Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
\AtEveryBibitem\clearfield

month \DeclareFieldFormatvolumevolume #1 \DeclareFieldFormat[article]volume#1 \addbibresourcemybib.bib

Transient subtraction: A control variate method for computing transport coefficients

Pierre Monmarchéa, LJLL and LCT, Sorbonne Université, Paris, France Renato Spacekb, MATHERIALS team, Inria Paris, France CERMICS, École des Ponts, France Gabriel Stoltzc, CERMICS, École des Ponts, France MATHERIALS team, Inria Paris, France
Abstract

In molecular dynamics, transport coefficients measure the sensitivity of the invariant probability measure of the stochastic dynamics at hand with respect to some perturbation. They are typically computed using either the linear response of nonequilibrium dynamics, or the Green–Kubo formula. The estimators for both approaches have large variances, which motivates the study of variance reduction techniques for computing transport coefficients. We present an alternative approach, called the transient subtraction technique (inspired by early work by Ciccotti and Jaccucci in 1975), which amounts to simulating a transient dynamics, from which we subtract a sensibly coupled equilibrium trajectory, resulting in an estimator with smaller variance. We present the mathematical formulation of the transient subtraction technique, give error estimates on the bias and variance of the associated estimator, and demonstrate the relevance of the method through numerical illustrations for various systems.

1 Introduction

When considering large systems of interacting particles, quantities of interest are typically macroscopic properties, such as temperature and pressure, rather than microscopic ones. Generally, full microscopic descriptions are not only too large to be reasonably considered, but also largely uninteresting. From a numerical viewpoint, molecular dynamics provides an effective way of bridging the microscopic and macroscopic properties of such systems through computer simulations; see [tuckerman2010, leimkuhler2015, allen2017] for reference textbooks. These simulations are typically done via the numerical realization of a stochastic differential equation (SDE), such as the Langevin dynamics, which evolves the positions q𝑞qitalic_q and momenta p𝑝pitalic_p as

{dqt=M1ptdt,dpt=V(qt)dtγM1ptdt+2γβdWt,cases𝑑subscript𝑞𝑡superscript𝑀1subscript𝑝𝑡𝑑𝑡otherwise𝑑subscript𝑝𝑡𝑉subscript𝑞𝑡𝑑𝑡𝛾superscript𝑀1subscript𝑝𝑡𝑑𝑡2𝛾𝛽𝑑subscript𝑊𝑡otherwise\displaystyle\begin{cases}dq_{t}=M^{-1}p_{t}\,dt,\\ dp_{t}=-\nabla V(q_{t})\,dt-\gamma M^{-1}p_{t}\,dt+\sqrt{\dfrac{2\gamma}{\beta% }}\,dW_{t},\end{cases}{ start_ROW start_CELL italic_d italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_d italic_t , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_d italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - ∇ italic_V ( italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_d italic_t - italic_γ italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_d italic_t + square-root start_ARG divide start_ARG 2 italic_γ end_ARG start_ARG italic_β end_ARG end_ARG italic_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , end_CELL start_CELL end_CELL end_ROW (1)

where V𝑉Vitalic_V is the potential energy function, M𝑀Mitalic_M the mass matrix, γ>0𝛾0\gamma>0italic_γ > 0 the damping coefficient, β>0𝛽0\beta>0italic_β > 0 the inverse temperature and Wtsubscript𝑊𝑡W_{t}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT a standard multidimensional Brownian motion.

One particular application of molecular dynamics is the computation of transport coefficients (such as the diffusivity, mobility and shear viscosity), which encode important physical properties of materials, and in particular measure how quickly a perturbed system returns to steady-state. At the microscopic level, transport coefficients are defined as the proportionality constant between the magnitude η1much-less-than𝜂1\eta\ll 1italic_η ≪ 1 of some external forcing exerted on the system, and some flux induced by this forcing. The flux is represented as the steady-state average 𝔼η(R)subscript𝔼𝜂𝑅\mathbb{E}_{\eta}(R)blackboard_E start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_R ) for some given observable R𝑅Ritalic_R with average 0 with respect to the equilibrium system (η=0)𝜂0(\eta=0)( italic_η = 0 ). This can be made precise through the framework of linear response theory; see [chandler1987, Chapter 8] for an introduction. To numerically realize this, one considers a nonequilibrium system by adding a perturbation of magnitude η𝜂\etaitalic_η to the reference dynamics at hand (e.g. Langevin dynamics), and the appropriate flux is then measured as a time-average over a long trajectory; this is known as the nonequilibrium molecular dynamics (NEMD) method.

Alternatively, the linear response can be reformulated as an equilibrium integrated correlation, known as the Green–Kubo (GK) formula [green1954, kubo1957]. Both the NEMD and Green–Kubo methods are commonly used, and each has their advantages and drawbacks; see [stoltz2024] for a detailed numerical comparison of both approaches.

Although less common, a third class of techniques consists of methods based on transient dynamics, which typically rely on monitoring the system’s relaxation to steady-state after an initial perturbation (unlike the NEMD and GK methods, which are based on steady-state averages). As will be made precise in Section 2.2, transient methods can be applied in two different ways: (i) starting from an equilibrium system with perturbed initial conditions, and allowing the system to relax back to its equilibrium steady-state, e.g. the momentum impulse relaxation [arya2000] and the approach-to-equilibrium molecular dynamics methods [lampin2013]; or a somewhat dual approach, carried out by (ii) applying a driving force to an equilibrium system and monitoring its relaxation towards a nonequilibrium steady-state, such as the transient-time correlation function method [morriss1987, evans1988].

All three classes of methods suffer from severe numerical difficulties, in particular with the presence of large statistical error being the main challenge, as made precise in Section 2.2. For NEMD, the statistical error mainly arises from the large signal-to-noise ratio (due to the small magnitude of the perturbation η𝜂\etaitalic_η), which requires long integration times to offset the variance. For Green–Kubo, the statistical error scales linearly with the integration time T𝑇Titalic_T due to the decay of correlations, as it amounts to integrating a small quantity plagued by a large statistical error [sousaoliveira2017]. There have been several attempts at more efficient methods to compute transport coefficients in the context of variance reduction [pavliotis2023, spacek2023, blassel2024]. In particular, one such method is known as the subtraction technique, developed and investigated in [ciccotti1975], and further explored in [ciccotti1979]. The method is based on NEMD, and its key idea relies on estimating the equilibrium quantity 𝔼0(R)subscript𝔼0𝑅\mathbb{E}_{0}(R)blackboard_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_R ) in addition to the usual nonequilibrium response 𝔼η(R)subscript𝔼𝜂𝑅\mathbb{E}_{\eta}(R)blackboard_E start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_R ). One would then subtract the estimated equilibrium trajectory (which has average 0 with respect to the stationary distribution by definition, thus leaving the transport coefficient unchanged) from the perturbed trajectory, yielding an estimator with lower variance provided that the two trajectories are sufficiently correlated.

As discussed in both works [ciccotti1975, ciccotti1979] (which consider a deterministic framework), the high correlation between trajectories is a natural artifact of the deterministic dynamics for reasonably short integration times. This allows for the statistical error to be effectively subtracted out through the equilibrium trajectory, thus making the subtraction step an effective control variate. In stochastic settings, however, using independent noises for equilibrium and nonequilibrium trajectories (corresponding to η=0𝜂0\eta=0italic_η = 0 and η0𝜂0\eta\neq 0italic_η ≠ 0, respectively) results in uncorrelated trajectories. This suggests the need for constructing a sensible coupling between the two systems; otherwise, the subtraction step would essentially amount to adding two independent Gaussians, doubling the variance of the estimator at hand.

One way to overcome this issue is to consider couplings, which have been used as a control variate to compute transport coefficients [goodman2009, garnier2022]. One particular example of a common coupling strategy is synchronous coupling, which amounts to using the same noise for both dynamics. A major challenge with coupling techniques, however, is ensuring that trajectories stay coupled for long times. This is especially problematic for systems which rely on long-time averages for convergence, e.g. NEMD. Typically, one hopes to obtain convergence of time-averages before trajectories decouple, but this cannot be assumed in general without additional (and often restrictive) requirements.

Synchronous coupling, for instance, typically requires conditions such as global dissipativity in order to ensure long-time couplings of the trajectories. In general, however, global dissipativity is only obtained under strong conditions. One such example is when the drift b𝑏bitalic_b is given by b=V𝑏𝑉b=-\nabla Vitalic_b = - ∇ italic_V for some strongly convex potential V𝑉Vitalic_V, which is too restrictive a requirement for actual applications in MD. This suggests that synchronous coupling is typically impractical for estimators that require long-time integration such as NEMD, as the decoupling time is much shorter than the time needed for convergence with no global dissipativity. Although in some cases, for instance at high temperatures, synchronously coupled trajectories might not decouple at all even with no global dissipativity; see [monmarche2023]. A natural way to address this problem would be to construct couplings with milder conditions which guarantee long-time couplings but this remains challenging (see for instance [darshan2024]).

We adopt an alternative viewpoint: we consider methods for which convergence of an observable is feasible over short-times. In particular, we devise a transient method, consisting of an initially perturbed trajectory relaxing to equilibrium. We thus do not require long-time averages for convergence, which suggests that we can use synchronous coupling under weak conditions, provided that the relaxation time is smaller than the decoupling time. Indeed, even though the dynamics might start to decouple before relaxation, the total variance might be nonetheless decreased due to the control variate, as analysis will show that variance reduction is still obtainable.

Outline

This work is organized as follows. We discuss in Section 2 some standard numerical methods for approximating transport coefficients, and present an approach based on integrating dynamics in the transient regime. Then, by employing the subtraction technique to the transient method, we construct in Section 3 an improved transient subtraction estimator. We provide some error analysis on its bias and variance. We then illustrate the efficacy of our method with numerical results for several systems in Section 4, namely by computing the mobility for one-dimensional Langevin dynamics, and mobility and shear viscosity for a Lennard–Jones fluid. Finally, conclusions and extensions are discussed in Section 5.

2 Transient method to compute transport coefficients

We discuss in this section the definition and computation of transport coefficients, and in particular the use of a transient method for their approximation. We start by presenting in Section 2.1 the general setting used to compute transport coefficients for a general SDE, then overview their standard numerical approximations and associated numerical difficulties in Section 2.2. We then introduce the transient method we consider in this work in Section 2.3.

2.1 General setting

Consider a general time-homogeneous SDE with additive noise defined on the state-space 𝒳𝒳\mathcal{X}caligraphic_X, where 𝒳𝒳\mathcal{X}caligraphic_X is typically dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT or 𝕋dsuperscript𝕋𝑑\mathbb{T}^{d}blackboard_T start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT (with 𝕋=/𝕋\mathbb{T}=\mathbb{R}/\mathbb{Z}blackboard_T = blackboard_R / blackboard_Z the one-dimensional torus):

dXt=b(Xt)dt+σdWt.𝑑subscript𝑋𝑡𝑏subscript𝑋𝑡𝑑𝑡𝜎𝑑subscript𝑊𝑡dX_{t}=b(X_{t})\,dt+\sigma\,dW_{t}.italic_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_b ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_d italic_t + italic_σ italic_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT . (2)

where b:𝒳d:𝑏𝒳superscript𝑑b\colon\mathcal{X}\to\mathbb{R}^{d}italic_b : caligraphic_X → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is a smooth function, σd×m𝜎superscript𝑑𝑚\sigma\in\mathbb{R}^{d\times m}italic_σ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_m end_POSTSUPERSCRIPT is a constant matrix and Wtsubscript𝑊𝑡W_{t}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a standard m𝑚mitalic_m-dimensional Brownian motion. We assume that (2) admits a unique strong solution (which is the case for instance when b𝑏bitalic_b is globally Lipschitz). We restrict ourselves to cases where σ𝜎\sigmaitalic_σ is constant, as the dynamics of interest considered later on involve additive noise, namely Langevin dynamics, and also because the coupling method we introduce in Section 3.1 is considerably easier to formulate in this setting. The dynamics (2) has associated infinitesimal generator

=b𝖳+12σσ𝖳:2=i=1dbixi+12i,j=1dk=1mσikσjkxixj2,:superscript𝑏𝖳12𝜎superscript𝜎𝖳superscript2superscriptsubscript𝑖1𝑑subscript𝑏𝑖subscriptsubscript𝑥𝑖12superscriptsubscript𝑖𝑗1𝑑superscriptsubscript𝑘1𝑚subscript𝜎𝑖𝑘subscript𝜎𝑗𝑘superscriptsubscriptsubscript𝑥𝑖subscript𝑥𝑗2\mathcal{L}=b^{\mathsf{T}}\nabla+\frac{1}{2}\sigma\sigma^{\mathsf{T}}\colon% \nabla^{2}=\sum_{i=1}^{d}b_{i}\partial_{x_{i}}+\frac{1}{2}\sum_{i,j=1}^{d}\sum% _{k=1}^{m}\sigma_{ik}\sigma_{jk}\partial_{x_{i}x_{j}}^{2},caligraphic_L = italic_b start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∇ + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_σ italic_σ start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT : ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT italic_σ start_POSTSUBSCRIPT italic_j italic_k end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (3)

where ::\colon: denotes the Frobenius inner product. Throughout this work, we assume that (2) admits a unique invariant probability measure μ𝜇\muitalic_μ with a positive density with respect to the Lebesgue measure. We denote by

L02(μ)={φL2(μ)|𝒳φ𝑑μ=0}subscriptsuperscript𝐿20𝜇conditional-set𝜑superscript𝐿2𝜇subscript𝒳𝜑differential-d𝜇0L^{2}_{0}(\mu)=\left\{\varphi\in L^{2}(\mu)\,\middle|\,\int_{\mathcal{X}}% \varphi\,d\mu=0\right\}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_μ ) = { italic_φ ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) | ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_φ italic_d italic_μ = 0 }

the space of L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) functions with average 0 with respect to μ𝜇\muitalic_μ.

Transport coefficients measure how the steady state of the reference dynamics (2) changes when some external forcing is applied to it. This external forcing typically arises as an extra drift term of magnitude η𝜂\etaitalic_η, with |η|𝜂|\eta|| italic_η | small in order for the forcing to be considered as a small perturbation. In this context, the transport coefficient ρ𝜌\rhoitalic_ρ is defined as the proportionality constant between the steady-state flux of some observable R𝑅Ritalic_R of interest, and the magnitude of the external forcing needed to induce it, known as the linear response; see [chandler1987, Chapter 8] for an introduction to linear response theory, and for instance [spacek2023, Section 2] for a synthetic presentation. We assume that the observable R𝑅Ritalic_R has average 0 with respect to μ𝜇\muitalic_μ (without loss of generality, as it can always be recentered in case it has a nonzero average). The linear response can be reformulated in terms of an integrated time-correlation function, known as the Green–Kubo formula. For simplicity, we do not further recall the framework of linear response theory and instead directly write the Green–Kubo formula:

ρ=0+𝔼μ(R(Xt)S(X0))𝑑t,𝜌superscriptsubscript0subscript𝔼𝜇𝑅subscript𝑋𝑡𝑆subscript𝑋0differential-d𝑡\displaystyle\rho=\int_{0}^{+\infty}\mathbb{E}_{\mu}(R(X_{t})S(X_{0}))\,dt,italic_ρ = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_S ( italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) italic_d italic_t , (4)

where SL02(μ)𝑆subscriptsuperscript𝐿20𝜇S\in L^{2}_{0}(\mu)italic_S ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_μ ) is the conjugate response function, which depends on the extra drift term added to perturb the dynamics (see [lelievre2016, Section 5.2.3] for a precise definition), and where the expectation 𝔼μsubscript𝔼𝜇\mathbb{E}_{\mu}blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT is taken with respect to all initial conditions X0μsimilar-tosubscript𝑋0𝜇X_{0}\sim\muitalic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ italic_μ, and over all realizations of the dynamics (2). Let us emphasize that the conjugate response S𝑆Sitalic_S has average 0 by construction.

2.2 Numerical techniques to compute transport coefficients

Transport coefficients can be numerically estimated using a variety of techniques. Generally, such techniques fall into one of three main categories (see [lelievre2016, stoltz2024] for a detailed discussion and numerical comparison):

  1. (1)

    Equilibrium techniques based on the Green–Kubo formula (4). In order to numerically realize (4), one constructs an estimator by (i) truncating the time-integral to finite integration time T𝑇Titalic_T; and (ii) approximating the expectation with an average over K𝐾Kitalic_K independent trajectories of the system (Xtk)t0subscriptsuperscriptsubscript𝑋𝑡𝑘𝑡0(X_{t}^{k})_{t\geqslant 0}( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_t ⩾ 0 end_POSTSUBSCRIPT with 1kK1𝑘𝐾1\leqslant k\leqslant K1 ⩽ italic_k ⩽ italic_K. This leads to the following natural estimator:

    ρ^GKT,K=1Kk=1K0TR(Xtk)S(X0k)𝑑t.subscriptsuperscript^𝜌𝑇𝐾GK1𝐾superscriptsubscript𝑘1𝐾superscriptsubscript0𝑇𝑅superscriptsubscript𝑋𝑡𝑘𝑆superscriptsubscript𝑋0𝑘differential-d𝑡\mathcal{\widehat{\rho}}^{T,K}_{\rm GK}=\frac{1}{K}\sum_{k=1}^{K}\int_{0}^{T}R% (X_{t}^{k})S(X_{0}^{k})\,dt.over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_GK end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) italic_S ( italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) italic_d italic_t . (5)

    The sources of error associated with the estimator (5) are

    • A statistical error O(T)O𝑇\mathrm{O}(T)roman_O ( italic_T ), which scales linearly with the time lag [sousaoliveira2017, plechac2022, gastaldello2024] and is typically the largest source of error;

    • An integration time truncation bias, which is small as correlations are typically exponentially decaying (as discussed in [plechac2022]);

    • A discretization bias, which arises from the finiteness of the timestep used to discretize (2), and from quadrature formulas for the time integral [leimkuhler2016, lelievre2016].

    The various sources of error suggest carefully choosing T𝑇Titalic_T in order to minimize the error as a tradeoff between T𝑇Titalic_T large enough for the time truncation bias to be small, but not too large in order to limit the increase in variance.

  2. (2)

    Nonequilibrium steady-state techniques. This method is based on linear response theory. It amounts to permanently adding an external forcing to the system, which induces a nonzero flux in the steady-state. The transport coefficient is then obtained by diving the average flux by the magnitude of the perturbation, for small values of the perturbation in order to ensure one stays in the linear response regime.

    There are several sources of error associated with this technique. In particular, the main concern is the statistical error, much larger than the usual asymptotic variance for standard time averages due to the small magnitude of the forcing. See [lelievre2016, Section 5], [spacek2023, Section 2] and [leimkuhler2016, Section 3] for a more detailed discussion on the numerical analysis of nonequilibrium methods.

  3. (3)

    Transient methods. While both Green–Kubo and nonequilibrium methods are based on steady-state dynamics, transient methods provide an alternative framework by monitoring the system’s relaxation to a steady-state after an initial perturbation, and can be classified into two main approaches:

    1. (3a)

      Equilibrium relaxation: A typical scenario is to perturb an equilibrium system by creating an initial profile of momentum or energy, for instance, which is then allowed to relax to an equilibrium steady-state through the time-evolution of the equilibrium dynamics. During relaxation, the corresponding transient profiles are monitored, which are often fit to a macroscopic effective PDE parametrized by the transport coefficient at hand in order to estimate this transport coefficient by some form of inverse problem fitting. Examples include the method proposed in [hulse2005] to compute the thermal conductivity, the momentum impulse relaxation method [arya2000], and the approach-to-equilibrium molecular dynamics method [lampin2013].

    2. (3b)

      Nonequilibrium relaxation: In a somewhat dual approach, one can alternatively start with an equilibrium system and drive it towards a nonequilibrium steady-state by applying an external forcing to the dynamics. The relaxation to a nonequilibrium steady-state is then monitored, from which the transport coefficient can be obtained. One example is the transient-time-correlation function (TTCF) method [morriss1987, evans1988], which generalizes the Green–Kubo relations to nonlinear regimes.

The limitations and drawbacks listed above suggest that there is space for alternative approaches, in particular in the context of variance reduction; this motivates the construction of the transient subtraction method.

2.3 Transient dynamics method

As discussed in Section 2.2, an alternative approach to the NEMD and GK methods for computing transport coefficients is based on transient dynamics. The fundamental idea is that, instead of applying an external forcing to the dynamics, or computing correlations for the equilibrium dynamics, we start from an initially perturbed system, and monitor its relaxation to steady-state by evolving equilibrium dynamics.

Mathematical formulation

The transient method relies on two main ingredients: (i) perturbing the distribution of initial conditions at order O(η)O𝜂\mathrm{O}(\eta)roman_O ( italic_η ) with η1much-less-than𝜂1\eta\ll 1italic_η ≪ 1; and (ii) monitoring return to stationarity via time integration. More precisely, we consider a process Xtηsuperscriptsubscript𝑋𝑡𝜂X_{t}^{\eta}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT which evolves according to the reference dynamics (2), with X0ημ~ηsimilar-tosuperscriptsubscript𝑋0𝜂subscript~𝜇𝜂X_{0}^{\eta}\sim\widetilde{\mu}_{\eta}italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ∼ over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT. The probability measure μ~ηsubscript~𝜇𝜂\widetilde{\mu}_{\eta}over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT is assumed to be a first-order perturbation of the invariant probability measure of the reference dynamics μ𝜇\muitalic_μ, satisfying

μ~η=(1+ηS)μ+O(η2).subscript~𝜇𝜂1𝜂𝑆𝜇Osuperscript𝜂2\widetilde{\mu}_{\eta}=(1+\eta S)\mu+\mathrm{O}(\eta^{2}).over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT = ( 1 + italic_η italic_S ) italic_μ + roman_O ( italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (6)

We then evolve the process Xtηsuperscriptsubscript𝑋𝑡𝜂X_{t}^{\eta}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT according to the reference equilibrium dynamics, which relaxes over time to its equilibrium steady-state. In particular, although not immediately clear, the time integral of the expectation of R(Xtη)𝑅superscriptsubscript𝑋𝑡𝜂R(X_{t}^{\eta})italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ), when divided by η𝜂\etaitalic_η, converges to the transport coefficient ρ𝜌\rhoitalic_ρ as η𝜂\etaitalic_η goes to 00:

ρ=limη01η0+𝔼(R(Xtη))𝑑t.𝜌subscript𝜂01𝜂superscriptsubscript0𝔼𝑅superscriptsubscript𝑋𝑡𝜂differential-d𝑡\rho=\lim_{\eta\to 0}\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}(R(X_{t}^{\eta}% ))\,dt.italic_ρ = roman_lim start_POSTSUBSCRIPT italic_η → 0 end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) ) italic_d italic_t . (7)

To motivate the equality (7), we consider finite η1much-less-than𝜂1\eta\ll 1italic_η ≪ 1. By writing the expectation in terms of the semigroup, and using that etRsuperscripte𝑡𝑅\mathrm{e}^{t\mathcal{L}}Rroman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R has average 0 with respect to μ𝜇\muitalic_μ (by the invariance of μ𝜇\muitalic_μ by the dynamics and the fact that R𝑅Ritalic_R has average 0 with respect to μ𝜇\muitalic_μ), we have, informally,

1η0+𝔼(R(Xtη))𝑑t1𝜂superscriptsubscript0𝔼𝑅superscriptsubscript𝑋𝑡𝜂differential-d𝑡\displaystyle\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}(R(X_{t}^{\eta}))\,dtdivide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) ) italic_d italic_t =1η0+𝒳(etR)𝑑μ~η𝑑tabsent1𝜂superscriptsubscript0subscript𝒳superscripte𝑡𝑅differential-dsubscript~𝜇𝜂differential-d𝑡\displaystyle=\frac{1}{\eta}\int_{0}^{+\infty}\int_{\mathcal{X}}\bigl{(}% \mathrm{e}^{t\mathcal{L}}R\bigr{)}\,d\widetilde{\mu}_{\eta}\,dt= divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) italic_d over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_t (8)
=1η0+𝒳etR𝑑μ𝑑t+0+𝒳(etR)S𝑑μ𝑑t+O(η)absent1𝜂superscriptsubscript0subscript𝒳superscripte𝑡𝑅differential-d𝜇differential-d𝑡superscriptsubscript0subscript𝒳superscripte𝑡𝑅𝑆differential-d𝜇differential-d𝑡O𝜂\displaystyle=\frac{1}{\eta}\int_{0}^{+\infty}\int_{\mathcal{X}}\mathrm{e}^{t% \mathcal{L}}R\,d\mu\,dt+\int_{0}^{+\infty}\int_{\mathcal{X}}\bigl{(}\mathrm{e}% ^{t\mathcal{L}}R\bigr{)}S\,d\mu\,dt+\mathrm{O}(\eta)= divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R italic_d italic_μ italic_d italic_t + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) italic_S italic_d italic_μ italic_d italic_t + roman_O ( italic_η ) (9)
=0+𝒳(etR)S𝑑μ𝑑t+O(η)absentsuperscriptsubscript0subscript𝒳superscripte𝑡𝑅𝑆differential-d𝜇differential-d𝑡O𝜂\displaystyle=\int_{0}^{+\infty}\int_{\mathcal{X}}\bigl{(}\mathrm{e}^{t% \mathcal{L}}R\bigr{)}S\,d\mu\,dt+\mathrm{O}(\eta)= ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) italic_S italic_d italic_μ italic_d italic_t + roman_O ( italic_η ) (10)
=0+𝔼μ(R(Xt)S(X0))𝑑t+O(η).absentsuperscriptsubscript0subscript𝔼𝜇𝑅subscript𝑋𝑡𝑆subscript𝑋0differential-d𝑡O𝜂\displaystyle=\int_{0}^{+\infty}\mathbb{E}_{\mu}\bigl{(}R(X_{t})S(X_{0})\bigr{% )}dt+\mathrm{O}(\eta).= ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_S ( italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) italic_d italic_t + roman_O ( italic_η ) . (11)

It is clear that by letting η0𝜂0\eta\to 0italic_η → 0 we get the correct result, i.e. (7) is equivalent to the Green–Kubo formula (4):

limη01η0+𝔼(R(Xtη))𝑑t=0+𝔼μ(R(Xt)S(X0))𝑑t.subscript𝜂01𝜂superscriptsubscript0𝔼𝑅superscriptsubscript𝑋𝑡𝜂differential-d𝑡superscriptsubscript0subscript𝔼𝜇𝑅subscript𝑋𝑡𝑆subscript𝑋0differential-d𝑡\lim_{\eta\to 0}\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}(R(X_{t}^{\eta}))\,% dt=\int_{0}^{+\infty}\mathbb{E}_{\mu}(R(X_{t})S(X_{0}))\,dt.roman_lim start_POSTSUBSCRIPT italic_η → 0 end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) ) italic_d italic_t = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_S ( italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) italic_d italic_t . (12)

We recall that 𝔼μsubscript𝔼𝜇\mathbb{E}_{\mu}blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT denotes the expectation with respect to the reference dynamics started at equilibrium, while 𝔼𝔼\mathbb{E}blackboard_E on the left-hand side denotes the expectation with respect to the reference dynamics initialized as X0ημ~ηsimilar-tosuperscriptsubscript𝑋0𝜂subscript~𝜇𝜂X_{0}^{\eta}\sim\widetilde{\mu}_{\eta}italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ∼ over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT.

The above discussion is an informal presentation of the method, and is done for motivational purposes; see Proposition 1 for the formal meaning of the initial distribution (6), and the rigorous form of the computation (8)–(11).

Estimators of transient dynamics

In practice, numerically estimating (7) requires first approximating the limit with (sufficiently small) finite η𝜂\etaitalic_η, truncating the time integral to finite T𝑇Titalic_T, and approximating the expectation with an average over K𝐾Kitalic_K realizations of the dynamics started from i.i.d. initial conditions X0μ~ηsimilar-tosubscript𝑋0subscript~𝜇𝜂X_{0}\sim\widetilde{\mu}_{\eta}italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT. This leads to the following estimator for (7):

ρ^transT,K,η=1ηKk=1K0TR(Xtη,k)𝑑t.subscriptsuperscript^𝜌𝑇𝐾𝜂trans1𝜂𝐾superscriptsubscript𝑘1𝐾superscriptsubscript0𝑇𝑅superscriptsubscript𝑋𝑡𝜂𝑘differential-d𝑡\widehat{\rho}^{T,K,\eta}_{\rm trans}=\frac{1}{\eta K}\sum_{k=1}^{K}\int_{0}^{% T}R(X_{t}^{\eta,k})\,dt.over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_trans end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_η italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η , italic_k end_POSTSUPERSCRIPT ) italic_d italic_t . (13)

Although these approximations lead to several sources of bias in (13), which are made precise in Section 3.2.2, the primary concern associated with (13) is its very large variance, as we discuss next. This disqualifies it as an appropriate numerical method.

Asymptotic variance of usual transient estimator

The asymptotic variance of the estimator (13) is

limT+T1Var(ρ^transT,K,η)=2Kη2𝒳R(1R)𝑑μ.subscript𝑇superscript𝑇1Varsubscriptsuperscript^𝜌𝑇𝐾𝜂trans2𝐾superscript𝜂2subscript𝒳𝑅superscript1𝑅differential-d𝜇\lim_{T\to+\infty}T^{-1}\operatorname{Var}\bigl{(}\widehat{\rho}^{T,K,\eta}_{% \rm trans}\bigr{)}=\frac{2}{K\eta^{2}}\int_{\mathcal{X}}R\left\lparen-\mathcal% {L}^{-1}R\right\rparen\,d\mu.roman_lim start_POSTSUBSCRIPT italic_T → + ∞ end_POSTSUBSCRIPT italic_T start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Var ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_trans end_POSTSUBSCRIPT ) = divide start_ARG 2 end_ARG start_ARG italic_K italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_R ( - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ) italic_d italic_μ . (14)

It corresponds to the usual asymptotic variance for time averages of ergodic equilibrium dynamics, except for the very large prefactor 1/η21superscript𝜂21/\eta^{2}1 / italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

Unlike the usual NEMD or Green–Kubo estimators of transport coefficients discussed in Section 2.2, the variance of (13) is magnified by two distinct contributions. First, as with NEMD, we divide (13) by η1much-less-than𝜂1\eta\ll 1italic_η ≪ 1 which gives rise to the O(η2)Osuperscript𝜂2\mathrm{O}(\eta^{-2})roman_O ( italic_η start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ) factor. Second, since the estimator is not a time average but a time integral as with GK, the variance also scales linearly in T𝑇Titalic_T, as opposed to the typical scaling O(1/T)O1𝑇\mathrm{O}(1/T)roman_O ( 1 / italic_T ) for time-averages. This leads to variance of order O(Tη2)O𝑇superscript𝜂2\mathrm{O}(T\eta^{-2})roman_O ( italic_T italic_η start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ), much higher than its NEMD and GK counterparts (although if T=O(1)𝑇O1T=\mathrm{O}(1)italic_T = roman_O ( 1 ), the transient method is comparable to NEMD).

This result calls for modifying the estimator with the use of variance reduction techniques, in particular to get rid of the η2superscript𝜂2\eta^{-2}italic_η start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT contribution. To this end, we consider the use of couplings as a control variate, which are discussed more precisely in the next section.

3 Transient subtraction method

We propose in this section a method called transient subtraction technique, which employs a subtraction technique similar to the one suggested in [ciccotti1975] to the transient dynamics method discussed in Section 2.3 as a means for variance reduction. We first outline in Section 3.1 the construction of the method, then present the numerical analysis of its associated estimators in Section 3.2.

3.1 Constructing the method

In the transient dynamics setting of Section 2.3, one can consider the use of couplings as a control variate approach to construct an estimator with lower variance than (13). To this end, we consider the coupling (Xtη,Yt0)superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0(X_{t}^{\eta},Y_{t}^{0})( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ), where the processes Xtηsuperscriptsubscript𝑋𝑡𝜂X_{t}^{\eta}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT and Yt0superscriptsubscript𝑌𝑡0Y_{t}^{0}italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT are evolved according to the same underlying reference dynamics and have different initial conditions:

{dYt0=b(Yt0)dt+σdWt,Y00μ,dXtη=b(Xtη)dt+σdW~t,X0ημ~η,cases𝑑superscriptsubscript𝑌𝑡0formulae-sequenceabsent𝑏superscriptsubscript𝑌𝑡0𝑑𝑡𝜎𝑑subscript𝑊𝑡similar-tosuperscriptsubscript𝑌00𝜇𝑑superscriptsubscript𝑋𝑡𝜂formulae-sequenceabsent𝑏superscriptsubscript𝑋𝑡𝜂𝑑𝑡𝜎𝑑subscript~𝑊𝑡similar-tosuperscriptsubscript𝑋0𝜂subscript~𝜇𝜂otherwise\begin{cases}\begin{aligned} dY_{t}^{0}&=b(Y_{t}^{0})\,dt+\sigma\,dW_{t},% \qquad Y_{0}^{0}\sim\mu,\\ dX_{t}^{\eta}&=b(X_{t}^{\eta})\,dt+\sigma\,d\widetilde{W}_{t},\qquad X_{0}^{% \eta}\sim\widetilde{\mu}_{\eta},\end{aligned}\end{cases}{ start_ROW start_CELL start_ROW start_CELL italic_d italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL = italic_b ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d italic_t + italic_σ italic_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∼ italic_μ , end_CELL end_ROW start_ROW start_CELL italic_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT end_CELL start_CELL = italic_b ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) italic_d italic_t + italic_σ italic_d over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ∼ over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT , end_CELL end_ROW end_CELL start_CELL end_CELL end_ROW (15)

where Wtsubscript𝑊𝑡W_{t}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and W~tsubscript~𝑊𝑡\widetilde{W}_{t}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT are standard m𝑚mitalic_m-dimensional Brownian motions. The transport coefficient ρ𝜌\rhoitalic_ρ can then be computed as

ρ=limη01η0+𝔼(R(Xtη)R(Yt0))𝑑t.𝜌subscript𝜂01𝜂superscriptsubscript0𝔼𝑅superscriptsubscript𝑋𝑡𝜂𝑅superscriptsubscript𝑌𝑡0differential-d𝑡\rho=\lim_{\eta\to 0}\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}\left\lparen R(% X_{t}^{\eta})-R(Y_{t}^{0})\right\rparen\,dt.italic_ρ = roman_lim start_POSTSUBSCRIPT italic_η → 0 end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) - italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) italic_d italic_t . (16)

Note that 0+R(Yt0)𝑑tsuperscriptsubscript0𝑅superscriptsubscript𝑌𝑡0differential-d𝑡\int_{0}^{+\infty}R(Y_{t}^{0})\,dt∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d italic_t acts as a control variate since 𝔼(R(Yt0))=0𝔼𝑅superscriptsubscript𝑌𝑡00\mathbb{E}(R(Y_{t}^{0}))=0blackboard_E ( italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) = 0 for all t0𝑡0t\geqslant 0italic_t ⩾ 0. The expression (16) admits the following natural estimator, carried out with independent initial conditions for the couple (Xtη,k,Yt0,k)t0subscriptsuperscriptsubscript𝑋𝑡𝜂𝑘superscriptsubscript𝑌𝑡0𝑘𝑡0(X_{t}^{\eta,k},Y_{t}^{0,k})_{t\geqslant 0}( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η , italic_k end_POSTSUPERSCRIPT , italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 , italic_k end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_t ⩾ 0 end_POSTSUBSCRIPT for 1kK1𝑘𝐾1\leqslant k\leqslant K1 ⩽ italic_k ⩽ italic_K and independent realizations of the dynamics (15):

ρ^subT,K,η=1ηKk=1K0T[R(Xtη,k)R(Yt0,k)]𝑑t.subscriptsuperscript^𝜌𝑇𝐾𝜂sub1𝜂𝐾superscriptsubscript𝑘1𝐾superscriptsubscript0𝑇delimited-[]𝑅superscriptsubscript𝑋𝑡𝜂𝑘𝑅superscriptsubscript𝑌𝑡0𝑘differential-d𝑡\widehat{\rho}^{T,K,\eta}_{\rm sub}=\frac{1}{\eta K}\sum_{k=1}^{K}\int_{0}^{T}% \left[R(X_{t}^{\eta,k})-R(Y_{t}^{0,k})\right]\,dt.over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_η italic_K end_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η , italic_k end_POSTSUPERSCRIPT ) - italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 , italic_k end_POSTSUPERSCRIPT ) ] italic_d italic_t . (17)

A sufficient condition for (17) to have smaller variance than the standard estimator (13) is for the trajectories to start η𝜂\etaitalic_η close, and to stay close for times of order 1/λ1𝜆1/\lambda1 / italic_λ, where λ𝜆\lambdaitalic_λ is the relaxation rate of the system to the stationary state (see Assumption 3). More precisely,

  1. (A)

    The initial distance |X0ηY00|superscriptsubscript𝑋0𝜂superscriptsubscript𝑌00|X_{0}^{\eta}-Y_{0}^{0}|| italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | must be of order η𝜂\etaitalic_η;

  2. (B)

    The dynamics must remain η𝜂\etaitalic_η close for finite times as the copies of the system evolve, i.e. |XtηYt0|superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0|X_{t}^{\eta}-Y_{t}^{0}|| italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | must be of order η𝜂\etaitalic_η for tT𝑡𝑇t\leqslant Titalic_t ⩽ italic_T.

Condition 1 amounts to finding a coupling measure which is concentrated along the diagonal in the (x,y)𝑥𝑦(x,y)( italic_x , italic_y ) space, so that initial conditions are η𝜂\etaitalic_η close. We emphasize that, although μ~ηsubscript~𝜇𝜂\widetilde{\mu}_{\eta}over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT is by construction a O(η)O𝜂\mathrm{O}(\eta)roman_O ( italic_η ) perturbation of μ𝜇\muitalic_μ, this is not enough to guarantee that the trajectories start η𝜂\etaitalic_η close when the initial conditions are independent, thus we require a coupling on the initial conditions.

We discuss in Section 3.1.1 a natural way of coupling the dynamics (15), and outline sufficient conditions for condition 2 to hold. Then, we formally construct the coupling measure on the initial conditions and discuss its properties in Section 3.1.2.

Remark 1 (Tangent dynamics).

The expression (16) of the transport coefficient can be formulated in terms of tangent dynamics [assaraf2017]. Denote by 𝒯tdsubscript𝒯𝑡superscript𝑑\mathcal{T}_{t}\in\mathbb{R}^{d}caligraphic_T start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT the tangent vector, where

𝒯t=limη0XtηXt0η.subscript𝒯𝑡subscript𝜂0superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑋𝑡0𝜂\mathcal{T}_{t}=\lim_{\eta\to 0}\frac{X_{t}^{\eta}-X_{t}^{0}}{\eta}.caligraphic_T start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_lim start_POSTSUBSCRIPT italic_η → 0 end_POSTSUBSCRIPT divide start_ARG italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_ARG start_ARG italic_η end_ARG . (18)

This vector evolves according to a random ordinary differential equation, obtained by linearizing (2). Moreover, (16) can be written as

ρ=limη01η0+𝔼(R(Xtη)R(Yt0))𝑑t=0+𝔼(𝒯tR(Xt0))𝑑t.𝜌subscript𝜂01𝜂superscriptsubscript0𝔼𝑅superscriptsubscript𝑋𝑡𝜂𝑅superscriptsubscript𝑌𝑡0differential-d𝑡superscriptsubscript0𝔼subscript𝒯𝑡𝑅superscriptsubscript𝑋𝑡0differential-d𝑡\rho=\lim_{\eta\to 0}\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}\left\lparen R(% X_{t}^{\eta})-R(Y_{t}^{0})\right\rparen\,dt=\int_{0}^{+\infty}\mathbb{E}\left% \lparen\mathcal{T}_{t}\cdot\nabla R(X_{t}^{0})\right\rparen\,dt.italic_ρ = roman_lim start_POSTSUBSCRIPT italic_η → 0 end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) - italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) italic_d italic_t = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( caligraphic_T start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⋅ ∇ italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) italic_d italic_t . (19)

3.1.1 Synchronous coupling

A natural way to ensure that the dynamics remain close is via synchronous coupling, which amounts to using the same Brownian motion for both processes, i.e. setting W~=W~𝑊𝑊\widetilde{W}=Wover~ start_ARG italic_W end_ARG = italic_W. It is known that synchronous coupling performs well in the presence of global dissipativity. Without it, however, trajectories decouple and we cannot control the coupling distance for long times. For the transient subtraction method, we do not require long-time results, as the relaxation time of the estimator (17) is typically of order O(1/λ)O1𝜆\mathrm{O}(1/\lambda)roman_O ( 1 / italic_λ ), with λ𝜆\lambdaitalic_λ the exponential convergence rate from Assumption 3. Thus, this suggests that synchronous coupling is an admissible control variate as long as the relaxation time is smaller than the decoupling time.

In order to more precisely state some results on the coupling distance, we give a sufficient condition for trajectories to decouple at most exponentially in time in Assumption 1.

Assumption 1.

There exists B𝐵B\in\mathbb{R}italic_B ∈ blackboard_R such that the drift b:𝒳d:𝑏𝒳superscript𝑑b\colon\mathcal{X}\to\mathbb{R}^{d}italic_b : caligraphic_X → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT satisfies

(x,y)𝒳2,xy,b(x)b(y)B|xy|2.formulae-sequencefor-all𝑥𝑦superscript𝒳2𝑥𝑦𝑏𝑥𝑏𝑦𝐵superscript𝑥𝑦2\forall(x,y)\in\mathcal{X}^{2},\qquad\langle x-y,b(x)-b(y)\rangle\leqslant B|x% -y|^{2}.∀ ( italic_x , italic_y ) ∈ caligraphic_X start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , ⟨ italic_x - italic_y , italic_b ( italic_x ) - italic_b ( italic_y ) ⟩ ⩽ italic_B | italic_x - italic_y | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (20)

A sufficient condition for (20) to be satisfied is when the drift b𝑏bitalic_b is globally Lipschitz with constant bLipsubscriptnorm𝑏Lip\|b\|_{\rm Lip}∥ italic_b ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT, in which case B=bLip𝐵subscriptnorm𝑏LipB=\|b\|_{\rm Lip}italic_B = ∥ italic_b ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT. In some fortunate cases where B<0𝐵0B<0italic_B < 0, the drift is globally dissipative. In particular, global dissipativity ensures uniform exponential decay of the coupling distance |XtηYt0|superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0|X_{t}^{\eta}-Y_{t}^{0}|| italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT |. We next state a standard result providing an upper bound for how fast trajectories decouple based on the estimate considered in Assumption 1.

Lemma 1.

Suppose that Assumption 1 holds. Then, almost surely,

t0,|XtηYt0|etB|X0ηY00|.formulae-sequencefor-all𝑡0superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0superscripte𝑡𝐵superscriptsubscript𝑋0𝜂superscriptsubscript𝑌00\forall t\geqslant 0,\qquad|X_{t}^{\eta}-Y_{t}^{0}|\leqslant\mathrm{e}^{tB}|X_% {0}^{\eta}-Y_{0}^{0}|.∀ italic_t ⩾ 0 , | italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | ⩽ roman_e start_POSTSUPERSCRIPT italic_t italic_B end_POSTSUPERSCRIPT | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | . (21)
Proof.

In order to bound the distance between the trajectories at time t𝑡titalic_t in terms of the initial distance, we first write, by Itô’s formula

d(|XtηYt0|2)𝑑superscriptsuperscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡02\displaystyle d\bigl{(}|X_{t}^{\eta}-Y_{t}^{0}|^{2}\bigr{)}italic_d ( | italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) =2XtηYt0,dXtηdYt0absent2superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0𝑑superscriptsubscript𝑋𝑡𝜂𝑑subscriptsuperscript𝑌0𝑡\displaystyle=2\langle X_{t}^{\eta}-Y_{t}^{0},dX_{t}^{\eta}-dY^{0}_{t}\rangle= 2 ⟨ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_d italic_Y start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⟩ (22)
=2XtηYt0,b(Xtη)b(Yt0)dtabsent2superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0𝑏superscriptsubscript𝑋𝑡𝜂𝑏superscriptsubscript𝑌𝑡0𝑑𝑡\displaystyle=2\langle X_{t}^{\eta}-Y_{t}^{0},b(X_{t}^{\eta})-b(Y_{t}^{0})% \rangle\,dt= 2 ⟨ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_b ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) - italic_b ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ⟩ italic_d italic_t (23)
2B|X0ηY00|2dt.absent2𝐵superscriptsuperscriptsubscript𝑋0𝜂superscriptsubscript𝑌002𝑑𝑡\displaystyle\leqslant 2B|X_{0}^{\eta}-Y_{0}^{0}|^{2}\,dt.⩽ 2 italic_B | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_t . (24)

Grönwall’s lemma then gives the claimed bound. ∎

3.1.2 Properties of initial conditions

We consider a coupling measure μcoup(dxdy)subscript𝜇coup𝑑𝑥𝑑𝑦\mu_{\rm coup}(dx\,dy)italic_μ start_POSTSUBSCRIPT roman_coup end_POSTSUBSCRIPT ( italic_d italic_x italic_d italic_y ) with marginals μ~η(dx)subscript~𝜇𝜂𝑑𝑥\widetilde{\mu}_{\eta}(dx)over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_d italic_x ) and μ(dy)𝜇𝑑𝑦\mu(dy)italic_μ ( italic_d italic_y ). In order to ensure that the initial conditions are η𝜂\etaitalic_η close, the coupling measure must be concentrated along the diagonal (more precisely, within η𝜂\etaitalic_η distance from the diagonal), as illustrated in Figure 1. A natural way of achieving this is to formulate X0ηsuperscriptsubscript𝑋0𝜂X_{0}^{\eta}italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT as a deterministic map of Y00superscriptsubscript𝑌00Y_{0}^{0}italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, i.e. to look for Φη:𝒳𝒳:subscriptΦ𝜂𝒳𝒳\Phi_{\eta}\colon\mathcal{X}\to\mathcal{X}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT : caligraphic_X → caligraphic_X such that X0η=Φη(Y00)superscriptsubscript𝑋0𝜂subscriptΦ𝜂superscriptsubscript𝑌00X_{0}^{\eta}=\Phi_{\eta}(Y_{0}^{0})italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT = roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ), with ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT close to the identity function. The function ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT should be chosen such that

μ~η=Φη#μ=(1+ηS)μ+O(η2),subscript~𝜇𝜂subscriptΦ𝜂#𝜇1𝜂𝑆𝜇Osuperscript𝜂2\widetilde{\mu}_{\eta}=\Phi_{\eta}\#\mu=(1+\eta S)\mu+\mathrm{O}(\eta^{2}),over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT = roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT # italic_μ = ( 1 + italic_η italic_S ) italic_μ + roman_O ( italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (25)

where ##\## denotes the image measure of μ𝜇\muitalic_μ by ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT: For any bounded measurable test function φ:𝒳:𝜑𝒳\varphi\colon\mathcal{X}\to\mathbb{R}italic_φ : caligraphic_X → blackboard_R,

𝒳φ𝑑μ~η=𝒳φΦη𝑑μ.subscript𝒳𝜑differential-dsubscript~𝜇𝜂subscript𝒳𝜑subscriptΦ𝜂differential-d𝜇\int_{\mathcal{X}}\varphi\,d\widetilde{\mu}_{\eta}=\int_{\mathcal{X}}\varphi% \circ\Phi_{\eta}\,d\mu.∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_φ italic_d over~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_φ ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ . (26)

We look for a map ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT of the form

Φη(x)=x+ηφ1(x),subscriptΦ𝜂𝑥𝑥𝜂subscript𝜑1𝑥\Phi_{\eta}(x)=x+\eta\varphi_{1}(x),roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) = italic_x + italic_η italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) , (27)

where φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is determined by (25). It is in fact given by a solution to the partial differential equation (PDE) (43) below, as made precise in Proposition 1. Note that (27) can be formulated as a map higher than first-order in η𝜂\etaitalic_η; see Section 3.2.2 for a discussion of this point.

The transient subtraction technique then amounts to evolving synchronously coupled equilibrium dynamics starting from initial conditions which are deterministically related:

{dYt0=b(Yt0)dt+σdWt,Y00μ,dXtη=b(Xtη)dt+σdWt,X0η=Φη(Y00).cases𝑑superscriptsubscript𝑌𝑡0formulae-sequenceabsent𝑏superscriptsubscript𝑌𝑡0𝑑𝑡𝜎𝑑subscript𝑊𝑡similar-tosuperscriptsubscript𝑌00𝜇𝑑superscriptsubscript𝑋𝑡𝜂formulae-sequenceabsent𝑏superscriptsubscript𝑋𝑡𝜂𝑑𝑡𝜎𝑑subscript𝑊𝑡superscriptsubscript𝑋0𝜂subscriptΦ𝜂superscriptsubscript𝑌00otherwise\begin{cases}\begin{aligned} dY_{t}^{0}&=b(Y_{t}^{0})\,dt+\sigma\,dW_{t},% \qquad Y_{0}^{0}\sim\mu,\\ dX_{t}^{\eta}&=b(X_{t}^{\eta})\,dt+\sigma\,dW_{t},\qquad X_{0}^{\eta}=\Phi_{% \eta}(Y_{0}^{0}).\end{aligned}\end{cases}{ start_ROW start_CELL start_ROW start_CELL italic_d italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL start_CELL = italic_b ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d italic_t + italic_σ italic_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∼ italic_μ , end_CELL end_ROW start_ROW start_CELL italic_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT end_CELL start_CELL = italic_b ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) italic_d italic_t + italic_σ italic_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT = roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) . end_CELL end_ROW end_CELL start_CELL end_CELL end_ROW (28)

We next perform error analysis on the transient subtraction technique estimator (17) for (Xtη)t0subscriptsuperscriptsubscript𝑋𝑡𝜂𝑡0(X_{t}^{\eta})_{t\geqslant 0}( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_t ⩾ 0 end_POSTSUBSCRIPT and (Yt0)t0subscriptsuperscriptsubscript𝑌𝑡0𝑡0(Y_{t}^{0})_{t\geqslant 0}( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_t ⩾ 0 end_POSTSUBSCRIPT given by (28).

Refer to caption
Figure 1: Illustration of coupling measure on initial conditions.

3.2 Numerical analysis of the transient subtraction method

In this section, we perform error analysis on the transient subtraction estimator (17) realized with the dynamics (28). We start by making precise the functional setting and stating some estimates in Section 3.2.1. We then make precise in Section 3.2.2 the bounds on the bias, and finally discuss its variance in Section 3.2.3.

3.2.1 Functional estimates

Consider a family of Lyapunov functions (𝒦n)nsubscriptsubscript𝒦𝑛𝑛(\mathcal{K}_{n})_{n\in\mathbb{N}}( caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n ∈ blackboard_N end_POSTSUBSCRIPT with 𝒦n:𝒳[1,+):subscript𝒦𝑛𝒳1\mathcal{K}_{n}\colon\mathcal{X}\to[1,+\infty)caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT : caligraphic_X → [ 1 , + ∞ ) such that

n,𝒦n𝒦n+1.formulae-sequencefor-all𝑛subscript𝒦𝑛subscript𝒦𝑛1\forall n\in\mathbb{N},\qquad\mathcal{K}_{n}\leqslant\mathcal{K}_{n+1}.∀ italic_n ∈ blackboard_N , caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⩽ caligraphic_K start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT . (29)

The associated weighted Bsuperscript𝐵B^{\infty}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT spaces are

Bn={φmeasurable|φBn:=supx𝒳|φ(x)𝒦n(x)|<+}.superscriptsubscript𝐵𝑛conditional-set𝜑measurableassignsubscriptnorm𝜑superscriptsubscriptBnsubscriptsupremumx𝒳𝜑xsubscript𝒦nxB_{n}^{\infty}=\left\{\varphi\,\rm{measurable}\;\middle|\;\|\varphi\|_{B_{n}^{% \infty}}:=\sup_{x\in\mathcal{X}}\left|\frac{\varphi(x)}{\mathcal{K}_{n}(x)}% \right|<+\infty\right\}.italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT = { italic_φ roman_measurable | ∥ italic_φ ∥ start_POSTSUBSCRIPT roman_B start_POSTSUBSCRIPT roman_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT := roman_sup start_POSTSUBSCRIPT roman_x ∈ caligraphic_X end_POSTSUBSCRIPT | divide start_ARG italic_φ ( roman_x ) end_ARG start_ARG caligraphic_K start_POSTSUBSCRIPT roman_n end_POSTSUBSCRIPT ( roman_x ) end_ARG | < + ∞ } . (30)

We next introduce the space 𝒮𝒮\mathscr{S}script_S of smooth functions φ𝜑\varphiitalic_φ belonging to the space Bnsubscriptsuperscript𝐵𝑛B^{\infty}_{n}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for some n𝑛nitalic_n, and whose derivatives also belong to Bmsubscriptsuperscript𝐵𝑚B^{\infty}_{m}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT for some m𝑚mitalic_m:

𝒮={φC(𝒳)|kd,n,kφBn}.𝒮conditional-set𝜑superscript𝐶𝒳formulae-sequencefor-all𝑘superscript𝑑formulae-sequence𝑛superscript𝑘𝜑superscriptsubscript𝐵𝑛\mathscr{S}=\left\{\varphi\in C^{\infty}(\mathcal{X})\;\middle|\;\forall k\in% \mathbb{N}^{d},\;\exists n\in\mathbb{N},\;\partial^{k}\varphi\in B_{n}^{\infty% }\right\}.script_S = { italic_φ ∈ italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( caligraphic_X ) | ∀ italic_k ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , ∃ italic_n ∈ blackboard_N , ∂ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_φ ∈ italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT } . (31)

We finally define the subspace 𝒮0subscript𝒮0\mathscr{S}_{0}script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT of functions in 𝒮𝒮\mathscr{S}script_S with average 0 with respect to μ𝜇\muitalic_μ.

We make the following assumptions on the Lyapunov functions.

Assumption 2 (Lyapunov estimates).

There exist n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N and Cn+subscript𝐶𝑛superscriptC_{n}\in\mathbb{R}^{+}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that

|x|Cn𝒦n(x).𝑥subscript𝐶𝑛subscript𝒦𝑛𝑥|x|\leqslant C_{n}\mathcal{K}_{n}(x).| italic_x | ⩽ italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) . (32)

Furthermore, for any n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N,

𝒦nL1(μ)<+.subscriptdelimited-∥∥subscript𝒦𝑛superscript𝐿1𝜇\lVert\mathcal{K}_{n}\rVert_{L^{1}(\mu)}<+\infty.∥ caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_μ ) end_POSTSUBSCRIPT < + ∞ . (33)

Moreover, we assume that the Lyapunov functions are stable by products: for any n,n𝑛superscript𝑛n,n^{\prime}\in\mathbb{N}italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_N, there exist m𝑚m\in\mathbb{N}italic_m ∈ blackboard_N and Cn,n+subscript𝐶𝑛superscript𝑛superscriptC_{n,n^{\prime}}\in\mathbb{R}^{+}italic_C start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that

(𝒦n𝒦n)(x)Cn,n𝒦m(x).subscript𝒦𝑛subscript𝒦superscript𝑛𝑥subscriptsuperscript𝐶𝑛superscript𝑛subscript𝒦𝑚𝑥\left\lparen\mathcal{K}_{n}\mathcal{K}_{n^{\prime}}\right\rparen(x)\leqslant C% ^{\prime}_{n,n^{\prime}}\mathcal{K}_{m}(x).( caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) ( italic_x ) ⩽ italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_x ) . (34)

We also assume stability by compositions: for any n,n𝑛superscript𝑛n,n^{\prime}\in\mathbb{N}italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_N and α+superscript𝛼superscript\alpha^{*}\in\mathbb{R}^{+}italic_α start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, there exist m𝑚m\in\mathbb{N}italic_m ∈ blackboard_N and Cn,n,α+subscript𝐶𝑛superscript𝑛superscript𝛼superscriptC_{n,n^{\prime},\alpha^{*}}\in\mathbb{R}^{+}italic_C start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_α start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that

α[0,α],𝒦n(α𝒦n(x))Cn,n,α𝒦m(x).formulae-sequencefor-all𝛼0superscript𝛼subscript𝒦𝑛𝛼subscript𝒦superscript𝑛𝑥subscript𝐶𝑛superscript𝑛superscript𝛼subscript𝒦𝑚𝑥\forall\alpha\in[0,\alpha^{*}],\qquad\mathcal{K}_{n}(\alpha\mathcal{K}_{n^{% \prime}}(x))\leqslant C_{n,n^{\prime},\alpha^{*}}\mathcal{K}_{m}(x).∀ italic_α ∈ [ 0 , italic_α start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] , caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_α caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) ) ⩽ italic_C start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_α start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_x ) . (35)

Lastly, we assume that 𝒦nsubscript𝒦𝑛\mathcal{K}_{n}caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is nondecreasing:

(i=1,,d,|yi||zi|)𝒦n(y)𝒦n(z).formulae-sequencefor-all𝑖1𝑑subscript𝑦𝑖subscript𝑧𝑖subscript𝒦𝑛𝑦subscript𝒦𝑛𝑧\left\lparen\forall i=1,\dotsc,d,\quad|y_{i}|\leqslant|z_{i}|\right\rparen% \implies\mathcal{K}_{n}(y)\leqslant\mathcal{K}_{n}(z).( ∀ italic_i = 1 , … , italic_d , | italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ⩽ | italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ) ⟹ caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_y ) ⩽ caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_z ) . (36)

A useful corollary of Assumption 2, which we will use in our estimates, is the following: for any fBn𝑓subscriptsuperscript𝐵𝑛f\in B^{\infty}_{n}italic_f ∈ italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and g=(g1,,gd)𝑔subscript𝑔1subscript𝑔𝑑g=(g_{1},\dotsc,g_{d})italic_g = ( italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_g start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) with giBnsubscript𝑔𝑖subscriptsuperscript𝐵superscript𝑛g_{i}\in B^{\infty}_{n^{\prime}}italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT,

|fg|(x)fBn𝒦ng(x)fBn𝒦n(gBn𝒦n(x))fBnKn,n,gBn𝒦m(x),𝑓𝑔𝑥subscriptnorm𝑓superscriptsubscript𝐵𝑛subscript𝒦𝑛𝑔𝑥subscriptnorm𝑓superscriptsubscript𝐵𝑛subscript𝒦𝑛subscriptnorm𝑔subscriptsuperscript𝐵superscript𝑛subscript𝒦superscript𝑛𝑥subscriptdelimited-∥∥𝑓subscriptsuperscript𝐵𝑛subscript𝐾𝑛superscript𝑛subscriptnorm𝑔subscriptsuperscript𝐵superscript𝑛subscript𝒦𝑚𝑥\displaystyle|f\circ g|(x)\leqslant\|f\|_{B_{n}^{\infty}}\mathcal{K}_{n}\circ g% (x)\leqslant\|f\|_{B_{n}^{\infty}}\mathcal{K}_{n}\left\lparen\|g\|_{B^{\infty}% _{n^{\prime}}}\mathcal{K}_{n^{\prime}}(x)\right\rparen\leqslant\lVert f\rVert_% {B^{\infty}_{n}}K_{n,n^{\prime},\|g\|_{B^{\infty}_{n^{\prime}}}}\mathcal{K}_{m% }(x),| italic_f ∘ italic_g | ( italic_x ) ⩽ ∥ italic_f ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∘ italic_g ( italic_x ) ⩽ ∥ italic_f ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( ∥ italic_g ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) ) ⩽ ∥ italic_f ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , ∥ italic_g ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_x ) , (37)

with m𝑚mitalic_m depending on n𝑛nitalic_n and nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

A typical choice for 𝒦nsubscript𝒦𝑛\mathcal{K}_{n}caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are polynomial Lyapunov functions of the form 𝒦n(x)=1+|x|nsubscript𝒦𝑛𝑥1superscript𝑥𝑛\mathcal{K}_{n}(x)=1+|x|^{n}caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) = 1 + | italic_x | start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. This is a standard choice for Langevin dynamics; see [mattingly2002, talay2002]. This choice satisfies Assumption 2 when μ𝜇\muitalic_μ has moments of all orders, which is a mild requirement.

We also make an assumption on the convergence of the semigroup etsuperscripte𝑡\mathrm{e}^{t\mathcal{L}}roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT in weighted Bsuperscript𝐵B^{\infty}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT spaces. To this end, we introduce the space Bnsubscriptsuperscript𝐵𝑛B^{\infty}_{n}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of functions with average 0 with respect to μ𝜇\muitalic_μ:

Bn,0={φBn|𝒳φ𝑑μ=0}.superscriptsubscript𝐵𝑛0conditional-set𝜑subscriptsuperscript𝐵𝑛subscript𝒳𝜑differential-d𝜇0B_{n,0}^{\infty}=\left\{\varphi\in B^{\infty}_{n}\;\middle|\;\int_{\mathcal{X}% }\varphi\,d\mu=0\right\}.italic_B start_POSTSUBSCRIPT italic_n , 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT = { italic_φ ∈ italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT | ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_φ italic_d italic_μ = 0 } . (38)
Assumption 3 (Decay estimates on semigroup operator).

For any n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N, there exist Ln+subscript𝐿𝑛superscriptL_{n}\in\mathbb{R}^{+}italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and λn>0subscript𝜆𝑛0\lambda_{n}>0italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT > 0 such that

φBn,0,etφBnLneλntφBn.formulae-sequencefor-all𝜑superscriptsubscript𝐵𝑛0subscriptnormsuperscripte𝑡𝜑subscriptsuperscript𝐵𝑛subscript𝐿𝑛superscriptesubscript𝜆𝑛𝑡subscriptdelimited-∥∥𝜑subscriptsuperscript𝐵𝑛\forall\varphi\in B_{n,0}^{\infty},\qquad\|\mathrm{e}^{t\mathcal{L}}\varphi\|_% {B^{\infty}_{n}}\leqslant L_{n}\mathrm{e}^{-\lambda_{n}t}\lVert\varphi\rVert_{% B^{\infty}_{n}}.∀ italic_φ ∈ italic_B start_POSTSUBSCRIPT italic_n , 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT , ∥ roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_φ ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⩽ italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT roman_e start_POSTSUPERSCRIPT - italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_t end_POSTSUPERSCRIPT ∥ italic_φ ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT . (39)

As a direct corollary of Assumption 3, the operator \mathcal{L}caligraphic_L is invertible on Bn,0superscriptsubscript𝐵𝑛0B_{n,0}^{\infty}italic_B start_POSTSUBSCRIPT italic_n , 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT, with

1=0+et𝑑t.superscript1superscriptsubscript0superscripte𝑡differential-d𝑡\mathcal{L}^{-1}=-\int_{0}^{+\infty}\mathrm{e}^{t\mathcal{L}}\,dt.caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_d italic_t . (40)

Moreover, the following bound holds

1BnLnλn.subscriptdelimited-∥∥superscript1subscriptsuperscript𝐵𝑛subscript𝐿𝑛subscript𝜆𝑛\lVert\mathcal{L}^{-1}\rVert_{B^{\infty}_{n}}\leqslant\frac{L_{n}}{\lambda_{n}}.∥ caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⩽ divide start_ARG italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG . (41)

We refer for instance to [lelievre2016, Section 2] for a discussion on sufficient conditions for Assumption 3 to hold (based on [reybellet2006, hairer2011]), and for the proof of (40).

3.2.2 Analysis of the bias

There are several sources of bias arising from the estimator (17), such as the time truncation and time discretization bias when considering numerical schemes to integrate the dynamics. Quantifying such biases is standard practice for estimators of this form. Additionally, there is a bias arising from the finiteness of η𝜂\etaitalic_η, which is the main result of this section. This is made precise in Corollary 1, which builds upon the estimates on the coupling measures provided by Proposition 1 below.

To state the result, we denote by 𝒜superscript𝒜\mathcal{A}^{*}caligraphic_A start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT the adjoint of a closed operator 𝒜𝒜\mathcal{A}caligraphic_A on L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ): for any test functions φ,ϕC𝜑italic-ϕsuperscript𝐶\varphi,\phi\in C^{\infty}italic_φ , italic_ϕ ∈ italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT with compact support,

𝒳(𝒜φ)ϕ𝑑μ=𝒳φ(𝒜ϕ)𝑑μ.subscript𝒳𝒜𝜑italic-ϕdifferential-d𝜇subscript𝒳𝜑superscript𝒜italic-ϕdifferential-d𝜇\int_{\mathcal{X}}(\mathcal{A}\varphi)\phi\,d\mu=\int_{\mathcal{X}}\varphi(% \mathcal{A}^{*}\phi)\,d\mu.∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( caligraphic_A italic_φ ) italic_ϕ italic_d italic_μ = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_φ ( caligraphic_A start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_ϕ ) italic_d italic_μ . (42)
Proposition 1 (Finite η𝜂\etaitalic_η bias).

Suppose that Assumption 2 holds true and that, for S𝒮0𝑆subscript𝒮0S\in\mathscr{S}_{0}italic_S ∈ script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, there exist solutions φ1=(φ1,x1,,φ1,xd)(Bn)dsubscript𝜑1subscript𝜑1subscript𝑥1subscript𝜑1subscript𝑥𝑑superscriptsubscriptsuperscript𝐵𝑛𝑑\varphi_{1}=(\varphi_{1,x_{1}},\dotsc,\varphi_{1,x_{d}})\in(B^{\infty}_{n})^{d}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , … , italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ∈ ( italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and φ2=(φ2,x1,,φ2,xd)(Bn)dsubscript𝜑2subscript𝜑2subscript𝑥1subscript𝜑2subscript𝑥𝑑superscriptsubscriptsuperscript𝐵𝑛𝑑\varphi_{2}=(\varphi_{2,x_{1}},\dotsc,\varphi_{2,x_{d}})\in(B^{\infty}_{n})^{d}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( italic_φ start_POSTSUBSCRIPT 2 , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , … , italic_φ start_POSTSUBSCRIPT 2 , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ∈ ( italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT for some n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N to the equations

φ1=i=1dxiφ1,xi=S,superscriptsubscript𝜑1superscriptsubscript𝑖1𝑑superscriptsubscriptsubscript𝑥𝑖subscript𝜑1subscript𝑥𝑖𝑆\nabla^{*}\varphi_{1}=\sum_{i=1}^{d}\partial_{x_{i}}^{*}\varphi_{1,x_{i}}=S,∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_S , (43)

and

φ2=12i,j=1dxixj(φ1,xiφ1,xj)=12()2:φ1φ1.:superscriptsubscript𝜑212superscriptsubscript𝑖𝑗1𝑑superscriptsubscriptsubscript𝑥𝑖superscriptsubscriptsubscript𝑥𝑗subscript𝜑1subscript𝑥𝑖subscript𝜑1subscript𝑥𝑗12superscriptsuperscript2tensor-productsubscript𝜑1subscript𝜑1\nabla^{*}\varphi_{2}=-\frac{1}{2}\sum_{i,j=1}^{d}\partial_{x_{i}}^{*}\partial% _{x_{j}}^{*}(\varphi_{1,x_{i}}\varphi_{1,x_{j}})=-\frac{1}{2}(\nabla^{*})^{2}% \colon\varphi_{1}\otimes\varphi_{1}.∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . (44)

Fix η>0subscript𝜂0\eta_{*}>0italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT > 0, and f𝒮𝑓𝒮f\in\mathscr{S}italic_f ∈ script_S. Then, there exists Kf,η+subscript𝐾𝑓subscript𝜂subscriptK_{f,\eta_{*}}\in\mathbb{R}_{+}italic_K start_POSTSUBSCRIPT italic_f , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT (which depends on f𝑓fitalic_f and ηsubscript𝜂\eta_{*}italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT) such that, for any |η|η𝜂subscript𝜂|\eta|\leqslant\eta_{*}| italic_η | ⩽ italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT,

|𝒳fΦηα𝑑μ𝒳f𝑑μη𝒳fS𝑑μ|ηα+1𝒞f,η,subscript𝒳𝑓subscriptsuperscriptΦ𝛼𝜂differential-d𝜇subscript𝒳𝑓differential-d𝜇𝜂subscript𝒳𝑓𝑆differential-d𝜇superscript𝜂𝛼1subscript𝒞𝑓subscript𝜂\qquad\left|\int_{\mathcal{X}}f\circ\Phi^{\alpha}_{\eta}\,d\mu-\int_{\mathcal{% X}}f\,d\mu-\eta\int_{\mathcal{X}}fS\,d\mu\right|\leqslant\eta^{\alpha+1}% \mathcal{C}_{f,\eta_{*}},| ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f ∘ roman_Φ start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ - ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f italic_d italic_μ - italic_η ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f italic_S italic_d italic_μ | ⩽ italic_η start_POSTSUPERSCRIPT italic_α + 1 end_POSTSUPERSCRIPT caligraphic_C start_POSTSUBSCRIPT italic_f , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT end_POSTSUBSCRIPT , (45)

with α=1𝛼1\alpha=1italic_α = 1 or α=2𝛼2\alpha=2italic_α = 2, and

{Φη1(x)=x+ηφ1(x),Φη2(x)=x+ηφ1(x)+η2φ2(x).casesmissing-subexpressionsuperscriptsubscriptΦ𝜂1𝑥𝑥𝜂subscript𝜑1𝑥missing-subexpressionsuperscriptsubscriptΦ𝜂2𝑥𝑥𝜂subscript𝜑1𝑥superscript𝜂2subscript𝜑2𝑥otherwise\begin{cases}\begin{aligned} &\Phi_{\eta}^{1}(x)=x+\eta\varphi_{1}(x),\\ &\Phi_{\eta}^{2}(x)=x+\eta\varphi_{1}(x)+\eta^{2}\varphi_{2}(x).\end{aligned}% \end{cases}{ start_ROW start_CELL start_ROW start_CELL end_CELL start_CELL roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_x ) = italic_x + italic_η italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_x ) = italic_x + italic_η italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) . end_CELL end_ROW end_CELL start_CELL end_CELL end_ROW (46)
Remark 2.

The smoothness condition on f𝑓fitalic_f can be weakened, as it would be enough to have derivatives of f𝑓fitalic_f up to order 3 in Bmsubscriptsuperscript𝐵𝑚B^{\infty}_{m}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. In order to simplify the presentation of the result, however, we suppose f𝒮𝑓𝒮f\in\mathscr{S}italic_f ∈ script_S here.

This result states that the finite η𝜂\etaitalic_η bias in the linear response is of order O(ηα)Osuperscript𝜂𝛼\mathrm{O}(\eta^{\alpha})roman_O ( italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ) for a map ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT which includes well-chosen terms up to order O(ηα)Osuperscript𝜂𝛼\mathrm{O}(\eta^{\alpha})roman_O ( italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ). It is of course possible to construct higher-order corrections in order to further decrease the bias. The associated PDEs for the corresponding φisubscript𝜑𝑖\varphi_{i}italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT terms, however, become increasingly cumbersome to solve, rendering it an impractical approach. In any case, a second-order map leads to an estimator with O(η2)Osuperscript𝜂2\mathrm{O}(\eta^{2})roman_O ( italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bias, which is sufficiently small in general.

Remark 3 (Well-posedness of PDEs).

Although (43) might look difficult to solve, one can show that it admits gradient solutions of the form φ1=ψsubscript𝜑1𝜓\varphi_{1}=\nabla\psiitalic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∇ italic_ψ under some conditions on μ𝜇\muitalic_μ. Indeed, (43) can then be written as

ψ=S,superscript𝜓𝑆\nabla^{*}\nabla\psi=S,∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∇ italic_ψ = italic_S , (47)

which has a unique solution when superscript\nabla^{*}\nabla∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∇ has a spectral gap when considered as an operator on L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) (implied by μ𝜇\muitalic_μ satisfying a Poincaré inequality).Note that the solutions φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are defined up to an element of the kernel of superscript\nabla^{*}∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, i.e. if ψ𝜓\nabla\psi∇ italic_ψ is a solution to (43), then φ1=ψ+gsubscript𝜑1𝜓𝑔\varphi_{1}=\nabla\psi+gitalic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∇ italic_ψ + italic_g with g=0superscript𝑔0\nabla^{*}g=0∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_g = 0 also an admissible solution.

Proof of Proposition 1.

It suffices to prove the result for α=2𝛼2\alpha=2italic_α = 2, from which the result for α=1𝛼1\alpha=1italic_α = 1 can be trivially deduced. By a Taylor expansion of f(Φη(x))𝑓subscriptΦ𝜂𝑥f(\Phi_{\eta}(x))italic_f ( roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ),

f(Φη(x))=f(x)𝑓subscriptΦ𝜂𝑥𝑓𝑥\displaystyle f(\Phi_{\eta}(x))=f(x)italic_f ( roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ) = italic_f ( italic_x ) +ηf(x)𝖳(φ1(x)+ηφ2(x))𝜂𝑓superscript𝑥𝖳subscript𝜑1𝑥𝜂subscript𝜑2𝑥\displaystyle+\eta\nabla f(x)^{\mathsf{T}}(\varphi_{1}(x)+\eta\varphi_{2}(x))+ italic_η ∇ italic_f ( italic_x ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) ) (48)
+η22(φ1(x)+ηφ2(x))𝖳2f(x)(φ1(x)+ηφ2(x))superscript𝜂22superscriptsubscript𝜑1𝑥𝜂subscript𝜑2𝑥𝖳superscript2𝑓𝑥subscript𝜑1𝑥𝜂subscript𝜑2𝑥\displaystyle+\frac{\eta^{2}}{2}(\varphi_{1}(x)+\eta\varphi_{2}(x))^{\mathsf{T% }}\nabla^{2}f(x)(\varphi_{1}(x)+\eta\varphi_{2}(x))+ divide start_ARG italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) ) (49)
+η363f(Θη(x))(φ1(x)+ηφ2(x))3,superscript𝜂36superscript3𝑓subscriptΘ𝜂𝑥superscriptsubscript𝜑1𝑥𝜂subscript𝜑2𝑥tensor-productabsent3\displaystyle+\frac{\eta^{3}}{6}\nabla^{3}f\left(\Theta_{\eta}(x)\right)\cdot(% \varphi_{1}(x)+\eta\varphi_{2}(x))^{\otimes 3},+ divide start_ARG italic_η start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG 6 end_ARG ∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ) ⋅ ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT ⊗ 3 end_POSTSUPERSCRIPT , (50)

where 3fsuperscript3𝑓\nabla^{3}f∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f denotes the third-order derivative tensor (and 2fsuperscript2𝑓\nabla^{2}f∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f denotes the Hessian), and Θη(x)subscriptΘ𝜂𝑥\Theta_{\eta}(x)roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) interpolates between x𝑥xitalic_x and Φη(x)subscriptΦ𝜂𝑥\Phi_{\eta}(x)roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ):

Θη(x)=(1θη(x))x+θη(x)Φη(x),θη(x)[0,1].formulae-sequencesubscriptΘ𝜂𝑥1subscript𝜃𝜂𝑥𝑥subscript𝜃𝜂𝑥subscriptΦ𝜂𝑥subscript𝜃𝜂𝑥01\Theta_{\eta}(x)=(1-\theta_{\eta}(x))x+\theta_{\eta}(x)\Phi_{\eta}(x),\qquad% \theta_{\eta}(x)\in[0,1].roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) = ( 1 - italic_θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ) italic_x + italic_θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) , italic_θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ∈ [ 0 , 1 ] . (51)

Integrating the Taylor expansion above yields

𝒳fΦη𝑑μsubscript𝒳𝑓subscriptΦ𝜂differential-d𝜇\displaystyle\int_{\mathcal{X}}f\circ\Phi_{\eta}\,d\mu∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ =𝒳f𝑑μ+η𝒳f𝖳φ1dμ+η2𝒳(φ2𝖳f+12φ1𝖳(2f)φ1)𝑑μ+η33,η,absentsubscript𝒳𝑓differential-d𝜇𝜂subscript𝒳superscript𝑓𝖳subscript𝜑1𝑑𝜇superscript𝜂2subscript𝒳superscriptsubscript𝜑2𝖳𝑓12superscriptsubscript𝜑1𝖳superscript2𝑓subscript𝜑1differential-d𝜇superscript𝜂3subscript3𝜂\displaystyle=\int_{\mathcal{X}}f\,d\mu+\eta\int_{\mathcal{X}}\nabla f^{% \mathsf{T}}\varphi_{1}\,d\mu+\eta^{2}\int_{\mathcal{X}}\left\lparen\varphi_{2}% ^{\mathsf{T}}\nabla f+\frac{1}{2}\varphi_{1}^{\mathsf{T}}(\nabla^{2}f)\varphi_% {1}\right\rparen\,d\mu+\eta^{3}\mathcal{R}_{3,\eta},= ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f italic_d italic_μ + italic_η ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ∇ italic_f start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_d italic_μ + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∇ italic_f + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ) italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_μ + italic_η start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 3 , italic_η end_POSTSUBSCRIPT , (52)

with 3,ηsubscript3𝜂\mathcal{R}_{3,\eta}caligraphic_R start_POSTSUBSCRIPT 3 , italic_η end_POSTSUBSCRIPT given by

3,η=𝒳[163f(Θη)(φ1+ηφ2)3+φ1𝖳(2f)φ2+η2φ2𝖳(2f)φ2]𝑑μ.subscript3𝜂subscript𝒳delimited-[]16superscript3𝑓subscriptΘ𝜂superscriptsubscript𝜑1𝜂subscript𝜑2tensor-productabsent3superscriptsubscript𝜑1𝖳superscript2𝑓subscript𝜑2𝜂2superscriptsubscript𝜑2𝖳superscript2𝑓subscript𝜑2differential-d𝜇\mathcal{R}_{3,\eta}=\int_{\mathcal{X}}\left[\frac{1}{6}\nabla^{3}f\left(% \Theta_{\eta}\right)\cdot(\varphi_{1}+\eta\varphi_{2})^{\otimes 3}+\varphi_{1}% ^{\mathsf{T}}(\nabla^{2}f)\varphi_{2}+\frac{\eta}{2}\varphi_{2}^{\mathsf{T}}(% \nabla^{2}f)\varphi_{2}\right]\,d\mu.caligraphic_R start_POSTSUBSCRIPT 3 , italic_η end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT [ divide start_ARG 1 end_ARG start_ARG 6 end_ARG ∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ) ⋅ ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⊗ 3 end_POSTSUPERSCRIPT + italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ) italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + divide start_ARG italic_η end_ARG start_ARG 2 end_ARG italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ) italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] italic_d italic_μ . (53)

We next show that the remainder term 3,ηsubscript3𝜂\mathcal{R}_{3,\eta}caligraphic_R start_POSTSUBSCRIPT 3 , italic_η end_POSTSUBSCRIPT is uniformly bounded. We start with the first right-hand side integrand term in (53). Since φ1,φ2subscript𝜑1subscript𝜑2\varphi_{1},\varphi_{2}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT have components in Bnsubscriptsuperscript𝐵𝑛B^{\infty}_{n}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for some n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N, we deduce that so does ΦηBnsubscriptΦ𝜂subscriptsuperscript𝐵𝑛\Phi_{\eta}\in B^{\infty}_{n}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (since the identity is in Bnsubscriptsuperscript𝐵𝑛B^{\infty}_{n}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT upon possibly increasing n𝑛nitalic_n in view of (32)), and thus also ΘηsubscriptΘ𝜂\Theta_{\eta}roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT. Therefore, since f𝒮𝑓𝒮f\in\mathscr{S}italic_f ∈ script_S and in view of (35) and (37), there exist m,n𝑚superscript𝑛m,n^{\prime}\in\mathbb{N}italic_m , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_N and C+𝐶superscriptC\in\mathbb{R}^{+}italic_C ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that, for all x𝒳𝑥𝒳x\in\mathcal{X}italic_x ∈ caligraphic_X and all η[η,η]𝜂subscript𝜂subscript𝜂\eta\in[-\eta_{*},\eta_{*}]italic_η ∈ [ - italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ],

|3f(Θη(x))|Ci,j,k=1dxi,xj,xk3fBm𝒦n(x),superscript3𝑓subscriptΘ𝜂𝑥𝐶superscriptsubscript𝑖𝑗𝑘1𝑑subscriptdelimited-∥∥superscriptsubscriptsubscript𝑥𝑖subscript𝑥𝑗subscript𝑥𝑘3𝑓superscriptsubscript𝐵𝑚subscript𝒦superscript𝑛𝑥\displaystyle\left\lvert\nabla^{3}f\left\lparen\Theta_{\eta}(x)\right\rparen% \right\rvert\leqslant C\smash{\sum_{i,j,k=1}^{d}}\left\lVert\partial_{x_{i},x_% {j},x_{k}}^{3}f\right\rVert_{B_{m}^{\infty}}\mathcal{K}_{n^{\prime}}(x),| ∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ) | ⩽ italic_C ∑ start_POSTSUBSCRIPT italic_i , italic_j , italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∥ ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) , (54)

so that

|𝒳3f(Θη(x))(φ1(x)+ηφ2(x))3𝑑μ|subscript𝒳superscript3𝑓subscriptΘ𝜂𝑥superscriptsubscript𝜑1𝑥𝜂subscript𝜑2𝑥tensor-productabsent3differential-d𝜇\displaystyle\left|\int_{\mathcal{X}}\nabla^{3}f\left\lparen\Theta_{\eta}(x)% \right\rparen\cdot(\varphi_{1}(x)+\eta\varphi_{2}(x))^{\otimes 3}\,d\mu\right|| ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( roman_Θ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_x ) ) ⋅ ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT ⊗ 3 end_POSTSUPERSCRIPT italic_d italic_μ | (55)
Ci,j,k=1dxi,xj,xk3fBm(φ1Bn+ηφ2Bn)3𝒳𝒦n3𝒦n𝑑μ,absent𝐶superscriptsubscript𝑖𝑗𝑘1𝑑subscriptdelimited-∥∥superscriptsubscriptsubscript𝑥𝑖subscript𝑥𝑗subscript𝑥𝑘3𝑓superscriptsubscript𝐵𝑚superscriptsubscriptdelimited-∥∥subscript𝜑1superscriptsubscript𝐵𝑛𝜂subscriptdelimited-∥∥subscript𝜑2superscriptsubscript𝐵𝑛3subscript𝒳superscriptsubscript𝒦𝑛3subscript𝒦superscript𝑛differential-d𝜇\displaystyle\qquad\leqslant C\smash{\sum_{i,j,k=1}^{d}}\left\lVert\partial_{x% _{i},x_{j},x_{k}}^{3}f\right\rVert_{B_{m}^{\infty}}\left\lparen\lVert\varphi_{% 1}\rVert_{B_{n}^{\infty}}+\eta\lVert\varphi_{2}\rVert_{B_{n}^{\infty}}\right% \rparen^{3}\int_{\mathcal{X}}\mathcal{K}_{n}^{3}\mathcal{K}_{n^{\prime}}\,d\mu,⩽ italic_C ∑ start_POSTSUBSCRIPT italic_i , italic_j , italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∥ ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( ∥ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_η ∥ italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_d italic_μ , (56)

where (35) ensures uniformity in η𝜂\etaitalic_η for the bounds above. By Assumption 2, there exist C+𝐶superscriptC\in\mathbb{R}^{+}italic_C ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and \ell\in\mathbb{N}roman_ℓ ∈ blackboard_N such that 𝒦n3𝒦mC𝒦superscriptsubscript𝒦𝑛3subscript𝒦𝑚𝐶subscript𝒦\mathcal{K}_{n}^{3}\mathcal{K}_{m}\leqslant C\mathcal{K}_{\ell}caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ⩽ italic_C caligraphic_K start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT. Since 𝒦L1(μ)subscript𝒦superscript𝐿1𝜇\mathcal{K}_{\ell}\in L^{1}(\mu)caligraphic_K start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_μ ) by (33), we conclude that (56) is uniformly bounded for |η|η𝜂subscript𝜂|\eta|\leqslant\eta_{*}| italic_η | ⩽ italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT. A similar computation shows that the remaining two integrand terms in (53) are also uniformly bounded.

To prove (45), it remains to show that the first and second-order terms in η𝜂\etaitalic_η in (45) and (52) coincide, which is by design of φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Indeed, by taking L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ )-adjoints and in view of condition (43) on φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT,

𝒳f𝖳φ1dμ=𝒳fφ1dμ=𝒳fS𝑑μ.subscript𝒳superscript𝑓𝖳subscript𝜑1𝑑𝜇subscript𝒳𝑓superscriptsubscript𝜑1𝑑𝜇subscript𝒳𝑓𝑆differential-d𝜇\int_{\mathcal{X}}\nabla f^{\mathsf{T}}\varphi_{1}\,d\mu=\int_{\mathcal{X}}f% \nabla^{*}\varphi_{1}\,d\mu=\int_{\mathcal{X}}fS\,d\mu.∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ∇ italic_f start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_d italic_μ = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_d italic_μ = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT italic_f italic_S italic_d italic_μ . (57)

Similarly, applying L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ )-adjoints to the second-order η𝜂\etaitalic_η terms in (52) yields

𝒳(φ2𝖳f+12φ1𝖳(2f)φ1)dμ=𝒳(fφ2+12()2:φ1φ1)dμ,\displaystyle\int_{\mathcal{X}}\left\lparen\varphi_{2}^{\mathsf{T}}\nabla f+% \frac{1}{2}\varphi_{1}^{\mathsf{T}}(\nabla^{2}f)\varphi_{1}\right\rparen\,d\mu% =\int_{\mathcal{X}}\left\lparen f\nabla^{*}\varphi_{2}+\frac{1}{2}(\nabla^{*})% ^{2}\colon\varphi_{1}\otimes\varphi_{1}\right\rparen\,d\mu,∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∇ italic_f + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ) italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_μ = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( italic_f ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_μ , (58)

which vanishes when φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT satisfies (44). This allows us to conclude the proof. ∎

The above result allows us to quantify the bias, as made precise in the corollary below. Before stating it, we require an additional assumption.

Assumption 4.

The generator \mathcal{L}caligraphic_L is invertible on 𝒮0subscript𝒮0\mathscr{S}_{0}script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. In other words, for any ϕ𝒮0italic-ϕsubscript𝒮0\phi\in\mathscr{S}_{0}italic_ϕ ∈ script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, there exists a unique solution Ψ𝒮0Ψsubscript𝒮0\Psi\in\mathscr{S}_{0}roman_Ψ ∈ script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to the Poisson equation Ψ=ϕΨitalic-ϕ-\mathcal{L}\Psi=\phi- caligraphic_L roman_Ψ = italic_ϕ.

Assumption 4 can be shown to hold for overdamped and underdamped Langevin dynamics under certain conditions on the potential V𝑉Vitalic_V [talay2002, kopec2014, kopec2015], and is a standard result in the literature.

Applying Proposition 1 together with the decay estimates from Assumption 3, as well as Assumption 4, to the transient subtraction estimator (17) yields the following result on the bias of the estimator (17).

Corollary 1.

Under the assumptions of Proposition 1 as well as Assumptions 3 and 4, there exists C+𝐶superscriptC\in\mathbb{R}^{+}italic_C ∈ blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT such that, for any T>0𝑇0T>0italic_T > 0 and η[η,η]{0}𝜂subscript𝜂subscript𝜂0\eta\in[-\eta_{*},\eta_{*}]\setminus\{0\}italic_η ∈ [ - italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ] ∖ { 0 },

|𝔼(ρ^subT,K,η,α)ρ|C(ηα+eλTη),𝔼subscriptsuperscript^𝜌𝑇𝐾𝜂𝛼sub𝜌𝐶superscript𝜂𝛼superscripte𝜆𝑇𝜂\left|\mathbb{E}\left\lparen\widehat{\rho}^{T,K,\eta,\alpha}_{\rm sub}\right% \rparen-\rho\right|\leqslant C\left\lparen\eta^{\alpha}+\frac{\mathrm{e}^{-% \lambda T}}{\eta}\right\rparen,| blackboard_E ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ) - italic_ρ | ⩽ italic_C ( italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT + divide start_ARG roman_e start_POSTSUPERSCRIPT - italic_λ italic_T end_POSTSUPERSCRIPT end_ARG start_ARG italic_η end_ARG ) , (59)

where ρ^subT,K,η,αsubscriptsuperscript^𝜌𝑇𝐾𝜂𝛼sub\widehat{\rho}^{T,K,\eta,\alpha}_{\rm sub}over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT is defined as (17) with the dynamics (28) and ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT given by (46).

This result, obtained as a direct consequence of Proposition 1, shows that the bias of the transient (subtraction) technique has two distinct contributions: an exponentially decaying bias term arising from the time truncation, as in the Green–Kubo method (however magnified by a η1superscript𝜂1\eta^{-1}italic_η start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT factor); and a bias of order ηαsuperscript𝜂𝛼\eta^{\alpha}italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT due to the finiteness of η𝜂\etaitalic_η, corresponding to deviations from the linear regime.

Remark 4.

Although the constant C𝐶Citalic_C does not depend on T𝑇Titalic_T (which suggests taking T𝑇Titalic_T as large as possible to minimize the truncation bias contribution), the variance depends on T𝑇Titalic_T (see Proposition 2). This calls for equilibrating between the two in order to have the smallest overall error.

Proof of Corollary 1.

Fix η[η,η]{0}𝜂subscript𝜂subscript𝜂0\eta\in[-\eta_{*},\eta_{*}]\setminus\{0\}italic_η ∈ [ - italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ] ∖ { 0 }. Since R𝑅Ritalic_R has average 0 with respect to μ𝜇\muitalic_μ, it holds that

|𝔼(ρ^subT,K,η,α)ρ|𝔼subscriptsuperscript^𝜌𝑇𝐾𝜂𝛼sub𝜌\displaystyle\left|\mathbb{E}\left\lparen\widehat{\rho}^{T,K,\eta,\alpha}_{\rm sub% }\right\rparen-\rho\right|| blackboard_E ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ) - italic_ρ | =1η|0+𝔼(R(Xtη)R(Yt0))𝑑tT+𝔼(R(Xtη)R(Yt0))𝑑tηρ|absent1𝜂superscriptsubscript0𝔼𝑅subscriptsuperscript𝑋𝜂𝑡𝑅subscriptsuperscript𝑌0𝑡differential-d𝑡superscriptsubscript𝑇𝔼𝑅subscriptsuperscript𝑋𝜂𝑡𝑅subscriptsuperscript𝑌0𝑡differential-d𝑡𝜂𝜌\displaystyle=\frac{1}{\eta}\left\lvert\int_{0}^{+\infty}\mathbb{E}\bigl{(}R(X% ^{\eta}_{t})-R(Y^{0}_{t})\bigr{)}\,dt-\int_{T}^{+\infty}\mathbb{E}\bigl{(}R(X^% {\eta}_{t})-R(Y^{0}_{t})\bigr{)}\,dt-\eta\rho\right\rvert= divide start_ARG 1 end_ARG start_ARG italic_η end_ARG | ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) - italic_R ( italic_Y start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t - ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) - italic_R ( italic_Y start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t - italic_η italic_ρ | (60)
1η|0+𝔼(R(Xtη))𝑑tηρ|+1η|T+𝔼(R(Xtη))𝑑t|,absent1𝜂superscriptsubscript0𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡𝜂𝜌1𝜂superscriptsubscript𝑇𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡\displaystyle\leqslant\frac{1}{\eta}\left\lvert\int_{0}^{+\infty}\mathbb{E}% \bigl{(}R(X^{\eta}_{t})\bigr{)}\,dt-\eta\rho\right\rvert+\frac{1}{\eta}\left% \lvert\int_{T}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})\bigr{)}\,dt\right\rvert,⩽ divide start_ARG 1 end_ARG start_ARG italic_η end_ARG | ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t - italic_η italic_ρ | + divide start_ARG 1 end_ARG start_ARG italic_η end_ARG | ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t | , (61)

We first consider the second term in (61). By the semigroup definition of the expectation and the fact that μ~η=Φη#μsubscript~𝜇𝜂subscriptΦ𝜂#𝜇\widetilde{\mu}_{\eta}=\Phi_{\eta}\#\muover~ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT = roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT # italic_μ, it holds that

|T+𝔼(R(Xtη))𝑑t|superscriptsubscript𝑇𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡\displaystyle\left\lvert\int_{T}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})% \bigr{)}\,dt\right\rvert| ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t | =|T+𝒳(etR)Φη𝑑μ𝑑t|.absentsuperscriptsubscript𝑇subscript𝒳superscripte𝑡𝑅subscriptΦ𝜂differential-d𝜇differential-d𝑡\displaystyle=\left\lvert\int_{T}^{+\infty}\int_{\mathcal{X}}\bigl{(}\mathrm{e% }^{t\mathcal{L}}R\bigr{)}\circ\Phi_{\eta}\,d\mu\,dt\right\rvert.= | ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ italic_d italic_t | . (62)

To bound the above quantity, we apply the semigroup decay estimate (39) and use (37):

|T+𝔼(R(Xtη))𝑑t|superscriptsubscript𝑇𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡\displaystyle\left\lvert\int_{T}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})% \bigr{)}\,dt\right\rvert| ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t | T+𝒳|(etR)Φη|𝑑μ𝑑tabsentsuperscriptsubscript𝑇subscript𝒳superscripte𝑡𝑅subscriptΦ𝜂differential-d𝜇differential-d𝑡\displaystyle\leqslant\int_{T}^{+\infty}\int_{\mathcal{X}}\left\lvert\bigl{(}% \mathrm{e}^{t\mathcal{L}}R\bigr{)}\circ\Phi_{\eta}\right\rvert\,d\mu\,dt⩽ ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT | ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT | italic_d italic_μ italic_d italic_t (63)
T+𝒳etRBn𝒦nΦη𝑑μ𝑑tabsentsuperscriptsubscript𝑇subscript𝒳subscriptdelimited-∥∥superscripte𝑡𝑅subscriptsuperscript𝐵𝑛subscript𝒦𝑛subscriptΦ𝜂differential-d𝜇differential-d𝑡\displaystyle\leqslant\int_{T}^{+\infty}\int_{\mathcal{X}}\left\lVert\mathrm{e% }^{t\mathcal{L}}R\right\rVert_{B^{\infty}_{n}}\mathcal{K}_{n}\circ\Phi_{\eta}% \,d\mu\,dt⩽ ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ∥ roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ italic_d italic_t (64)
T+etRBn𝒳𝒦n(ΦηBn𝒦n)𝑑μ𝑑tabsentsuperscriptsubscript𝑇subscriptdelimited-∥∥superscripte𝑡𝑅subscriptsuperscript𝐵𝑛subscript𝒳subscript𝒦𝑛subscriptdelimited-∥∥subscriptΦ𝜂subscriptsuperscript𝐵superscript𝑛subscript𝒦superscript𝑛differential-d𝜇differential-d𝑡\displaystyle\leqslant\int_{T}^{+\infty}\left\lVert\mathrm{e}^{t\mathcal{L}}R% \right\rVert_{B^{\infty}_{n}}\int_{\mathcal{X}}\mathcal{K}_{n}\left\lparen% \lVert\Phi_{\eta}\rVert_{B^{\infty}_{n^{\prime}}}\mathcal{K}_{n^{\prime}}% \right\rparen\,d\mu\,dt⩽ ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ∥ roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( ∥ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) italic_d italic_μ italic_d italic_t (65)
RBnT+LnCn,n,ηeλnt(𝒳𝒦m𝑑μ)𝑑tabsentsubscriptdelimited-∥∥𝑅subscriptsuperscript𝐵𝑛superscriptsubscript𝑇subscript𝐿𝑛subscript𝐶𝑛superscript𝑛superscript𝜂superscriptesubscript𝜆𝑛𝑡subscript𝒳subscript𝒦𝑚differential-d𝜇differential-d𝑡\displaystyle\leqslant\lVert R\rVert_{B^{\infty}_{n}}\int_{T}^{+\infty}L_{n}C_% {n,n^{\prime},\eta^{*}}\mathrm{e}^{-\lambda_{n}t}\left\lparen\int_{\mathcal{X}% }\mathcal{K}_{m}\,d\mu\right\rparen\,dt⩽ ∥ italic_R ∥ start_POSTSUBSCRIPT italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT italic_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_η start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_e start_POSTSUPERSCRIPT - italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_t end_POSTSUPERSCRIPT ( ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT caligraphic_K start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_d italic_μ ) italic_d italic_t (66)
C~m,n,n,ηT+eλnt𝑑t=C~m,n,n,ηeλnTλn.absentsubscript~𝐶𝑚𝑛superscript𝑛superscript𝜂superscriptsubscript𝑇superscriptesubscript𝜆𝑛𝑡differential-d𝑡subscript~𝐶𝑚𝑛superscript𝑛superscript𝜂superscriptesubscript𝜆𝑛𝑇subscript𝜆𝑛\displaystyle\leqslant\widetilde{C}_{m,n,n^{\prime},\eta^{*}}\int_{T}^{+\infty% }\mathrm{e}^{-\lambda_{n}t}\,dt=\frac{\widetilde{C}_{m,n,n^{\prime},\eta^{*}}% \mathrm{e}^{-\lambda_{n}T}}{\lambda_{n}}.⩽ over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_m , italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_η start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT roman_e start_POSTSUPERSCRIPT - italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_t end_POSTSUPERSCRIPT italic_d italic_t = divide start_ARG over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_m , italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_η start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_e start_POSTSUPERSCRIPT - italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG . (67)

We now consider the first term on the right-hand side of (61). Once again applying the semigroup definition of the expectation as well as the operator identity (40), it holds that

0+𝔼(R(Xtη))𝑑t=𝒳0+(etR)Φη𝑑t𝑑μ=𝒳(1R)Φη𝑑μ.superscriptsubscript0𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡subscript𝒳superscriptsubscript0superscripte𝑡𝑅subscriptΦ𝜂differential-d𝑡differential-d𝜇subscript𝒳superscript1𝑅subscriptΦ𝜂differential-d𝜇\displaystyle\int_{0}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})\bigr{)}\,dt=% \int_{\mathcal{X}}\int_{0}^{+\infty}\bigl{(}\mathrm{e}^{t\mathcal{L}}R\bigr{)}% \circ\Phi_{\eta}\,dt\,d\mu=\int_{\mathcal{X}}\bigl{(}-\mathcal{L}^{-1}R\bigr{)% }\circ\Phi_{\eta}\,d\mu.∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT ( roman_e start_POSTSUPERSCRIPT italic_t caligraphic_L end_POSTSUPERSCRIPT italic_R ) ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_t italic_d italic_μ = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ) ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT italic_d italic_μ . (68)

Writing the expectation in terms of the semigroup, and in view of the operator identity (40), we write the Green–Kubo formula (4) as

ρ=0+𝔼μ(R(Yt0)S(Y00))𝑑t=𝒳(1R)S𝑑μ.𝜌superscriptsubscript0subscript𝔼𝜇𝑅superscriptsubscript𝑌𝑡0𝑆superscriptsubscript𝑌00differential-d𝑡subscript𝒳superscript1𝑅𝑆differential-d𝜇\displaystyle\rho=\int_{0}^{+\infty}\mathbb{E}_{\mu}\bigl{(}R(Y_{t}^{0})S(Y_{0% }^{0})\bigr{)}\,dt=\int_{\mathcal{X}}(-\mathcal{L}^{-1}R)S\,d\mu.italic_ρ = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_S ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) italic_d italic_t = ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ) italic_S italic_d italic_μ . (69)

Thus, by Proposition 1 with f=1R𝒮0𝑓superscript1𝑅subscript𝒮0f=-\mathcal{L}^{-1}R\in\mathscr{S}_{0}italic_f = - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ∈ script_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, it follows that

|1η0+𝔼(R(Xtη))𝑑tρ|ηα𝒞1R,η.1𝜂superscriptsubscript0𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡𝜌superscript𝜂𝛼subscript𝒞superscript1𝑅subscript𝜂\left\lvert\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})% \bigr{)}\,dt-\rho\right\rvert\leqslant\eta^{\alpha}\mathcal{C}_{-\mathcal{L}^{% -1}R,\eta_{*}}.| divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t - italic_ρ | ⩽ italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT caligraphic_C start_POSTSUBSCRIPT - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R , italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT end_POSTSUBSCRIPT . (70)

This allows us to obtain the desired result. ∎

3.2.3 Analysis of the variance

We state in this section some results on the scaling of the variance of the estimator (17) used with the dynamics (28).

Proposition 2 (Variance of transient subtraction estimator).

Suppose that Assumption 1 holds and that R𝑅Ritalic_R is globally Lipschitz with Lipschitz constant RLipsubscriptnorm𝑅Lip\|R\|_{\rm Lip}∥ italic_R ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT. Then, for any T>0𝑇0T>0italic_T > 0,

Var(ρ^subT,K,η,α)RLip2K𝔼[|X0ηY00|2]η2(0TetB𝑑t)2.Varsubscriptsuperscript^𝜌𝑇𝐾𝜂𝛼subsuperscriptsubscriptnorm𝑅Lip2𝐾𝔼delimited-[]superscriptsuperscriptsubscript𝑋0𝜂superscriptsubscript𝑌002superscript𝜂2superscriptsuperscriptsubscript0𝑇superscripte𝑡𝐵differential-d𝑡2\operatorname{Var}\left\lparen\widehat{\rho}^{T,K,\eta,\alpha}_{\rm sub}\right% \rparen\leqslant\frac{\|R\|_{\rm Lip}^{2}}{K}\frac{\mathbb{E}\left[|X_{0}^{% \eta}-Y_{0}^{0}|^{2}\right]}{\eta^{2}}\left\lparen\int_{0}^{T}\mathrm{e}^{tB}% \,dt\right\rparen^{2}.roman_Var ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ) ⩽ divide start_ARG ∥ italic_R ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_K end_ARG divide start_ARG blackboard_E [ | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_e start_POSTSUPERSCRIPT italic_t italic_B end_POSTSUPERSCRIPT italic_d italic_t ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (71)

This result suggests that the variance grows at most exponentially fast in time, which is the case when B>0𝐵0B>0italic_B > 0. For dissipative drifts, i.e. B<0𝐵0B<0italic_B < 0, the variance is uniformly bounded in time by B2superscript𝐵2B^{-2}italic_B start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. Lastly, for B=0𝐵0B=0italic_B = 0 the variance grows linearly in T𝑇Titalic_T.

Note that the variance is uniformly bounded in η𝜂\etaitalic_η. In particular, since the functions φ1,φ2subscript𝜑1subscript𝜑2\varphi_{1},\varphi_{2}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (defined in (43) and (44), respectively, and assumed to be in Bnsubscriptsuperscript𝐵𝑛B^{\infty}_{n}italic_B start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for some n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N) belong to L2(μ)superscript𝐿2𝜇L^{2}(\mu)italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) by (33)–(34), it holds that

𝔼[|X0ηY00|2]η2={φ1L2(μ)2if α=1,φ1+ηφ2L2(μ)2if α=2.𝔼delimited-[]superscriptsuperscriptsubscript𝑋0𝜂superscriptsubscript𝑌002superscript𝜂2casessubscriptsuperscriptdelimited-∥∥subscript𝜑12superscript𝐿2𝜇if 𝛼1subscriptsuperscriptdelimited-∥∥subscript𝜑1𝜂subscript𝜑22superscript𝐿2𝜇if 𝛼2\frac{\mathbb{E}\left[\lvert X_{0}^{\eta}-Y_{0}^{0}\rvert^{2}\right]}{\eta^{2}% }=\begin{cases}\lVert\varphi_{1}\rVert^{2}_{L^{2}(\mu)}&\text{if }\alpha=1,\\ \lVert\varphi_{1}+\eta\varphi_{2}\rVert^{2}_{L^{2}(\mu)}&\text{if }\alpha=2.% \end{cases}divide start_ARG blackboard_E [ | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = { start_ROW start_CELL ∥ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) end_POSTSUBSCRIPT end_CELL start_CELL if italic_α = 1 , end_CELL end_ROW start_ROW start_CELL ∥ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_η italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_μ ) end_POSTSUBSCRIPT end_CELL start_CELL if italic_α = 2 . end_CELL end_ROW (72)

Plugging (72) into (71) immediately implies a bound on the variance uniform in the perturbation parameter |η|η𝜂subscript𝜂|\eta|\leqslant\eta_{*}| italic_η | ⩽ italic_η start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT.

For the result stated in Proposition 2 , we do not consider the asymptotic variance as for the transient estimator in Section 2.3, as we cannot observe the T+𝑇T\to+\inftyitalic_T → + ∞ limit due to the couplings we consider. As discussed in Section 3.1.1, the dynamics will decouple at large times, leading to a substantial increase in variance. We thus need to truncate the integration time T𝑇Titalic_T.

Remark 5 (Comparison with transient method).

Variance reduction is obtainable for synchronous coupling even without strong conditions on the drift of (2). In particular, when we have no dissipativity (i.e. B>0𝐵0B>0italic_B > 0), the subtraction technique is better than the usual transient method discussed in Section 2.3 provided that e2BTT/η2much-less-thansuperscripte2𝐵𝑇𝑇superscript𝜂2\mathrm{e}^{2BT}\ll T/\eta^{2}roman_e start_POSTSUPERSCRIPT 2 italic_B italic_T end_POSTSUPERSCRIPT ≪ italic_T / italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, i.e. Tlog(η)/Bmuch-less-than𝑇𝜂𝐵T\ll-\log(\eta)/Bitalic_T ≪ - roman_log ( italic_η ) / italic_B. As one would typically consider η1much-less-than𝜂1\eta\ll 1italic_η ≪ 1, this suggests that the subtraction technique should therefore be preferred.

Proof of Proposition 2.

It suffices to consider the estimator (17) for K=1𝐾1K=1italic_K = 1, since Var(ρ^subT,K,η,α)=K1Var(ρ^subT,1,η,α)Varsubscriptsuperscript^𝜌𝑇𝐾𝜂𝛼subsuperscript𝐾1Varsubscriptsuperscript^𝜌𝑇1𝜂𝛼sub\operatorname{Var}(\widehat{\rho}^{T,K,\eta,\alpha}_{\rm sub})=K^{-1}% \operatorname{Var}(\widehat{\rho}^{T,1,\eta,\alpha}_{\rm sub})roman_Var ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_K , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ) = italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Var ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , 1 , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ). To simplify the notation, we write ρ^subT,η,αsubscriptsuperscript^𝜌𝑇𝜂𝛼sub\widehat{\rho}^{T,\eta,\alpha}_{\rm sub}over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT instead of ρ^subT,1,η,αsubscriptsuperscript^𝜌𝑇1𝜂𝛼sub\widehat{\rho}^{T,1,\eta,\alpha}_{\rm sub}over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , 1 , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT:

ρ^subT,η,α=1η0T(R(Xtη)R(Yt0))𝑑t.subscriptsuperscript^𝜌𝑇𝜂𝛼sub1𝜂superscriptsubscript0𝑇𝑅superscriptsubscript𝑋𝑡𝜂𝑅superscriptsubscript𝑌𝑡0differential-d𝑡\widehat{\rho}^{T,\eta,\alpha}_{\rm sub}=\frac{1}{\eta}\int_{0}^{T}(R(X_{t}^{% \eta})-R(Y_{t}^{0}))\,dt.over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_R ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ) - italic_R ( italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) italic_d italic_t . (73)

Since R𝑅Ritalic_R is Lipschitz, and using Lemma 1 to bound the coupling distance |XtηYt0|superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0|X_{t}^{\eta}-Y_{t}^{0}|| italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | in terms of the initial distance |X0ηY00|superscriptsubscript𝑋0𝜂superscriptsubscript𝑌00|X_{0}^{\eta}-Y_{0}^{0}|| italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT |, we have

|ρ^subT,η,α|RLip0T|XtηYt0|η𝑑tRLip0TetB|X0ηY00|η𝑑t.subscriptsuperscript^𝜌𝑇𝜂𝛼subsubscriptnorm𝑅Lipsuperscriptsubscript0𝑇superscriptsubscript𝑋𝑡𝜂superscriptsubscript𝑌𝑡0𝜂differential-d𝑡subscriptnorm𝑅Lipsuperscriptsubscript0𝑇superscripte𝑡𝐵superscriptsubscript𝑋0𝜂superscriptsubscript𝑌00𝜂differential-d𝑡\displaystyle\left\lvert\widehat{\rho}^{T,\eta,\alpha}_{\rm sub}\right\rvert% \leqslant\|R\|_{\rm Lip}\int_{0}^{T}\frac{|X_{t}^{\eta}-Y_{t}^{0}|}{\eta}\,dt% \leqslant\|R\|_{\rm Lip}\int_{0}^{T}\frac{\mathrm{e}^{tB}|X_{0}^{\eta}-Y_{0}^{% 0}|}{\eta}\,dt.| over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT | ⩽ ∥ italic_R ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG | italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | end_ARG start_ARG italic_η end_ARG italic_d italic_t ⩽ ∥ italic_R ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG roman_e start_POSTSUPERSCRIPT italic_t italic_B end_POSTSUPERSCRIPT | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | end_ARG start_ARG italic_η end_ARG italic_d italic_t . (74)

We bound the variance as

Var(ρ^subT,η,α)𝔼[|ρ^subT,η,α|2]=RLip2𝔼[|0TetB|X0ηY00|η𝑑t|2],Varsubscriptsuperscript^𝜌𝑇𝜂𝛼sub𝔼delimited-[]superscriptsubscriptsuperscript^𝜌𝑇𝜂𝛼sub2superscriptsubscriptnorm𝑅Lip2𝔼delimited-[]superscriptsuperscriptsubscript0𝑇superscripte𝑡𝐵superscriptsubscript𝑋0𝜂superscriptsubscript𝑌00𝜂differential-d𝑡2\displaystyle\operatorname{Var}\left\lparen\widehat{\rho}^{T,\eta,\alpha}_{\rm sub% }\right\rparen\leqslant\mathbb{E}\left[\left\lvert\widehat{\rho}^{T,\eta,% \alpha}_{\rm sub}\right\rvert^{2}\right]=\|R\|_{\rm Lip}^{2}\mathbb{E}\left[% \left\lvert\int_{0}^{T}\frac{\mathrm{e}^{tB}|X_{0}^{\eta}-Y_{0}^{0}|}{\eta}\,% dt\right\rvert^{2}\right],roman_Var ( over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT ) ⩽ blackboard_E [ | over^ start_ARG italic_ρ end_ARG start_POSTSUPERSCRIPT italic_T , italic_η , italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = ∥ italic_R ∥ start_POSTSUBSCRIPT roman_Lip end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT blackboard_E [ | ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG roman_e start_POSTSUPERSCRIPT italic_t italic_B end_POSTSUPERSCRIPT | italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT - italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT | end_ARG start_ARG italic_η end_ARG italic_d italic_t | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] , (75)

which leads to the desired result. ∎

4 Application to Langevin dynamics

To illustrate the theoretical results obtained in Section 3, we apply the transient subtraction technique to compute the mobility and shear viscosity for a Lennard–Jones fluid, and to a low-dimensional example, with the Langevin dynamics (1) serving as the underlying dynamics for all cases. We present our numerical results in three parts:

  • In Section 4.1, we formulate the transient subtraction technique for Langevin dynamics by making precise φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for the conjugate responses S𝑆Sitalic_S of interest.

  • In Section 4.2, we numerically illustrate the finite η𝜂\etaitalic_η bias results from Corollary 1, which apply (45) to the subtraction estimator (17). In particular, we demonstrate the bias scaling for first and second-order maps ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT. This is done with the one-dimensional Langevin dynamics, which allows to directly compute (45), at the level of operators, by discretizing the associated PDE. This enables a clear and effective presentation of the result, which would have otherwise been challenging to achieve with usual stochastic approaches.

  • Finally, we compute in Section 4.3 the mobility and shear viscosity for a Lennard–Jones fluid. This aim is to demonstrate the usefulness and viability of the method in more practical, high-dimensional molecular dynamics settings, particularly by highlighting its variance reduction capabilities.

4.1 Transient methods for Langevin dynamics

Although our transient method only considers equilibrium dynamics, it encodes the relevant nonequilibrium information through the conjugate response function S𝑆Sitalic_S, which is the key quantity allowing to obtain the transport coefficient. This is expressed through the first-order perturbation PDE (43), whose solution depends on S𝑆Sitalic_S. To define it, let F(q)d𝐹𝑞superscript𝑑F(q)\in\mathbb{R}^{d}italic_F ( italic_q ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT represent an external forcing, chosen appropriately based on the transport coefficient in consideration. Particular choices for F(q)𝐹𝑞F(q)italic_F ( italic_q ) are made precise for each scenario we consider in Section 4.3. For all such scenarios, the associated conjugate response function S𝑆Sitalic_S is given by

S(q,p)=βF(q)𝖳M1p.𝑆𝑞𝑝𝛽𝐹superscript𝑞𝖳superscript𝑀1𝑝S(q,p)=\beta F(q)^{\mathsf{T}}M^{-1}p.italic_S ( italic_q , italic_p ) = italic_β italic_F ( italic_q ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p . (76)

We remark that the formal definition of S𝑆Sitalic_S is based on the associated nonequilibrium dynamics, and relies on linear response theory to be rigorously derived. In the interest of clarity, we do not provide such a rigorous discussion, and instead refer the reader to [lelievre2016, Section 5.2.3] for a comprehensive discussion.

Having identified the appropriate conjugate response function S𝑆Sitalic_S, one can now construct the map ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, for α=1,2𝛼12\alpha=1,2italic_α = 1 , 2, by solving the associated PDEs (43) and (44).

First-order map φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

For convenience, let us first recall the expression for (43):

φ1=i=1dxiφ1,xi=S.superscriptsubscript𝜑1superscriptsubscript𝑖1𝑑superscriptsubscriptsubscript𝑥𝑖subscript𝜑1subscript𝑥𝑖𝑆\nabla^{*}\varphi_{1}=\sum_{i=1}^{d}\partial_{x_{i}}^{*}\varphi_{1,x_{i}}=S.∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_S . (77)

For Langevin dynamics, it is natural to consider the position and momentum components of φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT by writing φ1(q,p)=(φ1,q(q,p),φ1,p(q,p))subscript𝜑1𝑞𝑝subscript𝜑1𝑞𝑞𝑝subscript𝜑1𝑝𝑞𝑝\varphi_{1}(q,p)=(\varphi_{1,q}(q,p),\varphi_{1,p}(q,p))italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_q , italic_p ) = ( italic_φ start_POSTSUBSCRIPT 1 , italic_q end_POSTSUBSCRIPT ( italic_q , italic_p ) , italic_φ start_POSTSUBSCRIPT 1 , italic_p end_POSTSUBSCRIPT ( italic_q , italic_p ) ), so that we can write φ1=qφ1,q+pφ1,psuperscriptsubscript𝜑1subscriptsuperscript𝑞subscript𝜑1𝑞subscriptsuperscript𝑝subscript𝜑1𝑝\nabla^{*}\varphi_{1}=\nabla^{*}_{q}\varphi_{1,q}+\nabla^{*}_{p}\varphi_{1,p}∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_q end_POSTSUBSCRIPT + ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_p end_POSTSUBSCRIPT. More precisely, the action of the adjoint operators are given by

qi=qi+βqiV,pi=pi+β(M1p)i,formulae-sequencesubscriptsuperscriptsubscript𝑞𝑖subscriptsubscript𝑞𝑖𝛽subscriptsubscript𝑞𝑖𝑉subscriptsuperscriptsubscript𝑝𝑖subscriptsubscript𝑝𝑖𝛽subscriptsuperscript𝑀1𝑝𝑖\partial^{*}_{q_{i}}=-\partial_{q_{i}}+\beta\partial_{q_{i}}V,\qquad\partial^{% *}_{p_{i}}=-\partial_{p_{i}}+\beta(M^{-1}p)_{i},∂ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = - ∂ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_β ∂ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_V , ∂ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = - ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_β ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , (78)

which can be obtained via integration by parts as in (42). In view of (78) and (76), we can write (43) more explicitly for Langevin dynamics as

divq(φ1,q)divp(φ1,p)+βV𝖳φ1,q+βp𝖳M1φ1,p=βF(q)𝖳M1p.subscriptdiv𝑞subscript𝜑1𝑞subscriptdiv𝑝subscript𝜑1𝑝𝛽superscript𝑉𝖳subscript𝜑1𝑞𝛽superscript𝑝𝖳superscript𝑀1subscript𝜑1𝑝𝛽𝐹superscript𝑞𝖳superscript𝑀1𝑝-\operatorname{div}_{q}(\varphi_{1,q})-\operatorname{div}_{p}(\varphi_{1,p})+% \beta\nabla V^{\mathsf{T}}\varphi_{1,q}+\beta p^{\mathsf{T}}M^{-1}\varphi_{1,p% }=\beta F(q)^{\mathsf{T}}M^{-1}p.- roman_div start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( italic_φ start_POSTSUBSCRIPT 1 , italic_q end_POSTSUBSCRIPT ) - roman_div start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_φ start_POSTSUBSCRIPT 1 , italic_p end_POSTSUBSCRIPT ) + italic_β ∇ italic_V start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_q end_POSTSUBSCRIPT + italic_β italic_p start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 1 , italic_p end_POSTSUBSCRIPT = italic_β italic_F ( italic_q ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p . (79)

Therefore, a natural solution for (43) in any dimension is

φ1(q,p)=(φ1,q(q,p)φ1,p(q,p))=(0F(q)),subscript𝜑1𝑞𝑝matrixsubscript𝜑1𝑞𝑞𝑝subscript𝜑1𝑝𝑞𝑝matrix0𝐹𝑞\varphi_{1}(q,p)=\begin{pmatrix}\varphi_{1,q}(q,p)\\ \varphi_{1,p}(q,p)\end{pmatrix}=\begin{pmatrix}0\\ F(q)\end{pmatrix},italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_q , italic_p ) = ( start_ARG start_ROW start_CELL italic_φ start_POSTSUBSCRIPT 1 , italic_q end_POSTSUBSCRIPT ( italic_q , italic_p ) end_CELL end_ROW start_ROW start_CELL italic_φ start_POSTSUBSCRIPT 1 , italic_p end_POSTSUBSCRIPT ( italic_q , italic_p ) end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_F ( italic_q ) end_CELL end_ROW end_ARG ) , (80)

and the transformation Φη1subscriptsuperscriptΦ1𝜂\Phi^{1}_{\eta}roman_Φ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT is then given by

Φη1(q,p)=(qp+ηF(q)).superscriptsubscriptΦ𝜂1𝑞𝑝matrix𝑞𝑝𝜂𝐹𝑞\Phi_{\eta}^{1}(q,p)=\begin{pmatrix}q\\ p+\eta F(q)\end{pmatrix}.roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q , italic_p ) = ( start_ARG start_ROW start_CELL italic_q end_CELL end_ROW start_ROW start_CELL italic_p + italic_η italic_F ( italic_q ) end_CELL end_ROW end_ARG ) . (81)

Thus, constructing the initial conditions for a first-order transient trajectory simply amounts to shifting the initial momentum p00superscriptsubscript𝑝00p_{0}^{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT of some associated stationary equilibrium process by ηF(q00)𝜂𝐹superscriptsubscript𝑞00\eta F(q_{0}^{0})italic_η italic_F ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ).

Second-order map φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

Constructing the second-order map amounts to finding φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT by solving (44), which we recall for convenience:

φ2=12i,j=1dxixj(φ2,xiφ2,xj)=12()2:φ1φ1.:superscriptsubscript𝜑212superscriptsubscript𝑖𝑗1𝑑superscriptsubscriptsubscript𝑥𝑖superscriptsubscriptsubscript𝑥𝑗subscript𝜑2subscript𝑥𝑖subscript𝜑2subscript𝑥𝑗12superscriptsuperscript2tensor-productsubscript𝜑1subscript𝜑1\nabla^{*}\varphi_{2}=-\frac{1}{2}\sum_{i,j=1}^{d}\partial_{x_{i}}^{*}\partial% _{x_{j}}^{*}(\varphi_{2,x_{i}}\varphi_{2,x_{j}})=-\frac{1}{2}(\nabla^{*})^{2}% \colon\varphi_{1}\otimes\varphi_{1}.∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_φ start_POSTSUBSCRIPT 2 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT 2 , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT . (82)

Substituting the solution (80) for φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in (44) leads to

φ2superscriptsubscript𝜑2\displaystyle\nabla^{*}\varphi_{2}∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =12()2:(0F)(0F)12(p)2:FF.:absent12superscriptsuperscript2tensor-productmatrix0𝐹matrix0𝐹12superscriptsuperscriptsubscript𝑝2:tensor-product𝐹𝐹\displaystyle=-\frac{1}{2}(\nabla^{*})^{2}\colon\begin{pmatrix}0\\ F\end{pmatrix}\otimes\begin{pmatrix}0\\ F\end{pmatrix}\equiv-\frac{1}{2}(\nabla_{p}^{*})^{2}\colon F\otimes F.= - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_F end_CELL end_ROW end_ARG ) ⊗ ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_F end_CELL end_ROW end_ARG ) ≡ - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_F ⊗ italic_F . (83)

Thus, as in the first-order case, one can choose φ2,q=0subscript𝜑2𝑞0\varphi_{2,q}=0italic_φ start_POSTSUBSCRIPT 2 , italic_q end_POSTSUBSCRIPT = 0 so that φ2=(0,φ2,p(q,p))subscript𝜑20subscript𝜑2𝑝𝑞𝑝\varphi_{2}=(0,\varphi_{2,p}(q,p))italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( 0 , italic_φ start_POSTSUBSCRIPT 2 , italic_p end_POSTSUBSCRIPT ( italic_q , italic_p ) ). Next, recalling that pi=piβ(M1p)isubscriptsuperscriptsubscript𝑝𝑖subscriptsubscript𝑝𝑖𝛽subscriptsuperscript𝑀1𝑝𝑖\partial^{*}_{p_{i}}=-\partial_{p_{i}}-\beta(M^{-1}p)_{i}∂ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = - ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_β ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT,

12(p)2:FF:12superscriptsuperscriptsubscript𝑝2tensor-product𝐹𝐹\displaystyle-\frac{1}{2}(\nabla_{p}^{*})^{2}\colon F\otimes F- divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∇ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_F ⊗ italic_F =12i,j=1dpipj(FiFj)absent12superscriptsubscript𝑖𝑗1𝑑superscriptsubscriptsubscript𝑝𝑖superscriptsubscriptsubscript𝑝𝑗subscript𝐹𝑖subscript𝐹𝑗\displaystyle=-\frac{1}{2}\sum_{i,j=1}^{d}\partial_{p_{i}}^{*}\partial_{p_{j}}% ^{*}\left\lparen F_{i}F_{j}\right\rparen= - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) (84)
=β2i,j=1d[pi+β(M1p)i](M1p)jFiFjabsent𝛽2superscriptsubscript𝑖𝑗1𝑑delimited-[]subscriptsubscript𝑝𝑖𝛽subscriptsuperscript𝑀1𝑝𝑖subscriptsuperscript𝑀1𝑝𝑗subscript𝐹𝑖subscript𝐹𝑗\displaystyle=-\frac{\beta}{2}\sum_{i,j=1}^{d}\left[-\partial_{p_{i}}+\beta(M^% {-1}p)_{i}\right](M^{-1}p)_{j}F_{i}F_{j}= - divide start_ARG italic_β end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT [ - ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_β ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (85)
=β22i,j=1d(M1p)i(M1p)jFiFj+β2i,j=1dpi(M1p)jFiFjabsentsuperscript𝛽22superscriptsubscript𝑖𝑗1𝑑subscriptsuperscript𝑀1𝑝𝑖subscriptsuperscript𝑀1𝑝𝑗subscript𝐹𝑖subscript𝐹𝑗𝛽2superscriptsubscript𝑖𝑗1𝑑subscriptsubscript𝑝𝑖subscriptsuperscript𝑀1𝑝𝑗subscript𝐹𝑖subscript𝐹𝑗\displaystyle=-\frac{\beta^{2}}{2}\sum_{i,j=1}^{d}(M^{-1}p)_{i}(M^{-1}p)_{j}F_% {i}F_{j}+\frac{\beta}{2}\sum_{i,j=1}^{d}\partial_{p_{i}}(M^{-1}p)_{j}F_{i}F_{j}= - divide start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + divide start_ARG italic_β end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (86)
=β22i,j=1d(M1p)i(M1p)jFiFj+β2i,j=1dMj,i1FiFjabsentsuperscript𝛽22superscriptsubscript𝑖𝑗1𝑑subscriptsuperscript𝑀1𝑝𝑖subscriptsuperscript𝑀1𝑝𝑗subscript𝐹𝑖subscript𝐹𝑗𝛽2superscriptsubscript𝑖𝑗1𝑑subscriptsuperscript𝑀1𝑗𝑖subscript𝐹𝑖subscript𝐹𝑗\displaystyle=-\frac{\beta^{2}}{2}\sum_{i,j=1}^{d}(M^{-1}p)_{i}(M^{-1}p)_{j}F_% {i}F_{j}+\frac{\beta}{2}\sum_{i,j=1}^{d}M^{-1}_{j,i}F_{i}F_{j}= - divide start_ARG italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + divide start_ARG italic_β end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (87)
=12(βp𝖳M1F)2+12βF𝖳M1F.absent12superscript𝛽superscript𝑝𝖳superscript𝑀1𝐹212𝛽superscript𝐹𝖳superscript𝑀1𝐹\displaystyle=-\frac{1}{2}\left\lparen\beta p^{\mathsf{T}}M^{-1}F\right\rparen% ^{2}+\frac{1}{2}\beta F^{\mathsf{T}}M^{-1}F.= - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_β italic_p start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_β italic_F start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F . (88)

Thus, a possible solution for the second-order term φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is

φ2(q,p)=(0βF(q)𝖳M1p2F(q))=(012S(q,p)F(q)).subscript𝜑2𝑞𝑝matrix0𝛽𝐹superscript𝑞𝖳superscript𝑀1𝑝2𝐹𝑞matrix012𝑆𝑞𝑝𝐹𝑞\varphi_{2}(q,p)=\begin{pmatrix}0\\ -\dfrac{\beta F(q)^{\mathsf{T}}M^{-1}p}{2}F(q)\end{pmatrix}=\begin{pmatrix}0\\ -\dfrac{1}{2}S(q,p)F(q)\end{pmatrix}.italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_q , italic_p ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL - divide start_ARG italic_β italic_F ( italic_q ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p end_ARG start_ARG 2 end_ARG italic_F ( italic_q ) end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_S ( italic_q , italic_p ) italic_F ( italic_q ) end_CELL end_ROW end_ARG ) . (89)

This leads to the second-order transformation ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT

Φη2(q,p)=(qp+ηF(q)η22F(q)S(q,p)).superscriptsubscriptΦ𝜂2𝑞𝑝matrix𝑞𝑝𝜂𝐹𝑞superscript𝜂22𝐹𝑞𝑆𝑞𝑝\Phi_{\eta}^{2}(q,p)=\begin{pmatrix}q\\ p+\eta F(q)-\dfrac{\eta^{2}}{2}F(q)S(q,p)\end{pmatrix}.roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q , italic_p ) = ( start_ARG start_ROW start_CELL italic_q end_CELL end_ROW start_ROW start_CELL italic_p + italic_η italic_F ( italic_q ) - divide start_ARG italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG italic_F ( italic_q ) italic_S ( italic_q , italic_p ) end_CELL end_ROW end_ARG ) . (90)

4.2 One-dimensional Langevin dynamics

We next present some numerical results showcasing the scaling of the finite η𝜂\etaitalic_η bias for the first and second-order transformations ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT derived in Section 3.1. As stated in Corollary 1, in particular (70), an estimator of order α𝛼\alphaitalic_α has bias O(ηα)Osuperscript𝜂𝛼\mathrm{O}(\eta^{\alpha})roman_O ( italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ):

|1η0+𝔼(R(Xtη))𝑑tρ|𝒞ηα.1𝜂superscriptsubscript0𝔼𝑅subscriptsuperscript𝑋𝜂𝑡differential-d𝑡𝜌𝒞superscript𝜂𝛼\left\lvert\frac{1}{\eta}\int_{0}^{+\infty}\mathbb{E}\bigl{(}R(X^{\eta}_{t})% \bigr{)}\,dt-\rho\right\rvert\leqslant\mathcal{C}\eta^{\alpha}.| divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + ∞ end_POSTSUPERSCRIPT blackboard_E ( italic_R ( italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) italic_d italic_t - italic_ρ | ⩽ caligraphic_C italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT . (91)

Note that we did not truncate the time-integral in the estimator above as the finite-time integration bias vanishes as T+𝑇T\to+\inftyitalic_T → + ∞, allowing us to solely quantify the η𝜂\etaitalic_η bias. In view of (68), we can rewrite (91) as

|1η𝒳(1R)Φηα𝑑μ𝒳(1R)S𝑑μ|𝒞ηα,1𝜂subscript𝒳superscript1𝑅superscriptsubscriptΦ𝜂𝛼differential-d𝜇subscript𝒳superscript1𝑅𝑆differential-d𝜇𝒞superscript𝜂𝛼\left\lvert\frac{1}{\eta}\int_{\mathcal{X}}(-\mathcal{L}^{-1}R)\circ\Phi_{\eta% }^{\alpha}\,d\mu-\int_{\mathcal{X}}(-\mathcal{L}^{-1}R)S\,d\mu\right\rvert% \leqslant\mathcal{C}\eta^{\alpha},| divide start_ARG 1 end_ARG start_ARG italic_η end_ARG ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ) ∘ roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT italic_d italic_μ - ∫ start_POSTSUBSCRIPT caligraphic_X end_POSTSUBSCRIPT ( - caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ) italic_S italic_d italic_μ | ⩽ caligraphic_C italic_η start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT , (92)

where we used that 1Rsuperscript1𝑅\mathcal{L}^{-1}Rcaligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R has average 0 with respect to μ𝜇\muitalic_μ. We write the bias in the form (92) since the low-dimensionality of the system in consideration allows us to directly compute the bias by discretizing \mathcal{L}caligraphic_L and solving the associated PDE. Note that the bias result presented here holds for both the naive transient (13) and subtraction (17) estimators.

Choice of observable

We consider the following observable, which has average 0 with respect to μ𝜇\muitalic_μ by construction:

R(q,p)=(cos(q)sin(q))eβV(q).𝑅𝑞𝑝𝑞𝑞superscripte𝛽𝑉𝑞R(q,p)=\left\lparen\cos(q)-\sin(q)\right\rparen\mathrm{e}^{\beta V(q)}.italic_R ( italic_q , italic_p ) = ( roman_cos ( italic_q ) - roman_sin ( italic_q ) ) roman_e start_POSTSUPERSCRIPT italic_β italic_V ( italic_q ) end_POSTSUPERSCRIPT . (93)

This choice is also considered in order to avoid symmetries in the response function (which may occur for typical observables such as p𝑝pitalic_p and V𝑉\nabla V∇ italic_V) so that the results are clearly presented; see [spacek2023, Section 4.2], for a more detailed discussion regarding the symmetries and the observable. Furthermore, the forcing F𝐹Fitalic_F in consideration is a normalized constant force, i.e. F=1𝐹1F=1italic_F = 1.

Numerically estimating the bias

The low dimensionality of this example allows us to analytically compute (92) through a direct computation of 1Rsuperscript1𝑅-\mathcal{L}^{-1}R- caligraphic_L start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R via finite-difference methods, and the use of quadratures for the associated integrals over the phase-space; see [spacek2023, Appendix B] for precise details on the numerical implementation of the finite-difference scheme. The unbounded momentum domain is truncated to [pmax,pmax]subscript𝑝maxsubscript𝑝max[-p_{\mathrm{max}},p_{\mathrm{max}}][ - italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ], with pmax=5subscript𝑝max5p_{\mathrm{max}}=5italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT = 5. The domain [π,π]×[pmax,pmax]𝜋𝜋subscript𝑝maxsubscript𝑝max[-\pi,\pi]\times[-p_{\mathrm{max}},p_{\mathrm{max}}][ - italic_π , italic_π ] × [ - italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ] is then discretized into mq=200subscript𝑚𝑞200m_{q}=200italic_m start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT = 200 by mp=400subscript𝑚𝑝400m_{p}=400italic_m start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = 400 points with uniform step sizes Δq=2π/mqΔ𝑞2𝜋subscript𝑚𝑞\Delta q=2\pi/m_{q}roman_Δ italic_q = 2 italic_π / italic_m start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT and Δp=2pmax/(mp1)Δ𝑝2subscript𝑝maxsubscript𝑚𝑝1\Delta p=2p_{\mathrm{max}}/(m_{p}-1)roman_Δ italic_p = 2 italic_p start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT / ( italic_m start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - 1 ).

We consider two maps ΦηαsuperscriptsubscriptΦ𝜂𝛼\Phi_{\eta}^{\alpha}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT, for α=1,2𝛼12\alpha=1,2italic_α = 1 , 2, constructed from φ1subscript𝜑1\varphi_{1}italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and φ2subscript𝜑2\varphi_{2}italic_φ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT obtained in Section 4.1, with particular forms

{Φη1(q,p)=p+η,Φη2(q,p)=p+ηη2β2p.casesmissing-subexpressionsuperscriptsubscriptΦ𝜂1𝑞𝑝𝑝𝜂missing-subexpressionsuperscriptsubscriptΦ𝜂2𝑞𝑝𝑝𝜂superscript𝜂2𝛽2𝑝otherwise\begin{cases}\begin{aligned} &\Phi_{\eta}^{1}(q,p)=p+\eta,\\ &\Phi_{\eta}^{2}(q,p)=p+\eta-\eta^{2}\frac{\beta}{2}p.\end{aligned}\end{cases}{ start_ROW start_CELL start_ROW start_CELL end_CELL start_CELL roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q , italic_p ) = italic_p + italic_η , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q , italic_p ) = italic_p + italic_η - italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_β end_ARG start_ARG 2 end_ARG italic_p . end_CELL end_ROW end_CELL start_CELL end_CELL end_ROW (94)

The bias was computed for various values of η𝜂\etaitalic_η, and the results are shown in Figure 2 in a log-log scale, with reference lines included. This confirms that the bias associated with an α𝛼\alphaitalic_α-ordered map is itself of order α𝛼\alphaitalic_α, which is the main estimate of Proposition 1.

Refer to caption
Figure 2: Bias (92) as a function of η𝜂\etaitalic_η for the first and second-order maps, with overlayed reference lines.

4.3 Mobility and shear viscosity for Lennard–Jones fluids

We next present some numerical results highlighting the variance-reduction potential of the transient subtraction method. The example in consideration is the computation of shear viscosity and mobility for a Lennard–Jones fluid. The system is composed of N𝑁Nitalic_N particles in spatial dimension D=3𝐷3D=3italic_D = 3 (so that d=DN𝑑𝐷𝑁d=DNitalic_d = italic_D italic_N), evolving according to the underdamped Langevin dynamics (1). The potential energy corresponds to the sum of pairwise interactions

V(q)=1i<jNv(qiqj),𝑉𝑞subscript1𝑖𝑗𝑁𝑣delimited-∥∥subscript𝑞𝑖subscript𝑞𝑗V(q)=\sum_{1\leqslant i<j\leqslant N}v(\lVert q_{i}-q_{j}\rVert),italic_V ( italic_q ) = ∑ start_POSTSUBSCRIPT 1 ⩽ italic_i < italic_j ⩽ italic_N end_POSTSUBSCRIPT italic_v ( ∥ italic_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ ) , (95)

with v(r)𝑣𝑟v(r)italic_v ( italic_r ) given by the standard 12-6 Lennard–Jones interaction potential:

v(r)=4ε[(σr)12(σr)6].𝑣𝑟4𝜀delimited-[]superscript𝜎𝑟12superscript𝜎𝑟6v(r)=4\varepsilon\left[\left\lparen\frac{\sigma}{r}\right\rparen^{12}-\left% \lparen\frac{\sigma}{r}\right\rparen^{6}\right].italic_v ( italic_r ) = 4 italic_ε [ ( divide start_ARG italic_σ end_ARG start_ARG italic_r end_ARG ) start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT - ( divide start_ARG italic_σ end_ARG start_ARG italic_r end_ARG ) start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT ] . (96)

The parameter ε𝜀\varepsilonitalic_ε represents the depth of the potential well, and σ𝜎\sigmaitalic_σ is the distance at which interactions are 0, i.e. when interactions go from attractive to repulsive. In practice, one truncates the range of (96) at some value rcsubscript𝑟cr_{\mathrm{c}}italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT, after which interactions can be deemed negligible. We employ the truncated shifted-force cutoff method with a cutoff value of rc=2.5σsubscript𝑟c2.5𝜎r_{\mathrm{c}}=2.5\sigmaitalic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT = 2.5 italic_σ, resulting in the modified potential

vSF(r)=[v(r)v(rc)(rrc)v(rc)]𝟏rrc.subscript𝑣SF𝑟delimited-[]𝑣𝑟𝑣subscript𝑟c𝑟subscript𝑟csuperscript𝑣subscript𝑟csubscript1𝑟subscript𝑟cv_{\mathrm{SF}}(r)=[v(r)-v(r_{\mathrm{c}})-(r-r_{\mathrm{c}})v^{\prime}(r_{% \mathrm{c}})]\mathbf{1}_{r\leqslant r_{\mathrm{c}}}.italic_v start_POSTSUBSCRIPT roman_SF end_POSTSUBSCRIPT ( italic_r ) = [ italic_v ( italic_r ) - italic_v ( italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ) - ( italic_r - italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ) italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ) ] bold_1 start_POSTSUBSCRIPT italic_r ⩽ italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT end_POSTSUBSCRIPT . (97)

We numerically integrate the dynamics using the BAOAB splitting scheme. The simulations were conducted using with the Molly.jl package [greener2024] in the Julia language, and were performed in dimensionless reduced units with σ=ε=kB=1𝜎𝜀subscript𝑘𝐵1\sigma=\varepsilon=k_{B}=1italic_σ = italic_ε = italic_k start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT = 1 and the mass matrix M=Id𝑀IdM=\mathrm{Id}italic_M = roman_Id. For both shear viscosity and mobility computations, the results we present correspond to averages over 105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT realizations of the system with i.i.d. initial conditions.

We next describe the strategy for initializing and evaluating the trajectories, which is the procedurally identical for the mobility and shear cases. Each independent realization of the system is initialized as follows. For the equilibrium control system, initial momenta were sampled from the Boltzmann–Gibbs measure, while initial positions were initialized on a uniform grid. The system was then evolved for a thermalization time of Ttherm=1subscript𝑇therm1T_{\mathrm{therm}}=1italic_T start_POSTSUBSCRIPT roman_therm end_POSTSUBSCRIPT = 1 with a timestep size Δt=103Δ𝑡superscript103\Delta t=10^{-3}roman_Δ italic_t = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT in reduced units. We ensured that the thermalization time was sufficient long to melt the crystal structure and relax the system to a stationary-state, as monitored by the stabilization of kinetic and potential energies, and by visual inspection of the molecular structure. Next, we initialize the transient trajectory by applying the transformation (98) to a copy of the stationary equilibrium system:

(q0ηp0η)=Φη(q00,p00)=(q00p00+ηF(q00)),matrixsuperscriptsubscript𝑞0𝜂superscriptsubscript𝑝0𝜂subscriptΦ𝜂superscriptsubscript𝑞00superscriptsubscript𝑝00matrixsuperscriptsubscript𝑞00superscriptsubscript𝑝00𝜂𝐹superscriptsubscript𝑞00\displaystyle\begin{pmatrix}q_{0}^{\eta}\\ p_{0}^{\eta}\end{pmatrix}=\Phi_{\eta}(q_{0}^{0},p_{0}^{0})=\begin{pmatrix}q_{0% }^{0}\\ p_{0}^{0}+\eta F(q_{0}^{0})\end{pmatrix},( start_ARG start_ROW start_CELL italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = ( start_ARG start_ROW start_CELL italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_η italic_F ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) end_CELL end_ROW end_ARG ) , (98)

where the expression for F(q)𝐹𝑞F(q)italic_F ( italic_q ) is made precise for mobility and shear viscosity in Sections 4.3.1 and 4.3.2, respectively. The equilibrium and transient trajectories are then evolved simultaneously according to synchronously coupled standard equilibrium dynamics. The integration time T𝑇Titalic_T should not be much larger than the relaxation time of the transient trajectory, as decoupling becomes a significant source of error. Nevertheless, this can be overcome during postprocessing, during which one can choose the appropriate truncation time for the estimator. Observational runs should be performed beforehand to ensure that one knows the approximate relaxation time (which varies significantly depending on the system at hand), which can be deduced from reasonably coarse and inexpensive runs. All simulation parameters are made precise in Table 1.

Parameter Shear visc. Mobility
Integration time (T𝑇Titalic_T) 3.5 2.0
Thermalization time (Tthermsubscript𝑇thermT_{\mathrm{therm}}italic_T start_POSTSUBSCRIPT roman_therm end_POSTSUBSCRIPT) 1 1
No. of realizations (K𝐾Kitalic_K) 105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT 105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT
Timestep (ΔtΔ𝑡\Delta troman_Δ italic_t) 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT
Inverse temp. (β𝛽\betaitalic_β) 1.25 0.8
Damping (γ𝛾\gammaitalic_γ) 1 1
No. of particles (N𝑁Nitalic_N) 1000 1000
Particle density (ϱitalic-ϱ\varrhoitalic_ϱ [N𝑁Nitalic_N/vol.]) 0.7 0.6
LJ cutoff (rcsubscript𝑟cr_{\mathrm{c}}italic_r start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT) 2.5 2.5
Mass matrix M𝑀Mitalic_M Id Id
LJ param. (σ)𝜎(\sigma)( italic_σ ) 1 1
LJ param. (ε)𝜀(\varepsilon)( italic_ε ) 1 1
Boltzmann const. (kBsubscript𝑘𝐵k_{B}italic_k start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT) 1 1
Table 1: Simulation parameters

4.3.1 Mobility

When under the effect of a constant external field F𝐹Fitalic_F, the mobility quantifies the particles’ drift velocity in the direction of the applied field. In our Lennard–Jones fluid example, we consider a constant force applied in the x𝑥xitalic_x-direction, and in particular we consider colored drift, which amounts to perturbing half the particles to one direction, and the other half in the opposite direction:

F=1N(F1,F2,,FN)𝖳3N,Fi=((1)i+1,0,0),i=1,,N.formulae-sequence𝐹1𝑁superscriptsubscript𝐹1subscript𝐹2subscript𝐹𝑁𝖳superscript3𝑁formulae-sequencesubscript𝐹𝑖superscript1𝑖100𝑖1𝑁F=\frac{1}{\sqrt{N}}(F_{1},F_{2},\dotsc,F_{N})^{\mathsf{T}}\in\mathbb{R}^{3N},% \qquad F_{i}=((-1)^{i+1},0,0),\qquad i=1,\dotsc,N.italic_F = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_N end_ARG end_ARG ( italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_F start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_F start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 italic_N end_POSTSUPERSCRIPT , italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( ( - 1 ) start_POSTSUPERSCRIPT italic_i + 1 end_POSTSUPERSCRIPT , 0 , 0 ) , italic_i = 1 , … , italic_N . (99)

The observable we consider is the velocity in the direction F𝐹Fitalic_F, and is the standard choice for mobility computations [lelievre2016, Section 5.2.2]:

R(q,p)=F𝖳M1p.𝑅𝑞𝑝superscript𝐹𝖳superscript𝑀1𝑝R(q,p)=F^{\mathsf{T}}M^{-1}p.italic_R ( italic_q , italic_p ) = italic_F start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p . (100)
Refer to caption
(a) Data for η=0.01𝜂0.01\eta=0.01italic_η = 0.01.
Refer to caption
(b) Data for η=0.1𝜂0.1\eta=0.1italic_η = 0.1.
Figure 3: Trajectories for the computation of the mobility of a Lennard–Jones fluid with colored drift and associated error bars. The top graphs correspond to the instantaneous response (normalized by η𝜂\etaitalic_η) as a function of time as transient trajectory relaxes, while the bottom graphs show the integrated response over time.
Variance at T=1𝑇1T=1italic_T = 1
η𝜂\etaitalic_η Naive Subtraction Ratio
0.01 2.66×1032.66superscript103{2.66}\times 10^{3}2.66 × 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT 4.69×1034.69superscript103{4.69}\times 10^{-3}4.69 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 5.67×1055.67superscript105{5.67}\times 10^{5}5.67 × 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT
0.1 26.6 4.65×1034.65superscript103{4.65}\times 10^{-3}4.65 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 5.71×1035.71superscript103{5.71}\times 10^{3}5.71 × 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT
1.0 0.265 4.37×1034.37superscript103{4.37}\times 10^{-3}4.37 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 60.6
(a) Data at T=1𝑇1T=1italic_T = 1 (start of decoupling)
Variance at T=2𝑇2T=2italic_T = 2
η𝜂\etaitalic_η Naive Subtraction Ratio
0.01 5.70×1035.70superscript103{5.70}\times 10^{3}5.70 × 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT 11.8 485
0.1 57.2 5.52 10.4
1.0 0.564 0.286 1.97
(b) Data at final time T=2𝑇2T=2italic_T = 2 (total decoupling)
Table 2: Comparison of variances between naive and subtraction transient estimators for various values of η𝜂\etaitalic_η for the computation of mobility.

Each trajectory is integrated for a physical time T=2𝑇2T=2italic_T = 2. Although the system relaxes significantly before, we deliberately wanted to observe the decoupling point, which can be easily spotted in Figure 3, which shows the average trajectories for two values of η𝜂\etaitalic_η. Due to the large signal-to-noise ratio of the mobility response, the subtraction’s uniform bound in η𝜂\etaitalic_η of the variance indeed shows to make a difference, as can readily be seen from Figure 3. The associated error bars shown in Figure 3 were computed with empirical averages over the independent realizations.

To quantitatively assess the variance reduction and the decoupling effect, we consider the variance values for both the naive transient and subtraction methods at two different times T𝑇Titalic_T: one right before trajectories decouple T=1𝑇1T=1italic_T = 1, and one at the final time T=2𝑇2T=2italic_T = 2. These results are summarized in Table 2. At T=1𝑇1T=1italic_T = 1, we indeed see the variance’s uniform bound in η𝜂\etaitalic_η for the subtraction trajectory, while the η2superscript𝜂2\eta^{-2}italic_η start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT factor shows for the naive trajectory. Additionally, we notice that at T=2𝑇2T=2italic_T = 2, even after significant decoupling, the subtraction method still provides significant variance reduction, even long after relaxation has occured.

4.3.2 Shear viscosity

The shear viscosity of a fluid can be computed in a variety of ways; see [todd2007] for a review on computational techniques. In this work, we consider a setting based on the transverse force-field method [joubaud2012] with a sinusoidal forcing profile. We denote by Fi3subscript𝐹𝑖superscript3F_{i}\in\mathbb{R}^{3}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT the force acting on the i𝑖iitalic_ith particle:

Fi=(f(qy),0,0)𝖳,f(y)=sin(2πyLy).formulae-sequencesubscript𝐹𝑖superscript𝑓subscript𝑞𝑦00𝖳𝑓𝑦2𝜋𝑦subscript𝐿𝑦F_{i}=(f(q_{y}),0,0)^{\mathsf{T}},\qquad f(y)=\sin\left\lparen\frac{2\pi y}{L_% {y}}\right\rparen.italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_f ( italic_q start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) , 0 , 0 ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT , italic_f ( italic_y ) = roman_sin ( divide start_ARG 2 italic_π italic_y end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT end_ARG ) . (101)

The force acts on the x𝑥xitalic_x-component of the momenta based on the particle’s y𝑦yitalic_y-coordinate position. The observable R𝑅Ritalic_R of interest is the imaginary part of the first empirical Fourier coefficient U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT:

R(q,p)=Im(U1),U1=1Nn=1N(M1p)n,xexp(2iπqn,yLy).formulae-sequence𝑅𝑞𝑝Imsubscript𝑈1subscript𝑈11𝑁superscriptsubscript𝑛1𝑁subscriptsuperscript𝑀1𝑝𝑛𝑥2i𝜋subscript𝑞𝑛𝑦subscript𝐿𝑦R(q,p)=\operatorname{Im}(U_{1}),\qquad U_{1}=\frac{1}{N}\sum_{n=1}^{N}(M^{-1}p% )_{n,x}\exp\left\lparen\frac{2\mathrm{i}\pi q_{n,y}}{L_{y}}\right\rparen.italic_R ( italic_q , italic_p ) = roman_Im ( italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_p ) start_POSTSUBSCRIPT italic_n , italic_x end_POSTSUBSCRIPT roman_exp ( divide start_ARG 2 roman_i italic_π italic_q start_POSTSUBSCRIPT italic_n , italic_y end_POSTSUBSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT end_ARG ) . (102)

The initialization and evaluation of trajectories for this system were performed as described in Section 4.3. We consider the same numerical results as shown in Section 4.3.1, with largely the same interpretations and conclusion. A first difference, clearly seen from Figure 4, however, is the magnitude of the error for the naive trajectories. This is a trivial artifact of the observable R(q,p)𝑅𝑞𝑝R(q,p)italic_R ( italic_q , italic_p ): for the mobility, the observable is O(N)O𝑁\mathrm{O}(\sqrt{N})roman_O ( square-root start_ARG italic_N end_ARG ), while it is O(1)O1\mathrm{O}(1)roman_O ( 1 ) for the shear case, since (102) corresponds to some spatial averaging. Secondly, the relaxation time for the shear trajectories is significantly longer compared to the one for mobility, and in fact almost coincides with the decoupling time. Nonetheless, Table 3 shows that variance reduction is still obtainable.

Refer to caption
(a) Data for η=0.01𝜂0.01\eta=0.01italic_η = 0.01.
Refer to caption
(b) Data for η=0.1𝜂0.1\eta=0.1italic_η = 0.1.
Figure 4: Trajectories for the computation of the shear viscosity of a Lennard–Jones fluid and associated error bars. The top graphs correspond to the instantaneous response (normalized by η𝜂\etaitalic_η) as a function of time as transient trajectory relaxes, while the bottom graphs show the integrated response over time.
Variance at T=2𝑇2T=2italic_T = 2
η𝜂\etaitalic_η Naive Subtraction Ratio
0.01 7.32 0.0260 281
0.1 0.0726 5.49×1035.49superscript103{5.49}\times 10^{-3}5.49 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 13.2
(a) Data at T=2𝑇2T=2italic_T = 2 (start of decoupling)
Variance at T=3.5𝑇3.5T=3.5italic_T = 3.5
η𝜂\etaitalic_η Naive Subtraction Ratio
0.01 15.0 2.45 6.12
0.1 0.150 0.0487 3.07
(b) Data at final time T=3.5𝑇3.5T=3.5italic_T = 3.5 (total decoupling)
Table 3: Comparison of variances between naive and subtraction transient estimators for various values of η𝜂\etaitalic_η for the computation of shear viscosity.

5 Conclusion and perspectives

We presented a variance reduction method to compute transport coefficients based on a transient approach with control variate, where the trajectories which relax are coupled to equilibrium ones. The numerical results in Section 4 show significant variance-reduction potential, suggesting this method is viable as its implementation is neither complex nor expensive; in fact, it is roughly twice the cost of a typical run due to the control system, so that the computational overhead is more than compensated by the variance reduction. For general systems, the bottleneck lies in the construction of the transformation ΦηsubscriptΦ𝜂\Phi_{\eta}roman_Φ start_POSTSUBSCRIPT italic_η end_POSTSUBSCRIPT, as the PDEs (43) and (44) might not have a readily available solution for some given conjugate response S𝑆Sitalic_S of interest.

This works calls for several extensions. A particularly appealing one is to explore other types of couplings. For the systems we considered, synchronous coupling was largely successful in keeping the trajectories sufficiently close during the transient relaxation. For the shear viscosity example, relaxation and decoupling almost coincided, which suggests that a less dissipative system is likely to undergo decoupling significantly before convergence. Such scenarios motivate exploring more robust coupling strategies to delay decoupling, such as the ones described in [guillin2012, monmarche2023a, chak2024]. \printbibliography