Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
License: CC BY 4.0
arXiv:2201.00611v4 [math.NA] 19 Dec 2023
\tocauthor

Sebastian Reich 11institutetext: University of Potsdam, Institute of Mathematics, Potsdam, Germany
11email: sebastian.reich@uni-postdam.de,

Robust parameter estimation using the ensemble Kalman filter

Sebastian Reich 11
Abstract

Standard maximum likelihood or Bayesian approaches to parameter estimation for stochastic differential equations are not robust to perturbations in the continuous-in-time data. In this paper, we give a rather elementary explanation of this observation in the context of continuous-time parameter estimation using an ensemble Kalman filter. We employ the frequentist perspective to shed new light on three robust estimation techniques; namely subsampling the data, rough path corrections, and data filtering. We illustrate our findings through a simple numerical experiment.

keywords:
parameter estimation, stochastic differential equations, ensemble Kalman filter, frequentist approach, rough path theory

1 Introduction

In this note, which is an extended version of [30], we consider the well-studied problem of parameter estimation for stochastic differential equations (SDEs) from continuous-time observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, t[0,T]𝑡0𝑇t\in[0,T]italic_t ∈ [ 0 , italic_T ] [26]. It is well-known that the corresponding maximum likelihood estimator does not depend continuously on the observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, t[0,T]𝑡0𝑇t\in[0,T]italic_t ∈ [ 0 , italic_T ], which can result in a systematic estimation bias [28, 15]. In other words, the maximum likelihood estimator is not robust with respect to perturbations in the observations. Here, we revisit this problem from the perspective of online (time-continuous) parameter estimation [7, 12] using the popular ensemble Kalman filter (EnKF) and its continuous-time ensemble Kalman–Bucy filter (EnKBF) formulations [16, 11, 27]. As for the corresponding maximum likelihood approaches, the EnKBF does not depend continuously on the incoming observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, t0𝑡0t\geq 0italic_t ≥ 0, with respect to the uniform norm topology on the space of continuous functions. This fact has been first investigated in [10] using rough path theory [17]. In particular, as already demonstrated for the related maximum likelihood estimator in [15], rough path theory allows one to specify an appropriately generalised topology which leads to a continuous dependence of the EnKBF estimators on the observations. Here we expand the analysis of [10] to a frequentist analysis of the EnKBF in the spirit of [31], where the primary focus is on the expected behaviour of the EnKBF estimators over all admissible observation paths. One recovers that the discontinuous dependence of the EnKBF estimators on the driving observations results in a systematic bias from a frequentist perspective. This is also a well known fact for SDEs driven by multiplicative noise [24].

The proposed frequentist perspective naturally enables the study of known bias correction methods, such as subsampling the data [28], a recently proposed data filtering approach [2], as well as novel de-biasing approaches [10] in the context of the EnKBF.

In order to facilitate a rather elementary mathematical analysis, we consider only the very much simplified problem of parameter estimation for linear SDEs. This restriction allows us to avoid certain technicalities from rough path theory and enables a rather straightforward application of the numerical rough path approach put forward in [14]. As a result we are able to demonstrate that the popular approach of subsampling the data [3, 28, 6] can be well justified from a frequentist perspective. The frequentist perspective also suggests a rather natural approach to the estimation of the required correction term in the case an EnKBF is implemented without subsampling.

We end this introductory paragraph with a reference to [1], which includes a broad survey on alternative estimation techniques. We also point to [10] for an in-depth discussion of rough path theory in connection to filtering and parameter estimation.

The remainder of this paper is structured as follows. The problem setting and the EnKBF are introduced in the subsequent Section 2. The frequentist perspective and its implications on the specific implementations of an EnKBF in the context of low and high frequency data assimilation are laid out in Section 3. The importance of these considerations becomes transparent when applying the EnKBF to perturbed data in Section 4. Here again, we restrict attention to a rather simple model setting taken from [18] and also used in [10]. As a result we build a clear connection between subsampling and the necessity for a correction term in the case high frequency data is assimilated directly. We also provide a discussion of the data filtering approach [2] in the context of our simply model system. A brief numerical demonstration is provided in Section 5, which is followed by a concluding remark in Section 6.

2 Ensemble Kalman parameter estimation

We consider the SDE parameter estimation problem

dXt=f(Xt,θ)dt+γ1/2dWtdsubscript𝑋𝑡𝑓subscript𝑋𝑡𝜃d𝑡superscript𝛾12dsubscript𝑊𝑡{\rm d}X_{t}=f(X_{t},\theta){\rm d}t+\gamma^{1/2}{\rm d}W_{t}roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_θ ) roman_d italic_t + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (1)

subject to observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, t[0,T]𝑡0𝑇t\in[0,T]italic_t ∈ [ 0 , italic_T ], which arise from the reference system

dXt=f(Xt)dt+γ1/2dWt,dsuperscriptsubscript𝑋𝑡superscript𝑓superscriptsubscript𝑋𝑡d𝑡superscript𝛾12dsubscriptsuperscript𝑊𝑡{\rm d}X_{t}^{\dagger}=f^{\dagger}(X_{t}^{\dagger}){\rm d}t+\gamma^{1/2}{\rm d% }W^{\dagger}_{t},roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = italic_f start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_d italic_t + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (2)

where the unknown drift function f(x)superscript𝑓𝑥f^{\dagger}(x)italic_f start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_x ) typically satisfies f(x)=f(x,θ)superscript𝑓𝑥𝑓𝑥superscript𝜃f^{\dagger}(x)=f(x,\theta^{\dagger})italic_f start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_x ) = italic_f ( italic_x , italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) and θsuperscript𝜃\theta^{\dagger}italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT denotes the true parameter value. Here we assume for simplicity that the unknown parameter is scalar-valued and that the state variable is d𝑑ditalic_d-dimensional with d1𝑑1d\geq 1italic_d ≥ 1. Furthermore, Wtsubscript𝑊𝑡W_{t}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and Wtsuperscriptsubscript𝑊𝑡W_{t}^{\dagger}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT denote independent standard d𝑑ditalic_d-dimensional Brownian motions and γ>0𝛾0\gamma>0italic_γ > 0 is the (known) diffusion constant.

Following the Bayesian paradigm, we treat the unknown parameter as a random variable ΘΘ\Thetaroman_Θ. Furthermore, we apply a sequential approach and update ΘΘ\Thetaroman_Θ with the incoming data Xtsubscriptsuperscript𝑋𝑡X^{\dagger}_{t}italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as a function of time. Hence we introduce the random variable ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT which obeys the Bayesian posterior distribution given all observations Xτsuperscriptsubscript𝑋𝜏X_{\tau}^{\dagger}italic_X start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, τ[0,t]𝜏0𝑡\tau\in[0,t]italic_τ ∈ [ 0 , italic_t ], up to time t>0𝑡0t>0italic_t > 0. Furthermore, instead of exactly solving the time-continuous Bayesian inference problem as specified by the associated Kushner–Stratonovitch equation [7, 27], we define the time evolution of ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT by an application of the (deterministic) ensemble Kalman–Bucy filter (EnKBF) mean-field equations [11, 27], which take the form

dΘt&=γ1πt[(θπt[θ])f(Xt,θ)]dIt,dIt=dXt12(f(Xt,Θt)+πt[f(Xt,θ)])dt,formulae-sequencedsubscriptΘ𝑡&superscript𝛾1subscript𝜋𝑡delimited-[]tensor-product𝜃subscript𝜋𝑡delimited-[]𝜃𝑓superscriptsubscript𝑋𝑡𝜃dsubscript𝐼𝑡dsubscript𝐼𝑡dsuperscriptsubscript𝑋𝑡12𝑓superscriptsubscript𝑋𝑡subscriptΘ𝑡subscript𝜋𝑡delimited-[]𝑓subscriptsuperscript𝑋𝑡𝜃d𝑡{\rm d}\Theta_{t}&=\gamma^{-1}\pi_{t}\left[(\theta-\pi_{t}[\theta])\otimes f(X% _{t}^{\dagger},\theta)\right]{\rm d}I_{t},\\ {\rm d}I_{t}={\rm d}X_{t}^{\dagger}-\frac{1}{2}\left(f(X_{t}^{\dagger},\Theta_% {t})+\pi_{t}[f(X^{\dagger}_{t},\theta)]\right){\rm d}t,roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] ) ⊗ italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_f ( italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_θ ) ] ) roman_d italic_t ,

where πtsubscript𝜋𝑡\pi_{t}italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT denotes the probability density function (PDF) of ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and πt[g]subscript𝜋𝑡delimited-[]𝑔\pi_{t}[g]italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_g ] the associated expectation value of a function g(θ)𝑔𝜃g(\theta)italic_g ( italic_θ ). The column vector Itsubscript𝐼𝑡I_{t}italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, defined by (2b), is called the innovation, while the row vector

Kt(πt)=γ1πt[(θπt[θ])f(Xt,θ)],subscript𝐾𝑡subscript𝜋𝑡superscript𝛾1subscript𝜋𝑡delimited-[]tensor-product𝜃subscript𝜋𝑡delimited-[]𝜃𝑓superscriptsubscript𝑋𝑡𝜃K_{t}(\pi_{t})=\gamma^{-1}\pi_{t}\left[(\theta-\pi_{t}[\theta])\otimes f(X_{t}% ^{\dagger},\theta)\right],italic_K start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] ) ⊗ italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] , (4)

premultiplying the innovation in (2a) is called the gain. Here the notation ab=abTtensor-product𝑎𝑏𝑎superscript𝑏Ta\otimes b=ab^{\rm T}italic_a ⊗ italic_b = italic_a italic_b start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, where a,b𝑎𝑏a,bitalic_a , italic_b can be any two column vectors, has been used. The initial condition Θ0π0similar-tosubscriptΘ0subscript𝜋0\Theta_{0}\sim\pi_{0}roman_Θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is provided by the prior PDF of the unknown parameter.

A Monte-Carlo implementation of the mean-field equations (2) leads to the interacting particle system

dΘt(i)&=γ1πtM[(θπtM[θ])f(Xt,θ)]dIt(i),dIt(i)=dXt12(f(Xt,Θt(i))+πtM[f(Xt,θ)])dt,formulae-sequencedsuperscriptsubscriptΘ𝑡𝑖&superscript𝛾1superscriptsubscript𝜋𝑡𝑀delimited-[]tensor-product𝜃superscriptsubscript𝜋𝑡𝑀delimited-[]𝜃𝑓superscriptsubscript𝑋𝑡𝜃dsuperscriptsubscript𝐼𝑡𝑖dsuperscriptsubscript𝐼𝑡𝑖dsuperscriptsubscript𝑋𝑡12𝑓superscriptsubscript𝑋𝑡superscriptsubscriptΘ𝑡𝑖superscriptsubscript𝜋𝑡𝑀delimited-[]𝑓subscriptsuperscript𝑋𝑡𝜃d𝑡{\rm d}\Theta_{t}^{(i)}&=\gamma^{-1}\pi_{t}^{M}\left[(\theta-\pi_{t}^{M}[% \theta])\otimes f(X_{t}^{\dagger},\theta)\right]{\rm d}I_{t}^{(i)},\\ {\rm d}I_{t}^{(i)}={\rm d}X_{t}^{\dagger}-\frac{1}{2}\left(f(X_{t}^{\dagger},% \Theta_{t}^{(i)})+\pi_{t}^{M}[f(X^{\dagger}_{t},\theta)]\right){\rm d}t,roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT & = italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT [ italic_θ ] ) ⊗ italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT [ italic_f ( italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_θ ) ] ) roman_d italic_t ,

i=1,,M𝑖1𝑀i=1,\ldots,Mitalic_i = 1 , … , italic_M, where expectations are now taken with respect to the empirical measure. That is,

πtM[g]=1Mi=1Mg(Θt(i))superscriptsubscript𝜋𝑡𝑀delimited-[]𝑔1𝑀superscriptsubscript𝑖1𝑀𝑔superscriptsubscriptΘ𝑡𝑖\pi_{t}^{M}[g]=\frac{1}{M}\sum_{i=1}^{M}g(\Theta_{t}^{(i)})italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT [ italic_g ] = divide start_ARG 1 end_ARG start_ARG italic_M end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_g ( roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) (6)

for given function g(θ)𝑔𝜃g(\theta)italic_g ( italic_θ ), and all Monte-Carlo samples are driven by the same (fixed) observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT. The initial samples Θ0(i)superscriptsubscriptΘ0𝑖\Theta_{0}^{(i)}roman_Θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT, i=1,,M𝑖1𝑀i=1,\ldots,Mitalic_i = 1 , … , italic_M, are drawn identically and independently from the prior distribution π0subscript𝜋0\pi_{0}italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

We note in passing that there is also a stochastic variant of the innovation process [27] defined by

dIt=dXtf(Xt,Θt)dtγ1/2dWt,dsubscript𝐼𝑡dsuperscriptsubscript𝑋𝑡𝑓superscriptsubscript𝑋𝑡subscriptΘ𝑡d𝑡superscript𝛾12dsubscript𝑊𝑡{\rm d}I_{t}={\rm d}X_{t}^{\dagger}-f(X_{t}^{\dagger},\Theta_{t}){\rm d}t-% \gamma^{1/2}{\rm d}W_{t},roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) roman_d italic_t - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (7)

which leads to the Monte-Carlo approximation

dIt(i)=dXtf(Xt,Θt(i))dtγ1/2dWt(i)dsuperscriptsubscript𝐼𝑡𝑖dsuperscriptsubscript𝑋𝑡𝑓superscriptsubscript𝑋𝑡superscriptsubscriptΘ𝑡𝑖d𝑡superscript𝛾12dsuperscriptsubscript𝑊𝑡𝑖{\rm d}I_{t}^{(i)}={\rm d}X_{t}^{\dagger}-f(X_{t}^{\dagger},\Theta_{t}^{(i)}){% \rm d}t-\gamma^{1/2}{\rm d}W_{t}^{(i)}roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) roman_d italic_t - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT (8)

of the innovation in (2).

Remark 2.1.

There is an intriguing connection to the stochastic gradient descent approach to the estimation of θsuperscript𝜃normal-†\theta^{\dagger}italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, as proposed in [32], which is written as

dθt&=αtγθf(Xt,θt)dI~t,dI~t=dXtf(Xt,θt)dtformulae-sequencedsubscript𝜃𝑡&subscript𝛼𝑡𝛾subscript𝜃𝑓superscriptsubscript𝑋𝑡subscript𝜃𝑡dsubscript~𝐼𝑡dsubscript~𝐼𝑡dsuperscriptsubscript𝑋𝑡𝑓superscriptsubscript𝑋𝑡subscript𝜃𝑡d𝑡{\rm d}\theta_{t}&=\frac{\alpha_{t}}{\gamma}\nabla_{\theta}f(X_{t}^{\dagger},% \theta_{t}){\rm d}\tilde{I}_{t},\\ {\rm d}\tilde{I}_{t}={\rm d}X_{t}^{\dagger}-f(X_{t}^{\dagger},\theta_{t}){\rm d}troman_d italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = divide start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) roman_d over~ start_ARG italic_I end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , roman_d over~ start_ARG italic_I end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_f ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) roman_d italic_t

in our notation, where αt>0subscript𝛼𝑡0\alpha_{t}>0italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT > 0 denotes the learning rate. We note that (2.1) shares with (2) the gain times innovation structure. However, while (2) approximates the Bayesian inference problem, formulation (2.1) treats the parameter estimation problem from an optimisation perspective. Both formulations share, however, the discontinuous dependence on the observation path Xtsuperscriptsubscript𝑋𝑡normal-†X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, and the proposed frequentist analysis of the EnKBF (2) also applies in simplified form to (2.1). We also point out that (2) is affine invariant [19] and does not require the computation of partial derivatives.

We now state a numerical implementation with step-size Δt>0Δ𝑡0\Delta t>0roman_Δ italic_t > 0 and denote the resulting numerical approximations at tn=nΔtsubscript𝑡𝑛𝑛Δ𝑡t_{n}=n\Delta titalic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_n roman_Δ italic_t by Θnπnsimilar-tosubscriptΘ𝑛subscript𝜋𝑛\Theta_{n}\sim\pi_{n}roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∼ italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, n1𝑛1n\geq 1italic_n ≥ 1. While a standard Euler–Maruyama approximation could be applied, the following stable discrete-time mean-field formulation of the EnKBF

Θn+1=Θn+Kn{(Xtn+1Xtn)12(f(Xtn,Θn)+πn[f(Xtn,θ)])Δt}subscriptΘ𝑛1subscriptΘ𝑛subscript𝐾𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛12𝑓superscriptsubscript𝑋subscript𝑡𝑛subscriptΘ𝑛subscript𝜋𝑛delimited-[]𝑓subscriptsuperscript𝑋subscript𝑡𝑛𝜃Δ𝑡\Theta_{n+1}=\Theta_{n}+K_{n}\left\{(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger}% )-\frac{1}{2}\left(f(X_{t_{n}}^{\dagger},\Theta_{n})+\pi_{n}[f(X^{\dagger}_{t_% {n}},\theta)]\right)\Delta t\right\}roman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT { ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_f ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ italic_f ( italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT , italic_θ ) ] ) roman_Δ italic_t } (10)

is inspired by [4] with Kalman gain

Kn&=πn[(θπn[θ])f(Xtn,θ)]×(γ+Δtπn[(f(Xtn,θ)πn[f(Xtn,θ)])f(Xtn,θ)])1.K_{n}&=\pi_{n}\left[(\theta-\pi_{n}[\theta])\otimes f(X_{t_{n}}^{\dagger},% \theta)\right]\times\\ \quad\left(\gamma+\Delta t\pi_{n}\left[\left(f(X_{t_{n}}^{\dagger},\theta)-\pi% _{n}[f(X_{t_{n}}^{\dagger},\theta)]\right)\otimes f(X_{t_{n}}^{\dagger},\theta% )\right]\right)^{-1}.italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT & = italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ italic_θ ] ) ⊗ italic_f ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] × ( italic_γ + roman_Δ italic_t italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ ( italic_f ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) - italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ italic_f ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] ) ⊗ italic_f ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_θ ) ] ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

It is straightforward to combine this time discretisation with the Monte-Carlo approximation (2) in order to obtain a complete numerical implementation of the EnKBF.

Remark 2.2.

The rough path analysis of the EnKBF presented in [10] is based on a Stratonovich reformulation of (2) and its appropriate time discretisation. Here we follow the Itô/Euler–Maruyama formulation of the data-driven term in (2),

0Tg(Xt,t)dXt=limΔt0i=1Lg(Xtn,tn)(Xtn+1Xtn)superscriptsubscript0𝑇𝑔superscriptsubscript𝑋𝑡𝑡differential-dsuperscriptsubscript𝑋𝑡subscriptΔ𝑡0superscriptsubscript𝑖1𝐿𝑔superscriptsubscript𝑋subscript𝑡𝑛subscript𝑡𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛\int_{0}^{T}g(X_{t}^{\dagger},t)\,{\rm d}X_{t}^{\dagger}=\lim_{\Delta t\to 0}% \sum_{i=1}^{L}g(X_{t_{n}}^{\dagger},t_{n})(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{% \dagger})∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_g ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_t ) roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = roman_lim start_POSTSUBSCRIPT roman_Δ italic_t → 0 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_g ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) (12)

for any continuous function g(x,t)𝑔𝑥𝑡g(x,t)italic_g ( italic_x , italic_t ) and Δt=T/Lnormal-Δ𝑡𝑇𝐿\Delta t=T/Lroman_Δ italic_t = italic_T / italic_L, as it corresponds to standard implementation of the EnKBF and is easier to analyse in the context of this paper.

The EnKBF provides only an approximate solution to the Bayesian inference problem for general nonlinear f(x,θ)𝑓𝑥𝜃f(x,\theta)italic_f ( italic_x , italic_θ ). However, it becomes exact in the mean-field limit for affine drift functions f(x,θ)=θAx+Bx+c𝑓𝑥𝜃𝜃𝐴𝑥𝐵𝑥𝑐f(x,\theta)=\theta Ax+Bx+citalic_f ( italic_x , italic_θ ) = italic_θ italic_A italic_x + italic_B italic_x + italic_c.

Example 2.3.

Consider the stochastic partial differential equation

tu=Uyu+ρy2u+𝒲˙subscript𝑡𝑢𝑈subscript𝑦𝑢𝜌superscriptsubscript𝑦2𝑢˙𝒲\partial_{t}u=-U\partial_{y}u+\rho\partial_{y}^{2}u+\dot{\mathcal{W}}∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_u = - italic_U ∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT italic_u + italic_ρ ∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u + over˙ start_ARG caligraphic_W end_ARG (13)

over a periodic spatial domain y[0,L)𝑦0𝐿y\in[0,L)italic_y ∈ [ 0 , italic_L ), where 𝒲(t,y)𝒲𝑡𝑦\mathcal{W}(t,y)caligraphic_W ( italic_t , italic_y ) denotes space-time white noise, U𝑈U\in\mathbb{R}italic_U ∈ blackboard_R, and ρ>0𝜌0\rho>0italic_ρ > 0 are given parameters. A standard finite-difference discretisation in space with d𝑑ditalic_d grid points and mesh-size Δynormal-Δ𝑦\Delta yroman_Δ italic_y leads to a linear system of SDEs of the form

d𝐮t=(UD+ρDDT)𝐮tdt+Δy1/2dWt,dsubscript𝐮𝑡𝑈𝐷𝜌𝐷superscript𝐷Tsubscript𝐮𝑡d𝑡Δsuperscript𝑦12dsubscript𝑊𝑡{\rm d}{\bf u}_{t}=-(UD+\rho DD^{\rm T}){\bf u}_{t}{\rm d}t+\Delta y^{-1/2}{% \rm d}W_{t},roman_d bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - ( italic_U italic_D + italic_ρ italic_D italic_D start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t + roman_Δ italic_y start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (14)

where 𝐮tdsubscript𝐮𝑡superscript𝑑{\bf u}_{t}\in\mathbb{R}^{d}bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT denotes the vector of grid approximations at time t𝑡titalic_t, Dd×d𝐷superscript𝑑𝑑D\in\mathbb{R}^{d\times d}italic_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT a finite difference approximation of the spatial derivative ysubscript𝑦\partial_{y}∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT, and Wtsubscript𝑊𝑡W_{t}italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT the standard d𝑑ditalic_d-dimensional Brownian motion. We can now set Xt=𝐮tsubscript𝑋𝑡subscript𝐮𝑡X_{t}={\bf u}_{t}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, γ=Δy1𝛾normal-Δsuperscript𝑦1\gamma=\Delta y^{-1}italic_γ = roman_Δ italic_y start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and identify either θ=U𝜃𝑈\theta=Uitalic_θ = italic_U or θ=ρ𝜃𝜌\theta=\rhoitalic_θ = italic_ρ as the unknown parameter in order to obtain an SDE of the form (1).

In this note, we further simplify our given inference problem to the case

f(x,θ)=θAx,𝑓𝑥𝜃𝜃𝐴𝑥f(x,\theta)=\theta Ax\,,italic_f ( italic_x , italic_θ ) = italic_θ italic_A italic_x , (15)

where Ad×d𝐴superscript𝑑𝑑A\in\mathbb{R}^{d\times d}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT is a normal matrix with eigenvalues in the left half plane. That is σ(A)𝜎𝐴subscript\sigma(A)\subset\mathbb{C}_{-}italic_σ ( italic_A ) ⊂ blackboard_C start_POSTSUBSCRIPT - end_POSTSUBSCRIPT. The reference parameter value is set to θ=1superscript𝜃1\theta^{\dagger}=1italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = 1. Hence the SDE (2) possesses a Gaussian invariant measure with mean zero and covariance matrix

C=γ(A+AT)1.𝐶𝛾superscript𝐴superscript𝐴T1C=-\gamma(A+A^{\rm T})^{-1}.italic_C = - italic_γ ( italic_A + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (16)

We assume from now on that the observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT are realisations of (2) with initial condition X0N(0,C)similar-tosuperscriptsubscript𝑋0N0𝐶X_{0}^{\dagger}\sim{\rm N}(0,C)italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∼ roman_N ( 0 , italic_C ).

Under these assumptions, the EnKBF (2) simplifies drastically, and we obtain

dΘt&=σtγ(AXt)TdIt,dIt=dXt12(Θt+πt[θ])AXtdt,formulae-sequencedsubscriptΘ𝑡&subscript𝜎𝑡𝛾superscript𝐴subscriptsuperscript𝑋𝑡Tdsubscript𝐼𝑡dsubscript𝐼𝑡dsuperscriptsubscript𝑋𝑡12subscriptΘ𝑡subscript𝜋𝑡delimited-[]𝜃𝐴superscriptsubscript𝑋𝑡d𝑡{\rm d}\Theta_{t}&=\frac{\sigma_{t}}{\gamma}(AX^{\dagger}_{t})^{\rm T}{\rm d}I% _{t},\\ {\rm d}I_{t}={\rm d}X_{t}^{\dagger}-\frac{1}{2}\left(\Theta_{t}+\pi_{t}[\theta% ]\right)AX_{t}^{\dagger}{\rm d}t,roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_d italic_t ,

with variance

σt=πt[(θπt[θ])2].subscript𝜎𝑡subscript𝜋𝑡delimited-[]superscript𝜃subscript𝜋𝑡delimited-[]𝜃2\sigma_{t}=\pi_{t}\left[(\theta-\pi_{t}[\theta])^{2}\right].italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] . (18)
Remark 2.4.

For completeness, we state the corresponding formulation for the stochastic gradient descent approach (2.1):

dθt&=αtγ(AXt)TdI~t,dI~t=dXtθtAXtdt.formulae-sequencedsubscript𝜃𝑡&subscript𝛼𝑡𝛾superscript𝐴superscriptsubscript𝑋𝑡Tdsubscript~𝐼𝑡dsubscript~𝐼𝑡dsuperscriptsubscript𝑋𝑡subscript𝜃𝑡𝐴superscriptsubscript𝑋𝑡d𝑡{\rm d}\theta_{t}&=\frac{\alpha_{t}}{\gamma}(AX_{t}^{\dagger})^{\rm T}{\rm d}% \tilde{I}_{t},\\ {\rm d}\tilde{I}_{t}={\rm d}X_{t}^{\dagger}-\theta_{t}AX_{t}^{\dagger}{\rm d}t.roman_d italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = divide start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d over~ start_ARG italic_I end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , roman_d over~ start_ARG italic_I end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_d italic_t .

We find that the learning rate αtsubscript𝛼𝑡\alpha_{t}italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT takes the role of the variance σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in (2). However, we emphasise again that the same pathwise stochastic integrals arise from both formulations, and therefore, the same robustness issue of the resulting estimators θtsubscript𝜃𝑡\theta_{t}italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, t>0𝑡0t>0italic_t > 0, arises.

Similarly, the discrete-time mean-field EnKBF (10) reduces to

Θn+1=Θn+Kn{(Xtn+1Xtn)12(Θn+πn[θ])AXtnΔt}subscriptΘ𝑛1subscriptΘ𝑛subscript𝐾𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛12subscriptΘ𝑛subscript𝜋𝑛delimited-[]𝜃𝐴superscriptsubscript𝑋subscript𝑡𝑛Δ𝑡\Theta_{n+1}=\Theta_{n}+K_{n}\left\{(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger}% )-\frac{1}{2}\left(\Theta_{n}+\pi_{n}[\theta]\right)AX_{t_{n}}^{\dagger}\Delta t\right\}roman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT { ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_Δ italic_t } (20)

with Kalman gain

Kn=σn(AXtn)T(γ+Δtσn(AXtn)TAXtn)1.subscript𝐾𝑛subscript𝜎𝑛superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛Tsuperscript𝛾Δ𝑡subscript𝜎𝑛superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛T𝐴superscriptsubscript𝑋subscript𝑡𝑛1K_{n}=\sigma_{n}(AX_{t_{n}}^{\dagger})^{\rm T}\left(\gamma+\Delta t\sigma_{n}(% AX_{t_{n}}^{\dagger})^{\rm T}AX_{t_{n}}^{\dagger}\right)^{-1}\,.italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_γ + roman_Δ italic_t italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (21)

Furthermore, since XtN(0,C)similar-tosuperscriptsubscript𝑋𝑡N0𝐶X_{t}^{\dagger}\sim{\rm N}(0,C)italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∼ roman_N ( 0 , italic_C ),

(AXt)TAXt=(ATA):(XtXt)(ATA):C:superscript𝐴superscriptsubscript𝑋𝑡T𝐴superscriptsubscript𝑋𝑡superscript𝐴T𝐴tensor-productsuperscriptsubscript𝑋𝑡superscriptsubscript𝑋𝑡superscript𝐴T𝐴:𝐶(AX_{t}^{\dagger})^{\rm T}AX_{t}^{\dagger}=(A^{\rm T}A):(X_{t}^{\dagger}% \otimes X_{t}^{\dagger})\approx(A^{\rm T}A):C( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ≈ ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C (22)

for d1much-greater-than𝑑1d\gg 1italic_d ≫ 1, and we may simplify the Kalman gain to

Kn=σn(AXtn)T(γ+Δtσn(ATA):C)1.K_{n}=\sigma_{n}\,(AX_{t_{n}}^{\dagger})^{\rm T}\left(\gamma+\Delta t\sigma_{n% }\,(A^{\rm T}A):C\right)^{-1}.italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_γ + roman_Δ italic_t italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (23)

Here we have used the notation A:B=tr(ATB):𝐴𝐵trsuperscript𝐴T𝐵A:B=\mbox{tr}(A^{\rm T}B)italic_A : italic_B = tr ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_B ) to denote the Frobenius inner product of two matrices A,Bd×d𝐴𝐵superscript𝑑𝑑A,B\in\mathbb{R}^{d\times d}italic_A , italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT. The approximation (22) becomes exact in the limit d𝑑d\to\inftyitalic_d → ∞, which we will frequently assume in the following section. Please note that

Kn=σnγ(AXtn)T+𝒪(Δt)subscript𝐾𝑛subscript𝜎𝑛𝛾superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛T𝒪Δ𝑡K_{n}=\frac{\sigma_{n}}{\gamma}\,(AX_{t_{n}}^{\dagger})^{\rm T}+\mathcal{O}(% \Delta t)italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT + caligraphic_O ( roman_Δ italic_t ) (24)

under the stated assumptions.

Remark 2.5.

The Stratonovitch reformulation of (2) replaces (2a) by

dΘt=σtγ{(AXt)TdItγ2𝑡𝑟(A)dt}.dsubscriptΘ𝑡subscript𝜎𝑡𝛾superscript𝐴subscriptsuperscript𝑋𝑡Tdsubscript𝐼𝑡𝛾2𝑡𝑟𝐴d𝑡{\rm d}\Theta_{t}=\frac{\sigma_{t}}{\gamma}\left\{(AX^{\dagger}_{t})^{\rm T}% \circ{\rm d}I_{t}-\frac{\gamma}{2}\mbox{tr}\,(A)\,{\rm d}t\right\}.roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG { ( italic_A italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ∘ roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - divide start_ARG italic_γ end_ARG start_ARG 2 end_ARG tr ( italic_A ) roman_d italic_t } . (25)

The innovation Itsubscript𝐼𝑡I_{t}italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT remains as before. See Appendix B of [10] for more details. An appropriate time discretisation of the innovation-driven term replaces the Kalman gain (21) by

Kn+1/2=σn(AXtn+1/2)T(γ+Δtσn(AXtn+1/2)TAXtn+1/2)1,subscript𝐾𝑛12subscript𝜎𝑛superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛12Tsuperscript𝛾Δ𝑡subscript𝜎𝑛superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛12T𝐴superscriptsubscript𝑋subscript𝑡𝑛121K_{n+1/2}=\sigma_{n}(AX_{t_{n+1/2}}^{\dagger})^{\rm T}\left(\gamma+\Delta t% \sigma_{n}(AX_{t_{n+1/2}}^{\dagger})^{\rm T}AX_{t_{n+1/2}}^{\dagger}\right)^{-% 1},italic_K start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_γ + roman_Δ italic_t italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (26)

where

Xtn+1/2=12(Xtn+Xtn+1).superscriptsubscript𝑋subscript𝑡𝑛1212superscriptsubscript𝑋subscript𝑡𝑛superscriptsubscript𝑋subscript𝑡𝑛1X_{t_{n+1/2}}^{\dagger}=\frac{1}{2}(X_{t_{n}}^{\dagger}+X_{t_{n+1}}^{\dagger})\,.italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT + italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) . (27)

Please note that a midpoint discretisation of the data-driven term in (25) results in

(AXtn+1/2)T(Xtn+1Xtn)&=(AXtn)T(Xtn+1Xtn)+12AT:(Xtn+1Xtn)(Xtn+1Xtn):superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛12Tsuperscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛&superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛Tsuperscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛12superscript𝐴Ttensor-productsuperscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛(AX_{t_{n+1/2}}^{\dagger})^{\rm T}(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})&% =(AX_{t_{n}}^{\dagger})^{\rm T}(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})\,\,% +\\ \,\,\,\frac{1}{2}A^{\rm T}:(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})\otimes(% X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) & = ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ⊗ ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT )

and that

12AT:(Xtn+1Xtn)(Xtn+1Xtn)Δtγ2𝑡𝑟(A),:12superscript𝐴Ttensor-productsuperscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛Δ𝑡𝛾2𝑡𝑟𝐴\frac{1}{2}A^{\rm T}:(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})\otimes(X_{t_{% n+1}}^{\dagger}-X_{t_{n}}^{\dagger})\approx\frac{\Delta t\,\gamma}{2}\mbox{tr}% \,(A),divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ⊗ ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ≈ divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG tr ( italic_A ) , (29)

which justifies the additional drift term in (25). A precise meaning of the approximation in (29) will be given in Remark 3.2 below.

Alternatively, if one wishes to explicitly utilise the availability of continuous-time data Xtsubscriptsuperscript𝑋𝑡X^{\dagger}_{t}italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, one could apply the following variant of (20):

Θn+1=Θn+σnγtntn+1(AXt)TdXt12KnAXtn(Θn+πn[θ])Δt,subscriptΘ𝑛1subscriptΘ𝑛subscript𝜎𝑛𝛾superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡12subscript𝐾𝑛𝐴superscriptsubscript𝑋subscript𝑡𝑛subscriptΘ𝑛subscript𝜋𝑛delimited-[]𝜃Δ𝑡\Theta_{n+1}=\Theta_{n}+\frac{\sigma_{n}}{\gamma}\int_{t_{n}}^{t_{n+1}}(AX_{t}% ^{\dagger})^{\rm T}{\rm d}X_{t}^{\dagger}-\frac{1}{2}K_{n}AX_{t_{n}}^{\dagger}% \left(\Theta_{n}+\pi_{n}[\theta]\right)\Delta t,roman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT [ italic_θ ] ) roman_Δ italic_t , (30)

and following the Itô/Euler–Maruyama approximation (12), discretise the integral with a small inner step-size Δτ=Δt/LΔ𝜏Δ𝑡𝐿\Delta\tau=\Delta t/Lroman_Δ italic_τ = roman_Δ italic_t / italic_L, L1much-greater-than𝐿1L\gg 1italic_L ≫ 1; that is,

tntn+1(AXt)TdXtl=0L1(AXτl)T(Xτl+1Xτl)superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡superscriptsubscript𝑙0𝐿1superscript𝐴superscriptsubscript𝑋subscript𝜏𝑙Tsuperscriptsubscript𝑋subscript𝜏𝑙1superscriptsubscript𝑋subscript𝜏𝑙\int_{t_{n}}^{t_{n+1}}(AX_{t}^{\dagger})^{\rm T}{\rm d}X_{t}^{\dagger}\approx% \sum_{l=0}^{L-1}(AX_{\tau_{l}}^{\dagger})^{\rm T}(X_{\tau_{l+1}}^{\dagger}-X_{% \tau_{l}}^{\dagger})∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ≈ ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L - 1 end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) (31)

with τl=tn+lΔτsubscript𝜏𝑙subscript𝑡𝑛𝑙Δ𝜏\tau_{l}=t_{n}+l\Delta\tauitalic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_l roman_Δ italic_τ. We note that

l=0L1(AXτl)T(Xτl+1Xτl)&=(AXtn)T(Xtn+1Xtn)+AT:(l=0L1(XτlXtn)(Xτl+1Xτl)),:superscriptsubscript𝑙0𝐿1superscript𝐴superscriptsubscript𝑋subscript𝜏𝑙Tsuperscriptsubscript𝑋subscript𝜏𝑙1superscriptsubscript𝑋subscript𝜏𝑙&superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛Tsuperscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛superscript𝐴Tsuperscriptsubscript𝑙0𝐿1tensor-productsuperscriptsubscript𝑋subscript𝜏𝑙superscriptsubscript𝑋subscript𝑡𝑛superscriptsubscript𝑋subscript𝜏𝑙1superscriptsubscript𝑋subscript𝜏𝑙\sum_{l=0}^{L-1}(AX_{\tau_{l}}^{\dagger})^{\rm T}(X_{\tau_{l+1}}^{\dagger}-X_{% \tau_{l}}^{\dagger})&=(AX_{t_{n}}^{\dagger})^{\rm T}(X_{t_{n+1}}^{\dagger}-X_{% t_{n}}^{\dagger})\,\,+\\ \,\,A^{\rm T}:\left(\sum_{l=0}^{L-1}(X_{\tau_{l}}^{\dagger}-X_{t_{n}}^{\dagger% })\otimes(X_{\tau_{l+1}}^{\dagger}-X_{\tau_{l}}^{\dagger})\right),∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L - 1 end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) & = ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L - 1 end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ⊗ ( italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ) ,

which is at the heart of rough path analysis [14] and which we utilise in the following section.

3 Frequentist analysis

It is well-known that the second-order contribution in (2) leads to a discontinuous dependence of the integral on the observed Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT in the uniform norm topology on the space of continuous functions. Rough path theory fixes this problem by defining appropriately extended topologies and has been extended to the EnKBF in [10]. In this section, we complement the path-wise analysis from [10] by an analysis of the impact of second-order contribution on the EnKBF (2) from a frequentist perspective, which analyses the behaviour of EnKBF over all possible observations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT subject to (2). In other words, one switches from a strong solution concept to a weak one. While we assume that the observations satisfy (2), throughout this section, we will analyse the impact of a perturbed observation process on the EnKBF in Section 4.

We first derive evolution equations for the conditional mean and variance under the assumption that Θ0subscriptΘ0\Theta_{0}roman_Θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is Gaussian distributed with given prior mean mpriorsubscript𝑚priorm_{\rm prior}italic_m start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT and variance σpriorsubscript𝜎prior\sigma_{\rm prior}italic_σ start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT. It follows directly from (2) that the conditional mean μt=πt[θ]subscript𝜇𝑡subscript𝜋𝑡delimited-[]𝜃\mu_{t}=\pi_{t}[\theta]italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ], that is the mean of ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, satisfies the SDE

dμt=σtγ((AXt)TdXtμt(ATA):(XtXt)dt),{\rm d}\mu_{t}=\frac{\sigma_{t}}{\gamma}\left((AX_{t}^{\dagger})^{\rm T}{\rm d% }X^{\dagger}_{t}-\mu_{t}\,(A^{\rm T}A):(X_{t}^{\dagger}\otimes X_{t}^{\dagger}% )\,{\rm d}t\right),roman_d italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_d italic_t ) , (33)

which simplifies to

dμt=σtγ((AXt)TdXtμt(ATA):Cdt),{\rm d}\mu_{t}=\frac{\sigma_{t}}{\gamma}\left((AX_{t}^{\dagger})^{\rm T}{\rm d% }X^{\dagger}_{t}-\mu_{t}\,(A^{\rm T}A):C\,{\rm d}t\right),roman_d italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C roman_d italic_t ) , (34)

under the approximation (22). The initial condition is μ0=mpriorsubscript𝜇0subscript𝑚prior\mu_{0}=m_{\rm prior}italic_μ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_m start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT. The evolution equation for the conditional variance, that is the variance of ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, is given by

ddtσt=σt2γ(ATA):(XtXt):dd𝑡subscript𝜎𝑡superscriptsubscript𝜎𝑡2𝛾superscript𝐴T𝐴tensor-productsuperscriptsubscript𝑋𝑡superscriptsubscript𝑋𝑡\frac{\rm d}{{\rm d}t}\sigma_{t}=-\frac{\sigma_{t}^{2}}{\gamma}\,(A^{\rm T}A):% (X_{t}^{\dagger}\otimes X_{t}^{\dagger})divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) (35)

with initial condition σ0=σpriorsubscript𝜎0subscript𝜎prior\sigma_{0}=\sigma_{\rm prior}italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT and which again reduces to

ddtσt=σt2γ(ATA):C:dd𝑡subscript𝜎𝑡superscriptsubscript𝜎𝑡2𝛾superscript𝐴T𝐴𝐶\frac{\rm d}{{\rm d}t}\sigma_{t}=-\frac{\sigma_{t}^{2}}{\gamma}\,(A^{\rm T}A):Cdivide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C (36)

under the approximation (22).

We now perform a frequentist analysis of the estimator μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT defined by (34) and (36), that is, we perform a weak analysis of the SDE (34) in terms of the first two moments of μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [31]. In the first step, we take the expectation of (34) over all realisations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT of the SDE (2), which we denote by

mt:=𝔼[μt].assignsubscript𝑚𝑡superscript𝔼delimited-[]subscript𝜇𝑡m_{t}:=\mathbb{E}^{\dagger}[\mu_{t}].italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT := blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] . (37)

The associated evolution equation is given by

ddtmt=σtγ(ATA):𝔼[XtXt]σtγ(ATA):Cmt,:dd𝑡subscript𝑚𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴superscript𝔼delimited-[]tensor-productsuperscriptsubscript𝑋𝑡superscriptsubscript𝑋𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴:𝐶subscript𝑚𝑡\frac{\rm d}{{\rm d}t}m_{t}=\frac{\sigma_{t}}{\gamma}\,(A^{\rm T}A):\mathbb{E}% ^{\dagger}\left[X_{t}^{\dagger}\otimes X_{t}^{\dagger}\right]-\frac{\sigma_{t}% }{\gamma}\,(A^{\rm T}A):C\,m_{t},divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ] - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (38)

which reduces to

ddtmt=σtγ(ATA):C(1mt)=σt(ATA):(A+AT)1(1mt).:dd𝑡subscript𝑚𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴𝐶1subscript𝑚𝑡subscript𝜎𝑡superscript𝐴T𝐴:superscript𝐴superscript𝐴T11subscript𝑚𝑡\frac{\rm d}{{\rm d}t}m_{t}=\frac{\sigma_{t}}{\gamma}\,(A^{\rm T}A):C\,(1-m_{t% })=\sigma_{t}\,(A^{\rm T}A):(A+A^{\rm T})^{-1}\,(1-m_{t}).divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C ( 1 - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_A + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) . (39)

In the second step, we also look at the frequentist variance

pt:=𝔼[(μtmt)2].assignsubscript𝑝𝑡superscript𝔼delimited-[]superscriptsubscript𝜇𝑡subscript𝑚𝑡2p_{t}:=\mathbb{E}^{\dagger}[(\mu_{t}-m_{t})^{2}].italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT := blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ ( italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] . (40)

Using

d(μtmt)&=σtγ{(ATA):(XtXtC)dt+γ1/2(AXt)TdWt}σtγ(ATA):C(μtmt)dt,:dsubscript𝜇𝑡subscript𝑚𝑡&limit-fromsubscript𝜎𝑡𝛾conditional-setsuperscript𝐴T𝐴tensor-productsuperscriptsubscript𝑋𝑡superscriptsubscript𝑋𝑡𝐶d𝑡superscript𝛾12superscript𝐴superscriptsubscript𝑋𝑡Tdsubscriptsuperscript𝑊𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴𝐶subscript𝜇𝑡subscript𝑚𝑡d𝑡{\rm d}(\mu_{t}-m_{t})&=\frac{\sigma_{t}}{\gamma}\left\{(A^{\rm T}A):\left(X_{% t}^{\dagger}\otimes X_{t}^{\dagger}-C\right){\rm d}t+\gamma^{1/2}(AX_{t}^{% \dagger})^{\rm T}{\rm d}W^{\dagger}_{t}\right\}\,\,-\\ \qquad\qquad\frac{\sigma_{t}}{\gamma}(A^{\rm T}A):C\,(\mu_{t}-m_{t}){\rm d}t,roman_d ( italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) & = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG { ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_C ) roman_d italic_t + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C ( italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) roman_d italic_t ,

we obtain

ddtpt&=σtγ(ATA):C(2ptσt)+2σtγ(ATA):𝔼[(XtXtC)(μtmt)],:dd𝑡subscript𝑝𝑡&subscript𝜎𝑡𝛾superscript𝐴T𝐴limit-from𝐶2subscript𝑝𝑡subscript𝜎𝑡2subscript𝜎𝑡𝛾superscript𝐴T𝐴:superscript𝔼delimited-[]tensor-productsuperscriptsubscript𝑋𝑡superscriptsubscript𝑋𝑡𝐶subscript𝜇𝑡subscript𝑚𝑡\frac{\rm d}{{\rm d}t}p_{t}&=-\frac{\sigma_{t}}{\gamma}\,(A^{\rm T}A):C\left(2% p_{t}-\sigma_{t}\right)\,\,+\\ \qquad\qquad\frac{2\sigma_{t}}{\gamma}\,(A^{\rm T}A):\mathbb{E}^{\dagger}\left% [(X_{t}^{\dagger}\otimes X_{t}^{\dagger}-C)\,(\mu_{t}-m_{t})\right],divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C ( 2 italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) + divide start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_C ) ( italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ] ,

which we simplify to

ddtpt=σtγ(ATA):C(σt2pt)=σt(ATA):(A+AT)1(σt2pt):dd𝑡subscript𝑝𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴𝐶subscript𝜎𝑡2subscript𝑝𝑡subscript𝜎𝑡superscript𝐴T𝐴:superscript𝐴superscript𝐴T1subscript𝜎𝑡2subscript𝑝𝑡\frac{\rm d}{{\rm d}t}p_{t}=\frac{\sigma_{t}}{\gamma}\,(A^{\rm T}A):C\left(% \sigma_{t}-2p_{t}\right)=\sigma_{t}\,(A^{\rm T}A):(A+A^{\rm T})^{-1}\left(% \sigma_{t}-2p_{t}\right)divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C ( italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - 2 italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_A + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - 2 italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) (43)

under the approximation (22). The initial conditions are m0=mpriorsubscript𝑚0subscript𝑚priorm_{0}=m_{\rm prior}italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_m start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT and p0=0subscript𝑝00p_{0}=0italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0, respectively. We note that the differential equations (36) and (43) are explicitly solvable. For example, it holds that

σt=σ01+(ATA):(AT+A)1σ0tsubscript𝜎𝑡subscript𝜎0:1superscript𝐴T𝐴superscriptsuperscript𝐴T𝐴1subscript𝜎0𝑡\sigma_{t}=\frac{\sigma_{0}}{1+(A^{\rm T}A):(A^{\rm T}+A)^{-1}\,\sigma_{0}t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG 1 + ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT + italic_A ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_t end_ARG (44)

and one finds that σt1/((ATA):(AT+A)1t)\sigma_{t}\sim 1/((A^{\rm T}A):(A^{\rm T}+A)^{-1}\,t)italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∼ 1 / ( ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT + italic_A ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_t ) for t1much-greater-than𝑡1t\gg 1italic_t ≫ 1. It can also be shown that ptσtsubscript𝑝𝑡subscript𝜎𝑡p_{t}\leq\sigma_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≤ italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for all t0𝑡0t\geq 0italic_t ≥ 0. Furthermore, this analysis suggests that the learning rate in the stochastic gradient descent formulation (2.4) should be chosen as

αt=min{α¯,1(ATA):(AT+A)1t},subscript𝛼𝑡¯𝛼1:superscript𝐴T𝐴superscriptsuperscript𝐴T𝐴1𝑡\alpha_{t}=\min\left\{\bar{\alpha},\frac{1}{(A^{\rm T}A):(A^{\rm T}+A)^{-1}\,t% }\right\},italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_min { over¯ start_ARG italic_α end_ARG , divide start_ARG 1 end_ARG start_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT + italic_A ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_t end_ARG } , (45)

where α¯>0¯𝛼0\bar{\alpha}>0over¯ start_ARG italic_α end_ARG > 0 denotes an initial learning rate; for example α¯=σ0¯𝛼subscript𝜎0\bar{\alpha}=\sigma_{0}over¯ start_ARG italic_α end_ARG = italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

We finally conduct a formal analysis of the ensemble Kalman filter time-stepping (20) and demonstrate that the method is first-order accurate with regard to the implied frequentist mean mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. We recall (24) and conclude from (20) that the implied update on the variance σnsubscript𝜎𝑛\sigma_{n}italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT satisfies

σn+1=σnσn2γ(ATA):CΔt+𝒪(Δt2),:subscript𝜎𝑛1subscript𝜎𝑛superscriptsubscript𝜎𝑛2𝛾superscript𝐴T𝐴𝐶Δ𝑡𝒪Δsuperscript𝑡2\sigma_{n+1}=\sigma_{n}-\frac{\sigma_{n}^{2}}{\gamma}\,(A^{\rm T}A):C\Delta t+% \mathcal{O}(\Delta t^{2}),italic_σ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C roman_Δ italic_t + caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (46)

which provides a first-order approximation to (36).

We next analyse the evolution equation (34) for the conditional mean μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and its numerical approximation

μn+1=μn+Kn{(Xtn+1Xtn)μnAXtnΔt}subscript𝜇𝑛1subscript𝜇𝑛subscript𝐾𝑛superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛subscript𝜇𝑛𝐴superscriptsubscript𝑋subscript𝑡𝑛Δ𝑡\mu_{n+1}=\mu_{n}+K_{n}\left\{(X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger})-\mu_% {n}AX_{t_{n}}^{\dagger}\Delta t\right\}italic_μ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT { ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) - italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_Δ italic_t } (47)

arising from (20). Here we follow [14] in order to analyse the impact of the data Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT on the estimator. An in-depth theoretical treatment can be found in [10].

Comparing (47) to (34) and utilising (24), we find that the key quantity of interest is

Jtn,tn+1:=tntn+1(AXt)TdXt,assignsubscriptsuperscript𝐽subscript𝑡𝑛subscript𝑡𝑛1superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡J^{\dagger}_{t_{n},t_{n+1}}:=\int_{t_{n}}^{t_{n+1}}(AX_{t}^{\dagger})^{\rm T}{% \rm d}X_{t}^{\dagger},italic_J start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , (48)

which we can rewrite as

Jtn,tn+1=AT:(XtnXtn,tn+1)+AT:𝕏tn,tn+1.:subscriptsuperscript𝐽subscript𝑡𝑛subscript𝑡𝑛1superscript𝐴Ttensor-productsubscriptsuperscript𝑋subscript𝑡𝑛subscriptsuperscript𝑋subscript𝑡𝑛subscript𝑡𝑛1superscript𝐴T:superscriptsubscript𝕏subscript𝑡𝑛subscript𝑡𝑛1J^{\dagger}_{t_{n},t_{n+1}}=A^{\rm T}:(X^{\dagger}_{t_{n}}\otimes X^{\dagger}_% {t_{n},t_{n+1}})+A^{\rm T}:\mathbb{X}_{t_{n},t_{n+1}}^{\dagger}\,.italic_J start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : blackboard_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT . (49)

Here, motivated by (2) and following standard rough path notation, we have used

Xtn,tn+1:=Xtn+1Xtnassignsubscriptsuperscript𝑋subscript𝑡𝑛subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛X^{\dagger}_{t_{n},t_{n+1}}:=X_{t_{n+1}}^{\dagger}-X_{t_{n}}^{\dagger}italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT := italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT (50)

and the second-order iterated Itô integral

𝕏tn,tn+1:=tntn+1(XtXtn)dXt.assignsuperscriptsubscript𝕏subscript𝑡𝑛subscript𝑡𝑛1superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑋𝑡subscriptsuperscript𝑋subscript𝑡𝑛differential-dsuperscriptsubscript𝑋𝑡\mathbb{X}_{t_{n},t_{n+1}}^{\dagger}:=\int_{t_{n}}^{t_{n+1}}(X^{\dagger}_{t}-X% ^{\dagger}_{t_{n}})\otimes{\rm d}X_{t}^{\dagger}.blackboard_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ⊗ roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT . (51)

The difference between the integral (48) and its corresponding approximation in (47) is provided by AT:𝕏tn,tn+1:superscript𝐴Tsuperscriptsubscript𝕏subscript𝑡𝑛subscript𝑡𝑛1A^{\rm T}:\mathbb{X}_{t_{n},t_{n+1}}^{\dagger}italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : blackboard_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT plus higher-order terms arising from (24). The iterated integral 𝕏tn,tn+1subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT becomes a random variable from the frequentist perspective. Taking note of (2), we find that the drift, f(x)=Ax𝑓𝑥𝐴𝑥f(x)=Axitalic_f ( italic_x ) = italic_A italic_x, contributes with terms of order 𝒪(Δt2)𝒪Δsuperscript𝑡2\mathcal{O}(\Delta t^{2})caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) to 𝕏tn,tn+1subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and the expected value of 𝕏tn,tn+1subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT therefore satisfies

𝔼[𝕏tn,tn+1]=𝒪(Δt2),superscript𝔼delimited-[]subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1𝒪Δsuperscript𝑡2\mathbb{E}^{\dagger}[\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}]=\mathcal{O}(\Delta t% ^{2}),blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] = caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (52)

since 𝔼[Wtn,τ]=0superscript𝔼delimited-[]subscriptsuperscript𝑊subscript𝑡𝑛𝜏0\mathbb{E}^{\dagger}[W^{\dagger}_{t_{n},\tau}]=0blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_τ end_POSTSUBSCRIPT ] = 0 for τ>tn𝜏subscript𝑡𝑛\tau>t_{n}italic_τ > italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, and

𝔼[𝕎tn,tn+1]=12𝔼[Wtn,tn+1Wtn,tn+1[Wtn,Wtn,tn+1]]Δt2I=0,superscript𝔼delimited-[]superscriptsubscript𝕎subscript𝑡𝑛subscript𝑡𝑛112superscript𝔼delimited-[]tensor-productsubscriptsuperscript𝑊subscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑊subscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑊subscript𝑡𝑛subscriptsuperscript𝑊subscript𝑡𝑛subscript𝑡𝑛1Δ𝑡2𝐼0\mathbb{E}^{\dagger}[\mathbb{W}_{t_{n},t_{n+1}}^{\dagger}]=\frac{1}{2}\mathbb{% E}^{\dagger}[W^{\dagger}_{t_{n},t_{n+1}}\otimes W^{\dagger}_{t_{n},t_{n+1}}-[W% ^{\dagger}_{t_{n}},W^{\dagger}_{t_{n},t_{n+1}}]]-\frac{\Delta t}{2}I=0,blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ blackboard_W start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ] = divide start_ARG 1 end_ARG start_ARG 2 end_ARG blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - [ italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT , italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] ] - divide start_ARG roman_Δ italic_t end_ARG start_ARG 2 end_ARG italic_I = 0 , (53)

where we have introduced the commutator

[Wtn,Wtn,tn+1]:=WtnWtn,tn+1Wtn,tn+1Wtn.assignsuperscriptsubscript𝑊subscript𝑡𝑛superscriptsubscript𝑊subscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑊subscript𝑡𝑛subscriptsuperscript𝑊subscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑊subscript𝑡𝑛subscript𝑡𝑛1superscriptsubscript𝑊subscript𝑡𝑛[W_{t_{n}}^{\dagger},W_{t_{n},t_{n+1}}^{\dagger}]:=W^{\dagger}_{t_{n}}\otimes W% ^{\dagger}_{t_{n},t_{n+1}}-W^{\dagger}_{t_{n},t_{n+1}}\otimes W_{t_{n}}^{% \dagger}.[ italic_W start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_W start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ] := italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_W start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT . (54)

Hence we find that, while (47) is not a first-order (strong) approximation of the SDE (34), the approximation becomes first-order in mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT when averaged over realisations Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT of the SDE (2). More precisely, one obtains

𝔼[Jtn,tn+1]=(ATA):CΔt+𝒪(Δt2).:superscript𝔼delimited-[]subscriptsuperscript𝐽subscript𝑡𝑛subscript𝑡𝑛1superscript𝐴T𝐴𝐶Δ𝑡𝒪Δsuperscript𝑡2\mathbb{E}^{\dagger}[J^{\dagger}_{t_{n},t_{n+1}}]=(A^{\rm T}A):C\Delta t+% \mathcal{O}(\Delta t^{2}).blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_J start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] = ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C roman_Δ italic_t + caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (55)

We note that the modified scheme (30) leads to the same time evolution in the variance σnsubscript𝜎𝑛\sigma_{n}italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT while the update in μnsubscript𝜇𝑛\mu_{n}italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is changed to

μn+1=μn+σnγtntn+1(AXt)TdXtKnAXtnμnΔt.subscript𝜇𝑛1subscript𝜇𝑛subscript𝜎𝑛𝛾superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡subscript𝐾𝑛𝐴superscriptsubscript𝑋subscript𝑡𝑛subscript𝜇𝑛Δ𝑡\mu_{n+1}=\mu_{n}+\frac{\sigma_{n}}{\gamma}\int_{t_{n}}^{t_{n+1}}(AX_{t}^{% \dagger})^{\rm T}{\rm d}X_{t}^{\dagger}-K_{n}AX_{t_{n}}^{\dagger}\mu_{n}\Delta t.italic_μ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT roman_Δ italic_t . (56)

This modification results in a more accurate evolution in the conditional mean μnsubscript𝜇𝑛\mu_{n}italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, but because of (52) it does not impact to leading order the evolution of the underlying frequentist mean, mn=𝔼[μn]subscript𝑚𝑛superscript𝔼delimited-[]subscript𝜇𝑛m_{n}=\mathbb{E}^{\dagger}[\mu_{n}]italic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ]. We summarise our findings in the following proposition.

Proposition 3.1.

The discrete-time EnKBF implementations (20) and (30) both provide first-order approximations to the time evolution of the frequentist mean, mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and the frequentist variance, ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. In other words, both methods converge weakly with order one.

We also note that the frequentist uncertainty is essentially data-independent and depends only on the time window [0,T]0𝑇[0,T][ 0 , italic_T ] over which the data gets observed. Hence, for fixed observation interval [0,T]0𝑇[0,T][ 0 , italic_T ], it makes sense to choose the step-size ΔtΔ𝑡\Delta troman_Δ italic_t such that the discretisation error (bias) remains on the same order of magnitude as pT1/2σT1/2superscriptsubscript𝑝𝑇12superscriptsubscript𝜎𝑇12p_{T}^{1/2}\approx\sigma_{T}^{1/2}italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ≈ italic_σ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT. Selecting a much smaller step-size would not significantly reduce the frequentist estimation error in the conditional estimator μTsubscript𝜇𝑇\mu_{T}italic_μ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT.

Remark 3.2.

We can now give a precise reformulation of the approximation (29):

12𝔼[AT:(Xtn,tn+1Xtn,tn+1)]=Δtγ2𝑡𝑟(A)+𝒪(Δt2),\frac{1}{2}\mathbb{E}^{\dagger}\left[A^{\rm T}:(X_{t_{n},t_{n+1}}^{\dagger}% \otimes X_{t_{n},t_{n+1}}^{\dagger})\right]=\frac{\Delta t\,\gamma}{2}\mbox{tr% }\,(A)+\mathcal{O}(\Delta t^{2}),divide start_ARG 1 end_ARG start_ARG 2 end_ARG blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ] = divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG tr ( italic_A ) + caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (57)

which is at the heart of the Stratonovich formulation (25) of the EnKFB [10].

4 Multi-scale data

We now have all the material in place to study the dependency of the EnKBF estimator on a set of observations Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT, ϵ>0italic-ϵ0\epsilon>0italic_ϵ > 0, which approach the theoretical Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT with respect to the uniform norm topology on the space of continuous functions as ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0. Since the second-order contribution in (2), that is (51), does not depend continuously on such perturbations, we demonstrate in this section that a systematic bias arises in the EnKBF. Furthermore, we show how the bias can be eliminated either via subsampling the data, which effectively amounts to ignoring these second-order contributions, or via an appropriate correction term, which ensures a continuous dependence on observations Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT with respect to the uniform norm topology. More specifically, we investigate the impact of a possible discrepancy between the SDE model (1), for which we aim to estimate the parameter θ𝜃\thetaitalic_θ, and the data generating SDE (2). We therefore replace (2) by the the following two-scale SDE [18]:

dXt(ϵ)&=AXt(ϵ)dt+γ1/2ϵMPt(ϵ)dt,dPt(ϵ)=1ϵMPt(ϵ)dt+dWt,formulae-sequencedsubscriptsuperscript𝑋italic-ϵ𝑡&𝐴subscriptsuperscript𝑋italic-ϵ𝑡d𝑡superscript𝛾12italic-ϵ𝑀subscriptsuperscript𝑃italic-ϵ𝑡d𝑡dsubscriptsuperscript𝑃italic-ϵ𝑡1italic-ϵ𝑀subscriptsuperscript𝑃italic-ϵ𝑡d𝑡dsubscriptsuperscript𝑊𝑡{\rm d}X^{(\epsilon)}_{t}&=AX^{(\epsilon)}_{t}\,{\rm d}t+\frac{\gamma^{1/2}}{% \epsilon}MP^{(\epsilon)}_{t}\,{\rm d}t,\\ {\rm d}P^{(\epsilon)}_{t}=-\frac{1}{\epsilon}MP^{(\epsilon)}_{t}\,{\rm d}t+{% \rm d}W^{\dagger}_{t},roman_d italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = italic_A italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t + divide start_ARG italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ end_ARG italic_M italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t , roman_d italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - divide start_ARG 1 end_ARG start_ARG italic_ϵ end_ARG italic_M italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t + roman_d italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ,

where

M=(1ββ1),𝑀1𝛽𝛽1M=\left(\begin{array}[]{cc}1&\beta\\ -\beta&1\end{array}\right),italic_M = ( start_ARRAY start_ROW start_CELL 1 end_CELL start_CELL italic_β end_CELL end_ROW start_ROW start_CELL - italic_β end_CELL start_CELL 1 end_CELL end_ROW end_ARRAY ) , (59)

β=2𝛽2\beta=2italic_β = 2 and ϵ=0.01italic-ϵ0.01\epsilon=0.01italic_ϵ = 0.01. The dimension of state space is d=2𝑑2d=2italic_d = 2 throughout this section. While we restrict here to the simple two-scale model (4), similar scenarios can arise from deterministic fast-slow systems [25, 8].

The associated EnKBF mean-field equations in the parameter ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, which we now denote by Θt(ϵ)superscriptsubscriptΘ𝑡italic-ϵ\Theta_{t}^{(\epsilon)}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT in order to explicitly record its dependence on the scale parameter ϵ1much-less-thanitalic-ϵ1\epsilon\ll 1italic_ϵ ≪ 1, become

dΘt(ϵ)&=σt(ϵ)γ(AXt(ϵ))TdIt(ϵ),dIt(ϵ)=dXt(ϵ)12(Θt(ϵ)+πt(ϵ)[θ])AXt(ϵ)dt,formulae-sequencedsuperscriptsubscriptΘ𝑡italic-ϵ&superscriptsubscript𝜎𝑡italic-ϵ𝛾superscript𝐴subscriptsuperscript𝑋italic-ϵ𝑡Tdsuperscriptsubscript𝐼𝑡italic-ϵdsuperscriptsubscript𝐼𝑡italic-ϵdsuperscriptsubscript𝑋𝑡italic-ϵ12superscriptsubscriptΘ𝑡italic-ϵsuperscriptsubscript𝜋𝑡italic-ϵdelimited-[]𝜃𝐴superscriptsubscript𝑋𝑡italic-ϵd𝑡{\rm d}\Theta_{t}^{(\epsilon)}&=\frac{\sigma_{t}^{(\epsilon)}}{\gamma}(AX^{(% \epsilon)}_{t})^{\rm T}{\rm d}I_{t}^{(\epsilon)},\\ {\rm d}I_{t}^{(\epsilon)}={\rm d}X_{t}^{(\epsilon)}-\frac{1}{2}\left(\Theta_{t% }^{(\epsilon)}+\pi_{t}^{(\epsilon)}[\theta]\right)AX_{t}^{(\epsilon)}{\rm d}t,roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT & = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_d italic_t ,

with variance

σt(ϵ)=πt(ϵ)[(θπt(ϵ)[θ])2]superscriptsubscript𝜎𝑡italic-ϵsuperscriptsubscript𝜋𝑡italic-ϵdelimited-[]superscript𝜃superscriptsubscript𝜋𝑡italic-ϵdelimited-[]𝜃2\sigma_{t}^{(\epsilon)}=\pi_{t}^{(\epsilon)}\left[(\theta-\pi_{t}^{(\epsilon)}% [\theta])^{2}\right]italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ ( italic_θ - italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] (61)

and Θtϵπt(ϵ)similar-tosubscriptsuperscriptΘitalic-ϵ𝑡superscriptsubscript𝜋𝑡italic-ϵ\Theta^{\epsilon}_{t}\sim\pi_{t}^{(\epsilon)}roman_Θ start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∼ italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT. The discrete-time mean-field EnKBF (20) turns into

Θn+1(ϵ)=Θn(ϵ)+Kn(ϵ){(Xtn+1(ϵ)Xtn(ϵ))12(Θn(ϵ)+πn(ϵ)[θ])AXtn(ϵ)Δt}superscriptsubscriptΘ𝑛1italic-ϵsuperscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝐾𝑛italic-ϵsuperscriptsubscript𝑋subscript𝑡𝑛1italic-ϵsuperscriptsubscript𝑋subscript𝑡𝑛italic-ϵ12superscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝜋𝑛italic-ϵdelimited-[]𝜃𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵΔ𝑡\Theta_{n+1}^{(\epsilon)}=\Theta_{n}^{(\epsilon)}+K_{n}^{(\epsilon)}\left\{% \left(X_{t_{n+1}}^{(\epsilon)}-X_{t_{n}}^{(\epsilon)}\right)-\frac{1}{2}\left(% \Theta_{n}^{(\epsilon)}+\pi_{n}^{(\epsilon)}[\theta]\right)AX_{t_{n}}^{(% \epsilon)}\Delta t\right\}roman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT { ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_Δ italic_t } (62)

with Kalman gain

Kn(ϵ)=σn(ϵ)(AXtn(ϵ))T(γ+Δtσn(ϵ)(AXtn(ϵ))TAXtn(ϵ))1.superscriptsubscript𝐾𝑛italic-ϵsuperscriptsubscript𝜎𝑛italic-ϵsuperscript𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵTsuperscript𝛾Δ𝑡superscriptsubscript𝜎𝑛italic-ϵsuperscript𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵT𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵ1K_{n}^{(\epsilon)}=\sigma_{n}^{(\epsilon)}(AX_{t_{n}}^{(\epsilon)})^{\rm T}% \left(\gamma+\Delta t\sigma_{n}^{(\epsilon)}(AX_{t_{n}}^{(\epsilon)})^{\rm T}% AX_{t_{n}}^{(\epsilon)}\right)^{-1}\,.italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( italic_γ + roman_Δ italic_t italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (63)

We also consider the appropriately modified scheme (30):

Θn+1(ϵ)=Θn(ϵ)+σn(ϵ)γtntn+1(AXt(ϵ))TdXt(ϵ)12Kn(ϵ)AXtn(ϵ)(Θn(ϵ)+πn(ϵ)[θ])Δt.superscriptsubscriptΘ𝑛1italic-ϵsuperscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝜎𝑛italic-ϵ𝛾superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵ12superscriptsubscript𝐾𝑛italic-ϵ𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵsuperscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝜋𝑛italic-ϵdelimited-[]𝜃Δ𝑡\Theta_{n+1}^{(\epsilon)}=\Theta_{n}^{(\epsilon)}+\frac{\sigma_{n}^{(\epsilon)% }}{\gamma}\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d}X_{t}^{(% \epsilon)}-\frac{1}{2}K_{n}^{(\epsilon)}AX_{t_{n}}^{(\epsilon)}\left(\Theta_{n% }^{(\epsilon)}+\pi_{n}^{(\epsilon)}[\theta]\right)\Delta t.roman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) roman_Δ italic_t . (64)

In order to understand the impact of the modified data generating process on the two mean-field EnKBF formulations (62) and (64), respectively, we follow [18] and investigate the difference between Xt(ϵ)subscriptsuperscript𝑋italic-ϵ𝑡X^{(\epsilon)}_{t}italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and Xtsubscriptsuperscript𝑋𝑡X^{\dagger}_{t}italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT:

d(Xt(ϵ)Xt)&=A(Xt(ϵ)Xt)dt+γ1/2ϵMPt(ϵ)dtγ1/2dWt=A(Xt(ϵ)Xt)dtγ1/2dPt(ϵ).dsubscriptsuperscript𝑋italic-ϵ𝑡subscriptsuperscript𝑋𝑡&𝐴subscriptsuperscript𝑋italic-ϵ𝑡subscriptsuperscript𝑋𝑡d𝑡superscript𝛾12italic-ϵ𝑀superscriptsubscript𝑃𝑡italic-ϵd𝑡superscript𝛾12dsuperscriptsubscript𝑊𝑡𝐴superscriptsubscript𝑋𝑡italic-ϵsuperscriptsubscript𝑋𝑡d𝑡superscript𝛾12dsuperscriptsubscript𝑃𝑡italic-ϵ{\rm d}(X^{(\epsilon)}_{t}-X^{\dagger}_{t})&=A(X^{(\epsilon)}_{t}-X^{\dagger}_% {t}){\rm d}t+\frac{\gamma^{1/2}}{\epsilon}MP_{t}^{(\epsilon)}{\rm d}t-\gamma^{% 1/2}{\rm d}W_{t}^{\dagger}\\ =A(X_{t}^{(\epsilon)}-X_{t}^{\dagger}){\rm d}t-\gamma^{1/2}{\rm d}P_{t}^{(% \epsilon)}.roman_d ( italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) & = italic_A ( italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) roman_d italic_t + divide start_ARG italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ end_ARG italic_M italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_d italic_t - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = italic_A ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_d italic_t - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT .

When Pt(ϵ)subscriptsuperscript𝑃italic-ϵ𝑡P^{(\epsilon)}_{t}italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is stationary, it is Gaussian with mean zero and covariance

𝔼stat[Pt(ϵ)Pt(ϵ)]=ϵ(M+MT)1=ϵ2I.subscript𝔼statdelimited-[]tensor-productsuperscriptsubscript𝑃𝑡italic-ϵsuperscriptsubscript𝑃𝑡italic-ϵitalic-ϵsuperscript𝑀superscript𝑀T1italic-ϵ2𝐼\mathbb{E}_{\rm stat}\left[P_{t}^{(\epsilon)}\otimes P_{t}^{(\epsilon)}\right]% =\epsilon\,(M+M^{\rm T})^{-1}=\frac{\epsilon}{2}I.blackboard_E start_POSTSUBSCRIPT roman_stat end_POSTSUBSCRIPT [ italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ⊗ italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ] = italic_ϵ ( italic_M + italic_M start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = divide start_ARG italic_ϵ end_ARG start_ARG 2 end_ARG italic_I . (66)

Hence Pt(ϵ)0subscriptsuperscript𝑃italic-ϵ𝑡0P^{(\epsilon)}_{t}\rightarrow 0italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT → 0 as ϵ0italic-ϵ0\epsilon\rightarrow 0italic_ϵ → 0 and also

Xt(ϵ)Xtsubscriptsuperscript𝑋italic-ϵ𝑡subscriptsuperscript𝑋𝑡X^{(\epsilon)}_{t}\rightarrow X^{\dagger}_{t}italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT → italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (67)

in L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT uniformly in t𝑡titalic_t, provided σ(A)𝜎𝐴subscript\sigma(A)\subset\mathbb{C}_{-}italic_σ ( italic_A ) ⊂ blackboard_C start_POSTSUBSCRIPT - end_POSTSUBSCRIPT and X0(ϵ)=X0subscriptsuperscript𝑋italic-ϵ0subscriptsuperscript𝑋0X^{(\epsilon)}_{0}=X^{\dagger}_{0}italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. This is illustrated in Figure 1.

Refer to caption
Figure 1: SDE driven by mathematical vs. physical Brownian motion (ϵ=0.01italic-ϵ0.01\epsilon=0.01italic_ϵ = 0.01). The top panel displays both Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT (blue) and Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT (red) over the long time interval t[0,10]𝑡010t\in[0,10]italic_t ∈ [ 0 , 10 ], while the lower panel provides a zoomed in perspective over the interval t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ].

In order to investigate the problem further, we study the integral

Jtn,tn+1(ϵ):=tntn+1(AXt(ϵ))TdXt(ϵ)assignsubscriptsuperscript𝐽italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵJ^{(\epsilon)}_{t_{n},t_{n+1}}:=\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{% \rm T}{\rm d}X_{t}^{(\epsilon)}italic_J start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT (68)

and its relation to (48). As for (48), we can rewrite (68) as

Jtn,tn+1(ϵ)=AT:(Xtn(ϵ)Xtn,tn+1(ϵ))+AT:𝕏tn,tn+1(ϵ).:subscriptsuperscript𝐽italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴Ttensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴T:superscriptsubscript𝕏subscript𝑡𝑛subscript𝑡𝑛1italic-ϵJ^{(\epsilon)}_{t_{n},t_{n+1}}=A^{\rm T}:(X^{(\epsilon)}_{t_{n}}\otimes X^{(% \epsilon)}_{t_{n},t_{n+1}})+A^{\rm T}:\mathbb{X}_{t_{n},t_{n+1}}^{(\epsilon)}.italic_J start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) + italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : blackboard_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT . (69)

We now investigate the limit of the second-order iterated integral

𝕏tn,tn+1(ϵ)&=tntn+1Xtn,t(ϵ)dXt(ϵ)=12Xtn,tn+1(ϵ)Xtn,tn+1(ϵ)12tntn+1[Xtn,t(ϵ),dXt(ϵ)]subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1&superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛𝑡differential-dsuperscriptsubscript𝑋𝑡italic-ϵtensor-product12superscriptsubscript𝑋subscript𝑡𝑛subscript𝑡𝑛1italic-ϵsuperscriptsubscript𝑋subscript𝑡𝑛subscript𝑡𝑛1italic-ϵ12superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛𝑡italic-ϵdsuperscriptsubscript𝑋𝑡italic-ϵ\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+1}}&=\int_{t_{n}}^{t_{n+1}}X^{(\epsilon)}_% {t_{n},t}\otimes{\rm d}X_{t}^{(\epsilon)}\\ =\frac{1}{2}X_{t_{n},t_{n+1}}^{(\epsilon)}\otimes X_{t_{n},t_{n+1}}^{(\epsilon% )}-\frac{1}{2}\int_{t_{n}}^{t_{n+1}}[X_{t_{n},t}^{(\epsilon)},{\rm d}X_{t}^{(% \epsilon)}]blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT & = ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT ⊗ roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT [ italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT , roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ]

as ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0 [18]. Here [.,.][.,.][ . , . ] denotes the commutator defined by (54).

Proposition 4.1.

The second-order iterated integral 𝕏tn,tn+1(ϵ)subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+1}}blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT satisfies

limϵ0𝕏tn,tn+1(ϵ)=𝕏tn,tn+1+Δtγ2Msubscriptitalic-ϵ0subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1Δ𝑡𝛾2𝑀\lim_{\epsilon\to 0}\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+1}}=\mathbb{X}^{% \dagger}_{t_{n},t_{n+1}}+\frac{\Delta t\,\gamma}{2}Mroman_lim start_POSTSUBSCRIPT italic_ϵ → 0 end_POSTSUBSCRIPT blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG italic_M (71)
Proof 4.2.

The proof follows [18] and can be summarised as follows:

𝕏tn,tn+1(ϵ)&=tntn+1Xtn,t(ϵ)dXt(ϵ)tntn+1Xtn,tdXtγ1/2tntn+1Xtn,t(ϵ)dPt(ϵ)=𝕏tn,tn+1γ1/2Xtn,tn+1(ϵ)Ptn+1(ϵ)+γ1/2tntn+1dXt(ϵ)Pt(ϵ)𝕏tn,tn+1+γ1/2tntn+1{AXt(ϵ)+γ1/2ϵMPt(ϵ)}Pt(ϵ)dt𝕏tn,tn+1+ΔtγϵM𝔼stat[Ptn(ϵ)Ptn(ϵ)]=𝕏tn,tn+1+Δtγ2M.subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1&superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛𝑡differential-dsubscriptsuperscript𝑋italic-ϵ𝑡superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑋subscript𝑡𝑛𝑡differential-dsubscriptsuperscript𝑋𝑡superscript𝛾12superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛𝑡differential-dsubscriptsuperscript𝑃italic-ϵ𝑡subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1tensor-productsuperscript𝛾12subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑃italic-ϵsubscript𝑡𝑛1superscript𝛾12superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-productdifferential-dsubscriptsuperscript𝑋italic-ϵ𝑡subscriptsuperscript𝑃italic-ϵ𝑡subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1superscript𝛾12superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1tensor-product𝐴superscriptsubscript𝑋𝑡italic-ϵsuperscript𝛾12italic-ϵ𝑀subscriptsuperscript𝑃italic-ϵ𝑡subscriptsuperscript𝑃italic-ϵ𝑡differential-d𝑡subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1Δ𝑡𝛾italic-ϵ𝑀subscript𝔼statdelimited-[]tensor-productsuperscriptsubscript𝑃subscript𝑡𝑛italic-ϵsuperscriptsubscript𝑃subscript𝑡𝑛italic-ϵsubscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1Δ𝑡𝛾2𝑀\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+1}}&=\int_{t_{n}}^{t_{n+1}}X^{(\epsilon)}_% {t_{n},t}\otimes{\rm d}X^{(\epsilon)}_{t}\\ \rightarrow\int_{t_{n}}^{t_{n+1}}X^{\dagger}_{t_{n},t}\otimes{\rm d}X^{\dagger% }_{t}-\gamma^{1/2}\int_{t_{n}}^{t_{n+1}}X^{(\epsilon)}_{t_{n},t}\otimes{\rm d}% P^{(\epsilon)}_{t}\\ =\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}-\gamma^{1/2}X^{(\epsilon)}_{t_{n},t_{n+1% }}\otimes P^{(\epsilon)}_{t_{n+1}}+\gamma^{1/2}\int_{t_{n}}^{t_{n+1}}{\rm d}X^% {(\epsilon)}_{t}\otimes P^{(\epsilon)}_{t}\\ \rightarrow\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}+\gamma^{1/2}\int_{t_{n}}^{t_{n% +1}}\left\{AX_{t}^{(\epsilon)}+\frac{\gamma^{1/2}}{\epsilon}MP^{(\epsilon)}_{t% }\right\}\otimes P^{(\epsilon)}_{t}{\rm d}t\\ \rightarrow\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}+\frac{\Delta t\,\gamma}{% \epsilon}M\,\mathbb{E}_{\rm stat}\left[P_{t_{n}}^{(\epsilon)}\otimes P_{t_{n}}% ^{(\epsilon)}\right]\\ =\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}+\frac{\Delta t\,\gamma}{2}M.blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT & = ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT ⊗ roman_d italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT → ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT ⊗ roman_d italic_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT ⊗ roman_d italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊗ italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT → blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT { italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + divide start_ARG italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ϵ end_ARG italic_M italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } ⊗ italic_P start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_d italic_t → blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG italic_ϵ end_ARG italic_M blackboard_E start_POSTSUBSCRIPT roman_stat end_POSTSUBSCRIPT [ italic_P start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ⊗ italic_P start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ] = blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG italic_M .

As discussed in detail in [10] already, Proposition 4.1 implies that the scheme (64) does not, in general, converge to the scheme (64) as ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0 since

Jtn,tn+1=limϵ0Jtn,tn+1(ϵ)Δtγ2AT:M.:subscriptsuperscript𝐽subscript𝑡𝑛subscript𝑡𝑛1subscriptitalic-ϵ0subscriptsuperscript𝐽italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1Δ𝑡𝛾2superscript𝐴T𝑀J^{\dagger}_{t_{n},t_{n+1}}=\lim_{\epsilon\to 0}J^{(\epsilon)}_{t_{n},t_{n+1}}% -\frac{\Delta t\,\gamma}{2}A^{\rm T}:M\,.italic_J start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = roman_lim start_POSTSUBSCRIPT italic_ϵ → 0 end_POSTSUBSCRIPT italic_J start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : italic_M . (73)

This observation suggests the following modification

Θn+1(ϵ)&=Θn(ϵ)+σn(ϵ)γtntn+1(AXt(ϵ))TdXt(ϵ)Δt2σn(ϵ)AT:M12Kn(ϵ)AXtn(ϵ)(Θn(ϵ)+πn(ϵ)[θ])Δt:superscriptsubscriptΘ𝑛1italic-ϵ&superscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝜎𝑛italic-ϵ𝛾superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵΔ𝑡2superscriptsubscript𝜎𝑛italic-ϵsuperscript𝐴Tlimit-from𝑀12superscriptsubscript𝐾𝑛italic-ϵ𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵsuperscriptsubscriptΘ𝑛italic-ϵsuperscriptsubscript𝜋𝑛italic-ϵdelimited-[]𝜃Δ𝑡\Theta_{n+1}^{(\epsilon)}&=\Theta_{n}^{(\epsilon)}+\frac{\sigma_{n}^{(\epsilon% )}}{\gamma}\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d}X_{t}^{(% \epsilon)}-\frac{\Delta t}{2}\sigma_{n}^{(\epsilon)}\,A^{\rm T}:M\,\,-\\ \qquad\qquad\frac{1}{2}K_{n}^{(\epsilon)}AX_{t_{n}}^{(\epsilon)}\left(\Theta_{% n}^{(\epsilon)}+\pi_{n}^{(\epsilon)}[\theta]\right)\Delta troman_Θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT & = roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + divide start_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG roman_Δ italic_t end_ARG start_ARG 2 end_ARG italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : italic_M - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) roman_Δ italic_t

to (64). Please note that it follows from (4) that

tntn+1(AXt(ϵ))TdXt(ϵ)=AT:(Xtn+1/2(ϵ)Xtn,tn+1(ϵ)12tntn+1[Xtn,t(ϵ),dXt(ϵ)]).:superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵsuperscript𝐴Ttensor-productsuperscriptsubscript𝑋subscript𝑡𝑛12italic-ϵsuperscriptsubscript𝑋subscript𝑡𝑛subscript𝑡𝑛1italic-ϵ12superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscriptsubscript𝑋subscript𝑡𝑛𝑡italic-ϵdsuperscriptsubscript𝑋𝑡italic-ϵ\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d}X_{t}^{(\epsilon)}=A% ^{\rm T}:\left(X_{t_{n+1/2}}^{(\epsilon)}\otimes X_{t_{n},t_{n+1}}^{(\epsilon)% }-\frac{1}{2}\int_{t_{n}}^{t_{n+1}}[X_{t_{n},t}^{(\epsilon)},{\rm d}X_{t}^{(% \epsilon)}]\right).∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT [ italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT , roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ] ) . (75)
Proposition 4.3.

The discrete-time EnKBF (62) converges to (20) for fixed Δtnormal-Δ𝑡\Delta troman_Δ italic_t as ϵ0normal-→italic-ϵ0\epsilon\to 0italic_ϵ → 0. Similarly, (4) converges to (30) under the same limit.

Proof 4.4.

The first statement follows from σn(ϵ)=σnsuperscriptsubscript𝜎𝑛italic-ϵsubscript𝜎𝑛\sigma_{n}^{(\epsilon)}=\sigma_{n}italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, the limiting behaviour (67), and

limϵ0Kn(ϵ)=Kn.subscriptitalic-ϵ0superscriptsubscript𝐾𝑛italic-ϵsubscript𝐾𝑛\lim_{\epsilon\to 0}K_{n}^{(\epsilon)}=K_{n}.roman_lim start_POSTSUBSCRIPT italic_ϵ → 0 end_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_K start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT . (76)

The second statement additionally requires (73) to be substituted into (4) when taking the limit ϵ0normal-→italic-ϵ0\epsilon\to 0italic_ϵ → 0.

Remark 4.5.

The analogous adaptation of (4) to the gradient descent formulation (2.4) with Xtsuperscriptsubscript𝑋𝑡normal-†X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT replaced by Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT becomes

θn+1(ϵ)&=θn(ϵ)+αtnγ(tntn+1(AXt(ϵ))TdXt(ϵ)γΔt2AT:Mθn(ϵ)(AXtn(ϵ))TAXtn(ϵ)Δt).\theta_{n+1}^{(\epsilon)}&=\theta_{n}^{(\epsilon)}+\frac{\alpha_{t_{n}}}{% \gamma}\left(\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d}X_{t}^{% (\epsilon)}-\frac{\gamma\Delta t}{2}A^{\rm T}:M\,\,-\right.\\ \qquad\qquad\left.\theta_{n}^{(\epsilon)}(AX_{t_{n}}^{(\epsilon)})^{\rm T}AX_{% t_{n}}^{(\epsilon)}\Delta t\right).italic_θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT & = italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + divide start_ARG italic_α start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG italic_γ roman_Δ italic_t end_ARG start_ARG 2 end_ARG italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : italic_M - italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_Δ italic_t ) .

Alternatively, subsampling the data can be applied which leads to the simpler formulation

θn+1(ϵ)=θn(ϵ)+αtnγ(AXtn(ϵ))T((Xtn+1(ϵ)Xtn(ϵ))θn(ϵ)AXtn(ϵ)Δt).superscriptsubscript𝜃𝑛1italic-ϵsuperscriptsubscript𝜃𝑛italic-ϵsubscript𝛼subscript𝑡𝑛𝛾superscript𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵTsuperscriptsubscript𝑋subscript𝑡𝑛1italic-ϵsuperscriptsubscript𝑋subscript𝑡𝑛italic-ϵsuperscriptsubscript𝜃𝑛italic-ϵ𝐴superscriptsubscript𝑋subscript𝑡𝑛italic-ϵΔ𝑡\theta_{n+1}^{(\epsilon)}=\theta_{n}^{(\epsilon)}+\frac{\alpha_{t_{n}}}{\gamma% }(AX_{t_{n}}^{(\epsilon)})^{\rm T}\left((X_{t_{n+1}}^{(\epsilon)}-X_{t_{n}}^{(% \epsilon)})-\theta_{n}^{(\epsilon)}AX_{t_{n}}^{(\epsilon)}\Delta t\right).italic_θ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + divide start_ARG italic_α start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ( ( italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) - italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT italic_A italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_Δ italic_t ) . (78)
Remark 4.6.

A two-scale SDE, closely related to (4), has been investigated in [9] in terms of the time integrated autocorrelation function of Pt(ϵ)superscriptsubscript𝑃𝑡italic-ϵP_{t}^{(\epsilon)}italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT and modified stochastic integrals. In our case, the modified quadrature rule, here denoted by normal-⋄\diamond, has to satisfy

tntn+1(AXt)TdXt=limϵ0tntn+1(AXt(ϵ))TdXt(ϵ),superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdsuperscriptsubscript𝑋𝑡subscriptitalic-ϵ0superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵ\int_{t_{n}}^{t_{n+1}}(AX_{t}^{\dagger})^{\rm T}\diamond{\rm d}X_{t}^{\dagger}% =\lim_{\epsilon\to 0}\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d% }X_{t}^{(\epsilon)},∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ⋄ roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = roman_lim start_POSTSUBSCRIPT italic_ϵ → 0 end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT , (79)

and it is therefore related to the standard Itô integral via

tntn+1(AXt)TdXt=tntn+1(AXt)TdXt+Δtγ2AT:M.:superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdsuperscriptsubscript𝑋𝑡superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡Δ𝑡𝛾2superscript𝐴T𝑀\int_{t_{n}}^{t_{n+1}}(AX_{t}^{\dagger})^{\rm T}\diamond{\rm d}X_{t}^{\dagger}% =\int_{t_{n}}^{t_{n+1}}(AX_{t}^{\dagger})^{\rm T}{\rm d}X_{t}^{\dagger}+\frac{% \Delta t\gamma}{2}A^{\rm T}:M.∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ⋄ roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT + divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT : italic_M . (80)

Hence M𝑀Mitalic_M playes the role of the integrated autocorrelation function of Pt(ϵ)superscriptsubscript𝑃𝑡italic-ϵP_{t}^{(\epsilon)}italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT in our approach. We note that the modified quadrature rule reduces to the standard Stratonovitch integral if either β=0𝛽0\beta=0italic_β = 0 in (59) or A𝐴Aitalic_A is symmetric. While the results from [9] could, therefore, also be used as a starting point for discussing the induced estimation bias, practical implementations would still require knowledge of the integrated autocorrelation function of Pt(ϵ)superscriptsubscript𝑃𝑡italic-ϵP_{t}^{(\epsilon)}italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT or, equivalently, the estimation of M𝑀Mitalic_M in addition to observing Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT. We address this aspect next.

4.1 Numerical implementation

The numerical implementation of (4) requires an estimator for the generally unknown M𝑀Mitalic_M in (73). This task is challenging as we only have access to Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT without any explicit knowledge of the underlying generating process (4). While the estimator proposed in [10] is based on the idea of subsampling the data, the frequentist perspective taken in this note suggests the alternative estimator Mestsubscript𝑀estM_{\rm est}italic_M start_POSTSUBSCRIPT roman_est end_POSTSUBSCRIPT defined by

Δtγ2Mest=𝔼[𝕏tn,tn+1(ϵ)],Δ𝑡𝛾2subscript𝑀estsuperscript𝔼delimited-[]subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1\frac{\Delta t\,\gamma}{2}M_{\rm est}=\mathbb{E}^{\dagger}[\mathbb{X}^{(% \epsilon)}_{t_{n},t_{n+1}}],divide start_ARG roman_Δ italic_t italic_γ end_ARG start_ARG 2 end_ARG italic_M start_POSTSUBSCRIPT roman_est end_POSTSUBSCRIPT = blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] , (81)

which follows from (4.2f) and (52). That is, 𝔼[𝕏tn,tn+1]=𝒪(Δt2)superscript𝔼delimited-[]subscriptsuperscript𝕏subscript𝑡𝑛subscript𝑡𝑛1𝒪Δsuperscript𝑡2\mathbb{E}^{\dagger}[\mathbb{X}^{\dagger}_{t_{n},t_{n+1}}]=\mathcal{O}(\Delta t% ^{2})blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ blackboard_X start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] = caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) for ΔtΔ𝑡\Delta troman_Δ italic_t sufficiently small. Note that second-order iterated integral Xtn,tn+1(ϵ)superscriptsubscript𝑋subscript𝑡𝑛subscript𝑡𝑛1italic-ϵX_{t_{n},t_{n+1}}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT satisfies (4) and is therefore easy to compute. In practice, the frequentist expectation value can be replaced by an approximation along a given single observation path Xt(ϵ)subscriptsuperscript𝑋italic-ϵ𝑡X^{(\epsilon)}_{t}italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, t[0,T]𝑡0𝑇t\in[0,T]italic_t ∈ [ 0 , italic_T ], under the assumption of ergodicity.

An appropriate choice of the outer or sub-sampling step-size ΔtΔ𝑡\Delta troman_Δ italic_t [28] constitutes an important aspect for the practical implementation of the EnKBF formulation (62) for finite values of ϵ>0italic-ϵ0\epsilon>0italic_ϵ > 0 [27]. Consistency of the second-order iterated integrals [14] implies

𝕏tn,tn+2(ϵ)=𝕏tn,tn+1(ϵ)+𝕏tn+1,tn+2(ϵ)+Xtn,tn+1(ϵ)Xtn+1,tn+2(ϵ).subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛2subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝕏italic-ϵsubscript𝑡𝑛1subscript𝑡𝑛2tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛1subscript𝑡𝑛2\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+2}}=\mathbb{X}^{(\epsilon)}_{t_{n},t_{n+1}% }+\mathbb{X}^{(\epsilon)}_{t_{n+1},t_{n+2}}+X^{(\epsilon)}_{t_{n},t_{n+1}}% \otimes X^{(\epsilon)}_{t_{n+1},t_{n+2}}.blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + blackboard_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT . (82)

A sensible choice of ΔtΔ𝑡\Delta troman_Δ italic_t is dictated by

𝔼[Xtn,tn+1(ϵ)Xtn+1,tn+2(ϵ)]=𝒪(Δt2),superscript𝔼delimited-[]tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛1subscript𝑡𝑛2𝒪Δsuperscript𝑡2\mathbb{E}^{\dagger}\left[X^{(\epsilon)}_{t_{n},t_{n+1}}\otimes X^{(\epsilon)}% _{t_{n+1},t_{n+2}}\right]=\mathcal{O}(\Delta t^{2})\,,blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] = caligraphic_O ( roman_Δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (83)

that is, the sub-sampled data Xtn(ϵ)superscriptsubscript𝑋subscript𝑡𝑛italic-ϵX_{t_{n}}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT behaves to leading order like solution increments from the reference model (2) at scale ΔtΔ𝑡\Delta troman_Δ italic_t independent of the specific value of ϵitalic-ϵ\epsilonitalic_ϵ. Note that, on the other hand,

𝔼[Xτl,τl+1(ϵ)Xτl+1,τl+2(ϵ)]=𝒪(ϵ1Δτ2)superscript𝔼delimited-[]tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝜏𝑙subscript𝜏𝑙1subscriptsuperscript𝑋italic-ϵsubscript𝜏𝑙1subscript𝜏𝑙2𝒪superscriptitalic-ϵ1Δsuperscript𝜏2\mathbb{E}^{\dagger}\left[X^{(\epsilon)}_{\tau_{l},\tau_{l+1}}\otimes X^{(% \epsilon)}_{\tau_{l+1},\tau_{l+2}}\right]=\mathcal{O}(\epsilon^{-1}\Delta\tau^% {2})blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_l + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] = caligraphic_O ( italic_ϵ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Δ italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) (84)

for an inner step-size Δτϵsimilar-toΔ𝜏italic-ϵ\Delta\tau\sim\epsilonroman_Δ italic_τ ∼ italic_ϵ. In other words, a suitable step-size Δt>0Δ𝑡0\Delta t>0roman_Δ italic_t > 0 can be defined by making

h(Δt):=Δt2𝔼[Xtn,tn+1(ϵ)Xtn+1,tn+2(ϵ)]assignΔ𝑡Δsuperscript𝑡2normsuperscript𝔼delimited-[]tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛1subscript𝑡𝑛2h(\Delta t):=\Delta t^{-2}\left\|\mathbb{E}^{\dagger}\left[X^{(\epsilon)}_{t_{% n},t_{n+1}}\otimes X^{(\epsilon)}_{t_{n+1},t_{n+2}}\right]\right\|italic_h ( roman_Δ italic_t ) := roman_Δ italic_t start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ∥ blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] ∥ (85)

as small as possible while still guaranteeing an accurate numerical approximation in (62).

Remark 4.7.

The choice of the outer time step Δtnormal-Δ𝑡\Delta troman_Δ italic_t is less critical for the EnKBF formulation (4) since it does not rely on sub-sampling the data and is robust with regard to perturbations in the data provided the appropriate M𝑀Mitalic_M is explicitly available or has been estimated from the available data using (81). Furthermore, if A𝐴Aitalic_A is symmetric, then it follows from (75) and the skew-symmetry of the commutator [.,.][.,.][ . , . ] that

tntn+1(AXt(ϵ))TdXt(ϵ)=A:(Xtn+1/2(ϵ)Xtn,tn+1(ϵ)),:superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑋𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵ𝐴tensor-productsubscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛12subscriptsuperscript𝑋italic-ϵsubscript𝑡𝑛subscript𝑡𝑛1\int_{t_{n}}^{t_{n+1}}(AX_{t}^{(\epsilon)})^{\rm T}{\rm d}X_{t}^{(\epsilon)}=A% :\left(X^{(\epsilon)}_{t_{n+1/2}}\otimes X^{(\epsilon)}_{t_{n},t_{n+1}}\right),∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_A : ( italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 / 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) , (86)

which can be used in (4). The same simplification arises when M𝑀Mitalic_M is symmetric. This insight is at the heart of the geometric rough path approach followed in [10] and which starts from the Stratonovich formulation (25) of the EnKBF. See also [29] on the convergence of Wong–Zakai approximations for stochastic differential equations. In all other cases, a more refined numerical approximation of the data-driven integral in (4) is necessary; such as, for example, (31). For that reason, we rely on the Itô/Euler–Maruyama interpretation of (68) in this note instead, that is the approximation (12).

4.2 Filtered data

We finally discuss a recently proposed [2] robust modification to the parameter estimation problem in the light of the mean-field EnKBF equations considered in this paper. The essential idea is to filter the observation paths Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, t0𝑡0t\geq 0italic_t ≥ 0, via

dZt=1δ(XtZt)dt+δnoise2dVt,dsuperscriptsubscript𝑍𝑡1𝛿superscriptsubscript𝑋𝑡superscriptsubscript𝑍𝑡d𝑡subscript𝛿noise2dsuperscriptsubscript𝑉𝑡{\rm d}Z_{t}^{\dagger}=\frac{1}{\delta}(X_{t}^{\dagger}-Z_{t}^{\dagger}){\rm d% }t+\delta_{\rm noise}\sqrt{2}{\rm d}V_{t}^{\dagger},roman_d italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_d italic_t + italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT square-root start_ARG 2 end_ARG roman_d italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , (87)

where δ>0𝛿0\delta>0italic_δ > 0 is a sufficiently small parameter and Vtsuperscriptsubscript𝑉𝑡V_{t}^{\dagger}italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT denotes independent Brownian motion with δnoise=1subscript𝛿noise1\delta_{\rm noise}=1italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT = 1 (noise added) or δnoise=0subscript𝛿noise0\delta_{\rm noise}=0italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT = 0 (no noise added). Extending the methodology proposed in [2] to the mean-field EnKFB equations (2), we now consider

dΘt&=σtγ(AZt)TdIt,dIt=dXt12(Θt+πt[θ])AXtdt,formulae-sequencedsubscriptΘ𝑡&subscript𝜎𝑡𝛾superscript𝐴subscriptsuperscript𝑍𝑡Tdsubscript𝐼𝑡dsubscript𝐼𝑡dsuperscriptsubscript𝑋𝑡12subscriptΘ𝑡subscript𝜋𝑡delimited-[]𝜃𝐴superscriptsubscript𝑋𝑡d𝑡{\rm d}\Theta_{t}&=\frac{\sigma_{t}}{\gamma}(AZ^{\dagger}_{t})^{\rm T}{\rm d}I% _{t},\\ {\rm d}I_{t}={\rm d}X_{t}^{\dagger}-\frac{1}{2}\left(\Theta_{t}+\pi_{t}[\theta% ]\right)AX_{t}^{\dagger}{\rm d}t,roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT & = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_Z start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_d italic_t ,

with the variance σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT defined as before.

Let us first investigate the long-time behaviour of the extended data generating system

dXt&=AXtdt+γ1/2dWt,dZt=1δ(XtZt)dt+δnoise2dVt,formulae-sequencedsuperscriptsubscript𝑋𝑡&𝐴superscriptsubscript𝑋𝑡d𝑡superscript𝛾12dsuperscriptsubscript𝑊𝑡dsuperscriptsubscript𝑍𝑡1𝛿superscriptsubscript𝑋𝑡superscriptsubscript𝑍𝑡d𝑡subscript𝛿noise2dsuperscriptsubscript𝑉𝑡{\rm d}X_{t}^{\dagger}&=AX_{t}^{\dagger}{\rm d}t+\gamma^{1/2}{\rm d}W_{t}^{% \dagger},\\ {\rm d}Z_{t}^{\dagger}=\frac{1}{\delta}(X_{t}^{\dagger}-Z_{t}^{\dagger}){\rm d% }t+\delta_{\rm noise}\sqrt{2}{\rm d}V_{t}^{\dagger},roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT & = italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT roman_d italic_t + italic_γ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_d italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , roman_d italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_d italic_t + italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT square-root start_ARG 2 end_ARG roman_d italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ,

in some detail. Its stationary distribution is Gaussian with mean mx=mz=0superscriptsubscript𝑚𝑥superscriptsubscript𝑚𝑧0m_{\infty}^{x}=m_{\infty}^{z}=0italic_m start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT = italic_m start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT = 0. The stationary covariance matrices satisfy the relations

0&=AΣxx+ΣxxAT+γI,0=Σxz+Σzx2(ΣzzδnoiseδI),0=AΣxz+1δ(ΣxxΣxz).formulae-sequence0&𝐴superscriptsubscriptΣ𝑥𝑥superscriptsubscriptΣ𝑥𝑥superscript𝐴T𝛾𝐼formulae-sequence0superscriptsubscriptΣ𝑥𝑧superscriptsubscriptΣ𝑧𝑥2superscriptsubscriptΣ𝑧𝑧subscript𝛿noise𝛿𝐼0𝐴superscriptsubscriptΣ𝑥𝑧1𝛿superscriptsubscriptΣ𝑥𝑥superscriptsubscriptΣ𝑥𝑧0&=A\Sigma_{\infty}^{xx}+\Sigma_{\infty}^{xx}A^{\rm T}+\gamma I,\\ 0=\Sigma_{\infty}^{xz}+\Sigma_{\infty}^{zx}-2(\Sigma_{\infty}^{zz}-\delta_{\rm noise% }\delta I),\\ 0=A\Sigma_{\infty}^{xz}+\frac{1}{\delta}(\Sigma_{\infty}^{xx}-\Sigma_{\infty}^% {xz}).0 & = italic_A roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_x end_POSTSUPERSCRIPT + roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_x end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT + italic_γ italic_I , 0 = roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_z end_POSTSUPERSCRIPT + roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_x end_POSTSUPERSCRIPT - 2 ( roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_z end_POSTSUPERSCRIPT - italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT italic_δ italic_I ) , 0 = italic_A roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_z end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ( roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_x end_POSTSUPERSCRIPT - roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_z end_POSTSUPERSCRIPT ) .

We note that Σxx=CsuperscriptsubscriptΣ𝑥𝑥𝐶\Sigma_{\infty}^{xx}=Croman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_x end_POSTSUPERSCRIPT = italic_C with the matrix C𝐶Citalic_C defined in (16) and that the symmetric part of ΣzxsuperscriptsubscriptΣ𝑧𝑥\Sigma_{\infty}^{zx}roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_x end_POSTSUPERSCRIPT and ΣxzsuperscriptsubscriptΣ𝑥𝑧\Sigma_{\infty}^{xz}roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_z end_POSTSUPERSCRIPT, respectively, are equivalent to ΣzzδnoiseδIsuperscriptsubscriptΣ𝑧𝑧subscript𝛿noise𝛿𝐼\Sigma_{\infty}^{zz}-\delta_{\rm noise}\delta Iroman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_z end_POSTSUPERSCRIPT - italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT italic_δ italic_I. Hence, following (22), we again make the crucial approximation

(ATA):ZtXt(ATA):Σzx=(ATA):(ΣzzδnoiseδI):superscript𝐴T𝐴tensor-productsuperscriptsubscript𝑍𝑡superscriptsubscript𝑋𝑡superscript𝐴T𝐴:superscriptsubscriptΣ𝑧𝑥superscript𝐴T𝐴:superscriptsubscriptΣ𝑧𝑧subscript𝛿noise𝛿𝐼(A^{\rm T}A):Z_{t}^{\dagger}\otimes X_{t}^{\dagger}\approx(A^{\rm T}A):\Sigma_% {\infty}^{zx}=(A^{\rm T}A):(\Sigma_{\infty}^{zz}-\delta_{\rm noise}\delta I)( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ≈ ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_x end_POSTSUPERSCRIPT = ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : ( roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_z end_POSTSUPERSCRIPT - italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT italic_δ italic_I ) (91)

for d1much-greater-than𝑑1d\gg 1italic_d ≫ 1. Let us therefore introduce the shorthand

C~=ΣzzδnoiseδI.~𝐶superscriptsubscriptΣ𝑧𝑧subscript𝛿noise𝛿𝐼\tilde{C}=\Sigma_{\infty}^{zz}-\delta_{\rm noise}\delta I.over~ start_ARG italic_C end_ARG = roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_z italic_z end_POSTSUPERSCRIPT - italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT italic_δ italic_I . (92)

We also note that

C~=C+𝒪(δ).~𝐶𝐶𝒪𝛿\tilde{C}=C+\mathcal{O}(\delta).over~ start_ARG italic_C end_ARG = italic_C + caligraphic_O ( italic_δ ) . (93)

The frequentist analysis from Section 3 delivers

dμt=σtγ((AZt)TdXtμt(ATA):C~dt){\rm d}\mu_{t}=\frac{\sigma_{t}}{\gamma}\left((AZ_{t}^{\dagger})^{\rm T}{\rm d% }X_{t}^{\dagger}-\mu_{t}(A^{\rm T}A):\tilde{C}\,{\rm d}t\right)roman_d italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( ( italic_A italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : over~ start_ARG italic_C end_ARG roman_d italic_t ) (94)

for the conditional mean and

ddtσt=σt2γ(ATA):C~:dd𝑡subscript𝜎𝑡superscriptsubscript𝜎𝑡2𝛾superscript𝐴T𝐴~𝐶\frac{\rm d}{{\rm d}t}\sigma_{t}=-\frac{\sigma_{t}^{2}}{\gamma}(A^{\rm T}A):% \tilde{C}divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : over~ start_ARG italic_C end_ARG (95)

for the conditional variance of the random variable ΘtsubscriptΘ𝑡\Theta_{t}roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, as defined by the modified mean-field evolution equations (4.2). Furthermore, we find that μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT still provides an asymptotically unbiased estimator since mt=𝔼[μt]subscript𝑚𝑡superscript𝔼delimited-[]subscript𝜇𝑡m_{t}=\mathbb{E}^{\dagger}[\mu_{t}]italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = blackboard_E start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT [ italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] satisfies

ddtmt=σtγ(ATA):C~(1mt).:dd𝑡subscript𝑚𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴~𝐶1subscript𝑚𝑡\frac{\rm d}{{\rm d}t}m_{t}=\frac{\sigma_{t}}{\gamma}(A^{\rm T}A):\tilde{C}\,(% 1-m_{t}).divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : over~ start_ARG italic_C end_ARG ( 1 - italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) . (96)

Similarly, the variance, ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of the estimator μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT satisfies

ddtpt=σtγ(ATA):C~(σt2pt).:dd𝑡subscript𝑝𝑡subscript𝜎𝑡𝛾superscript𝐴T𝐴~𝐶subscript𝜎𝑡2subscript𝑝𝑡\frac{\rm d}{{\rm d}t}p_{t}=\frac{\sigma_{t}}{\gamma}(A^{\rm T}A):\tilde{C}\,(% \sigma_{t}-2p_{t}).divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : over~ start_ARG italic_C end_ARG ( italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - 2 italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) . (97)

In summary, we find that the modified EnKBF mean-field equations (4.2) behave exactly as the original equations (2) with the only difference that the stationary covariance matrix C=Σxx𝐶superscriptsubscriptΣ𝑥𝑥C=\Sigma_{\infty}^{xx}italic_C = roman_Σ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x italic_x end_POSTSUPERSCRIPT is replaced everywhere by (92).

Again following [2], given multi-scale observations Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT, t0𝑡0t\geq 0italic_t ≥ 0, we define associated filtered Zt(ϵ)superscriptsubscript𝑍𝑡italic-ϵZ_{t}^{(\epsilon)}italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT via

ddtZt(ϵ)=1δ(Xt(ϵ)Zt(ϵ))+δnoise2dVt.dd𝑡superscriptsubscript𝑍𝑡italic-ϵ1𝛿superscriptsubscript𝑋𝑡italic-ϵsuperscriptsubscript𝑍𝑡italic-ϵsubscript𝛿noise2dsuperscriptsubscript𝑉𝑡\frac{\rm d}{{\rm d}t}Z_{t}^{(\epsilon)}=\frac{1}{\delta}(X_{t}^{(\epsilon)}-Z% _{t}^{(\epsilon)})+\delta_{\rm noise}\sqrt{2}{\rm d}V_{t}^{\dagger}.divide start_ARG roman_d end_ARG start_ARG roman_d italic_t end_ARG italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) + italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT square-root start_ARG 2 end_ARG roman_d italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT . (98)

The intriguing observation is that

J~tn,tn+1(ϵ):=tntn+1(AZt(ϵ))TdXt(ϵ)assignsuperscriptsubscript~𝐽subscript𝑡𝑛subscript𝑡𝑛1italic-ϵsuperscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑍𝑡italic-ϵTdifferential-dsuperscriptsubscript𝑋𝑡italic-ϵ\tilde{J}_{t_{n},t_{n+1}}^{(\epsilon)}:=\int_{t_{n}}^{t_{n+1}}(AZ_{t}^{(% \epsilon)})^{\rm T}{\rm d}X_{t}^{(\epsilon)}over~ start_ARG italic_J end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT (99)

converges to

J~tn,tn+1:=tntn+1(AZt)TdXtassignsubscriptsuperscript~𝐽subscript𝑡𝑛subscript𝑡𝑛1superscriptsubscriptsubscript𝑡𝑛subscript𝑡𝑛1superscript𝐴superscriptsubscript𝑍𝑡Tdifferential-dsuperscriptsubscript𝑋𝑡\tilde{J}^{\dagger}_{t_{n},t_{n+1}}:=\int_{t_{n}}^{t_{n+1}}(AZ_{t}^{\dagger})^% {\rm T}{\rm d}X_{t}^{\dagger}over~ start_ARG italic_J end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT := ∫ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_A italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT (100)

as ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0. This simply follows from the fact that both integrals can be interpreted as standard Riemann–Stieltjes integrals and convergence of Xt(ϵ)Xtsuperscriptsubscript𝑋𝑡italic-ϵsuperscriptsubscript𝑋𝑡X_{t}^{(\epsilon)}\to X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT → italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT and Zt(ϵ)Ztsuperscriptsubscript𝑍𝑡italic-ϵsuperscriptsubscript𝑍𝑡Z_{t}^{(\epsilon)}\to Z_{t}^{\dagger}italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT → italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT as ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0 in the standard topology of continuous functions is sufficient to conclude the convergence of the associated integrals J~tn,tn+1(ϵ)J~tn,tn+1superscriptsubscript~𝐽subscript𝑡𝑛subscript𝑡𝑛1italic-ϵsuperscriptsubscript~𝐽subscript𝑡𝑛subscript𝑡𝑛1\tilde{J}_{t_{n},t_{n+1}}^{(\epsilon)}\to\tilde{J}_{t_{n},t_{n+1}}^{\dagger}over~ start_ARG italic_J end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT → over~ start_ARG italic_J end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT. In other words, the extended EnKBF formulation

dΘt(ϵ)&=σt(ϵ)γ(AZt(ϵ))TdIt(ϵ),dIt(ϵ)=dXt(ϵ)12(Θt(ϵ)+πt(ϵ)[θ])AXt(ϵ)dt,dZt(ϵ)=1δ(Xt(ϵ)Zt(ϵ))dt+δnoise2Vtformulae-sequencedsuperscriptsubscriptΘ𝑡italic-ϵ&superscriptsubscript𝜎𝑡italic-ϵ𝛾superscript𝐴subscriptsuperscript𝑍italic-ϵ𝑡Tdsuperscriptsubscript𝐼𝑡italic-ϵformulae-sequencedsuperscriptsubscript𝐼𝑡italic-ϵdsuperscriptsubscript𝑋𝑡italic-ϵ12superscriptsubscriptΘ𝑡italic-ϵsuperscriptsubscript𝜋𝑡italic-ϵdelimited-[]𝜃𝐴superscriptsubscript𝑋𝑡italic-ϵd𝑡dsuperscriptsubscript𝑍𝑡italic-ϵ1𝛿superscriptsubscript𝑋𝑡italic-ϵsuperscriptsubscript𝑍𝑡italic-ϵd𝑡subscript𝛿noise2superscriptsubscript𝑉𝑡{\rm d}\Theta_{t}^{(\epsilon)}&=\frac{\sigma_{t}^{(\epsilon)}}{\gamma}(AZ^{(% \epsilon)}_{t})^{\rm T}{\rm d}I_{t}^{(\epsilon)},\\ {\rm d}I_{t}^{(\epsilon)}={\rm d}X_{t}^{(\epsilon)}-\frac{1}{2}\left(\Theta_{t% }^{(\epsilon)}+\pi_{t}^{(\epsilon)}[\theta]\right)AX_{t}^{(\epsilon)}{\rm d}t,% \\ {\rm d}Z_{t}^{(\epsilon)}=\frac{1}{\delta}(X_{t}^{(\epsilon)}-Z_{t}^{(\epsilon% )}){\rm d}t+\delta_{\rm noise}\sqrt{2}V_{t}^{\dagger}roman_d roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT & = divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT end_ARG start_ARG italic_γ end_ARG ( italic_A italic_Z start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT , roman_d italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = roman_d italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT + italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT [ italic_θ ] ) italic_A italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT roman_d italic_t , roman_d italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_δ end_ARG ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT - italic_Z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT ) roman_d italic_t + italic_δ start_POSTSUBSCRIPT roman_noise end_POSTSUBSCRIPT square-root start_ARG 2 end_ARG italic_V start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT

will converge to the correct parameter value θ=1superscript𝜃1\theta^{\dagger}=1italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = 1 as t𝑡t\to\inftyitalic_t → ∞ in the limit ϵ0italic-ϵ0\epsilon\to 0italic_ϵ → 0, that is,

limtlimϵ0Θt(ϵ)=θ.subscript𝑡subscriptitalic-ϵ0superscriptsubscriptΘ𝑡italic-ϵsuperscript𝜃\lim_{t\to\infty}\lim_{\epsilon\to 0}\Theta_{t}^{(\epsilon)}=\theta^{\dagger}.roman_lim start_POSTSUBSCRIPT italic_t → ∞ end_POSTSUBSCRIPT roman_lim start_POSTSUBSCRIPT italic_ϵ → 0 end_POSTSUBSCRIPT roman_Θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT = italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT . (102)

This statement is in line with the results from [2]. The intriguing point is that the data filtering approach does not require knowledge of the rough path correction term implied by (73) while still delivering an unbiased estimator.

5 Numerical example

We consider the linear SDE (2) with γ=1𝛾1\gamma=1italic_γ = 1 and

A=12(1111).𝐴121111A=\frac{-1}{2}\left(\begin{array}[]{cc}1&-1\\ 1&1\end{array}\right).italic_A = divide start_ARG - 1 end_ARG start_ARG 2 end_ARG ( start_ARRAY start_ROW start_CELL 1 end_CELL start_CELL - 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL end_ROW end_ARRAY ) . (103)

We find that C=I𝐶𝐼C=Iitalic_C = italic_I and ATA=1/2Isuperscript𝐴T𝐴12𝐼A^{\rm T}A=1/2Iitalic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A = 1 / 2 italic_I. Hence (ATA):C=1:superscript𝐴T𝐴𝐶1(A^{\rm T}A):C=1( italic_A start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT italic_A ) : italic_C = 1, and the posterior variance simply satisfies σt=σ0/(1+σ0t)subscript𝜎𝑡subscript𝜎01subscript𝜎0𝑡\sigma_{t}=\sigma_{0}/(1+\sigma_{0}t)italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / ( 1 + italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_t ) according to (44). We set mprior=0subscript𝑚prior0m_{\rm prior}=0italic_m start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT = 0 and σprior=4subscript𝜎prior4\sigma_{\rm prior}=4italic_σ start_POSTSUBSCRIPT roman_prior end_POSTSUBSCRIPT = 4 for the Gaussian prior distribution of Θ0subscriptΘ0\Theta_{0}roman_Θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and the observation interval is [0,T]0𝑇[0,T][ 0 , italic_T ] with T=6𝑇6T=6italic_T = 6. We find that σT=0.16subscript𝜎𝑇0.16\sigma_{T}=0.16italic_σ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = 0.16. Solving (39) for given σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT with initial condition m0=0subscript𝑚00m_{0}=0italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0 yields

mt=1σtσ0subscript𝑚𝑡1subscript𝜎𝑡subscript𝜎0m_{t}=1-\frac{\sigma_{t}}{\sigma_{0}}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 1 - divide start_ARG italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG (104)

and mT=0.96subscript𝑚𝑇0.96m_{T}=0.96italic_m start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = 0.96. The corresponding curves are displayed in red in Figure 2.

Refer to caption
Figure 2: a)–b): frequentist mean, mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and variance, ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, from EnKBF implementation (20) with step-size Δt=0.06Δ𝑡0.06\Delta t=0.06roman_Δ italic_t = 0.06; c)–d): same results from EnKBF implementation (30) with inner time-step Δτ=Δt/600Δ𝜏Δ𝑡600\Delta\tau=\Delta t/600roman_Δ italic_τ = roman_Δ italic_t / 600. We also display the curves arising for σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from the standard Kalman theory using the approximation (22). Note that the posterior variance, σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, should provide an upper bound on the frequentist uncertainty ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

We implement the EnKBF schemes (20) and (30) with tn=nΔtsubscript𝑡𝑛𝑛Δ𝑡t_{n}=n\,\Delta titalic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_n roman_Δ italic_t. The inner time-step is Δτ=104Δ𝜏superscript104\Delta\tau=10^{-4}roman_Δ italic_τ = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT while Δt=0.06Δ𝑡0.06\Delta t=0.06roman_Δ italic_t = 0.06, that is, L=600𝐿600L=600italic_L = 600. We repeat the experiment N=104𝑁superscript104N=10^{4}italic_N = 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT times and compare the outcome with the predicted mean value of mT=0.96subscript𝑚𝑇0.96m_{T}=0.96italic_m start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = 0.96 and the posterior variance of σT=0.16subscript𝜎𝑇0.16\sigma_{T}=0.16italic_σ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = 0.16 in Figure 2. The differences in the computed time evolutions of mtsubscript𝑚𝑡m_{t}italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT are rather minor and support the idea that it is not necessary to assimilate continuous-time data beyond ΔtΔ𝑡\Delta troman_Δ italic_t. We also find that the simple prediction (104), based on standard Kalman filter theory, is not very accurate for this low-dimensional problem (d=2𝑑2d=2italic_d = 2). The corresponding approximation for σtsubscript𝜎𝑡\sigma_{t}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT provides, however, a good upper bound for ptsubscript𝑝𝑡p_{t}italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

Refer to caption
Figure 3: Same experimental setting as in Figure 2 but with the data now generated from the multi-scale SDE (4). Again, subsampling the data in intervals of Δt=0.06Δ𝑡0.06\Delta t=0.06roman_Δ italic_t = 0.06 and high-frequency assimilation with step-size Δτ=104Δ𝜏superscript104\Delta\tau=10^{-4}roman_Δ italic_τ = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT lead to very similar results in terms of their frequentist means and variances.

We now replace the data generating SDE model (2) by the multi-scale formulation (4) with ϵ=0.01italic-ϵ0.01\epsilon=0.01italic_ϵ = 0.01 and β=2𝛽2\beta=2italic_β = 2. This parameter choice agrees with the one used in [10]. We again find that assimilating the data at the slow time-scale Δt=0.06Δ𝑡0.06\Delta t=0.06roman_Δ italic_t = 0.06 leads to very similar results obtained from an assimilation at the fast time-scale Δτ=104Δ𝜏superscript104\Delta\tau=10^{-4}roman_Δ italic_τ = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT with the EnKBF formulation (4), provided the correction term resulting from the second-order iterated integral (73) is included. See Figure 3. We also verified numerically that Δt=0.06Δ𝑡0.06\Delta t=0.06roman_Δ italic_t = 0.06 constitutes a nearly optimal step-size in the sense of making (85) sufficiently small while maintaining numerical accuracy. For example, reducing the outer step-size to Δt=0.02Δ𝑡0.02\Delta t=0.02roman_Δ italic_t = 0.02 leads to h(0.02)h(0.06)100.020.0610h(0.02)-h(0.06)\approx 10italic_h ( 0.02 ) - italic_h ( 0.06 ) ≈ 10 in (85).

We finally implement the data filtering approaches (4.2) and (4.2) with δ=0.1𝛿0.1\delta=0.1italic_δ = 0.1 using the true signal Xtsuperscriptsubscript𝑋𝑡X_{t}^{\dagger}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT and its multi-scale representation Xt(ϵ)superscriptsubscript𝑋𝑡italic-ϵX_{t}^{(\epsilon)}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_ϵ ) end_POSTSUPERSCRIPT, respectively. The numerical implementation with a step-size Δt=Δτ=104Δ𝑡Δ𝜏superscript104\Delta t=\Delta\tau=10^{-4}roman_Δ italic_t = roman_Δ italic_τ = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT resulted in approximations μtnsubscript𝜇subscript𝑡𝑛\mu_{t_{n}}italic_μ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT which converged to the true parameter value θ=1superscript𝜃1\theta^{\dagger}=1italic_θ start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = 1 as tT=6𝑡𝑇6t\to T=6italic_t → italic_T = 6 without the need for including further corrections terms; as expected from the results in Section 4.2.

6 Conclusions

In this follow-up note to [10], we have investigated the impact of subsampling, data filtering, and high-frequency data assimilation on the corresponding conditional mean estimators, μtsubscript𝜇𝑡\mu_{t}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, both for data generated from the standard SDE model and a modified multi-scale SDE. A frequentist analysis supports the basic finding that all three approaches lead to comparable results provided that the systematic biases due to different second-order iterated integrals are properly accounted for. While the EnKBF is relatively easy to analyse and a full rough path approach can be avoided, extending these results to the nonlinear feedback particle filter [27, 10] will prove more challenging. Extensions to systems without a strong scale separation [5, 33] and applications to geophysical fluid dynamics [23, 13] are also of interest. In this context, the approximation quality of the proposed estimator (81) and the choice of the step-size ΔtΔ𝑡\Delta troman_Δ italic_t following (85) (and potentially ΔτΔ𝜏\Delta\tauroman_Δ italic_τ) will be of particular interest. Finally, while we have investigated the univariate parameter estimation problem, a semi-parametric parametrisation of the drift term f𝑓fitalic_f in (1), such as random feature maps [22], lead to high-dimensional parameter estimation problems and their statistics [20, 21]. This provides another fertile direction for future research.

Acknowledgements.

SR has been partially funded by Deutsche Forschungsgemeinschaft (DFG) - Project-ID 318763901 - SFB1294 and Project-ID 235221301 - SFB1114. He would also like to thank Nikolas Nüsken for many fruitful discussions on the subject of this paper.

References

  • Abdulle et al. [2021] A. Abdulle, G. Garegnani, G. A. Pavliotis, A. M. Stuart, and A. Zanoni. Drift estimation of multiscale diffusions based on filtered data. Foundations of Computational Mathematics, published online 2021/10/13:in press, 2021. 10.1007/s10208-021-09541-9.
  • Abdulle et al. [2023] A. Abdulle, G. Garegnani, G. Pavliotis, A. Stuart, and A. Zanoni. Drift estimation of multiscale diffusions based on filtered data. Foundations of Computational Mathematics, 23:33–84, 2023. 10.1007/s10208-021-09541-9.
  • Ait-Sahalia et al. [2005] Y. Ait-Sahalia, P. A. Mykland, and L. Zhang. How often to sample a continuous-time process in the presence of market microstructure noise. The Review of Financial Studies, 18:351–416, 2005.
  • Amezcua et al. [2014] J. Amezcua, E. Kalnay, K. Ide, and S. Reich. Ensemble transform Kalman-Bucy filters. Q.J.R. Meteor. Soc., 140:995–1004, 2014.
  • Arnold [2001] L. Arnold. Hasselmann’s program revisited: The analysis of stochasticity in deterministic climate models. In Stochastic Climate Models, pages 141–158. Birkhäuser Basel, 2001. 10.1007/978-3-0348-8287-3.
  • Azencott et al. [2013] R. Azencott, A. Beri, A. Jain, and I. Timofeyev. Sub-sampling and parametric estimation for multiscale dynamics. Communications in Mathematical Sciences, 11:939–970, 2013.
  • Bain and Crisan [2009] A. Bain and D. Crisan. Fundamentals of Stochastic Filtering, volume 60 of Stoch. Model. Appl. Probab. Springer, New York, 2009. 10.1007/978-0-387-76896-0.
  • Bálint and Melbourne [2018] P. Bálint and I. Melbourne. Statistical properties for flows with unbounded roof function, including the Lorenz attractor. Journal of Statistical Physics, 172:1101–1126, 2018. 10.1007/s10955-018-2093-y.
  • Bo and Celani [2013] S. Bo and A. Celani. White-noise limit of nonwhite nonequilibrium processes. Physical Review E, 88:062150, 2013. 10.1103/PhysRevE.88.062150.
  • Coghi et al. [2023] M. Coghi, T. Nilssen, N. Nüsken, and S. Reich. Rough McKean–Vlasov dynamics for robust ensemble Kalman filtering. Ann. Appl. Probab., 33 (6B):5693–5752, 2023.
  • Cotter and Reich [2013] C. Cotter and S. Reich. Ensemble filter techniques for intermittent data assimilation. Radon Ser. Comput. Appl. Math., 13:91–134, 2013. 10.1515/9783110282269.91.
  • Crisan et al. [2013] D. Crisan, J. Diehl, P. K. Friz, H. Oberhauser, et al. Robust filtering: correlated noise and multidimensional observation. The Annals of Applied Probability, 23:2139–2160, 2013.
  • Culina et al. [2011] J. Culina, S. Kravtsov, and A. H. Monahan. Stochastic parameterization schemes for use in realistic climate models. Journal of the Atmospheric Sciences, 68:284 – 299, 2011. 10.1175/2010JAS3509.1.
  • Davie [2008] A. M. Davie. Differential equations driven by rough paths: An approach via discrete approximation. Applied Mathematics Research eXpress, 2008, 2008. 10.1093/amrx/abm009. abm009.
  • Diehl et al. [2016] J. Diehl, P. Friz, and H. Mai. Pathwise stability of likelihood estimators for diffusion via rough paths. The Annals of Applied Probability, 26:2169–2192, 2016. 10.1214/15-AAP1143.
  • Evensen [2009] G. Evensen. Data assimilation. Springer-Verlag, Berlin, second edition, 2009. ISBN 978-3-642-03710-8. 10.1007/978-3-642-03711-5.
  • Friz and Hairer [2020] P. Friz and M. Hairer. A course on rough paths. Springer-Verlag, 2020.
  • Friz et al. [2015] P. Friz, P. Gassiat, and T. Lyons. Physical Brownian motion in a magnetic field as a rough path. Transactions of the American Mathematical Society, 367:7939–7955, 2015.
  • Garbuno-Inigo et al. [2020] A. Garbuno-Inigo, N. Nüsken, and S. Reich. Affine invariant interacting Langevin dynamics for Bayesian inference. SIAM J. Appl. Dyn. Syst., 19:1633–1658, 2020. 10.1137/19M1304891.
  • Ghosal and van der Vaart [2017] S. Ghosal and A. van der Vaart. Fundamentals of Nonparametric Bayesian Inference. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2017. 10.1017/9781139029834.
  • Giné and Nickl [2016] E. Giné and R. Nickl. Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge University Press, Cambridge, 2016. 10.1017/CBO9781107337862.
  • Gottwald and Reich [2021] G. A. Gottwald and S. Reich. Supervised learning from noisy observations: Combining machine-learning techniques with data assimilation. Physica D: Nonlinear Phenomena, 423:132911, 2021. ISSN 0167-2789. https://doi.org/10.1016/j.physd.2021.132911.
  • Hasselmann [1976] K. Hasselmann. Stochastic climate models Part I. Theory. Tellus, 28:473–485, 1976. 10.1111/j.2153-3490.1976.tb00696.x.
  • Ikeda and Watanabe [1989] N. Ikeda and S. Watanabe. Stochastic differential equations and diffusion pro- cesses. North Holland Publishing Company, Amsterdam-New York, 2nd edition, 1989.
  • Kelly and Melbourne [2017] D. Kelly and I. Melbourne. Deterministic homogenization for fast-slow systems with chaotic noise. Journal of Functional Analysis, 272:4063–4102, 2017. 10.1016/j.jfa.2017.01.015.
  • Kutoyants [2013] Y. A. Kutoyants. Statistical inference for ergodic diffusion processes. Springer Science & Business Media, 2013.
  • Nüsken et al. [2019] N. Nüsken, S. Reich, and P. J. Rozdeba. State and parameter estimation from observed signal increments. Entropy, 21(5):505, 2019. 10.3390/e21050505.
  • Papavasiliou et al. [2009] A. Papavasiliou, G. Pavliotis, and A. Stuart. Maximum likelihood estimation for multiscale diffusions. Stochastic Processes and their Applications, 19:3173–3210, 2009.
  • Pathiraja [2020] S. Pathiraja. L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT convergence of smooth approximations of stochastic differential equations with unbounded coefficients, 2020. arXiv:2011.13009.
  • Reich [2023] S. Reich. Frequentist perspective on robust parameter estimation using the ensemble Kalman filter. In B. Chapron, D. Crisan, D. Holm, E. Mémin, and A. Radomska, editors, Stochastic Transport in Upper Ocean Dynamics. STUOD 2021. Mathematics of Planet Earth, volume 10, pages 237–258, Cham., 2023. Springer. 10.1007/978-3-031-18988-3_15.
  • Reich and Rozdeba [2020] S. Reich and P. Rozdeba. Posterior contraction rates for non-parametric state and drift estimation. Foundation of Data Science, 2:333–349, 2020. 10.3934/fods.2020016.
  • Sirignano and Spiliopoulos [2017] J. Sirignano and K. Spiliopoulos. Stochastic gradient descent in continuous time. SIAM J. Financial Math., 8:933–961, 2017. 10.1137/17M1126825.
  • Wouters and Gottwald [2019] J. Wouters and G. A. Gottwald. Stochastic model reduction for slow-fast systems with moderate time scale separation. Multiscale Modeling & Simulation, 17:1172–1188, 2019.