Limit theorems for weighted and regular Multilevel estimators

Daphné Giorgi; Vincent Lemaire; Gilles Pagès

doi:10.1515/mcma-2017-0102

Publicly Available Published by De Gruyter February 21, 2017

Limit theorems for weighted and regular Multilevel estimators

Daphné Giorgi , Vincent Lemaire and Gilles Pagès

From the journal Monte Carlo Methods and Applications

https://doi.org/10.1515/mcma-2017-0102

Abstract

We aim at analyzing in terms of a.s. convergence and weak rate the performances of the Multilevel Monte Carlo estimator (MLMC) introduced in [7] and of its weighted version, the Multilevel Richardson–Romberg estimator (ML2R), introduced in [12]. These two estimators permit to compute a very accurate approximation of I0=𝔼⁢[Y0] by a Monte Carlo-type estimator when the (non-degenerate) random variable Y0∈L2⁢(ℙ) cannot be simulated (exactly) at a reasonable computational cost whereas a family of simulatable approximations (Yh)h∈ℋ is available. We will carry out these investigations in an abstract framework before applying our results, mainly a Strong Law of Large Numbers and a Central Limit Theorem, to some typical fields of applications: discretization schemes of diffusions and nested Monte Carlo.

Keywords: Multilevel Monte Carlo methods; law of large numbers; Central Limit Theorem,Richardson–Romberg extrapolation; Euler scheme; nested Monte Carlo

MSC 2010: 65C05; 65C30

1 Introduction

In recent years, there has been an increasing interest in Multilevel Monte Carlo approach which delivers remarkable improvements in computational complexity in comparison with standard Monte Carlo in biased framework. We refer the reader to [8] for a broad outline of the ideas behind the Multilevel Monte Carlo method and various recent generalizations and extensions. In this paper we establish a Strong Law of Large Numbers and Central Limit Theorem for two kinds of multilevel estimator, Multilevel Monte Carlo estimator (MLMC) introduced by Giles in [7] and the Multilevel Richardson–Romberg (weighted) estimator introduced in [12]. We consider a rather general and in some way abstract framework which will allow us to state these results whatever the strong rate parameter is (usually denoted by β). To be more precise, we will deal with the versions of these estimators designed to achieve a root mean squared error (RMSE) ε and establish these results as ε→0. Doing so, we will retrieve some recent results established in [2] in the framework of Euler discretization schemes of Brownian diffusions. We will also deduce an SLLN and a CLT for Multilevel nested Monte Carlo, which are new results to our knowledge. More generally, our result apply to any implementation of Multilevel Monte Carlo methods.

Let (Ω,𝒜,ℙ) be a probability space and let (Yh)h∈ℋ be a family of real-valued random variables in L2⁢(ℙ) associated to Y0, where ℋ={𝐡n:n≥1}, such that limh→0⁡∥Yh-Y0∥2=0. In the sequel, a fixed h∈ℋ will be called bias parameter (though it appears in a different framework as a discretization parameter). In what follows we will be interested in the computational cost of the estimators denoted by the Cost⁡(⋅) function. We assume that the simulation of Yh has an inverse linear complexity, i.e. Cost⁡(Yh)=h-1. A natural estimator of I0=𝔼⁢[Y0] is the standard Monte Carlo estimator, which reads for a fixed h,

IhN=1N⁢∑k=1NYhk with Cost⁡(IhN)=h-1⁢N,

where (Yhk)k≥1 are i.i.d. copies of Yh and N is the size of the estimator, which controls the statistical error. In order to give the definition of a Multilevel estimator, we consider a depthR⩾2 (the finest level of simulation) and a geometric decreasing sequence of bias parameters hj=hnj with nj=Mj-1, j=1,…,R. If N is the estimator size, we consider an allocation policy q=(q1,…,qR) such that, at each level j=1,…,R, we will simulate Nj=⌈N⁢qj⌉ scenarios (see (1) and (2) below). Thus, we consider R independent copies of the family Y(j)=(Yh(j))h∈ℋ, j=1,…,R, attached to independent random copies Y0(j) of Y0. Moreover, let (Y(j),k)k≥1 be independent sequences of independent copies of Y(j). We denote by IπN an estimator of size N of I0, attached to a simulation parameter π∈Π⊂ℝd.

A standard Multilevel Monte Carlo (MLMC) estimator, as introduced by Giles in [7], reads

(1)IπN=Ih,R,qN=1N1⁢∑k=1N1Yh(1),k+∑j=2R1Nj⁢∑k=1Nj(Yhj(j),k-Yhj-1(j),k)

with π=(h,R,q).

A Multilevel Richardson–Romberg (ML2R) estimator, as introduced in [12], is a weighted version of (1) which reads

(2)IπN=Ih,R,qN=1N1⁢∑k=1N1Yh(1),k+∑j=2R𝐖jRNj⁢∑k=1Nj(Yhj(j),k-Yhj-1(j),k)

with π=(h,R,q). The weights (𝐖jR)j=1,…,R are explicitly defined as functions of the weak error rate α (see equation (${\mathrm{WE}_{\alpha,\bar{R}}}$) below) and of the refiners nj, j=0,…,R, in order to kill the successive bias terms in the weak error expansion (see Section 4.3 for more details on the weights). When no ambiguity, we will keep denoting by IπN estimators for both classes. We notice that a Crude Monte Carlo estimator of size N formally appears as an ML2R estimator with R=1 and an MLMC estimator appears as an ML2R estimator in which the weights set 𝐖jR=1, j=1,…,R. Based on the inverse linear complexity of Yh, it is clear that the simulation cost of both MLMC and ML2R estimators is given by

Cost⁡(Ih,R,qN)=Nh⁢∑j=1Rqj⁢(nj-1+nj)

with the convention n0=0. The difference between the cost of MLMC and of ML2R estimator comes from the different choice of the parameters M, R, h, q and N.

The calibration of the parameters is the result, a root M⩾2 being fixed, of the minimization of the simulation cost, for a given target Mean Square Error or L2-error ε, namely,

(3)(π⁢(ε),N⁢(ε))=argmin∥IπN-I0∥2≤ε⁡Cost⁡(IπN).

This calibration has been done in [12] for both estimators MLMC and ML2R under the following assumptions on the sequence (Yh)h∈ℋ. The first one, called bias error expansion (or weak error assumption), states

(${\mathrm{WE}_{\alpha,\bar{R}}}$)∃α>0,R¯⩾1,(cr)1≤r≤R¯,𝔼⁢[Yh]-𝔼⁢[Y0]=∑k=1R¯ck⁢hα⁢k+hα⁢R¯⁢ηR¯⁢(h),limh→0⁡ηR¯⁢(h)=0.

The second one, called strong approximation error assumption, states

(${\mathrm{SE}_{\beta}}$)∃β>0,V1⩾0,∥Yh-Y0∥22=𝔼⁢[|Yh-Y0|2]⩽V1⁢hβ.

Note that the strong error assumption can be sometimes replaced by the sharper

∃β>0,V1⩾0,∥Yh-Yh′∥22=𝔼⁢[|Yh-Yh′|2]⩽V1⁢|h-h′|β,h,h′∈ℋ.

From now on, we set IπN⁢(ε):=Iπ⁢(ε)N⁢(ε), where π⁢(ε) and N⁢(ε) are closed to solutions of (3) (see [12] for the construction of these parameters and Tables 1 and 2 for the explicit values). As mentioned by Duffie and Glynn in [5], the global cost of the standard Monte Carlo with these optimal parameters satisfies

Cost⁡(IπN⁢(ε))≲K⁢(α)⁢ε-(2+1α),

where the finite real constant K⁢(α) depends on the structural parameters α,Var⁡(Y0),𝐡, and we recall that f⁢(ε)≲g⁢(ε) if and only if lim supε→0⁡g⁢(ε)/f⁢(ε)≤1. Giles for MLMC in [7] and Lemaire and Pagès for ML2R in [12] showed that, using these optimal parameters the global cost is upper bounded by a function of ε, depending on the weak error expansion rate α and on the strong error rate β. More precisely, for both estimators we have

(4)Cost⁡(IπN⁢(ε))≲K⁢(α,β,M)⁢v⁢(ε),

where the finite real constant K⁢(α,β,M) is explicit and differs between MLMC and ML2R (see [12] for more details). Denoting vMLMC and vML2R the dominated function in (4) for the MLMC and ML2R estimator, respectively, we obtain two distinct cases. In the case β>1 both estimators behaves very well as an unbiased Monte Carlo estimator, i.e. vMLMC⁢(ε)=vML2R⁢(ε)=ε-2. In the case β⩽1, the ML2R is asymptotically quite better than MLMC since limε→0⁡vML2RvMLMC=0. More precisely, we have the following scheme.

	vMLMC⁢(ε)	vML2R⁢(ε)
β=1	ε-2log(1/ε)2	ε-2⁢log⁡(1/ε)
β<1	ε-2-1-βα	ε-2⁢e1-βα⁢2⁢log⁡(1/ε)⁢log⁡(M)

The aim of this paper is to prove a Strong Law of Large Numbers (SLLN) and a Central Limit Theorem (CLT) for both estimators MLMC and ML2R calibrated using these optimal parameters. First notice that as these parameters have been computed under the constraint ∥IπN⁢(ε)-I0∥2≤ε, the convergence in L2 holds by construction. As a consequence, it is straight forward that, for every sequence (εk)k≥1 such that ∑k≥1εk2<+∞,

∑k≥1𝔼⁢[|IπN⁢(εk)-I0|2]<+∞,

so that

IπN⁢(εk)→a.s.I0 as k→+∞.

We will weaken the assumption on the sequence (εk)k≥1 when Yh has higher finite moments, so we will investigate some Lp criterions for p≥2. Moreover, provided a sharper strong error assumption and adding some more hypothesis of uniform integrability, we will show that

IπN⁢(ε)-I0ε-m⁢(ε)→ℒ𝒩⁢(0,σ2) as ε→0,

with m⁢(ε)=μ⁢(ε)ε where μ⁢(ε)=𝔼⁢[IπN]-I0 is the bias of the estimator, and m2+σ2≤1, owing to the explicit expression of the constraint

(5)∥IπN⁢(ε)-I0∥22=μ⁢(ε)2+Var⁡(IπN⁢(ε))≤ε2.

In particular, we will prove that limε→0⁡m⁢(ε)=0 for the ML2R estimator. More precisely we will use in the proof the expansion

IπN⁢(ε)-I0ε=m⁢(ε)+σ2⁢ζ2ε+1ε⁢N⁢(ε)⁢σ1⁢ζ1ε as ⁢ε→0,

where ζ1ε and ζ2ε are two independent variables such that (ζ1ε,ζ2ε)→ℒ𝒩⁢(0,I2) as ε→0. We will see that ζ1ε comes from the coarse level of the estimator, while ζ2ε derives from the sum of the refined levels. When β>1, ε⁢N⁢(ε) converges to a constant, hence the variance σ2 results from the sum of the variance of the first coarse level σ12 and the variance of the sum of the refined fine levels σ22. When β∈(0,1], since ε⁢N⁢(ε) diverges, the contribution to σ2 of the coarse level disappears and only the variance of the refined levels contributes to σ2. More details on m and σ will follow in Section 3.

The paper is organized as follows. In Section 2 we briefly recall the technical background for Multilevel Monte Carlo estimators. In Section 3 we stable our main results: a Strong Law of Large Numbers and a Central Limit Theorem in a quite general framework. Section 4 is devoted to the analysis of the asymptotic behavior of the optimal parameters, to the study of the weights of the ML2R estimator and to the bias of the estimators and its robustness. These are auxiliary results that we need for the proof of the main theorems, which we detail in Section 5. In Section 6 we apply these results first to the discretization schemes of Brownian diffusions, where we retrieve recent results by Ben Alaya and Kebaier in [2], and secondly to Nested Monte Carlo.

Notations.

Let ℕ*={1,2,…} denote the set of positive integers and ℕ=ℕ*∪{0}.
For every x∈ℝ+=[0,+∞), ⌈x⌉ denotes the unique n∈ℕ* satisfying n-1<x⩽n.
If (an)n∈ℕ and (bn)n∈ℕ are two sequences of real numbers, an∼bn if an=εn⁢bn with limn⁡εn=1, an=O⁢(bn) if (εn)n∈ℕ is bounded and an=o⁢(bn) if limn⁡εn=0.
We denote by Var⁡(X) and σ⁢(X) the variance and the standard deviation of a random variable X, respectively.

2 Brief background on MLMC and ML2R estimators

We follow [12] and recall briefly the construction of the optimal parameters derived from the optimization problem (3). The first step is a stratification procedure allowing us to establish the optimal allocation policy (q1,…,qR) when the other parameters R,h,M are fixed. We focus now on the effort of the estimator defined as the product of the cost times the variance, i.e. Effort⁡(IπN)=Cost⁡(IπN)⁢Var⁡(IπN). Introducing the notations

Zj=(hMj-1)-β2⁢(YhMj-1(j)-YhMj-2(j)) for all ⁢j⩾2 and Z1=Yh(1),

a Multilevel estimator MLMC (1) or ML2R (2) writes

IπN=1N1⁢∑k=1N1Z1k+∑j=2R1Nj⁢∑k=1NjWjR⁢(hMj-1)β2⁢Zjk,

where WjR=1 for the MLMC and WjR=𝐖jR for the ML2R. By definition and using the approximation Nj≃N⁢qj the effort satisfies

Effort⁡(IπN)≃(∑j=1Rqj⁢Cost⁡(Zj))⁢(Var⁡(Yh)q1+∑j=2R(WjR)2⁢(hMj-1)β⁢Var⁡(Zj)qj).

Given R,h,M, a minimization of q∈(0,1)R↦Effort⁡(IπN) on {q∈(0,1)R:∑j=1Rqj=1} gives the solution

(6){q1*=μ*⁢Var⁡(Yh)Cost⁡(Yh)qj*=μ*⁢(hMj-1)β2⁢|WjR|⁢Var⁡(Zj)Cost⁡(Zj) with μ* such that ∑j=1Rqj=1,

using the Schwarz’s inequality (see [12, Theorem 3.6] for a detailed proof). The strong error assumption (${\mathrm{SE}_{\beta}}$) allows us to upper bound Var⁡(Yh) and Var⁡(Zj) by Var(Y0)(1+θhβ/2)2 with θ=V1Var⁡(Y0) and V1⁢(1+Mβ/2)2, respectively. On the other hand, we assume that Cost⁡(Zj)=(1+M-1)h⁢Mj-1. Plugging theses estimates in (6), we obtain the optimal allocation policy used in this paper and given in Tables 1 and 2. Notice that this particular choice for the qj is not unique, if we change (${\mathrm{SE}_{\beta}}$) with a different strong error assumption, for example with the sharp version, then we have to replace the upper bound for Var⁡(Zj) with V1⁢|1-Mβ/2|2 and a new expression for the qj follows. In the same spirit, the Cost⁡(Zj) can be different and hence have an impact on the qj, see [6] or the nested Monte Carlo methods as examples of alternative costs.

The second step is to select h⁢(ε)∈ℋ and R⁢(ε)⩾2 to minimize the cost of the optimally allocated estimator given a prescribed RMSE ε>0. To do this, we use the weak error assumption (${\mathrm{WE}_{\alpha,\bar{R}}}$) and we obtain

h⁢(ε)=𝐡⌈𝐡⁢(1+2⁢α)12⁢α⁢|c1|1α⁢ε-1α⁢M-(R-1)⌉

with c1 the first coefficient in the weak error expansion, for the MLMC estimator. For the ML2R estimator we made the additional assumption c~∞=limR→∞⁡|cR|1R∈(0,+∞) and then we obtain

h⁢(ε)=𝐡⌈𝐡⁢(1+2⁢α⁢R)12⁢α⁢R⁢c~∞1α⁢ε-1α⁢R⁢M-R-12⌉.

The depth parameter R⩾2 follows and the choice of N is directly related to the constraint (5).

We report in Tables 1 and 2 the ML2R and MLMC values for R⁢(ε), h⁢(ε), q⁢(ε)=(q1⁢(ε),…,qR⁢(ε)), N⁢(ε) computed in [12] and used throughout this paper. Note that these parameters are used in the web application of the LPMA at the address http://simulations.lpma-paris.fr/multilevel/. The following constants are used in this paper and in the Tables 1 and 2:

θ=V1Var⁡(Y0) and c~∞=limR→∞⁡|cR|1R∈(0,+∞)

and

C¯M,β=1+Mβ21+M-1 and C¯M,β=(1+Mβ2)⁢1+M-1.

Notice that 1+Mβ2 comes from the (${\mathrm{SE}_{\beta}}$) and 1+M-1 from the cost, hence the constants C¯M,β and C¯M,β depend on them, but on anything else.

Table 1

Optimal parameters for the ML2R estimator.

𝑹⁢(𝜺)	⌈12+log⁡(c~∞1α⁢𝐡)log⁡(M)+(12+log⁡(c~∞1α⁢𝐡)log⁡(M))2+2⁢log⁡(A/ε)α⁢log⁡(M)⌉,A=1+4⁢α
𝒉⁢(𝜺)	𝐡/⌈𝐡⁢(1+2⁢α⁢R)12⁢α⁢R⁢c~∞1α⁢ε-1α⁢R⁢M-R-12⌉
𝒒⁢(𝜺)	q1=μ⁢(1+θ⁢hβ2), qj=μ⁢θ⁢hβ2⁢C¯M,β⁢\|𝐖j⁢(R,M)\|⁢M-1+β2⁢(j-1), j=2,…,R, ∑1⩽j⩽Rqj=1
𝑵⁢(𝜺)	(1+12⁢α⁢R)⁢Var⁡(Y0)⁡(1+θ⁢hβ2+θ⁢hβ2⁢C¯M,β⁢∑j=2R\|𝐖j⁢(R,M)\|⁢M1-β2⁢(j-1))ε2⁢μ*

Table 2

Optimal parameters for the MLMC estimator.

𝑹⁢(𝜺)	⌈1+log⁡(\|c1\|1α⁢𝐡)log⁡(M)+log⁡(A/ε)α⁢log⁡(M)⌉,A=1+2⁢α
𝒉⁢(𝜺)	𝐡/⌈𝐡⁢(1+2⁢α)12⁢α⁢\|c1\|1α⁢ε-1α⁢M-(R-1)⌉
𝒒⁢(𝜺)	q1=μ⁢(1+θ⁢hβ2), qj=μ⁢θ⁢hβ2⁢C¯M,β⁢M-1+β2⁢(j-1), j=2,…,R, ∑1⩽j⩽Rqj=1
𝑵⁢(𝜺)	(1+12⁢α)⁢Var⁡(Y0)⁡(1+θ⁢hβ2+θ⁢hβ2⁢C¯M,β⁢∑j=2RM1-β2⁢(j-1))ε2⁢μ*

In what follows, we will shorter these notations by setting

(7)R⁢(ε)=⌈CR(1)+CR(2)+2α⁢log⁡(M)⁢log⁡(1ε)⌉

with

CR(1)=12+log⁡(c~1α⁢𝐡)log⁡(M) and CR(2)=(12+log⁡(c~1α⁢𝐡)log⁡(M))2+2⁢log⁡(A)α⁢log⁡(M)

for ML2R and

(8)R⁢(ε)=⌈CR(1)+1α⁢log⁡(M)⁢log⁡(1ε)⌉

with

CR(1)=1+log⁡(|c1|1α⁢𝐡)log⁡(M)+log⁡(A)α⁢log⁡(M)

for MLMC.

3 Main results

The asymptotic behavior, as ε goes to 0, of the parameters given in Tables 1 and 2 will be exposed in Section 4. We proceed here to the analysis of the asymptotic behavior of the estimator IπN⁢(ε):=Iπ⁢(ε)N⁢(ε) as ε→0.

3.1 Strong Law of Large Numbers

We will first prove a Strong Law of Large Numbers, namely

Theorem 3.1

Theorem 3.1 (Strong Law of Large Numbers)

Let p≥2. Assume (${\mathrm{WE}_{\alpha,\bar{R}}}$) for all R¯≥1 and Y0∈Lp. Assume furthermore the following Lp-strong error rate assumption:

(9)∃β>0,V1(p)⩾0,∥Yh-Y0∥pp=𝔼⁢[|Yh-Y0|p]⩽V1(p)⁢hβ2⁢p,h∈ℋ.

Then, for every sequence of positive real numbers (εk)k≥1 such that ∑k≥1εkp<+∞, both MLMC and ML2R estimators satisfy

(10)IπN⁢(εk)→a.s.I0 as k→+∞.

3.2 Central Limit Theorems

A necessary condition for a Central Limit Theorem to hold will be that the ratio between the variance of the estimator and ε converges as ε→0. It seems intuitive that (${\mathrm{SE}_{\beta}}$) should be reinforced by a sharper estimate as h→0. We define

(11)Z⁢(h):=(hM)-β2⁢(YhM-Yh) and Zj:=Z⁢(𝐡nj-1).

A necessary condition to obtain a CLT is to assume that (Z⁢(h))h∈ℋ is L2-uniformly integrable. We state two results, the first one in the case β>1 and the second one in the case β⩽1.

3.2.1 Case β>1

In this case, note that following (${\mathrm{SE}_{\beta}}$) we have supj⩾1⁡Var⁡(Zj)⩽V1⁢(1+Mβ2)2.

Theorem 3.2

Theorem 3.2 (Central Limit Theorem, β>1)

Assume (${\mathrm{SE}_{\beta}}$) for β>1 and that (Z⁢(h))h∈H is L2-uniformly integrable. We set

σ12=1Σ⁢Var⁡(Y𝐡)Var⁡(Y0)⁡(1+θ⁢𝐡β2) 𝑎𝑛𝑑 σ22=1Σ⁢𝐡β2⁢∑j≥2M1-β2⁢(j-1)⁢Var⁡(Zj)Var⁡(Y0)⁡V1⁢C¯M,β

with

Σ=Σ⁢(M,β,θ,𝐡)=[1+θ⁢𝐡β2⁢(1+C¯M,β⁢M1-β21-M1-β2)].

Then the following statements hold.

ML2R estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯≥1. Then
(12)IπN⁢(ε)-I0ε→ℒ𝒩⁢(0,σ12+σ22) as ε→0.
MLMC estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for R¯=1. Then there exists, for every ε>0, m⁢(ε) such that M-α1+2⁢α⩽|m⁢(ε)|⩽11+2⁢α and
IπN⁢(ε)-I0ε-m⁢(ε)→ℒ𝒩⁢(0,2⁢α2⁢α+1⁢(σ12+σ22)) as ε→0.

Note that the variance of the first term Y𝐡 associated to the coarse level contributes to the asymptotic variance of the estimator throughout σ12, while the variances of the correcting levels, Var⁡(Zj), j≥2, contribute throughout σ22. The ML2R estimator is asymptotically unbiased, whereas the MLMC estimator has an a priori non-vanishing bias term. This gain on the bias for ML2R is balanced by the variance, which is reduced of a factor 2⁢α1+2⁢α for MLMC. The constraint (5) yields σ12+σ22≤1, which is easy to verify if we recall that

Var(Y𝐡)≤Var(Y0)(1+θ𝐡β2)2,Var(Zj)≤V1(1+Mβ2)2 and m(ε)2≤11+2⁢α.

3.2.2 Case β∈(0,1]

In this case, we make the additional sharper assumption that limh→0⁡∥Z⁢(h)∥22=v∞⁢(M,β). This assumption allows us to identify limj→+∞⁡Var⁡(Zj). More precisely, note that owing to the consistence of the strong and weak error 2⁢α⩾β and owing to (${\mathrm{WE}_{\alpha,\bar{R}}}$) we have

𝔼⁢[Zj]=(𝐡nj)-β2⁢𝔼⁢[Y𝐡nj-Y𝐡nj]=c1⁢(1-Mα)⁢(𝐡nj)α-β2+o⁢((𝐡nj)α-β2),

so that

Var⁡(Zj)=∥Z⁢(𝐡nj-1)∥22-c12⁢(1-Mα)2⁢(𝐡nj)2⁢α-β+o⁢((𝐡nj)2⁢α-β).

We conclude that

limj→+∞⁡Var⁡(Zj)={v∞⁢(M,β)if 2⁢α>β,v∞⁢(M,β)-c12⁢(1-Mβ2)2if 2⁢α=β.

Theorem 3.3

Theorem 3.3 (Central Limit Theorem, 0<β⩽1)

Assume (${\mathrm{SE}_{\beta}}$) for β∈(0,1]. Assume that (Z⁢(h))h∈H is L2-uniformly integrable and assume furthermore limh→0⁡∥Z⁢(h)∥22=v∞⁢(M,β). We set

(13)σ2={v∞⁢(M,β)⁢(1+Mβ2)-2⁢V1-1if 2⁢α>β,(v∞⁢(M,β)-c12⁢(1-Mβ2)2)⁢(1+Mβ2)-2⁢V1-1if 2⁢α=β.

Then the following statements hold.

ML2R estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯≥1. Then
(14)IπN⁢(ε)-I0ε→ℒ𝒩⁢(0,σ2) as ε→0.
MLMC estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for R¯=1 and that 2⁢α>β when β<1. Then there exists, for every ε>0, m⁢(ε) such that M-α1+2⁢α⩽|m⁢(ε)|⩽11+2⁢α and
IπN⁢(ε)-I0ε-m⁢(ε)→ℒ𝒩⁢(0,2⁢α2⁢α+1⁢σ2) as ε→0.

We will see in the proof that the asymptotic variance corresponds to the variance associated to the correcting levels.

3.3 Practitioner’s corner

In the proof of Theorems 3.2 and 3.3 we will obtain the more precise expansion

IπN⁢(ε)-I0ε=m⁢(ε)+Σ2⁢ζ2ε+1ε⁢N⁢(ε)⁢Σ1⁢ζ1ε as ⁢ε→0,

where ζ1ε and ζ2ε are two independent variables such that (ζ1ε,ζ2ε)→ℒ𝒩⁢(0,I2) as ε→0, and the real values Σ1 and Σ2 depend on whether we are in the MLMC or in the ML2R case and on the value of β. Fundamentally Σ1 comes from the variance of the first coarse level and Σ2 from the sum of variances of the correcting levels.

When β>1, we will prove in Lemma 4.5 that ε⁢N⁢(ε) converges to a constant as ε→0, hence both the coarse and the refined levels contribute to the asymptotic of the estimator.

When β≤1, we will see that (ε⁢N⁢(ε))-1→0 as ε→0 so that, asymptotically, the variance of the coarse level fades and only the refined levels contribute to the asymptotic variance. Still, it is commonly known in the Multilevel framework that the coarse level is the one with the biggest size (speaking in terms of Nj), hence this term is not really negligible. We can go through this contradiction by observing the inverse convergence rate to 0, namely ε⁢N⁢(ε). It is equivalent, up to a constant, to R⁢(ε) when β=1 and M1-β4⁢R⁢(ε) when β<1.

For ML2R, owing to the expression of R⁢(ε) given in (7), ε⁢N⁢(ε)∼C⁢(log⁡(1/ε))14, where C is a positive constant when β=1 and ε⁢N⁢(ε)=o⁢(ε-η) for all η>0 when β<1. Hence the convergence rate to 0 of (ε⁢N⁢(ε))-1 is very slow. By contrast, Σ1≫Σ2, since Σ1 is related to the variance of the coarse level which roughly approximates the value of interest whereas Σ2 is related to the variance of the refined levels supposed to be smaller a priori. Hence the product (ε⁢N⁢(ε))-1⁢Σ1 turns out not to be negligible with respect to Σ2 for the values of the RMSE ε usually prescribed in applications.

For MLMC, we get ε⁢N⁢(ε)∼C⁢log⁡(1/ε), C positive constant, for β=1 and ε⁢N⁢(ε)∼C′⁢ε-1-β4⁢α for β<1. Hence, when β>1, the slow convergence phenomenon is still observed though less significant.

Impact of the weights 𝐖jR, j=1,…,R, on the asymptotic behavior of the ML2R estimator.

When β≥1, one observes that neither the rate of convergence nor the asymptotic variance of the estimator depends in any way upon the weights 𝐖jR, j=1,…,R. If β<1, it depends in a somewhat hidden way through the multiplicative constant of ε-2⁢M1-β2⁢R⁢(ε) in the asymptotic of N⁢(ε) (see Lemma 4.5 for more details). However, at finite range, it may have an impact on the variance of the estimator, having however in mind that, by construction, the depth of the ML2R estimator is lower than that of the MLMC which tempers this effect.

4 Auxiliary results

This section contains some useful results for the proof of the Strong Law of Large Numbers and of the Central Limit Theorem. More in detail, we investigate the asymptotic behavior as ε→0 of the optimal parameters given in Tables 1 and 2 and of the bias of the estimators and we analyze the weights of the ML2R estimator.

4.1 Asymptotic of the bias parameter and of the depth

An important property of MLMC and ML2R estimators is that h⁢(ε)→𝐡 and R⁢(ε)→∞ as ε→0. The saturation of the bias parameter 𝐡 is not intuitively obvious; indeed, it is well known that h⁢(ε)→0 as ε→0 for Crude Monte Carlo estimator. Still, this is a good property, because h=𝐡 is the choice which minimizes the cost of simulation of the variable Yh, which we recall is inverse linear with respect to h. First of all, we retrace the computations that led to the choice of the optimal h*⁢(ε) and R*⁢(ε), starting from ML2R estimator. We define

h⁢(ε,R)=(1+2⁢α⁢R)-12⁢α⁢R⁢|cR|-1α⁢R⁢ε1α⁢R⁢MR-12

and we recall that this is the optimized bias found in [12] at R fixed. Since the value of cR is unknown, it is necessary to make the assumption |cR|1R→c~ as R→+∞ and |cR|-1α⁢R is replaced by c~-1α. The value of c~ is also unknown and in the simulations we have to take an estimate of c~, that we write c^. We follow the lines of [12] and define the polynomial

(15)P⁢(R)=R⁢(R-1)2⁢log⁡(M)-R⁢log⁡(K)-1α⁢log⁡(1+4⁢αε),

where K=c^1α⁢𝐡. We set R+⁢(ε) the positive zero of P⁢(R). The optimal value for the depth of the ML2R estimator is R*⁢(ε)=⌈R+⁢(ε)⌉. We notice that P⁢(R*⁢(ε))≥0, R*⁢(ε)→+∞ as ε→0, and R* is increasing in c^. We can rewrite

h⁢(ε,R)=(1+4⁢α1+2⁢α⁢R)12⁢α⁢R⁢(c^|cR|1R)1α⁢eP⁢(R)R⁢𝐡.

We notice that

h⁢(ε,R+)=(1+4⁢α1+2⁢α⁢R+)12⁢α⁢R+⁢(c^|cR+|1R+)1α⁢𝐡.

The optimal choice for the bias is the projection of h⁢(ε,R*⁢(ε)) on the set ℋ={𝐡n:n∈ℕ}, which reads

h*⁢(ε)=𝐡⁢⌈𝐡h⁢(ε,R*⁢(ε))⌉-1.

When we replace |cR|1R with c^, we finally obtain

h*⁢(ε)=𝐡⌈𝐡⁢(1+2⁢α⁢R*)12⁢α⁢R*⁢c^1α⁢ε-1α⁢R*⁢M-R*-12⌉=𝐡⁢⌈𝐡h⁢(ε,R*)⁢(|cR*|1R*⁢c^-1)1α⌉-1.

Let us analyze the denominator

𝐡h⁢(ε,R*)⁢(|cR*|1R*⁢c^-1)1α=(1+4⁢α1+2⁢α⁢R*)-12⁢α⁢R*⁢e-P⁢(R*)R*.

Since P⁢(R*)≥0 and since for R large enough the function (1+4⁢α1+2⁢α⁢R)-12⁢α⁢R↗1, it follows that, up to reducing ε¯,

(16)(1+4⁢α1+2⁢α⁢R*)-12⁢α⁢R*⁢e-P⁢(R*)R*≤1 for all ⁢ε∈(0,ε¯),

which yields

⌈𝐡h⁢(ε,R*)⁢(|cR*|1R*⁢c^-1)1α⌉=1 and h*⁢(ε)=𝐡.

For MLMC we may follow the same reasoning starting from h⁢(ε,R)=(1+2⁢α)-12⁢α⁢|c1|-1α⁢ε1α⁢MR-1. We just showed the following:

Proposition 4.1

There exists ε¯>0 such that h*⁢(ε)=h for all ε∈(0,ε¯].

In what follows, we will always assume that ε∈(0,ε¯] and h*⁢(ε)=𝐡. This threshold ε¯ can be reduced in what follows line to line.

As ε→0, we have R=R*⁢(ε)→+∞ at the rate 2α⁢log⁡(M)⁢log⁡(1ε) in the ML2R case and 1α⁢log⁡(M)⁢log⁡(1ε) in the MLMC case.

4.2 Asymptotic of the bias and robustness

As part of a Central Limit Theorem, we will be faced to the quantity μ⁢(𝐡,R⁢(ε),M)ε, where

μ⁢(𝐡,R⁢(ε),M)=𝔼⁢[IπN⁢(ε)]-I0

is the bias of the estimator. This leads us to analyze carefully its asymptotic behavior as ε→0. Under assumption (${\mathrm{WE}_{\alpha,\bar{R}}}$) , the bias of a Crude Monte Carlo estimator reads

μ⁢(h)=c1⁢hα⁢(1+η1⁢(h)),limh→0⁡η1⁢(h)=0.

The bias of Multilevel estimators is dramatically reduced compared to the Crude Monte Carlo, more precisely the following proposition is proved in [12]:

Proposition 4.2

The following statements hold.

MLMC: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) with R¯=1.
μ⁢(h,R,M)=c1⁢(hMR-1)α⁢(1+η1⁢(hMR-1))
with limh→0⁡η1⁢(h)=0.
ML2R: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯≥1.
μ⁢(h,R,M)=(-1)R-1⁢cR⁢(hRMR⁢(R-1)2)α⁢(1+ηR,n¯⁢(h)),
where ηR,n¯⁢(h)=(-1)R-1⁢Mα⁢R⁢(R-1)2⁢∑r=1Rwrnrα⁢R⁢ηR⁢(hnR) with limh→0⁡ηR⁢(h)=0.

We notice that the ML2R estimator requires and takes full advantage of a higher order of the expansion of the bias error (${\mathrm{WE}_{\alpha,\bar{R}}}$), whereas the MLMC estimator only needs a first order expansion. As the computations were made under the constraint ∥IπN-I0∥2≤ε, we have clearly that |μ⁢(𝐡,R⁢(ε),M)|ε≤1. We focus our attention on the constants c~∞ and c1, which a priori we do not know and that we replace in the simulations by c^∞=c^1=1. If we plug the values of h⁢(ε)=𝐡 and R⁢(ε) in the formulas for the bias, owing to (15) and (16) we get, for ML2R,

|μ⁢(𝐡,R⁢(ε),M)|=|cR⁢(ε)|⁢(𝐡R⁢(ε)MR⁢(ε)⁢R⁢(ε)-12)α=|cR⁢(ε)|⁢𝐡α⁢R⁢(ε)⁢1eα⁢P⁢(R⁢(ε))⁢Kα⁢R⁢(ε)⁢ε1+4⁢α

=|cR⁢(ε)|c^∞R⁢(ε)⁢1eα⁢P⁢(R⁢(ε))⁢ε1+4⁢α≤|cR⁢(ε)|c^∞R⁢(ε)⁢11+2⁢α⁢R⁢(ε)⁢ε

and, for MLMC,

|c1c^1|⁢M-α1+2⁢α⁢ε<|μ⁢(𝐡,R⁢(ε),M)|≤|c1c^1|⁢11+2⁢α⁢ε.

We set m⁢(ε):=|μ⁢(𝐡,R⁢(ε),M)|ε. Hence, when taking the true values c^∞=c~∞ and c^1=c1, we get

(17){limε→0⁡m⁢(ε)=0for ML2R,M-α1+2⁢α<limε→0⁡m⁢(ε)≤11+2⁢αfor MLMC.

For ML2R estimators, if cR has a polynomial growth depending on R, we have

limR→+∞⁡|cR|1R=1

and c^∞=1 corresponds to the exact value of c~∞. If the growth of cR is less than polynomial, the convergence to 0 in (17) still holds. The only uncertain case is when the growth of cR is faster than polynomial. Then, if c^∞≥|cR|1R, |μ(ε)|ε goes to 0 faster than 11+2⁢α⁢R⁢(ε), but if we had taken c^∞<1, we would have obtained

limR→+∞⁡|cR|c^∞R=+∞,

hence c^∞<1 is definitely not a good choice. In conclusion, whenever the growth of cR is at most polynomial, c^∞=1 remains a good choice. When the growth is faster than polynomial, it is better to overestimate c^∞ than to underestimate it. The remarkable fact is that, when we choose c^∞, we are not forced to have a very precise idea of the expression of cR, but only of its growth rate. The choice of c^1 for MLMC estimator is less robust, since it is obvious that if we overestimate c1 the inequality |μ⁢(ε)|ε≤11+2⁢α still holds, but if we underestimate it we eventually may not have |μ⁢(ε)|ε≤1 as expected. Hence the bias for the MLMC estimator is very connected to an accurate enough estimation of c1.

In Figures a and b we show the values of |c1| estimated with the formula

c1=(h2-h)-1⁢(𝔼⁢[Yh2]-𝔼⁢[Yh])

compared to the value plugged in the simulations c^1=1, for a Call option in a Black–Scholes model with X0=100, K=80, T=1, σ=0.4 and making the interest rate vary as follows: r=0.01,0.1,0.2,…,0.9,1. We simulated 𝔼⁢[Yh], with h=T20, using an Euler and a Milstein discretization scheme and making a Crude Monte Carlo simulation of size N=108.

In Figures a and b we show the absolute value of the empirical bias for different values of r. In the simulations, we fixed c^1=1 and c^∞=1. We can observe that when |c1| is underestimated, the bias for MLMC and Crude Monte Carlo estimators do not satisfy the constraint |μ⁢(ε)|≤ε, whereas the ML2R estimator appears to be less sensible to the estimation of c~.

Figure 1

Estimated |c1|=(|𝔼⁢[Yh]-𝔼⁢[Yh2]|)⁢(h-h2)-1 when r varies.

$(a) Euler scheme (α=1,β=1${\alpha=1,\beta=1}$).$

(a)

Euler scheme (α=1,β=1).

$(b) Milstein scheme (α=1,β=2${\alpha=1,\beta=2}$).$

(b)

Milstein scheme (α=1,β=2).

Figure 2

Empirical bias |μ⁢(ε)| for a Call option in a Black–Scholes model for a prescribed RMSE ε=2-5 and for different values of r, taking c^∞=c^1=1.

$(a) Euler scheme (α=1,β=1${\alpha=1,\beta=1}$).$

(a)

Euler scheme (α=1,β=1).

$(b) Milstein scheme (α=1,β=2${\alpha=1,\beta=2}$).$

(b)

Milstein scheme (α=1,β=2).

4.3 Properties of the weights of the ML2R estimator

One significant difficulty in the proof of the Central Limit Theorem that we stated in Theorems 3.2 and 3.3, is to deal with the weights 𝐖jR appearing in the ML2R estimator. Moreover, the analysis of the behavior of the weights is necessary when studying the asymptotic of the parameters q=(q1,…,qR) and N. These weights are devised to kill the coefficients c1,…,cR in the bias expansion under (${\mathrm{WE}_{\alpha,\bar{R}}}$). They are defined as

(18)𝐖jR=∑r=jR𝐰r,j=1,…,R,

where the weights 𝐰=(𝐰r)r=1,…,R are the solution to the Vandermonde system V⁢𝐰=e1, the matrix V being defined by

V=V⁢(1,n2-α,…,nR-α)=(11⋯11n2-α⋯nR-α⋮⋮⋱⋮1n2-α⁢(R-1)⋯nR-α⁢(R-1)).

Notice that 𝐖1R=1 by construction. In order to give a more tractable expression of the weights 𝐖jR, one notices that the weights 𝐰 admit a closed form given by Cramer’s rule, namely

𝐰ℓ=aℓ⁢bR-ℓ,ℓ=1,…,R,

where

aℓ=1∏1≤k≤ℓ-1(1-M-k⁢α),ℓ=1,…,R,

with the convention ∏k=10(1-M-k⁢α)=1, and

bℓ=(-1)ℓ⁢M-α2⁢ℓ⁢(ℓ+1)∏1≤k≤ℓ(1-M-k⁢α),ℓ=0,…,R.

As a consequence,

𝐖jR=∑ℓ=jRaℓ⁢bR-ℓ=∑ℓ=0R-jaR-ℓ⁢bℓ,j∈1,…,R.

We will make an extensive use of the following properties, which are proved in Appendix A.

Lemma 4.3

Let α>0 and the associated weights (WjR)j=1,…,R given in (18).

limℓ→+∞⁡aℓ=a∞<+∞ and ∑ℓ=0+∞|bℓ|=B~∞<+∞.
The weights 𝐖jR are uniformly bounded,
(19)|𝐖jR|≤a∞⁢B~∞ for all R∈ℕ* and all j∈{1,…,R}.
For every γ>0,
limR→+∞⁡∑j=2R|𝐖jR|⁢M-γ⁢(j-1)=1Mγ-1.
Let {vj}j≥1 be a bounded sequence of positive real numbers. Let γ∈ℝ and assume that limj→+∞⁡vj=1 when γ≥0. Then the following limits hold:
∑j=2R|𝐖jR|⁢Mγ⁢(j-1)⁢vj∼{∑j≥2Mγ⁢(j-1)⁢vj<+∞𝑓𝑜𝑟 γ<0,R𝑓𝑜𝑟 γ=0,Mγ⁢Ra∞∑j≥1|∑ℓ=0j-1bℓ|M-γ⁢j𝑓𝑜𝑟 γ>0, as ⁢R→+∞.

4.4 Asymptotic of the allocation policy and of the size

Let us analyze the allocation policy q=(q1,…,qR) for the ML2R case. Since

(20)q1⁢(ε)=μ*⁢(ε)⁢(1+θ⁢𝐡β2) and qj⁢(ε)=θ⁢𝐡β2⁢C¯M,β⁢μ*⁢(ε)⁢|𝐖jR⁢(ε)|⁢M-β+12⁢(j-1),j=2,…,R⁢(ε),

the condition ∑j=1R⁢(ε)qj=1 yields

μ*⁢(ε)=(1+θ⁢𝐡β2⁢(1+C¯M,β⁢∑j=2R⁢(ε)|𝐖jR⁢(ε)|⁢M-β+12⁢(j-1)))-1.

Owing to Lemma 4.3 (c) with γ=β+12, the limit of this term as ε→0 is

𝝁*=(1+θ⁢𝐡β2⁢(1+C¯M,βM1+β2-1))-1.

Moreover, for all ε∈(0,ε¯], the following inequalities hold:

(21)1μ¯*:=1+θ𝐡β2≤1μ*⁢(ε)≤1+θ𝐡β2(1+C¯M,βa∞B~∞11-M-β+12)=:1μ¯*.

Remark 4.4

If we set 𝐖jR⁢(ε)=1 for all j=1,…,R⁢(ε), and a∞⁢B~∞=1, we obtain the same results for the MLMC allocation policy.

The asymptotic of the estimator size N=N⁢(ε) is given in the following lemma.

Lemma 4.5

We have N=N⁢(ε)→+∞ as ε→0, with a convergence rate depending on β as follows:

Case β>1: We have N⁢(ε)∼Cβ⁢ε-2 with
Cβ=Var⁡(Y0)𝝁*⁢[1+θ⁢𝐡β2⁢(1+C¯M,β⁢M1-β21-M1-β2)]⁢{1for ML2R,1+12⁢αfor MLMC.
Case β≤1: We recall the expression of R⁢(ε) given in (7) for ML2R and (8) for MLMC. Then
N⁢(ε)∼Cβ⁢ε-2⁢{R⁢(ε)if ⁢β=1,M1-β2⁢R⁢(ε)if ⁢β<1,
where the constant Cβ reads
Cβ=Var⁡(Y0)𝝁*⁢θ⁢𝐡12⁢C¯M,β⁢{1for ML2R,1+12⁢αfor MLMC, if ⁢β=1
and
Cβ=Var⁡(Y0)𝝁*⁢θ⁢𝐡β2⁢C¯M,β⁢{a∞∑j≥1|∑ℓ=0j-1bℓ|Mβ-12⁢jfor ML2R,(1+12⁢α)⁢1M1-β2-1for MLMC, if ⁢β<1.

We notice that for β≥1 the asymptotic behavior of N⁢(ε) for ML2R does not depend on the weights 𝐖jR and the difference between the coefficient Cβ for ML2R and for MLMC estimator lies only in the factor (1+12⁢α), whereas when β<1, the asymptotic of the weights has an impact on the behavior of N⁢(ε) for ML2R. Still, in this case we observe that if a∞=1 and |∑ℓ=0j-1bℓ|=1 for all j≥1, then

a∞∑j≥1|∑ℓ=0j-1bℓ|Mβ-12⁢j=1M1-β2-1

and the factor (1+12⁢α) appears again to be the only difference in the coefficient Cβ of N⁢(ε) for the two estimators.

Proof.

ML2R: The estimator size N reads

N=N⁢(ε)=(1+12⁢α⁢R⁢(ε))⁢Var⁡(Y0)μ*⁢1ε2⁢(1+θ⁢𝐡β2+θ⁢𝐡β2⁢C¯M,β⁢∑j=2R⁢(ε)|𝐖jR⁢(ε)|⁢M1-β2⁢(j-1)).

We notice that R⁢(ε)→+∞ as ε→0 and use Lemma 4.3 (d) with γ=1-β2, with vj=1 for each j≥1, to complete the proof on the ML2R framework.

MLMC: The result follows directly from the convergence of the series ∑j=2R⁢(ε)M1-β2⁢(j-1), since N reads

N=N⁢(ε)=(1+12⁢α)⁢Var⁡(Y0)μ*⁢1ε2⁢(1+θ⁢𝐡β2⁢(1+C¯M,β⁢∑j=2R⁢(ε)M1-β2⁢(j-1))),

as desired. ∎

5 Proofs

We will use the notations

I~ε1:=1N1⁢(ε)⁢∑k=1N1⁢(ε)Y𝐡(1),k-𝔼⁢[Y𝐡(1),k] and I~ε2:=∑j=2R⁢(ε)𝐖jR⁢(ε)Nj⁢(ε)⁢∑k=1Nj⁢(ε)Y~jk,

where we set

Yj~:=Y𝐡nj(j)-Y𝐡nj-1(j)-𝔼⁢[Y𝐡nj(j)-Y𝐡nj-1(j)],j=1,…,R⁢(ε).

These notations hold for both ML2R and MLMC estimators, where we set 𝐖jR⁢(ε)=1, j=1,…,R⁢(ε), for MLMC estimators. We notice that

(22)IπN⁢(ε)-I0=I~ε1+I~ε2+μ⁢(𝐡,R⁢(ε),M),

where the bias μ⁢(𝐡,R⁢(ε),M)→0 as ε→0 (see Section 4.2 for a detailed description of the bias).

5.1 Proof of Strong Law of Large Numbers

The proof of the Strong Law of Large Numbers is a consequence of the following proposition.

Proposition 5.1

Let p≥2. There exists a positive real constant K⁢(M,β,p) such that

(23)𝔼⁢[|I~ε2|p]≤K⁢(M,β,p)⁢εp.

Proof.

ML2R: We first give the proof of (23) for the ML2R estimator. As a first step we show that, for all p≥2,

(24)𝔼⁢[|Y~j|p]≤CM,β,p⁢M-β⁢p2⁢(j-1),j=1,…,R⁢(ε),with ⁢CM,β,p=2p⁢V1(p)⁢(1+Mβ2)p⁢𝐡β⁢p2.

By Minkowski’s inequality,

(𝔼⁢[|Y~j|p])1p≤∥Y𝐡nj-Y𝐡nj-1∥p+|𝔼⁢[Y𝐡nj-Y𝐡nj-1]|≤∥Y𝐡nj-Y𝐡nj-1∥p+∥Y𝐡nj-Y𝐡nj-1∥1≤2⁢∥Y𝐡nj-Y𝐡nj-1∥p.

Applying again Minkowski’s inequality, the Lp-strong approximation assumption (9) yields (24). As the random variables (Y~jk)k≥1 are i.i.d. and the (Y~j)j=1,…,R⁢(ε) are centered and independent, Burkholder’s inequality (see [10, Theorem 2.10, p. 23]) and (24) imply that there exists a positive universal real constant Cp such that

𝔼⁢[|I~ε2|p]=𝔼⁢[|∑j=2R⁢(ε)∑k=1Nj⁢(ε)𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk|p]≤Cp⁢𝔼⁢[|∑j=2R⁢(ε)∑k=1Nj⁢(ε)(𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)2|p2]

≤Cp⁢(∑j=2R⁢(ε)∑k=1Nj⁢(ε)∥(𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)2∥p2)p2=Cp⁢(∑j=2R⁢(ε)|𝐖jR⁢(ε)|2Nj⁢(ε)⁢(𝔼⁢[|Y~j|p])2p)p2

≤Cp⁢CM,β,2⁢(∑j=2R⁢(ε)|𝐖jR⁢(ε)|2Nj⁢(ε)⁢M-β⁢(j-1))p2.

As Nj⁢(ε)=⌈N⁢(ε)⁢qj⁢(ε)⌉≥N⁢(ε)⁢qj⁢(ε), we derive that

1Nj⁢(ε)≤1N⁢(ε)⁢qj⁢(ε),j=1,…,R⁢(ε).

It follows from the expression of qj given in (20) and from inequality (21) that

|𝐖jR⁢(ε)|qj⁢(ε)≤1θ⁢𝐡β2⁢C¯M,β⁢μ¯*⁢Mβ+12⁢(j-1),j=2,…,R⁢(ε).

Then, using that supj∈{1,…,R},R≥1⁡|𝐖jR|≤a∞⁢B~∞, we get

𝔼⁢[|I~ε2|p]≤Cp⁢CM,β,2⁢(a∞⁢B~∞)p2⁢(θ⁢𝐡β2⁢C¯M,β⁢μ¯*)-p2⁢(1N⁢(ε)⁢∑j=2R⁢(ε)M1-β2⁢(j-1))p2.

Owing to Lemma 4.5, up to reducing ε¯, we have

(25)1N⁢(ε)≤2Cβ⁢ε2⁢{1if ⁢β>1,R⁢(ε)-1if ⁢β=1,M-1-β2⁢R⁢(ε)if ⁢β<1, for all ⁢ε∈(0,ε¯].

Moreover,

∑j=2R⁢(ε)M1-β2⁢(j-1)≤{11-M1-β2if ⁢β>1,R⁢(ε)if ⁢β=1,M1-β2⁢R⁢(ε)M1-β2-1if ⁢β<1.

Then

(1N⁢(ε)⁢∑j=2R⁢(ε)M1-β2⁢(j-1))p2≤K1⁢εp

with

K1=K1⁢(M,β,p)=(2Cβ)p2⁢{(1-M1-β2)-p2if ⁢β>1,1if ⁢β=1,(M1-β2-1)-p2if ⁢β<1.

Hence (23) holds with

K⁢(M,β,p)=Cp⁢CM,β,2⁢(a∞⁢B~∞)p2⁢(θ⁢𝐡β2⁢C¯M,β⁢μ¯*)-p2⁢K1.

MLMC: The proof for the MLMC estimator follows the same steps, by replacing 𝐖jR⁢(ε)=1, j=1,…,R⁢(ε), and a∞⁢B~∞=1. ∎

The Strong Law of Large Numbers follows as a consequence of Proposition 5.1.

Proof of Theorem 3.1.

Owing to the decomposition (22), equation (10) amounts to proving

I~εk1→a.s.0 as ⁢k→+∞ and I~εk2→a.s.0 as ⁢k→+∞.

As limε→0⁡N1⁢(ε)=+∞ and the (Y𝐡(1),k)k≥1 are i.i.d. and do not depend on ε, the convergence of I~εk1 is a direct application of the classical Strong Law of Large Numbers, for both ML2R and MLMC estimators.

To establish the a.s. convergence of I~εk2, owing to Lemma 5.1 it is straightforward that for all sequence of positive values (εk)k≥1 such that εk→0 as k→+∞ and ∑k≥1εkp<+∞,

∑k≥1𝔼⁢[|I~εk2|p]<+∞.

Hence, by Beppo–Levi’s Theorem, ∑k≥1|I~εk2|p<+∞ a.s., which in turn implies I~εk2→a.s.0 as k→+∞. ∎

5.2 Proof of the Central Limit Theorem

This subsection is devoted to the proof of Theorems 3.2 and 3.3. In order to satisfy a Lindeberg condition, we will need the assumption (Z⁢(h))h∈ℋ is L2-uniformly integrable. Owing to (${\mathrm{WE}_{\alpha,\bar{R}}}$), R¯=1,

|𝔼⁢[Z⁢(h)]|=|c1⁢(1-Mα)|⁢hα-β2+o⁢(hα-β2).

Since 2⁢α≥β, this deterministic sequence (𝔼⁢[Z⁢(h)])h∈ℋ is bounded. Hence, the L2-uniform integrability of (Z⁢(h))h∈ℋ yields the L2-uniform integrability of the centered sequence (Z~⁢(h))h∈ℋ=(Z⁢(h)-𝔼⁢[Z⁢(h)])h∈ℋ.

One criterion to verify the L2-uniform integrability is the following.

Lemma 5.2

The following statements hold.

If there exists a p>2 such that suph∈ℋ⁡∥Z⁢(h)∥p<+∞, the family (Z⁢(h))h∈ℋ is L2-uniformly integrable.
If there exists a random variable D(M)∈L2 such that, as h→0, Z⁢(h)→ℒD(M), then the following conditions are equivalent (see [3, Theorem 3.6]):
1. The family (Z⁢(h))h∈ℋ is L2-uniformly integrable.
2. limh→0⁡∥Z⁢(h)∥2=∥D(M)∥2.

Now we are in a position to prove the Central Limit Theorem, in both cases β>1 and β∈(0,1].

Proof of Theorems 3.2 and 3.3.

Owing to the decomposition (22) (with 𝐖jR⁢(ε)=1, j=1,…,R⁢(ε) for MLMC estimator),

IπN⁢(ε)-I0ε=I~ε1ε+I~ε2ε+μ⁢(𝐡,R⁢(ε),M)ε,

where I~ε1 and I~ε2 are independent. The bias term has already been treated in (17).

ML2R: Formulas (12) and (14) amount to proving, as ε→0,

(26)N⁢(ε)⁢I~ε1→ℒ𝒩⁢(0,Var⁡(Y𝐡)𝐪1)

and

(27)I~ε2ε→ℒ𝒩⁢(0,σ22)

with σ2=σ for β∈(0,1]. Indeed, for (26) let us write I~ε1ε=1ε⁢N⁢(ε)⁢N⁢(ε)⁢I~ε1. Using Lemma 4.5, N⁢(ε) reads

N⁢(ε)∼Cβ⁢ε-2⁢{1if ⁢β>1,R⁢(ε)if ⁢β=1,M1-β2⁢R⁢(ε)if ⁢β<1, as ⁢ε→0.

In particular, since R⁢(ε)→+∞ as ε→0, when β≤1, 1N⁢(ε)=o⁢(ε) and the term I~ε1ε→0 in probability. Since Y𝐡(1),k does not depend on ε, N1⁢(ε)→+∞ and N1⁢(ε)/N⁢(ε)→𝐪1 as ε→0, the asymptotic behavior of the first term is driven by a regular Central Limit Theorem at rate N⁢(ε), i.e.

N⁢(ε)⁢I~ε1=N⁢(ε)⁢[1N1⁢(ε)⁢∑k=1N1⁢(ε)(Y𝐡(1),k-𝔼⁢[Y𝐡(1),k])]→ε→0𝒩⁢(0,Var⁡(Y𝐡)𝐪1),

which proves (26). We will use Lindeberg’s Theorem for triangular arrays of martingale increments (see [10, Corollary 3.1, p. 58]) to establish (27). The random variables Y~jk being centered and independent, the variance reads

Var⁡(∑j=2R⁢(ε)∑k=1Nj⁢(ε)1ε⁢𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)=1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε)Nj⁢(ε))2⁢Nj⁢(ε)⁢Var⁡(Y~j)=1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2Nj⁢(ε)⁢Var⁡(Y~j).

Noticing that 0≤1x-1⌈x⌉≤1x2, x>0, and that Nj⁢(ε)=⌈qj⁢(ε)⁢N⁢(ε)⌉, we derive

|1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2Nj⁢(ε)⁢Var⁡(Y~j)-1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2qj⁢(ε)⁢N⁢(ε)⁢Var⁡(Y~j)|≤1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2(qj⁢(ε)⁢N⁢(ε))2⁢Var⁡(Y~j).

The conclusion will follow from

(28)limε→0⁡1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2N⁢(ε)⁢qj⁢(ε)⁢Var⁡(Y~j)=σ22

and

(29)limε→0⁡1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2(qj⁢(ε)⁢N⁢(ε))2⁢Var⁡(Y~j)=0.

Owing to the definition of Zj given in (11), we get Var⁡(Y~j)=(𝐡nj)β⁢Var⁡(Zj) and, using the expression of qj⁢(ε) given in (20), we obtain

1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2N⁢(ε)⁢qj⁢(ε)⁢Var⁡(Y~j)=1ε2⁢N⁢(ε)⁢𝐡β2θ⁢C¯M,β⁢μ*⁢(ε)⁢∑j=2R⁢(ε)|𝐖jR⁢(ε)|⁢M1-β2⁢(j-1)⁢Var⁡(Zj).

Case β>1: Owing to the expression of N⁢(ε) given in Lemma 4.5 when β>1,

limε→0⁡1ε2⁢N⁢(ε)⁢𝐡β2θ⁢C¯M,β⁢μ*⁢(ε)=1Σ⁢𝐡β2Var⁡(Y0)⁡V1⁢C¯M,β

and owing to the limit in Lemma 4.3 (d) with γ=1-β2<0,

∑j=2R⁢(ε)|𝐖jR⁢(ε)|⁢M1-β2⁢(j-1)⁢Var⁡(Zj)=∑j=2+∞M1-β2⁢(j-1)⁢Var⁡(Zj)<+∞.

Hence the convergence of the variance (28) holds for Theorem 3.2.

Case β≤1: Owing to the expression of N⁢(ε) given in Lemma 4.5 when β≤1, we get, as ε→0,

1ε2⁢N⁢(ε)⁢𝐡β2θ⁢C¯M,β⁢μ*⁢(ε)∼1V1⁢(1+Mβ2)2⁢{(R⁢(ε))-1if ⁢β=1,(M1-β2⁢R⁢(ε)a∞∑j≥1|∑ℓ=0j-1bℓ|Mβ-12⁢j)-1if ⁢β<1.

We notice that

limj→+∞⁡Var⁡(Zj)=v~∞⁢(M,β)

if 2⁢α>β and

limj→+∞⁡Var⁡(Zj)=v~∞⁢(M,β)-c12⁢(1-Mβ2)2

if 2⁢α=β. Hence, owing to the limit in Lemma 4.3 (d) with γ=1-β2≥0, we obtain (28) with σ2=σ given in (13) in Theorem 3.3.

For (29), it follows from the expression of qj⁢(ε) in (20) that

|𝐖jR⁢(ε)|qj⁢(ε)=Mβ+12⁢(j-1)θ⁢𝐡β2⁢C¯M,β⁢μ*⁢(ε).

Owing to the definition of Zj in (11) and to inequality (21), we get

1ε2⁢∑j=2R⁢(ε)(𝐖jR⁢(ε))2(qj⁢(ε)⁢N⁢(ε))2⁢Var⁡(Y~j)≤𝐡β(θ⁢𝐡β2⁢C¯M,β⁢μ¯*)2⁢1(ε⁢N⁢(ε))2⁢∑j=2R⁢(ε)M(j-1)⁢Var⁡(Zj)

≤𝐡β(θ⁢𝐡β2⁢C¯M,β⁢μ¯*)2⁢1(ε⁢N⁢(ε))2⁢(supj≥1⁡Var⁡(Zj))⁢MR⁢(ε)-1M-1.

We conclude by showing that

(30)MR⁢(ε)(ε⁢N⁢(ε))2→0 as ⁢ε→0.

Owing to the expression of R⁢(ε) given in (7), we notice that

R⁢(ε)=O⁢(log⁡(1ε))=o⁢(log⁡(1ε)) as ε→0.

Moreover, using Lemma 4.5, up to another reduction of ε¯, we have 1(ε⁢N⁢(ε))2≤(2Cβ)2⁢ε2 for all β>0. This in turn yields

MR⁢(ε)(ε⁢N⁢(ε))2≤(2Cβ)2⁢ε2⁢MR⁢(ε)=Cβ-2⁢elog⁡(M)⁢R⁢(ε)-2⁢log⁡(1ε)→0 as ⁢ε→0.

Then (29) is proved and so is the first condition of Lindeberg’s Theorem.

For the second condition of Lindeberg’s Theorem we need to prove that, for every η>0,

∑j=2R⁢(ε)∑k=1Nj⁢(ε)𝔼⁢[(1ε⁢𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)2⁢𝟏{|1ε𝐖jR⁢(ε)Nj⁢(ε)Y~jk|>η}]→0 as ⁢ε→0.

Since the (Y~jk)k=1,…,Nj⁢(ε) are identically distributed, we can write

∑j=2R⁢(ε)∑k=1Nj⁢(ε)𝔼⁢[(1ε⁢𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)2⁢𝟏{|1ε𝐖jR⁢(ε)Nj⁢(ε)Y~jk|>η}]≤∑j=2R⁢(ε)1ε2⁢|𝐖jR⁢(ε)|2Nj⁢(ε)⁢𝔼⁢[(Y~j)2⁢𝟏{|Y~j|>ηεNj⁢(ε)|𝐖jR⁢(ε)|}].

We set Z~j=Zj-𝔼⁢[Zj]. Replacing qj by its values given in (20), using inequality (19) from Lemma 4.3 (b) and the elementary inequality 1Nj⁢(ε)≤1qj⁢(ε)⁢N⁢(ε) yields

∑j=2R⁢(ε)∑k=1Nj⁢(ε)𝔼⁢[(1ε⁢𝐖jR⁢(ε)Nj⁢(ε)⁢Y~jk)2⁢𝟏{|1ε𝐖jR⁢(ε)Nj⁢(ε)Y~jk|>η}]

≤1θ⁢𝐡β2⁢C¯M,β⁢μ*⁢(ε)⁢1ε2⁢N⁢(ε)⁢∑j=2R⁢(ε)|𝐖jR⁢(ε)|⁢Mβ+12⁢(j-1)⁢𝔼⁢[(Y~j)2⁢𝟏{|Y~j|>ηθ⁢𝐡β2⁢C¯M,β⁢μ*⁢(ε)Mβ+12⁢(j-1)εN(ε)}]

≤𝐡β2θ⁢C¯M,β⁢μ¯*⁢a∞⁢B~∞⁢1ε2⁢N⁢(ε)⁢∑j=2R⁢(ε)M1-β2⁢(j-1)⁢𝔼⁢[(Z~j)2⁢𝟏{|Z~j|>ηθC¯M,βμ¯*εN(ε)M-j-12}]

≤𝐡β2θ⁢C¯M,β⁢μ¯*⁢a∞⁢B~∞⁢sup2≤j≤R⁢(ε)⁡𝔼⁢[(Zj)2⁢𝟏{|Z~j|>ΘεN(ε)M-R⁢(ε)2}]⁢1ε2⁢N⁢(ε)⁢∑j=2R⁢(ε)M1-β2⁢(j-1)

where we set Θ=η⁢θ⁢C¯M,β⁢μ¯*⁢M. Now, it follows from Lemma 4.5 that

1ε2⁢N⁢(ε)⁢∑j=2R⁢(ε)M1-β2⁢(j-1)=1ε2⁢N⁢(ε)⁢(M1-β2⁢R⁢(ε)-M1-β2M1-β2-1⁢𝟏{β≠1}+(R⁢(ε)+1)⁢𝟏{β=1})→K,

as ε→0, where K is a real positive constant. Owing to (30), limε→0⁡ε⁢N⁢(ε)⁢M-R⁢(ε)2=+∞. Hence, since we assumed that the family (Zj)j≥1 is L2-uniformly integrable, we obtain that

limε→0⁡sup2≤j≤R⁢(ε)⁡𝔼⁢[(Zj)2⁢𝟏{|Zj|>ΘεN(ε)M-R⁢(ε)-12}]=0

and the second condition of Lindeberg’s Theorem is proved.

MLMC: The proofs are quite the same as for ML2R, up to the constant 1+12⁢α, coming from the constant Cβ in the asymptotic of N⁢(ε). Using Lemma 4.5 and the expression of R⁢(ε) given in (8), we obtain

N⁢(ε)∼Cβ⁢ε-2⁢{1if ⁢β>1,1α⁢log⁡(M)⁢log⁡(1ε)if ⁢β=1,ε-1-β2⁢αif ⁢β<1. as ⁢ε→0.

We replace 𝐖jR=1, j=1,…,R, and a∞⁢B~∞=1. The only significant difference comes when β<1, while proving (30). In this case, owing to Lemma 4.5 as we did in (25) and using the expression of R⁢(ε) given in (8), up to reducing ε¯, we can write

MR⁢(ε)(ε⁢N⁢(ε))2≤(2Cβ)2⁢ε2⁢M-(1-β)⁢R⁢(ε)⁢MR⁢(ε)≤(2Cβ)2⁢Mβ⁢(CR(1)+1)⁢ε2-βα,

which goes to 0, owing to the strict inequality assumption 2⁢α>β. ∎

6 Applications

6.1 Diffusions

In this subsection we retrieve a recent result by Kebaier and Ben Alaya (see [2]) obtained for MLMC estimators and we extend it to the ML2R estimators and to the use of path-dependent functionals. Let (Xt)t∈[0,T] a Brownian diffusion process solution to the stochastic differential equation

Xt=X0+∫0tb⁢(s,Xs)⁢𝑑s+∫0tσ⁢(s,Xs)⁢𝑑Ws,t∈[0,T],

where b:[0,T]×ℝd→ℝd, σ:[0,T]×ℝd→M⁢(d,q,ℝ) are continuous functions, Lipschitz continuous in x, uniformly in t∈[0,T], (Wt)t∈[0,T] is a q-dimensional Brownian motion independent of X0, both defined on a probability space (Ω,𝒜,ℙ).

We know that X=(Xt)t∈[0,T] is the unique (ℱtW)t∈[0,T]-adapted solution to this equation, where ℱW is the augmented filtration of W. The process (Xt)t∈[0,T] cannot be simulated at a reasonable computational cost (at least in full generality), which leads to introduce some simulatable time discretization schemes, the simplest being undoubtedly the Euler scheme with step h=Tn, n≥1, defined by

(31)X¯tn=X0+∫0tb⁢(s¯,X¯s¯n)⁢𝑑s+∫0tσ⁢(s¯,X¯s¯n)⁢𝑑Ws

with s¯=⌊n⁢s⌋n, s∈[0,T]. In particular, if we set tkn=k⁢Tn,

X¯tk+1nn=X¯tknn+b⁢(tkn,X¯tknn)⁢h+σ⁢(tkn,X¯tknn)⁢h⁢Uk+1n,k∈{0,…,n-1},

where Uk+1n=Wtk+1n-Wtknh is i.i.d. with distribution 𝒩⁢(0,Iq). Furthermore, we also derive from (31) that

X¯tn=X¯t¯n+b⁢(t¯,X¯t¯n)⁢(t-t¯)+σ⁢(t¯,X¯t¯n)⁢(Wt-Wt¯),t∈[0,T].

It is classical background that, under the above assumptions on b,σ,X0 and W, the Euler scheme satisfies the following a priori Lp-error bounds:

(32)∀p⩾2,∃cb,σ,p,T>0,∥supt∈[0,T]⁡|Xt-X¯tn|∥p≤cb,σ,p,T⁢Tn⁢(1+∥X0∥p).

For the weak error expansion the existing results are less general. Let us recall as an illustration the celebrated Talay–Tubaro’s and Bally–Talay’s weak error expansions for marginal functionals of Brownian diffusions, i.e. functionals of the form F⁢(X)=f⁢(XT).

Theorem 6.1

The following statements hold.

Regular setting (Talay–Tubaro [13]): If b and σ are infinitely differentiable with bounded partial derivatives and if f:ℝd→ℝ is an infinitely differentiable function, with all its partial derivatives having a polynomial growth, then for a fixed maturity T>0 and for every integer R∈N*,
(33)𝔼⁢[f⁢(X¯Tn)]-𝔼⁢[f⁢(XT)]=∑k=1Rck⁢(1n)k+O⁢((1n)R+1),
where the coefficients ck depend on b,σ,f,T but not on n.
(Hypo-)Elliptic setting (Bally–Talay [1]): If b and σ are infinitely differentiable with bounded partial derivatives and if σ is uniformly elliptic in the sense that
σ⁢σ*⁢(x)≥ε0⁢Iq for all x∈ℝd and all t∈[0,T],ε0>0,
or more generally if (b,σ) satisfies the strong Hörmander hypo-ellipticity assumption, then (33) holds true for every bounded Borel function f:ℝd→ℝ.

For more general path-dependent functionals, no such result exists in general. For various classes of specified functionals depending on the running maximum or mean, some exit stopping time, first order weak expansions in hα,α∈(0,1], have sometimes been established (see [12] for a brief review in connection with multilevel methods). However, as emphasized by the numerical experiments carried out in [12], such weak error expansion can highly be suspected to hold at any order under reasonable smoothness assumptions.

In this subsection we consider F:𝒞b⁢([0,T],ℝd)→ℝ a Lipschitz continuous functional and we set

Y0=F⁢(X) and Yh=F⁢(X¯n) with h=Tn and n⩾1 (i.e. 𝐡=T).

We assume the weak error expansion (${\mathrm{WE}_{\alpha,\bar{R}}}$). We prove now that both estimators ML2R (2) and MLMC (1) satisfy a Strong Law of Large Numbers and a Central Limit Theorem when ε tends to 0.

Theorem 6.2

Let X0∈L2 and assume that F:Cb⁢([0,T],Rd)→R is a Lipschitz continuous functional. Then the assumption (${\mathrm{SE}_{\beta}}$) is satisfied with β=1.

If X0∈Lp for p⩾2, then the Lp-strong error assumption ∥Yh-Y0∥p⩽V1(p)⁢h is satisfied so that both ML2R and MLMC estimators satisfy Theorem 3.1.

If X0∈Lp for p>2 and if F is differentiable with DF continuous, then the sequence (Z⁢(h))h∈H is L2-uniformly integrable and

(34)∃v∞>0,limh→0⁡∥Z⁢(h)∥22=(M-1)⁢v∞.

As a consequence, both ML2R and MLMC estimators satisfy Theorem 3.3 (case β=1).

Proof.

First, note that if F is a Lipschitz continuous functional, with Lipschitz coefficient [F]Lip, we have for all p⩾2,

∥Yh-Y0∥pp⩽[F]Lipp⁢𝔼⁢[supt∈[0,T]⁡|Xt-X¯tn|p]⩽[F]Lipp⁢cb,σ,p,Tp⁢(1+∥X0∥p)p⁢hp2,

then (Yh)h∈ℋ satisfies (${\mathrm{SE}_{\beta}}$) with β=1 and the Lp-strong error assumption as soon as X0∈Lp.

Assume now that X0∈Lp for p>2. By a straightforward application of Minkowski’s inequality we deduce from the Lp-strong error assumption that ∥YhM-Yh∥p⩽C⁢h and then that suph∈ℋ⁡∥Z⁢(h)∥p<+∞. Applying criterion (a) of Lemma 5.2, we prove that (Z⁢(h))h∈ℋ is L2-uniformly integrable.

At this stage it remains to prove (34). The key is [2, Theorem 3], where it is proved that

n⁢M⁢(X¯n-X¯n⁢M)→stablyU(M) as n→+∞,

where U(M)=(Ut(M))t∈[0,T] is the d-dimensional process satisfying

(35)Ut(M)=M-12⁢∑i,j=1qVt⁢∫0t(Vs)-1⁢∇⁡φ.j⁢(Xs)⁢φ.i⁢(Xs)⁢𝑑Bsi,j,t∈[0,T].

We recall the notations of Jacod and Protter [11]

d⁢Xt=φ⁢(Xt)⁢d⁢Wt=∑j=0qφ.j⁢(Xt)⁢d⁢Wtj

with φ.j representing the jth column of the matrix φ=[φi⁢j]i=1,…,d,j=1,…,q, for j=1,…,q, φ0=b and Wt:=(t,Wt1,…,Wtq)′ (column vector), where Wt0=t and the q remaining components make up a standard Brownian motion. Moreover, ∇⁡φ.j is a d×d matrix, where (∇⁡φ.j)i⁢k=∂xk⁡φi⁢j (partial derivative of φi⁢j with respect to the kth coordinate) and (Vt)t∈[0,T] is the ℝd×d valued process solution of the linear equation

Vt=Id+∑j=0q∫0t∇⁡φ.j⁢(Xs)⁢𝑑Wsj⁢Vs,t∈[0,T].

Here (Bi⁢j)1≤i,j≤q is a standard q2-dimensional Brownian motion independent of W. This process is defined on an extension (Ω~,ℱ~,(ℱt~)t≥0,ℙ~) of the original space (Ω,ℱ,(ℱt)t≥0,ℙ) on which lives W.

We write, using that h=Tn,

Z⁢(h)=n⁢M⁢(F⁢(X¯n⁢M)-F⁢(X¯n))=-∫01D⁢F⁢(u⁢X¯n+(1-u)⁢X¯n⁢M)⁢𝑑u⋅Un(M),

where Un(M):=n⁢M⁢(X¯n-X¯n⁢M). The function (x1,x2,x3)↦∫01D⁢F⁢(u⁢x1+(1-u)⁢x2)⁢𝑑u⁢x3 is continuous, and it suffices to prove that (X¯n,X¯n⁢M,Un(M))→ℒ(X,X,U(M)), as n goes to infinity, to conclude that

(36)Z⁢(h)→ℒ-D⁢F⁢(X)⁢U(M) as h→0.

Let two bounded Lipschitz continuous functionals be ϕ:𝒞b⁢([0,T],ℝ2⁢d)→ℝ and ψ:𝒞b⁢([0,T],ℝd)→ℝ and denote X~n=(X¯n,X¯n⁢M) and X~=(X,X). We write

𝔼⁢[ϕ⁢(X~n)⁢ψ⁢(Un(M))-ϕ⁢(X~)⁢ψ⁢(U(M))]=𝔼⁢[(ϕ⁢(X~n)-ϕ⁢(X~))⁢ψ⁢(Un(M))+ϕ⁢(X~)⁢(ψ⁢(Un(M))-ψ⁢(U(M)))].

Since (Un(M))n⩾1 converges stably with limit U(M), we have that

limn→+∞⁡𝔼⁢[ϕ⁢(X~)⁢(ψ⁢(Un(M))-ψ⁢(U(M)))]=0.

On the other hand, owing to (32), we prove that

limn→+∞⁡𝔼⁢[(ϕ⁢(X~n)-ϕ⁢(X~))⁢ψ⁢(Un(M))]=0.

By (36) and Lemma 5.2 (b) we have

limh→0⁡∥Z⁢(h)∥22=∥D⁢F⁢(X)⁢U(M)∥22=(M-1)⁢v∞

with v∞=∥D⁢F⁢(X)⁢U(M)M-1∥22 which does not depend on M owing to the definition of UM given in (35). ∎

6.2 Nested Monte Carlo

The aim of a nested Monte Carlo method is to compute by Monte Carlo simulation

𝔼[f(𝔼[X|Y])],

where (X,Y) is a couple of ℝ×ℝqY-valued random variables defined on a probability space (Ω,𝒜,ℙ) with X∈L2⁢(ℙ) and f:ℝ→ℝ is a Lipschitz continuous function with Lipschitz coefficient [f]Lip. We assume that there exists a Borel function F:ℝqξ×ℝqY→ℝ and a random variable ξ:(Ω,𝒜)→ℝqξ independent of Y such that

X=F⁢(ξ,Y)

and we set 𝐡=1K0 for some integer K0⩾1, h=1K, K∈K0⁢ℕ*={K0,2⁢K0,…} and

Y0:=f(𝔼[X|Y]),Yh=Y1K:=f(1K∑k=1KF(ξk,Y)),

where (ξk)k≥1 is a sequence of i.i.d. variables, ξk∼ξ, independent of Y. A nested ML2R estimator then writes (nj=Mj-1)

IπN=1N1⁢∑i=1N1f⁢(1K⁢∑k=1KF⁢(ξk(1),i,Y(1),i))

+∑j=2R𝐖jRNj⁢∑i=1Nj(f⁢(1nj⁢K⁢∑k=1nj⁢KF⁢(ξk(j),i,Y(j),i))-f⁢(1nj-1⁢K⁢∑k=1nj-1⁢KF⁢(ξk(j),i,Y(j),i))),

where (Y(j),i)i≥1 is a sequence of independent copies of Y(j)∼Y, j=1,…,R, Y(j) independent of Y(ℓ) for j≠ℓ, and (ξk(j),i)k,i≥1,j=1,…,R is a sequence of i.i.d. variables ξk(j),i∼ξ. We saw in [12] that, when f is 2⁢R times differentiable with f(k) bounded, the nested Monte Carlo estimator satisfies (${\mathrm{SE}_{\beta}}$) with β=1 and (${\mathrm{WE}_{\alpha,\bar{R}}}$) with α=1 and R¯=R-1. Here we want to show that the nested Monte Carlo satisfies also the assumptions of the Strong Law of Large Numbers 3.1 and of the Central Limit Theorem 3.3. Then we define for convenience

ϕ0⁢(y):=𝔼⁢[F⁢(ξ,y)],ϕh⁢(y):=1K⁢∑k=1KF⁢(ξk,y),K∈K0⁢ℕ*,

so that Y0=f⁢(ϕ0⁢(Y)) and Yh=f⁢(ϕh⁢(Y)), and for a fixed y, we set σF⁢(y):=Var⁡(F⁢(ξ,y)).

Proposition 6.3

Still assuming that f is Lipschitz continuous. If X∈Lp⁢(P) for p⩾2, then there exists V1(p) such that, for all h=1K and h′=1K′,K,K′∈K0⁢N*,

(37)∥Yh′-Yh∥pp≤V1(p)⁢|h′-h|p2.

As a consequence, assumption (${\mathrm{SE}_{\beta}}$) and the Lp-strong error assumption (9) are satisfied with β=1. Then both ML2R and MLMC estimators satisfy a Strong Law of Large Numbers, see Theorem 3.1.

Proof.

Set X~k=F(ξk,Y)-𝔼[F(ξk,Y)|Y] and Sk=∑ℓ=1kX~ℓ. As f is Lipschitz, we have

∥Yh′-Yh∥pp=∥f⁢(1K′⁢∑k=1K′F⁢(ξk,Y))-f⁢(1K⁢∑k=1KF⁢(ξk,Y))∥pp

≤[f]Lipp⁢∥1K′⁢∑k=1K′X~k-1K⁢∑k=1KX~k∥pp=[f]Lipp⁢𝔼⁢[|SK′K′-SKK|p].

Assume without loss of generality that K≤K′. Since p≥2, it follows that

𝔼⁢[|SK′K′-SKK|p]=𝔼⁢[|(1K′-1K)⁢SK+1K′⁢(SK′-SK)|p]

≤2p-1⁢[|1K′-1K|p⁢𝔼⁢[|SK|p]+(1K′)p⁢𝔼⁢[|SK′-SK|p]].

Owing to Burkholder’s inequality, there exists a universal constant Cp such that

𝔼⁢[|SK|p]≤Cp⁢𝔼⁢[|∑k=1KX~k2|p2]≤Cp⁢(∑k=1K∥X~k2∥p2)p2=Cp⁢Kp2⁢𝔼⁢[|X~1|p].

Hence, as SK′-SK∼SK′-K in distribution,

𝔼⁢[|SK′K′-SKK|p]≤2p-1⁢Cp⁢𝔼⁢[|X~1|p]⁢[|1K′-1K|p⁢Kp2+(1K′)p⁢|K′-K|p2].

Keeping in mind that K′≥K, we derive

|1K′-1K|p⁢Kp2+|1K′|p⁢|K′-K|p2=|1K′-1K|p2⁢|KK′-1|p2+|KK′|p2⁢|1K-1K′|p2≤2⁢|1K-1K′|p2.

We conclude by setting V1(p)=[f]Lipp⁢2p⁢Cp⁢𝔼⁢[|X~1|p]. ∎

For the Central Limit Theorem to hold, the key point is the following lemma.

Lemma 6.4

Assume that f:R→R is a Lipschitz continuous function and differentiable with f′ continuous. Let ζ be an N⁢(0,1)-distributed random variable independent of Y. Then, as h→0,

(38)Z⁢(h)=Mh⁢(YhM-Yh)→ℒM-1⁢f′⁢(ϕ0⁢(Y))⁢σF⁢(Y)⁢ζ.

Proof.

First note that Z⁢(h)=zh(M)⁢(Y), where zh(M) is defined by

zh(M)⁢(y)=Mh⁢(f⁢(ϕhM⁢(y))-f⁢(ϕh⁢(y))) for all ⁢y∈ℝqY.

Let y∈ℝqY. We have

(39)zh(M)⁢(y)=-(∫01f′⁢(v⁢ϕh⁢(y)+(1-v)⁢ϕhM⁢(y))⁢𝑑v)⁢uh(M)⁢(y)

with

uh(M)⁢(y)=Mh⁢(ϕh⁢(y)-ϕhM⁢(y)).

We derive from the Strong Law of Large Numbers that

limh→0⁡ϕh⁢(y)=ϕ0⁢(y)=limh→0⁡ϕhM⁢(y) a.s.

and by continuity of the function (x1,x2)↦∫01f′⁢(v⁢x1+(1-v)⁢x2)⁢𝑑v (since f′ is continuous) we get

(40)limh→0⁡∫01f′⁢(v⁢ϕh⁢(y)+(1-v)⁢ϕhM⁢(y))⁢𝑑v=f′⁢(ϕ0⁢(y)) a.s.

We have now to study the convergence of the random sequence uh(M)⁢(y) as h goes to zero. We set ξ~k=ξk+K, k=1,…,K⁢(M-1). Note that (ξ~k)k=1,…,K⁢(M-1) are i.i.d. with distribution ξ1 and are independent of (ξk)k=1⁢…,M. Then we can write

uh(M)⁢(y)=M⁢K⁢(1K⁢∑k=1KF⁢(ξk,y)-1M⁢K⁢∑k=1M⁢KF⁢(ξk,y))

=M⁢K⁢(M-1M⁢K⁢∑k=1K(F⁢(ξk,y)-ϕ0⁢(y))-1M⁢K⁢∑k=K+1M⁢K(F⁢(ξk,y)-ϕ0⁢(y)))

=M-1M⁢(1K⁢∑k=1KF⁢(ξk,y)-ϕ0⁢(y))-M-1M⁢[1K⁢(M-1)⁢(∑k=1K⁢(M-1)F⁢(ξ~k,y)-ϕ0⁢(y))].

Owing to the Central Limit Theorem and the independence of both terms on the right-hand side of the above inequality, we derive that

uh(M)⁢(y)→ℒM-1M⁢σF⁢(y)⁢ζ1-M-1M⁢σF⁢(y)⁢ζ2 as ⁢h→0,

where ζ1 and ζ2 are two independent random variables both following a standard Gaussian distribution. Hence, noting that (M-1M)2+(M-1M)2=M-1, we obtain

(41)uh(M)⁢(y)→ℒM-1⁢σF⁢(y)⁢ζ with ⁢ζ∼𝒩⁢(0,1).

By Slutsky’s Theorem, we derive from (39), (40) and (41) that for every y∈ℝqY,

(42)zh(M)⁢(y)→ℒM-1⁢f′⁢(ϕ0⁢(y))⁢σF⁢(y)⁢ζ as h→0.

Recall that Z⁢(h)=zh(M)⁢(Y). We prove (38) combining Fubini’s Theorem with the Lebesgue Dominated Convergence Theorem and (42). More precisely, for all G∈𝒞b we have

limh→0⁡𝔼⁢[[]⁢2]⁢G⁢(Z⁢(h))=limh→0⁡𝔼⁢[G⁢(zh(M)⁢(Y))]=𝔼⁢[limh→0⁡G⁢(zh(M)⁢(Y))]

=𝔼⁢[G⁢(M-1⁢f′⁢(ϕ0⁢(y))⁢σF⁢(y)⁢ζ)],

as desired. ∎

We are now in a position to prove that the nested Monte Carlo satisfies the assumptions of the Central Limit Theorem 3.3.

Theorem 6.5

Assume that f:R→R is a Lipschitz continuous function and differentiable with f′ continuous. Then (Z⁢(h))h∈H is L2-uniformly integrable and

(43)limh→0⁡∥Z⁢(h)∥22=(M-1)⁢∥f′⁢(ϕ0⁢(Y))⁢σF⁢(Y)∥22.

As a consequence, the ML2R and MLMC estimators (2) and (1) satisfy a Central Limit Theorem in the sense of Theorem 3.3 (case β=1).

Proof.

We prove first the L2-uniform integrability of (Z⁢(h))h∈ℋ. As f is Lipschitz, we have

|Z⁢(h)|2⩽[f]Lip2⁢|uh(M)⁢(Y)|2 with ⁢uh(M)⁢(y)=Mh⁢(ϕh⁢(y)-ϕhM⁢(y)).

Consequently, it suffices to show that (uh(M)⁢(Y))h∈ℋ is L2-uniformly integrable, to establish the L2-uniform integrability of (Z⁢(h))h∈ℋ.

We saw in the proof of Proposition 6.4 that uh(M)⁢(Y)→ℒM-1⁢σF⁢(Y)⁢ζ as h goes to 0, where ζ is a standard normal random variable independent of Y. Owing to Lemma 5.2 (b), the uniform integrability will follow from limh→0⁡∥uh(M)⁢(Y)∥2=∥M-1⁢σF⁢(Y)⁢ζ∥2. In fact, this convergence holds as an equality. Indeed,

∥uh(M)⁢(Y)∥22=M⁢K⁢𝔼⁢[(SK-SM⁢K)2]=M⁢K⁢𝔼⁢[(M-1M⁢K⁢SK-1M⁢K⁢(SM⁢K-SK))2].

We notice that SM⁢K-SK is independent of SK. Hence, since the ξk are independent,

∥uh(M)⁢(Y)∥22=M⁢K⁢(𝔼⁢[(M-1M⁢K⁢SK)2]+𝔼⁢[(1M⁢K⁢(SM⁢K-SK))2])

=M⁢K⁢((M-1M⁢K)2⁢K⁢𝔼⁢[(X~1)2]+(1M⁢K)2⁢(M⁢K-K)⁢𝔼⁢[(X~1)2])

=(M-1)⁢𝔼⁢[(X~1)2]=(M-1)⁢𝔼⁢[σF2⁢(Y)].

We prove now (43) using again Lemma 5.2 (b) with the convergence in law of (Z⁢(h))h∈ℋ established in Lemma 6.4. ∎

We notice that, if assumption (37) in Proposition 6.3 holds with p>2, the condition of L2-uniform integrability is much easier to show since it is a direct consequence of Lemma 5.2 (a).

6.3 Smooth nested Monte Carlo

When the function f is smooth, namely 𝒞1+ρ⁢(ℝ,ℝ), ρ∈(0,1] (f′ is ρ-Hölder), a variant of the former multilevel nested estimator has been used in [4] (see also [8]) to improve the strong rate of convergence in order to attain the asymptotically unbiased setting β>1 in condition (${\mathrm{SE}_{\beta}}$). A root M being given, the idea is to replace in the successive refined levels the difference YhM-Yh (where h=1K, K∈K0⁢ℕ*) in the ML2R and MLMC estimators by

Yh,hM:=f⁢(1M⁢K⁢∑k=1M⁢KF⁢(ξk,Y))-1M⁢∑m=1Mf⁢(1K⁢∑k=1KF⁢(ξ(m-1)⁢K+k,Y)).

It is clear that

𝔼⁢[Yh,hM]=𝔼⁢[YhM-Yh].

Computations similar to those carried out in Proposition 6.3 yield that, if X=F⁢(ξ,Y)∈Lp⁢(1+ρ)⁢(ℙ) for some p⩾2, then

(44)∥Yh,hM∥pp⩽VM(ρ,p)⁢|h-hM|p2⁢(1+ρ)=VM(ρ,p)⁢|1-1M|p2⁢(1+ρ)⁢|h|p2⁢(1+ρ).

SLLN:

The first consequence is that the SLLN also holds for these modified estimators along the sequences of RMSE (εk)k⩾1 satisfying ∑k⩾1εkp<+∞ owing to Theorem 3.1.

CLT:

When (44) is satisfied with p=2, one derives that β=p2⁢(1+ρ)=1+ρ>1 whatever ρ is. Hence, the only requested condition in this setting to obtain a CLT (see Theorem 3.2) is the L2-uniform integrability of (h-β2⁢Yh,hM)h∈ℋ, since no sharp rate is needed when β>1. Moreover, if (44) holds for a p∈(2,+∞), i.e. if X=F⁢(ξ,Y)∈Lp⁢(1+ρ)⁢(ℙ) with p>2, then

h-β2⁢∥Yh,hM∥p⩽VM(ρ,p)1p⁢|1-1M|12⁢(1+ρ),

which in turn ensures the L2-uniform integrability.

As a final remark, note that if the function fis convex, Yh,hM⩽0 so that 𝔼⁢[YhM]⩽𝔼⁢[Yh] which in turn implies by an easy induction that 𝔼⁢[Y0]⩽𝔼⁢[Yh] for every h∈ℋ. A noticeable consequence is that the MLMC estimator has a positive bias.

These results can be extended to locally ρ-Hölder continuous functions with polynomial growth at infinity. For more details and a complete proof we refer to [9].

7 Conclusion

We proved a Strong Law of Large Numbers and a Central Limit Theorem for Multilevel estimators with and without weights and we exhibited two applications: the discretization schemes for diffusions, where we extend a result of Ben Alaya and Kebaier in [2], and the nested Monte Carlo, first mentioned in the Multilevel framework by Lemaire and Pagès in [12]. The Strong Law of Large Numbers is essentially a consequence of the strong error assumption (${\mathrm{SE}_{\beta}}$) (or of its reinforced version (9)), and of the estimator levels’ independence. The understanding of the behavior of the weights in the Multilevel Richardson–Romberg estimator is also crucial at this stage, as it is for the proof of the Central Limit Theorem which follows. Under some additional assumptions of L2-uniform integrability, both the weighted and the standard Multilevel estimators follow a Central Limit Theorem at rate ε as the quadratic error ε goes to 0. We distinguish between two cases, depending on the value of the strong error rate β. When β>1, both the first coarse level and the successive fine correcting levels contribute to the asymptotic variance of the estimator, whereas when β∈(0,1], the asymptotic variance contains only the contribution of the correcting levels. With the choice of optimal parameters made in Tables 1 and 2, the Standard Multilevel Monte Carlo estimator has a bias (which is bounded), whereas the weighted Multilevel Monte Carlo estimator is asymptotically without bias, hence we can build exact confidence intervals for the weighted Multilevel Monte Carlo estimator.

A Asymptotic of the weights

We focus our attention on the behavior of 𝐖jR when R→+∞. We recall

𝐖jR=∑ℓ=jRaℓ⁢bR-ℓ=∑ℓ=0R-jaR-ℓ⁢bℓ

with

aℓ=1∏1≤k≤ℓ-1(1-M-k⁢α)

and with the convention ∏k=10(1-M-k⁢α)=1, and

bℓ=(-1)ℓ⁢M-α2⁢ℓ⁢(ℓ+1)∏1≤k≤ℓ(1-M-k⁢α).

For convenience, we set 𝐖jR=0, for j≥R+1,R∈ℕ*. We first notice that aℓ is an increasing and converging sequence and we set

limℓ→+∞⁡aℓ=a∞.

The sequence bℓ converges to zero and furthermore the series with general term bℓ is absolutely converging, since ∑ℓ≥1M-α2⁢ℓ⁢(ℓ+1)<+∞. This leads us to set

B~∞=∑ℓ=0+∞|bℓ|<+∞ and B∞=∑ℓ=0+∞bℓ<+∞.

Claim (a) of Lemma 4.3 is then proved. As a consequence,

|𝐖jR|≤a∞⁢B~∞ for all R∈ℕ* and all j∈{1,…,R},

which proves claim (b) in Lemma 4.3. For the proof of claims (c) and (d), we will need the following:

Lemma A.1

Let φ:N*→N* such that φ⁢(R)∈{1,…,R-1} for all R≥1, φ⁢(R)→+∞ and R-φ⁢(R)→+∞ as R→+∞. Then

limR→+∞⁡sup1≤j≤φ⁢(R)⁡|𝐖jR-1|=0.

In particular, for all j∈N* we have WjR→1 as R→+∞. However, this convergence is not uniform since WR-j+1R→a∞⁢∑ℓ=0j-1bℓ for every j∈N* as R→+∞.

Proof.

We write

|𝐖jR-a∞⁢B∞|=|∑ℓ=0R-jaR-ℓ⁢bℓ-a∞⁢∑ℓ=0R-jbℓ-a∞⁢∑ℓ≥R-j+1bℓ|≤∑ℓ=0R-j(a∞-aR-ℓ)⁢|bℓ|+a∞⁢∑ℓ≥R-j+1|bℓ|.

First note that

limR→+∞⁡supj∈{1,…,φ⁢(R)}⁡∑ℓ≥R-j+1|bℓ|≤limR→+∞⁡∑ℓ≥R-φ⁢(R)+1|bℓ|=0

as R-φ⁢(R)→+∞ and ∑ℓ≥0|bℓ|<+∞. On the other hand, for every j∈{1,…,φ⁢(R)},

∑ℓ=0R-j(a∞-aR-ℓ)⁢|bℓ|=∑ℓ=jR(a∞-aℓ)⁢|bR-ℓ|

=∑ℓ=jφ⁢(R)(a∞-aℓ)⁢|bR-ℓ|+∑ℓ=φ⁢(R)+1R(a∞-aℓ)⁢|bR-ℓ|

≤a∞⁢∑ℓ=R-φ⁢(R)R-j|bℓ|+(a∞-aφ⁢(R)+1)⁢∑ℓ=0R-φ⁢(R)-1|bℓ|

≤a∞⁢∑ℓ=R-φ⁢(R)+∞|bℓ|+(a∞-aφ⁢(R)+1)⁢∑ℓ=0R-φ⁢(R)-1|bℓ|.

Consequently, supj∈{1,…,R}⁡∑ℓ=0R-j(a∞-aR-ℓ)⁢|bℓ|→0 as R→+∞, since φ⁢(R) and R-φ⁢(R)→+∞. Finally,

limR→+∞⁡sup1≤j≤φ⁢(R)⁡|𝐖jR-a∞⁢B∞|=0.

Moreover, by definition we have 𝐖1R=1 for all R, which implies that B∞=1a∞ and completes the proof. Finally, as aj→a∞,

𝐖R-j+1R=∑ℓ=0j-1aR-ℓ⁢bl→R→+∞a∞⁢∑ℓ=0j-1bℓ,

as desired. ∎

Proof of Lemma 4.3 (c) and (d).

mβ⁢(j)=Mγ⁢(j-1),γ<0.

We notice that it is a finite measure since

∑j≥1d⁢mβ⁢(j)=11-Mγ.

Since, as we saw in Lemma A.1, 𝐖jR→1 as R→+∞ for every j∈ℕ* and |𝐖jR|≤a∞⁢B~∞, we derive from Lebesgue’s Dominated Convergence Theorem that

limR→+∞⁡∑j=2R|𝐖jR|⁢Mγ⁢(j-1)=∑j=2+∞limR→+∞⁡|𝐖jR|⁢Mγ⁢(j-1)=1M-γ-1.

(d) If γ<0, we consider the non-negative finite measure on ℕ* defined by mβ′⁢(j)=Mγ⁢(j-1)⁢vj since (vj)j⩾1 is a bounded sequence of positive real numbers. As in the previous case (c) we have

limR→+∞⁡∑j=2R|𝐖jR|⁢Mγ⁢(j-1)⁢vj=∑j=2+∞Mγ⁢(j-1)⁢vj.

If γ=0, let us consider a sequence φ⁢(R)∈{1,…,R-1} such that φ⁢(R)R→1, R-φ⁢(R)→+∞ as R→+∞ (for example φ⁢(R)=R-R). Then we can write

|1R⁢∑j=2R|𝐖jR|⁢vj-1R⁢∑j=2Rvj|≤[1R⁢∑j=2φ⁢(R)||𝐖jR|-1|+1R⁢∑j=φ⁢(R)+1R(|𝐖jR|+1)]⁢supj≥2⁡vj

≤[sup2≤j≤φ⁢(R)⁡|𝐖jR-1|⁢φ⁢(R)R+(a∞⁢B~∞+1)⁢(1-φ⁢(R)R)]⁢supj≥2⁡vj.

Owing to Lemma A.1, sup2≤j≤φ⁢(R)⁡|𝐖jR-1|→0 as R→+∞. Using furthermore that φ⁢(R)R→1 as R→+∞ and that limj→+∞⁡vj=1, one concludes by noting that, owing to Césàro’s Lemma, limR→+∞⁡1R⁢∑j=2Rvj=1.

If γ>0, first, we notice that

(45)∑j=2R|𝐖jR|⁢Mγ⁢(j-1)≥|𝐖RR|⁢Mγ⁢R=|aR|⁢Mγ⁢R→+∞.

Let η>0. Since limj→+∞⁡vj=1, there exists Nη∈ℕ* such that, for each j>Nη, |vj-1|<η2. Owing to Lemma A.1 there exists Rη such that, for each R≥Rη, sup2≤j≤Nη⁡|𝐖jR|<1+η. Then

|∑j=2R|𝐖jR|⁢Mγ⁢(j-1)⁢vj∑j=2R|𝐖jR|⁢Mγ⁢(j-1)-1|≤∑j=2R|𝐖jR|⁢Mγ⁢(j-1)⁢|vj-1|∑j=2R|𝐖jR|⁢Mγ⁢(j-1)

≤∑j=2Nη|𝐖jR|⁢Mγ⁢(j-1)⁢|vj-1|∑j=2R|𝐖jR|⁢Mγ⁢(j-1)+η2⁢∑j=Nη+1R|𝐖jR|⁢Mγ⁢(j-1)∑j=2R|𝐖jR|⁢Mγ⁢(j-1)

≤max2≤j≤Nη⁡Mγ⁢(j-1)⁢|vj-1⁢|Nη⁢sup2≤j≤Nη|⁢𝐖jR|∑j=2R|𝐖jR|⁢Mγ⁢(j-1)+η2

≤f⁢(Nη)⁢(1+η)∑j=2R|𝐖jR|⁢Mγ⁢(j-1)+η2

where f⁢(N)=max2≤j≤N⁡Mγ⁢(j-1)⁢|vj-1|⁢N does not depend on R. Thanks to (45), there exists Rη′>0 such that, for each R≥max⁡(Rη,Rη′), ∑j=2R|𝐖jR|⁢Mγ⁢(j-1)>2⁢f⁢(Nη)⁢(1+η)η, which proves that

limR→+∞⁡∑j=2R|𝐖jR|⁢Mγ⁢(j-1)⁢vj∑j=2R|𝐖jR|⁢Mγ⁢(j-1)=1.

This leads to analyze

1Mγ⁢R⁢∑j=2R|𝐖jR|⁢Mγ⁢(j-1)=∑j=2R|𝐖jR|⁢M-γ⁢(R-j+1)=∑j=1R-1|𝐖R-j+1R|⁢M-γ⁢j.

Using that |𝐖jR|≤a∞⁢B~∞ for j∈{1,…,R} and Lemma A.1, one derives from Lebesgue’s Dominated Convergence Theorem that

∑j=1R-1|𝐖R-j+1R|⁢M-γ⁢j→R→+∞a∞⁢∑j≥1|∑ℓ=0j-1bℓ|⁢M-γ⁢j<+∞

since |∑ℓ=0j-1bℓ|≤∑ℓ=0j-1|bℓ|≤B~∞. ∎

References

[1] Bally V. and Talay D., The law of the Euler scheme for stochastic differential equations. I. Convergence rate of the distribution function, Probab. Theory Related Fields 104 (1996), no. 1, 43–60. 10.1007/BF01303802Search in Google Scholar

[2] Ben Alaya M. and Kebaier A., Central limit theorem for the multilevel Monte Carlo Euler method, Ann. Appl. Probab. 25 (2015), no. 1, 211–234. 10.1214/13-AAP993Search in Google Scholar

[3] Billingsley P., Convergence of Probability Measures, 2nd ed., Wiley Ser. Probab. Stat., John Wiley & Sons, New York, 1999. 10.1002/9780470316962Search in Google Scholar

[4] Bujok K., Hambly B. and Reisinger C., Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives, Methodol. Comput. Appl. Probab. 17 (2015), no. 3, 579–604. 10.1007/s11009-013-9380-5Search in Google Scholar

[5] Duffie D. and Glynn P., Efficient Monte Carlo simulation of security prices, Ann. Appl. Probab. 5 (1995), no. 4, 897–905. 10.1214/aoap/1177004598Search in Google Scholar

[6] Gerbi A. A., Jourdain B. and Clément E., Ninomiya–Victoir scheme: Strong convergence, antithetic version and application to multilevel estimators, Monte Carlo Methods Appl. 22 (2016), no. 3, 197–228. 10.1515/mcma-2016-0109Search in Google Scholar

[7] Giles M., Multilevel Monte Carlo path simulation, Oper. Res. 56 (2008), no. 3, 607–617. 10.1287/opre.1070.0496Search in Google Scholar

[8] Giles M. B., Multilevel Monte Carlo methods, Acta Numer. 24 (2015), 259–328. 10.1007/978-3-642-41095-6_4Search in Google Scholar

[9] Giorgi D., Ph.D. thesis, in progress. Search in Google Scholar

[10] Hall P. and Heyde C. C., Martingale Limit Theory and its Application, Probab. Math. Statist., Academic Press, New York, 1980. Search in Google Scholar

[11] Jacod J. and Protter P., Asymptotic error distributions for the euler method for stochastic differential equations, Ann. Probab. 26 (1998), no. 1, 267–307. 10.1214/aop/1022855419Search in Google Scholar

[12] Lemaire V. and Pagès G., Multilevel richardson-romberg extrapolation, preprint 2014, https://arxiv.org/abs/1401.1177v1; to appear in Bernoulli. 10.2139/ssrn.2539114Search in Google Scholar

[13] Talay D. and Tubaro L., Expansion of the global error for numerical schemes solving stochastic differential equations, Stoch. Anal. Appl. 8 (1990), no. 4, 483–509. 10.1080/07362999008809220Search in Google Scholar

Received: 2016-11-17

Accepted: 2017-1-26

Published Online: 2017-2-21

Published in Print: 2017-3-1

Limit theorems for weighted and regular Multilevel estimators

Abstract

1 Introduction

Notations.

2 Brief background on MLMC and ML2R estimators

3 Main results

3.1 Strong Law of Large Numbers

Theorem 3.1 (Strong Law of Large Numbers)

3.2 Central Limit Theorems

3.2.1 Case β>1

Theorem 3.2 (Central Limit Theorem, β>1)

3.2.2 Case β∈(0,1]

Theorem 3.3 (Central Limit Theorem, 0<β⩽1)

3.3 Practitioner’s corner

Impact of the weights 𝐖jR, j=1,…,R, on the asymptotic behavior of the ML2R estimator.

4 Auxiliary results

4.1 Asymptotic of the bias parameter and of the depth

4.2 Asymptotic of the bias and robustness

4.3 Properties of the weights of the ML2R estimator

4.4 Asymptotic of the allocation policy and of the size

Proof.

5 Proofs

5.1 Proof of Strong Law of Large Numbers

Proof.

Proof of Theorem 3.1.

5.2 Proof of the Central Limit Theorem

Proof of Theorems 3.2 and 3.3.

6 Applications

6.1 Diffusions

Proof.

6.2 Nested Monte Carlo

Proof.

Proof.

Proof.

6.3 Smooth nested Monte Carlo

SLLN:

CLT:

7 Conclusion

A Asymptotic of the weights

Proof.

Proof of Lemma 4.3 (c) and (d).

References

Journal and Issue

Articles in the same Issue