Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
Publicly Available Published by De Gruyter February 21, 2017

Limit theorems for weighted and regular Multilevel estimators

  • Daphné Giorgi , Vincent Lemaire and Gilles Pagès EMAIL logo

Abstract

We aim at analyzing in terms of a.s. convergence and weak rate the performances of the Multilevel Monte Carlo estimator (MLMC) introduced in [7] and of its weighted version, the Multilevel Richardson–Romberg estimator (ML2R), introduced in [12]. These two estimators permit to compute a very accurate approximation of I0=𝔼[Y0] by a Monte Carlo-type estimator when the (non-degenerate) random variable Y0L2() cannot be simulated (exactly) at a reasonable computational cost whereas a family of simulatable approximations (Yh)h is available. We will carry out these investigations in an abstract framework before applying our results, mainly a Strong Law of Large Numbers and a Central Limit Theorem, to some typical fields of applications: discretization schemes of diffusions and nested Monte Carlo.

MSC 2010: 65C05; 65C30

1 Introduction

In recent years, there has been an increasing interest in Multilevel Monte Carlo approach which delivers remarkable improvements in computational complexity in comparison with standard Monte Carlo in biased framework. We refer the reader to [8] for a broad outline of the ideas behind the Multilevel Monte Carlo method and various recent generalizations and extensions. In this paper we establish a Strong Law of Large Numbers and Central Limit Theorem for two kinds of multilevel estimator, Multilevel Monte Carlo estimator (MLMC) introduced by Giles in [7] and the Multilevel Richardson–Romberg (weighted) estimator introduced in [12]. We consider a rather general and in some way abstract framework which will allow us to state these results whatever the strong rate parameter is (usually denoted by β). To be more precise, we will deal with the versions of these estimators designed to achieve a root mean squared error (RMSE) ε and establish these results as ε0. Doing so, we will retrieve some recent results established in [2] in the framework of Euler discretization schemes of Brownian diffusions. We will also deduce an SLLN and a CLT for Multilevel nested Monte Carlo, which are new results to our knowledge. More generally, our result apply to any implementation of Multilevel Monte Carlo methods.

Let (Ω,𝒜,) be a probability space and let (Yh)h be a family of real-valued random variables in L2() associated to Y0, where ={𝐡n:n1}, such that limh0Yh-Y02=0. In the sequel, a fixed h will be called bias parameter (though it appears in a different framework as a discretization parameter). In what follows we will be interested in the computational cost of the estimators denoted by the Cost() function. We assume that the simulation of Yh has an inverse linear complexity, i.e. Cost(Yh)=h-1. A natural estimator of I0=𝔼[Y0] is the standard Monte Carlo estimator, which reads for a fixed h,

IhN=1Nk=1NYhkwithCost(IhN)=h-1N,

where (Yhk)k1 are i.i.d. copies of Yh and N is the size of the estimator, which controls the statistical error. In order to give the definition of a Multilevel estimator, we consider a depthR2 (the finest level of simulation) and a geometric decreasing sequence of bias parameters hj=hnj with nj=Mj-1, j=1,,R. If N is the estimator size, we consider an allocation policy q=(q1,,qR) such that, at each level j=1,,R, we will simulate Nj=Nqj scenarios (see (1) and (2) below). Thus, we consider R independent copies of the family Y(j)=(Yh(j))h, j=1,,R, attached to independent random copies Y0(j) of Y0. Moreover, let (Y(j),k)k1 be independent sequences of independent copies of Y(j). We denote by IπN an estimator of size N of I0, attached to a simulation parameter πΠd.

A standard Multilevel Monte Carlo (MLMC) estimator, as introduced by Giles in [7], reads

(1)IπN=Ih,R,qN=1N1k=1N1Yh(1),k+j=2R1Njk=1Nj(Yhj(j),k-Yhj-1(j),k)

with π=(h,R,q).

A Multilevel Richardson–Romberg (ML2R) estimator, as introduced in [12], is a weighted version of (1) which reads

(2)IπN=Ih,R,qN=1N1k=1N1Yh(1),k+j=2R𝐖jRNjk=1Nj(Yhj(j),k-Yhj-1(j),k)

with π=(h,R,q). The weights (𝐖jR)j=1,,R are explicitly defined as functions of the weak error rate α (see equation (${\mathrm{WE}_{\alpha,\bar{R}}}$) below) and of the refiners nj, j=0,,R, in order to kill the successive bias terms in the weak error expansion (see Section 4.3 for more details on the weights). When no ambiguity, we will keep denoting by IπN estimators for both classes. We notice that a Crude Monte Carlo estimator of size N formally appears as an ML2R estimator with R=1 and an MLMC estimator appears as an ML2R estimator in which the weights set 𝐖jR=1, j=1,,R. Based on the inverse linear complexity of Yh, it is clear that the simulation cost of both MLMC and ML2R estimators is given by

Cost(Ih,R,qN)=Nhj=1Rqj(nj-1+nj)

with the convention n0=0. The difference between the cost of MLMC and of ML2R estimator comes from the different choice of the parameters M, R, h, q and N.

The calibration of the parameters is the result, a root M2 being fixed, of the minimization of the simulation cost, for a given target Mean Square Error or L2-error ε, namely,

(3)(π(ε),N(ε))=argminIπN-I02εCost(IπN).

This calibration has been done in [12] for both estimators MLMC and ML2R under the following assumptions on the sequence (Yh)h. The first one, called bias error expansion (or weak error assumption), states

(${\mathrm{WE}_{\alpha,\bar{R}}}$)α>0,R¯1,(cr)1rR¯,𝔼[Yh]-𝔼[Y0]=k=1R¯ckhαk+hαR¯ηR¯(h),limh0ηR¯(h)=0.

The second one, called strong approximation error assumption, states

(${\mathrm{SE}_{\beta}}$)β>0,V10,Yh-Y022=𝔼[|Yh-Y0|2]V1hβ.

Note that the strong error assumption can be sometimes replaced by the sharper

β>0,V10,Yh-Yh22=𝔼[|Yh-Yh|2]V1|h-h|β,h,h.

From now on, we set IπN(ε):=Iπ(ε)N(ε), where π(ε) and N(ε) are closed to solutions of (3) (see [12] for the construction of these parameters and Tables 1 and 2 for the explicit values). As mentioned by Duffie and Glynn in [5], the global cost of the standard Monte Carlo with these optimal parameters satisfies

Cost(IπN(ε))K(α)ε-(2+1α),

where the finite real constant K(α) depends on the structural parameters α,Var(Y0),𝐡, and we recall that f(ε)g(ε) if and only if lim supε0g(ε)/f(ε)1. Giles for MLMC in [7] and Lemaire and Pagès for ML2R in [12] showed that, using these optimal parameters the global cost is upper bounded by a function of ε, depending on the weak error expansion rate α and on the strong error rate β. More precisely, for both estimators we have

(4)Cost(IπN(ε))K(α,β,M)v(ε),

where the finite real constant K(α,β,M) is explicit and differs between MLMC and ML2R (see [12] for more details). Denoting vMLMC and vML2R the dominated function in (4) for the MLMC and ML2R estimator, respectively, we obtain two distinct cases. In the case β>1 both estimators behaves very well as an unbiased Monte Carlo estimator, i.e. vMLMC(ε)=vML2R(ε)=ε-2. In the case β1, the ML2R is asymptotically quite better than MLMC since limε0vML2RvMLMC=0. More precisely, we have the following scheme.

vMLMC(ε)vML2R(ε)
β=1ε-2log(1/ε)2ε-2log(1/ε)
β<1ε-2-1-βαε-2e1-βα2log(1/ε)log(M)

The aim of this paper is to prove a Strong Law of Large Numbers (SLLN) and a Central Limit Theorem (CLT) for both estimators MLMC and ML2R calibrated using these optimal parameters. First notice that as these parameters have been computed under the constraint IπN(ε)-I02ε, the convergence in L2 holds by construction. As a consequence, it is straight forward that, for every sequence (εk)k1 such that k1εk2<+,

k1𝔼[|IπN(εk)-I0|2]<+,

so that

IπN(εk)a.s.I0as k+.

We will weaken the assumption on the sequence (εk)k1 when Yh has higher finite moments, so we will investigate some Lp criterions for p2. Moreover, provided a sharper strong error assumption and adding some more hypothesis of uniform integrability, we will show that

IπN(ε)-I0ε-m(ε)𝒩(0,σ2)as ε0,

with m(ε)=μ(ε)ε where μ(ε)=𝔼[IπN]-I0 is the bias of the estimator, and m2+σ21, owing to the explicit expression of the constraint

(5)IπN(ε)-I022=μ(ε)2+Var(IπN(ε))ε2.

In particular, we will prove that limε0m(ε)=0 for the ML2R estimator. More precisely we will use in the proof the expansion

IπN(ε)-I0ε=m(ε)+σ2ζ2ε+1εN(ε)σ1ζ1εas ε0,

where ζ1ε and ζ2ε are two independent variables such that (ζ1ε,ζ2ε)𝒩(0,I2) as ε0. We will see that ζ1ε comes from the coarse level of the estimator, while ζ2ε derives from the sum of the refined levels. When β>1, εN(ε) converges to a constant, hence the variance σ2 results from the sum of the variance of the first coarse level σ12 and the variance of the sum of the refined fine levels σ22. When β(0,1], since εN(ε) diverges, the contribution to σ2 of the coarse level disappears and only the variance of the refined levels contributes to σ2. More details on m and σ will follow in Section 3.

The paper is organized as follows. In Section 2 we briefly recall the technical background for Multilevel Monte Carlo estimators. In Section 3 we stable our main results: a Strong Law of Large Numbers and a Central Limit Theorem in a quite general framework. Section 4 is devoted to the analysis of the asymptotic behavior of the optimal parameters, to the study of the weights of the ML2R estimator and to the bias of the estimators and its robustness. These are auxiliary results that we need for the proof of the main theorems, which we detail in Section 5. In Section 6 we apply these results first to the discretization schemes of Brownian diffusions, where we retrieve recent results by Ben Alaya and Kebaier in [2], and secondly to Nested Monte Carlo.

Notations.

  1. Let *={1,2,} denote the set of positive integers and =*{0}.

  2. For every x+=[0,+), x denotes the unique n* satisfying n-1<xn.

  3. If (an)n and (bn)n are two sequences of real numbers, anbn if an=εnbn with limnεn=1, an=O(bn) if (εn)n is bounded and an=o(bn) if limnεn=0.

  4. We denote by Var(X) and σ(X) the variance and the standard deviation of a random variable X, respectively.

2 Brief background on MLMC and ML2R estimators

We follow [12] and recall briefly the construction of the optimal parameters derived from the optimization problem (3). The first step is a stratification procedure allowing us to establish the optimal allocation policy (q1,,qR) when the other parameters R,h,M are fixed. We focus now on the effort of the estimator defined as the product of the cost times the variance, i.e. Effort(IπN)=Cost(IπN)Var(IπN). Introducing the notations

Zj=(hMj-1)-β2(YhMj-1(j)-YhMj-2(j))for all j2  and  Z1=Yh(1),

a Multilevel estimator MLMC (1) or ML2R (2) writes

IπN=1N1k=1N1Z1k+j=2R1Njk=1NjWjR(hMj-1)β2Zjk,

where WjR=1 for the MLMC and WjR=𝐖jR for the ML2R. By definition and using the approximation NjNqj the effort satisfies

Effort(IπN)(j=1RqjCost(Zj))(Var(Yh)q1+j=2R(WjR)2(hMj-1)βVar(Zj)qj).

Given R,h,M, a minimization of q(0,1)REffort(IπN) on {q(0,1)R:j=1Rqj=1} gives the solution

(6){q1*=μ*Var(Yh)Cost(Yh)qj*=μ*(hMj-1)β2|WjR|Var(Zj)Cost(Zj)with μ* such that j=1Rqj=1,

using the Schwarz’s inequality (see [12, Theorem 3.6] for a detailed proof). The strong error assumption (${\mathrm{SE}_{\beta}}$) allows us to upper bound Var(Yh) and Var(Zj) by Var(Y0)(1+θhβ/2)2 with θ=V1Var(Y0) and V1(1+Mβ/2)2, respectively. On the other hand, we assume that Cost(Zj)=(1+M-1)hMj-1. Plugging theses estimates in (6), we obtain the optimal allocation policy used in this paper and given in Tables 1 and 2. Notice that this particular choice for the qj is not unique, if we change (${\mathrm{SE}_{\beta}}$) with a different strong error assumption, for example with the sharp version, then we have to replace the upper bound for Var(Zj) with V1|1-Mβ/2|2 and a new expression for the qj follows. In the same spirit, the Cost(Zj) can be different and hence have an impact on the qj, see [6] or the nested Monte Carlo methods as examples of alternative costs.

The second step is to select h(ε) and R(ε)2 to minimize the cost of the optimally allocated estimator given a prescribed RMSE ε>0. To do this, we use the weak error assumption (${\mathrm{WE}_{\alpha,\bar{R}}}$) and we obtain

h(ε)=𝐡𝐡(1+2α)12α|c1|1αε-1αM-(R-1)

with c1 the first coefficient in the weak error expansion, for the MLMC estimator. For the ML2R estimator we made the additional assumption c~=limR|cR|1R(0,+) and then we obtain

h(ε)=𝐡𝐡(1+2αR)12αRc~1αε-1αRM-R-12.

The depth parameter R2 follows and the choice of N is directly related to the constraint (5).

We report in Tables 1 and 2 the ML2R and MLMC values for R(ε), h(ε), q(ε)=(q1(ε),,qR(ε)), N(ε) computed in [12] and used throughout this paper. Note that these parameters are used in the web application of the LPMA at the address http://simulations.lpma-paris.fr/multilevel/. The following constants are used in this paper and in the Tables 1 and 2:

θ=V1Var(Y0)andc~=limR|cR|1R(0,+)

and

C¯M,β=1+Mβ21+M-1andC¯M,β=(1+Mβ2)1+M-1.

Notice that 1+Mβ2 comes from the (${\mathrm{SE}_{\beta}}$) and 1+M-1 from the cost, hence the constants C¯M,β and C¯M,β depend on them, but on anything else.

Table 1

Optimal parameters for the ML2R estimator.

𝑹(𝜺)12+log(c~1α𝐡)log(M)+(12+log(c~1α𝐡)log(M))2+2log(A/ε)αlog(M),A=1+4α
𝒉(𝜺)𝐡/𝐡(1+2αR)12αRc~1αε-1αRM-R-12
𝒒(𝜺)q1=μ*(1+θhβ2),  qj=μ*θhβ2C¯M,β|𝐖j(R,M)|M-1+β2(j-1), j=2,,R, 1jRqj=1
𝑵(𝜺)(1+12αR)Var(Y0)(1+θhβ2+θhβ2C¯M,βj=2R|𝐖j(R,M)|M1-β2(j-1))ε2μ*
Table 2

Optimal parameters for the MLMC estimator.

𝑹(𝜺)1+log(|c1|1α𝐡)log(M)+log(A/ε)αlog(M),A=1+2α
𝒉(𝜺)𝐡/𝐡(1+2α)12α|c1|1αε-1αM-(R-1)
𝒒(𝜺)q1=μ*(1+θhβ2),  qj=μ*θhβ2C¯M,βM-1+β2(j-1), j=2,,R, 1jRqj=1
𝑵(𝜺)(1+12α)Var(Y0)(1+θhβ2+θhβ2C¯M,βj=2RM1-β2(j-1))ε2μ*

In what follows, we will shorter these notations by setting

(7)R(ε)=CR(1)+CR(2)+2αlog(M)log(1ε)

with

CR(1)=12+log(c~1α𝐡)log(M)andCR(2)=(12+log(c~1α𝐡)log(M))2+2log(A)αlog(M)

for ML2R and

(8)R(ε)=CR(1)+1αlog(M)log(1ε)

with

CR(1)=1+log(|c1|1α𝐡)log(M)+log(A)αlog(M)

for MLMC.

3 Main results

The asymptotic behavior, as ε goes to 0, of the parameters given in Tables 1 and 2 will be exposed in Section 4. We proceed here to the analysis of the asymptotic behavior of the estimator IπN(ε):=Iπ(ε)N(ε) as ε0.

3.1 Strong Law of Large Numbers

We will first prove a Strong Law of Large Numbers, namely

Theorem 3.1

Theorem 3.1 (Strong Law of Large Numbers)

Let p2. Assume (${\mathrm{WE}_{\alpha,\bar{R}}}$) for all R¯1 and Y0Lp. Assume furthermore the following Lp-strong error rate assumption:

(9)β>0,V1(p)0,Yh-Y0pp=𝔼[|Yh-Y0|p]V1(p)hβ2p,h.

Then, for every sequence of positive real numbers (εk)k1 such that k1εkp<+, both MLMC and ML2R estimators satisfy

(10)IπN(εk)a.s.I0as k+.

3.2 Central Limit Theorems

A necessary condition for a Central Limit Theorem to hold will be that the ratio between the variance of the estimator and ε converges as ε0. It seems intuitive that (${\mathrm{SE}_{\beta}}$) should be reinforced by a sharper estimate as h0. We define

(11)Z(h):=(hM)-β2(YhM-Yh)andZj:=Z(𝐡nj-1).

A necessary condition to obtain a CLT is to assume that (Z(h))h is L2-uniformly integrable. We state two results, the first one in the case β>1 and the second one in the case β1.

3.2.1 Case β>1

In this case, note that following (${\mathrm{SE}_{\beta}}$) we have supj1Var(Zj)V1(1+Mβ2)2.

Theorem 3.2

Theorem 3.2 (Central Limit Theorem, β>1)

Assume (${\mathrm{SE}_{\beta}}$) for β>1 and that (Z(h))hH is L2-uniformly integrable. We set

σ12=1ΣVar(Y𝐡)Var(Y0)(1+θ𝐡β2)𝑎𝑛𝑑σ22=1Σ𝐡β2j2M1-β2(j-1)Var(Zj)Var(Y0)V1C¯M,β

with

Σ=Σ(M,β,θ,𝐡)=[1+θ𝐡β2(1+C¯M,βM1-β21-M1-β2)].

Then the following statements hold.

  1. ML2R estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯1. Then

    (12)IπN(ε)-I0ε𝒩(0,σ12+σ22)as ε0.
  2. MLMC estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for R¯=1. Then there exists, for every ε>0, m(ε) such that M-α1+2α|m(ε)|11+2α and

    IπN(ε)-I0ε-m(ε)𝒩(0,2α2α+1(σ12+σ22))as ε0.

Note that the variance of the first term Y𝐡 associated to the coarse level contributes to the asymptotic variance of the estimator throughout σ12, while the variances of the correcting levels, Var(Zj), j2, contribute throughout σ22. The ML2R estimator is asymptotically unbiased, whereas the MLMC estimator has an a priori non-vanishing bias term. This gain on the bias for ML2R is balanced by the variance, which is reduced of a factor 2α1+2α for MLMC. The constraint (5) yields σ12+σ221, which is easy to verify if we recall that

Var(Y𝐡)Var(Y0)(1+θ𝐡β2)2,Var(Zj)V1(1+Mβ2)2andm(ε)211+2α.

3.2.2 Case β(0,1]

In this case, we make the additional sharper assumption that limh0Z(h)22=v(M,β). This assumption allows us to identify limj+Var(Zj). More precisely, note that owing to the consistence of the strong and weak error 2αβ and owing to (${\mathrm{WE}_{\alpha,\bar{R}}}$) we have

𝔼[Zj]=(𝐡nj)-β2𝔼[Y𝐡nj-Y𝐡nj]=c1(1-Mα)(𝐡nj)α-β2+o((𝐡nj)α-β2),

so that

Var(Zj)=Z(𝐡nj-1)22-c12(1-Mα)2(𝐡nj)2α-β+o((𝐡nj)2α-β).

We conclude that

limj+Var(Zj)={v(M,β)if 2α>β,v(M,β)-c12(1-Mβ2)2if 2α=β.
Theorem 3.3

Theorem 3.3 (Central Limit Theorem, 0<β1)

Assume (${\mathrm{SE}_{\beta}}$) for β(0,1]. Assume that (Z(h))hH is L2-uniformly integrable and assume furthermore limh0Z(h)22=v(M,β). We set

(13)σ2={v(M,β)(1+Mβ2)-2V1-1if 2α>β,(v(M,β)-c12(1-Mβ2)2)(1+Mβ2)-2V1-1if 2α=β.

Then the following statements hold.

  1. ML2R estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯1. Then

    (14)IπN(ε)-I0ε𝒩(0,σ2)as ε0.
  2. MLMC estimator: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for R¯=1 and that 2α>β when β<1. Then there exists, for every ε>0, m(ε) such that M-α1+2α|m(ε)|11+2α and

    IπN(ε)-I0ε-m(ε)𝒩(0,2α2α+1σ2)as ε0.

We will see in the proof that the asymptotic variance corresponds to the variance associated to the correcting levels.

3.3 Practitioner’s corner

In the proof of Theorems 3.2 and 3.3 we will obtain the more precise expansion

IπN(ε)-I0ε=m(ε)+Σ2ζ2ε+1εN(ε)Σ1ζ1εas ε0,

where ζ1ε and ζ2ε are two independent variables such that (ζ1ε,ζ2ε)𝒩(0,I2) as ε0, and the real values Σ1 and Σ2 depend on whether we are in the MLMC or in the ML2R case and on the value of β. Fundamentally Σ1 comes from the variance of the first coarse level and Σ2 from the sum of variances of the correcting levels.

When β>1, we will prove in Lemma 4.5 that εN(ε) converges to a constant as ε0, hence both the coarse and the refined levels contribute to the asymptotic of the estimator.

When β1, we will see that (εN(ε))-10 as ε0 so that, asymptotically, the variance of the coarse level fades and only the refined levels contribute to the asymptotic variance. Still, it is commonly known in the Multilevel framework that the coarse level is the one with the biggest size (speaking in terms of Nj), hence this term is not really negligible. We can go through this contradiction by observing the inverse convergence rate to 0, namely εN(ε). It is equivalent, up to a constant, to R(ε) when β=1 and M1-β4R(ε) when β<1.

For ML2R, owing to the expression of R(ε) given in (7), εN(ε)C(log(1/ε))14, where C is a positive constant when β=1 and εN(ε)=o(ε-η) for all η>0 when β<1. Hence the convergence rate to 0 of (εN(ε))-1 is very slow. By contrast, Σ1Σ2, since Σ1 is related to the variance of the coarse level which roughly approximates the value of interest whereas Σ2 is related to the variance of the refined levels supposed to be smaller a priori. Hence the product (εN(ε))-1Σ1 turns out not to be negligible with respect to Σ2 for the values of the RMSE ε usually prescribed in applications.

For MLMC, we get εN(ε)Clog(1/ε), C positive constant, for β=1 and εN(ε)Cε-1-β4α for β<1. Hence, when β>1, the slow convergence phenomenon is still observed though less significant.

Impact of the weights 𝐖jR, j=1,,R, on the asymptotic behavior of the ML2R estimator.

When β1, one observes that neither the rate of convergence nor the asymptotic variance of the estimator depends in any way upon the weights 𝐖jR, j=1,,R. If β<1, it depends in a somewhat hidden way through the multiplicative constant of ε-2M1-β2R(ε) in the asymptotic of N(ε) (see Lemma 4.5 for more details). However, at finite range, it may have an impact on the variance of the estimator, having however in mind that, by construction, the depth of the ML2R estimator is lower than that of the MLMC which tempers this effect.

4 Auxiliary results

This section contains some useful results for the proof of the Strong Law of Large Numbers and of the Central Limit Theorem. More in detail, we investigate the asymptotic behavior as ε0 of the optimal parameters given in Tables 1 and 2 and of the bias of the estimators and we analyze the weights of the ML2R estimator.

4.1 Asymptotic of the bias parameter and of the depth

An important property of MLMC and ML2R estimators is that h(ε)𝐡 and R(ε) as ε0. The saturation of the bias parameter 𝐡 is not intuitively obvious; indeed, it is well known that h(ε)0 as ε0 for Crude Monte Carlo estimator. Still, this is a good property, because h=𝐡 is the choice which minimizes the cost of simulation of the variable Yh, which we recall is inverse linear with respect to h. First of all, we retrace the computations that led to the choice of the optimal h*(ε) and R*(ε), starting from ML2R estimator. We define

h(ε,R)=(1+2αR)-12αR|cR|-1αRε1αRMR-12

and we recall that this is the optimized bias found in [12] at R fixed. Since the value of cR is unknown, it is necessary to make the assumption |cR|1Rc~ as R+ and |cR|-1αR is replaced by c~-1α. The value of c~ is also unknown and in the simulations we have to take an estimate of c~, that we write c^. We follow the lines of [12] and define the polynomial

(15)P(R)=R(R-1)2log(M)-Rlog(K)-1αlog(1+4αε),

where K=c^1α𝐡. We set R+(ε) the positive zero of P(R). The optimal value for the depth of the ML2R estimator is R*(ε)=R+(ε). We notice that P(R*(ε))0, R*(ε)+ as ε0, and R* is increasing in c^. We can rewrite

h(ε,R)=(1+4α1+2αR)12αR(c^|cR|1R)1αeP(R)R𝐡.

We notice that

h(ε,R+)=(1+4α1+2αR+)12αR+(c^|cR+|1R+)1α𝐡.

The optimal choice for the bias is the projection of h(ε,R*(ε)) on the set ={𝐡n:n}, which reads

h*(ε)=𝐡𝐡h(ε,R*(ε))-1.

When we replace |cR|1R with c^, we finally obtain

h*(ε)=𝐡𝐡(1+2αR*)12αR*c^1αε-1αR*M-R*-12=𝐡𝐡h(ε,R*)(|cR*|1R*c^-1)1α-1.

Let us analyze the denominator

𝐡h(ε,R*)(|cR*|1R*c^-1)1α=(1+4α1+2αR*)-12αR*e-P(R*)R*.

Since P(R*)0 and since for R large enough the function (1+4α1+2αR)-12αR1, it follows that, up to reducing ε¯,

(16)(1+4α1+2αR*)-12αR*e-P(R*)R*1for all ε(0,ε¯),

which yields

𝐡h(ε,R*)(|cR*|1R*c^-1)1α=1andh*(ε)=𝐡.

For MLMC we may follow the same reasoning starting from h(ε,R)=(1+2α)-12α|c1|-1αε1αMR-1. We just showed the following:

Proposition 4.1

There exists ε¯>0 such that h*(ε)=h for all ε(0,ε¯].

In what follows, we will always assume that ε(0,ε¯] and h*(ε)=𝐡. This threshold ε¯ can be reduced in what follows line to line.

As ε0, we have R=R*(ε)+ at the rate 2αlog(M)log(1ε) in the ML2R case and 1αlog(M)log(1ε) in the MLMC case.

4.2 Asymptotic of the bias and robustness

As part of a Central Limit Theorem, we will be faced to the quantity μ(𝐡,R(ε),M)ε, where

μ(𝐡,R(ε),M)=𝔼[IπN(ε)]-I0

is the bias of the estimator. This leads us to analyze carefully its asymptotic behavior as ε0. Under assumption (${\mathrm{WE}_{\alpha,\bar{R}}}$) , the bias of a Crude Monte Carlo estimator reads

μ(h)=c1hα(1+η1(h)),limh0η1(h)=0.

The bias of Multilevel estimators is dramatically reduced compared to the Crude Monte Carlo, more precisely the following proposition is proved in [12]:

Proposition 4.2

The following statements hold.

  1. MLMC: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) with R¯=1.

    μ(h,R,M)=c1(hMR-1)α(1+η1(hMR-1))

    with limh0η1(h)=0.

  2. ML2R: Assume ((${\mathrm{WE}_{\alpha,\bar{R}}}$)) for all R¯1.

    μ(h,R,M)=(-1)R-1cR(hRMR(R-1)2)α(1+ηR,n¯(h)),

    where ηR,n¯(h)=(-1)R-1MαR(R-1)2r=1RwrnrαRηR(hnR) with limh0ηR(h)=0.

We notice that the ML2R estimator requires and takes full advantage of a higher order of the expansion of the bias error (${\mathrm{WE}_{\alpha,\bar{R}}}$), whereas the MLMC estimator only needs a first order expansion. As the computations were made under the constraint IπN-I02ε, we have clearly that |μ(𝐡,R(ε),M)|ε1. We focus our attention on the constants c~ and c1, which a priori we do not know and that we replace in the simulations by c^=c^1=1. If we plug the values of h(ε)=𝐡 and R(ε) in the formulas for the bias, owing to (15) and (16) we get, for ML2R,

|μ(𝐡,R(ε),M)|=|cR(ε)|(𝐡R(ε)MR(ε)R(ε)-12)α=|cR(ε)|𝐡αR(ε)1eαP(R(ε))KαR(ε)ε1+4α
=|cR(ε)|c^R(ε)1eαP(R(ε))ε1+4α|cR(ε)|c^R(ε)11+2αR(ε)ε

and, for MLMC,

|c1c^1|M-α1+2αε<|μ(𝐡,R(ε),M)||c1c^1|11+2αε.

We set m(ε):=|μ(𝐡,R(ε),M)|ε. Hence, when taking the true values c^=c~ and c^1=c1, we get

(17){limε0m(ε)=0for ML2R,M-α1+2α<limε0m(ε)11+2αfor MLMC.

For ML2R estimators, if cR has a polynomial growth depending on R, we have

limR+|cR|1R=1

and c^=1 corresponds to the exact value of c~. If the growth of cR is less than polynomial, the convergence to 0 in (17) still holds. The only uncertain case is when the growth of cR is faster than polynomial. Then, if c^|cR|1R, |μ(ε)|ε goes to 0 faster than 11+2αR(ε), but if we had taken c^<1, we would have obtained

limR+|cR|c^R=+,

hence c^<1 is definitely not a good choice. In conclusion, whenever the growth of cR is at most polynomial, c^=1 remains a good choice. When the growth is faster than polynomial, it is better to overestimate c^ than to underestimate it. The remarkable fact is that, when we choose c^, we are not forced to have a very precise idea of the expression of cR, but only of its growth rate. The choice of c^1 for MLMC estimator is less robust, since it is obvious that if we overestimate c1 the inequality |μ(ε)|ε11+2α still holds, but if we underestimate it we eventually may not have |μ(ε)|ε1 as expected. Hence the bias for the MLMC estimator is very connected to an accurate enough estimation of c1.

In Figures a and b we show the values of |c1| estimated with the formula

c1=(h2-h)-1(𝔼[Yh2]-𝔼[Yh])

compared to the value plugged in the simulations c^1=1, for a Call option in a Black–Scholes model with X0=100, K=80, T=1, σ=0.4 and making the interest rate vary as follows: r=0.01,0.1,0.2,,0.9,1. We simulated 𝔼[Yh], with h=T20, using an Euler and a Milstein discretization scheme and making a Crude Monte Carlo simulation of size N=108.

In Figures a and b we show the absolute value of the empirical bias for different values of r. In the simulations, we fixed c^1=1 and c^=1. We can observe that when |c1| is underestimated, the bias for MLMC and Crude Monte Carlo estimators do not satisfy the constraint |μ(ε)|ε, whereas the ML2R estimator appears to be less sensible to the estimation of c~.

Figure 1

Estimated |c1|=(|𝔼[Yh]-𝔼[Yh2]|)(h-h2)-1 when r varies.

(a) Euler scheme (α=1,β=1${\alpha=1,\beta=1}$).
(a)

Euler scheme (α=1,β=1).

(b) Milstein scheme (α=1,β=2${\alpha=1,\beta=2}$).
(b)

Milstein scheme (α=1,β=2).

Figure 2

Empirical bias |μ(ε)| for a Call option in a Black–Scholes model for a prescribed RMSE ε=2-5 and for different values of r, taking c^=c^1=1.

(a) Euler scheme (α=1,β=1${\alpha=1,\beta=1}$).
(a)

Euler scheme (α=1,β=1).

(b) Milstein scheme (α=1,β=2${\alpha=1,\beta=2}$).
(b)

Milstein scheme (α=1,β=2).

4.3 Properties of the weights of the ML2R estimator

One significant difficulty in the proof of the Central Limit Theorem that we stated in Theorems 3.2 and 3.3, is to deal with the weights 𝐖jR appearing in the ML2R estimator. Moreover, the analysis of the behavior of the weights is necessary when studying the asymptotic of the parameters q=(q1,,qR) and N. These weights are devised to kill the coefficients c1,,cR in the bias expansion under (${\mathrm{WE}_{\alpha,\bar{R}}}$). They are defined as

(18)𝐖jR=r=jR𝐰r,j=1,,R,

where the weights 𝐰=(𝐰r)r=1,,R are the solution to the Vandermonde system V𝐰=e1, the matrix V being defined by

V=V(1,n2-α,,nR-α)=(1111n2-αnR-α1n2-α(R-1)nR-α(R-1)).

Notice that 𝐖1R=1 by construction. In order to give a more tractable expression of the weights 𝐖jR, one notices that the weights 𝐰 admit a closed form given by Cramer’s rule, namely

𝐰=abR-,=1,,R,

where

a=11k-1(1-M-kα),=1,,R,

with the convention k=10(1-M-kα)=1, and

b=(-1)M-α2(+1)1k(1-M-kα),=0,,R.

As a consequence,

𝐖jR==jRabR-==0R-jaR-b,j1,,R.

We will make an extensive use of the following properties, which are proved in Appendix A.

Lemma 4.3

Let α>0 and the associated weights (WjR)j=1,,R given in (18).

  1. lim+a=a<+ and =0+|b|=B~<+.

  2. The weights 𝐖jR are uniformly bounded,

    (19)|𝐖jR|aB~for all R* and all j{1,,R}.
  3. For every γ>0,

    limR+j=2R|𝐖jR|M-γ(j-1)=1Mγ-1.
  4. Let {vj}j1 be a bounded sequence of positive real numbers. Let γ and assume that limj+vj=1 when γ0. Then the following limits hold:

    j=2R|𝐖jR|Mγ(j-1)vj{j2Mγ(j-1)vj<+𝑓𝑜𝑟γ<0,R𝑓𝑜𝑟γ=0,MγRaj1|=0j-1b|M-γj𝑓𝑜𝑟γ>0,as R+.

4.4 Asymptotic of the allocation policy and of the size

Let us analyze the allocation policy q=(q1,,qR) for the ML2R case. Since

(20)q1(ε)=μ*(ε)(1+θ𝐡β2)andqj(ε)=θ𝐡β2C¯M,βμ*(ε)|𝐖jR(ε)|M-β+12(j-1),j=2,,R(ε),

the condition j=1R(ε)qj=1 yields

μ*(ε)=(1+θ𝐡β2(1+C¯M,βj=2R(ε)|𝐖jR(ε)|M-β+12(j-1)))-1.

Owing to Lemma 4.3 (c) with γ=β+12, the limit of this term as ε0 is

𝝁*=(1+θ𝐡β2(1+C¯M,βM1+β2-1))-1.

Moreover, for all ε(0,ε¯], the following inequalities hold:

(21)1μ¯*:=1+θ𝐡β21μ*(ε)1+θ𝐡β2(1+C¯M,βaB~11-M-β+12)=:1μ¯*.
Remark 4.4

If we set 𝐖jR(ε)=1 for all j=1,,R(ε), and aB~=1, we obtain the same results for the MLMC allocation policy.

The asymptotic of the estimator size N=N(ε) is given in the following lemma.

Lemma 4.5

We have N=N(ε)+ as ε0, with a convergence rate depending on β as follows:

  1. Case β>1: We have N(ε)Cβε-2 with

    Cβ=Var(Y0)𝝁*[1+θ𝐡β2(1+C¯M,βM1-β21-M1-β2)]{1for ML2R,1+12αfor MLMC.
  2. Case β1: We recall the expression of R(ε) given in (7) for ML2R and (8) for MLMC. Then

    N(ε)Cβε-2{R(ε)if β=1,M1-β2R(ε)if β<1,

    where the constant Cβ reads

    Cβ=Var(Y0)𝝁*θ𝐡12C¯M,β{1for ML2R,1+12αfor MLMC,if β=1

    and

    Cβ=Var(Y0)𝝁*θ𝐡β2C¯M,β{aj1|=0j-1b|Mβ-12jfor ML2R,(1+12α)1M1-β2-1for MLMC,if β<1.

We notice that for β1 the asymptotic behavior of N(ε) for ML2R does not depend on the weights 𝐖jR and the difference between the coefficient Cβ for ML2R and for MLMC estimator lies only in the factor (1+12α), whereas when β<1, the asymptotic of the weights has an impact on the behavior of N(ε) for ML2R. Still, in this case we observe that if a=1 and |=0j-1b|=1 for all j1, then

aj1|=0j-1b|Mβ-12j=1M1-β2-1

and the factor (1+12α) appears again to be the only difference in the coefficient Cβ of N(ε) for the two estimators.

Proof.

ML2R: The estimator size N reads

N=N(ε)=(1+12αR(ε))Var(Y0)μ*1ε2(1+θ𝐡β2+θ𝐡β2C¯M,βj=2R(ε)|𝐖jR(ε)|M1-β2(j-1)).

We notice that R(ε)+ as ε0 and use Lemma 4.3 (d) with γ=1-β2, with vj=1 for each j1, to complete the proof on the ML2R framework.

MLMC: The result follows directly from the convergence of the series j=2R(ε)M1-β2(j-1), since N reads

N=N(ε)=(1+12α)Var(Y0)μ*1ε2(1+θ𝐡β2(1+C¯M,βj=2R(ε)M1-β2(j-1))),

as desired. ∎

5 Proofs

We will use the notations

I~ε1:=1N1(ε)k=1N1(ε)Y𝐡(1),k-𝔼[Y𝐡(1),k]andI~ε2:=j=2R(ε)𝐖jR(ε)Nj(ε)k=1Nj(ε)Y~jk,

where we set

Yj~:=Y𝐡nj(j)-Y𝐡nj-1(j)-𝔼[Y𝐡nj(j)-Y𝐡nj-1(j)],j=1,,R(ε).

These notations hold for both ML2R and MLMC estimators, where we set 𝐖jR(ε)=1, j=1,,R(ε), for MLMC estimators. We notice that

(22)IπN(ε)-I0=I~ε1+I~ε2+μ(𝐡,R(ε),M),

where the bias μ(𝐡,R(ε),M)0 as ε0 (see Section 4.2 for a detailed description of the bias).

5.1 Proof of Strong Law of Large Numbers

The proof of the Strong Law of Large Numbers is a consequence of the following proposition.

Proposition 5.1

Let p2. There exists a positive real constant K(M,β,p) such that

(23)𝔼[|I~ε2|p]K(M,β,p)εp.

Proof.

ML2R: We first give the proof of (23) for the ML2R estimator. As a first step we show that, for all p2,

(24)𝔼[|Y~j|p]CM,β,pM-βp2(j-1),j=1,,R(ε),with CM,β,p=2pV1(p)(1+Mβ2)p𝐡βp2.

By Minkowski’s inequality,

(𝔼[|Y~j|p])1pY𝐡nj-Y𝐡nj-1p+|𝔼[Y𝐡nj-Y𝐡nj-1]|Y𝐡nj-Y𝐡nj-1p+Y𝐡nj-Y𝐡nj-112Y𝐡nj-Y𝐡nj-1p.

Applying again Minkowski’s inequality, the Lp-strong approximation assumption (9) yields (24). As the random variables (Y~jk)k1 are i.i.d. and the (Y~j)j=1,,R(ε) are centered and independent, Burkholder’s inequality (see [10, Theorem 2.10, p. 23]) and (24) imply that there exists a positive universal real constant Cp such that

𝔼[|I~ε2|p]=𝔼[|j=2R(ε)k=1Nj(ε)𝐖jR(ε)Nj(ε)Y~jk|p]Cp𝔼[|j=2R(ε)k=1Nj(ε)(𝐖jR(ε)Nj(ε)Y~jk)2|p2]
Cp(j=2R(ε)k=1Nj(ε)(𝐖jR(ε)Nj(ε)Y~jk)2p2)p2=Cp(j=2R(ε)|𝐖jR(ε)|2Nj(ε)(𝔼[|Y~j|p])2p)p2
CpCM,β,2(j=2R(ε)|𝐖jR(ε)|2Nj(ε)M-β(j-1))p2.

As Nj(ε)=N(ε)qj(ε)N(ε)qj(ε), we derive that

1Nj(ε)1N(ε)qj(ε),j=1,,R(ε).

It follows from the expression of qj given in (20) and from inequality (21) that

|𝐖jR(ε)|qj(ε)1θ𝐡β2C¯M,βμ¯*Mβ+12(j-1),j=2,,R(ε).

Then, using that supj{1,,R},R1|𝐖jR|aB~, we get

𝔼[|I~ε2|p]CpCM,β,2(aB~)p2(θ𝐡β2C¯M,βμ¯*)-p2(1N(ε)j=2R(ε)M1-β2(j-1))p2.

Owing to Lemma 4.5, up to reducing ε¯, we have

(25)1N(ε)2Cβε2{1if β>1,R(ε)-1if β=1,M-1-β2R(ε)if β<1,for all ε(0,ε¯].

Moreover,

j=2R(ε)M1-β2(j-1){11-M1-β2if β>1,R(ε)if β=1,M1-β2R(ε)M1-β2-1if β<1.

Then

(1N(ε)j=2R(ε)M1-β2(j-1))p2K1εp

with

K1=K1(M,β,p)=(2Cβ)p2{(1-M1-β2)-p2if β>1,1if β=1,(M1-β2-1)-p2if β<1.

Hence (23) holds with

K(M,β,p)=CpCM,β,2(aB~)p2(θ𝐡β2C¯M,βμ¯*)-p2K1.

MLMC: The proof for the MLMC estimator follows the same steps, by replacing 𝐖jR(ε)=1, j=1,,R(ε), and aB~=1. ∎

The Strong Law of Large Numbers follows as a consequence of Proposition 5.1.

Proof of Theorem 3.1.

Owing to the decomposition (22), equation (10) amounts to proving

I~εk1a.s.0as k+  and  I~εk2a.s.0as k+.

As limε0N1(ε)=+ and the (Y𝐡(1),k)k1 are i.i.d. and do not depend on ε, the convergence of I~εk1 is a direct application of the classical Strong Law of Large Numbers, for both ML2R and MLMC estimators.

To establish the a.s. convergence of I~εk2, owing to Lemma 5.1 it is straightforward that for all sequence of positive values (εk)k1 such that εk0 as k+ and k1εkp<+,

k1𝔼[|I~εk2|p]<+.

Hence, by Beppo–Levi’s Theorem, k1|I~εk2|p<+ a.s., which in turn implies I~εk2a.s.0 as k+. ∎

5.2 Proof of the Central Limit Theorem

This subsection is devoted to the proof of Theorems 3.2 and 3.3. In order to satisfy a Lindeberg condition, we will need the assumption (Z(h))h is L2-uniformly integrable. Owing to (${\mathrm{WE}_{\alpha,\bar{R}}}$), R¯=1,

|𝔼[Z(h)]|=|c1(1-Mα)|hα-β2+o(hα-β2).

Since 2αβ, this deterministic sequence (𝔼[Z(h)])h is bounded. Hence, the L2-uniform integrability of (Z(h))h yields the L2-uniform integrability of the centered sequence (Z~(h))h=(Z(h)-𝔼[Z(h)])h.

One criterion to verify the L2-uniform integrability is the following.

Lemma 5.2

The following statements hold.

  1. If there exists a p>2 such that suphZ(h)p<+, the family (Z(h))h is L2-uniformly integrable.

  2. If there exists a random variable D(M)L2 such that, as h0, Z(h)D(M), then the following conditions are equivalent (see [3, Theorem 3.6]):

    1. The family (Z(h))h is L2-uniformly integrable.

    2. limh0Z(h)2=D(M)2.

Now we are in a position to prove the Central Limit Theorem, in both cases β>1 and β(0,1].

Proof of Theorems 3.2 and 3.3.

Owing to the decomposition (22) (with 𝐖jR(ε)=1, j=1,,R(ε) for MLMC estimator),

IπN(ε)-I0ε=I~ε1ε+I~ε2ε+μ(𝐡,R(ε),M)ε,

where I~ε1 and I~ε2 are independent. The bias term has already been treated in (17).

ML2R: Formulas (12) and (14) amount to proving, as ε0,

(26)N(ε)I~ε1𝒩(0,Var(Y𝐡)𝐪1)

and

(27)I~ε2ε𝒩(0,σ22)

with σ2=σ for β(0,1]. Indeed, for (26) let us write I~ε1ε=1εN(ε)N(ε)I~ε1. Using Lemma 4.5, N(ε) reads

N(ε)Cβε-2{1if β>1,R(ε)if β=1,M1-β2R(ε)if β<1,as ε0.

In particular, since R(ε)+ as ε0, when β1, 1N(ε)=o(ε) and the term I~ε1ε0 in probability. Since Y𝐡(1),k does not depend on ε, N1(ε)+ and N1(ε)/N(ε)𝐪1 as ε0, the asymptotic behavior of the first term is driven by a regular Central Limit Theorem at rate N(ε), i.e.

N(ε)I~ε1=N(ε)[1N1(ε)k=1N1(ε)(Y𝐡(1),k-𝔼[Y𝐡(1),k])]ε0𝒩(0,Var(Y𝐡)𝐪1),

which proves (26). We will use Lindeberg’s Theorem for triangular arrays of martingale increments (see [10, Corollary 3.1, p. 58]) to establish (27). The random variables Y~jk being centered and independent, the variance reads

Var(j=2R(ε)k=1Nj(ε)1ε𝐖jR(ε)Nj(ε)Y~jk)=1ε2j=2R(ε)(𝐖jR(ε)Nj(ε))2Nj(ε)Var(Y~j)=1ε2j=2R(ε)(𝐖jR(ε))2Nj(ε)Var(Y~j).

Noticing that 01x-1x1x2, x>0, and that Nj(ε)=qj(ε)N(ε), we derive

|1ε2j=2R(ε)(𝐖jR(ε))2Nj(ε)Var(Y~j)-1ε2j=2R(ε)(𝐖jR(ε))2qj(ε)N(ε)Var(Y~j)|1ε2j=2R(ε)(𝐖jR(ε))2(qj(ε)N(ε))2Var(Y~j).

The conclusion will follow from

(28)limε01ε2j=2R(ε)(𝐖jR(ε))2N(ε)qj(ε)Var(Y~j)=σ22

and

(29)limε01ε2j=2R(ε)(𝐖jR(ε))2(qj(ε)N(ε))2Var(Y~j)=0.

Owing to the definition of Zj given in (11), we get Var(Y~j)=(𝐡nj)βVar(Zj) and, using the expression of qj(ε) given in (20), we obtain

1ε2j=2R(ε)(𝐖jR(ε))2N(ε)qj(ε)Var(Y~j)=1ε2N(ε)𝐡β2θC¯M,βμ*(ε)j=2R(ε)|𝐖jR(ε)|M1-β2(j-1)Var(Zj).

Case β>1: Owing to the expression of N(ε) given in Lemma 4.5 when β>1,

limε01ε2N(ε)𝐡β2θC¯M,βμ*(ε)=1Σ𝐡β2Var(Y0)V1C¯M,β

and owing to the limit in Lemma 4.3 (d) with γ=1-β2<0,

j=2R(ε)|𝐖jR(ε)|M1-β2(j-1)Var(Zj)=j=2+M1-β2(j-1)Var(Zj)<+.

Hence the convergence of the variance (28) holds for Theorem 3.2.

Case β1: Owing to the expression of N(ε) given in Lemma 4.5 when β1, we get, as ε0,

1ε2N(ε)𝐡β2θC¯M,βμ*(ε)1V1(1+Mβ2)2{(R(ε))-1if β=1,(M1-β2R(ε)aj1|=0j-1b|Mβ-12j)-1if β<1.

We notice that

limj+Var(Zj)=v~(M,β)

if 2α>β and

limj+Var(Zj)=v~(M,β)-c12(1-Mβ2)2

if 2α=β. Hence, owing to the limit in Lemma 4.3 (d) with γ=1-β20, we obtain (28) with σ2=σ given in (13) in Theorem 3.3.

For (29), it follows from the expression of qj(ε) in (20) that

|𝐖jR(ε)|qj(ε)=Mβ+12(j-1)θ𝐡β2C¯M,βμ*(ε).

Owing to the definition of Zj in (11) and to inequality (21), we get

1ε2j=2R(ε)(𝐖jR(ε))2(qj(ε)N(ε))2Var(Y~j)𝐡β(θ𝐡β2C¯M,βμ¯*)21(εN(ε))2j=2R(ε)M(j-1)Var(Zj)
𝐡β(θ𝐡β2C¯M,βμ¯*)21(εN(ε))2(supj1Var(Zj))MR(ε)-1M-1.

We conclude by showing that

(30)MR(ε)(εN(ε))20as ε0.

Owing to the expression of R(ε) given in (7), we notice that

R(ε)=O(log(1ε))=o(log(1ε))as ε0.

Moreover, using Lemma 4.5, up to another reduction of ε¯, we have 1(εN(ε))2(2Cβ)2ε2 for all β>0. This in turn yields

MR(ε)(εN(ε))2(2Cβ)2ε2MR(ε)=Cβ-2elog(M)R(ε)-2log(1ε)0as ε0.

Then (29) is proved and so is the first condition of Lindeberg’s Theorem.

For the second condition of Lindeberg’s Theorem we need to prove that, for every η>0,

j=2R(ε)k=1Nj(ε)𝔼[(1ε𝐖jR(ε)Nj(ε)Y~jk)2𝟏{|1ε𝐖jR(ε)Nj(ε)Y~jk|>η}]0as ε0.

Since the (Y~jk)k=1,,Nj(ε) are identically distributed, we can write

j=2R(ε)k=1Nj(ε)𝔼[(1ε𝐖jR(ε)Nj(ε)Y~jk)2𝟏{|1ε𝐖jR(ε)Nj(ε)Y~jk|>η}]j=2R(ε)1ε2|𝐖jR(ε)|2Nj(ε)𝔼[(Y~j)2𝟏{|Y~j|>ηεNj(ε)|𝐖jR(ε)|}].

We set Z~j=Zj-𝔼[Zj]. Replacing qj by its values given in (20), using inequality (19) from Lemma 4.3 (b) and the elementary inequality 1Nj(ε)1qj(ε)N(ε) yields

j=2R(ε)k=1Nj(ε)𝔼[(1ε𝐖jR(ε)Nj(ε)Y~jk)2𝟏{|1ε𝐖jR(ε)Nj(ε)Y~jk|>η}]
1θ𝐡β2C¯M,βμ*(ε)1ε2N(ε)j=2R(ε)|𝐖jR(ε)|Mβ+12(j-1)𝔼[(Y~j)2𝟏{|Y~j|>ηθ𝐡β2C¯M,βμ*(ε)Mβ+12(j-1)εN(ε)}]
𝐡β2θC¯M,βμ¯*aB~1ε2N(ε)j=2R(ε)M1-β2(j-1)𝔼[(Z~j)2𝟏{|Z~j|>ηθC¯M,βμ¯*εN(ε)M-j-12}]
𝐡β2θC¯M,βμ¯*aB~sup2jR(ε)𝔼[(Zj)2𝟏{|Z~j|>ΘεN(ε)M-R(ε)2}]1ε2N(ε)j=2R(ε)M1-β2(j-1)

where we set Θ=ηθC¯M,βμ¯*M. Now, it follows from Lemma 4.5 that

1ε2N(ε)j=2R(ε)M1-β2(j-1)=1ε2N(ε)(M1-β2R(ε)-M1-β2M1-β2-1𝟏{β1}+(R(ε)+1)𝟏{β=1})K,

as ε0, where K is a real positive constant. Owing to (30), limε0εN(ε)M-R(ε)2=+. Hence, since we assumed that the family (Zj)j1 is L2-uniformly integrable, we obtain that

limε0sup2jR(ε)𝔼[(Zj)2𝟏{|Zj|>ΘεN(ε)M-R(ε)-12}]=0

and the second condition of Lindeberg’s Theorem is proved.

MLMC: The proofs are quite the same as for ML2R, up to the constant 1+12α, coming from the constant Cβ in the asymptotic of N(ε). Using Lemma 4.5 and the expression of R(ε) given in (8), we obtain

N(ε)Cβε-2{1if β>1,1αlog(M)log(1ε)if β=1,ε-1-β2αif β<1.as ε0.

We replace 𝐖jR=1, j=1,,R, and aB~=1. The only significant difference comes when β<1, while proving (30). In this case, owing to Lemma 4.5 as we did in (25) and using the expression of R(ε) given in (8), up to reducing ε¯, we can write

MR(ε)(εN(ε))2(2Cβ)2ε2M-(1-β)R(ε)MR(ε)(2Cβ)2Mβ(CR(1)+1)ε2-βα,

which goes to 0, owing to the strict inequality assumption 2α>β. ∎

6 Applications

6.1 Diffusions

In this subsection we retrieve a recent result by Kebaier and Ben Alaya (see [2]) obtained for MLMC estimators and we extend it to the ML2R estimators and to the use of path-dependent functionals. Let (Xt)t[0,T] a Brownian diffusion process solution to the stochastic differential equation

Xt=X0+0tb(s,Xs)𝑑s+0tσ(s,Xs)𝑑Ws,t[0,T],

where b:[0,T]×dd, σ:[0,T]×dM(d,q,) are continuous functions, Lipschitz continuous in x, uniformly in t[0,T], (Wt)t[0,T] is a q-dimensional Brownian motion independent of X0, both defined on a probability space (Ω,𝒜,).

We know that X=(Xt)t[0,T] is the unique (tW)t[0,T]-adapted solution to this equation, where W is the augmented filtration of W. The process (Xt)t[0,T] cannot be simulated at a reasonable computational cost (at least in full generality), which leads to introduce some simulatable time discretization schemes, the simplest being undoubtedly the Euler scheme with step h=Tn, n1, defined by

(31)X¯tn=X0+0tb(s¯,X¯s¯n)𝑑s+0tσ(s¯,X¯s¯n)𝑑Ws

with s¯=nsn, s[0,T]. In particular, if we set tkn=kTn,

X¯tk+1nn=X¯tknn+b(tkn,X¯tknn)h+σ(tkn,X¯tknn)hUk+1n,k{0,,n-1},

where Uk+1n=Wtk+1n-Wtknh is i.i.d. with distribution 𝒩(0,Iq). Furthermore, we also derive from (31) that

X¯tn=X¯t¯n+b(t¯,X¯t¯n)(t-t¯)+σ(t¯,X¯t¯n)(Wt-Wt¯),t[0,T].

It is classical background that, under the above assumptions on b,σ,X0 and W, the Euler scheme satisfies the following a priori Lp-error bounds:

(32)p2,cb,σ,p,T>0,supt[0,T]|Xt-X¯tn|pcb,σ,p,TTn(1+X0p).

For the weak error expansion the existing results are less general. Let us recall as an illustration the celebrated Talay–Tubaro’s and Bally–Talay’s weak error expansions for marginal functionals of Brownian diffusions, i.e. functionals of the form F(X)=f(XT).

Theorem 6.1

The following statements hold.

  1. Regular setting (Talay–Tubaro [13]): If b and σ are infinitely differentiable with bounded partial derivatives and if f:d is an infinitely differentiable function, with all its partial derivatives having a polynomial growth, then for a fixed maturity T>0 and for every integer RN*,

    (33)𝔼[f(X¯Tn)]-𝔼[f(XT)]=k=1Rck(1n)k+O((1n)R+1),

    where the coefficients ck depend on b,σ,f,T but not on n.

  2. (Hypo-)Elliptic setting (Bally–Talay [1]): If b and σ are infinitely differentiable with bounded partial derivatives and if σ is uniformly elliptic in the sense that

    σσ*(x)ε0Iqfor all xd and all t[0,T],ε0>0,

    or more generally if (b,σ) satisfies the strong Hörmander hypo-ellipticity assumption, then (33) holds true for every bounded Borel function f:d.

For more general path-dependent functionals, no such result exists in general. For various classes of specified functionals depending on the running maximum or mean, some exit stopping time, first order weak expansions in hα,α(0,1], have sometimes been established (see [12] for a brief review in connection with multilevel methods). However, as emphasized by the numerical experiments carried out in [12], such weak error expansion can highly be suspected to hold at any order under reasonable smoothness assumptions.

In this subsection we consider F:𝒞b([0,T],d) a Lipschitz continuous functional and we set

Y0=F(X)  and  Yh=F(X¯n)with h=Tn and n1 (i.e. 𝐡=T).

We assume the weak error expansion (${\mathrm{WE}_{\alpha,\bar{R}}}$). We prove now that both estimators ML2R (2) and MLMC (1) satisfy a Strong Law of Large Numbers and a Central Limit Theorem when ε tends to 0.

Theorem 6.2

Let X0L2 and assume that F:Cb([0,T],Rd)R is a Lipschitz continuous functional. Then the assumption (${\mathrm{SE}_{\beta}}$) is satisfied with β=1.

If X0Lp for p2, then the Lp-strong error assumption Yh-Y0pV1(p)h is satisfied so that both ML2R and MLMC estimators satisfy Theorem 3.1.

If X0Lp for p>2 and if F is differentiable with DF continuous, then the sequence (Z(h))hH is L2-uniformly integrable and

(34)v>0,limh0Z(h)22=(M-1)v.

As a consequence, both ML2R and MLMC estimators satisfy Theorem 3.3 (case β=1).

Proof.

First, note that if F is a Lipschitz continuous functional, with Lipschitz coefficient [F]Lip, we have for all p2,

Yh-Y0pp[F]Lipp𝔼[supt[0,T]|Xt-X¯tn|p][F]Lippcb,σ,p,Tp(1+X0p)php2,

then (Yh)h satisfies (${\mathrm{SE}_{\beta}}$) with β=1 and the Lp-strong error assumption as soon as X0Lp.

Assume now that X0Lp for p>2. By a straightforward application of Minkowski’s inequality we deduce from the Lp-strong error assumption that YhM-YhpCh and then that suphZ(h)p<+. Applying criterion (a) of Lemma 5.2, we prove that (Z(h))h is L2-uniformly integrable.

At this stage it remains to prove (34). The key is [2, Theorem 3], where it is proved that

nM(X¯n-X¯nM)stablyU(M)as n+,

where U(M)=(Ut(M))t[0,T] is the d-dimensional process satisfying

(35)Ut(M)=M-12i,j=1qVt0t(Vs)-1φ.j(Xs)φ.i(Xs)𝑑Bsi,j,t[0,T].

We recall the notations of Jacod and Protter [11]

dXt=φ(Xt)dWt=j=0qφ.j(Xt)dWtj

with φ.j representing the jth column of the matrix φ=[φij]i=1,,d,j=1,,q, for j=1,,q, φ0=b and Wt:=(t,Wt1,,Wtq) (column vector), where Wt0=t and the q remaining components make up a standard Brownian motion. Moreover, φ.j is a d×d matrix, where (φ.j)ik=xkφij (partial derivative of φij with respect to the kth coordinate) and (Vt)t[0,T] is the d×d valued process solution of the linear equation

Vt=Id+j=0q0tφ.j(Xs)𝑑WsjVs,t[0,T].

Here (Bij)1i,jq is a standard q2-dimensional Brownian motion independent of W. This process is defined on an extension (Ω~,~,(t~)t0,~) of the original space (Ω,,(t)t0,) on which lives W.

We write, using that h=Tn,

Z(h)=nM(F(X¯nM)-F(X¯n))=-01DF(uX¯n+(1-u)X¯nM)𝑑uUn(M),

where Un(M):=nM(X¯n-X¯nM). The function (x1,x2,x3)01DF(ux1+(1-u)x2)𝑑ux3 is continuous, and it suffices to prove that (X¯n,X¯nM,Un(M))(X,X,U(M)), as n goes to infinity, to conclude that

(36)Z(h)-DF(X)U(M)as h0.

Let two bounded Lipschitz continuous functionals be ϕ:𝒞b([0,T],2d) and ψ:𝒞b([0,T],d) and denote X~n=(X¯n,X¯nM) and X~=(X,X). We write

𝔼[ϕ(X~n)ψ(Un(M))-ϕ(X~)ψ(U(M))]=𝔼[(ϕ(X~n)-ϕ(X~))ψ(Un(M))+ϕ(X~)(ψ(Un(M))-ψ(U(M)))].

Since (Un(M))n1 converges stably with limit U(M), we have that

limn+𝔼[ϕ(X~)(ψ(Un(M))-ψ(U(M)))]=0.

On the other hand, owing to (32), we prove that

limn+𝔼[(ϕ(X~n)-ϕ(X~))ψ(Un(M))]=0.

By (36) and Lemma 5.2 (b) we have

limh0Z(h)22=DF(X)U(M)22=(M-1)v

with v=DF(X)U(M)M-122 which does not depend on M owing to the definition of UM given in (35). ∎

6.2 Nested Monte Carlo

The aim of a nested Monte Carlo method is to compute by Monte Carlo simulation

𝔼[f(𝔼[X|Y])],

where (X,Y) is a couple of ×qY-valued random variables defined on a probability space (Ω,𝒜,) with XL2() and f: is a Lipschitz continuous function with Lipschitz coefficient [f]Lip. We assume that there exists a Borel function F:qξ×qY and a random variable ξ:(Ω,𝒜)qξ independent of Y such that

X=F(ξ,Y)

and we set 𝐡=1K0 for some integer K01, h=1K, KK0*={K0,2K0,} and

Y0:=f(𝔼[X|Y]),Yh=Y1K:=f(1Kk=1KF(ξk,Y)),

where (ξk)k1 is a sequence of i.i.d. variables, ξkξ, independent of Y. A nested ML2R estimator then writes (nj=Mj-1)

IπN=1N1i=1N1f(1Kk=1KF(ξk(1),i,Y(1),i))
+j=2R𝐖jRNji=1Nj(f(1njKk=1njKF(ξk(j),i,Y(j),i))-f(1nj-1Kk=1nj-1KF(ξk(j),i,Y(j),i))),

where (Y(j),i)i1 is a sequence of independent copies of Y(j)Y, j=1,,R, Y(j) independent of Y() for j, and (ξk(j),i)k,i1,j=1,,R is a sequence of i.i.d. variables ξk(j),iξ. We saw in [12] that, when f is 2R times differentiable with f(k) bounded, the nested Monte Carlo estimator satisfies (${\mathrm{SE}_{\beta}}$) with β=1 and (${\mathrm{WE}_{\alpha,\bar{R}}}$) with α=1 and R¯=R-1. Here we want to show that the nested Monte Carlo satisfies also the assumptions of the Strong Law of Large Numbers 3.1 and of the Central Limit Theorem 3.3. Then we define for convenience

ϕ0(y):=𝔼[F(ξ,y)],ϕh(y):=1Kk=1KF(ξk,y),KK0*,

so that Y0=f(ϕ0(Y)) and Yh=f(ϕh(Y)), and for a fixed y, we set σF(y):=Var(F(ξ,y)).

Proposition 6.3

Still assuming that f is Lipschitz continuous. If XLp(P) for p2, then there exists V1(p) such that, for all h=1K and h=1K,K,KK0N*,

(37)Yh-YhppV1(p)|h-h|p2.

As a consequence, assumption (${\mathrm{SE}_{\beta}}$) and the Lp-strong error assumption (9) are satisfied with β=1. Then both ML2R and MLMC estimators satisfy a Strong Law of Large Numbers, see Theorem 3.1.

Proof.

Set X~k=F(ξk,Y)-𝔼[F(ξk,Y)|Y] and Sk==1kX~. As f is Lipschitz, we have

Yh-Yhpp=f(1Kk=1KF(ξk,Y))-f(1Kk=1KF(ξk,Y))pp
[f]Lipp1Kk=1KX~k-1Kk=1KX~kpp=[f]Lipp𝔼[|SKK-SKK|p].

Assume without loss of generality that KK. Since p2, it follows that

𝔼[|SKK-SKK|p]=𝔼[|(1K-1K)SK+1K(SK-SK)|p]
2p-1[|1K-1K|p𝔼[|SK|p]+(1K)p𝔼[|SK-SK|p]].

Owing to Burkholder’s inequality, there exists a universal constant Cp such that

𝔼[|SK|p]Cp𝔼[|k=1KX~k2|p2]Cp(k=1KX~k2p2)p2=CpKp2𝔼[|X~1|p].

Hence, as SK-SKSK-K in distribution,

𝔼[|SKK-SKK|p]2p-1Cp𝔼[|X~1|p][|1K-1K|pKp2+(1K)p|K-K|p2].

Keeping in mind that KK, we derive

|1K-1K|pKp2+|1K|p|K-K|p2=|1K-1K|p2|KK-1|p2+|KK|p2|1K-1K|p22|1K-1K|p2.

We conclude by setting V1(p)=[f]Lipp2pCp𝔼[|X~1|p]. ∎

For the Central Limit Theorem to hold, the key point is the following lemma.

Lemma 6.4

Assume that f:RR is a Lipschitz continuous function and differentiable with f continuous. Let ζ be an N(0,1)-distributed random variable independent of Y. Then, as h0,

(38)Z(h)=Mh(YhM-Yh)M-1f(ϕ0(Y))σF(Y)ζ.

Proof.

First note that Z(h)=zh(M)(Y), where zh(M) is defined by

zh(M)(y)=Mh(f(ϕhM(y))-f(ϕh(y)))for all yqY.

Let yqY. We have

(39)zh(M)(y)=-(01f(vϕh(y)+(1-v)ϕhM(y))𝑑v)uh(M)(y)

with

uh(M)(y)=Mh(ϕh(y)-ϕhM(y)).

We derive from the Strong Law of Large Numbers that

limh0ϕh(y)=ϕ0(y)=limh0ϕhM(y)a.s.

and by continuity of the function (x1,x2)01f(vx1+(1-v)x2)𝑑v (since f is continuous) we get

(40)limh001f(vϕh(y)+(1-v)ϕhM(y))𝑑v=f(ϕ0(y))a.s.

We have now to study the convergence of the random sequence uh(M)(y) as h goes to zero. We set ξ~k=ξk+K, k=1,,K(M-1). Note that (ξ~k)k=1,,K(M-1) are i.i.d. with distribution ξ1 and are independent of (ξk)k=1,M. Then we can write

uh(M)(y)=MK(1Kk=1KF(ξk,y)-1MKk=1MKF(ξk,y))
=MK(M-1MKk=1K(F(ξk,y)-ϕ0(y))-1MKk=K+1MK(F(ξk,y)-ϕ0(y)))
=M-1M(1Kk=1KF(ξk,y)-ϕ0(y))-M-1M[1K(M-1)(k=1K(M-1)F(ξ~k,y)-ϕ0(y))].

Owing to the Central Limit Theorem and the independence of both terms on the right-hand side of the above inequality, we derive that

uh(M)(y)M-1MσF(y)ζ1-M-1MσF(y)ζ2as h0,

where ζ1 and ζ2 are two independent random variables both following a standard Gaussian distribution. Hence, noting that (M-1M)2+(M-1M)2=M-1, we obtain

(41)uh(M)(y)M-1σF(y)ζwith ζ𝒩(0,1).

By Slutsky’s Theorem, we derive from (39), (40) and (41) that for every yqY,

(42)zh(M)(y)M-1f(ϕ0(y))σF(y)ζas h0.

Recall that Z(h)=zh(M)(Y). We prove (38) combining Fubini’s Theorem with the Lebesgue Dominated Convergence Theorem and (42). More precisely, for all G𝒞b we have

limh0𝔼[[]2]G(Z(h))=limh0𝔼[G(zh(M)(Y))]=𝔼[limh0G(zh(M)(Y))]
=𝔼[G(M-1f(ϕ0(y))σF(y)ζ)],

as desired. ∎

We are now in a position to prove that the nested Monte Carlo satisfies the assumptions of the Central Limit Theorem 3.3.

Theorem 6.5

Assume that f:RR is a Lipschitz continuous function and differentiable with f continuous. Then (Z(h))hH is L2-uniformly integrable and

(43)limh0Z(h)22=(M-1)f(ϕ0(Y))σF(Y)22.

As a consequence, the ML2R and MLMC estimators (2) and (1) satisfy a Central Limit Theorem in the sense of Theorem 3.3 (case β=1).

Proof.

We prove first the L2-uniform integrability of (Z(h))h. As f is Lipschitz, we have

|Z(h)|2[f]Lip2|uh(M)(Y)|2with uh(M)(y)=Mh(ϕh(y)-ϕhM(y)).

Consequently, it suffices to show that (uh(M)(Y))h is L2-uniformly integrable, to establish the L2-uniform integrability of (Z(h))h.

We saw in the proof of Proposition 6.4 that uh(M)(Y)M-1σF(Y)ζ as h goes to 0, where ζ is a standard normal random variable independent of Y. Owing to Lemma 5.2 (b), the uniform integrability will follow from limh0uh(M)(Y)2=M-1σF(Y)ζ2. In fact, this convergence holds as an equality. Indeed,

uh(M)(Y)22=MK𝔼[(SK-SMK)2]=MK𝔼[(M-1MKSK-1MK(SMK-SK))2].

We notice that SMK-SK is independent of SK. Hence, since the ξk are independent,

uh(M)(Y)22=MK(𝔼[(M-1MKSK)2]+𝔼[(1MK(SMK-SK))2])
=MK((M-1MK)2K𝔼[(X~1)2]+(1MK)2(MK-K)𝔼[(X~1)2])
=(M-1)𝔼[(X~1)2]=(M-1)𝔼[σF2(Y)].

We prove now (43) using again Lemma 5.2 (b) with the convergence in law of (Z(h))h established in Lemma 6.4. ∎

We notice that, if assumption (37) in Proposition 6.3 holds with p>2, the condition of L2-uniform integrability is much easier to show since it is a direct consequence of Lemma 5.2 (a).

6.3 Smooth nested Monte Carlo

When the function f is smooth, namely 𝒞1+ρ(,), ρ(0,1] (f is ρ-Hölder), a variant of the former multilevel nested estimator has been used in [4] (see also [8]) to improve the strong rate of convergence in order to attain the asymptotically unbiased setting β>1 in condition (${\mathrm{SE}_{\beta}}$). A root M being given, the idea is to replace in the successive refined levels the difference YhM-Yh (where h=1K, KK0*) in the ML2R and MLMC estimators by

Yh,hM:=f(1MKk=1MKF(ξk,Y))-1Mm=1Mf(1Kk=1KF(ξ(m-1)K+k,Y)).

It is clear that

𝔼[Yh,hM]=𝔼[YhM-Yh].

Computations similar to those carried out in Proposition 6.3 yield that, if X=F(ξ,Y)Lp(1+ρ)() for some p2, then

(44)Yh,hMppVM(ρ,p)|h-hM|p2(1+ρ)=VM(ρ,p)|1-1M|p2(1+ρ)|h|p2(1+ρ).

SLLN:

The first consequence is that the SLLN also holds for these modified estimators along the sequences of RMSE (εk)k1 satisfying k1εkp<+ owing to Theorem 3.1.

CLT:

When (44) is satisfied with p=2, one derives that β=p2(1+ρ)=1+ρ>1 whatever ρ is. Hence, the only requested condition in this setting to obtain a CLT (see Theorem 3.2) is the L2-uniform integrability of (h-β2Yh,hM)h, since no sharp rate is needed when β>1. Moreover, if (44) holds for a p(2,+), i.e. if X=F(ξ,Y)Lp(1+ρ)() with p>2, then

h-β2Yh,hMpVM(ρ,p)1p|1-1M|12(1+ρ),

which in turn ensures the L2-uniform integrability.

As a final remark, note that if the function fis convex, Yh,hM0 so that 𝔼[YhM]𝔼[Yh] which in turn implies by an easy induction that 𝔼[Y0]𝔼[Yh] for every h. A noticeable consequence is that the MLMC estimator has a positive bias.

These results can be extended to locally ρ-Hölder continuous functions with polynomial growth at infinity. For more details and a complete proof we refer to [9].

7 Conclusion

We proved a Strong Law of Large Numbers and a Central Limit Theorem for Multilevel estimators with and without weights and we exhibited two applications: the discretization schemes for diffusions, where we extend a result of Ben Alaya and Kebaier in [2], and the nested Monte Carlo, first mentioned in the Multilevel framework by Lemaire and Pagès in [12]. The Strong Law of Large Numbers is essentially a consequence of the strong error assumption (${\mathrm{SE}_{\beta}}$) (or of its reinforced version (9)), and of the estimator levels’ independence. The understanding of the behavior of the weights in the Multilevel Richardson–Romberg estimator is also crucial at this stage, as it is for the proof of the Central Limit Theorem which follows. Under some additional assumptions of L2-uniform integrability, both the weighted and the standard Multilevel estimators follow a Central Limit Theorem at rate ε as the quadratic error ε goes to 0. We distinguish between two cases, depending on the value of the strong error rate β. When β>1, both the first coarse level and the successive fine correcting levels contribute to the asymptotic variance of the estimator, whereas when β(0,1], the asymptotic variance contains only the contribution of the correcting levels. With the choice of optimal parameters made in Tables 1 and 2, the Standard Multilevel Monte Carlo estimator has a bias (which is bounded), whereas the weighted Multilevel Monte Carlo estimator is asymptotically without bias, hence we can build exact confidence intervals for the weighted Multilevel Monte Carlo estimator.

A Asymptotic of the weights

We focus our attention on the behavior of 𝐖jR when R+. We recall

𝐖jR==jRabR-==0R-jaR-b

with

a=11k-1(1-M-kα)

and with the convention k=10(1-M-kα)=1, and

b=(-1)M-α2(+1)1k(1-M-kα).

For convenience, we set 𝐖jR=0, for jR+1,R*. We first notice that a is an increasing and converging sequence and we set

lim+a=a.

The sequence b converges to zero and furthermore the series with general term b is absolutely converging, since 1M-α2(+1)<+. This leads us to set

B~==0+|b|<+andB==0+b<+.

Claim (a) of Lemma 4.3 is then proved. As a consequence,

|𝐖jR|aB~for all R* and all j{1,,R},

which proves claim (b) in Lemma 4.3. For the proof of claims (c) and (d), we will need the following:

Lemma A.1

Let φ:N*N* such that φ(R){1,,R-1} for all R1, φ(R)+ and R-φ(R)+ as R+. Then

limR+sup1jφ(R)|𝐖jR-1|=0.

In particular, for all jN* we have WjR1 as R+. However, this convergence is not uniform since WR-j+1Ra=0j-1b for every jN* as R+.

Proof.

We write

|𝐖jR-aB|=|=0R-jaR-b-a=0R-jb-aR-j+1b|=0R-j(a-aR-)|b|+aR-j+1|b|.

First note that

limR+supj{1,,φ(R)}R-j+1|b|limR+R-φ(R)+1|b|=0

as R-φ(R)+ and 0|b|<+. On the other hand, for every j{1,,φ(R)},

=0R-j(a-aR-)|b|==jR(a-a)|bR-|
==jφ(R)(a-a)|bR-|+=φ(R)+1R(a-a)|bR-|
a=R-φ(R)R-j|b|+(a-aφ(R)+1)=0R-φ(R)-1|b|
a=R-φ(R)+|b|+(a-aφ(R)+1)=0R-φ(R)-1|b|.

Consequently, supj{1,,R}=0R-j(a-aR-)|b|0 as R+, since φ(R) and R-φ(R)+. Finally,

limR+sup1jφ(R)|𝐖jR-aB|=0.

Moreover, by definition we have 𝐖1R=1 for all R, which implies that B=1a and completes the proof. Finally, as aja,

𝐖R-j+1R==0j-1aR-blR+a=0j-1b,

as desired. ∎

Proof of Lemma 4.3 (c) and (d).

(c) Let us consider the non-negative measure on * defined by

mβ(j)=Mγ(j-1),γ<0.

We notice that it is a finite measure since

j1dmβ(j)=11-Mγ.

Since, as we saw in Lemma A.1, 𝐖jR1 as R+ for every j* and |𝐖jR|aB~, we derive from Lebesgue’s Dominated Convergence Theorem that

limR+j=2R|𝐖jR|Mγ(j-1)=j=2+limR+|𝐖jR|Mγ(j-1)=1M-γ-1.

(d) If γ<0, we consider the non-negative finite measure on * defined by mβ(j)=Mγ(j-1)vj since (vj)j1 is a bounded sequence of positive real numbers. As in the previous case (c) we have

limR+j=2R|𝐖jR|Mγ(j-1)vj=j=2+Mγ(j-1)vj.

If γ=0, let us consider a sequence φ(R){1,,R-1} such that φ(R)R1, R-φ(R)+ as R+ (for example φ(R)=R-R). Then we can write

|1Rj=2R|𝐖jR|vj-1Rj=2Rvj|[1Rj=2φ(R)||𝐖jR|-1|+1Rj=φ(R)+1R(|𝐖jR|+1)]supj2vj
[sup2jφ(R)|𝐖jR-1|φ(R)R+(aB~+1)(1-φ(R)R)]supj2vj.

Owing to Lemma A.1, sup2jφ(R)|𝐖jR-1|0 as R+. Using furthermore that φ(R)R1 as R+ and that limj+vj=1, one concludes by noting that, owing to Césàro’s Lemma, limR+1Rj=2Rvj=1.

If γ>0, first, we notice that

(45)j=2R|𝐖jR|Mγ(j-1)|𝐖RR|MγR=|aR|MγR+.

Let η>0. Since limj+vj=1, there exists Nη* such that, for each j>Nη, |vj-1|<η2. Owing to Lemma A.1 there exists Rη such that, for each RRη, sup2jNη|𝐖jR|<1+η. Then

|j=2R|𝐖jR|Mγ(j-1)vjj=2R|𝐖jR|Mγ(j-1)-1|j=2R|𝐖jR|Mγ(j-1)|vj-1|j=2R|𝐖jR|Mγ(j-1)
j=2Nη|𝐖jR|Mγ(j-1)|vj-1|j=2R|𝐖jR|Mγ(j-1)+η2j=Nη+1R|𝐖jR|Mγ(j-1)j=2R|𝐖jR|Mγ(j-1)
max2jNηMγ(j-1)|vj-1|Nηsup2jNη|𝐖jR|j=2R|𝐖jR|Mγ(j-1)+η2
f(Nη)(1+η)j=2R|𝐖jR|Mγ(j-1)+η2

where f(N)=max2jNMγ(j-1)|vj-1|N does not depend on R. Thanks to (45), there exists Rη>0 such that, for each Rmax(Rη,Rη), j=2R|𝐖jR|Mγ(j-1)>2f(Nη)(1+η)η, which proves that

limR+j=2R|𝐖jR|Mγ(j-1)vjj=2R|𝐖jR|Mγ(j-1)=1.

This leads to analyze

1MγRj=2R|𝐖jR|Mγ(j-1)=j=2R|𝐖jR|M-γ(R-j+1)=j=1R-1|𝐖R-j+1R|M-γj.

Using that |𝐖jR|aB~ for j{1,,R} and Lemma A.1, one derives from Lebesgue’s Dominated Convergence Theorem that

j=1R-1|𝐖R-j+1R|M-γjR+aj1|=0j-1b|M-γj<+

since |=0j-1b|=0j-1|b|B~. ∎

References

[1] Bally V. and Talay D., The law of the Euler scheme for stochastic differential equations. I. Convergence rate of the distribution function, Probab. Theory Related Fields 104 (1996), no. 1, 43–60. 10.1007/BF01303802Search in Google Scholar

[2] Ben Alaya M. and Kebaier A., Central limit theorem for the multilevel Monte Carlo Euler method, Ann. Appl. Probab. 25 (2015), no. 1, 211–234. 10.1214/13-AAP993Search in Google Scholar

[3] Billingsley P., Convergence of Probability Measures, 2nd ed., Wiley Ser. Probab. Stat., John Wiley & Sons, New York, 1999. 10.1002/9780470316962Search in Google Scholar

[4] Bujok K., Hambly B. and Reisinger C., Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives, Methodol. Comput. Appl. Probab. 17 (2015), no. 3, 579–604. 10.1007/s11009-013-9380-5Search in Google Scholar

[5] Duffie D. and Glynn P., Efficient Monte Carlo simulation of security prices, Ann. Appl. Probab. 5 (1995), no. 4, 897–905. 10.1214/aoap/1177004598Search in Google Scholar

[6] Gerbi A. A., Jourdain B. and Clément E., Ninomiya–Victoir scheme: Strong convergence, antithetic version and application to multilevel estimators, Monte Carlo Methods Appl. 22 (2016), no. 3, 197–228. 10.1515/mcma-2016-0109Search in Google Scholar

[7] Giles M., Multilevel Monte Carlo path simulation, Oper. Res. 56 (2008), no. 3, 607–617. 10.1287/opre.1070.0496Search in Google Scholar

[8] Giles M. B., Multilevel Monte Carlo methods, Acta Numer. 24 (2015), 259–328. 10.1007/978-3-642-41095-6_4Search in Google Scholar

[9] Giorgi D., Ph.D. thesis, in progress. Search in Google Scholar

[10] Hall P. and Heyde C. C., Martingale Limit Theory and its Application, Probab. Math. Statist., Academic Press, New York, 1980. Search in Google Scholar

[11] Jacod J. and Protter P., Asymptotic error distributions for the euler method for stochastic differential equations, Ann. Probab. 26 (1998), no. 1, 267–307. 10.1214/aop/1022855419Search in Google Scholar

[12] Lemaire V. and Pagès G., Multilevel richardson-romberg extrapolation, preprint 2014, https://arxiv.org/abs/1401.1177v1; to appear in Bernoulli. 10.2139/ssrn.2539114Search in Google Scholar

[13] Talay D. and Tubaro L., Expansion of the global error for numerical schemes solving stochastic differential equations, Stoch. Anal. Appl. 8 (1990), no. 4, 483–509. 10.1080/07362999008809220Search in Google Scholar

Received: 2016-11-17
Accepted: 2017-1-26
Published Online: 2017-2-21
Published in Print: 2017-3-1

© 2017 by De Gruyter

Downloaded on 5.9.2024 from https://www.degruyter.com/document/doi/10.1515/mcma-2017-0102/html
Scroll to top button