m. giles - multilevel monte carlo path simulation

8/13/2019 M. Giles - Multilevel Monte Carlo Path Simulation

1/25

Multilevel Monte Carlo path simulation

Michael B. Giles

Abstract

We show that multigrid ideas can be used to reduce the computa-tional complexity of estimating an expected value arising from a stochas-tic differential equation using Monte Carlo path simulations. In thesimplest case of a Lipschitz payoff and an Euler discretisation, the com-putational cost to achieve an accuracy ofO() is reduced fromO(3) toO(2(log )2). The analysis is supported by numerical results showingsignificant computational savings.

1 Introduction

In Monte Carlo path simulations which are used extensively in computationalfinance, one is interested in the expected value of a quantity which is a func-tional of the solution to a stochastic differential equation. To be specific,suppose we have a multi-dimensional SDE with general drift and volatilityterms,

dS(t) =a(S, t) dt + b(S, t) dW(t), 0< t < T, (1)

and given initial data S0 we want to compute the expected value off(S(T))wheref(S) is a scalar function with a uniform Lipschitz bound, i.e. there existsa constant c such that

|f(U) f(V)| cU V , U,V. (2)A simple Euler discretisation of this SDE with timestep h isSn+1=Sn+ a(Sn, tn) h + b(Sn, tn) Wn,

and the simplest estimate forE[f(ST)] is the mean of the payoff values f(ST/h),from Nindependent path simulations,

Y =N1 Ni=1

f(S(i)T/h).Professor of Scientific Computing, Oxford University Computing Laboratory

1


2/25

It is well established that, provided a(S, t) and b(S, t) satisfy certain condi-

tions [1, 15, 20], the expected mean-square-error (MSE) in the estimateY isasymptotically of the form

MSE c1N1 + c2h2,where c1, c2 are positive constants. The first term corresponds to the varianceinYdue to the Monte Carlo sampling, and the second term is the square ofthe O(h) bias introduced by the Euler discretisation.

To make the MSEO(2), so that the r.m.s. error is O(), requires thatN=O(2) andh = O(), and hence the computational complexity (cost) isO(3)

[4]. The main theorem in this paper proves that the computational complexityfor this simple case can be reduced to O (2(log )2) through the use of amultilevel method which reduces the variance, leaving unchanged the bias dueto the Euler discretisation. The multilevel method is very easy to implementand can be combined, in principle, with other variance reduction methods suchas stratified sampling [7] and quasi Monte Carlo methods [16, 17, 19] to obtaineven greater savings.

The method extends the recent work of Kebaier [14] who proved that thecomputational cost of the simple problem described above can be reduced toO(2.5) through the appropriate combination of results obtained using two

levels of timestep, h and O(h

1/2

). This is closely related to a more generallyapplicable approach of quasi control variates analysed by Emsermann andSimon [5].

Our technique generalises Kebaiers approach to multiple levels, using ageometric sequence of different timesteps hl = M

l T, l = 0, 1, . . . , L, forinteger M 2, with the smallest timestep hL corresponding to the original hwhich determines the size of the Euler discretisation bias. This idea of usinga geometric sequence of timesteps comes from the multigrid method for theiterative solution of linear systems of equations arising from the discretisationof elliptic partial differential equations [2, 21]. The multigrid method uses

a geometric sequence of grids, each typically twice as fine in each directionas its predecessor. If one were to use only the finest grid, the discretisationerror would be very small, but the computational cost of a Jacobi or Gauss-Seidel iteration would be very large. On a much coarser grid, the accuracyis much less, but the cost is also much less. Multigrid solves the equationson the finest grid, by computing corrections using all of the grids, therebyachieving the fine grid accuracy at a much lower cost. This is a very simplifiedexplanation of multigrid, but it is the same essential idea which will be usedhere, retaining the accuracy/bias associated with the smallest timestep, butusing calculations with larger timesteps to reduce the variance in a way thatminimises the overall computational complexity.

2


3/25

A similar multilevel Monte Carlo idea has been used by Heinrich [9] for

parametric integration, in which one is interested in evaluating a quantityI() which is defined as a multi-dimensional integral of a function which hasa parametric dependence on . Although the details of the method are quitedifferent from Monte Carlo path simulation, the analysis of the computationalcomplexity is quite similar.

The paper begins with the introduction of the new multilevel method andan outline of its asymptotic accuracy and computational complexity for thesimple problem described above. The main theorem and its proof are thenpresented. This establishes the computational complexity for a broad cat-egory of applications and numerical discretisations with certain properties.

The applicability of the theorem to the Euler discretisation is a consequenceof its well-established weak and strong convergence properties. The paperthen discusses some refinements to the method and its implementation, andthe effects of different payoff functions and numerical discretisations. Finally,numerical results are presented to provide support for the theoretical analysis,and directions for further research are outlined.

2 Multilevel Monte Carlo method

Consider Monte Carlo path simulations with different timesteps hl = Ml T,l = 0, 1, . . . , L. For a given Brownian path W(t), let P denote the payoff

f(S(T)), and letSl,Ml andPl denote the approximations to S(T) andP usinga numerical discretisation with timestep h l.

It is clearly true that

E[PL] =E[P0] + Ll=1

E[PlPl1].The multilevel method independently estimates each of the expectations on

the right-hand side in a way which minimises the computational complexity.

LetY0 be an estimator for E[P0] using N0 samples, and letYl for l > 0 bean estimator forE[PlPl1] using Nl paths. The simplest estimator that onemight use is a mean ofNl independent samples, which for l > 0 is

Yl =N1l Nli=1

P(i)l P(i)l1 . (3)The key point here is that the quantity

P(i)l

P(i)l1 comes from two discrete

approximations with different timesteps but the same Brownian path. This

3


4/25

is easily implemented by first constructing the Brownian increments for the

simulation of the discrete path leading to the evaluation ofP(i)l , and thensumming them in groups of sizeMto give the discrete Brownian increments forthe evaluation ofP(i)l1. The variance of this simple estimator is V[Yl] =N1l Vlwhere Vl is the variance of a single sample. The same inverse dependence onNl would apply in the case of a more sophisticated estimator using stratifiedsampling or a zero-mean control variate to reduce the variance.

The variance of the combined estimatorY = Ll=0

Yl isV[Y] = L

l=0N1l Vl.

The computational cost, if one ignores the asymptotically negligible cost ofthe final payoff evaluation, is proportional to

Ll=0

Nl h1l .

Treating the Nl as continuous variables, the variance is minimised for a fixedcomputational cost by choosing Nl to be proportional to Vl hl. This calcu-lation of an optimal number of samples Nl is similar to the approach used inoptimal stratified sampling [7], except that in this case we also include theeffect of the different computational cost of the samples on different levels.

The above analysis holds for any value ofL. We now assume that L 1,and consider the behaviour of Vl as l . In the particular case of theEuler discretisation and the Lipschitz payoff function, provided a(S, t) andb(S, t) satisfy certain conditions [1, 15, 20], there is O(h) weak convergenceand O(h1/2) strong convergence. Hence, asl ,

E[

Pl P] =O(hl), (4)

andE[ Sl,Ml S(T)2 ] =O(hl). (5)

From the Lipschitz property (2), it follows that

V[PlP] E[ (PlP)2 ] c2 E[ Sl,Ml S(T)2 ].Combining this with (5) gives V[PlP] =O(hl). Furthermore,

(

Pl

Pl1) = (

PlP) (

Pl1P)

=

V[PlPl1] (V[PlP])1/2 + (V[Pl1P])1/2

2

.

4


5/25

Hence, for the simple estimator (3), the single sample variance Vl is O(hl),

and the optimal choice for Nl is asymptotically proportional to hl. SettingNl =O(

2Lhl), the variance of the combined estimatorY isO(2).IfL is now chosen such that

L=log 1

log M + O(1),

as 0, thenhL=ML =O(),and so the bias error E[PL P] isO(), dueto (4). Consequently, we obtain a MSE which is O(2), with a computationalcomplexity which is O(2L2) =O(2(log )2).

3 Complexity theorem

The main theorem is worded quite generally so that it can be applied to avariety of financial models with output functionals which are not necessarilyLipschitz functions of the terminal state but may instead be a discontinuousfunction of the terminal state, or even path-dependent as in the case of barrierand lookback options. The theorem also does not specify which numericalapproximation is used. Instead, it proves a result concerning the computa-tional complexity of the multilevel method conditional on certain features ofthe underlying numerical approximation and the multilevel estimators. Thisapproach is similar to that used by Duffie and Glynn [4].

Theorem 3.1 LetP denote a functional of the solution of stochastic differ-ential equation (1) for a given Brownian path W(t), and letPl denote thecorresponding approximation using a numerical discretisation with timestephl =M

l T.

If there exist independent estimators

Yl based onNl Monte Carlo samples,

and positive constants 12

, , c1, c2, c3 such that

i) E[Pl P] c1 hlii) E[Yl] =

E[P0], l= 0E[Pl Pl1], l >0

iii) V[Yl] c2 N1l hliv) Cl, the computational complexity ofYl, is bounded by

Cl

c3 Nl h

1l ,

5


6/25

then there exists a positive constant c4 such that for any < e1 there are

valuesL andNl for which the multilevel estimator

Y = Ll=0

Yl,has a mean-square-error with bound

MSE EY E[P]2< 2

with a computational complexityCwith bound

C

c4 2

, >1,c4

2(log )2, = 1,

c4 2(1)/, 0<


7/25

We now need to consider the different possible values for .

a) If=1, we set Nl= 2 2 (L+1) c2 hl so thatV[Y] = L

l=0

V[Yl] Ll=0

c2 N1l hl 122,

which is the required upper bound on the variance of the estimator.

To bound the computational complexity Cwe begin with an upper bound on Lgiven by

L log 1

log M +

log(

2 c1 T)

log M + 1.

Given that 1 < log 1

for < e1

, it follows that

L+1 c5 log 1,where

c5= 1

log M+ max

0,

log(

2 c1 T)

log M

+ 2.

Upper bounds for Nl are given by

Nl 22 (L+1) c2 hl+ 1.Hence the computational complexity is bounded by

C c3Ll=0

Nl h1l c3

2 2(L+1)2 c2+

Ll=0

h1l

Using the upper bound for L+1 and inequality (7), and the fact that 1 < log 1 for < e1, it follows that C c4 2(log )2 where

c4= 2 c3 c25 c2+ c3

M2

M1

2 c1

1/.

b) For > 1, setting

Nl = 2 2 c2 T(1)/2 1M(1)/21 h(+1)/2l ,then

Ll=0

V[Yl] 122 T(1)/2 1M(1)/2 Ll=0

h(1)/2l .

Using the standard result for a geometric series,

Ll=0

h(1)/2l = T

(1)/2Ll=0

M(1)/2

l< T(1)/2 1M(1)/2

1, (8)

7


8/25

and hence we obtain an 122 upper bound on the variance of the estimator.

Using the Nl upper bound

Nl


9/25

The first inequality in (6) gives

h(1)L 1 the computa-

tional effort is primarily expended on the coarsest levels, for < 1 it is on thefinest levels, and for =1 it is roughly evenly spread across all levels.

In applying the theorem in different contexts, there will often be existingliterature on weak convergence which will establish the correct exponent for condition i). Constructing estimators with properties ii) and iv) is alsostraightforward. The main challenge will be in determining and proving the

appropriate exponentfor iii). An even bigger challenge might be to developbetter estimators with a higher value for .

In the case of the Euler discretisation with a Lipschitz payoff, there isexisting literature on the conditions on a(S, t) and b(S, t) for O(h) weak con-vergence andO(h1/2) strong convergence [1, 15, 20], which in turn gives = 1as explained earlier.

The convergence is degraded if the payoff function f(S(T)) has a discon-tinuity. In this case, for a given timestep hl, a fraction of the paths of sizeO(h

1/2l ) will have a final

Sl,Ml which is O(h

1/2l ) from the discontinuity. With

the Euler discretisation, this fraction of the paths have an O(1) probability ofPlPl1 being O(1), due toSl,Ml andSl1,Ml1 being on opposite sides of thediscontinuity, and therefore Vl = O(h

1/2l ) and =

12

. Because the weak orderof convergence is still O(hl) [1] so = 1, the overall complexity is O(

2.5),which is still better than the O(3) complexity of the standard Monte Carlomethod with an Euler discretisation. Further improvement may be possiblethrough the use of adaptive sampling techniques which increase the samplingof those paths with large values forPlPl1 [8, 13, 18].

If the Euler discretisation is replaced by Milsteins method for a scalarSDE, itsO(h) strong convergence results inVl =O(h

2l ) for a Lipschitz payoff.

Current research is investigating how to achieve a similar improvement in

9


10/25

the convergence rate for lookback, barrier and digital options, based on the

appropriate use of Brownian interpolation [7], as well as the extension to multi-dimensional SDEs.

4 Extensions

4.1 Optimal M

The analysis so far has not specified the value of the integer M, which is thefactor by which the timestep is refined at each level. In the multigrid method

for the iterative solution of discretisations of elliptic PDEs, it is usually optimalto use M= 2, but that is not necessarily the case with the multilevel MonteCarlo method introduced in this paper.

For the simple Euler discretisation with a Lipschitz payoff, V[PlP] c0 hlasymptotically, for some positive constant c0. This corresponds to the case=1 in Theorem 3.1. From the identity

(PlPl1) = (PlP) (Pl1P)we obtain, asymptotically, the upper and lower bounds

M 12 c0 hl V[PlPl1] M+ 12 c0 hl,

2 4 6 8 10 12 14 161.8

2

2.2

2.4

2.6

2.8

3

3.2

M

f(M)

f(M) = (MM1

) / (log M)2

Figure 1: A plot of the function (M

M1)/(log M)2

10


11/25

with the two extremes corresponding to perfect correlation and anti-correlation

betweenPlP andPl1P.Suppose now that the value ofV[PlPl1] is given approximately by the

geometric mean of the two bounds,

V[PlPl1] (M1) c0 hl,which corresponds to c2= (M1)c0 in Theorem 3.1. This results in

Nl 22(L+1)(M1) c0 hl.and so the computational cost of evaluating

Yl is proportional to

Nl(h1

l +h1

l1) =N

lh1l

(1+M1)

2 2(L+1)(M

M1) c0.

SinceL = O(log 1/ log M), summing the costs of all levels, we conclude thatasymptotically, as 0, the total computational cost is roughly proportionalto

2 2(log )2 f(M),

where

f(M) =MM1

(log M)2.

This function is illustrated in Figure 1. Its minimum near M=7 is about halfthe value at M= 2, giving twice the computational efficiency. The numerical

results presented later are all obtained using M= 4. This gives most of thebenefits of a larger value ofM, but at the same time Mis small enough to givea reasonable number of levels from which to estimate the bias, as explained inthe next section.

4.2 Bias estimation and Richardson extrapolation

In the multilevel method, the estimates for the correction E[PlPl1] at eachlevel give information which can be used to estimate the remaining bias. Inparticular, for the Euler discretisation with a Lipschitz payoff, asymptotically,

as lE[PPl] c1hl,

for some constant c1 and hence

E[PlPl1] (M1) c1hl (M1) E[PPl].This information can be used in one of two ways. The first is to use it as

an approximate bound on the remaining bias, so that to obtain a bias whichhas magnitude less than /

2 one increases the value for L until

YL< 12(M1) .11


12/25

Being more cautious, the condition which we use in the numerical results

presented later is

max

M1YL1 , YL< 12(M1) . (10)

This ensures that the remaining error based on an extrapolation from eitherof the two finest timesteps is within the desired range. This modification isdesigned to avoid possible problems due to a change in sign of the correction,E[PlPl1], on successive refinement levels.

An alternative approach is to use Richardson extrapolation to eliminatethe leading order bias. Since E[P

PL] (M1)1E[PLPL1], by changingthe combined estimator to

Ll=0

Yl

+ (M1)1YL= MM1

Y0+ Ll=1

Yl M1Yl1

,

the leading order bias is eliminated and the remaining bias is o(hL), usually

either O(h3/2L ) or O(h

2L). The advantage of re-writing the new combined esti-

mator in the form shown above on the right-hand-side, is that one can monitorthe convergence of the terms

Yl M1

Yl1 to decide when the remaining bias

is sufficiently small, in exactly the same way as described previously forYl.Assuming the remaining bias is O(h2L), the appropriate convergence test isYL M1YL1< 12(M21) . (11)5 Numerical algorithm

Putting together the elements already discussed, the multilevel algorithm usedfor the numerical tests is as follows:

1. start with L = 0

2. estimateVL using an initial NL = 104 samples

3. define optimalNl, l= 0, . . . , Lusing Eqn. (12)

4. evaluate extra samples at each level as needed for new Nl

5. ifL2, test for convergence using Eqn. (10) or Eqn. (11)6. ifL < 2 or not converged, set L:= L+1 and go to 2.

12


13/25

The equation for the optimal Nl is

Nl =

2 2

Vl hl

Ll=0

Vl/hl

. (12)

This makes the estimated variance of the combined multilevel estimator lessthan 1

22, while Equation (10) tries to ensure that the bias is less than 1

2.

Together, they should give a MSE which is less than 2, with being a user-specified r.m.s. accuracy.

In step 4, the optimal Nl from step 3 is compared to the number of samplesalready calculated at that level. If the optimalNl is larger, then the appropri-

ate number of additional samples are calculated. The estimate for Vl is thenupdated, and this improved estimate is used if step 3 is re-visited.

It is important to note that this algorithm is heuristic; it is not guaran-teed to achieve a MSE error which is O(2). The main theorem in Section3 does provide a guarantee, but the conditions of the theorem assume a pri-oriknowledge of the constants c1 and c2 governing the weak convergence andthe variance convergence as h 0. These two constants are in effect beingestimated in the numerical algorithm described above.

The accuracy of the variance estimate at each level depends on the size of

the initial sample set. If this initial sample size were made proportional to p

for some exponent 0 < p < 21/, then as 0 it could be proved that thevariance estimate will converge to the true value with probability 1, withoutan increase in the order of the computational complexity.

The weakness in the heuristic algorithm lies in the bias estimation, and itdoes not appear to be easily resolved. Suppose the numerical algorithm de-termines that L levels are required. Ifp(S) represents the probability densityfunction for the final state S(T) defined by the SDE, and pl(S), l = 0, 1, . . . Lare the corresponding probability densities for the level lnumerical approxima-tions, then in general p(S) and thepl(S) are likely to be linearly independent,

and sop(S) =g(S) +

Ll=0

alpl(S),

for some set of coefficients al and a non-zero functiong(S) which is orthogonalto the pl(S). If we consider g(S) to be an increment to the payoff function,then its numerical expectation on each level is zero, since

Epl [g] =

g(S)pl(S) dS= 0,

13


14/25

while its true expectation is

Ep[g] =

g(S)p(S) dS=

g2(S) dS >0.

Hence, by adding an arbitrary amount of g(S) to the payoff, we obtain anarbitrary perturbation of the true expected payoff, but the heuristic algorithmwill on average terminate at the same level L with the same expected value.

This is a fundamental problem which also applies to the standard MonteCarlo algorithm. In practice, it may require additionala prioriknowledge orexperience to choose an appropriate minimum value for L to achieve a givenaccuracy. Being cautious, one is likely to use a value for L which is larger

than required in most cases. In this case, the use of the multilevel method willyield significant additional benefits. For the standard Monte Carlo method,the computational cost is proportional to ML, the number of timesteps on thefinest level, whereas for the multilevel method with the Euler discretisationand a Lipschitz payoff the cost is proportional to L2. Thus the computationalcost of being cautious in the choice ofL is much less severe for the multilevelalgorithm than for the standard Monte Carlo.

Even better would be a multilevel application with a variance convergencerate > 1; for this the computational cost is approximately independent ofL,suggesting that one could use a value forLwhich is much larger than necessary.

If there is a known value forL which is guaranteed to give a bias which is muchless than, then it may be possible to define a numerical algorithm which willprovably achieve a MSE error of2 at a cost which is O(2); this will be anarea for future research.

In reporting the numerical results later, we define the computational costas the total number of timesteps performed on all levels,

C=N0+Ll=1

Nl(Ml+Ml1).

The term Ml

+Ml1

reflects the fact that each sample at level l >0 requiresthe computation of one fine path with Ml timesteps and one coarse path withMl1 timesteps.

The computational costs are compared to those of the standard MonteCarlo method, which is calculated as

C=Ll=0

NlMl,

where Nl = 2 2 V[Pl] so that the variance of the estimator is

122 as with

the multilevel method. The summation over the grid levels corresponds to an

14


15/25

application of the standard Monte Carlo algorithm on each grid level to enable

the estimation of the bias in order to apply the same heuristic terminationcriterion as the multilevel method.

Results are also shown for Richardson extrapolation in conjunction withboth the multilevel and standard Monte Carlo methods. The costs for theseare defined in the same way; the difference is in the choice of L, and thedefinition of the extrapolated estimator which has a slightly different variance.

6 Numerical results

6.1 Geometric Brownian motion

Figures 2-5 present results for a simple geometric Brownian motion,

dS=r Sdt + SdW, 0< t


16/25

0 1 2 3 410

8

6

4

2

0

l

logM

variance

Pl

Pl P

l1

0 1 2 3 412

10

8

6

4

2

0

l

logM|

mean|

Pl

Pl P

l1

YlY

l1/M

0 1 2 3 4

104

106

108

1010

l

Nl

=0.00005

=0.0001

=0.0002

=0.0005

=0.001

104

103

102

101

100

101

2

Cost

Std MC

Std MC ext

MLMCMLMC ext

Figure 2: Geometric Brownian motion with European option (value 0.10).

approximately1 again implies an O(h) convergence ofE[PlPl1]. Even atl = 3, the relative errorE[P

Pl]/E[P] is less than 10

3. Also plotted is a linefor the multilevel method with Richardson extrapolation, showing significantly

faster weak convergence.The bottom two plots have results from two sets of multilevel calculations,

with and without Richardson extrapolation, for five different values of. Eachline in the bottom left plot corresponds to one multilevel calculation and showsthe values for Nl, l = 0, . . . , L, with the values decreasing with l because ofthe decrease in both Vl and hl. It can also be seen that the value forL, themaximum level of timestep refinement, increases as the value for decreases.

The bottom right plot shows the variation of the computational complexityC(as defined in the previous section) with the desired accuracy . The plotis of 2C versus , because we expect to see that 2C is only very weakly

16


17/25

dependent on for the multilevel method. Indeed, it can be seen that without

Richardson extrapolation 2C is a very slowly increasing function of 1 forthe multilevel methods, in agreement with the theory which predicts it to beasymptotically proportional to (log )2. For the standard Monte Carlo method,theory predicts that2Cshould be proportional to the number of timesteps onthe finest level, which in turn is roughly proportional to 1 due to the weakconvergence property. This can be seen in the figure, with the staircaseeffect corresponding to the fact that L =2 for = 0.001, 0.0005 and L = 3 for = 0.0002, 0.0001, 0.0005.

With Richardson extrapolation, a prioritheoretical analysis predicts that2Cfor the standard Monte Carlo method should be approximately propor-

tional to 1/2. However, with extrapolation the numerical results require nomore than the minimum two levels of refinement to achieve the desired ac-curacy, and so 2C is found to be independent of for the range of in thetests. Nevertheless, for the most accurate case with = 5105, the multilevelmethod is still approximately 10 times more efficient than the standard MonteCarlo method when using extrapolation, and more than 60 times more efficientwithout extrapolation.

As a final check on the reliability of the heuristics in the multilevel numeri-cal algorithm, ten sets of multilevel calculations have been performed for eachvalue of, and the root-mean-square-error (RMSE) is computed and compared

to the target accuracy of. For all cases, with and without Richardson extrap-olation, the ratio RMSE/ was found to be in the range 0.430.96, indicatingthat the algorithm is correctly achieving the desired accuracy.

6.1.2 Asian option

Figure 3 has results for the Asian option payoff, P= exp(r) max 0, S1 ,where

S=

1

0

S(t) dt,

which is approximated numerically by

Sl =

Nln=1

12(Sn+Sn1) hl.

The O(hl) convergence of both Vl and E[PlPl1] is similar to the Europeanoption case, but in this case the Richardson extrapolation does not seem tohave improved the order of weak convergence. Hence, the reliability of the biasestimation and grid level termination must be questioned for the Richardsonextrapolation. Without extrapolation, the multilevel method is up to 30 timesmore efficient than the standard Monte Carlo method.

17


18/25

0 1 2 3 410

8

6

4

2

0

l

logM

variance

Pl

Pl P

l1

0 1 2 3 412

10

8

6

4

2

0

l

logM|

mean|

Pl

Pl P

l1

YlY

l1/M

0 1 2 3 4

104

106

108

1010

l

Nl

=0.00005

=0.0001

=0.0002

=0.0005

=0.001

104

103

102

101

100

101

2

Cost

Std MC

Std MC ext

MLMCMLMC ext

Figure 3: Geometric Brownian motion with Asian option (value 0.058).

6.1.3 Lookback option

The results in Figure 4 are for the lookback option

P= exp(r)S(1) min0


19/25

0 1 2 3 4 510

8

6

4

2

0

l

logM

variance

Pl

Pl P

l1

0 1 2 3 4 512

10

8

6

4

2

0

l

logM|

mean|

Pl

Pl P

l1

YlY

l1/M

0 1 2 3 4 5

10

4

106

108

1010

l

Nl

=0.0001

=0.0002

=0.0005

=0.001

=0.002

104

103

102

101

100

101

102

2

Cost

Std MC

Std MC ext

MLMC

MLMC ext

Figure 4: Geometric Brownian motion with lookback option (value 0.17).

of grid levels required, so that the multilevel method gives savings of up tofactor 65 without extrapolation, but up to only 4 with extrapolation.

6.1.4 Digital option

The final payoff which is considered is a digital option, P= exp(r) H(S(1)1)where H(x) is the Heaviside function. The results in Figure 5 show that

Vl= O(h1/2l ), instead of theO(hl) convergence of all of the previous options.

Because of this, much larger values for Nl on the finer refinement levels arerequired to achieve comparable accuracy, and the efficiency gains of the multi-level method are reduced accordingly. Richardson extrapolation is extremelyeffective in this case, although the resulting order of weak convergence is un-clear, but the multilevel method still offers some additional computational

19


20/25

0 1 2 3 410

8

6

4

2

0

l

logM

variance

Pl

Pl P

l1

0 1 2 3 412

10

8

6

4

2

0

l

logM|

mean|

Pl

Pl P

l1

YlY

l1/M

0 1 2 3 4

104

106

108

1010

l

Nl

=0.0002

=0.0005

=0.001

=0.002

=0.005

103

100

101

102

2

Cost

Std MC

Std MC ext

MLMC

MLMC ext

Figure 5: Geometric Brownian motion with digital option (value 0.53).

savings.

The accuracy of the heuristic algorithm is again tested by performing tensets of multilevel calculations and comparing the RMSE error to the target

accuracy . The ratio is in the range 0.551.0 for all cases, with and withoutextrapolation.

6.2 Heston stochastic volatility model

Figure 6 presents results for the same European call payoff considered previ-ously, but this time based on the Heston stochastic volatility model [10],

dS = r Sdt +

V SdW1, 0< t


21/25

0 1 2 3 410

8

6

4

2

0

l

logM

variance

Pl

Pl P

l1

0 1 2 3 412

10

8

6

4

2

0

l

logM|

mean|

Pl

Pl P

l1

YlY

l1/M

0 1 2 3 4

104

106

108

1010

l

Nl

=0.00005

=0.0001

=0.0002

=0.0005

=0.001

104

103

102

101

100

101

2

Cost

Std MC

Std MC ext

MLMCMLMC ext

Figure 6: Heston model with European option (value 0.10).

with S(0)=1, V(0)=0.04, r = 0.05, = 0.2, = 5, = 0.25, and correlation =0.5 between dW1 and dW2.

The accuracy and variance are both improved by defining a new variable

W =et (V2),and applying the Euler discretisation to the SDEs for W and S which resultsin the discrete equations

Sn+1 =Sn+ rSn h + V+nSnW1,nVn+1 = 2 + eh(Vn2) + V+n W2,n ,Note that the

V is replaced by

V+

max(V, 0) but as h 0 the21


22/25

probability of the discrete approximation to the volatility becoming negative

approaches zero, for the chosen values of, , [12].Because the volatility does not satisfy a global Lipschitz condition, there is

no existing theory to predict the order of weak and strong convergence. Thenumerical results suggest the variance is decaying slightly slower than firstorder, while the weak convergence appears slightly faster than first order. Themultilevel method without Richardson extrapolation gives savings of up tofactor 10 compared to the standard Monte Carlo method. Using a referencevalue computed using the numerical method of Kahl and J ackel [11], the ratioof the RMSE error to the target accuracy is found to be in the range 0.491.01.

The results with Richardson extrapolation are harder to interpret. Theorder of weak convergence does not appear to be improved. The computationalcost is reduced, but this is due to the heuristic termination criterion whichassumes the remaining error after extrapolation is second order, which it isnot. Consequently, the ratio of the RMSE error to the target accuracy isin the range 0.661.23, demonstrating that the termination criterion is notreliable in combination with extrapolation for this application.

7 Concluding remarks

In this paper we have shown that a multilevel approach, using a geometricsequence of timesteps, can reduce the order of complexity of Monte Carlopath simulations. If we consider the generation of a discrete Brownian paththrough a recursive Brownian Bridge construction, starting with the end pointsW0 andWTat level 0, then computing the mid-point WT/2at level 1, then theinterval mid-points WT/4, W3T/4 at level 2, and so on, then an interpretation

of the multilevel method is that the levell correction,E[PlPl1],correspondsto the effect on the expected payoff due to the extra detail that is brought intothe Brownian Bridge construction at level l.

The numerical results for a range of model problems show that the mul-tilevel algorithm is efficient and reliable in achieving the desired accuracy,whereas the use of Richardson extrapolation is more problematic; in somecases it works well but in other cases it fails to double the weak order ofconvergence and hence does not achieve the target accuracy.

There are a number of areas for further research arising from this work.One is the development of improved estimators giving a convergence order > 1. For scalar SDEs, the Milstein discretisation gives = 2 for Lipschitzpayoffs, but more work is required to obtain improved convergence for look-

22


23/25

back, barrier and digital options. The extension to multi-dimensional SDEs is

also challenging since, in most cases, the Milstein discretisation requires thesimulation of Levy areas [6, 7].

A second area for research concerns the heuristic nature of the multilevelnumerical procedure. It would clearly be desirable to have a numerical proce-dure which is guaranteed to give a MSE which is less than 2. This may beachievable by using estimators with > 1, so that one can use an excessivelylarge value for L without significant computational penalty, thereby avoidingthe problems with the bias estimation.

Thirdly, the multilevel method needs to be tested on much more complexapplications, more representative of the challenges faced in the finance commu-nity. This includes payoffs which involve evaluations at multiple intermediatetimes in addition to the value at maturity, and basket options which involvehigh-dimensional SDEs.

Finally, it may be possible to further reduce the computational complexityby switching to quasi Monte Carlo methods such as Sobol sequences and latticerules [16, 17]. This is likely to be particularly effective in conjunction withimproved estimators with > 1, because in this case the optimal Nl for thetrue Monte Carlo sampling leads to the majority of the computational effortbeing applied to extremely coarse paths. These are ideally suited to the use ofquasi Monte Carlo techniques, which may be able to lower the computationalcost towards O(1) to achieve a MSE of2.

Acknowledgements

I am very grateful to Mark Broadie, Paul Glasserman and Terry Lyons fordiscussions on this work, and their very helpful comments on drafts of thepaper.

References

[1] V. Bally and D. Talay. The law of the Euler scheme for stochastic differen-tial equations, I: convergence rate of the distribution function.ProbabilityTheory and Related Fields, 104(1):4360, 1995.

[2] W.M. Briggs, V.E. Henson, and S.F. McCormick. A Multigrid Tutorial(second edition). SIAM, 2000.

[3] M. Broadie, P. Glasserman, and S. Kou. A continuity correction fordiscrete barrier options. Mathematical Finance, 7(4):325348, 1997.

23


24/25


25/25

[18] J.S. Liu. Monte Carlo strategies in scientific computing. Springer, New

York, 2001.

[19] H. Niederreiter. Random Number Generation and Quasi-Monte CarloMethods. SIAM, 1992.

[20] D. Talay and L. Tubaro. Expansion of the global error for numericalschemes solving stochastic differential equations. Stochastic Analysis andApplications, 8:483509, 1990.

[21] P. Wesseling. An Introduction to Multigrid Methods. John Wiley, 1992.

25

m. giles - multilevel monte carlo path simulation

Documents