zhongqiang˜zhang george˜em˜karniadakis numerical methods ... · leslie greengard, courant...

Applied Mathematical Sciences

Zhongqiang ZhangGeorge Em Karniadakis

Numerical Methods for Stochastic Partial Di� erential Equations with White Noise

Applied Mathematical Sciences

Volume 196

EditorsS.S. Antman, Institute for Physical Science and Technology, University of Maryland, College Park, MD,[email protected] Greengard, Courant Institute of Mathematical Sciences, New York University, New York, NY,[email protected]. Holmes, Department of Mechanical and Aerospace Engineering, Princeton University, Princeton,NJ, [email protected]

AdvisorsJ. Bell, Lawrence Berkeley National Lab, Center for Computational Sciences and Engineering, Berkeley,CA, USAP. Constantin, Department of Mathematics, Princeton University, Princeton, NJ, USAR. Durrett, Departmetn of Mathematics, Duke University, Durham, NC, USAR. Kohn, Courant Institute of Mathematical Sciences, New York University, New York, USAR. Pego, Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA, USAL. Ryzhik, Department of Mathematics, Stanford University, Stanford, CA, USAA. Singer, Department of Mathematics, Princeton University, Princeton, NJ, USAA. Stevens, Department of Applied Mathematics, University of Munster, Munster, GermanyA. Stuart, Mathematics Institute, University of Warwick, Coventry, United Kingdom

Founding EditorsFritz John, Joseph P. LaSalle and Lawrence Sirovich

http://[email protected]



More information about this series at http://www.springer.com/series/34

http://www.springer.com/series/34

gk

Sticky Note

this web site is the general one for the series, not specific to our book

Zhongqiang Zhang • George Em Karniadakis

Numerical Methods forStochastic Partial DifferentialEquations with White Noise

123

Zhongqiang ZhangDepartment of Mathematical SciencesWorcester Polytechnic InstituteWorcester, Massachusetts, USA

George Em KarniadakisDivision of Applied MathematicsBrown UniversityProvidence, Rhode Island, USA

ISSN 0066-5452 ISSN 2196-968X (electronic)Applied Mathematical SciencesISBN 978-3-319-57510-0 ISBN 978-3-319-57511-7 (eBook)DOI 10.1007/978-3-319-57511-7

Library of Congress Control Number: 2017941192

Mathematics Subject Classification (2010): 65Cxx, 65Qxx, 65Mxx

© Springer International Publishing AG 2017This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part ofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodologynow known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this bookare believed to be true and accurate at the date of publication. Neither the publisher nor the authors or theeditors give a warranty, express or implied, with respect to the material contained herein or for any errorsor omissions that may have been made. The publisher remains neutral with regard to jurisdictional claimsin published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer NatureThe registered company is Springer International Publishing AGThe registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

In his forward-looking paper [374] at the conference “Mathematics Towardsthe Third Millennium,” our esteemed colleague at Brown University Prof.David Mumford argued that “. . . stochastic models and statistical reasoningare more relevant to i) the world, ii) to science and many parts of mathemat-ics and iii) particularly to understanding the computations in our mind, thanexact models and logical reasoning.” Deterministic modeling and correspond-ing simulations are computationally much more manageable than stochasticsimulations, but they also offer much less, i.e., a single point in the “de-sign space” instead of a “sphere” of possible solutions that reflect the vari-ous random (or not) perturbations of the physical or biological problem westudy.

In the last twenty years, three-dimensional simulations of physical andbiological phenomena have gone from an exception to the rule, and theyhave been widely adopted in Computational Science and Engineering (CSE).This more realistic approach to simulating physical phenomena together withthe continuing fast increase of computer speeds has also led to the desire inperforming more ambitious and even more realistic simulations with multi-scale physics. However, not all scales can be modeled directly, and stochasticmodeling has been used to account for the un-modeled physics. In addition,there is a fundamental need to quantify the uncertainty of large-scale sim-ulations, and this has led to a new emerging field in CSE, namely that ofUncertainty Quantification or UQ. Hence, computational scientists and en-gineers are interested in endowing their simulations with “error bars” thatreflect not only numerical discretization errors but also uncertainties associ-ated with unknown precise values of material properties, ill-defined or evenunknown boundary conditions, or uncertain constitutive laws in the govern-ing equations. However, performing UQ taxes the computational resourcesgreatly, and hence the selection of the proper numerical method for stochasticsimulations is of paramount importance, in some sense much more importantthan selecting a numerical method for deterministic modeling.

V

VI Preface

The most popular simulation method for stochastic modeling is the MonteCarlo method and its various extensions, but it requires a lot of computationaleffort to compute thousands and often millions of sample paths required toobtain certain statistics of the quantity of interest. Specifically, Monte Carlomethods are quite general, but they suffer from slow convergence so they areusually employed in conjunction with some variance reduction techniquesto produce satisfactory accuracy in practice. More recently and for applica-tions that employ stochastic models with color noise, deterministic integra-tion methods in random space have been used with great success as they leadto high accuracy, especially for a modest number of uncertain parameters.However, they are not directly applicable to stochastic partial differentialequations (SPDEs) with temporal white noise, since their solutions are usu-ally non-smooth and would require a very large number of random variablesto obtain acceptable accuracy.

Methodology. For linear SPDEs, we can still apply deterministic inte-gration methods by exploiting the linear property of these equations for along-time numerical integration. This observation has been made in [315] forZakai-type equations, where a recursive Wiener chaos method was developed.In this book, we adopt this idea and we further formulate a recursive strat-egy to solve linear SPDEs of general types using Wiener chaos methods andstochastic collocation methods in conjunction with sparse grids for efficiency.In order to apply these deterministic integration methods, we first truncatethe stochastic processes (e.g., Brownian motion) represented by orthogonalexpansions. In this book, we show that the orthogonal expansions can leadto higher-order schemes with proper time discretization when Wiener chaosexpansion methods are employed. However, we usually keep only a smallnumber of truncation terms in the orthogonal expansion of Brownian motionto efficiently use deterministic integration methods for temporal noise. Forspatial noise, the orthogonal expansion of Brownian motions leads to higher-order methods in both random space and physical space when the solutionsto the underlying SPDEs are smooth.

The framework we use is the Wong-Zakai approximation andWiener chaosexpansion. Wiener chaos expansion is associated with the Ito-Wick productwhich was used intensively in [223]. The methodology and the proofs areintroduced in [315] and in some subsequent papers by Rozovsky and his col-laborators. In this framework, we are led to systems of deterministic partialdifferential equations with unknowns being the Wiener chaos expansion co-efficients, and it is important to understand the special structure of theselinear systems. Following another framework, such as viewing the SPDEs asinfinite-dimensional stochastic differential equations (SDEs), admits a readyapplication of numerical methods for SDEs. We emphasize that there aremultiple points of view for treating SPDEs and accordingly there are manyviews on what are proper numerical methods for SPDEs. However, irrespec-tive of such views, the common difficulty remains: the solutions are typically

Preface VII

very rough and do not have first-order derivatives either in time or in space.Hence, no high-order (higher than first-order) methods are known except invery special cases.

Since the well-known monographs on numerical SDEs by Kloeden &Platen (1992) [259], Milstein (1995) [354], and Milstein & Tretyakov (2004)[358], numerical SPDEs with white noise have gained popularity, and therehave been some new books on numerical SPDEs available, specifically:

• The book by Jentzen & Kloeden (2011) [251] on the development ofstochastic Taylor’s expansion for mild solutions to stochastic parabolicequations and their application to numerical methods.

• The book by Grigoriu (2012) [174] on the application of stochasticGalerkin and collocation methods as well as Monte Carlo methods to par-tial differential equations with random data, especially elliptic equations.Numerical methods for SODEs with random coefficients are discussed aswell.

• The book by Kruse (2014) [277] on numerical methods in space and timefor semi-linear parabolic equations driven by space-time noise addressingstrong (Lp or mean-square) and weak (moments) sense of convergence.

• The book by Lord, Powell, & Shardlow (2014) [308] on the introductionof numerical methods for stochastic elliptic equations with color noise andstochastic semi-linear equations with space-time noise.

For numerical methods of stochastic differential equations with color noise,we refer the readers to [294, 485]. On the theory of SPDEs, there are alsosome new developments, and we refer the interested readers to the book [36]covering amplitude equations for nonlinear SPDEs and to the book [119] onhomogenization techniques for effective dynamics of SPDEs.

How to use this book. This book can serve as a reference/textbook forgraduate students or other researchers who would like to understand thestate-of-the-art of numerical methods for SPDEs with white noise.

Reading this book requires some basic knowledge of probability theoryand stochastic calculus, which are presented in Chapter 2 and Appendix A.Readers are also required to be familiar with numerical methods for par-tial differential equations and SDEs in Chapter 3 before further reading.The reader can also refer to Chapter 3 for MATLAB implementations oftest problems. More MATLAB codes for examples in this book are availableupon request. For those who want to take a glance of numerical methods forstochastic partial differential equation, they are encouraged to read a reviewof these methods presented in Chapter 3. Exercises with hints are providedin most chapters to nurture the reader’s understanding of the presented ma-terials.

Part I. Numerical stochastic ordinary differential equations.We start withnumerical methods for SDEs with delay using the Wong-Zakai approximation

VIII Preface

and finite difference in time in Chapter 4. The framework of Wong-Zakaiapproximation is used throughout the book. If the delay time is zero, wethen recover the standard SDEs. We then discuss how to deal with strongnonlinearity and stiffness in SDEs in Chapter 5.

Part II. Temporal white noise. In Chapters 6–8, we consider SPDEs asPDEs driven by white noise, where discretization of white noise (Brownianmotion) leads to PDEs with smooth noise, which can then be treated bynumerical methods for PDEs. In this part, recursive algorithms based onWiener chaos expansion and stochastic collocation methods are presentedfor linear stochastic advection-diffusion-reaction equations. Stochastic Eulerequations in Chapter 9 are exploited as an application of stochastic colloca-tion methods, where a numerical comparison with other integration methodsin random space is made.

Part III. Spatial white noise. We discuss in Chapter 10 numerical methodsfor nonlinear elliptic equations as well as other equations with additive noise.Numerical methods for SPDEs with multiplicative noise are discussed usingthe Wiener chaos expansion method in Chapter 11. Some SPDEs driven bynon-Gaussian white noise are discussed, where some model reduction methodsare presented for generalized polynomial chaos expansion methods.

We have attempted to make the book self-contained. Necessary back-ground knowledge is presented in the appendices. Basic knowledge of proba-bility theory and stochastic calculus is presented in Appendix A. In AppendixB, we present some semi-analytical methods for SPDEs. In Appendix C, weprovide a brief introduction to Gauss quadrature. In Appendix D, we list allthe conclusions we need for proofs. In Appendix E, we present a method tocompute convergence rate empirically.

Acknowledgments. This book is dedicated to Dr. Fariba Fahroo whowith her quiet and steady leadership at AFOSR and DARPA and withher generous advocacy and support has established the area of uncertaintyquantification and stochastic modeling as an independent field, indispens-able in all areas of computational simulation of physical and biologicalproblems.

This book is based on our research through collaboration with ProfessorBoris L. Rozovsky at Brown University and Professor Michael V. Tretyakovat the University of Nottingham. We are indebted to them for their valuableadvice. Specifically, Chapters 6, 7, and 8 are based on collaborative researchwith them. Chapter 5 is also from a collaboration with Professor Michael V.Tretyakov. We would also like to thank Professor Wanrong Cao at SoutheastUniversity for providing us numerical results in Chapter 4 and ProfessorGuang Lin at Purdue University and Dr. Xiu Yang at Pacific NorthwestNational Laboratory for providing their code for Chapter 9.

The research of this book was supported by OSD/MURI grant FA9550-09-1-0613, NSF/DMS grant DMS-0915077, NSF/DMS grant DMS-1216437,

Preface IX

and ARO and NIH grants and also by the Collaboratory on Mathematics forMesoscopic Modeling of Materials (CM4) which is sponsored by DOE. Thefirst author was also supported by a start-up funding at Worcester Polytech-nic Institute during the writing of the book.

Worcester, MA, USA Zhongqiang ZhangProvidence, RI, USA George Em Karniadakis

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V

1 Prologue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Why random and Brownian motion (white noise)? . . . . . . . . . . 11.2 Modeling with SPDEs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Specific topics of this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Brownian motion and stochastic calculus . . . . . . . . . . . . . . . . . 112.1 Gaussian processes and their representations . . . . . . . . . . . . . . . 112.2 Brownian motion and white noise . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Some properties of Brownian motion . . . . . . . . . . . . . . . . 182.2.2 Approximation of Brownian motion . . . . . . . . . . . . . . . . . 21

2.3 Brownian motion and stochastic calculus . . . . . . . . . . . . . . . . . . 252.4 Stochastic chain rule: Ito formula . . . . . . . . . . . . . . . . . . . . . . . . . 292.5 Integration methods in random space . . . . . . . . . . . . . . . . . . . . . 31

2.5.1 Monte Carlo method and its variants . . . . . . . . . . . . . . . 312.5.2 Quasi-Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . 342.5.3 Wiener chaos expansion method . . . . . . . . . . . . . . . . . . . . 352.5.4 Stochastic collocation method . . . . . . . . . . . . . . . . . . . . . 372.5.5 Application to SODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.6 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.7 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Numerical methods for stochastic differential equations . . . 533.1 Basic aspects of SODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.1.1 Existence and uniqueness of strong solutions . . . . . . . . . 543.1.2 Solution methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.2 Numerical methods for SODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 603.2.1 Derivation of numerical methods based on numerical

integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

XI

XII Contents

3.2.2 Strong convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.2.3 Weak convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.2.4 Linear stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663.2.5 Summary of numerical SODEs . . . . . . . . . . . . . . . . . . . . . 69

3.3 Basic aspects of SPDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.3.1 Functional spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.3.2 Solutions in different senses . . . . . . . . . . . . . . . . . . . . . . . . 733.3.3 Solutions to SPDEs in explicit form. . . . . . . . . . . . . . . . . 763.3.4 Linear stochastic advection-diffusion-reaction

equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.3.5 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . 773.3.6 Conversion between Ito and Stratonovich

formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783.4 Numerical methods for SPDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.4.1 Direct semi-discretization methods for parabolicSPDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.4.2 Wong-Zakai approximation for parabolic SPDEs . . . . . . 853.4.3 Preprocessing methods for parabolic SPDEs . . . . . . . . . 863.4.4 What could go wrong? Examples of stochastic

Burgers and Navier-Stokes equations . . . . . . . . . . . . . . . . 883.4.5 Stability and convergence of existing numerical

methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903.4.6 Summary of numerical SPDEs . . . . . . . . . . . . . . . . . . . . . 93

3.5 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 943.6 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Part I Numerical Stochastic Ordinary Differential Equations

4 Numerical schemes for SDEs with time delay using theWong-Zakai approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.1 Wong-Zakai approximation for SODEs . . . . . . . . . . . . . . . . . . . . 104

4.1.1 Wong-Zakai approximation for SDDEs . . . . . . . . . . . . . . 1054.2 Derivation of numerical schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 106

4.2.1 A predictor-corrector scheme. . . . . . . . . . . . . . . . . . . . . . . 1074.2.2 The midpoint scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1114.2.3 A Milstein-like scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.3 Linear stability of some schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1234.5 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 1294.6 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Contents XIII

5 Balanced numerical schemes for SDEs with non-Lipschitzcoefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.1 A motivating example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.2 Fundamental theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.2.1 On application of Theorem 5.2.3 . . . . . . . . . . . . . . . . . . . 1405.2.2 Proof of the fundamental theorem . . . . . . . . . . . . . . . . . . 141

5.3 A balanced Euler scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

5.4.1 Some numerical schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 1535.4.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155


Part II Temporal White Noise

6 Wiener chaos methods for linear stochasticadvection-diffusion-reaction equations . . . . . . . . . . . . . . . . . . . . 1676.1 Description of methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

6.1.1 Multistage WCE method . . . . . . . . . . . . . . . . . . . . . . . . . . 1686.1.2 Algorithm for computing moments . . . . . . . . . . . . . . . . . 172

6.2 Examples in one dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1756.2.1 Numerical results for one-dimensional

advection-diffusion-reaction equations . . . . . . . . . . . . . . 1776.3 Comparison of the WCE algorithm and Monte Carlo type

algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796.4 A two-dimensional passive scalar equation . . . . . . . . . . . . . . . . . 183

6.4.1 A Monte Carlo method based on the methodof characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

6.4.2 Comparison between recursive WCE and MonteCarlo methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187


7 Stochastic collocation methods for differential equationswith white noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1937.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1937.2 Isotropic sparse grid for weak integration of SDE . . . . . . . . . . . 195

7.2.1 Probabilistic interpretation of SCM . . . . . . . . . . . . . . . . . 1957.2.2 Illustrative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

7.3 Recursive collocation algorithm for linear SPDEs . . . . . . . . . . . 2047.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2087.5 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 2157.6 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

XIV Contents

8 Comparison between Wiener chaos methods andstochastic collocation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2198.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2198.2 Review of Wiener chaos and stochastic collocation . . . . . . . . . . 220

8.2.1 Wiener chaos expansion (WCE) . . . . . . . . . . . . . . . . . . . . 2208.2.2 Stochastic collocation method (SCM) . . . . . . . . . . . . . . . 221

8.3 Error estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2248.3.1 Error estimates for WCE . . . . . . . . . . . . . . . . . . . . . . . . . . 2258.3.2 Error estimate for SCM . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

8.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2408.5 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 2488.6 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

9 Application of collocation method to stochasticconservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2519.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2519.2 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

9.2.1 Stochastic Euler equations . . . . . . . . . . . . . . . . . . . . . . . . . 2549.3 Verification of the Stratonovich- and Ito-Euler equations . . . . 256

9.3.1 A splitting method for stochastic Euler equations . . . . . 2569.3.2 Stratonovich-Euler equations versus first-order

perturbation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2579.3.3 Stratonovich-Euler equations versus Ito-Euler

equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2589.4 Applying the stochastic collocation method . . . . . . . . . . . . . . . . 2599.5 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 2629.6 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Part III Spatial White Noise

10 Semilinear elliptic equations with additive noise . . . . . . . . . . 27110.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27110.2 Assumptions and schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27310.3 Error estimates for strong and weak convergence order . . . . . . 275

10.3.1 Examples of other PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . 27610.3.2 Proofs of the strong convergence order . . . . . . . . . . . . . . 27910.3.3 Weak convergence order . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

10.4 Error estimates for finite element approximation . . . . . . . . . . . . 28610.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29210.6 Summary and bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . 29310.7 Suggested practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

Contents XV

11 Multiplicative white noise: The Wick-Malliavinapproximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29711.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29711.2 Approximation using the Wick-Malliavin expansion . . . . . . . . . 29911.3 Lognormal coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

11.3.1 One-dimensional example . . . . . . . . . . . . . . . . . . . . . . . . . 30311.4 White noise as coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

11.4.1 Error Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30711.4.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

11.5 Application of Wick-Malliavin approximation to nonlinearSPDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

11.6 Wick-Malliavin approximation: extensions for non-Gaussianwhite noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31711.6.1 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32211.6.2 Malliavin derivatives for Poisson noises . . . . . . . . . . . . . . 325


12 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33512.1 A review of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33512.2 Some open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Appendices

A Basics of probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343A.1 Probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

A.1.1 Random variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343A.2 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

A.2.1 Properties of conditional expectation . . . . . . . . . . . . . . . . 345A.2.2 Filtration and Martingales . . . . . . . . . . . . . . . . . . . . . . . . . 346

A.3 Continuous time stochastic process . . . . . . . . . . . . . . . . . . . . . . . 347

B Semi-analytical methods for SPDEs . . . . . . . . . . . . . . . . . . . . . . 349

C Gauss quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351C.1 Gauss quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351C.2 Gauss-Hermite quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

D Some useful inequalities and lemmas . . . . . . . . . . . . . . . . . . . . . . 357

E Computation of convergence rate . . . . . . . . . . . . . . . . . . . . . . . . . 361

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

1

Prologue

Stochastic mathematical models have received increasing attention for theirability of representing intrinsic uncertainty in complex systems, e.g., rep-resenting various scales in particle simulations at molecular and mesoscopicscales, as well as extrinsic uncertainty, e.g., stochastic external forces, stochas-tic initial conditions, or stochastic boundary conditions. One important classof stochastic mathematical models is stochastic partial differential equations(SPDEs), which can be seen as deterministic partial differential equations(PDEs) with finite or infinite dimensional stochastic processes – either withcolor noise or white noise. Though white noise is a purely mathematical con-struction, it can be a good model for rapid random fluctuations, and it is alsothe limit of color noise when the correlation length goes to zero.

In the following text, we first discuss why random variables/processes areused using a simple model and introduce the color of noise. Then we presentsome typical models using PDEs with random (stochastic) processes. At last,we preview the topics covered in this book and the methodology we use.

1.1 Why random and Brownian motion (white noise)?

Consider the following simple population growth model

dy(t) = k(t)y(t) dt, y(0) = y0. (1.1.1)

Here y(t) is the size of population and k(t) is the relative growth rate. Inpractice, k(t) is not completely known and is disturbed around some knownquantity k(t):

k(t) = k(t) + ‘perturbation (noise)’.

Here k(t) is deterministic and is usually known while no exact behavior existsof the perturbation (noise) term. The uncertainty (lack of information) about

© Springer International Publishing AG 2017Z. Zhang, G.E. Karniadakis, Numerical Methods for StochasticPartial Differential Equations with White Noise,Applied Mathematical Sciences 196, DOI 10.1007/978-3-319-57511-7 1

1

2 1 Prologue

k(t) (the perturbation term) is naturally represented as a stochastic quantity,denoted as v(t, ω). Here ω represents the randomness.

To address the dependence on ω, we then write the ordinary differentialequation as

dy(t, ω) = k(t, ω)y(t, ω) dt, 1 y(0) = y0. (1.1.2)

Here y0 can be a random variable but we take y0 = 1 (deterministic) forsimplicity.

What color is the noise?

In many situations, it is assumed that a stochastic process v(t) satisfies thefollowing properties

1. The expectation of v(t) is zero for all t, i.e., E[v(t)] = 0.2. The covariance (two-point correlation) function of v(t) is more or less

known. That is,

Cov[(v(t), v(s))] = E[(v(t)− E[v(t)])(v(s)− E[v(s)])].

When the covariance function is proportional to the Dirac function δ(t− s),the process v(t) is called uncorrelated, which is usually referred to as whitenoise. Otherwise, it is correlated and is referred to as color noise. The whitenoise can be intuitively described as a stochastic process, which has indepen-dent values at each time instance and has an infinite variance.

Due to the simplicity of Gaussian processes, v(t, ω) is modeled with aGaussian process. One important Gaussian process is Brownian motion inphysics, which describes a random motion of particles in a fluid with con-stantly varying fluctuations. It is fully characterized by its expectation (usu-ally taken as zero) and its covariance function, see Chapter 2.2.

Another way to represent the Gaussian noise is through Fourier series.A real-valued Gaussian process can be represented as

v(t) =

∞∑

k=−∞e−iktakξk,

where ξk’s are i.i.d. standard Gaussian random variables (i.e., mean zero andvariance 1) and i is the imaginary unit (i2 = −1). When ak’s are the same

constant, the process is called white noise. When |ak|2 is proportional to1/kα, it is called 1/fα noise. When α = 2, the process is called red noise(Brownian noise). When α = 1, the process is called pink noise. It is calledblue noise when α = −1. However, white noise (α = 0) is not observed inexperiments and nature, but 1/fα noise, 0 < α ≤ 2 was first observed back in

1We assume that k(t, ω) is sufficiently smooth in t. It will be clear why somesmoothness is needed when we introduce the stochastic product and calculus inChapter 2.3.

1.1 Why random and Brownian motion (white noise)? 3

1910s, e.g., α = 1 [144]. When α is closer to 0, the process v(t) becomes lesscorrelated and when α = 0, white noise can be treated as a limit of processeswith extremely small correlation. This is illustrated in Figure 1.1, which isgenerated by the Matlab code in [429] using Mersenne Twister pseudorandomnumber generator with seed 100. The smaller α is, the closer is the noise towhite noise.

Fig. 1.1. Sample paths of different Gaussian 1/fα processes.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

(a) Brownian motion, α = 2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2

-1.5

-1

-0.5

0

0.5

1

1.5

(b) α = 1.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5

-4

-3

-2

-1

0

1

2

3

4

5

(c) Pink noise, α = 1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-20

-15

-10

-5

0

5

10

15

20

(d) α = 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-80

-60

-40

-20

0

20

40

60

80

(e) α = 0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100

-80

-60

-40

-20

0

20

40

60

80

100

(f) white noise

4 1 Prologue

Solutions to (1.1.2)

When k(t) = 0, k(t, ω) = v(t, ω) may take the following form:

• v(t, ω) = ξ ∼ N (0, 1) is a standard Gaussian random variable. The co-variance (in this case variance) of v(t, ω) is 1.

• v(t, ω) where the covariance function of V (t) is exp(− |t−s|A ), with A the

correlation time.• v(t, ω) = W (t, ω) is a standard Brownian motion where the covariance

function is min(t, s).• v(t, ω) = W (t, ω) is the white noise and the covariance function is δ(t−s).

When k(t, ω) =: W (t, ω), Equation (1.1.2) is understood in the Stratonovichsense

dy = y ◦ dW (t), y(0) = y0 (1.1.3)

so that one can apply the classical chain rule. We discuss what the circlemeans in Chapter 2.3, but roughly speaking the circle denotes Stratonovichproduct, which can be understood in some limit of Riemann sum using themidpoint rule.

The exact solution to Equation (1.1.2) is y = y0 exp(K(t)), where K(t) =∫ t

0k(s) ds is again Gaussian with mean zero. It can be readily checked by the

definition of moments that

• K(t) ∼ N (0, t2) when k(t, ω) =: ξ ∼ N (0, 1);• K(t) ∼ N

(0, 2At + 2A2(exp(− t

A ) − 1))when k(t, ω) has the two-point

correlation function of exp(− |t1−t2|A ).

• K(t) ∼ N (0, t3

3 ) when k(t, ω) = W (t, ω) is a standard Brownian motion.

• K(t) ∼ N (0, t) is the Brownian motion when k(t, ω) = W (t, ω) is thewhite noise.

Then the moments of the solution y are, for m = 1, 2, · · · ,

E[ym(t)] = ym0 exp(m2

2σ2), (1.1.4)

where σ2 = t2, 2At + 2A2(exp(− tA ) − 1), t3

3 , t for the listed processes,respectively. These results are useful and can be used in checking the accuracyof different numerical schemes applied to Equation (1.1.2).

1.2 Modeling with SPDEs

SPDEs with white noise have been formulated for various applications, suchas nonlinear filtering (see, e.g., [499]), turbulent flows (see, e.g., [43, 348]),fluid flows in random media (see, e.g., [223]), particle systems (see, e.g., [268]),population biology (see, e.g., [98]), neuroscience (see, e.g., [463]), etc. Since

1.2 Modeling with SPDEs 5

analytic solutions to SPDEs can rarely be obtained, numerical methods haveto be developed to solve SPDEs. One of the motivations for numerical SPDEsin the early literature was to solve the Zakai equation of nonlinear filter-ing, see, e.g., [31, 78, 130, 150–152]. In the next section, we review somenumerical methods for a semilinear equation (3.4.1), the advection-diffusion-reaction equation of nonlinear filtering (3.4.13), the stochastic Burgers equa-tion (3.4.20), and the stochastic Navier-Stokes equation (1.2.4).

Let W (t) be a r-dimensional Brownian motion, i.e., W (t) = (W1(t), . . . ,Wr(t))

� where Wi(t)’s are mutually independent Brownian motions – Gaus-sian processes with the covariance function min(t, s). For a rigorous definitionof Brownian motion, see Chapter 2.2.

Example 1.2.1 (Zakai equation of nonlinear filtering) Let y(t) be ar-dimensional observation of a signal x(t) and y(t) given by

y(t) = y0 +

∫ t

0

h(x(s)) ds+W (t),

where h = (h1, h2, . . . , hr)� is a R

r-vector-valued function defined on Rd

and the observational signal x(t) satisfies the following stochastic differentialequation

dx(t) = b(x(t)) dt+

q∑

k=1

σk(x(t)) dBk, x(0) = x0,

where b and σk’s are d-dimensional vector functions on Rd, and B(t) =

(B1(t), B2(t), . . . , Bq(t))� is a q-dimensional Brownian motion on (Ω,F ,P)

and is independent of W (t).The conditional probability density function (un-normalized) of the signal

x(t) given y(t) satisfies, see, e.g., [499],

du(t, x) =1

2

d∑

i,j=1

DiDj [(σσ�)iju(t, x)] dt−

d∑

i

Di(biu(t, x)) dt

+

r∑

l=1

hlu(t, x) dyl(t), x ∈ Rd. (1.2.1)

Here Di := ∂xiis the partial derivative in the xi-th direction and σ� is the

transpose of σ.

Equation (1.2.1) provides an analytical solution to the aforementionedfiltering problem. Equation (1.2.1) and its variants have been a majormotivation for the development of theory for SPDEs (see, e.g., [280, 408])and corresponding numerical methods; see, e.g., [31, 78, 130, 150–152] andfor a comprehensive review on numerical methods see [52].

6 1 Prologue

Example 1.2.2 (Pressure equation) The following model was introducedas an example of the pressure of a fluid in porous and heterogeneous (butisotropic) media at the point x over the physical domain D in R

d (d ≤ 3):

− div(K(x)∇p) = f(x), p|∂D = 0, (1.2.2)

where K(x) > 0 is the permeability of media at the point x and f(x) is a masssource. In a typical porous media, K(x) is fluctuating in an unpredictable andirregular way and can be modeled with a stochastic process.

In [140, 141], K(x) = exp(

∫

Dφ(x− y) dW (y)− 1

2‖φ‖2) is represented as

a lognormal process, see also [223] for a formulation of the lognormal processusing the Ito-Wick product (see its definition in Chapter 2.3). Here φ is acontinuous function in D and ‖φ‖ = (

∫D φ2 dx)1/2 < ∞.

We define infinite dimensional Gaussian processes, see, e.g., [94, 408], asfollows

WQ(t, x) =∑

i∈Nd

√qiei(x)Wi(t), (1.2.3)

where Wi(t)’s are mutually independent Brownian motions. Here qi ≥ 0,i ∈ N

d and {ei(x)} is an orthonormal basis in L2(D). The following noiseis usually considered in literature: WQ(t, x) =

∑i∈Nd

√qiei(x)Wi(t). When

qi = 1 for all i’s, we have a space-time white noise. When∑∞

i=1 qi < ∞, wecall it the space-time color noise. Sometimes it is called a Q-Wiener process.We call the noise finite-dimensional when qi = 0 for all sufficient large i.

Example 1.2.3 (Turbulence model) The stochastic incompressibleNavier-Stokes equation reads, see, e.g., [108, 348]

∂tu+ u · ∇u− νΔu+∇p = σ(u)WQ, divu = 0, (1.2.4)

where σ is Lipschitz continuous over the physical domain in Rd (d = 2, 3).

Here E[WQ(x, t)WQ(y, s)] = q(x, y)min (s, t) and q(x, x) is square-integrableover the physical domain.

Example 1.2.4 (Reaction-diffusion equation)

du = νΔu+ f(u) + σ(u)WQ(t, x).

This may represent a wide class of SPDEs:

• In materials, the stochastic Allen-Cahn equation, where f(u) = u(1− u)(1 + u), σ(u) is a constant and WQ(t, x) is space-time white noise, see,e.g., in [204, 302].

• In population genetics, this equation has been used to model changesin the structure of population in time and space, where WQ(t, x) is aGaussian process white in time but color in space. For example, f(u) = 0,σ(u) = γ

√max (u, 0), where γ is a constant and u is the mass distri-

bution of the population, see, e.g., [97]. Also, f(u) = αu − β, σ(u) =γ√

max (u(1− u), 0), where α, β, γ are constants, see, e.g., [129].

1.3 Specific topics of this book 7

• When σ(u) is a small constant, the equation can be treated as randomperturbation of deterministic equations (σ(u) = 0), see, e.g., [136].

In the first and third cases, we say that the equation has an additive noise asthe coefficients of noise are independent of the solution. In the second case,we say that the equation has a multiplicative noise since the coefficient ofnoise depends on the solution itself.

For more details on modeling with SPDEs, we refer to the books [223, 268].For well-posedness (existence, uniqueness, and stability) of SPDEs, we referto the books [94, 145, 170, 223, 399, 408]. For studies of the dynamics ofSPDEs, we refer to [145] for asymptotic behavior of solutions to SPDEs; alsoto [92] for Kolmogrov equations for SPDEs, to [36] for amplitude equationsof nonlinear SPDEs, and to [119] for homogenization of multiscale SPDEs.In this book, we present numerical methods for SPDEs. Specifically, we focuson forward problems, i.e., predicting the quantities of interest from knownSPDEs with known inputs, especially those SPDEs driven by white noise.

1.3 Specific topics of this book

In this book, we focus on two issues related to numerical methods for SPDEswith white noise: one is deterministic integration methods in random spacewhile the other one is the effect of truncation of the Brownian motion usingthe spectral approximation.2 In Figure 1.2, we provide an overview on howwe organize this book. In addition, we present some necessary details in theappendices to make the book self-contained.

In Chapter 2, we review some numerical methods for SPDEs with whitenoise addressing primarily their discretization in random space. We thenexplore the effect of truncation of Brownian motion using its spectral expan-sion (Wong-Zakai approximation) for stochastic differential equations withtime delay in Chapter 4 and show that the Wong-Zakai approximation canfacilitate the derivation of various numerical schemes.

For deterministic integration methods of SPDEs in random space, we aimat performing accurate long-time numerical integration of time-dependentequations, especially of linear stochastic advection-reaction-diffusion equa-tions. We study Wiener chaos expansion methods (WCE) and stochasticcollocation methods (SCM), compare their performance and prove theirconvergence order. To achieve longer time integration, we adopt the re-cursive WCE proposed in [315] for the Zakai equation for nonlinear filteringand develop algorithms for the first two moments of solutions. Numeri-cal results show that when high accuracy is required, WCE is superior to

2There are different numerical methods for SPDEs using different approximationof Brownian motion and integration methods in random space, see Chapter 2 for areview, where discretization in time and space is also presented.

8 1 Prologue

Monte Carlo methods while WCE is not as efficient if only low accuracy issufficient, see Chapter 6. We show that the recursive approach for SCM forlinear advection-reaction-diffusion equations is efficient for long-time integra-tion in Chapter 7. We first analyze the error of SCM (sparse grid collocationof Smolyak type) with an Euler scheme in time for linear SODEs, and showthat the error is small only when the noise magnitude is small and/or theintegration time is relatively short. We compare WCE and SCM using therecursive procedure in Chapter 8, where we derive error estimates of WCEand SCM for linear advection-reaction-diffusion equations and show thatWCE and SCM are competitive in practice by presenting careful numeri-cal comparisons, even though WCE can be of one order higher than SCM.

Among almost all approximations for WCE and SCM, we use theWong-Zakai approximation with spectral approximation of Brownianmotion. The convergence order with respect to the number of truncationmodes is half order, see Chapter 8. However, WCE can be of higher conver-gence order since it can preserve the orthogonality over the Wiener space(infinite dimensional) while SCM cannot as the orthogonality is only validon discrete spaces (finite dimensional), see Chapter 8. In Chapter 9, we testthe Wong-Zakai approximation in conjunction with the stochastic collocationmethod for the stochastic Euler equations modeling a stochastic piston prob-lem and show the effectiveness of this approximation in a practical situation.To further investigate the effect of truncation of Brownian motions, we studythe elliptic equation with additive white noise in Chapter 10. We show thatthe convergence of numerical solutions with truncation of Brownian motiondepends on the smoothing effects of the resolvent of the elliptic operator. Wealso show similar convergence when finite element methods are used.

As shown in Figure 1.2, we focus on deterministic integration methods inrandom space, i.e., Wiener chaos and stochastic collocation, in Chapters 6–9and compare their performance with Monte Carlo methods and/or quasi-Monte Carlo methods. In Chapter 6, we compare WCE and Monte Carlomethods and show that WCE is superior to Monte Carlo methods if high ac-curacy is needed. In Chapters 7 and 9, we show theoretically and numericallythe efficiency of SCM for short time integration and for small magnitudes ofnoises. In Chapter 8, we compare WCE and SCM in conjunction with a recur-sive multistage procedure and show that they are comparable in performance.In Chapter 11, we consider WCE for elliptic equations with multiplicativenoise. We use the Wick product for the interaction of two random variables aswell as the Wick-Malliavin approximation to reduce the computational cost.We use Monte Carlo methods in Chapters 4 and 10 as the dimensionality inrandom space is beyond deterministic integration methods.

In all chapters except Chapters 5 and 7, we apply the Wong-Zakai approxi-mation with the Brownian motion approximated by its spectral truncation. Itis shown that the convergence of numerical schemes based on the Wong-Zakaiapproximation is determined by further discretization in space (Chapter 10)or in time (Chapter 4).

1.3 Specific topics of this book 9

Fig. 1.2. Conceptual and chapter overview of this book.

gk

Sticky Note

Handy -- please check carefully this navigation figure

2

Brownian motion and stochastic calculus

In this chapter, we review some basic concepts for stochastic processes andstochastic calculus as well as numerical integration methods in random spacefor obtaining statistics of stochastic processes.

We start from Gaussian processes and their representations in Chapter 2.1and then introduce Brownian motion and its properties and approximationsin Chapter 2.2. We discuss basic concepts in stochastic calculus: Ito integralin Chapter 2.3 and Ito formula in Chapter 2.4. We then focus on numer-ical integration methods in random space such as Monte Carlo methods,quasi-Monte Carlo methods, Wiener chaos method, and stochastic colloca-tion method (sparse grid collocation method) in Chapter 2.5. Examples ofapplying these methods to a simple equation are provided in Chapter 2.5.5with Matlab code. In Chapter 2.6 of bibliographic notes, we present a reviewon different types of approximation of Brownian motion and a brief reviewon pros and cons of numerical integration methods in random space. Var-ious exercises are provided for readers to familiarize themselves with basicconcepts presented in this chapter.

2.1 Gaussian processes and their representations

On a given probability space (Ω,F ,P) (Ω = R), if a cumulative distributionfunction of a random variable X is normal, i.e.,

P(X < x) =

∫ x

−∞

1√2π σ

e−(y−μ)2

2σ2 dy, σ > 0.


11

12 2 Brownian motion and stochastic calculus

then the random variable X is called a Gaussian (normal) random variableon the probability space (Ω,F ,P). Here X is completely characterized byits mean μ and its standard deviation σ. We denote X ∼ N (μ, σ2). Theprobability density function of X is

p(x) =1√2π σ

e−(x−μ)2

2σ2 .

When μ = 0 and σ = 1, we call X a standard Gaussian (normal) randomvariable. If X ∼ N (μ, σ2), then Z = X−μ

σ ∼ N (0, 1), i.e., Z is a standardGaussian (normal) random variable.

Example 2.1.1 If Xi are mutually independent Gaussian random variables,then

∑Nj=1 aiXi is a Gaussian random variable for any ai ∈ R. In particular,

if X1 ∼ N (μ1, σ21) and X2 ∼ N (μ2, σ

22) are independent, then

αX1 + βX2 ∼ N (μ1 + μ2, α2σ2

1 + β2σ22).

Definition 2.1.2 (Gaussian random vector) A Rn-valued random vector

X = (X1, X2, . . . , Xn)� has an n-variate Gaussian distribution with mean µ

and covariance matrix Σ if X = µ+AZ where the matrix A is of size n×n,Σ = AA�, and Z = (Z1, Z2, . . . , Zn)

� is a vector with independent standardGaussian (normal) components.

When n = 1, X is a (univariate) Gaussian random variable. The probabilitydensity of X is

p(x) =1

√(2π)n |Σ|1/2

e−(x−µ)�Σ−1(x−µ)

2 .

Example 2.1.3 A set of random variables {Xi}ni=1 are called jointlyGaussian if

∑ni=1 aiXi is a Gaussian random variable for any ai ∈ R. Then

X = (X1, X2, . . . , Xn)� is a Gaussian random vector.

The correlation of two random variables (vectors) is a normalized versionof the covariance, with values ranging from −1 to 1:

Corr(X,Y ) =Cov[(X,Y )]√Var[X]Var[Y ]

, Cov[(X,Y )] = E[(X −E[X])(Y −E[Y ])�].

When Corr(X,Y ) = 0, we say X and Y are uncorrelated.

Definition 2.1.4 (Gaussian process) A collection of random variablesis called a Gaussian process, if the joint distribution of any finite numberof its members is Gaussian. In other words, a Gaussian process is a R

d-valued stochastic process with continuous time (or with index) t such that(X(t0), X(t1), · · · , X(tn))

� is a n + 1-dimensional Gaussian random vectorfor any 0 ≤ t0 < t1 < · · · < tn.

2.1 Gaussian processes and their representations 13

The Gaussian process is denoted as X = {X(t)}t∈T where T is a set ofindexes. Here T = [0,∞).

The consistency theorem of Kolmogorov [255, Theorem 2.2] implies thatthe finite dimensional distribution of a Gaussian stochastic process X(t) isuniquely characterized by two functions: the mean function μt = E[X(t)] andthe covariance function C(t, s) = Cov[X(t), X(s)].

A Gaussian process {X(t)}t∈T is called a centered Gaussian process if themean function μ(t) = E[X(t)] = 0 for all t ∈ T .

Given a function μ(t) : T → R and a nonnegative definite function C(t, s) :T×T → R, there exists a Gaussian process {X(t)}t∈T with the mean functionμ(t) and the covariance function C(t, s).

To find such a Gaussian process, we can use the following expansion.

Theorem 2.1.5 (Karhunen-Loeve expansion) Let X(t) be a Gaussianstochastic process defined over a probability space (Ω,F ,P) and t ∈ [a, b],−∞ < a < b < ∞. Suppose that X(t) has a continuous covariance functionC(t, s) = Cov[(X(t), X(s))] = E[(X(t) − E[X(t)])(X(s) − E[X(s)])]. ThenX(t) admits the following representation

X(t) = E[X(t)] +

∞∑

k=1

Zkek(t),

where the convergence is in L2, uniform in t (i.e., limn→∞ max

t∈[a,b]E[(X(t) −

E[X(t)]−n∑

k=1

Zkek(t))2] = 0) and

Zk =

∫ b

a

(X(t)− E[X(t)]

)ek(t) dt.

Here the eigenfunctions ek’s of CX with respective eigenvalues λk’s form anorthonormal basis of L2([a, b]) and

∫ b

a

C(t, s)ek(t) dt = λkek(s), k ≥ 1.

Furthermore, the random variables Zk’s have zero-mean, are uncorrelated,and have variance λk

E[Zk] = 0, for all k ≥ 1 and E[ZiZj ] = δijλj , for all i, j ≥ 1.

This is a direct application of Mercer’s theorem [497] on a representation ofa symmetric positive-definite function as a sum of a convergent sequence ofproduct functions. The stochastic process X(t) can be non-Gaussian.

The covariance function C(t, s) can be represented as C(t, s) =∑∞

k=1 λkek(t)ek(s). The variance of X(t) is the sum of the variances of the individualcomponents of the sum:


Var[X(t)] = E[(X(t)− E[X(t)])2] =

∞∑

k=0

e2k(t)Var[Zk] =

∞∑

k=1

λke2k(t).

Here Zk are uncorrelated random variables.The domain where the process is defined can be extended to domains in

Rd. In Table 2.1 we present a list of covariance functions commonly used in

practice. Here the constant l is called correlation length, Kν is the modified

Table 2.1. A list of covariance functions.

Wiener process min(x, y), x, y ≥ 0

White Noise σ2δ(x− y), x, y ∈ Rd

Gaussian exp(− |x−y|2

2l2

), x, y ∈ R

d

Exponential exp(− |x−y|

l

), x, y ∈ R

d

Matern kernel21−ν

Γ (ν)

(√2ν |x− y|l

)ν

Kν

(√2ν |x− y|l

), x, y ∈ R

d

Rational quadratic (1 + |x− y|2)−α, x, y ∈ Rd, α ≥ 0

Bessel function of order ν, and Γ (·) is the gamma function:

Γ (t) =

∫ ∞

0

xt−1e−x dx, t > 0.

Here are some examples of Karhunen-Loeve expansion for Gaussian pro-cesses.

Example 2.1.6 (Brownian motion) When C(t, s) = min(t, s), t ∈ [0, 1],then the Gaussian process X(t) can be written as

X(t) =√2

∞∑

k=1

ξksin

((k − 1

2

)πt)

(k − 1

2

)π

.

Here ξk’s are mutually independent standard Gaussian random variables. Onecan show that for t, s ∈ [0, 1], the eigenvectors of the covariance functionmin(t, s) are

ek(t) =√2 sin

((k − 1

2

)πt),

and the corresponding eigenvalues are

λk =1

(k − 12 )

2π2.

In the next section, we know that the process in Example 2.1.6 is actually aBrownian motion.

2.1 Gaussian processes and their representations 15

Example 2.1.7 (Brownian Bridge) Let X(t), 0 ≤ t ≤ 1, be the Gaussianprocess in Example 2.1.6. Then Y (t) = X(t) − tX(1), 0 ≤ t ≤ 1, is also aGaussian process and admits the following Karhunen-Loeve expansion:

Y (t) =

∞∑

k=1

ηk

√2 sin(kπt)

kπ.

Here ηk’s are mutually independent standard Gaussian random variables.

Example 2.1.8 (Ornstein-Uhlenbeck process) Consider a centered one-dimensional Gaussian process with an exponential covariance function

exp(− |t−s|l ). The Karhunen-Loeve expansion of such a Gaussian process over

[−a, a] is

O(t) =

∞∑

k=1

ξk√

λkek(t),

where λk = 2ll2θ2

k+1and the corresponding eigenvalues are

e2i(t) =cos(θ2it)√2 + sin(2θ2ia)

2θ2i

, e2i−1(t) =sin(θ2i−1t)√2− sin(2θ2i−1a)

2θ2i−1

, for all i≥ 1, t∈[−a, a].

The θk’s are solutions to the following transcendental equation

1− lθ tan(aθ) = 0 = lθ + tan(aθ).

See p. 23, Section 2.3 of [155] or [245] for a derivation of such an expansion.For more general forms of covariance functions C(t, s), it may not be pos-

sible to find explicitly the eigenvectors and eigenvalues. The Karhunen-Loeveexpansion can be found numerically, and in practice only a finite numberof terms in the expansion are required. Specifically, we usually perform aprincipal component analysis by truncating the sum at some N such that

∑Ni=1 λi∑∞i=1 λi

=

∑Ni=1 λi∫ b

aVar[X(t)] dt

≥ α.

Here α is typically taken as 0.9, 0.95, and 0.99. The eigenvalues and eigenfunc-tions are found by solving numerically the following eigenproblem (integralequation):

∫ b

a

C(t, s)ek(t) ds = λkek(s), s ∈ [a, b] and k = 1, 2, . . . , N.

See, e.g., Section 2.3 of [155] or [419] for a Galerkin method for this problem.We can also apply the Nystrom method or the quadrature method, where theintegral is replaced with a representative weighted sum. Several numericalmethods for representing a stochastic process with a given covariance kernelare presented in [308, Chapter 7], where numerical methods are not based onKarhunen-Loeve expansion, but based on Fourier analysis and other methods.

The decay of eigenvalues in the Karhunen-Loeve expansion depends onthe smoothness of covariance functions.


Definition 2.1.9 ([419]) A covariance function C : D × D → R is said tobe piecewise analytic/smooth/Hp,q on D × D, 0 ≤ p, q ≤ ∞ if there exists

a partition D = {Dj}Jj=1 into a finite sequence of simplexes Dj and a finite

family G = {Gj}Jj=1 of open sets in Rd such that

D = ∪Jj=1Dj , Dj ⊆ Gj , 1 ≤ j ≤ J

and such that C|Dj×D′jhas an extension to Gj × Gj′ which is analytic in

Dj ×Dj′/is smooth in Dj ×Dj′/is in Hp(Dj)⊗Hq(Dj′) for any pair (j, j′).

The following conclusions on the eigenvalues in the Karhunen-Loeveexpansion are from [419].

Theorem 2.1.10 Assume that C ∈ L2(D × D) be a symmetric covariancefunction which leads to a compact and nonnegative operator from L2(D) de-fined by Cu(x) =

∫D C(x, y)u(y) dy. If C is piecewise analytic on D×D, then

the eigenvalues λk in the Karhunen-Loeve expansion satisfy that

0 ≤ λk ≤ K1e−K2k

1/d

, k ≥ 1.

The constants K1 and K2 depend only on the covariance function C and thedomain D. If C is piecewise Hp(D)⊗L2(D) with p ≥ 1, then the eigenvaluesλk in the Karhunen-Loeve expansion decay algebraically fast

0 ≤ λk ≤ K3k−p/d, k ≥ 1.

For the Gaussian covariance function, C(x, y) = σ2 exp(−(|x− y|2)/γ2/diam(D)). Then the eigenvalues λk in the Karhunen-Loeve expansion decayexponentially fast:

0 ≤ λk ≤ K4γ−k1/d−2/Γ (0.5k1/d), k ≥ 1.

An different approach to show the decay of the eigenvalues is presented in[308, Chapter 7] using Fourier analysis for isotropic covariance kernels (thetwo-point covariance kernel depends only on distances of two points).

Theorem 2.1.11 ([419]) Assume that the process a(x, ω) has a covariancefunction C, which is piecewise analytic/in Hp,q on D × D. Then the eigen-functions are analytic/in Hp in each Dj ∈ D.

With further conditions on the domain Dj in D, it can be shown that the

derivatives of eigenfunctions ek(x)’s decay at the speed of |λk|−swhen C is

piecewise smooth where s > 0 is an arbitrary number.

2.2 Brownian motion and white noise 17

2.2 Brownian motion and white noise

Definition 2.2.1 (One-dimensional Brownian motion) A one-dimensional continuous time stochastic process W (t) is called a standardBrownian motion if

• W (t) is almost surely continuous in t,• W (t) has independent increments,• W (t)−W (s) obeys the normal distribution with mean zero and variance

t− s.• W (0) = 0.

It can be readily shown that W (t) is Gaussian process. We then call W (t) =ddtW , formally the first-order derivative of W (t) in time, white noise.

By Example 2.1.6 and Exercise 2.7.7, then the Brownian motion W (t),t ∈ [0, 1] can be represented by

W (t) =√2

∞∑

i=1

ξisin

((i− 1

2

)πt)

(i− 1

2

)π

, t ∈ [0, 1],

where ξi’s are mutually independent standard Gaussian random variables.The Brownian motion and white noise can also be defined in terms of or-thogonal expansions. Suppose that {mk(t)}k≥1 is a complete orthonormal

system (CONS) in L2([0, T ]). The Brownian motion W (t), t ∈ [0, T ] can bedefined by (see, e.g., [315])

W (t) =

∞∑

i=1

ξi

∫ t

0

mi(s) ds, t ∈ [0, T ], (2.2.1)

where ξi’s are mutually independent standard Gaussian random variables.It can be checked that the Gaussian process defined by (2.2.1) is indeed astandard Brownian motion by Definition 2.2.1. Correspondingly, the whitenoise is defined by

W (t) =

∞∑

i=1

ξimi(t), t ∈ [0, T ]. (2.2.2)

When mi(t) =√2/T cos((i − 1/2)πt/T ), i ≥ 1, then the representa-

tion (2.2.1) coincides with the Karhunen-Loeve expansion of Brownian mo-tion in Example 2.1.6 when T = 1.

Definition 2.2.2 (Multidimensional Brownian motion) A continuousstochastic process Wt = (W1(t), . . . ,Wm(t))� is called an m-dimensionalBrownian motion on R

m when Wi(t) are mutually independent standardBrownian motions on R.


Definition 2.2.3 (Multidimensional Brownian motion, alternativedefinitions) An R

d-valued continuous Gaussian process X(t) with meanfunction μ(t) = E[X(t)] and the covariance function C(t, s) = Cov[(X(t),X(s))] = E[(X(s)− μ(s))(X(t)− μ(t))�] is called a d-dimensional Brownianmotion if for any 0 ≤ t0 < t1 < · · · < tn,

• X(ti) and X(ti+1)−X(ti) are independent;• the covariance function (a matrix) is a diagonal matrix with entries

min(ti, tj), 0 ≤ i, j ≤ n.

When μ(t) = 0 for all t and C(t, s) = min(t, s), the Gaussian process is calleda standard Brownian motion.

2.2.1 Some properties of Brownian motion

Theorem 2.2.4 The covariance Cov[(W (t),W (s))] = E[W (t)W (s)] = min(t, s).

• Time-homogeneity: For any s > 0, W (t) = W (t+s)−W (s) is a Brownianmotion, independent of σ(W (u), u ≤ s).

• Brownian scaling: For every c > 0, cW (t/c2) is a Brownian motion.• Time inversion: Let W (0) = 0 and W (t) = tW (1/t), t > 0. Then W (t)

is a Brownian motion.

Corollary 2.2.5 (Strong law of large numbers for Brownianmotion) If W (t) is a Brownian motion, then it holds almost surely that

limt→∞

W (t)

t= 0.

Theorem 2.2.6 (Law of the iterated logarithm) Let Wt be a standardBrownian motion. Then

P(lim supt→0

Wt√2t |log log(t)|

= 1) = 1, P(lim inft→0

Wt√2t |log log(t)|

= −1) = 1.

P(lim supt→∞

Wt√2t log log(t)

= 1) = 1, P(lim inft→∞

Wt√2t log log(t)

= −1) = 1.

Example 2.2.7 (Ornstein-Uhlenbeck process) Consider a centered one-dimensional Gaussian process with exponential covariance function

exp(− |t−s|σ ). The Gaussian process is usually called a Ornstein-Uhlenbeck

process. Suppose that W (t) is a standard Brownian motion. For t ≥ 0, theOrnstein-Uhlenbeck process can be written as

O(t) = e−tσ W (e

2tσ ).


Example 2.2.8 The Brownian bridge X(t) is a one-dimensional Gaussianprocess with time t ∈ [0, 1] and covariance Cov[(X(t), X(s))] = min(t, s) −ts =

{s(1− t), 0 ≤ s ≤ t ≤ 1.t(1− s), 0 ≤ t ≤ s ≤ 1.

Suppose that W (t) is a standard Brownian motion. Then X(t) can be repre-sented by

X(t) = W (t)−tW (1) = t(W (t)−W (1)

)+(1−t)

(W (t)−W (0)

), 0 ≤ t ≤ 1.

The process X(t) bridges W (t)−W (1) and W (t)−W (0). It can be readilyverified that Cov[(X(t), X(s))] = min(t, s) − ts and X(t) is continuous andstarting from 0. Moreover,

W (t) = (t+ 1)X(t

t+ 1), X(t) = (1− t)W (

t

1− t).

Regularity of Brownian motion

For deterministic functions, f(x), x ∈ R is Holder continuous of order α ifand only if there exists a constant C such that

|f(x+ h)− f(x)| ≤ Chα, for all h > 0 and all x.

When α = 1, we call it Lipschitz continuous. When C depends on x, then wecall it locally Holder continuous of order α

|f(x+ h)− f(x)| ≤ C(x)hα, for all small enough h > 0.

Definition 2.2.9 Consider two stochastic processes, X(t) and Y (t), definedon the probability space (Ω,F ,P). We call Y (t) a modification (or version)of X(t) if for every t ≥ 0, we have

P(X(t) = Y (t)) = 1.

Theorem 2.2.10 (Kolmogorov and Centsov continuity theorem,[255, Section 2.2.B]) Given a stochastic process X(t) with t ∈ [a, b], ifthere exist constants p > r, K > 0 such that

E[|X(t)−X(s)|p] ≤ K |t− s|1+r, for t, s ∈ [a, b],

then X(t) has a modification Y (t) which is almost everywhere (in ω) contin-uous: for all t, s ∈ [a, b],

|Y (t, ω)− Y (s, ω)| ≤ C(ω) |t− s|α , 0 < α <r

p.


For X(t), t ∈ T ⊆ Rd, if there exist constants p > r, K such that

E[|X(t)−X(s)|p] ≤ K |t− s|d+r, for t, s ∈ T,

then X(t) has a modification Y (t) which is almost everywhere in ω continu-ous: for all t, s ∈ T ,

E[(sups�=t

|Y (t, ω)− Y (s, ω)||t− s|

α)p] < ∞, 0 < α <

r

p.

Theorem 2.2.11 For α < 12 , the Brownian motion has a modification which

is of locally Holder continuous of order α.

Proof. For integer n ≥ 1, by Kolmogorov and Centsov continuity theorem, itonly requires to show that

E[|W (t)−W (s)|2n] ≤ Cn |t− s|n .

Then recalling the conclusion from Exercise 2.7.1 leads to the conclusion.

Theorem 2.2.12 ([255, Section 2.9.D]) The Brownian motion is nowheredifferentiable: for almost all ω, the sample path (realization, trajectory)W (t, ω) is nowhere differentiable as function of t. Moreover, for almost all ω,the path (realization, trajectory) W (t, ω) is nowhere Holder continuous withexponent α > 1

2 .

Definition 2.2.13 (p-variation) The p-variation of a real-valued functionf, defined on an interval [a, b] ⊂ R, is the quantity

|f |p,TV = supΠn, |πn|→0

n−1∑

i=0

|f(xi+1)− f(xi)|p,

where the supremum runs over the set of all partitions Πn of the giveninterval.

Theorem 2.2.14 (Unbounded total variation of Brownian motion)The paths (realizations, trajectories) of Brownian motion are of infinite totalvariation almost surely (a.s., with probability one).

Proof. Without loss of generality, let’s consider the interval [0, 1].

|W |1,TV = supΠn

n−1∑

i=0

|W (ti+1)−W (ti)| ≥n−1∑

i=0

|W (i+ 1

n)−W (

i

n)| =: Vn.

Denote by W ( i+1n ) − W ( i

n ) =ξi√n. Then ξi’s are i.i.d. N (0, 1) random vari-

ables. Observe that E[Vn] =√nE[|ξ1|] and Var[Vn] = 1− (E[|ξ1|])2.


Then it follows from the Chebyshev inequality (see Appendix D), we have

P(Vn ≥ 1

2E[|ξ1|]

√n) = P(Vn − E[|ξ1|]

√n ≥ −1

2E[|ξ1|]

√n)

≥ 1− P(∣∣Vn − E[|ξ1|]

√n∣∣ ≥ 1

2E[|ξ1|]

√n)

≥ 1− Var[Vn](12E[|ξ1|]

√n)2 = 1− 4

1− (E[|ξ1|])2n(E[|ξ1|])2

.

Thus we have

P(|W |1,TV ≥ E[|ξ1|]2

√n) ≥ P(Vn ≥ E[|ξ1|]

2

√n) = 1− 4

1− (E[|ξ1|])2n(E[|ξ1|])2

.

Letting n → ∞, we obtain

P(|W |1,TV = ∞) = 1.

2.2.2 Approximation of Brownian motion

According to the representation in Chapter 2.2, we have at least three ap-proximate representations for Brownian motion by a finite number of randomvariables.

By Definition 2.2.1, the Brownian motion at time tn+1 can be approxi-mated by

n∑

i=0

ΔWi =

n∑

i=0

√Δtiξi, where ΔWi = W (ti+1)−W (ti), and Δti = ti+1−ti,

(2.2.3)

where ξi’s are i.i.d. standard Gaussian random variables. A sample path(realization, trajectory) of Brownian motion is illustrated in Figure 2.1. Hereis Matlab code for generating Figure 2.1.

Code 2.1. A sample path of Brownian motion.

% One realization of W(t) at time grids k*dtclc, clear allt =2.5;n= 1000;dt = t / n;% Increments of Brownian motionWinc = zeros ( n + 1,1);% Declare the status of random number generator -- Mersenne% Twisterrng(100, 'twister');Winc(1:n) = sqrt ( dt ) * randn ( n, 1);% Brownian motion - cumulative sum of all previous% increments


W(2:n+1,1) = cumsum ( Winc(1:n) );figure(10)plot((1:n+1).'*dt,W,'b-','Linewidth',2);xlabel('t')ylabel('W(t)')axis tight

One popular approximation of Brownian motion in continuous time ispiecewise linear approximation (also known as polygonal approximation, see,e.g., [457, 481, 482] or [241, p. 396]), i.e.,

W (n)(t) = W (ti) + (W (ti+1)−W (ti))t− ti

ti+1 − ti, t ∈ [ti, ti+1). (2.2.4)

Another way to approximate Brownian motion is by a truncated orthogonalexpansion:

W (n)(t) =

n∑

i=1

ξi

∫ t

0

mi(s) ds, ξj =

∫ T

0

mj(t) dW, t ∈ [0, T ], (2.2.5)

where {mi(t)} is a CONS in L2([0, T ]) and ξj are mutually independentstandard Gaussian random variables.

In this book, we mostly use the cosine basis {ml(s)}l≥1 given by

Fig. 2.1. An illustration of a sample path of Brownian motion using cumulativesummation of increments.

t0.5 1 1.5 2 2.5

W(t

)

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4


m1(s) =1√T, ml(s) =

√2

Tcos(

π(l − 1)s

T), l ≥ 2, 0 ≤ s ≤ T, (2.2.6)

or a piecewise version of spectral expansion (2.2.5) by taking a partition of[0, T ], e.g., 0 = t0 < t1 < · · · tK−1 < tK = T . We then have

W (n,K)(t) =

K∑

k=1

n∑

l=1

Mk,l(t)ξk,l, ξk,l =

∫

Ik

mk,l(s) dW (s) (2.2.7)

where Mk,l(t) =∫ tk∧t

tk−1mk,l(s) ds (tk ∧ t = min(tk, t)), {mk,l}∞l=1 is a CONS

in L2(Ik) and Ik = [tk−1, tk). The random variables ξk,l are i.i.d. standardGaussian random variables. Sometimes (2.2.7) is written as

W (n,K)(t) =

K∑

k=1

n∑

l=1

∫ t

0

1Ik(s)mk,l(s) dsξk,l, (2.2.8)

where 1· is the indicator function.Here different choices of CONS lead to different representations. The or-

thonormal piecewise constant basis over time interval Ik = [tk−1, tk), withhk = (tk − tk−1)/

√n, is

mk,l(t) =1√hχ[tk−1+(l−1)hk,tk−1+lhk), l = 1, 2, · · · , n. (2.2.9)

When n = 1, this basis gives the classical piecewise linear interpolation (2.2.4).The orthonormal Fourier basis gives Wiener’s representation (see, e.g., [259,358, 391]):

mk,1 =1√

tk − tk−1, mk,2l =

√2

tk − tk−1sin

(2lπ

t− tk−1

tk − tk−1

),

mk,2l+1(t) =

√2

tk − tk−1cos

(2lπ

t− tk−1

tk − tk−1

), t ∈ [tk−1, tk). (2.2.10)

Note that taking only mk,1 (n = 1) in (2.2.10) again leads to the piecewiselinear interpolation (2.2.4). Besides, we can also use the Haar wavelet basis,which gives the Levy-Ciesielsky representation [255].

Remark 2.2.15 Once we have a formal representation (approximation) ofBrownian motion, we then can readily obtain a formal representation (ap-proximation) of white noise and thus we skip the formulas for white noise.

Lemma 2.2.16 Consider a uniform partition of [0, T ], i.e., tk = kΔ, KΔ =T . For t ∈ [0, T ], there exists a constant C > 0 such that

E[(W (t)−W (n,K)(t)

)2] ≤ C

Δ

n,


and for sufficient small ε > 0

∣∣∣W (t)−W (n,K)(t)∣∣∣ ≤ O

((Δ

n

)1/2−ε)

.1 (2.2.11)

For t = tk, we haveW (tk)−W (n,K)(tk) = 0, (2.2.12)

if the CONS {mk,l}∞l=1 contains 1√tk−tk−1

as its elements, i.e.,

∫ tk

tk−1

mk,l(s)

1√tk − tk−1

ds = δl,1.

Proof. By the spectral approximation of W (t) (2.2.7) and the fact that ξk,l,iare i.i.d., we have

E[(W (t)−W (n,K)(t))2] =K∑

k=1

∞∑

l=n+1

(∫ t∧tk

tk−1

mk,l(s) ds

)2

=

K∑

k=1

∞∑

l=n+1

(∫ tk

tk−1

χ[0,t](s)mk,l(s) ds

)2

≤ CΔ

n,

where tk ∧ t = min(tk, t) and we have applied the standard estimate for L2-projection using piecewise orthonormal basis mk,l(s), see, e.g., [417] and haveused the fact that the indicator function χ[0,t](s) belongs to the Sobolev space

H1/2((0, T )) for any T > t.Once we have the L2-estimate, we can apply the Borel-Cantelli lemma

(see Appendix D) to obtain the almost sure (a.s.) convergence (2.2.11).

If t = tk, we have∫ tktk−1

mk,l(s) ds = 0 for any l ≥ 2 if mk,1 = 1√tk−tk−1

.

and thus (2.2.12) holds.

Though any CONS in L2([0, T ]) can be used in the spectral approxima-tion (2.2.5), we use a CONS containing a constant in the basis. Consequently,we have the following relation

∫ tn+1

tn

dW (n)(t) = ΔWn, ΔWn = W (tn+1)−W (tn). (2.2.13)

We will use these approximations in most of the chapters in the book forWong-Zakai approximation.

1The big “O” implies that the error is bounded by a positive constant times theterm in the parenthesis.

2.3 Brownian motion and stochastic calculus 25

2.3 Brownian motion and stochastic calculus

As Brownian motion W (t) is not a process of bounded variation, the

integral

∫ t

0

f(t) dW (t) cannot be interpreted using Riemann-Stieltjes inte-

gration or Lebesgue-Stieltjes integration, even for very smooth stochasticprocess f . However, it can be understood as Ito integral or Stratonovich in-tegral. For an adapted process f(t) (with respect to the natural filtrationof Brownian motion), the Ito integral is defined as, see, e.g., [388], for allpartitions of the interval [0, T ],

lim|Πn|→0

E[( ∫ T

0

f(t) · dW −n−1∑

i=1

f(ti−1)ΔWi

)2] = 0,

where Πn = {0 = t0 < t1 < t2 < · · · < tn = T} is a partition of the inter-val [0, T ] and |Πn| = max0≤i≤n−1 |ti+1 − ti|. The finite sum in this defini-tion is defined at the left-hand points in each subinterval of the partition.For Stratonovich calculus, the finite sum is defined at the midpoints in eachsubinterval of the partition. i.e.,

∫ T

0

f(t) ◦ dW = lim|Πn|→0

n−1∑

i=1

f(ti−1 + ti

2)ΔWi.

Again, the limit is understood in the mean-square sense, see, e.g., [388, 431].

Example 2.3.1 It can be readily checked that

∫ T

0

W (t) dW (t) =W 2(T )− T

2,

∫ T

0

W (t) ◦ dW (t) =W 2(T )

2. (2.3.1)

Let us show that the first formula holds. By simple calculation and the prop-erties of increments of Brownian motion, for 0 = t0 < t1 < t2 < · · · < tn = T ,we have

E[(W 2(T )− T

2−

n−1∑

i=0

W (ti)ΔWi

)2]

= E[(W 2(T )− T

2−

n−1∑

i=0

W (ti) +W (ti+1)

2ΔWi +

1

2

n∑

i=1

(ΔWi)2)2]

= E[(W 2(T )− T

2−

n−1∑

i=0

W 2(ti+1)−W 2(ti)

2+

1

2

n∑

i=1

(ΔWi)2)2]

= E[(−T

2+

1

2

n∑

i=1

(ΔWi)2)2] → 0, n → ∞.

The second formula can be derived similarly.Moreover, we have the following conversion formula.


Theorem 2.3.2 (Conversion of a Stratonovich integral to an Itointegral) A Stratonovich integral can be computed via the Ito integral:

∫ T

0

f(t,W (t) ◦ dW (t) =

∫ T

0

f(t,W (t)) dW (t) +1

2

∫ T

0

∂xf(t,W (t)) dt.

Here f(t,W (t)) is a scalar function and ∂xf is the derivative with respect tothe second argument of f . When f ∈ R

m×n is a matrix function, then

[

∫ T

0

f(t,W (t) ◦ dW (t)]i = [

∫ T

0

f(t,W (t)) dW (t)]i

+1

2

∫ T

0

n∑

j=1

∂xjfi,j(t,W (t)) dt, i = 1, 2, · · · ,m.

Here vi means the i-th component of a vector v.

The proof can be done using the definition of two integrals and mean valuetheorem. We leave the proof to interested readers.

Definition 2.3.3 (Quadratic covariation) The quadratic covariation oftwo processes X and Y is

[X,Y ]t = lim|Πn|→0

n∑

k=1

(X(tk)−X(tk−1)) (Y (tk)− Y (tk−1)) .

Here Πn = {0 = t0 < t1 < · · · < tn−1 < tn = t} is an arbitrary partition ofthe interval [0, t].

When X = Y , the quadratic covariation becomes the quadratic variation:

[X]t = [X,X]t = lim|Πn|→0

n∑

k=1

X(tk)−X(tk−1)2.

The quadratic covariation can be computed by the polarization identity:

[X,Y ]t =14 ([X + Y ]t − [X − Y ]t).

With the definition of quadratic covariation, we have

∫ T

0

f(t,W (t)◦ dW (t) =

∫ T

0

f(t,W (t)) dW (t)+1

2

∫ T

0

∂xf(t,W (t)) d[W,W ]t.

More generally, we have the following conversion rule

∫ t

0

Y (s) ◦ dX(s) =

∫ t

0

Y (s) dX(s) +1

2[X,Y ]t. (2.3.2)

2.3 Brownian motion and stochastic calculus 27

For Ito integral, we have the following properties. Define

L2ad(Ω;L2([a, b])) =

{ft(ω)|ft(ω) is Ft-measurable and E[

∫ b

a

f2s ds] < ∞

}.

Theorem 2.3.4 For f, g ∈ L2ad(Ω;L2([0, T ])), we have

• (linear property)

∫ t

0

(af(s) + bg(s)) dW (s) = a

∫ t

0

f(s) dW (s) + b

∫ t

0

g(s)

dW (s), a, b ∈ R,

• (Ito isometry) E[( ∫ t

0

f(s) dW (s))2] =

∫ t

0

E[f2(s)] ds,

• (Generalized Ito isometry) E[

∫ t

0

f(s) dW (s)

∫ t

0

g(s) dW (s)] =

∫ t

0

E[f(s)

g(s)] dt,

• Mt =

∫ t

0

f(s) dW (s) is a continuous martingale. Moreover, the quadratic

variation of Mt is |M |2,TV =∫ t

0f2(s) ds, 0 ≤ t ≤ T and

E[ sup0≤t≤T

( ∫ t

0

f(s) dW (s))2] ≤ 4E[

∫ t

0

E[f2(s)] ds].

Example 2.3.5 (Multiple Ito integral) Assuming that W (t) is a standardBrownian Motion, show that

1

n!

∫ t

0

∫ tn

0

. . .

∫ t2

0

dW (t1) · · · dW (tn) = tn/2Hn(W (t)√

t).

Here Hn is the n-th Hermite polynomial:

Hn(x) = (−1)nex2/2 dn

dxne−x2/2. (2.3.3)

When n = 0, Hn(x) = 1 and we use the convention that when n < 1 the

integral is defined as 1. When n = 1,∫ t

0dW (t1) = W (t) = t1/2H1(

W (t)√t) as

H1(x) = x. Then it can be shown by induction that the integrand in theleft-hand side is in L

2ad(Ω;L2([0, t])) and thus the multiple integral is indeed

an Ito integral and is equal to the right-hand side.

When we want to define the integral

∫ t

0

f(t) dW (t) via a spectral repre-

sentation of Brownian motion instead of using increments of Brownian mo-tion, we have to use the so-called Ito-Wick product (Wick product) andStratonovich product.


The use of Ito-Wick product relies on two facts: the integrand f canbe expressed as Hermite polynomial expansion of some random variables(“random basis”) and the product is well defined for these basis. The basisand the Ito-Wick product will be shown and defined shortly. Specifically, let{ξk}∞k=1 be a sequence of mutually independent standard Gaussian randomvariables from a spectral representation of Brownian motion and let F =σ(ξk)k≥1. The following Cameron-Martin theorem states that any element inL2(Ω,F ,P) can be represented by a linear combination of elements in the

Cameron-Martin basis (2.3.4).

Theorem 2.3.6 (Cameron-Martin [57]) Let (Ω,F ,P) be a completeprobability space. The collection Ξ = {ξα}α∈J is an orthonormal basis inL2(Ω,F ,P), where ξα’s are defined as

ξα :=∏

α

(Hαl

(ξl)√αl!

), ξl =

∫ t

0

ml(s) dW (s), α ∈ J , (2.3.4)

where {ml} is a complete orthonormal basis in L2([0, t]) and J is the collec-tion of of multi-indices α = (αl)l≥1 of finite length, i.e.,

J =

{α = (αl)l≥1, αl ∈ {0, 1, 2, . . .}, |α| :=

∑

l

αl < ∞}

.

Any η ∈ L2(Ω,F ,P) can be represented as the following Wiener chaos

expansion

η =∑

α∈Jηαξα, ηα = E[ηξα], and E[η2] =

∑

α∈Jη2α. (2.3.5)

The collection Ξ of random variables {ξα, α ∈ J } is called the Cameron-Martin basis. It can be readily shown that E[ξαξβ ] = 1 if α = β and 0otherwise. See some specific examples of the Cameron-Martin basis in Sec-tion 2.5.3.

Following [223, Section 2.5] and [316], under certain conditions on f(t)(continuous semi-martingale with respect to the natural filtration of Brown-ian motion), we have

∫ t

0

f(t) dW (t) =

∫ t

0

f(t) � W (t) dt, (2.3.6)

where the definition of Ito-Wick product “�” is based on the product for

elements of the Cameron-Martin basis: ξα � ξβ =

√(α+ β)!

α!β!ξα+β .

2.4 Stochastic chain rule: Ito formula 29

With the approximation (2.2.5), Ogawa [386] defined the following so-called Ogawa integral,

∫ t

0

f(t)∗ dW (t) =

∞∑

i=1

∫ t

0

f(s)mi(s) ds

∫ t

0

mi(s) dW (s), t ∈ [0, T ] (2.3.7)

where {mi(t)} is the CONS on L2([0, T ]). (Note that Ogawa’s original integralis only defined when t = T .) Nualart and Zakai [385] proved that the Ogawaintegral is equivalent to the Stratonovich integral if the Ogawa integral exists,with the Stratonovich integral defined as

∫ t

0

f(t) ◦ dW (t) = lim|Πn|→0

n∑

i=1

1

ti+1 − ti

∫ ti+1

ti

f(s) ds(W (ti+1)−W (ti)).

(2.3.8)

Moreover, if the integrand f(t) is a continuous semi-martingale (e.g.,Ito process in Definition 2.4.2) on the natural filtration of W (t), then theOgawa integral coincides with the Stratonovich integral defined at the mid-points.

∫ t

0

f(t) ◦ dW (t) = lim|Πn|→0

n∑

i=1

f(ti + ti+1

2)(W (ti+1)−W (ti)). (2.3.9)

As application of stochastic integrals, the fractional Brownian motionBH(t), t ≥ 0, can be introduced. It is a centered Gaussian process with thefollowing covariance function

E[BH(t)BH(s)] = 12 (|t|

2H + |s|2H − |t− s|2H), 0 < H < 1.

The constant H is called the Hurst index or Hurst parameter. The fractionalBrownian motion can be represented by

BH(t) = BH(0) +1

Γ (H + 1/2)

{∫ 0

−∞

[(t− s)H−1/2 − (−s)H−1/2

]dW (s)

+

∫ t

0

(t− s)H−1/2 dW (s)

}.

2.4 Stochastic chain rule: Ito formula

One motivation of Ito formula is to evaluate Ito integral with a complicatedintegrand.

Theorem 2.4.1 (Ito formula in the simplest form) If f and its first twoderivatives are continuous on R, then it holds with probability one (almostsurely, a.s.) that


f(W (t)) = f(W (t0)) +

∫ t

t0

f ′(W (s)) dW (s) +1

2

∫ t

t0

f ′′(W (s)) ds.

From the theorem, we can compute the Ito integral∫ t

t0f ′(W (s)) dW (s) by

∫ t

t0

f ′(W (s)) dW (s) = f(W (t))− f(W (t0))−1

2

∫ t

t0

f ′′(W (s)) ds.

Definition 2.4.2 (Ito process) An Ito process is a stochastic process ofthe form

Xt = X(t0) +

∫ t

t0

a(s) ds+

∫ t

t0

σ(s) dW (s),

where X(t0) is Ft0-measurable, as and σs are adapted w.r.t. Fs and

∫ t

t0

|a(s)| ds < ∞,

∫ t

t0

‖σ(s)‖2 ds < ∞ a.s..

The filtration {Fs, t0 ≤ s ≤ t} is defined such that

• for any s, Bs, a(s) and σ(s) are Fs-measurable;• for any t1 ≤ t2, Bt2 −Bt1 is independent of Ft1 .

Suppose that X(t) exists a.s. such that

X(t) = X(t0) +

∫ t

t0

a(s,X(s)) ds+

m∑

r=1

∫ t

t0

σr(s,X(s)) dWr(s).

Here X(t), X(t0), a, σr ∈ Rd and σ ∈ R

d×m. Also, Wr(s)’s are mutuallyindependent Brownian motions, a(s) and σ(s) are adapted w.r.t. Fs, and

∫ t

t0

|a(s,X(s))| ds < ∞,

m∑

r=1

∫ t

t0

|σr(s,X(s))|2 ds < ∞ a.s..

The filtration {Fs, t0 ≤ s ≤ t} is defined such that for any s, Wr(s) is Fs-measurable and for any t1 ≤ t2, Wr(t2) − Wr(t1) is independent of Ft1 . Itoformula for a C1([0, T ];C2(Rd)) function f(t, ·) is

f(t,X(t)) = f(t0, X(t0)) +

m∑

r=1

∫ t

t0

Λrf(s,X(s)) dWr(s) +

∫ t

t0

Lf(s,X(s)) ds,

where

Λr = σ�r ∇ =

d∑

i=1

σi,r∂

∂xi, ∇ = (

∂

∂x1,

∂

∂x2, · · · , ∂

∂xd),

L =∂

∂t+ a�∇+

1

2

m∑

r=1

d∑

i,j=1

σi,rσj,r∂2

∂xi∂xj.

2.5 Integration methods in random space 31

Remark 2.4.3 For the multi-dimensional Ito formula, we can use thefollowing table to memorize the formula when Wj(t) are mutually independent

Brownian motions.× dWj(t) dt

dWi(t) 1{i=j}dt 0dt 0 0

Theorem 2.4.4 (Integration by parts formula) Let X(t) and Yt be Itoprocesses defined in Definition 2.4.2. Then the following integration by partsformula holds

X(t)Y (t) = X(t0)Y (t0)+

∫ t

t0

X(s) dY (s)+

∫ t

t0

Y (s) dX(s)+

∫ t

t0

dX(s) dY (s).

Here dX(s) dY (s) can be formally computed using the table in Remark 2.4.3.

This integration by parts formula is a corollary of multi-dimensional Ito for-mula for Ito processes.

Consider two Ito processes, X and Y : dX = aX(t) dt+ σX(t) dW (t) anddY = aY (t) dt+ σY (t) dW (t). Then we have from Remark 2.4.3 that

dXdY = (aX(s) dt+σX(t) dW (t)(aY (s) dt+σY (t) dW (t))) = σX(t)σY (t) dt.

2.5 Integration methods in random space

Numerical SODEs and SPDEs are usually dependent on the Monte Carlomethod and its variants to obtain the desired statistics of the solutions. Thefundamental quantity of interest is of the following form

1

(√2π)d

∫

Rd

f(x)e−|x|22 dx.

This is a standard numerical integration problem. Let us consider d = 1 ina general setting. An integral can be treated as an expectation of randomvariable ∫

R

f(x)p(x) dx = E[f(X)], (2.5.1)

where X has a probability density p(x) ≥ 0 and∫Rp(x) dx = 1 (Here p(x) =

1√2π

e−x2/2).

2.5.1 Monte Carlo method and its variants

A simple (standard, brute force) Monte Carlo estimator of this integration is

μ = E[f(X)] ≈ 1

n

n∑

i=1

f(Xi) =: μ (2.5.2)


where Xi are copies of X (i.i.d. with X). The convergence of (2.5.2) is guar-anteed by the law of large numbers. Observe that μ is a random variable andthe mean of this estimator is E[μ] = μ, the variance of the estimator is

Var[μ] =Var[f(X)]

n. (2.5.3)

The error of the Monte Carlo estimator is measured by the followingconfidence interval (95%)

(μ− 1.96

√Var[f(X)]

n, μ+ 1.96

√Var[f(X)]

n),

σ2(f(X)) = E[f2(X)]− (E[f(X)])2.

In practice, we use the following since we don’t know μ:

(μ− 1.96

√Var[f(X)]

n, μ+ 1.96

√Var[f(X)]

n),

σ2(f(X)) = E[f2(X)]− (E[f(X)])2.

The error estimate of the Monte Carlo estimator is based on the centrallimit theorem. Specifically, when n is large, the Monte Carlo estimator μ istreated as a Gaussian random variable and

Z =:μ− μ√

Var[f(X)]/n=

μ− μ√Var[μ]

is a standard Gaussian random variable. Here the number 1.962 is the ap-

proximate z value such that P(Z < z) = 0.95 and the number 1.96Var[f(X)]√n

is called the statistical error.The convergence rate of the Monte Carlo estimator can be also shown by

the Chebyshev inequality, the fact that E[μ] = μ and (2.5.3):

P(|E[f(X)]− μ| ≥√

Var[μ]

ε) = P(|E[f(X)]− μ| ≥ n−1/2

√Var[f(X)]

ε) ≤ ε.

For any fixed ε, the error of the Monte Carlo estimator decreases in the rateO(n−1/2) when we increase the number of samples.

In practice, the random numbers Xi are replaced with pseudorandomnumbers using pseudorandom number generators, see, e.g., [358, 376]. Also,the variance Var[f(X)] is replaced by its numerical value (called empiricalvariance):

√√√√ 1

n− 1

( n∑

i=0

f2(Xi)− μ2)or

√√√√ 1

n− 1

( n∑

i=0

(f(Xi)− μ

)2. (2.5.4)

In Matlab (Matlab 2011b and later version), the estimator of E[Xp] (p ≥ 1,X ∼ N (0, 1)) and the error can be implemented as follows.

2The value 2 is used sometimes in this book.


Code 2.2. Estimation of the second moment of a standard Gaussian random vari-able.

% Declare the status of random number generators -- Mersenne% Twisterrng(100, 'twister');n= 1000; % the number of sample pointsX = randn ( n, 1);p=2;mu hat = mean (X.ˆp);mu =1; % X is a standard Gaussian random variablemc int err = mu hat - mu; % integration errorstat err = 1.96*sqrt(var (X.ˆ2)/n); % statistical error

When the variance Var[f(X)] is large, the Monte Carlo estimator μ is notaccurate: the empirical variance in (2.5.4) may be large and the statisticalerror in the confidence interval can be too large that the interval is nottrusted; moreover, the empirical variance may not be reliable either.

To have an accurate Monte Carlo estimator, i.e., a small confidence in-

terval, e.g., when we want√

Var[f(X)]n = 10−2 but Var[f(X)] = 10, we then

need n = 104Var[f(X)]) = 105 Monte Carlo sample points. To reduce thisnumber of Monte Carlo sample points, we need “the variance Var[f(X)]small.” To “reduce” the variance, there are several methods available, suchas importance sampling (change of measure), control variates, stratified sam-pling (decomposing the sampling domain into smaller sub-domains). For suchvariance reduction methods, one can refer to many books on this topic, suchas [259, 264, 376].

For one-dimensional integration, the convergence rate of Monte Carlomethod is simply O(n−1/2) and too slow to compete with deterministic nu-merical integration methods. Actually, the advantage of Monte Carlo methodsis their efficiency for high dimensional integration (large d). The statisticalerror of a Monte Carlo estimator does not depend on the dimensionality.

Multilevel Monte Carlo method

One of the recently developed variance reduction methods, the so-called mul-tilevel Monte Carlo method, see, e.g., [156] has attracted a lot of attentionfor numerical SODEs and SPDEs. The idea of the multilevel Monte Carlomethod is to write the desired statistics in a telescoping sum and then tosample the difference terms (between terms defined on two different meshsizes) in the telescoping sum with a small number of sampling paths, wherethe corresponding “variance” is small. The multilevel Monte Carlo methodfor (2.5.1) is based on the following formulation.3

3As a method of control variate, it can be written in the following form∫

R

f(x)p(x) dx =

∫

R

[f(x)− λ(g(x)− E[g]

)]p(x) dx,


∫

R

f(x)p(x) dx =

∫

R

[f(x)− f0(x)]p(x) dx+

∫

R

fL(x)p(x) dx

+

L−1∑

l=0

∫

R

[fl(x)− fl+1(x)]p(x) dx, L ≥ 0.4

The control variate f0 is chosen such that f0 ≈ f and it is much cheaperto simulate fL (and E[fL(x)]), and fl − fl+1 has smaller variances than f ’s.Consider L = 0. The Monte Carlo estimate is then

1

N0

N0∑

k=1

f0(Xk) +1

N1

N1∑

k=1

[f(XN0+k)− f0(XN0+k)].

Let C0 be the cost for the Monte Carlo estimate of E[f0] and C1 the costfor the Monte Carlo estimate of E[f − f0]. The total cost of the estimate isN0C0+N1C1 while the variance is N−1

0 Var[f0]+N−11 Var[f −f0]. For a fixed

cost, we can minimize the variance by choosing

C0

C1=

V0/N20

V1/N21

, i.e.,N1

N0=

√V1/C1√V0/C0

.

For L > 0, the idea is similar and one can take Nl proportional to√

Vl/Cl.Applying to simulation of statistics of solutions to SDEs, the difference termscan be defined on finer meshes (smaller time step sizes) which would admitsmaller variances and thus require a smaller number of sampling paths. Thecomputational cost is thus reduced.

2.5.2 Quasi-Monte Carlo methods

Quasi-Monte Carlo methods (QMC) were originally designed as determinis-tic integration methods in random space and allowed only moderately highdimensional integrations, see, e.g., [376, 423]. QMC has a similar estimatoras the Monte Carlo estimator

QMCn(f) =1

n

n∑

k=1

f(xk).

However, one significant difference is that the sequence xk is deterministicinstead of a random or pseudo-random sequence. The sequence is designed to

where the control variate g(x) has known expectation E[g] (w.r.t. p(x)) and g iswell correlated with f and optimal value for λ can be estimated by a few samples.

4We use the convention that when L = 0, the summation is zero.


provide better uniformity (measured in discrepancy) than a random sequenceand to satisfy the basic (worse-case) bound

|E[f ]−QMCn(f)| ≤ C(log n)kn−1

∣∣∣∣∂df

∂x1 · · · ∂xd

∣∣∣∣1,TV

,

where the constants C > 0 and k ≥ 0 do not depend on N but may de-pend on the dimension d. Here it is required that the mixed derivative of fhas a bounded total variation while MC requires only a bounded variance.Compared to Monte Carlo methods, the convergence rate is approximatelyproportional to n−1 instead of n−1/2 and there is no statistical error sincea deterministic sequence is used. However, the convergence rate is smaller ifthe dimension d is large (usually less than 40). To overcome the dependenceon dimension, one can use randomized quasi-Monte Carlo methods.

Some commonly used quasi-Monte Carlo sequences are Halton and Sobolsequences, Niederreiter sequences, etc. In an example at Chapter 2.5.5 andin Chapter 9.4, we will use randomized quasi-Monte Carlo sequences, Haltonsequence, and Sobol sequence, where the Matlab code for these sequences isprovided.

We refer to [56] for an introduction to the Monte Carlo method and quasi-Monte Carlo methods and [115] for recent development in deterministic andrandomized quasi-Monte Carlo methods.

Compared to the Monte Carlo type method, the following two methodshave no statistical errors and allow efficient short-time integration of SPDEs.

2.5.3 Wiener chaos expansion method

The Cameron-Martin theorem (Theorem 2.3.6) provides a spectral repre-sentation of square-integrable stochastic processes defined on the completeprobability space (Ω,F ,P). This representation is also known asWiener chaosexpansion, see, e.g., [155, 479, 485].

Let us review what we have in Chapter 2.3. The Cameron-Martin basisis listed in Table 2.2 where there is only one Wiener process in (2.3.4).

Table 2.2. Some elements of the Cameron-Martin basis ξα in (2.3.4).

|α| α ξα

0 α = (0, 0, . . .) 1

1 α = (0, . . . , 0, 1, 0, . . .) H1(ξi)= ξi

2 α = (0, . . . , 0, 2, 0, . . .) H2(ξi)/√2 = 1√

2

(ξ2i − 1

)2 α = (0, . . . , 0, 1, 0, . . . , 0, 1, 0, . . .) H1(ξi)H1(ξj) = ξiξj


We need in practice to truncate the number of random variables, i.e., letthe elements of α be zero with large indexes. To be precise, we introduce thefollowing notation: the order of multi-index α:

d(α) = max {l ≥ 1 : αk,l > 0 for some k ≥ 1} .Also, we need to limit the number of |α|. We actually define truncated set ofmulti-indices

JN,n = {α ∈ J : |α| ≤ N, d(α) ≤ n} .In this set, there is a finite number of n dimensional random variables andthe number is

N∑

i=0

(n+ i− 1

i

)=

(n+ N

N

)=

(n+ N

n

).

In Table 2.3, we list the elements in a truncated Cameron-Martin basis. More

Table 2.3. Elements of a truncated Cameron-Martin basis ξα for a finite dimen-sional random space where α ∈ JN,n, N = 2 and n = 2.

|α| α ξα

0 α = (0, 0) 1

1 α = (1, 0) H1(ξ1)= ξ11 α = (0, 1) H1(ξ2)= ξ2

2 α = (2, 0) H2(ξ1)/√2 = 1√

2

(ξ21 − 1

)

2 α = (0, 2) H2(ξ2)/√2 = 1√

2

(ξ22 − 1

)2 α = (1, 1) H1(ξ1)H1(ξ2) = ξ1ξ2

examples of the basis can be generated using the representation of the Her-mite polynomial (2.3.3). Here are the first seven Hermite polynomials:

H0(x) = 1, H1(x) = x,

H2(x) = x2 − 1,

H3(x) = x3 − 3x,

H4(x) = x4 − 6x2 + 3,

H5(x) = x5 − 10x3 + 15x,

H6(x) = x6 − 15x4 + 45x2 − 15.

The Hermite polynomials can be represented (computed) by the three-termrecurrence relation

Hn+1(x) = xHn(x)− nHn−1(x), n ≥ 1, H0(x) = 1, H1(x) = x. (2.5.5)

Let’s now consider the Wiener chaos expansion for computing the integra-tion (2.5.1). The method is based on the Wiener chaos expansion of f(X),where X is a standard Gaussian random variable. Suppose that

f(X) =∞∑

j=0

fjξj , where ξ′js are from the Cameron-Martin basis.


Once we find fj , we compute the integration (2.5.1) by

E[f(X)] = E[

∞∑

j=0

fjξj ] = f0.

In practice, the coefficients of the Wiener chaos expansion, fj ’s can benumerically computed using the governing stochastic equation. Note that ξjare orthonormal and thus E[ξj ] = 0 for j ≥ 1 and fj = E[f(X)ξj ]. Moreover,we can numerically compute E[g(f(X))] by

E[g(f(X))] ≈ E[g(

N∑

j=0

fjξj)].

We will illustrate the idea of Wiener chaos method as a numerical methodin Chapter 2.5.5. We will see that the Wiener chaos method is essentially aspectral Galerkin method in the random space.

2.5.4 Stochastic collocation method

In the framework of deterministic integration methods for SPDEs in ran-dom space, another solution for nonlinear SPDEs or linear SPDEs with ran-dom coefficient is to employ collocation techniques in random space. Here bystochastic collocation methods, we mean the sampling strategies using highdimensional deterministic quadratures (with certain polynomial exactness)to evaluate desired expectations of solutions to SPDEs.

Let’s now consider the stochastic collocation method (SCM) for comput-ing the integration (2.5.1). Note that the deterministic integral can be com-puted by any quadrature rule. Here we consider a sequence of one-dimensionalGauss–Hermite quadrature rules Qn with number of nodes n ∈ N for univari-ate functions ψ(y), y ∈R:

Qnψ(y) =n∑

α=1

ψ(yn,α)wn,α, (2.5.6)

where yn,1 < yn,2 < · · · < yn,n are the roots of the Hermite polynomial

Hn(y) = (−1)ney2/2 dn

dyn e−y2/2 and wn,α = n!/(n2[Hn−1(yn,α)]2) are the asso-

ciated weights. It is known that Qnψ is exactly equal to the integral I1ψ whenψ is a polynomial of degree less than or equal to 2n− 1, i.e., the polynomialdegree of exactness of Gauss–Hermite quadrature rules Qn is equal to 2n−1.The integration (2.5.1) can be computed by

E[f(X)] =

∫

R

f(x)p(x) dx ≈N∑

k=0

f(yN,k)wk.


The Gauss-Hermite quadrature points yN,k are the zeros (roots) of the N+1-th order Hermite polynomial HN+1. The weights wi’s are the correspondingquadrature weights. The statistics E[g(f(X))] can be approximated by

E[g(f(X))] ≈N∑

k=0

g(f(yN,k))wk.

The quadrature rule here is a starting point of stochastic collocation methods.As we are usually looking for numerical approximation of some statistics

like E[f(X)], we can simply look for “function values” on some deterministicquadrature points corresponding to certain quadrature rules. The stochasticcollocation methods are then collocating a stochastic equation at these deter-ministic quadrature points using the classical collocation method in randomspace and seeking the “function values” at the quadrature points by solvingthe resulting equations they satisfy.

We will illustrate the idea of stochastic collocation method as a numericalmethod in Chapter 2.5.5. The stochastic collocation method is a spectralcollocation method in the random space.

Smolyak’s sparse grid

Sparse grid quadrature is a certain reduction of product quadrature rules,which decreases the number of quadrature nodes and allows effective integra-tion in moderately high dimensions [425] (see also [154, 381, 477]).

Consider a d-dimensional integral of a function ϕ(y), y ∈ Rd, with respect

to a Gaussian measure:

Idϕ :=1

(2π)d/2

∫

Rd

ϕ(y) exp

(−1

2

d∑

i=1

y2i

)dy1 · · · dyd. (2.5.7)

We can approximate the multidimensional integral Idϕ by a tensor prod-uct quadrature rule

Idϕ ≈ Idϕ := Qn ⊗Qn · · · ⊗Qnϕ(y1, y2, · · · , yd)= Q⊗d

n ϕ(y1, y2, · · · , yd) (2.5.8)

=

n∑

α1=1

· · ·n∑

αd=1

ϕ(yn,α1, . . . , yn,αd

)wn,α1· · ·wn,αd

,

where for simplicity we use the same amount on nodes in all the directions.The quadrature Idϕ is exact for all polynomials from the space Pk1

⊗· · ·⊗Pkd

with max1≤i≤d ki = 2n−1, where Pk is the space of one-dimensional polyno-mials of degree less than or equal to k (we note in passing that this fact is easyto prove using probabilistic representations of Idϕ and Idϕ). Computationalcosts of quadrature rules are measured in terms of a number of function eval-uations, which is equal to nd in the case of the tensor product (2.5.8), i.e.,the computational cost of (2.5.8) grows exponentially fast with dimension.


The sparse grid of Smolyak [425] reduces the computational complexityof the tensor product rule (2.5.8) by exploiting the difference quadratureformulas:

A(L, d)ϕ :=∑

d≤|i|≤L+d−1

(Qi1 −Qi1−1)⊗ · · · ⊗ (Qid −Qid−1)ϕ,

where Q0 = 0 and i = (i1, i2, . . . , id) is a multi-index with ik ≥ 1 and|i| = i1 + i2 + · · ·+ id. The number L is usually referred to as the level of thesparse grid. The sparse grid rule (2.5.9) can also be written in the followingform [477]:

A(L, d)ϕ =∑

L≤|i|≤L+d−1

(−1)L+d−1−|i|(

d− 1

|i| − L

)Qi1 ⊗ · · · ⊗Qidϕ. (2.5.9)

The quadrature A(L, d)ϕ is exact for polynomials from the space Pk1⊗ · · · ⊗

Pkdwith |k| = 2L− 1, i.e., for polynomials of total degree up to 2L− 1 [381,

Corollary 1].Denote the set of sparse grid points xκ = (x1κ, · · · , xdκ) by Hnq

L , where xjκ(1 ≤ j ≤ d) belongs to the set of points used by the quadrature rule Qij .According to (2.5.9), we only need to know the function values at the sparsegrid Hnq

L :

A(L, d)ϕ =

η(L,nq)∑

κ=1

ϕ(xκ)Wκ, xκ = (x1κ, · · · , xdκ) ∈ HnqL , (2.5.10)

where Wκ are determined by (2.5.9) and the choice of the quadrature rulesQij and they are called the sparse grid quadrature weights. Due to (2.5.9),the total number of nodes used by this sparse grid rule is estimated by

#HnqL = η(L, d) ≤

∑

L≤|i|≤L+d−1

i1 × · · · × id.

Table 2.4 lists the number of sparse grid points up to level 5 when the levelis not greater than d.

Table 2.4. The number of sparse grid points for the sparse grid quadrature (2.5.9)using the one-dimensional Gauss-Hermite quadrature rule (2.5.6), when the sparsegrid level L ≤ d.

L = 1 L = 2 L = 3 L = 4 L = 5

η(L, d) 1 2d+ 1 2d2 + 2d+ 1 43d3 + 2d2 + 14

3d+ 1 2

3d4 + 4

3d3 + 22

3d2 + 8

3d+ 1


The quadrature Idϕ from (2.5.8) is exact for polynomials of total degree2L − 1 when n = L. It is not difficult to see that if the required polynomialexactness (in terms of total degree of polynomials) is relatively small, thenthe sparse grid rule (2.5.9) substantially reduces the number of function eval-uations compared with the tensor-product rule (2.5.8). For instance, supposethat the dimension d = 40 and the required polynomial exactness is equal to3. Then the cost of the tensor product rule (2.5.8) is 340

.= 1. 215 8 × 1019

while the cost of the sparse grid rule (2.5.9) based on the one-dimensionalrule (2.5.6) is 3281.

2.5.5 Application to SODEs

Let’s consider the stochastic differential equation (1.1.2) where

dX = W (t)X dt, 0 < t ≤ 1 X(0) = X0 = 1, (2.5.11)

where W (t) is a standard Brownian motion.We employ the simplest time discretization – forward Euler scheme. For

a uniform partition of [0, 1], tk = kh, 1 ≤ k ≤ N , and Nh = 1. The forwardEuler scheme is

Xk+1 = Xk +W (tk)Xkh, 0 ≤ k ≤ N − 1. (2.5.12)

The goal here is to numerically compute the mean and variance of thesolution or simply E[Xp

N ], p = 1, 2.Here we notice that W (tk) need further discretization. We recall that in

Chapter 2.2.2, there are two ways to approximate W (tk). The first one is touse increments of Brownian motions (2.2.3) and the forward Euler schemebecomes

Xk+1 = Xk +√h

k∑

i=0

ξiXkh, 0 ≤ k ≤ N − 1. (2.5.13)

Here ξ0 = 0. Then we have that the solution can be represented by XN (ξ1, ξ2,. . . , ξN−1), and the desired statistics are of the form (2.5.7). We can then usethe methods described in the last section.

For Monte Carlo methods, we can use pseudo-random number genera-tors. In Matlab, the Monte Carlo method for (2.5.13) can be implemented asfollows:

Code 2.3. Monte Carlo method with the forward Euler scheme for Equa-tion (2.5.11) using (2.5.13).

clc, clear allrng(100, 'twister'); % for repeatable pseudo random% sequences.t final = 1; x ini= 1;


N= 1000; h =t final/N;num sample path = 1e4;% time marching, Euler schemeW k = 0;X k = x ini* ones(num sample path,1);for k= 1: N-1

W k = W k + sqrt(h)* randn(num sample path,1);X k = X k + W k.*X k*h;

endX mean = mean(X k);X second moment = mean(X k.ˆ2);X mean stat error = 1.96 * sqrt(var(X k)/ num sample path);X second moment stat error = 1.96 * sqrt (var (X k.ˆ2)/

num sample path);

For quasi-Monte Carlo method, we use the scrambled Halton sequence[262],

Code 2.4. Quasi-Monte Carlo method with the forward Euler scheme for Equa-tion (2.5.11) using (2.5.13).

clc, clear allt final = 1; x ini= 1;N= 1000; h =t final/N;num sample path = 1e4;rng(100, 'twister'); % for repeatable randomized% quasi-random sequences.qmc sequence= haltonset(N-1 ,'Skip',1e3,'Leap',20);% Halton sequenceqmc sequence= scramble( qmc sequence, 'RR2'); % scramble,% randomizingqmc sequence = net( qmc sequence, num sample path);qmc sequence=erfinv(2*qmc sequence-1)*sqrt(2); % an inverse% transformation for Gaussian% time marching, Euler schemeW k = 0;X k = x ini* ones(num sample path,1);for k= 1: N-1

W k = W k + sqrt(h)* qmc sequence(:,k);X k = X k + W k.*X k*h;

endX mean = mean(X k);X second moment = mean(X k.ˆ2);X mean stat error = 1.96 * sqrt(var(X k)/ num sample path);X second moment stat error = 1.96 * sqrt( var (X k.ˆ2)/

num sample path);


In this code, the scrambled Halton sequence is used, see [262]. We canalso use the scrambled Sobol sequence [340] instead of a scrambled Haltonsequence.

Code 2.5. A scrambled Sobol sequence of quasi-Monte Carlo method.

qmc sequence = sobolset(N-1,'Skip',1e3,'Leap',20);% Sobol sequenceqmc sequence = scramble(qmc sequence,'MatousekAffineOwen');qmc sequence = erfinv(2*qmc sequence-1)*sqrt(2);% inverse transformation

A stochastic collocation method requires the values of Xk at quadraturepoints (in random space) according to (2.5.8). To find these values, we applythe collocation method in random space: for κ = 1, . . . , η(L, N − 1),

E[Xk+1(ξ1, . . . , ξk)δ((ξ1, . . . , ξk)− (x1κ, · · · , xkκ))]

= E[(1 +√h

k∑

i=0

ξih))Xk(ξ1, . . . , ξk−1)δ((ξ1, . . . , ξk)− (x1κ, · · · , xkκ))].

By the property of the delta function, we have a system of deterministic anddecoupled equations:

Xk+1(x1κ, · · · , xkκ) = (1 +

√h

k∑i=0

xiκh)Xk(x1κ, · · · , xk−1

κ ), κ = 1, . . . , η(L, N − 1).

Here we use the sparse grid code ‘nwspgr.m’ at http://www.

sparse-grids.de/. We now list the code for sparse grid collocation methods.

Code 2.6. Sparse grid collocation with the forward Euler scheme for Equa-tion (2.5.11) using (2.5.13).

clc, clear allt final = 1; X ini= 1;N= 40; h =t final/N;sparse grid dim = N-1;sparse grid level = 2;[sparse grid nodes, sparse grid weights]=nwspgr('GQN', ...

sparse grid dim, sparse grid level);num sample path=size(sparse grid weights,1);% time marching, Euler schemeW k = 0;X k = X ini* ones(num sample path,1);for k= 1: N-1

W k = W k + sqrt(h)* sparse grid nodes(:,k);X k = X k + W k.*X k*h;

http://www.sparse-grids.de/



endX mean = sum(X k.*sparse grid weights);X second moment = sum(X k.ˆ2.*sparse grid weights);

Consider now the Wiener chaos method for (2.5.13). Suppose that Xk+1 =∑α∈JN,k

xα,kξα for 1 ≤ k ≤ N − 1. We first apply a Galerkin method inrandom space – multiplying by the Cameron-Martin basis ξβ , β ∈∈ JN,k overboth sides of (2.5.13) and taking expectation (integration over the randomspace). We then have

E[ξβ∑

α∈JN,k

xα,kξα] = E[ξβ(1 +√h

k∑

i=1

ξih))∑

α∈JN,k−1

xα,k−1ξα].

By the orthonormality of the Cameron-Martin basis, we have

xβ,k = xβ,k−11βk=0 + h3/2k∑

i=1

∑

α∈JN,k−1

xα,k−1E[ξiξαξβ ].

This turns the discrete stochastic equation from the forward Euler schemeinto a system of deterministic equations of the Wiener chaos expansioncoefficients xβ,k+1. This system of deterministic equation of thecoefficients is called propagator of the discrete stochastic equation (2.5.13).

To solve for ξβ,k+1, one needs to find the expectations of the triplesE[ξiξαξβ ]. Recalling the recurrence relation (2.5.5) and orthogonality of Her-mite polynomials, the triples are zero unless α = β ± εi, where εi is a mul-tiindex with |εi| = 1 and its only nonzero element is the i-th one. Recallingthe (2.5.5) and when βi = αi + 1, we have

E[ξiξαiξβi

] = E[ξiHαi(ξi)Hβi

(ξi)]/√

αi!/√

βi!

= E[Hαi+1(ξi)Hβi(ξi)]/

√αi!√

βi! =√αi + 1.

Then, the triples can be computed as

E[ξiξαξβ ] =√αi + 11α+εi=β +

√βi + 11β+εi=α. (2.5.14)

We have a propagator ready for implementation: for β ∈ JN,k,

xβ,k =(xβ,k−1 + h3/2

k−1∑

i=1

(√βixβ−εi,k−1 +

√βi + 1xβ+εi,k−11|β|≤N−1

))

1βk=0 + h3/2xα,k−11βk=11α=(β1,...,βk−1).

Now let’s consider the spectral approximation of Brownian motion (2.2.5).Applying the spectral approximation (2.2.5), the problem (2.5.11) can bediscretized as follows:

dX = W (n)(t)X dt, X0 = 1. (2.5.15)


Different from (2.5.13), Equation (2.5.15) is still continuous in time. We needa further discretization in time. Let us again use the forward Euler scheme:

Xk+1 = Xk +W (n)(tk)Xkh, X0 = 1. (2.5.16)

With the approximation (2.2.5) and the cosine basis (2.2.6), we have

W (n)(tk) =n∑

i=1

ξiMi(tk), M1(t) = t, Mi(t) =

∫ t

0

mi(s) ds

=

√2

(i− 1)πsin(π(i− 1)t), i ≥ 2.

The Monte Carlo method is similar to what we have before. The only dif-ference between Code 2.3 and the code here is in computing the approximatedBrownian motion.

Code 2.7. Monte Carlo method with the forward Euler scheme for Equa-tion (2.5.11) using (2.5.16) (WZ).

clc, clear allrng(100, 'twister'); % for repeatable pseudo random% sequences.t final = 1; x ini= 1;N= 1000; h =t final/N;num sample path = 1e4;% time marching, Euler schemeW k = 0;X k = x ini* ones(num sample path,1);n=40; % truncation of specral approximationxi=randn(num sample path,n);for k= 1: N-1

t k=k*h;W k = t k*xi(:,1);for i=2:n

W k = W k+ sqrt(2)*xi(:,i)*sin(pi*(i-1)*t k)/(i-1)/pi;endX k = X k + W k.*X k*h;

endX mean = mean(X k);X second moment = mean(X k.ˆ2);X mean stat error = 1.96 * sqrt( var(X k)/num sample path);X second moment stat error = 1.96 * sqrt( var (X k.ˆ2)/

num sample path);

Similarly, the sparse grid collocation method here differs from Code 2.6in computing the approximated Brownian motion.


Code 2.8. Sparse grid collocation with the forward Euler scheme for Equa-tion (2.5.11) using (2.5.16) (WZ).

clc, clear allt final = 1; X ini= 1;N= 10000; h =t final/N;n=40; % truncation of specral approximationsparse grid dim = n; % nsparse grid level = 2; % less than 5[sparse grid nodes, sparse grid weights]=nwspgr('GQN', ...

sparse grid dim, sparse grid level);num sample path=size(sparse grid weights,1);% time marching, Euler schemeW k = 0;X k = X ini* ones(num sample path,1);for k= 1: N-1

t k=k*h;W k = t k*sparse grid nodes(:,1);for i=2:n

W k = W k+ sqrt(2)* sparse grid nodes(:,i)

*sin(pi*(i-1)*t k)/(i-1)/pi;endX k = X k + W k.*X k*h;

endX mean = sum(X k.*sparse grid weights);X second moment = sum(X k.ˆ2.*sparse grid weights);

Let us derive the Wiener chaos method for (2.5.16). Suppose that Xk =∑α∈JN,n

xα,kξα for k ≥ 1. We first apply a Galerkin method in randomspace – multiplying the Cameron-Martin basis ξβ , β ∈ JN,,n over both sideof (2.5.13) and taking expectation (integration over the random space). Wethen have

E[ξβ∑

α∈JN,,n

xα,k+1ξα] = E[ξβ(1 +

n∑

i=1

ξiMi(tk))∑

α∈JN,n

xα,kξα].

By the orthonormality of the Cameron-Martin basis, we have the followingpropagator of the discrete stochastic equation (2.5.16):

xβ,k+1 = xβ,k + h

n∑

i=1

Mi(tk)∑

α∈JN,n

xα,kE[ξiξαξβ ].

Recall the fact (2.5.14) and we have a propagator ready for implementation

xβ,k+1 = xβ,k + hn∑

i=1

Mi(tk)(√

βixβ−εi,k +√

βi + 1xβ+εi,k1|β|≤N−1

),

β ∈ JN,n.


In Table 2.5, we present first two moments (mean and second moment)obtained by the Monte Carlo method and quasi-Monte Carlo method andstochastic collocation methods using different approximation of Brownianmotion for Equation (2.5.11). We recall from Chapter 1.1 that the first mo-ment at t = 1 is E[X(1)] = X(0) exp(1/2) ≈ 1.6487 and the second momentis E[X2(1)] = X2(0) exp(2 ≈ 7.3891. From the table, we can see that quasi-Monte Carlo methods give the most accurate values for both the mean andthe second moment while the stochastic collocation method is the least accu-rate. One possible reason for the failure of the stochastic collocation methodis the high dimensionality in random space. It is believed that beyond 40dimension the stochastic collocation method is empirically inefficient.

2.6 Bibliographic notes

Other approximations of Brownian motion. Besides the piecewise linear ap-proximation of Brownian motion (2.2.4), there are several other approxima-tions, e.g., the mollifier approximation (see, e.g., [241, p. 397] and [387]), orthe random walk by Donsker’s theorem (see, e.g., [255]). However, we omitthe discussion of construction of Brownian motion here as only the first twoapproaches are used in practice and in this book.

Table 2.5. The first two moments using different methods in random space andEuler scheme in time for Equation (2.5.11) at t = 1. For Wong-Zakai (WZ) ap-proximation, we use n = 40. The time step size is 10−4. The exact momentsare given by (1.1.4): the mean is exp(1/3) ≈ 1.3956 and the second moment isexp(3/2) ≈ 1.9477.

Methods Mean Second moment Approximation ofBrownian motion

Monte Carlo (MC) 1.1856 ± 0.0148 1.9771 ± 0.0717 Increments

Quasi-Monte Carlo(QMC)

1.2237 ± 0.0152 2.0969 ± 0.0673 Increments

Stochastic collocationmethod (SCM)

1.1545 1.6325 Increments

Monte Carlo (MC) 1.1777 ± 0.0145 1.9353 ± 0.0609 WZ

Quasi-Monte Carlo(QMC)

1.5662 ± 0.0173 3.2312 ± 0.0903 WZ

Stochastic collocationmethod (SCM)

1.1695 1.7141 WZ

Mollification of Brownian motion is also known as the mollifier approxi-mation and is used in the approximation of stochastic integrals and SODEs,see, e.g., [117, 177, 239, 323, 325, 387, 397, 510, 511]

2.6 Bibliographic notes 47

W (t) =

∫ t

tn

∫

R

K(θ, s) dW (s) dθ, t ∈ [tn, tn+1), (2.6.1)

whereK is symmetric. This type of approximation was proposed for a methodof lines for SODEs in [387], where no numerical results were presented. Whenthis approximation is applied to SODEs, consistency (convergence withoutorder) has been proved in [117, 177, 179, 326], etc. In [241], the approachesof piecewise linear approximation and mollification have been unified withproved convergence order, known as Ikeda-Nakao-Yamato-type approxima-tions, see also [195].

Spectral approximation of Brownian motion. With a piecewise constantbasis in (2.2.5), the use of multiple Ito integrals (Wiener chaos expansion)and multiple Stratonovich integrals was addressed in [53, 54]. When the spec-tral basis is chosen as Haar wavelets, the approximation is also known asLevy-Ciesielsky approximation, see, e.g., [255]). The expansion (2.2.1) is anextension of Fourier expansion of Brownian motion proposed by Wiener [391,Chapter IX]. It is also known as Levy-Ciesielski approximation [81, 255, 278],Ito-Niso approximation [243], or Fourier approximation [391]; see [240] for ahistorical review of this approximation.

The approximation with trigonometric orthonormal basis has been usedin Wiener chaos methods (see, e.g., [55, 225, 315, 316, 318, 505]) and will bethe approximation for our Wong-Zakai approximation throughout this book.See Chapter 4 for the Wong-Zakai approximation using (2.2.5) for SODEswith time delay.

For the multilevel Monte Carlo method for numerical SPDEs, see, e.g.,[1, 25, 75, 82, 440, 441] for elliptic equations with random coefficients, [363–365] for stochastic hyperbolic equations, [24, 158] for stochastic parabolicequations, and [160] for a stochastic Brinkman problem. Ref. [1] proposedtime discretization schemes with large stability regions to further reduce thecost of the multilevel Monte Carlo method. However, it has been shown thatthe performance of multilevel Monte Carlo methods is not robust, see, e.g.,[264, Chapter 4] when the variances of the desired stochastic processes arelarge.

Quasi-Monte Carlo methods have also been investigated for numericalSPDEs. Some randomized quasi-Monte Carlo methods have been successfullyapplied to solve stochastic elliptic equations with random coefficients, see,e.g., [164, 165, 281, 282] where the solution is analytic in random space (pa-rameter space). For a good review on quasi-Monte Carlo methods, see [115].

Wiener chaos expansion methods. As numerical methods, they have beensummarized in [155, 439, 487]. This idea of representing a random variable(process) with orthogonal polynomial of the random variable (process) withrespect to its corresponding probability density function is not limited toGaussian random variables (processes) and has been extended to more gen-eral cases, see, e.g., [415, 488]. Xiu and Karniadakis [488] developed theWiener-Askey polynomial chaos using a broad class of Askey’s orthogonalpolynomials [10]. Soize and Ghanem [427] discuss chaos expansions with re-


spect to an arbitrary probability measure, see also Wan and Karniadakis[468]. These methods are known as generalized polynomial chaos, or gPC,see reviews in, e.g., [484, 485].

For linear problems driven by white noise in space or in time, the Wienerchaos expansion method has been investigated both theoretically (see, e.g.,[315–318]) and numerically (see, e.g., [469, 505]). The advantage of Wienerchaos expansion method is that the resulting system of PDEs is linear, lowertriangular, and deterministic. Also, the Wiener chaos expansion method canbe of high accuracy. However, there are two main difficulties for the Wienerchaos expansion as a numerical method. The first is the efficiency of long-timeintegration. Usually, this method is only efficient for short-time integration,see, e.g., [53, 315]. This limitation can be somewhat removed when a recursiveprocedure is adopted for computing certain statistics, e.g., first two momentsof the solution, see, e.g., [505, 507]. The second is nonlinearity. When SPDEsare nonlinear, Wiener chaos expansion methods result in fully coupled sys-tems of deterministic PDEs while the interactions between different Wienerchaos expansion terms necessitate exhaustive computation. This effect hasbeen shown numerically through the stochastic Burgers equation and theNavier-Stokes equations [225].

One remedy for nonlinear problems is to introduce the Wick-Malliavinapproximation for nonlinear terms. Wick-Malliavin approximations can beseen as a perturbation of a Wick product formulation by adding high-orderMalliavin derivatives of the nonlinear terms to the Wick product formulation,see [346, 462] and Chapter 11 for details. Basically, lower level Wick-Malliavinapproximation (with lower-order Malliavin derivatives) allows weaker nonlin-ear interactions between the Wiener chaos expansion terms. Let us considerthe Burgers equation with additive noise, for example. When only the Wickproduct is used (zeroth-order Malliavin derivatives only), the resulting systemis lower triangular and contains only one nonlinear equation. When Malliavinderivatives of up to first-order are used, the resulting system of PDEs is onlyweakly coupled and contains only two nonlinear equations. This approachhas been shown to be very efficient for short-time integration of equationswith quadratic nonlinearity and small noise, see, e.g., [462].

The Wick product had been formulated in [223] for various SPDEs be-fore the Wick-Malliavin approximation was introduced. The Wick productformulation has been explored with finite element methods in physical space,see, e.g., [254, 327–333, 443–445, 498] and also [469] for a brief review onSPDEs equipped with Wick product. In Chapter 11, we will discuss the Wick-Malliavin approximation for linear elliptic equations with multiplicative noiseand some nonlinear equations with quadratic nonlinearity and additive noise.

A stochastic collocation method (SCM) was first introduced in [439] andlater on by other authors, see, e.g., [11] and [486]. While WCE (WienerChaos Expansion) is a spectral Galerkin method in random space, see,e.g., [155, 488]), SCM is a stochastic version of deterministic collocation

2.6 Bibliographic notes 49

methodology. As collocation methods for deterministic problems, see, e.g.,[163]), the stochastic collocation methods exhibit high accuracy comparableto the WCE performance, see, e.g., [123] for elliptic equations with randomcoefficients.

For stochastic differential equations with color noise, it has been demon-strated in a number of works (see, e.g., [11, 12, 35, 370, 378, 486, 500] andreferences therein) that stochastic collocation methods (Smolyak’s sparse gridcollocation, SGC) can be a competitive alternative to the Monte Carlo tech-nique and its variants in the case of differential equations. The success ofthese methods relies on the solution smoothness in the random space andcan usually be achieved when it is sufficient to consider only a limited num-ber of random variables (i.e., in the case of a low dimensional random space).As Wiener chaos methods, stochastic collocation methods are limited in prac-tice as they can be used for a small number of random variables. Based onempirical evidence (see, e.g., [393]), the use of SGC is limited to problemswith random space dimensionality of less than 40.

More efficient algorithms might be built using anisotropic SGC methods[172, 379] or goal-oriented quadrature rules, which employ more quadraturepoints along the “most important” direction, e.g., [373, 389, 390]. Here weconsider only isotropic SGC with predetermined quadrature rules. In fact,the effectiveness of adaptive sparse grids relies heavily on the order of impor-tance in random dimension of numerical solutions to stochastic differentialequations, which is not always easy to reach. Furthermore, all these sparsegrids as integration methods in random space grow quickly with random di-mensions and thus cannot be used for longer time integration (usually withlarge random dimensions).

For SODEs driven by white noise in time, the stochastic collocationmethod has been known as cubature on Wiener space (e.g., [202, 300, 322,373, 377]), optimal quantization (e.g., [389, 390]) to solve SODEs in randomspace, sparse grid of Smolyak type (e.g., [153, 154, 172, 398]), or particle sim-ulation (e.g., [132]). For stochastic collocation methods for equations withcolor noise, see, e.g., [11, 486].

The stochastic collocation methods result in decoupled systems of equa-tions as Monte Carlo method and its variants do, which can be of greatadvantage in parallel computation. High accuracy and fast convergence canbe also observed for stochastic evolution equations, e.g., [153, 154, 172, 398]where the sparse grid of Smolyak type was used.

However, the fundamental limitation of these collocation methods is theexponential growth of sampling points with an increasing number of randomparameters, see, e.g., [172], and thus a failure for longer time integration,see error estimates for cubature on Wiener space (e.g., [28, 70, 116]) andconclusions for optimal quantization (e.g., [389, 390]).


2.7 Suggested practice

Exercise 2.7.1 Show that for any centered Gaussian random variable, ξ ∼N (0, σ2), E[ξ2n] = σ2n(2n − 1)!! and E[ξ2n−1] = 0, for any n ≥ 1. Here(2n− 1)!! = 1× 3× · · · × (2n− 1).

Exercise 2.7.2 Consider a Gaussian vector (X,Y ) with X, Y satisfyingCov[(X,Y )] = 0 (uncorrelated). Then X and Y are independent.

Exercise 2.7.3 For a Gaussian random variable X ∼ N (0, σ2) and aBernoulli Z with P(Z = ±1) = 1

2 , X and Z are independent. Show that

a) the product ZX is a Gaussian random variable;b) and X and ZX are uncorrelated;c) but X and ZX are not independent.

Check whether (X,ZX)� is a Gaussian random vector or not.

Exercise 2.7.4 Assume that X and Y are Gaussian random variables, thenX+Y independent of X−Y implies X independent of Y if and only if (X,Y )is a Gaussian random vector.

Exercise 2.7.5 Show that the covariance function C is nonnegative definite,i.e., for all t1, . . . tk ∈ T and all z1, z2, . . ., zk ∈ R

k∑

i=1

k∑

j=1

C(ti, tj)zizj ≥ 0.

Hint: assume that E[X(t)] = 0 and write the formula as E[(∑k

i=1 ziX(ti))2].

Exercise 2.7.6 Assume that W (t) is a standard Brownian motion. Showthat the covariance of W (t) is Cov[(W (t),W (s))] = min(t, s).

Exercise 2.7.7 If X(t) is a one-dimensional Gaussian process with covari-ance Cov[X(t), X(s)] = min(t, s), then it is a one-dimensional Brownianmotion.

Exercise 2.7.8 Use the definition of Brownian motion to show that theprocess in (2.2.1) is indeed a Brownian motion on [0, T ].

Exercise 2.7.9 Compute the Holder exponent α for the following functions:f(t) =

√t, f(t) = tβ (β > 0) for t ∈ [0, 1] and for t ∈ [ε, 1] (0 < ε < 1).

Exercise 2.7.10 Show that if there exists a constant C such that for any x,

|f(x+ h)− f(x)| ≤ Chβ , β > 1,

then f(x) is a constant.

2.7 Suggested practice 51

Exercise 2.7.11 For integer n ≥ 1, verify that

E[|Bt −Bs|2n] ≤ Cn |t− s|n .

Exercise 2.7.12 If f is differentiable and its derivative is Riemann-integrable,its total variation over the interval [a, b] is

|f |1,TV,[a,b] =

∫ b

a

|f ′(t)| dt.

Exercise 2.7.13 Show that a Brownian motion W (t) has a bounded quadraticvariation and any p-variation of W (t) is zero when p ≥ 3.

Exercise 2.7.14 Assume that Vn is from Theorem 2.2.14. Show that E[Vn] =√nE[|ξ1|] and Var[Vn] = 1− (E[|ξ1|])2.

Exercise 2.7.15 Compute E[|ξ1|] in Theorem 2.2.14.

Exercise 2.7.16 Show that ξl =∫ T

0ml(s) dW (s)’s are i.i.d. standard Gaus-

sian random variables, where {ml} is a complete orthonormal basis inL2([0, T ]).

Exercise 2.7.17 Show by definition the second formula in Example 2.3.1.

Exercise 2.7.18 Using Taylor’s expansion and bounded quadratic variationof Brownian motion, prove Theorem 2.4.1 when f has bounded first twoderivatives.

3

Numerical methods for stochasticdifferential equations

In this chapter, we discuss some basic aspects of stochastic differential equa-tions (SDEs) including stochastic ordinary (SODEs) and partial differentialequations (SPDEs).

In Chapter 3.1, we first present some basic theory of SODEs including theexistence and uniqueness of strong solutions and solution methods for SODEssuch as integrating factor methods and the method of moment equations ofsolutions. We then discuss in Chapter 3.2 numerical methods for SODEs andstrong and weak convergence of numerical solutions as well as linear stabilitytheory of numerical SODEs. We summarize basic aspects of numerical SODEsin Chapter 3.2.5.

We present some basic estimates of regularity of solutions to SPDEs inChapter 3.3. Then, we introduce the solutions in several senses: strong so-lution, variational solution, mild solution, and Wiener chaos solution. Exis-tence and uniqueness of variational solutions are presented. In Chapter 3.4,we briefly review numerical methods for parabolic SPDEs including differenttechniques for numerical solutions aiming at strong and weak convergence.A comparison of numerical stability between PDEs and SPDEs is also pre-sented in Chapter 3.4. We summarize basic aspects of numerical methodsfor SPDEs in Chapter 3.4.6. Numerical methods for other type of equationssuch as stochastic hyperbolic equations are presented in Chapter 3.5 of bib-liographic notes. Some exercises are provided at the end of this chapter.

3.1 Basic aspects of SODEs

Let us consider the following simple stochastic ordinary equation:

dX(t) = −λX(t) dt+ dW (t), λ > 0. (3.1.1)


53

54 3 Numerical methods for stochastic differential equations

It can be readily verified by Ito’s formula (Theorem 2.4.1) that the followingprocess

X(t) = e−λtx0 +

∫ t

0

e−λ(t−s) dW (s) (3.1.2)

satisfies Equation (3.1.1). By the Kolmogorov continuity theorem (Theo-rem 2.2.10), the solution is Holder continuous of order less than 1/2 in timesince

E[|X(t)−X(s)|2] ≤ (t− s)2(2

λ+ x2

0) + |t− s| . (3.1.3)

This simple model shows that the solution to a stochastic differentialequation is Holder continuous of order less than 1/2 and thus does not havederivatives in time. This low regularity of solutions leads to different concernsin SODEs and their numerical methods.

3.1.1 Existence and uniqueness of strong solutions

Let (Ω,F ,P) be a probability space and (W (t),FWt ) = ((W1(t), . . . ,Wm(t))�,

FWt ) be an m-dimensional standard Wiener process, where FW

t , 0 ≤ t ≤ T,is an increasing family of σ-subalgebras of F induced by W (t). Consider thesystem of Ito SODEs

dX = a(t,X)dt+

m∑

r=1

σr(t,X)dWr(t), t ∈ (t0, T ], X(t0) = x0, (3.1.4)

where X, a, σr are m-dimensional column-vectors and x0 is independent ofw. We assume that a(t, x) and σ(t, x) are sufficiently smooth and globallyLipschitz.

The SODEs (3.1.4) can be rewritten in Stratonovich sense under mildconditions. With the relation (2.3.2), the equation (3.1.4) can be written as

dX = [a(t,X)− c(t,X)]dt+m∑

r=1

σr(t,X)dWr(t), t ∈ (t0, T ], X(t0) = x0,

(3.1.5)where

c(t,X) =1

2

m∑

r=1

∂σr(t,X)

∂xσr(t,X),

and ∂σr

∂x is the Jacobi matrix of the column-vector σr:

∂σr

∂x=

[∂σr

∂x1· · · ∂σr

∂xm

]=

⎡

⎢⎢⎢⎢⎣

∂σ1,r

∂x1· · · ∂σ1,r

∂xm...

. . ....

∂σm,r

∂x1· · · ∂σm,r

∂xm

⎤

⎥⎥⎥⎥⎦.

3.1 Basic aspects of SODEs 55

We denote f ∈ Lad(Ω;L2([a, b])) if f(t) is adapted to Ft and f(t, ω) ∈L2([a, b]), i.e.,

f ∈ Lad(Ω;L2([a, b])) =

{f(t, ω)|f(t, ω) is Ft-measurable

and P(

∫ b

a

f2s ds < ∞) = 1

}.

Here {Ft; a ≤ t ≤ b} is a filtration such that

• for each t, f(t) and W (t) are Ft-measurable, i.e., f(t) and W (t) areadapted to the filtration Ft.

• for any s ≤ t, W (t)−W (s) is independent of the σ-filed Fs.

Definition 3.1.1 (A strong solution to a SODE) We say that X(t) isa (strong) solution to SDE (3.1.4) if

• a(t,X(t)) ∈ Lad(Ω,L1([c, d])),• σ(t,X(t)) ∈ Lad(Ω,L2([c, d])),• and X(t) satisfies the following integral equation a.s.

X(t) = x+

∫ t

0

a(s,X(s)) ds+

∫ t

0

σ(s,X(s)) dW (s). (3.1.6)

In general, it is difficult to give a necessary and sufficient condition forexistence and uniqueness of strong solutions. Usually we can give sufficientconditions.

Theorem 3.1.2 (Existence and uniqueness) If X0 is F0-measurable andE[X2

0 ] < ∞. The coefficients a, σ satisfy the following conditions.

• (Lipschitz condition) a and σ are Lipschitz continuous, i.e., there is aconstant K > 0 such that

|a(x)− a(y)|+m∑

r=1

|σr(x)− σr(y)| ≤ K|x− y|.

• (Linear growth) a and σ grow at most linearly i.e., there is a C > 0 suchthat

|a(x)|+ |σ(x)| ≤ C(1 + |x|),then the SDE above has a unique strong solution and the solution has thefollowing properties

• X(t) is adapted to the filtration generated by X0 and W (s) (s ≤ t).

• E[

∫ t

0

X2(s) ds] < ∞.


Here are some examples where the conditions in the theorem are satisfied.

• (Geometric Brownian motion) For μ, σ ∈ R,

dX(t) = μX(t) dt+ σX(t) dW (t), X0 = x.

• (Sine process) For σ ∈ R,

dX(t) = sin(X(t)) dt+ σ dW (t), X0 = x.

• (modified Cox-Ingersoll-Ross process) For θ1, θ2 ∈ R,

dX(t) = −θ1X(t) dt+ θ2√

1 +X(t)2 dW (t), X0 = x. θ1 +θ222

> 0.

Remark 3.1.3 The condition in the theorem is also known as global Lips-chitz condition. A straightforward generalization is one-sided Lipschitz con-dition (global monotone condition)

(x− y)�(a(x)− a(y)) + p0

m∑

r=1

|σr(x)− σr(y)|2 ≤ K|x− y|2, p0 > 0,

and the growth condition can also be generalized as

x�a(x) + p1

m∑

r=1

|σr(x)|2 ≤ C(1 + |x|2).

We will discuss in detail this condition in Chapter 5.

Theorem 3.1.4 (Regularity of the solution) Under the conditions ofTheorem 3.1.2, the solution is continuous and there exists a constant C > 0depending only on t that

E[|X(t)−X(s)|2] ≤ C |t− s| .

Then by the Kolmogorov continuity theorem (Theorem 2.2.10), we can con-clude that the solution is only Holder continuous with exponent less than1/2, which is the same as Brownian motion.

3.1.2 Solution methods

This process (3.1.2) here is a special case of the Ornstein-Uhlenbeck process,which satisfies the equation

dX(t) = κ(θ −X(t)) dt+ σ dW (t). (3.1.7)


where κ, σ > 0, θ ∈ R. The solution to (3.1.7) can be obtained by the methodof change-of-variable: Y (t) = θ −X(t). Then by Ito’s formula we have

dY (t) = −κY (t) dt+ σ d(−W (t)).

Similar to (3.1.2), the solution is

Y (t) = e−κtY0 + σ

∫ t

0

e−κ(t−s) d(−W (s)). (3.1.8)

Then by Y (t) = θ −X(t), we have

X(t) = X0e−κt + θ(1− e−κt) + σ

∫ t

0

e−κ(t−s) dW (s).

In a more general case, we can use similar ideas to find explicit solutionsto SODEs.

The integrating factor method

We apply the integrating factor method to solve nonlinear SDEs of the form

dX(t) = f(t,X(t)) dt+ σ(t)X(t) dW (t), X0 = x. (3.1.9)

where f is a continuous deterministic function defined from R+ × R to R.

• Step 1. Solve the equation

dG(t) = σ(t)G(t) dW (t).

Then we have

G(t) = exp(

∫ t

0

σ(s) dW (s)− 1

2

∫ t

0

σ2(s) ds).

The integrating factor function is defined by F (t) = G−1(t). It can bereadily verified that F (t) satisfies

dF (t) = −σ(t)F (t) dW (t) + σ2(t)F (t) dt.

• Step 2. Let X(t) = G(t)C(t) and then C(t) = F (t)X(t). Then by theproduct rule, (3.1.9) can be written as

d(F (t)X(t)) = F (t)f(t,X(t)) dt.


Then Ct satisfies the following “deterministic” ODE

dC(t) = F (t)f(t, G(t)C(t)). (3.1.10)

• Step 3. Once we obtain C(t), we get X(t) from X(t) = G(t)C(t).

Remark 3.1.5 When (3.1.10) cannot be explicitly solved, we may use somenumerical methods to obtain C(t).

Example 3.1.6 Use the integrating factor method to solve the SDE

dX(t) = (X(t))−1 dt+ αX(t) dW (t), X0 = x > 0,

where α is a constant.

Solution. Here f(t, x) = x−1 and F (t) = exp(−αW (t) + α2

2 t). We onlyneed to solve

dC(t) = F (t)[G−1(t)C(t)]−1 = F 2(t)/C(t).

This gives d(C(t))2 = 2F 2(t) dt and thus

(C(t))2 = 2

∫ t

0

exp(−2αW (s) + α2s) ds+ x2.

Since the initial condition is x > 0, we take Y (t) > 0 such that

X(t)=G(t)Y (t)= exp(αW (t)−α2

2t)

√

2

∫ t

0

exp(−2αW (s) + α2s) ds+ x2 > 0.

Moment equations of solutions

For a more complicated SODE, we cannot obtain a solution that can bewritten explicitly in terms of W (t). For example, the modified Cox-Ingersoll-Ross model (3.1.11) does not have an explicit solution:

dX(t) = κ(θ −X(t))dt+ σ√

X(t)dW (t), X0 = x, (3.1.11)

However, we can say a bit more about the moments of the process X(t).Write (3.1.11) in its integral form:

X(t) = x+ κ

∫ t

0

(θ −X(s))ds+ σ

∫ t

0

√X(s) dW (s) (3.1.12)

and using Ito’s formula gives

X2(t)=x2+(2κθ+σ2)

∫ t

0

X(s) ds− 2κ

∫ t

0

X(s)2 ds+ 2σ

∫ t

0

(X(s))3/2 dW (s).

(3.1.13)


From this equation and the properties of Ito’s integral, we can obtain themoments of the solution. The first moment can be obtained by taking expec-tation over both sides of (3.1.12):

mt := E[X(t)] = x+ κ

(θt−

∫ t

0

E[X(s)] ds

),

because the expectation of the stochastic integral part is zero.1 We can thensolve the following ODE:

dmt = κ(θ −mt)dt.

The solution is given by:

mt = θ + (x− θ)e−κt.

For the second moment, we get from (3.1.13) that

E[X2(t)] = x2 + (2κθ + σ2)

∫ t

0

E[X(s)]ds− 2κ

∫ t

0

E[X2(s)]ds.

This is again an ODE similar to the one for mt to solve:

E[X2(t)] = x2 + (2κθ + σ2)(θt+ (x− θ)

(1− e−κt)

κ

)− 2κ

∫ t

0

E[X2(s)]ds.

Here we also assume that we have∫ t

0E[|X(s)|3] ds < ∞ so that

∫ t

0(X(s))3/2

dW (s) is an Ito integral with a square-integrable integrand and thus

E[∫ t

0(X(s))3/2 dW (s)] = 0.

Remark 3.1.7 It can be shown using Feller’s test [255, Theorem 5.29] thatthe solution to (3.1.13) exists and is unique when 2κθ > σ2 and X0 ≥ 0.

Moreover, the solution is strictly positive when X0 > 0. If E[|X0|3] < ∞,then E[|X(t)|p] < ∞, 1 ≤ p ≤ 3.

Unfortunately, even the first few moments are difficult to obtain in gen-eral. For example, we cannot get a closure for the second-order moment ofthe following SDE

dX(t) = κ(θ −X(t)) dt+(X(t)

)αdW (t),

1

2< α < 1.

We cannot even obtain the first-order moment of the following SDE

dX(t) = sin(X(t)) dt+ dW (t), X0 = x.

1Here we need to verify that∫ t

0

√X(s) dW (s) is indeed Ito’s integral with

a square-integrable integrand, by showing that

∫ t

0

E[|X(s)|] ds < ∞. See Re-

mark 3.1.7.


3.2 Numerical methods for SODEs

As explicit solutions to SODEs are usually hard to find, we seek numericalapproximation of solutions.

3.2.1 Derivation of numerical methods based on numericalintegration

A starting point for numerical SODEs is numerical integration. Consider theSODE (3.1.4) over [t, t+ h]:

X(t+ h) = X(t) +

∫ t+h

t

a(s,X(s)) ds+m∑

r=1

∫ t+h

t

σr(s,X(s)) dWr.

The simplest scheme for (3.1.4) is the forward Euler scheme. In the forwardEuler scheme, we replace (approximate)

∫ t+h

t

a(s,X(s)) ds with

∫ t+h

t

a(t,X(t)) ds = a(t,X(t))h

and

∫ t+h

t

σr(s,X(s)) dWr with

∫ t+h

t

σr(t,X(t)) dWr

= σr(t,X(t))(Wr(t+ h)−Wr(t)).

Then we obtain the forward Euler scheme (also known as Euler-Maruyamascheme):

Xk+1 = Xk + a(tk, Xk)h+

m∑

r=1

σl(tk, Xk)ΔkWr, (3.2.1)

where h = (T − t0)/N, tk = t0 + kh, k = 0, . . . , N. X0 = x0 and ΔkWr =Wr(tk+1)−Wr(tk). The Euler scheme can be implemented by replacing theincrements ΔkWr with Gaussian random variables:


m∑

r=1

σl(tk, Xk)√hξl,k+1, (3.2.2)

where ξr,k+1 are i.i.d. N (0, 1) random variables.Replacing (approximating) the drift term with its value at t+ h, we have

∫ t+h

t

a(s,X(s)) ds ≈∫ t+h

t

a(t+ h,X(t+ h)) ds = a(t+ h,X(t+ h))h.

3.2 Numerical methods for SODEs 61

The resulting scheme is called backward Euler scheme (also known as drift-implicit Euler scheme)

Xk+1 = Xk + a(tk+1, Xk+1)h+m∑

r=1

σl(tk, Xk)ΔkWr, k = 0, 1, . . . , N − 1.

(3.2.3)

The following schemes can be considered as extensions of forward andbackward Euler schemes

Xk+1 = Xk + [(1− λ)a(tk, Xk) + λa(tk+1, Xk+1)]h+m∑

r=1

σr(tk, Xk)√hξr,k+1,

(3.2.4)where λ ∈ [0, λ], or similarly

Xk+1=Xk+a((1−λ)tk+λtk+1, (1−λ)Xk+λXk+1)h+m∑

r=1

σr(tk, Xk)√hξr,k+1.

(3.2.5)

We can also derive numerical methods for (3.1.4) in order to get high-orderconvergence. For example, in (3.1.4), we can approximate the diffusion termsσr using their half-order Ito-Taylor’s expansion which leads to the Milsteinscheme [354]. Let us illustrate the derivation of the Milstein scheme for anautonomous SODE (a and σ do not explicitly depend on t) when m = 1and r = 1, i.e., for scalar equation with single noise. With the followingapproximation,

∫ t+h

t

a(X(s)) ds ≈∫ t+h

t

a(Xt) ds = a(X(t))h

∫ t+h

t

σ(X(s)) dW (s) ≈∫ t+h

t

[σ(X(t)) +

∫ s

t

σ′(X(t))σ(Xt) dW (θ)] dW (s)

= σ(X(t))(W (t+ h)−W (t)) + σ′(X(t))σ(X(t))∫ t+h

t

∫ s

t

dW (θ) dW (s),

we can obtain the Milstein scheme

Xk+1 = Xk + a(Xk)h+ σ(Xk)(W (tk+1)−W (tk))

+1

2σ(Xk)σ

′(Xk)[(W (tk+1)−W (tk))2 − h].

One can also derive a drift-implicit Milstein scheme as follows:

Xk+1 = Xk + a(Xk+1)h+ σ(Xk)(W (tk+1)−W (tk))

+1

2σ(Xk)σ

′(Xk)[(W (tk+1)−W (tk))2 − h].


For (3.1.4), the Milstein scheme is as follows, see, e.g., [259, 358],

Xk+1 = Xk + a(tk, Xk)h+m∑

r=1

σr(tk, Xk)ξrk√h

+

m∑

i,l=1

Λiσl(t,Xk)Ii,l,tk , (3.2.6)

where Ii,l,tk =

∫ tk+1

tk

∫ s

tk

dWi dWl. To efficiently evaluate this double integral,

see Chapter 4 and some bibliographic notes.The scheme (3.2.6) is of first-ordermean-square convergence. For commutative noises, i.e.

Λiσl = Λlσi, Λl = σ�l

∂

∂x, (3.2.7)

we can use only increments of Brownian motions of the double Ito integralin (3.2.6) since

Ii,l,tk + Il,i,tk = (ξikξlk − δil)h/2,

where δil is the Kronecker delta function. In this case, we have a simplifiedversion of (3.2.6):

Xk+1 = Xk + a(tk, Xk)h+m∑

r=1

σr(tk, Xk)ξlk√h

+1

2

m∑

i,l=1

Λiσl(t,Xk)(ξikξlk − δil)h. (3.2.8)

There has been an extensive literature on numerical methods for SODES.We refer to [217, 416] for introduction to numerical methods for SODEs andto [259, 354] for a systematic construction of numerical methods for SODEs.In chapter 4 we will present three different methods for SODEs with delay,which can also be applied to standard SODEs if we set the delay equal tozero.

For numerical methods for SODEs and SPDEs, the key issues are whethera numerical method converges and in what sense and whether it is stable insome sense, as well as how fast it converges.

3.2.2 Strong convergence

Definition 3.2.1 (Strong convergence in Lp) A method (scheme) is saidto have a strong convergence order γ in Lp if there exists a constant K > 0independent of h such that

E[|Xk −X(tk)|p] ≤ Khpγ

for any k = 0, 1, . . . , N and Nh = T and sufficiently small h.


In many applications and in this book, a strong convergence refers toconvergence in the mean-square sense, i.e., p = 2.

If the coefficients of (3.1.4) satisfy the conditions in Theorem 3.1.2, theforward Euler scheme (3.2.1) and the backward Euler scheme (3.2.3) areconvergent with half-order (γ = 1/2) in the mean-square sense (strong con-vergence order half), i.e.,

max1≤k≤N

E[|X(tk)−Xk|2] ≤ Kh,

where K is positive constant independent of h. When the noise is additive,i.e., the coefficients of noises are functions of time instead of functions of thesolutions, these schemes are of first-order convergence.

Under the conditions in Theorem 3.1.2, the Milstein scheme (3.2.6) canbe shown to have a strong convergence order one, i.e., γ = 1.

Note that all these schemes are one-step schemes. One can use the follow-ing Milstein’s fundamental theorem to derive their mean-square convergenceorder. Introduce the one-step approximation Xt,x(t+ h), t0 ≤ t < t+ h ≤ T,for the solution Xt,x(t + h) of (3.1.4), which depends on the initial point(t, x), a time step h, and {W1(θ)−W1(t), . . . ,Wm(θ)−Wm(t), t ≤ θ ≤ t+h}and which is defined as follows:

Xt,x(t+h) = x+A(t, x, h;Wi(θ)−Wi(t), i = 1, . . . ,m, t ≤ θ ≤ t+h). (3.2.9)

Using the one-step approximation (3.2.9), we recurrently construct the ap-proximation (Xk,Ftk), k = 0, . . . , N, tk+1 − tk = hk+1, TN = T :

X0 = X(t0), Xk+1 = Xtk,Xk(tk+1) (3.2.10)

= Xk +A(tk, Xk, hk+1;Wi(θ)−Wi(tk), i = 1, . . . ,m, tk ≤ θ ≤ tk+1).

For simplicity, we will consider a uniform time step size, i.e., hk = h forall k. The proof of the following theorem can be found in [353] and [354, 358,Chapter 1].

Theorem 3.2.2 (Fundamental convergence theorem of one-step nu-merical methods) Suppose that

(i) the coefficients of (3.1.4) are Lipschitz continuous;(ii)the one-step approximation Xt,x(t + h) from (3.2.9) has the following

orders of accuracy: for some p ≥ 1 there are α ≥ 1, h0 > 0, and K > 0such that for arbitrary t0 ≤ t ≤ T − h, x ∈ R

d, and all 0 < h ≤ h0 :

|E[Xt,x(t+ h)− Xt,x(t+ h)]| ≤ K(1 + |x|2)1/2hq1 ,

[E|Xt,x(t+ h)− Xt,x(t+ h)|2p

]1/(2p) ≤ K(1 + |x|2p)1/(2p)hq2 (3.2.11)

with

q2 ≥ 1

2, q1 ≥ q2 +

1

2;


Then for any N and k = 0, 1, . . . , N the following inequality holds:[E|Xt0,X0

(tk)− Xt0,X0(tk)|2p

]1/(2p) ≤ K(1 + E|X0|2p)1/(2p)hq2−1/2,(3.2.12)

where K > 0 do not depend on h and k, i.e., the order of accuracy of themethod (3.2.10) is q = q2 − 1/2.

Many other schemes of strong convergence based on Ito-Taylor’s expan-sion have been developed, such as Runge-Kutta methods, predictor-correctormethods, and splitting (split-step) methods, see, e.g., [259, 358]. Here weassume that the coefficients are Lipschitz continuous while in practice thecoefficients may be non-Lipschitz continuous. We will discuss this issue inChapter 5.

3.2.3 Weak convergence

The weak integration of SDEs is computing the expectation

E[f(X(T )], (3.2.13)

where f(x) is a sufficiently smooth function with growth at infinity not fasterthan a polynomial:

|f(x)| ≤ K(1 + |x|κ) (3.2.14)

for some K > 0 and κ ≥ 1.

Definition 3.2.3 (Weak convergence) A method (scheme) is said to havea weak convergence order γ if there exists a constant K > 0 independent ofh such that

|E[f(Xk)]− E[f(X(tk))]| ≤ Khγ

for any k = 0, 1, . . . , N and Nh = T and sufficiently small h.

Under the conditions of Theorem 3.1.2, the following error estimate holdsfor the forward Euler scheme (3.2.2) (see, e.g., [358, Chapter 2]):

|Ef(Xk)− Ef(X(tk))| ≤ Kh, (3.2.15)

where K > 0 is a constant independent of h. The backward Euler scheme(3.2.3) and the Milstein scheme (3.2.6), are all of weak convergence order 1.

This first-order weak convergence of the forward Euler scheme can alsobe achieved by replacing ξl,k+1 with discrete random variables [358], e.g., theweak Euler scheme has the form

Xk+1 = Xk + ha(tk, Xk) +√h

m∑

r=1

σr(tk, Xk)ζr,k+1, k = 0, . . . , N − 1,

(3.2.16)

where X0 = x0 and ζr,k+1 are i.i.d. random variables with the law

P (ζ = ±1) = 1/2. (3.2.17)


The following error estimate holds for (3.2.16)–(3.2.17) (see, e.g., [358, Chap-ter 2]):

|Ef(XN )− Ef(X(T ))| ≤ Kh, (3.2.18)

where K > 0 can be a different constant than that in (3.2.15).Introducing the function ϕ(y), y ∈ R

mN , so that

ϕ(ξ1,1, . . . , ξr,1, . . . , ξ1,N , . . . , ξm,N ) = f(XN ), (3.2.19)

we have

E[f(X(T )] ≈ Ef(XN ) = Eϕ(ξ1,1, . . . , ξr,1, . . . , ξ1,N , . . . , ξm,N ) (3.2.20)

=1

(2π)mN/2

∫

RrN

ϕ(y1,1, . . . , ym,1, . . . , y1,N , . . . , ym,N )

exp

(−1

2

mN∑

i=1

y2i

)dy.

Further, it is not difficult to see from (3.2.16) to (3.2.17) and (2.5.8) that

E[f(X(T )] ≈ Ef(XN ) = Eϕ(ζ1,1, . . . , ζm,1, . . . , ζ1,N , . . . , ζm,N ) (3.2.21)

= Q⊗mN2 ϕ(y1,1, . . . , ym,1, . . . , y1,N , . . . , ym,N ),

where Q2 is the Gauss-Hermite quadrature rule with nodes ±1 and equalweights 1/2, see, e.g., [2, Equation 25.4.46]. We note that E[f(XN )] can beviewed as an approximation of E[f(XN )] and that (cf. (3.2.15) and (3.2.18))∣∣∣E[f(XN )]− E[f(XN )]

∣∣∣ = O(h).

Remark 3.2.4 Let ζl,k+1 in (3.2.16) be i.i.d. random variables with the law

P (ζ = yn,j) = wn,j , j = 1, . . . , n, (3.2.22)

where yn,j are nodes of the Gauss-Hermite quadrature Qn and wn,j are thecorresponding quadrature weights (see (2.5.6)). Then

Ef(XN ) = Eϕ(ζ1,N , . . . , ζm,N ) = Q⊗mNn ϕ(y1,1, . . . , ym,N ),

which can be a more accurate approximation of E[f(XN )] than E[f(XN )]from (3.2.21) but the weak-sense error for the SDEs approximation Ef(XN )−Ef(X(T )) remains of order O(h).


We can also use the second-order weak scheme (3.2.23) for (3.1.4) (see,e.g., [358, Chapter 2]):

Xk+1 = Xk + ha(tk, Xk) +√h

m∑

i=1

σi(tk, Xk)ξi,k+1 +h2

2La(tk, Xk) (3.2.23)

+h

m∑

i=1

r∑

j=1

Λiσj(tk, Xk)ηi,j,k+1 +h3/2

2

m∑

i=1

(Λia(tk, Xk)

+Lσi(tk,Xk))ξi,k+1,

k = 0, . . . , N − 1,

where X0 = x0; ηi,j =12ξiξj − γi,jζiζj/2 with γi,j = −1 if i < j and γi,j = 1

otherwise;

Λl =m∑

i=1

σil

∂

∂xi, L =

∂

∂t+

m∑

i=1

ai∂

∂xi+

1

2

m∑

r=1

m∑

i,j=1

σilσ

il

∂2

∂xi∂xj;

and ξi,k+1 and ζi,k+1 are mutually independent random variables with Gaus-sian distribution or with the laws P (ξ = 0) = 2/3, P (ξ = ±

√3) =

1/6 and P (ζ = ±1) = 1/2. The following error estimate holds for (3.2.23)(see, e.g., [358, Chapter 2]):

|Ef(X(T ))− Ef(XN )| ≤ Kh2.

We again refer to [259, 358] for more weakly convergent numericalschemes.

3.2.4 Linear stability

To understand the stability of time integrators for SODEs, we consider thefollowing linear model:

dX = λX dt+ σ dW (t), X0 = x, λ < 0. (3.2.24)

Consider one-step methods of the following form:

Xn+1 = A(z)Xn +B(z)√hσξn, z = λh. (3.2.25)

Here h is the time step size and A(z) and B(z) are analytic functions, δWn =√hξn are i.i.d. Gaussian random variables.For (3.2.24), we have X(t) is a Gaussian random variable and

limt→∞E[X(t)] = 0, lim

t→∞E[X2(t)] =σ2

2λ.


It can be readily shown that Xn is also a Gaussian random variable withE[Xn] = An(z)x and

limn→∞E[|Xn|2] =

σ2

2λR(z), R(z) = − 2zB2(z)

1−A2(z).

Here are some examples:

• Euler scheme: A(z) = 1 + z, B(z) = 1, and R(z) = 22+z .

• Backward Euler scheme: A(z) = B(z) = 11−z and R(z) = 2

2−z .

• Trapezoidal rule, A(z) = 1+z/21−z/2 and B(z) = 1

1−z/2 and R(z) = 1.

When R(z) = 1 and |A(z)| < 1, then we can obtain the exact distributionof X(∞). In this case, we call the one-step scheme is A-stable.

For long-time integration, L-stability is also helpful when a stiff problem(e.g., λ is large) is investigated. The L-stability requires A-stability and thatA(−∞) = 0 such that limz→−∞ E[Xn] = limz→−∞ An(z)x = 0 for any fixedn. When λh is large (e.g., λ is too large to have such a practical h that λh issmall), L-stable schemes can still obtain the decay of the solution E[X(t)] = 0even with moderately small time step sizes while A-stable schemes usuallyrequire very small h. For example, the trapezoidal rule is A-stable but not L-stable since A(−∞) = 1. In practice, this means that for an extremely large λ,the trapezoidal rule damps the mean since E[Xn+1] = limz→−∞ A(z)E[Xn] =E[Xn] while E[X(tn+1)] = limz→−∞ exp(−z)E[Xtn ] = 0, where tn+1−tn = h.

However, when A(z) and B(z) are rational functions of z, it is impossibleto have both A-stability and L-stability since when R(z) = 1, it holds thatA(−∞) = 1. The claim can be proved by the argument of contradiction.

It is still possible to have a scheme such that it is L-stable and A-stable.Define

Xn = C(z)Xn +D(z)σ√hξn, z = λh,

where Xn is from (3.2.25). For example, for the backward Euler scheme,A(z) = B(z) = 1

1−z and

limn→∞E[

∣∣∣Xn

∣∣∣2

] =σ2

2λ(C2(z)R(z)− 2zD2(z)).

The limit is exactly the same as the variance of X(∞) when C(z) = 1 andD(z) = (1 − z/2)−1. Such a scheme with X approximating X is called apost-processing scheme or a predictor-corrector scheme.

The linear model (3.2.24) is designed for additive noise. For multiplicativenoise, we can consider the following geometric Brownian motion.

dX = λX dt+ σX dW (t), X(0) = 1. (3.2.26)

Here we assume that λ, σ ∈ R. The solution to (3.2.26) is

X = exp((λ− 1

2σ2)t+ σW (t)).


The solution is mean-square stable if λ + σ2

2 < 0, i.e., limt→∞ E[X2(t)] =

0. It is asymptotic stable (limt→∞ |X(t)| = 0) if λ − σ2

2 < 0. The mean-square stability implies asymptotic stability. Here we consider only mean-square stability.

Applying the forward Euler scheme (3.2.2) to the linear model (3.2.26),we have

Xk+1 = (1 + λh+√hξk)Xk.

The second moment is E[X2k+1] = E[X2

k ]E[(λh+√hξk)

2] = E[X2k ]((1+λh)2+

hσ2). For limk→∞ E[Xk+1] = 0, we need (1 + λh)2 + hσ2 < 1. Similarly, for

the backward Euler scheme (3.2.3), we need 1 + σ2h < (1− λh)2.We call the region of (λh, σ2h) where a scheme is mean-square stable the

mean-square stability region of the scheme. To allow relative large h for stiffproblems, e.g., when λ is large, we need a large stability region. Usually,explicit schemes have smaller stability regions than implicit (including semi-implicit and drift-implicit) schemes do.

Both schemes (3.2.2) and (3.2.6) are explicit and hence they have smallstability regions. To improve the stability region, we can use some semi-implicit (drift-implicit) schemes, e.g., (3.2.3) and drift-implicit Milstein scheme.Fully implicit schemes are also used in practice because of their symplecticity-preserving property and effectiveness in long-term integration. The followingfully implicit scheme is from [358, 448]:

Xk+1 = Xk + a(tk+λ, (1− λ)Xk + λXk+1)h

−λ

m∑

r=1

d∑

j=1

∂σr

∂xj(tk+λ, (1− λ)Xk + λXk+1)σ

jr(tk+λ, (1− λ)Xk

+λXk+1)h

+

m∑

r=1

σr(tk+λ, (1− λ)Xk + λXk+1) (ζrh)k√h, (3.2.27)

where 0 < λ ≤ 1, tk+λ = tk + λh and (ζrh)k are i.i.d. random variables sothat

ζh =

⎧⎨

⎩

ξ, |ξ| ≤ Ah,Ah, ξ > Ah,

−Ah, ξ < −Ah,(3.2.28)

with ξ ∼ N (0, 1) and Ah =√2l| lnh| with l ≥ 1. We recall [358, Lemma

1.3.4] that∣∣E[(ξ2 − ζ2h)]

∣∣ ≤ (1 + 2√

2l| lnh|)hl. (3.2.29)

All these semi-implicit and fully implicit schemes are of half-order conver-gence in the mean-square sense, see, e.g., [358, Chapter 1]. When the noiseis additive, i.e., the coefficients of noises are functions of time instead offunctions of the solutions, these schemes are of first-order convergence.

3.3 Basic aspects of SPDEs 69

3.2.5 Summary of numerical SODEs

Numerical methods for (3.1.4) with Lipschitz continuous coefficients havebeen investigated extensively, see, e.g., [238, 259, 354, 358]. As the solutionto (3.1.4) is usually Holder continuous with exponent 1/2− ε (0 < ε < 1/2),the convergence order in the mean-square sense is usually half. In fact, byIto’s formula, we can readily have

E[|Xt0,X0(t+ h)−Xt0,X0

(t)|2] ≤ exp(Ct)h.

Then we can conclude from the Kolmogorov continuity theorem (Theo-rem 2.2.10 or see e.g., [255, Chapter 2.2.A]) that the solution is Holder contin-uous with exponent 1/2− ε. First-order schemes of mean-square convergencecan be also derived using the Ito-Taylor expansion and they require signif-icant computational effort, see, e.g., the Milstein scheme (3.2.6). When thecoefficients of noises satisfy the commutative conditions, the computationaleffort in simulating the double integrals in (3.2.6) can be significantly re-duced, see (3.2.8). When a weak convergence is desired, i.e., expectationsof functionals of solutions or simply moments of solutions are desired, onecan further approximate the Gaussian random variables in the schemes ofstrong convergence with simple symmetric random walks. For long-time inte-gration of SDEs, some structure-preserving schemes should be used, e.g., themid-point method (3.2.27)–(3.2.28) as one of symplectic methods (see [358,Chapter 4] for more symplectic schemes for SDEs).

However, numerical methods for (3.1.4) with non-Lipschitz continuouscoefficients are far from being mature. When the coefficients are of poly-nomial growth, several numerical schemes have been proposed, see, e.g.,[218, 233, 235, 448] for schemes of strong convergence and [359] for schemesof weak convergence. Compared to numerical methods of SDEs with Lips-chitz continuous coefficients, the key ingredient is the moment stability ofthe numerical schemes for SDEs with non-globally Lipschitz coefficients, see[448] and Chapter 5. An extension was made for numerical schemes wherenumerical solutions have exponential moments, see, e.g., [232, 237].

3.3 Basic aspects of SPDEs

Let us discuss the regularity of a simple SPDE – one-dimensional heat equa-tion with additive space-time noise. As will be shown, the regularity is lowand depends on the smoothness of the driving space-time noise.

Example 3.3.1 (Heat equation with random forcing) Consider thefollowing one-dimensional heat equation driven by some space-time forcing:

∂tu(t, x) = D∂2xu(t, x) + F (t, x), (t, x) ∈ (0,∞)× (0, l), (3.3.1)


with vanishing Dirichlet boundary conditions and deterministic initial con-dition u0(x). Here D is a physical constant depending on the conductivityof the thin wire and F (t, x) =

∑∞k=1

√qkmk(x)Wk(t). Here

∑∞k=1 qk < ∞,

{mk(x)}k is a complete orthonormal basis in L2([0, l]) and Wk’s are i.i.d.standard Brownian motion.

To find a solution, we apply the method of eigenfunction expansion. Let{ek(x)}k be eigenfunctions of the operator ∂2

x with vanishing Dirichlet bound-ary conditions) and the corresponding eigenvalues are λk:

− ∂2xek = λkek, ek|x=0,l = 0, k = 1, 2, . . . . (3.3.2)

with 0 ≤ λ1 ≤ λ2 ≤ · · · ≤ λk ≤ · · · and limk→∞ λk = +∞. Actually, theycan be computed explicitly

λk = k2(π

l)2, ek(x) =

√2

lsin(k

π

lx). (3.3.3)

For simplicity, we let D = 1 and mk = ek for any k. We look for a formalsolution of the following form

u(t, x) =

∞∑

k=1

uk(t)ek(x).

Plugging this formula into (3.3.1) and multiplying by ei before integratingover both sides of the equation, we have

dui(t) = −λiui(t) dt+√qi dWi(t), ui(0) =

∫ l

0

u0(x)ei(x) dx.

This is the Ornstein-Uhlenbeck process 3.1.7 and the solution is

ui(t) = u0,ie−λit +

√qi

∫ t

0

e−λi(t−s) dWi(s), u0,i =

∫ l

0

u0(x)ei(x) dx.

Thus the solution is

u(t, x) =

∞∑

k=1

[

∫ l

0

u0(x)ek(x) dxe−λkt +

√qk

∫ t

0

e−λk(t−s) dWk(s)]ek(x).

(3.3.4)

Now we show the regularity of the solution. When∑∞

k=1 qk < ∞, we have

E[‖u(t, x)− u(s, x)‖2] ≤ C(t−s) (t−s small) and then the solution is Holdercontinuous in time, by Kolmogorov’s continuity theorem. In fact,


E[‖u(t, x)− u(s, x)‖2] =∞∑

k=1

E[|uk(t)− uk(s)|2]

=

∞∑

k=1

∣∣u0,k

(e−λkt − eλks)

∣∣2+ qkE[

∣∣∣∣∫ t

0

e−λk(t−θ) dWk(θ)

−∫ s

0

e−λk(s−θ) dWk(θ)

∣∣∣∣2

]

≤∞∑

k=1

[λk |u0,k|2 (t− s)2 + qk1− e−2λk(t−s)

λk(t− s)]

≤ C(t− s)2∞∑

k=1

|u0,k|2 λk + (t− s)

∞∑

k=1

qk1− e−2λk(t−s)

λk

≤ C(t− s),

where we require that u0 ∈ H1([0, l]), i.e.,∑∞

k=1 u20,kk

2 < ∞. In the secondlast line, we also have used the following conclusion

E[

∣∣∣∣∫ t

0

e−λ(t−θ) dW (θ)−∫ s

0

e−λ(s−θ) dW (θ)

∣∣∣∣2

]≤ 1− e−2λ(t−s)

λ, for any t, λ > 0.

(3.3.5)

The conclusion is left as an exercise (Exercise 3.6.10).We can show that the solution is Holder continuous with exponent less

than 1. By the fact |ek(x)− ek(y)| ≤√

2l k

πl |x− y|, it can be readily checked

that

E[|u(t, x)− u(t, y)|2] =∞∑

k=1

E[|uk(t)|2] |ek(x)− ek(y)|2

≤ 2π2

l3|x− y|2

∞∑

k=1

E[|uk(t)|2]k2

=2π2

l3|x− y|2

∞∑

k=1

( ∣∣u20,k

∣∣ e−2λkt + qk1− e−2λkt

2λk

)k2.

Recalling (3.3.3) and u0 ∈ H1([0, l]), we have

E[‖u(t, x)− u(t, y)‖2] ≤(C +

∞∑

k=1

qk)|x− y|2 . (3.3.6)

The regularity in x follows from Kolmogorov’s continuity theorem.By the Burkholder-Davis-Gundy inequality (see Appendix D), we have

E[ sup0≤t≤T

|up(t, x)|] ≤ Cp

∣∣∣∣∣

∞∑

k=1

qke2k(x)

1− e−2λkt

2λk

∣∣∣∣∣

p/2

, p ≥ 1. (3.3.7)


As long as∑∞

k=1qkλk

converges, E[sup0≤t≤T |up(t, x)|] < ∞. However, thesecond-order derivative of the solution in x should be understood as a dis-tribution instead of a function. For simplicity, let’s suppose that u0(x) = 0.The solution becomes

u(t, x) =

∞∑

k=1

√qk

∫ t

0

e−λk(t−s) dWk(s)ek(x). (3.3.8)

The second derivative of u(t, x) in x is

∂2xu(t, x) =

∞∑

k=1

λk√qk

∫ t

0

e−λk(t−s) dWk(s)ek(x). (3.3.9)

As a Gaussian process, this process exists a.s. and requires a bounded second-order moment, i.e.,

E[(∂2xu(t, x)

)2] =

∞∑

k=1

qkλk1− e−2λkt

2e2k(x) ≥

1− e−2λ1t

2

∞∑

k=1

qkλke2k(x).

Thus if qk is proportional to 1/kp, p < 3,∑∞

k=1 qkλk diverges. The conditionon

∑∞k=1 qk < ∞ will not give us second-order derivatives in a classical sense.

In conclusion, the solution to (3.3.1) is not smooth and in general does nothave second-order derivatives unless the space-time noise is very smooth inspace. For example, when qk = 0 for k ≥ N > 1, we have a finite dimensionalnoise, we can expect second-order derivatives in space.

Example 3.3.2 (Multiplicative noise) Consider the following equation

du = auxxdt+ σux dW (t), x ∈ (0, 2π)

with periodic boundary condition. Then by Fourier transform, when 2a−σ2 >0, the solution has second-order moments.

3.3.1 Functional spaces

Consider a domain D ⊆ Rd. When k is a positive integer, we denote

Ck(D) = {u : D → R|Dαu are continuous, |α| ≤ k} .

When k = 1, we simply denote C(D) instead of C1(D). The space C∞(D) =∩∞k=1Ck(D). The space Ck

0 (D) denotes functions in Ck(D) with compact sup-port. Recall that the compact support of a function f is the closure of thesubset of X where f is nonzero: {x ∈ X | f(x) �= 0}. The Holder space Cr+1

b (D)is equipped with the following norm

‖f‖Crb= max

0≤|β|≤ r�

∥∥Dβf∥∥L∞ + sup

x,y∈D|β|= r�,r> r�

∣∣Dβf(x)−Dβf(y)∣∣

|x− y|r− r� ,

and �r� is the integer part of the positive number r.


For 1 ≤ p ≤ ∞, denote by Lp(D) the space of Lebesgue measurablefunctions with finite Lp-norm, i.e., when f ∈ Lp(D), then ‖f‖Lp(D) =( ∫

D|f |p dx

)1/p< ∞ if 1 ≤ p < ∞. When p = ∞, ‖f‖∞ = ess supD |f | .

In the Sobolev space W k,p(D), k = 0, 1, 2, . . ., 1 ≤ p ≤ ∞, the Sobolev normis defined as

‖u‖Wk,p(Ω) :=

⎧⎨

⎩

(∑|α|≤k ‖Dαu‖pLp(Ω)

) 1p

, 1 ≤ p < +∞;

max|α|≤k ‖Dαu‖L∞(Ω) , p = +∞.

If p = 2, W k,p(D) = Hk(D) is a Sobolev-Hilbert space. When k is not aninteger, we need the following Slobodeckij semi-norm |·|θ,p,D, defined by

∫

D

∫

D

|f(x)− f(y)|p

|x− y|pθ+ddx dy)1/p. (3.3.10)

The Sobolev-Slobodeckij space W k,p(D) is defined as

W k,p(D) =

{f ∈ W k�,p(D) | sup

|α|= s�[Dαf ]θ,p,D < ∞

}

associated with the norm

‖f‖W s,p(D) = ‖f‖W �s�,p(D) + sup|α|= s�

[Dαf ]θ,p,D.

The space W k,p0 (D) is defined as the closure of C∞

0 (D) with respect to thenorm ‖ · ‖W s,p(D). When p = 2 and k is a nonnegative integer, we write

Hk0 (D) = W k,p

0 (D).

For k ≥ 0, W−k,p(D) is defined as the dual space of W k,q0 (D), where p′

is the conjugate of p (1/p + 1/q = 1, p, q ≥ 1). In particular, H−k(D) is thedual space of Hk

0 (D) and for f ∈ H−1(D),

‖f‖H−1(D) = supv∈H1

0 (D)

〈f, v〉‖v‖H1 (D)

.

We will drop the domain D in norms if no confusion arises. For example,‖f‖H−1(D) will be written as ‖f‖H−1 .

3.3.2 Solutions in different senses

Definition 3.3.3 (Strong solution) A predictable L2-valued process{u(t)}t∈[0,T ] is called a strong solution to (3.3.23)–(3.3.24) if

u(t) = u0 +

∫ t

0

Lu(t) + f ds+∑

k≥1

∫ t

0

(Mku(t) + gk)dWk(t).


As shown in the beginning of this section, the solution in this sense is re-strictive for infinite dimensional noises, especially for space-time white noise,even in one dimension. For finite dimensional noises, the sense of solution canbe still used.

We now introduce solutions to stochastic partial differential equations inthree senses: variational solution, mild solution, and Wiener chaos solution.

Definition 3.3.4 (Variational solution) A variational solution u(t, x) isa predictable L2-valued process such that for any v ∈ C∞

0 (D) (C∞ functionswith compact support) for each t ∈ (0, T ]

(u, v) = (u0, v) +

∫ t

0

∫

DLu, v dx+ (f, v) dt+

∫ t

0

∞∑

k=1

(Mku+ gk dWk, v),

(3.3.11)

where L− 12

∑∞k=1 MkMk > 0 and for L− 1

2

∑∞k=1 MkMk = 0 (fully degen-

erate)

(u, v) = (u0, v) +

∫ t

0

∫

DuL∗v, dx+ (f, v) dt+

∫ t

0

∞∑

k=1

[Mku+ gk] dWk, v),

(3.3.12)

where L∗ is the adjoint operator of L with respect to the inner product in H.

A variational solution to the stochastic partial differential equation (3.3.1)can be defined as a random process u such that for all v ∈ C∞

0 ([0, l]),

∫ l

0

u(t, x)v(x) dx =

∫ l

0

u0(x)v(x) dx−∫ t

0

∫ l

0

u′(x)v′(x) dx ds

+

∫ t

0

∫ L

0

F (t, x)v(x) dx ds

or in an even weaker sense,

∫ l

0

u(t, x)v(x) dx =

∫ l

0

u0(x)v(x) dx+

∫ t

0

∫ L

0

u(x)v′′(x) dx ds

+

∫ t

0

∫ l

0

F (t, x)v(x) dx ds.

Definition 3.3.5 (Mild solution) When L is deterministic and time-independent, the mild sense of the solution for Equation (3.3.23) is

u = eLtu0 +

∫ t

0

eL(t−s)f ds+

∫ t

0

eL(t−s)∞∑

k=1

[Mku+ gk] dWk(s). (3.3.13)


Let G(t, x, y) be the Green’s function for the linear equation (3.3.1) withvanishing Dirichlet boundary conditions:

G(t;x, y) =

∞∑

n=−∞Φ(t;x− y − 2nl) + Φ(t;x+ y − 2nl),

where Γ (t; z) = 1√4πt

e−z2/(4t). The solution to (3.3.1) of mild form is

u(t, x) =

∫ l

0

G(t;x, y)u0(y) dy +

∫ t

0

∫ l

0

G(t− s;x, y)F (t, y) dy ds.

As F (t, x) is defined by a q-cylindrical Wiener process, denoted as WQ(t, x),then

u(t, x) =

∫ l

0

G(t;x, y)u0(y) dy +

∫ t

0

∫ l

0

G(t− s;x, y) dWQ(s, y).

For stochastic elliptic equations in Chapter 10, we use a solution in themild sense, which we will define similarly.

Now we present a Wiener chaos solution to the linear SPDE (3.3.23)–(3.3.24) (see, e.g., [316, 318, 345]). Denote by J the set of multi-indicesα = (αk,l)k,l≥1 of finite length |α| =

∑∞i,k=1 αk,l, i.e.,

J = {α = (αk,l, k, l ≥ 1), αk,l ∈ {0, 1, 2, . . .}, |α| < ∞} .

Here k denotes the number of Wiener processes and l the number of Gaus-sian random variables approximating each Wiener process as will be shownshortly. We represent the solution of (3.3.23)–(3.3.24) as

u(t, x) =∑

α∈J

1√α!

ϕα(t, x)ξα, (3.3.14)

where {ξα} is a complete orthonormal system (CONS) in L2(Ω,Ft, P )(Cameron-Martin basis), α! =

∏k,l(αk,l!). To obtain the coefficients ϕα(t, x),

we rewrite the SPDE (3.3.23) in the following form using the Ito-Wickproduct

du(t, x)= [Lu(t, x)+f(x)] dt+∑

k≥1

[Mku(t, x)+gk(x)] �Wk dt, (t, x)∈(0, T ]×D,

u(0, x) = u0(x), x ∈ D, (3.3.15)

where Wk is formally the first-order derivative of Wk in time, i.e., W )kddtWk.

Then we substitute the representation (3.3.14) into (3.3.23) and take expec-tation on both sides of (3.3.23) after multiplying ξα on both sides (3.3.23).

With the properties of the Ito-Wick product ξα � ξβ =√

(α+β)!α!β! ξα+β and


E[ξαξβ ] = δα=β , we then obtain that ϕα satisfies the following system ofequations (the propagator):

∂ϕα(s, x)

∂s= Lϕα(s, x) + f(x)1{|α|=0} (3.3.16)

+∑

k,l

αk,lml(s)[Mkϕα−(k,l)(s, x) + gk(x)1{|α|=1}

],

0 < s ≤ t, x ∈ D,

ϕα(0, x) = u0(x)1{|α|=0}, x ∈ D,

where α−(k, l) is the multi-index with components

(α−(k, l)

)i,j

=

{max(0, αi,j − 1), if i = k and j = l,αi,j , otherwise.

(3.3.17)

Here we also use the spectral approximation of Brownian motion (2.2.1).

Remark 3.3.6 Since the Cameron-Martin basis is complete, see Theorem2.3.6, a truncation of the WCE (3.3.14) can present a consistent (convergent)numerical methods for SPDEs.

3.3.3 Solutions to SPDEs in explicit form

The first one is the stochastic advection-diffusion equation with periodicboundary condition, written in the Stratonovich form as

du(t, x) = εuxx(t, x)dt+ σux(t, x) ◦ dW (t), t > 0, x ∈ (0, 2π), (3.3.18)

u(0, x) = sin(x),

or in the Ito form as

du(t, x) = auxx(t, x) dt+ σux(t, x) dW (t), u(0, x) = sin(x).

Here W (t) is a standard one-dimensional Wiener process, σ > 0, ε ≥ 0 areconstants, and a = ε+ σ2/2. The solution of (3.3.18) is

u(t, x) = e−εt sin(x+ σW (t)), (3.3.19)

and its first and second moments are

E[u(t, x)] = e−at sin(x), E[u2(t, x)] = e−2εt

(1

2− 1

2e−2σ2t cos(2x)

).

We note that for ε = 0 the equation (3.3.18) becomes degenerate.The second model problem is the following Ito reaction-diffusion equation

with periodic boundary condition:

du(t, x) = auxx(t, x) dt+ σu(t, x) dW (t), t > 0, x ∈ (0, 2π), (3.3.20)

u(0, x) = sin(x),


where σ > 0 and a ≥ 0 are constants. Its solution is

u(t, x) = exp

(−(a+

σ2

2)t+ σW (t)

)sin(x), (3.3.21)

and its first and second moments are

E[u(t, x)] = e−at sin(x), E[u2(t, x)] = exp(−(2a− σ2)t

)sin2(x).

3.3.4 Linear stochastic advection-diffusion-reaction equations

Consider the following SPDE written in Ito’s form:

du(t, x) = [Lu(t, x) + f(x)] dt+∑

k≥1

[Mku(t, x) (3.3.22)

+gk(x)] dWk(t), (t, x) ∈ (0, T ]×D,

u(0, x) = u0(x), x ∈ D, (3.3.23)

where

Lu(t, x) =

d∑

i,j=1

aij (x)DiDju(t, x) +

d∑

i=1

bi(x)Diu(t, x) + c (x)u(t, x),

Mku(t, x) =

d∑

i=1

bki (x)Diu(t, x) + hk (x)u(t, x), (3.3.24)

and Di := ∂xiand D be an open domain in R

d. We assume that D is eitherbounded with a regular boundary or that D = R

d. In the former case we willconsider periodic boundary conditions and in the latter the Cauchy problem.Let (W (t),Ft) = ({Wk (t) , k ≥ 1} ,Ft) be a system of one-dimensional inde-pendent standard Wiener processes defined on a complete probability space(Ω,F ,P), where Ft, 0 ≤ t ≤ T, is a filtration satisfying the usual hypotheses.

Remark 3.3.7 The problem (3.3.23) with (3.3.24) can be regarded as a prob-lem driven by a cylindrical Wiener process. Consider a cylindrical Wienerprocess W (t, x) =

∑∞k=1 λkWk(t)ek(x), where

∑∞k=1 λ

2k < ∞, {Wk(t)} are

independent Wiener processes, and {ek(x)}∞k=1 is a complete orthonormalsystem (CONS) in L2(D), see, e.g., [94, 408]. Thus, we can view (3.3.23)–(3.3.24) as SPDEs driven by this cylindrical Wiener process when Mku =ek(x)Mu and M is first-order or zeroth order differential operator.

3.3.5 Existence and uniqueness

We assume the coercivity condition that there exist a constant δL > 0 and areal number CL such that for any v ∈ H1(D),

〈Lv, v〉+ 1

2

∑

k≥1

‖Mkv‖2 + δL ‖v‖2H1 ≤ CL ‖v‖2 , (3.3.25)


where 〈·, ·〉 is the duality between the Sobolev spaces H−1(D) and H1(D)associated with the inner-product over L2(D) and ‖·‖ is the L2(D)-norm. Anecessary condition for (3.3.25) is that the coefficients satisfy

d∑

i,j=1

⎛

⎝2ai,j(x)−∑

k≥1

σi,k(x)σk,j(x)

⎞

⎠ yiyj ≥ 2δL |y|2 , x, y ∈ D.

With these assumptions, we have a unique square-integrable (variational)solution of (3.3.23)–(3.3.24) if we also have the following conditions:

• the coefficients of operators L and M in (3.3.24) are uniformly boundedand predictable for every x ∈ D. The coefficients ai,j(x)’s are Lipschitzcontinuous;

• For φ ∈ H1(D),∑

k≥1

E[‖Mkφ(t)‖2L2 ] < ∞;

• the initial condition u0(x) ∈ L2(Ω;L2) is F0-measurable;

• f(t, ω) and gk(t, ω) are adapted and

∫ T

0

‖f(t)‖2H−1 dt < ∞,∑

k≥1

∫ T

0

‖gk(t)‖2L2 dt < ∞.

Then for each φ ∈ H1 or a dense subset of H1 and all t ∈ [0, T ], the adaptedprocess u(t) is a variational solution to (3.3.23)–(3.3.24). With the coer-civity condition and ‖Lφ‖H−1 ≤ C0 ‖φ‖H1 , there exists a unique solutionu ∈ L

2(Ω,C((0, T ));L2(D) and satisfies

E[ sup0<t<T

‖u(t)‖2L2 ] +δL2E[

∫ T

0

‖u(t)‖2V dt] ≤ CE[‖u0‖2L2 ]

+ CE[

∫ T

0

‖f(t)‖2H−1 dt] + C∑

k≥1

∫ T

0

‖gk(t)‖2L2 dt.

Here C depends on C0, CL, δL, and T . see, e.g., [318, 345] for proofs.

3.3.6 Conversion between Ito and Stratonovich formulation

In Stratonovich form, (3.3.23) and (3.3.24) are written as

du(t, x)= [Lu(t, x)+f(x)] dt+

q∑

k=1

[Mku(t, x)+gk(x)] ◦Wk dt, (t, x)∈(0, T ]×D,

u(0, x) = u0(x), x ∈ D, (3.3.26)

where Lu = Lu− 12

∑1≤k≤q Mk[Mku+ gk].

Example 3.3.8 Consider the following one-dimensional equation for (t, x) ∈(0, T ]× (0, 2π):

du = [(ε+1

2σ2)∂2

xu+ β sin(x)∂xu] dt+ σ∂xu dW (t), (3.3.27)

where W (t) is a standard scalar Brownian motion (Wiener process), ε > 0,β, σ are constants.


In the Stratonovich form, Equation (3.3.27) can be written as

du = [ε∂2xu+ β sin(x)∂xu] dt+ σ∂xu ◦ dW (t). (3.3.28)

The problems (3.3.23) and (3.3.24) are said to have commutative noises if

MkMj = MjMk, 1 ≤ k, j ≤ q, (3.3.29)

and to have noncommutative noises otherwise. When q = 1, (3.3.29) is sat-isfied and thus this is a special case of commutative noises. When Mk arezeroth-order operators, (σi,k = 0), (3.3.29) is satisfied and the problem alsohas commutative noises. The definition is consistent with that of commuta-tive and noncommutative noises for stochastic ordinary differential equations,see, e.g., [259, 358].


du = [(ε+1

2σ21 cos

2(x))∂2xu+ (β sin(x)− 1

4σ21 sin(2x))∂xu] dt

+σ1 cos(x)∂xu dW1(t) + σ2u dW2(t), (3.3.30)

where (W1(t),W2(t)) is a standard two-dimensional Wiener process, ε > 0,β, σ1, σ2 are constants.

In the Stratonovich form, Equation (3.3.30) is written as

du = [ε∂2xu+ β sin(x)∂xu] dt+ σ1 cos(x)∂xu ◦ dW1(t) + σ2u ◦ dW2(t).

(3.3.31)

The problem has commutative noises (3.3.29):

σ1 cos(x)∂xσ2Idu = σ2Idσ1 cos(x)∂x = σ1σ2 cos(x)∂x.

Here Id is the identity operator.


du = [(ε+1

2σ21)∂

2xu+ β sin(x)∂xu+

1

2σ22 cos

2(x)u] dt

+σ1∂xu dW1(t) + σ2 cos(x)u dW2(t), (3.3.32)

where (W1(t),W2(t)) is a standard Wiener process, ε > 0, β, σ1, σ2 areconstants.

In the Stratonovich form, Equation (3.3.32) is written as

du = [ε∂2xu+ β sin(x)∂xu] dt+ σ1∂xu ◦ dW1(t) + σ2 cos(x)u ◦ dW2(t).

(3.3.33)

The problem has noncommutative noises as the coefficients do not sat-isfy (3.3.29).


3.4 Numerical methods for SPDEs

In this section, we briefly review numerical methods for SPDEs and broadlyclassify the numerical methods in literature into three categories:

• Direct semi-discretization methods. In this category, we usually discretizethe underlying SPDEs in time and/or in space, applying classical tech-niques from time-discretization methods of stochastic ordinary differentialequations (SODEs) and/or from spatial discretization methods of partialdifferential equations (PDEs).

• Wong-Zakai approximation. In this category, we first discretize the space-time noise before any discretization in time and space and thus we needfurther spatial-temporal discretizations.

• Preprocessing methods. In this category, we first transform the underlyingSPDE into some equivalent form before we discretize the SPDEs.

We start by considering the following SPDE over the physical domainD ⊆ R

d,dX = [AX + f(X)] dt+ g(X) dWQ, (3.4.1)

where the Q-Wiener process WQ is defined in (1.2.3). The physical space isone-dimensional, i.e., d = 1, unless otherwise stated. When D is bounded,we consider periodic boundary conditions (with further requirements on thedomain) or Dirichlet boundary conditions.

The leading operator A can be second-order or fourth-order differentialoperators, which are positive definite. The nonlinear functions f, g are usu-ally Lipschitz continuous. The problem (3.4.1) is endowed either with onlyinitial conditions in the whole space (D = R

d) or with initial and boundaryconditions in a bounded domain (D � R

d).Let us introduce the stability and convergence of numerical schemes. We

denote δtk = (tk+1 − tk) (k = 1, 2, . . . ,K,∑K

k=1 δtk = T ) are the timestep sizes. Sometimes we simply use the time step size δt when all δtk’s areequal. We also denote by N > 0 the number of orthogonal modes in spectralmethods or discretization steps in space (Nh = |D|, |D| is the length of theinterval D ⊂ R

d when d = 1) for finite difference methods or finite elementmethods. We denote a numerical solution to (3.4.1) by XN,K .

Let H be a separable Hilbert space (Hilbert space with a countable basis)with corresponding norm ‖·‖H . We usually take H = L2(D),

Definition 3.4.1 (Convergence) Assume that XN,K is a numerical solu-tion to (3.4.1) and X(x, T ) is a solution to (3.4.1) at time T .

• Mean-square convergence (Strong convergence). If there exists aconstant C independent of h and δ such that

E[‖XN,K −X(·, T )‖2L2(D)] ≤ C(h2p1 + (δt)2p2), p1, p2 > 0, (3.4.2)

3.4 Numerical methods for SPDEs 81

then the numerical solution is convergent in the mean-square sense to thesolution to (3.4.1). The mean-square convergence order in time is p1 andthe convergence order in physical space is p2.

• Almost sure convergence (Pathwise convergence). If there is a fi-nite random variable C(ω) > 0 independent of h and δ such that

‖XN,K −X(·, T )‖L2(D) ≤ C(ω)(hp1 + (δt)p2), (3.4.3)

then the numerical solution is convergent almost surely to the solutionto (3.4.1).

• Weak convergence. If there exists a constant C independent of h and δsuch that

‖E[φ(XN,K)]− E[φ(X(·, T ))]‖L2(D) ≤ C(hp1 + (δt)p2), (3.4.4)

then the numerical solution is weakly convergent to the solution to (3.4.1).

We say the convergence order (in mean-square sense, almost sure sense orweak sense) in time is p1 and the convergence order (in mean-square sense,almost sure sense or weak sense) in physical space is p2.

Remark 3.4.2 Here we do not specify what the sense of solutions is. Thedefinition is universal for strong solutions, variational solutions, and mildsolutions for SPDEs.

We do not consider the effect of truncation of infinite dimensional processWQ. In general, the convergence of the truncated finite dimensional processto WQ depends on the decay rate of qi (1.2.3) as well as on the smoothingeffect of the inverse of the leading operator A.

The following zero-stability is concerned with whether the numerical so-lution can be controlled by initial values.

Definition 3.4.3 (Zero-Stability) Assume that XN,K is a numerical so-lution to (3.4.1) and X(x, T ) is a solution to (3.4.1) at time T .

• Mean-square zero-stability. If there exists a constant C independentof N and K such that

E[‖XN,K‖2] ≤ C max0≤k≤m

E[‖XN,k‖p], for some nonnegative integers m, p,

(3.4.5)then the numerical solution is stable in the mean-square sense.

• Almost sure zero-stability. If there is a finite random variable C(ω) >0 independent of N and K such that

‖XN,K‖ ≤ C(ω) max0≤k≤m

‖XN,k‖ , for some nonnegative integer m,

(3.4.6)then the numerical solution is stable almost surely.


Remark 3.4.4 In most cases, we use m = 1, which is appropriate for one-step numerical methods. The case m ≥ 2 is for m-step numerical methods,where the first m-steps cannot be obtained from the m-step numerical methodsbut can be obtained from some other numerical methods (usually one-stepmethods) with smaller time step sizes.

We can also define linear stability, which is concerned with asymptoticbehavior of numerical solutions withK goes to infinity while the time step sizeδt is fixed. The linear stability for evolutionary SPDEs is a straightforwardextension of linear stability for SODEs, which can also be defined in themean-square, almost sure or weak sense. The concern of linear stability oftenleads to the Courant-Friedrichs-Lewy condition (often abbreviated as CFLcondition): the mesh size in time has to be proportional to a certain power(depending on the order of leading operator A) of the mesh size in space. Thisis similar to linear stability of PDEs. An example of the linear stability for alinear stochastic advection-diffusion equation is presented in Chapter 3.4.5.

3.4.1 Direct semi-discretization methods for parabolic SPDEs

The time-discretization methods for (3.4.1) can be seen as a straightforwardapplication of numerical methods for SODEs, where increments of Brownianmotions are used. After performing a truncation in physical space, we willobtain a system of finite dimensional SODEs, and subsequently we can applystandard numerical methods for SODEs, e.g., those from [259, 354, 358]. Itis very convenient to simply extend the methods of numerical PDEs andSODEs to solve SPDEs. One can select the optimal numerical methods forthe underlying SPDEs with carefully analyzing the characteristics of relatedPDEs and SODEs.

However, it is not possible to derive high-order schemes with direct timediscretization methods as the solutions to SPDEs have very low regularity.For example, for the heat equation with additive space-time white noise inone dimension (3.3.1), see Exercise 3.6.11, the sample paths of the solution isHolder continuous with exponent 1/4− ε (ε > 0 is arbitrarily small) in timeand is Holder continuous with exponent 1/2− ε in space.

Second-order equations

For finite dimensional noise, we can directly apply those time-discretizationmethods for SODEs to SPDEs as solutions are usually smooth in space.Ref. [167] considered Euler and other explicit schemes for a scalar Wienerprocess and Ref. [261] further considered linear-implicit schemes in time underthe same problem setting. Specifically, both papers considered (3.4.1) withWQ = W (t) being one-dimensional Brownian motion. After discretizing inphysical space, we obtain a system of SODEs:

dXN = [ANXN + f(XN )] dt+ g(XN ) dW, (3.4.7)


where XN can be derived from finite difference schemes, finite elementschemes, spectral Galerkin/collocation schemes, finite volume schemes, andmany other schemes for discretization in space. The explicit Euler schemefor (3.4.7) [167] is

Xk+1N = Xk

N + [AkNXk

N + f(XkN )]δtk + g(Xk

N )ΔWk, (3.4.8)

where δtk’s are the time step sizes and XkN is an approximation of XN at

tk =∑k

i=1 δtk. In [167], the first-order Milstein scheme was also applied.However, for explicit schemes, the CFL condition requires small time stepsizes even when the mesh size in physical space is relatively large. WhenA = Δ, we need δtN2 less than some constant which depends on the size ofthe domain. To avoid such a severe restriction, Ref. [261] applied the drift-implicit (linear-implicit) Euler scheme

Xk+1N = Xk

N + [Ak+1N Xk+1

N + f(XkN )]δtk + g(Xk

N )ΔWk, (3.4.9)

which does not require a severe CFL condition when g and f are Lipschitzcontinuous. However, the same conclusion is not true when the coefficient ofthe noise involves first-order derivatives of the solution X. Recall that we donot have such an issue for deterministic PDEs [178]. See, e.g., Chapter 3.4.5for such an example.

In a similar setting, Ref. [139] proposed the Milstein scheme for the Kol-mogorov Petrovskii-Piskunov (KPP) equation with multiplicative noise usinga finite difference scheme in space. See [158, 366, 404, 422, 426] for more nu-merical results.

For infinite dimensional noise but with fast decaying qi in (1.2.3), Hausen-blas (2003) [210] considered the mean-square convergence of linear-implicitand explicit Euler scheme, and Crank-Nicolson scheme in time for (3.4.1)with certain smooth f and g and proved half-order convergence for theseschemes. Here, the Crank-Nicolson scheme means linear-implicit Crank-Nicolson, where the nonlinear terms and coefficients of noise are treatedexplicitly as in the linear-implicit Euler scheme (3.4.9):

Xk+1N = Xk

N + [Ak+1/2N X

k+1/2N + f(Xk

N )]δt+ g(XkN )ΔWN

k (x), (3.4.10)

where ΔWNk (x) is a discretization of dWQ(t, x) and

Ak+1/2N X

k+1/2N =

Ak+1N Xk+1

N +AkNXk

N

2.

Here ΔWNk (x) can be

WQ(tk+1,xj)−WQ(tk,xj)δt in a finite difference scheme

or PhWQ(tk+1,x)−PhW

Q(tk,x)δt in a finite element scheme where Ph is a L2-

projection into a finite dimensional solution space.The author remarked that for the Crank-Nicolson scheme the convergence

order can be improved to one for linear equations with additive noise, as in


the case of SODEs. Also, Hausenblas (2003) [211] proved the first-order weakconvergence of these numerical schemes for (3.4.1) with additive noise. Millet& Morien [350] considered (3.4.1) with space-time noise, where qi and eiare eigenvalues and eigenfunctions of a specific isotropic kernel. Hausenblas(2002) [209] considered a slightly different equation

dX = [AX + f(t,X)] dt+∑

j

gi(t,X) dWj(t), (3.4.11)

where∑

j ‖gi(t, ·)‖2H2(D) < ∞ and some boundedness of f and g is imposed.

Half-order convergence in time is proved for the linear-implicit and explicitEuler schemes and the Crank-Nicolson scheme.

However, if space-time white noise is considered (qi = 1 in (1.2.3)),the regularity in time is shown to be less than 1/4, see Exercise 3.6.11 for alinear equation. Thus, the optimal order of convergence in time is 1/4− ε ifonly increments of Brownian motion (with equispaced time steps) are used,see, e.g., [6, 95] for the case of linear equations.

Gyongy and Nualart introduced an implicit numerical scheme in time forthe SPDE (3.4.1) with additive noise and proved convergence in probabilitywithout order in [196] and for (3.4.1) with mean-square order 1/8 − ε intime [197]. Gyongy [186, 188] also applied finite differences in space to theSPDE (3.4.1) and then used several temporal implicit and explicit schemes,including the linear-implicit Euler scheme. The author showed that theseschemes converge with order 1/2− ε in space and with order 1/4− ε in timefor multiplicative noise with Lipschitz nonlinear terms similar to the linearequations in [6, 95]. Refs. [371, 372] proposed an implicit Euler scheme onnonuniform time grid for (3.4.1) with f = 0 to reduce the computationalcost; the upper bound estimate was presented in [371] while the lower boundwas presented in [372], of the mean-square errors in terms of computationalcost.

As we mentioned before, the solution to (3.4.1) is of low regularity andthus it is not possible to derive high-order schemes with direct time discretiza-tion methods. See, e.g., [464] for discussion on first-order schemes (Milsteintype schemes) for (3.4.1) and also [249] for a review of numerical approxima-tion of (3.4.1) along this line.

For spatial semi-discretization methods for solving SPDEs (including butnot limited to (3.4.1)), see finite difference methods, see, e.g., [6, 313, 420,495]; finite element methods, see, e.g., [6, 22, 152, 464, 469, 491, 494]; fi-nite volume methods for hyperbolic problems, see, e.g., [274, 367]; spectralmethods, see, e.g., [65, 78, 242, 302]). See also [149, 193, 206, 207] for acceler-ation schemes in space using Richardson’s extrapolation method. As spatialdiscretizations are classical topics in numerical methods for PDEs, we referthe readers to standard textbooks, such as [178] for finite difference schemes,[216] for spectral methods, and [296] for finite volume methods. In most cases,especially for linear problems (or essential linear problems, e.g., linear lead-ing operator with Lipschitz nonlinearity), one can simply apply the classical


spatial discretization techniques. One caveat is for solutions of extremely lowregularity, see, e.g., numerical methods for Burgers equations with additivespace-time white noise in Chapter 3.4.4, where different spatial discretizationmethods indeed make significant difference.

Fourth-order equations

Now we consider fourth-order equations, i.e., A is a fourth-order differentialoperator, which have been investigated in [265–267, 271, 291], etc. As the ker-nels associated with fourth-order operators can have more smoothing effectsthan those associated with second-order differential operator, we can expectbetter convergence in space and also in time.

Ref. [265] considered fully discrete finite element approximations for afourth-order linear stochastic parabolic equation with additive space-timewhite noise in one space dimension where strong convergence with order 3/8in time and 3/2− ε in space was proved. Ref. [271] proved the convergence offinite element approximation of the nonlinear stochastic Cahn-Hilliard-Cookequation by additive space-time color noise

dX = Δ2X +Δf(X) + dWQ. (3.4.12)

Ref. [71] presented some numerical results of a semi-implicit backward dif-ferentiation formula in time for nonlinear Cahn-Hilliard equation while noconvergence analysis is given. For the linearized Cahn-Hilliard-Cook equation(f = 0) with additive space-time color noise, Ref. [291] applied a standardfinite element method and an implicit Euler scheme in time and obtainedquasi-optimal convergence order in space. Kossioris and Zouris considered animplicit Euler scheme in time and finite elements in space for the linear Cahn-Hilliard equation with additive space-time white noise in [267] and the sameequation but with even rougher noise which is the fist-order spatial derivativeof the space-time white noise in [266]. In [267], they proved that the strongconvergence order is (4− d)/8 in time and (4− d)/2− ε in space for d = 2, 3.

3.4.2 Wong-Zakai approximation for parabolic SPDEs

In this approach, we first truncate the Brownian motion with a smooth pro-cess of bounded variation yielding a PDE with finite dimensional noise. Thus,after truncating the Brownian motion, we have to discretize a deterministicPDE both in time and in space to obtain fully discrete schemes.

The most popular approximation Brownian motion in this approach ispiecewise linear approximation of Brownian motion (2.2.4), see, e.g., [481].Piecewise linear approximation for SPDEs has been well studied in theory,see, e.g., [142, 180, 227, 442, 454, 455, 457] (for mean-square convergence), [46,181, 200, 201] (for pathwise convergence), [18, 30, 72, 79, 182–185, 199, 351,456] (for support theorem, the relation between the support of distribution


of the solution and that of its Wong-Zakai approximation). For mean-squareconvergence of Wong-Zakai approximation for (3.4.13) with Mk having nodifferential operator, Ref. [227] proved a half-order convergence, see also [53,54]. For pathwise convergence, Ref. [200] proved a 1/4− ε-order convergenceand Ref. [201] proved a 1/2 − ε-order convergence when Mk is a first-orderdifferential operator.

All the aforementioned papers were on the convergence of the Wong-Zakaiapproximation itself, i.e., without any further discretization of the resultingPDEs. Numerical solutions of SPDEs based on the Wong-Zakai approxima-tion have not yet been well explored for SPDEs. Even for SODEs, Ref. [307]seems to be the first attempt to obtain numerical solutions from Wong-Zakaiapproximation, where the authors considered a stiff ODE solver instead ofpresenting new discretization schemes.

In this book, we will derive fully discrete schemes based on Wong-Zakaiapproximations and show the relationships between the derived schemes andthe classical schemes (e.g., those in [259, 354, 358]); see Chapter 4 for details.

3.4.3 Preprocessing methods for parabolic SPDEs

In this type of methods, the underlying equation is first transformed into anequivalent form, which may bring some benefits in computation, and then isdealt with time discretization techniques. For example, splitting techniquessplit the underlying equation into stochastic part and deterministic part andsave computational cost if either part can be efficiently solved either numeri-cally or even analytically. In the splitting methods, we also have the freedomto use different schemes for different parts.

We will only review two methods in this class: splitting techniquesand exponential integrator methods. In addition to these two methods,there are other preprocessing methods such as methods of averaging-over-characteristics, e.g., [361, 396, 428]; particle methods, e.g., [87–91, 285]; al-gebraic method, e.g., [405]; filtering on space-time noise [310]; etc.

Splitting methods

Splitting methods are also known as fractional step methods, see, e.g.,[162], and sometimes as predictor-corrector methods, see, e.g., [131]. Theyhave been widely used for their computational convenience, see, e.g., [33,85, 86, 189, 191, 192, 244, 293, 304, 305]. Typically, the splitting is for-mulated by the following Lie-Trotter splitting, which splits the underlyingproblem, say (3.4.13), into two parts: ‘stochastic part’ (3.4.14a) and ‘deter-ministic part’ (3.4.14b). Consider the following Cauchy problem (see, e.g.,[131, 190, 191])

du(t, x) = Lu(t, x) dt+

d1∑

k=1

Mku(t, x) ◦ dWk, (t, x) ∈ (0, T ]×D, (3.4.13)


where L is linear second-order differential operator, Mk is linear differentialoperator up to first order, and D is the whole space R

d. The typical Lie-Trotter splitting scheme for (3.4.13) reads, over the time interval (tn, tn+1],in integral form

un(t, x) = un(tn, x) +

∫ t

tn

d1∑

k=1

Mkun(s, x) ◦ dWk(s), t ∈ (tn, tn+1],

(3.4.14a)

un(t, x) = un(tn+1) +

∫ t

tn

Lun(s, x) ds, t ∈ (tn, tn+1]. (3.4.14b)

When Mk is a zeroth order differential operator, Ref. [433] presentedresults for pathwise convergence with half-order in time under the L2-normin space when d1 = 1. Under similar settings as in [433], Ref. [244] provedthat a normalization of numerical density in the Zakai equation in a splittingscheme is equivalent to solving the Kushner equation (nonlinear SPDE for thenormalized density, see, e.g., [286]) by a similar splitting scheme (first orderin the mean-square sense). When Mk is a first-order differential operator,Ref. [131] proved half-order mean-square convergence in time under the L2-norm in space. Gyongy and Krylov managed to provide the first-order mean-square convergence in time under higher-order Sobolev-Hilbert norms [191],and under even stronger norm in space [190].

Other than finite dimensional noise, Refs. [32, 33] considered semilinearparabolic equations (3.4.1) with multiplicative space-time color noises. Withthe Lie-Trotter splitting, they established strong convergence of the splittingscheme and proved half-order mean-square convergence in time. Cox and vanNeerven [85] obtained mean-square and pathwise convergence order of Lie-Trotter splitting methods for Cauchy problems of linear stochastic parabolicequations with additive space-time noise. Other than the problems (3.4.1)and (3.4.13), the Lie-Trotter splitting techniques have been applied to dif-ferent problems, such as stochastic hyperbolic equations (e.g., [7, 27, 412]),rough partial differential equations (e.g., [138]), stochastic Schrodinger equa-tion (e.g., [47, 168, 304, 305, 338]), etc.

Integrating factor (exponential integrator) techniques

In this approach, we first write the underlying SPDE in mild form (integration-factor) and then combine different time-discretization methods to derive fullydiscrete schemes. It was first proposed in [309, 369], under the name of ex-ponential Euler scheme and was further developed to derive higher-orderscheme, see, e.g., [29, 246–250, 252].

In this approach, it is possible to derive high-order schemes in the strongsense since we may incorporate the dynamics of the underlying problemsas shown for ODEs with smooth random inputs in [257]. By formulatingEquation (3.4.1) with additive noise in mild form, we have


X(t) = eAtX0 +

∫ t

0

eA(t−s)f(X(s)) ds+

∫ t

0

eA(t−s) dWQ(s), (3.4.15)

then we can derive an exponential Euler scheme [309, 369]:

Xk+1 = eAh[Xk + hf(Xk) +WQ(tk+1)−WQ(tk)], (3.4.16)

or as in [250, 369]

Xk+1 = eAhXk +A−1(eAh − I)f(Xk) +

∫ tk+1

tk

eA(tk+1−s) dWQ(s), (3.4.17)

where tk = kh, k = 0, · · · , N , Nh = T .In certain cases, the total computational cost for the exponential Euler

scheme can be reduced when ηk =

∫ tk+1

tk

eA(tk+1−s) dWQ(s) is simulated as

a whole instead of using increments of Brownian motion. For example, whenAei = −λiei, we observe that ηk solves the following equation

Y =

∞∑

i=1

∫ tk+1

tk

AY ds+

∞∑

i=1

∫ tk+1

tk

√qiei dWi(s), (3.4.18)

and thus ηk can be represented by

ηk =∞∑

i=1

√γiei(x)ξk,i, ξk,i =

1√γi

∫ tk+1

tk

eλi(tk+1−s) dWi(s),

γi =qi2λi

(1− exp(2λih). (3.4.19)

In this way, we incorporate the interaction between the dynamics and thenoise, and thus we can have first-order mean-square convergence [249, 250].See [248, 258, 311, 362] for further discussion on additive noise. For multi-plicative noise, a first-order scheme (Milstein scheme) has been derived underthis approach [252], where commutative conditions on diffusion coefficientsfor equations with infinite dimension noises were identified and a one-and-a-half order scheme in the mean-square sense has been derived in [29]. Seealso [4, 21, 23, 275, 312, 473] for further discussion on exponential integrationschemes for SPDEs with multiplicative noises.

3.4.4 What could go wrong? Examples of stochastic Burgers andNavier-Stokes equations

As a special class of parabolic SPDEs, stochastic Burgers and Navier-Stokesequations require more attention for their strong interactions between thestrong nonlinearity and the noises. Similar to linear heat equation with addi-tive noise, the convergence for time-discretization of one-dimensional Burgers


equations is no more than 1/4, see [400] for multiplicative space-time noisewith convergence in probability, and [38] for additive space-time noise withpathwise convergence. The convergence in space is less than 1/4, see [5] foradditive space-time white noise with pathwise convergence, and [37] for ad-ditive space-time color noise with pathwise convergence.

Because of the strong nonlinearity, the discretization in space and in timemay cause some effects, such as “a spatial version of the Ito-Stratonovichcorrection” [203, 205]. Hairer et al considered finite difference schemes forthe Burgers equation with additive space-time noise in [205]:

∂tu = ν∂2xu+ (∇G(u))∂xu+ σWQ, x ∈ [0, 2π]. (3.4.20)

If we only consider the discretization of the first-order differential operator,e.g.,

∂tuε=ν∂2

xuε+(∇G(uε))∂xu

ε+σWQ, ∂xuε=:

u(x+ aε)−u(x−bε)

(a+b)ε, a, b>0,

(3.4.21)then it can be proved that this equation converges to (see [203])

∂tv = ν∂2xv + (∇G(v))∂xv − σ2

4ν

a− b

a+ bΔG(v) + σWQ, x ∈ [0, 2π], (3.4.22)

if WQ is space-time white noise; and no correction term appears if WQ ismore regular than space-time white noise, e.g., white in time but correlated inspace. Effects of some other standard discretizations in space, e.g., Galerkinmethods, and fully discretizations were also discussed in [203].

Now we consider the stochastic incompressible Navier-Stokes (1.2.4).When the noise is color noise in space: E[WQ(x, t)WQ(y, s)] = q(x, y)min (s, t)and q(x, x) is square-integrable over the physical domain, Ref. [44] showedthe existence and strong convergence of the solutions for the fully discreteschemes in two-dimensional case. Ref. [69] considered three semi-implicitEuler schemes in time and standard finite elements methods for the two-dimensional (1.2.4) with periodic boundary conditions. They presented thesolution convergence in probability with order 1/4 in time similar to theone-dimensional stochastic Burgers equation with additive noise. They alsoshowed that for the corresponding Stokes problem, the fully discrete schemeconverges in the strong sense with order half in time and order one in physicalspace.

For (1.2.4) in the bounded domain with Dirichlet boundary condition, Ref.[502] considered the backward Euler scheme and proved half-order strong con-vergence when the multiplicative noise is space-time color noise. Ref. [493]considered an implicit-explicit scheme and proved a convergence order de-pending on the regularity index of initial condition. Ref. [120] considered fi-nite elements methods and a semi-implicit Euler for stochastic Navier-Stokesequation (1.2.4) and Ref. [121] considered similar fully discrete schemes forstochastic Navier-Stokes introduced in [348]. Ref. [492] provided a posteri-ori error estimates for stochastic Navier-Stokes equation. See [41] (recursive


approximation), [128] (implicit scheme), [169] (Wong-Zakai approximation),[414, 496] (Galerkin approximation), [122] (Wiener chaos expansion) for morediscussion on numerical methods and, e.g., [108] for existence and uniquenessof (1.2.4). See also [175] for strong convergence of Fourier Galerkin methodsfor the hyperviscous Burgers equation and some numerical results for stochas-tic Burgers equation equipped with the Wick product [445].

Many other evolution equations have also been explored, such as stochasticKdV equations (see, e.g., [102, 105, 106, 110, 111, 215], Ginzburg-Landauequation (see, e.g., [302]), stochastic Schrodinger equations (see, e.g., [26, 99–101, 103, 104, 369]), stochastic age-dependent population (see, e.g., [214]), etc.For steady stochastic partial differential equations, especially for stochasticelliptic equation, see, e.g., [6, 12, 34, 64, 118, 140, 194, 469]. See furtherdiscussion in Chapter 10.

3.4.5 Stability and convergence of existing numerical methods

There are various aspects to be considered for numerical methods for SPDEs,e.g., the sense of existence of solutions, the sense of convergence, the senseof stability, etc. Here the existence of solutions and numerical solutions toSPDEs are usually interpreted as mild solutions or as variational solutions.We focus on weak convergence and pathwise convergence in this subsection.For strong convergence, we refer to [276] for an optimal convergence order offinite element methods and linear-implicit Euler scheme in time for (3.4.1);see also the aforementioned papers for strong convergence in different problemsettings.

Weak convergence

Similar to the weak convergence of numerical methods for SODEs, the maintool for the weak convergence is the Kolmogorov equation associated withthe functional and the underlying SPDE [92, 94]. For linear equations, theKolmogorov equation for SPDEs is sufficient to obtain optimal weak conver-gence, see, e.g., [112, 147, 421]. Ref. [421] considered weak convergence of theθ-method in time and spectral method in physical space for the heat equationwith additive space-time noise, and showed that the weak convergence order istwice that of strong convergence for a finite dimensional functional. Ref. [147]obtained a similar conclusion for more general functionals, the restriction onwhich was further removed in [112]. More recently, there have been moreworks following this approach [269, 270, 299, 402] for linear equations. Forthe linear Cahn-Hilliard equation with additive noise, Ref. [269] obtained theweak error for the semidiscrete schemes by linear finite elements with orderh2β | log(h)|), where hβ is the strong convergence order and β is determined byqi’s and the smoothness of the initial condition. Ref. [270] provided weak con-vergence order for the same problem but with further time discretization andproved that the weak convergence order is twice the strong convergence order.


For nonlinear equations, Malliavin calculus for SPDEs has also been usedfor optimal weak convergence, see, e.g., [109, 211, 475]. Ref. [211] appliedMalliavin calculus to a parabolic SPDE to obtain the weak convergence theof linear-implicit Euler and Crank-Nicolson schemes in time for additive noise,where the first-order weak convergence (with certain condition on the func-tional) is obtained. Ref. [213] showed that the order of weak convergence ofthe leap-frog scheme, both in space and time, is twice that of strong conver-gence for wave equation with additive noise as shown for heat equations, see,e.g., [109, 112, 147]. Ref. [109] established weak convergence order for thesemilinear heat equation with multiplicative space-time noise and showedthat the weak convergence order is twice the strong convergence order intime. Ref. [475] obtained weak convergence order of the linear-implicit Eulerscheme in time for (3.4.1) with additive noise and obtained similar conclu-sions. For exponential Euler schemes for SODEs, it was proved that the weakconvergence order is one (see, e.g., [369]), which is the same as the mean-square convergence order.

For weak convergence of numerical methods for elliptic equations, we canuse multivariate calculus to compute the derivatives with respect to (random)parameters and Taylor’s expansion, see, e.g., [73, 74] and also Chapter 10.

Pathwise convergence

There are two approaches to obtain pathwise convergence. The first is viamean-square convergence. By the Borel-Cantelli lemma (see, e.g., [186]), itcan be shown that pathwise convergence order is the same as mean-squareconvergence order (up to an arbitrarily small constant ε > 0). For example,Ref. [95] first deduced a pathwise convergence on schemes from the mean-square convergence order established in [188]. Refs. [21, 23, 86, 289, 290] firstobtained the mean-square convergence order and then documented the path-wise convergence. The second approach is without knowing the mean-squareconvergence. In [433], the authors required pathwise boundedness (uniformlyboundedness in time step sizes) to have a pathwise convergence with order1/2 − ε. In [248], it was shown that it is crucial to establish the pathwiseregularity of the solution to obtain pathwise convergence order.

Finally, we note that there are some other senses of convergence, see,e.g., [19] for convergence in probability using several approximations of whitenoise.

Stability

Here we will not review the stability of numerical methods for SPDEs as theyare usually accompanied by a convergence study. We refer to [464] for thestability of the fully discrete schemes for (3.4.1). We also refer to the followingtwo papers for some general framework on stability and convergence. Ref.[288] proposed a version of Lax equivalence theorem for (3.4.1) with additive


and multiplicative noise while WQ is replaced with a cadlag (right continuouswith a left limit) square-integrable martingale. Ref. [275] suggested a generalframework for Galerkin methods for (3.4.1) and applied them to Milsteinschemes.

It is known that the mean-square stability region of a numerical scheme intime for SPDEs with multiplicative noise is smaller than that of the schemefor PDEs, e.g., Crank-Nicolson scheme for (3.4.1) with multiplicative noise[464], or alternating direction explicit scheme for heat equation with multi-plicative noise [426].

To illustrate different stability requirements of numerical PDEs andSPDEs, we summarize a recent work on the mean-square stability of Milsteinscheme for one-dimensional advection-diffusion equation with multiplicativescalar noise [158, 404]. Ref. [404] analyzed the linear stability (proposed in[49]) of the first-order σ-θ-scheme and Ref. [219] for SODEs. For a specificequation of the form (3.4.13) with periodic boundary conditions:

dv = −μ∂xvdt+1

2∂2xvdt−

√ρ∂xvdWt, 0 ≤ ρ < 1, x ∈ (0, 1) (3.4.23)

the σ-θ scheme reads, with time step size δt and space step size δx,

Vn+1 = Vn − θ

2(δt

δxμD1 −

δt

δx2D2)Vn+1 −

1− θ

2(δt

δxμD1

− δt

δx2D2)Vn (‘deterministic part’)

− δt

δx2ρ[σD2Vn+1 + (1− σ)D2Vn]

(‘correction term due to stochastic part’)

−√ρ

2

√δt

δxD1Vnξn +

ρ

2

δt

δx2D2Vnξ

2n, (‘stochastic part’) (3.4.24)

where ξn are i.i.d. independent standard Gaussian random variables, θ ∈ [0, 1]and D1 and D2 are the first and second central difference operators:

D1Vn =Vn+1 − Vn−1

2δt, D2Vn =

Vn+1 − 2Vn + Vn−1

δt2.

It was shown that when σ = −1, θ > 1/2 the scheme is unconditionally stableas we have, by Fourier stability analysis,

δt

δx2[1− 2(θ − ρσ − ρ2)] < 1. (3.4.25)

When σ = 0, θ = 0, the scheme becomes the Milstein discretization in timein conjunction with finite difference schemes in physical space introducedin [158], requiring that μ2δt ≤ 1 − ρ in addition to (3.4.25). In Table 3.1,we summarize the CFL conditions for Equation (3.4.23) with various ρ anddifferent discretization parameters θ and σ.


3.4.6 Summary of numerical SPDEs

For SPDEs driven by space-time noise, their solutions are usually of lowregularity, especially when the noise is space-time white noise. Hence, it isdifficult to obtain efficient high-order schemes.

Generally speaking, numerical methods for SPDEs are classified into threeclasses: direct discretization where numerical methods for PDEs and SODEsare directly applied, Wong-Zakai approximation where the noise is discretizedbefore any space-time discretization, and preprocessing methods where theSPDEs are first reformulated equivalently or approximately before discretiza-tion of SPDEs.

Table 3.1. Stability region of the scheme (3.4.24) for Equation (3.4.23).

ρ θ σ CFL condition Scheme

0 0 – δtδx2 < 1 Explicit

0 (0, 12) – δt

δx2 (1− 2θ) < 1 Implicit

0 [ 12, 1] – – implicit

(0, 1) 0 0 δtδx2 (1 + 2ρ2) < 1 Explicit

(0, 1) (0,min( 12+ ρ2, 1)) 0 δt

δx2 [1− 2(θ − ρ2)] < 1 Implicit

(0,√22] (min( 1

2+ ρ2, 1), 1) 0 – Implicit

(0, 1) [ 12− ρ+ ρ2, 1] −1 – implicit

[1,∞) – – – Not mean-square stable

The convergence and stability theory of numerical SODEs can be ex-tended to numerical SPDEs. Difference senses of convergence and stabilitycan be implemented such as mean-square convergence/stability, almost sureconvergence/stability and weak convergence/stability. We do not discuss indetail here and in this book the convergence in probability. The stability fornumerical SPDEs usually require no less than the corresponding PDEs (whennoises vanish). For multiplicative noises, numerical SPDEs usually requiremore on the CFL condition than corresponding PDEs, see Chapter 3.4.5.

Because of low regularity of solutions to SPDEs, it is helpful to makefull use of specific properties of the underlying SPDEs and preprocessingtechniques to derive higher-order schemes while keeping the computationalcost low. For example, we can use the exponential Euler scheme (3.4.17)with (3.4.19) when the underlying SPDEs are driven by additive noise andtheir leading differential operators are independent of randomness and time.When SPDEs (with multiplicative noises) have commutative noises (seee.g. (3.3.29) for a definition), we can use the Milstein scheme (first-orderstrong convergence, see, e.g., [252, 276, 366]) while only sampling incrementsof Brownian motions.


Another issue for numerical methods of SPDEs is to reduce their compu-tational cost in high-dimensional random space as there are usually infinitedimensional stochastic processes whose truncations converge very slowly. Thisis the case even when high-order schemes like (3.4.19) can be used. Efficientinfinite-dimensional integration methods should be employed to obtain thedesired statistics with reasonable computational cost. See Chapter 2.5 for abrief review of numerical integration methods in random space.

SPDEs are usually solved with Monte Carlo methods and many similardeterministic equations have to be solved. In some special cases, however,SPDEs can be solved in a very efficient way. For example, for some SPDEswith periodic boundary conditions, we can transform the equation to a de-terministic one, which can then be solved once with deterministic solvers, seeAppendix B.

3.5 Summary and bibliographic notes

We have presented some basic aspects of SODEs and SPDEs and numericalmethods of stochastic differential equations driven by Gaussian white noises.Solutions to SODEs and SPDEs are usually obtained numerically. Conver-gence and stability theory of numerical methods for SODEs and SPDEs arepresented. The commutative conditions of the coefficients of noises may sig-nificantly reduce the computational cost.

We summarize the main points in this chapter.

• For SODEs, strong solutions and solution methods for analytical solutionsare introduced in Chapter 3.1. Several senses of solutions to SPDEs arepresented in Chapter 3.3.

• Conversion between Ito and Stratonovich formulation of SODEs andSPDEs is presented.

• For both SODEs and SPDEs, numerical methods are often used to obtainapproximate solutions.

• Strong and weak convergence for numerical SODEs and SPDEs are intro-duced. Schemes of strong convergence for numerical SODEs are derivedand presented in Chapter 3.2. A brief review of numerical schemes forSPDEs focusing on strong and weak convergence is presented in Chap-ter 3.4.

• Linear stability theory is presented for numerical SODEs in Chapter 3.2.4.Linear stability of a specific numerical scheme for an advection-diffusionequation with Gaussian white noise is discussed in Chapter 3.4.5.

With basics of numerical SODEs and SPDEs, we are ready to discussmore about numerical SODEs and SPDEs in the following chapters.

Bibliographic notes. Mean-square convergence has its own area of appli-cability, e.g., for simulating scenarios, visualization of stochastic dynamics,

3.5 Summary and bibliographic notes 95

filtering, etc., see further discussion on this in [233, 259, 358] and referencestherein. Furthermore, the mean-square approximation is of theoretical inter-est and it also provides a guidance in constructing schemes of convergence inweak sense (see, e.g., [259, 354, 358]).

Partial differential equations (PDEs) driven by white noise have differ-ent interpretation of stochastic products and lead to different numerical ap-proximations, unlike the PDEs driven by color noise. Specifically, stochasticproducts for white noise are usually interpreted with two different products:the Ito product and the Stratonovich product, see, e.g., [8]. Under certainconditions, these two products can be used to formulate the same problem.However, different products lead to different performance of numerical solversfor SPDEs driven by white noise, see Chapter 8.

Compared to parabolic equations, stochastic wave equations of second or-der can have better smoothing in time: the solutions are Holder continuouswith exponent 1/2− ε in time, and thus the optimal order of convergence intime is half if only increments of Brownian motion are used, see [465] for theone-dimensional wave equation with multiplicative noise. Ref. [9] consideredthe linear wave equation with additive single white noise in time using inte-gration factor techniques, where the convergence of two-step finite differenceschemes in time is of first-order. Ref. [476] applied exponential integrationwith (3.4.19) for the semilinear wave equation with additive space-time noiseand obtained first-order mean-square convergence in time and half-order inspace. Ref. [403] considered finite difference schemes in space for the stochas-tic semilinear wave equation with multiplicative space-time white noise andobtained optimal mean-square convergence with order less than 1/3 in spacegiven smooth initial conditions. Finite element methods were investigatedin [272] and their convergence order was connected with the regularity ofthe solution. Ref. [65] considered semi-discretization using spectral Galerkinmethods in physical space. Other than strong approximation of stochasticwave equations, Ref. [213] obtained second-order weak convergence both inspace and in time for a leap-frog scheme in both space and time solvingthe one-dimensional semilinear wave equation driven by additive space-timewhite noise. Ref. [402] considered weak convergence of full discrete finite el-ement methods for the linear stochastic elastic equation driven by additivespace-time noise and showed that the weak order is twice the strong orderboth in time and in space.

Among stochastic hyperbolic problems, stochastic conservation laws havealso attracted increasing interest, see, e.g., [76, 113, 126, 224, 413] for sometheoretical results and, e.g., [27, 274, 367, 412] for some numerical studies.In Chapter 9 we present a practical example of a stochastic conservation lawfor a stochastic piston.



Exercise 3.6.1 Show that the solution to (3.1.1) X(t) is a Gaussian process.

Exercise 3.6.2 Prove (3.1.3). You may try to prove a stronger conclusion:

E[|X(t)−X(s)|2] ≤ C |t− s| .

Here C does not depend on t and s.

Exercise 3.6.3 (Conversion of an Ito SDE to a Stratonovich SDE)With the relation (2.3.2), show that the equation (3.1.4) can be writtenas (3.1.5) when σr’s are continuous in t and continuously differentiable in x.

Exercise 3.6.4 Suppose that X(t) is adapted and E[∫ t

0X2(s) ds] < ∞, verify

that X(t) in Theorem 3.1.2 is a solution using Definition 3.1.1.

Exercise 3.6.5 Verify that the following SDE has coefficients satisfying theconditions here with 1 ≤ p0 ≤ λ/σ2 + 1/2:

dX = κX(θ −X) + σ |X|3/2 dW (t), X0 > 0.

Find a range of p1 that the growth condition in Remark 3.1.3 holds.

Exercise 3.6.6 Use the integrating factor method to solve the SDE

dX(t) = (X(t))γ dt+ αX(t) dW (t), X0 = x > 0,

where α is a constant and γ ≤ 1.

Exercise 3.6.7 Find E[X(t)X(s)] when X(t) is a solution to (3.1.11).

Hint. Use the relation E[X(t)X(s)] − E[X2s ] = E[(X(t) − X(s))X(s)] =

E[(E[X(t)−X(s)|Fs])X(s)].

Exercise 3.6.8 Explain why we need to use

∫ t+h

t

σr(s,X(s)) dWr ≈∫ t+h

t

σr(t,X(t)) dWr

instead of using σr(s,X(s)) at other time instants in between t and t+ h inboth forward and backward Euler schemes.

Exercise 3.6.9 Show that

∫ t+h

t

∫ s

t

dW (θ) dW (s) =1

2[(W (t+ h)−W (t))2 − h].


Exercise 3.6.10 Show that (3.3.5) holds.

Exercise 3.6.11 Suppose that u0(x) = 0 and qk = 1, for all k ≥ 1. Showthat the solution u(t, x) of Equation (3.3.1) is Holder continuous of order lessthan 1/4 in t, and is Holder continuous of order less than 1/2 in x.

Hint. By Kolmogorov’s continuity theorem and use the following facts that∑∞k=1 k

2β−2 < ∞ when β < 1/2 and for 0 < β ≤ 1,

|ek(x)−ek(y)| = |ek(x)−ek(y)|1−β |ek(x)−ek(y)|β ≤ 21−β

√2

lkβ(

π

l)β |x−y|β ,

1− ek2(t2−t1) ≤

(1− e−λk(t2−t1)

)β ≤ λβk(t2 − t1)

β .

Part I

Numerical Stochastic Ordinary DifferentialEquations

101

We can understand stochastic partial differential equations (SPDEs) asstochastic ordinary differential equations (SODEs) in infinite dimensions, see,e.g., [94, 145]. It is then important to investigate numerical methods forSODEs before studying SPDEs. In this first part of the book, which includestwo chapters, we will present the Wong-Zakai approximation for SODEsin Chapter 4, which is somewhat less investigated in the literature. As wediscussed in the previous chapter, the Wong-Zakai approximation refers tothe truncation of Brownian motion with a bounded variation process and thusrequires further discretization in time. We show that errors of schemes basedon the Wong-Zakai approximation are also determined by discretization intime, in addition to the choice of bounded variation processes.

In Chapter 4, SODEs have global Lipschitz coefficients. We also presentsome numerical methods for SODEs with non-global Lipschitz coefficientsin Chapter 5. With the coefficients of polynomial growth, we show the keyelements to convergence (rates) of the numerical schemes. The key point ofsuch schemes is to control the fast growth of numerical solutions though thesolutions blow up with a very small probability without control.

The methodology presented for SODEs in this part can be extended toSPDEs with care. In Part II, we will apply Wong-Zakai approximation toSPDEs. The basic findings in Chapter 4 still apply, but careful choices ofdiscretization in time and also in space are needed. Solving SPDEs withcoefficients of polynomial growth is a more delicate matter but the numericalschemes in Chapter 5 can be essentially applied.

4

Numerical schemes for SDEs with time delayusing the Wong-Zakai approximation

Will a spectral approximation of Brownian motion lead to higher-order nu-merical methods for stochastic differential equations as the Karhunen-Loevetruncation of a smooth stochastic process does?

In classical numerical methods for stochastic differential equations, theBrownian motion is typically approximated by its piecewise linear interpo-lation simultaneously when a time discretization is performed. Another ap-proximation, i.e., the spectral approximation of Brownian motion, has beenproposed in theory for years but has not yet been investigated extensively inthe context of numerical methods.

In Chapter 4.2, we show how to derive three numerical schemes forstochastic delay differential equations (SDDEs) using the Wong-Zakai (WZ)approximation. By approximating the Brownian motion with its truncatedspectral expansion and then using different discretizations in time, we willobtain three schemes: a predictor-corrector scheme, a midpoint scheme, and aMilstein-like scheme. We prove that the predictor-corrector scheme convergeswith order half in the mean-square sense while the Milstein-like scheme con-verges with order one. In Chapter 4.3, we discuss the linear stability of thesenumerical schemes for SDDEs. Numerical results in Chapter 4.4 confirm thetheoretical prediction and demonstrate that the midpoint scheme is of half-order convergence. Numerical results also show that the predictor-correctorand midpoint schemes can be of first-order convergence under commutativenoises when there is no delay in the diffusion coefficients.

All these conclusions are summarized in Chapter 4.5, where we alsopresent a review on the Wong-Zakai approximation for SODEs, SDDEs, andSPDEs with Gaussian or non-Gaussian noises. The simulation of double Itointegrals in Milstein-type schemes is also discussed for SDDEs as well as forSODEs. Some exercises are provided for readers to practice implementationof numerical methods for SDDEs at the end of this chapter.


103

gk

Sticky Note

is Karhunen-Loeve spelled consistently throughout...I think its difrengt in chapter 2

104 4 Numerical schemes for SDEs with time delay using the Wong-Zakai...

4.1 Wong-Zakai approximation for SODEs

Let us first illustrate the Wong-Zakai approximation by considering the piece-wise linear approximation (2.2.4) of the one-dimensional Brownian motionW (t) for the following Ito SODEs, see, e.g., [481, 482]

dX = b(t,X)dt+ σ(t,X)dW (t), X(0) = X0, (4.1.1)

and obtain the following ODE with smooth random inputs

dX(n) = b(t,X(n))dt+ σ(t,X(n))dW (n)(t), X(0) = X0. (4.1.2)

It is proved in [481, 482] that (4.1.2) converges in the mean-square sense to

dX =

(b(t,X) +

1

2σ(t,X)σx(t,X)

)dt+ σ(t,X)dW (t), X(0) = X0,

(4.1.3)

under mild assumptions, which can be written in Stratonovich form [430]

dX = b(t,X)dt+ σ(t,X) ◦ dW (t), X(0) = X0, (4.1.4)

where ‘◦’ indicates the Stratonovich product. The term 12σ(t,X)σx(t,X)

in (4.1.3) is called the standard Wong-Zakai correction term.It is essential to identify the Wong-Zakai correction term (or the equation

that the resulting equation from Wong-Zakai approximation converges to) invarious cases. For SODEs with scalar noise, e.g., (4.1.1), when the Brown-ian motion is approximated by a process of bounded variation (rather thanby piecewise linear approximation), Ref. [434] proved that the convergenceto (4.1.3) holds in the pathwise sense (almost surely) if the drift b is locallyLipschitz continuous and is of linear growth and the diffusion σ is continuouswith bounded first-order derivatives. However, this conclusion does not holdif σ does not have bounded first-order derivative in x [434] or the approxi-mation of Brownian motion is not differentiable [342].

For SODEs with multiple noises, Sussmann [435] derived a generic Wong-Zakai correction term for multiple noises. Refs. [283, 284] provided a practi-cal criterion to verify whether a general approximation of Brownian motions(even general semi-martingales) may lead to a standard Wong-Zakai correc-tion term (e.g., 1/2σxσ for (4.1.4)) or other Wong-Zakai correction terms. Tohave a standard Wong-Zakai correction term, the essential condition for theapproximation of Brownian motion is

limn→∞E[

∫ T

0

W (n) dW (n) −∫ T

0

W (t) ◦ dW (t)] = 0. (4.1.5)

The convergence of Wong-Zakai approximation for SODEs has been estab-lished in different senses, e.g., pathwise convergence (e.g., [434, 435]), supporttheorem (the relation between the support of distribution of the solution and

4.1 Wong-Zakai approximation for SODEs 105

that of its Wong-Zakai approximation, e.g., [17, 351, 432, 457]), mean-squareconvergence (e.g., [15, 195, 241, 457]), and convergence in probability (e.g.,[16]).

Observe that Equation (4.1.2) is still continuous in time though the Brow-nian motion is approximated by a finite dimensional smooth process. Someproper time-discretization schemes should be applied for the equation (4.1.2).We will get back to this important issue shortly after we introduce SDDEsand we will treat SODEs as a special case of SDDEs with vanishing delay.

4.1.1 Wong-Zakai approximation for SDDEs

Consider the following SDDE with constant delay in Stratonovich form:

dX(t) = f(X(t), X(t− τ))dt+

r∑

l=1

gl(X(t), X(t− τ)) ◦ dWl(t), t ∈ (0, T ],

X(t) = φ(t), t ∈ [−τ, 0], (4.1.6)

where τ > 0 is a constant, (W (t),Ft) = ({Wl(t), 1 ≤ l ≤ r} ,Ft) is a systemof one-dimensional independent standard Wiener process, the functions f :R

d × Rd → R

d, gl : Rd × R

d → Rd, φ(t) : [−τ, 0] → R

d are continuous with

E[‖φ‖2L∞ ] < ∞. We also assume that φ(t) is F0-measurable.For the mean-square stability of Equation (4.1.6), we assume that f , gl

, ∂xglgl and ∂xτglgl, (∂x and ∂xτ

denote the derivatives with respect to thefirst and second variables, respectively), l = 1, 2, · · · , r in Equation (4.1.6)satisfy the following Lipschitz conditions:

|v(x1, y1)− v(x2, y2)|2 ≤ Lv(|x1 − x2|2 + |y1 − y2|2), (4.1.7)

and the linear growth conditions

|v(x1, y1)|2 ≤ K(1 + |x1|2 + |y1|2) (4.1.8)

for every x1, y1, x2, y2 ∈ Rd, where Lv, K are positive constants, which

depend only on v. Under these conditions, Equation (4.1.6) has a uniquesample-continuous and Ft-adapted strong solution X(t) : [−τ,+∞) → R

d,see, e.g., [334, 368].

Now we present the WZ approximation of Equation (4.1.6) using the spec-tral approximation (2.2.8) with piecewise constant basis (2.2.9) and Fourierbasis (2.2.10). With these orthogonal approximations, we have the followingWZ approximation for


r∑

l=1

gl(X(t), X(t− τ))dWl(t), t ∈ [0, T ],

X(t) = φ(t), t ∈ (−τ, 0], (4.1.9)


where Wl(t) can be any approximation of Wl(t) described above.For the piecewise linear interpolation (2.2.4), we have the following con-

sistency condition of the WZ approximation (4.1.9) to Equation (4.1.6).

Theorem 4.1.1 (Consistency, [453]) Suppose f and gl in Equation (4.1.6)are Lipschitz continuous and satisfy conditions (4.1.7) and have second-ordercontinuous and bounded partial derivatives. Suppose also the initial segmentφ(t), t ∈ [−τ, 0] to be on the probability space (Ω,F , P ) and F0-measurable

and right continuous, and E[‖φ‖2L∞ ] < ∞. For X(t) in (4.1.9) with piecewiselinear approximation of Brownian motion (2.2.4), we have for any t ∈ (0, T ],

limn→∞ sup

0≤s≤tE[|X(s)− X(s)|2] = 0. (4.1.10)

The consistency of the WZ approximation with spectral approximation (2.2.8)can be established by the argument of integration by parts as in [200, 241],under similar conditions on the drift and diffusion coefficients.

4.2 Derivation of numerical schemes

We further discretize Equation (4.1.9) in time and derive several numericalschemes for (4.1.6). To this end, we take a uniform time step size h, whichsatisfies τ = mh and m is a positive integer; NT = T/h (T is the final time);tn = nh, n = 0, 1, · · · , NT . For simplicity, we take the same partition for theWZ approximation exactly as the time discretization, i.e.,

tn = tn, n = 0, 1, · · · , NT and Δ =: tn − tn−1 = tn − tn−1 = h.

For Equation (4.1.9), we have the following integral form over [tn, tn+1]:

∫ tn+1

tn

dX(t) =

∫ tn+1

tn

f(X(t), X(t− τ))dt

+

r∑

l=1

∫ tn+1

tn

gl(X(t), X(t− τ))dWl(t) (4.2.1)

=

∫ tn+1

tn

f(X(t), X(t− τ))dt

+r∑

l=1

∫ tn+1

tn

gl(X(t), X(t− τ))

Nh∑

j=1

m(n)j (t)ξ

(n)l,j dt.

Here we emphasize that the time-discretization for the diffusion termhas to be at least half-order. Otherwise, the resulting scheme is not con-sistent, e.g., Euler-type schemes, in general, converge to the correspondingSDDEs in the Ito sense instead of those in the Stratonovich sense. In fact, ifgl(X(t), X(t− τ)) (l = 1, · · · , r) is approximated by gl(X(tn), X(tn − τ)) in

4.2 Derivation of numerical schemes 107

Equation (4.2.1), then we have, for both Fourier basis (2.2.10) and piecewiseconstant basis (2.2.9),

∫ tn+1

tn

dX(t) =

∫ tn+1

tn

f(X(t), X(t− τ))dt+r∑

l=1

gl(X(tn), X(tn − τ)ΔWl,n,

where ΔWl,n = Wl(tn+1) − Wl(tn). This will lead to an Euler-type schemewhich converges to the following SDDE in the Ito sense, see, e.g., [14, 306],instead of (4.1.6):


r∑

l=1

gl(X(t), X(t− τ))dWl(t).

In the following, three numerical schemes for solving Equation (4.1.6) arederived using Taylor expansion and different discretizations in time in (4.2.1).

4.2.1 A predictor-corrector scheme

Taking Nh = 1, we have that both bases, (2.2.9) and (2.2.10), have only

one term m(n)1 = 1/

√h over each subinterval. Using the trapezoidal rule to

approximate the integrals on the right-hand side of (4.2.1), we get

Xn+1 = Xn +h

2

[f(Xn, Xn−m) + f(Xn+1, Xn−m+1)

]

+1

2

r∑

l=1

[gl(Xn, Xn−m) + gl(Xn+1, Xn−m+1)

]ΔWl,n, (4.2.2)

where Xn is an approximation of X(tn) (thus an approximation of X(tn)).The initial conditions are Xn = φ(nh), when n = −m,−m + 1, · · · , 0. Notethat the scheme (4.2.2) is fully implicit and is not solvable as ΔWl,n cantake any values in the real line. To resolve this issue, we further apply theleft rectangle rule on the right side of (4.2.1) to obtain a predictor for Xn+1

in (4.2.2) so that the resulting scheme is explicit. Consequently, we arrive ata predictor-corrector scheme for SDDE (4.1.6):

Xn+1 = Xn + hf(Xn, Xn−m) +

r∑

l=1

gl(Xn, Xn−m)ΔWl,n,

Xn+1 = Xn +h

2

[f(Xn, Xn−m) + f(Xn+1, Xn−m+1)

](4.2.3)

+1

2

r∑

l=1

[gl(Xn, Xn−m) + gl(Xn+1, Xn−m+1)

]ΔWl,n,

n = 0, 1, · · · , NT − 1.


Taking Nh = 1 is sufficient for half-order schemes, such as the predictor-corrector scheme (4.2.3) and the following midpoint scheme. Both schemes

employ

∫ tn+1

tn

Nh∑

j=1

m(n)j (t)ξ

(n)l,j dt, which is equal to ΔWl,n for any Nh ≥ 1,

according to (2.2.8) and our choices of orthonormal bases (2.2.9) and (2.2.10).

Theorem 4.2.1 Assume that f , gl, ∂xglgq and ∂xτglgq (l, q = 1, 2, · · · , r)

satisfy the Lipschitz condition (4.1.7) and also gl have bounded second-orderpartial derivatives with respect to all variables. If E[‖φ‖pL∞ ] < ∞, p ≥ 4, thenwe have for the predictor-corrector scheme (4.2.3),

max1≤n≤NT

E|X(tn)−Xn|2 = O(h). (4.2.4)

When τ = 0 both in drift and diffusion coefficients, the scheme(4.2.3) degenerates into one family of the predictor-corrector schemes in [42],which can have larger stability region than the explicit Euler scheme and someother one-step schemes, especially for stochastic differential equations withmultiplicative noises, see Chapter 4.3. Moreover, we will numerically showthat if the time delay only exists in the drift term in SDDE with commutativenoise (for one-dimensional case, i.e., d = 1, the commutative condition isgl∂xgq − gq∂xgl = 0, 1 ≤ l, q ≤ r), the predictor-corrector scheme can beconvergent with order one in the mean-square sense.

Proof of Theorem 4.2.1. We present the proof for d = 1 in (4.1.6),which can be extended to multi-dimensional case d > 1 without difficulty.We recall that for the Milstein scheme (4.2.26), see [228], max

1≤n≤NT

E|X(tn)−

XMn |2 = O(h2). Then by the triangle inequality, it suffices to prove

max1≤n≤NT

E|XMn −Xn|2 = O(h). (4.2.5)

We denote that fn = f(Xn, Xn−m) and gl,n = gl(Xn, Xn−m) and also

ρfn = f(Xn+1, Xn−m+1)− fn, (4.2.6)

ρgl,n = gl(Xn+1, Xn−m+1)− [gl,n + ∂xgl,n

r∑

q=1

gq,nΔWq,n

+∂xτgl,n

r∑

q=1

gq,n−mΔWq,n−m].

With (4.2.6), we can rewrite (4.2.3) as follows

Xn+1 = Xn + hfn +

r∑

l=1

gl,nΔWl,n +1

2

r∑

l=1

r∑

q=1

∂xgl,nΔWq,nΔWl,n

+1

2

r∑

l=1

r∑

q=1

∂xτgl,n gq,n−mΔWq,n−mΔWl,n + ρn, (4.2.7)

where ρn = hρfn + 12

∑rl=1 ρgl,nΔWl,n.


It can be readily checked that if f , gl satisfy the Lipschitz condi-tion (4.1.7), and gl has bounded second-order derivatives (l = 1, · · · , r),then by the predictor-corrector scheme (4.2.3) and Taylor expansion ofgl(Xn+1, Xn−m+1), we have h2

E[ρ2fn ] ≤ Ch3, E[(ρgl,nΔWl,n)2] ≤ Ch3, and

thus by the triangle inequality,

E[ρ2n] ≤ Ch3, (4.2.8)

where the constant C depends on r and Lipschitz constants, independentof h.

Subtracting (4.2.7) from (4.2.26) and taking the expectation after squar-ing over both sides, we have

E[(XMn+1 −Xn+1)

2] = E[(XMn −Xn)

2] + 2E[(XMn −Xn)(

4∑

i=0

Ri − ρn)]

−2

4∑

i=0

E[ρnRi] +

4∑

i,j=0

E[RiRj ] + E[ρ2n], (4.2.9)

where we denote fMn = f(XM

n , XMn−m) and gMl,n = gl(X

Mn , XM

n−m) and

R0 = h(fMn − fn) +

r∑

l=1

(gMl,n − gl,n)ΔWl,n,

R1 =r∑

l=1

r∑

q=1

[∂xg

Ml,ng

Mq,n − ∂xgl,ngq,n

] ΔWq,nΔWl,n

2,

R2 =r∑

l=1

r∑

q=1

[∂xτ

gMl,ngMq,n−m − ∂xτ

gl,ngq,n−m

] ΔWq,n−mΔWl,n

2,

R3 =

r∑

l=1

r∑

q=1

∂xgMl,ng

Mq,n(Iq,l,tn,tn+1,0 −

ΔWq,nΔWl,n

2),

R4 =

r∑

l=1

r∑

q=1

∂xτgMl,ng

Mq,n−m(Iq,l,tn,tn+1,τ − ΔWq,n−mΔWl,n

2).

By the Lipschitz condition for f and gl, and adaptedness of Xn, XMn , we

have

E[R20] ≤ C(h2 + h)(E[(XM

n −Xn)2] + E[(XM

n−m −Xn−m)2]). (4.2.10)

To bound E[R2i ] (i = 1, 2, 3, 4), we require that Xn and XM

n have bounded (upto) fourth-order moments, which can be readily checked from the predictor-corrector scheme (4.2.3) and the Milstein scheme (4.2.26) under our assump-tions. By the Lipschitz condition of gl and ∂xτ

glgq, we have

E[R22] ≤ C max

1≤l,q≤rE[(

∣∣XMn −Xn

∣∣+∣∣XM

n−m −Xn−m

∣∣)2(ΔWq,n−mΔWl,n)2],


whence by the Cauchy inequality and the boundedness of E[X4n] and E[(XM

n )4],we have E[R2

2] ≤ Ch2. Similarly, we have E[R21] ≤ Ch2. By Lemma 4.2.4, and

the linear growth condition (4.1.8) for ∂xτglgq, we obtain

E[R24] ≤ C max1≤l<q≤r E[(1 +

∣∣XMn

∣∣2 +∣∣XM

n−m

∣∣2)(Iq,l,tn,tn+1,τ

−ΔWq,n−mΔWl,n

2 )2] ≤ Ch2,

sinceXMn , XM

n−m have bounded fourth-order moments and by the Burkholder-Davis-Gundy inequality (see Appendix D), it holds that for l �= q

E[(Iq,l,tn,tn+1,τ − ΔWq,n−mΔWl,n

2)4]

= E[(

∫ tn+1

tn

(Wq(t− τ)− Wq(tn+1 − τ) +Wq(tn − τ)

2) ◦ dWl)

4]

≤ C(E[

∫ tn+1

tn

(Wq(t− τ)− Wq(tn+1 − τ) +Wq(tn − τ)

2)2 ds])2 ≤ Ch4.

Similarly, we have E[R23] ≤ Ch2. Thus, we have proved that

E[R2i ] ≤ Ch2, i = 1, 2, 3, 4. (4.2.11)

By the basic inequality 2ab ≤ a2 + b2, we have

2∣∣E[(XM

n −Xn)ρn]∣∣ ≤ hE[(XM

n −Xn)2] + h−1

E[ρ2n]. (4.2.12)

By the fact that Xn and XMn are Ftn -measurable and Lipschitz condition

for f ,

2E[(XMn −Xn)R0] = 2hE[(XM

n −Xn)(fn − fn−m)]

≤ Ch(E[(XMn −Xn)

2] + E[(XMn−m −Xn−m)2]). (4.2.13)

Further, by the Lipschitz condition (4.1.7) for ∂xglgl, we have

2E[(XMn −Xn)R1] =

r∑

l=1

E[(XMn −Xn)∂xg

Ml,ng

Ml,n − ∂xgl,ngl,n)]E[(ΔWl,n)

2]

≤ Ch(E[(XMn −Xn)

2] + E[(XMn−m −Xn−m)2]). (4.2.14)

By the adaptedness ofXn, XMn and E[ΔWl,n] = E[(Iq,l,tn,tn+1,0−

ΔWq,nΔWl,n

2 )]= 0, we have

E[(XMn −Xn)Ri] = 0, i = 2, 3. (4.2.15)

Again by the adaptedness of Xn and XMn , we can have

E[(XMn −Xn)R4] = 0. (4.2.16)


In fact, by Lemma 4.2.4, we can represent Iq,l,tn,tn+1,τ as

Iq,l,tn,tn+1,τ = h2 ξ

(n−m)q,1 ξ

(n)l,1 + h

2π

∑∞p=1

1p [ξ

(n)q,2p+1ξ

(n−m)l,2p − ξ

(n−m)q,2p ξ

(n)l,2p+1

−√2ξ

(n−m)q,1 ξ

(n)l,2p]. (4.2.17)

Then by the facts E[∣∣(XM

n −Xn)R4

∣∣] ≤ (E[(XMn −Xn)

2])1/2(E[R24])

1/2 ≤ Ch

and E[(XMn −Xn)ξ

(n)l,k ] = 0 for any k ≥ 1, we obtain (4.2.16) from Lebesgue’s

dominated convergence theorem.With (4.2.15)–(4.2.16) and Cauchy inequality, from (4.2.9) we have, for

n ≥ m,

E[(XMn+1 −Xn+1)

2]

≤ E[(XMn −Xn)

2] + 2E[(XMn −Xn)(R0 +R1 − ρn)] + C

4∑

i=0

E[R2i ] + CE[ρ2n]

and further by (4.2.8), (4.2.10)–(4.2.12), and (4.2.13)–(4.2.14), we obtain, forn ≥ m,

E[(XMn+1 −Xn+1)

2] ≤ (1 + Ch)E[(XMn −Xn)

2]

+ChE[(XMn−m −Xn−m)2]) + (C + h−1)E[ρ2n]

+C4∑

i=0

E[R2i ] ≤ (1 + Ch)E[(XM

n −Xn)2]

+ChE[(XMn−m −Xn−m)2] + Ch2, (4.2.18)

where C is independent of h. Similarly, we can also obtain that (4.2.18) holdsfor n = 1, · · · ,m − 1. Taking the maximum over both sides of (4.2.18) andnoting that XM

i −Xi = 0 for −m ≤ i ≤ 0, we have

max1≤i≤n+1

E[(XMi −Xi)

2] ≤ (1 + Ch) max1≤i≤n

E[(XMi −Xi)

2] + Ch2.

Then (4.2.5) follows from the discrete Gronwall inequality (SeeAppendix D.s �

4.2.2 The midpoint scheme

Taking Nh = 1, applying the midpoint rule on the right side of (4.2.1) andby X(t+ h

2 ) ≈12 (X(t+ h) +X(t)), we obtain the following midpoint scheme

Xn+1 = Xn + hf

(Xn +Xn+1

2,Xn−m +Xn−m+1

2

)

+

r∑

l=1

gl

(Xn +Xn+1

2,Xn−m +Xn−m+1

2

)ΔW l,n,

n = 0, 1, · · · , NT − 1, (4.2.19)

where we have truncated ΔWn with ΔWn so that the solution to (4.2.19) hasfinite second-order moment and is solvable (see, e.g., [358, Section 1.3]). Here


ΔWn = ζ(n)√h instead of ξ(n)

√h, where ζ(n) is a truncation of the standard

Gaussian random variable ξ(n) (see, e.g., [358, pp. 39] and (3.2.28)):

ζ(n) = ξ(n)χ|ξ(n)|≤Ah+ sgn(ξ(n))Ahχ|ξ(n)|>Ah

, Ah =√

4| log (h)|.

When τ = 0, this fully implicit midpoint scheme allows long-time inte-gration for stochastic Hamiltonian systems [356]. As in the case of no delay,the midpoint scheme complies with the Stratonovich calculus without differ-entiating the diffusion coefficient.

Theorem 4.2.2 Assume that f , gl, ∂xglgq, and ∂xτglgq (l, q = 1, 2, · · · , r)

satisfy the Lipschitz condition (4.1.7) and also gl have bounded second-orderpartial derivatives with respect to all variables. If E[‖φ‖pL∞ ] < ∞, p ≥ 4, thenwe have for the following midpoint scheme (4.2.19),

max1≤n≤NT

E|X(tn)−Xn|2 = O(h).

We refer the reader for the proof of the convergence rate of the midpointscheme to [356]. The proof is almost the same if there is no delay in the dif-fusion coefficients. It has first-order convergence for Stratonovich stochasticdifferential equations with commutative noise when no delay arises in thediffusion coefficients. However, it has half-order convergence once the delayappears in the diffusion coefficients, and in this case, no commutative noisesare defined.1 The convergence rates in different cases will be shown numeri-cally in Chapter 4.4.

Remark 4.2.3 The relation (2.2.13) is crucial in the derivation of theschemes (4.2.3) and (4.2.19). If a CONS contains no constants, e.g.,{√

2tn+1−tn

sin(kπ(s−tn)tn+1−tn

)}∞

k=1, then from (4.2.1), ΔWl,n in the scheme (4.2.3)

should be replaced by

∫ tn+1

tn

dWl(t) =

Nh∑

j=1

√2

tn+1 − tn

∫ tn+1

tn

sin(jπ(s− tn)

tn+1 − tn) dsξ

(n)l,j , (4.2.20)

which will be simulated with i.i.d. Gaussian random variables with zero meanand variance

∑Nh

j=12h

j2π2 (1−(−1)j)2. According to the proof of Theorem 4.2.1,

we require Nh ∼ O(h−1) so that

E[

∣∣∣∣∫ tn+1

tn

dWl(t)−∫ tn+1

tn

dWl(t)

∣∣∣∣2

] ∼ O(h2) (4.2.21)

to make the corresponding scheme of half-order convergence. Numerical re-sults show that the scheme (4.2.3) with ΔWn replaced by (4.2.20) andNh ∼ O(h−1) leads to similar accuracy and same convergence order withthe predictor-corrector scheme (4.2.3) (numerical results are not presented).

1We can think that the noises are noncommutative, even when there is only asingle noise.


4.2.3 A Milstein-like scheme

A first-order scheme, the Milstein scheme for SDDEs (4.1.6) can be derivedbased on the Ito-Taylor expansion [260] or anticipative calculus, see, e.g.,[228]. Here we derive a similar scheme (called the Milstein-like scheme) basedon WZ approximation (4.1.9) and Taylor expansion of the diffusion terms.When s ∈ [tn, tn+1], we approximate f(X(s), X(s−τ)) by f(X(tn), X(tn−τ))and by the Taylor expansion we have

gl(X(s), X(s− τ)) ≈ gl(X(tn), X(tn − τ)) + ∂xgl(X(tn), X(tn − τ))

[X(s)− X(tn)] + ∂xτgl(X(tn), X(tn − τ))

[X(s− τ)− X(tn − τ)] (4.2.22)

Substituting the above approximations into (4.2.1) and omitting the termswhose order is higher than one in (4.2.1), we then obtain the following scheme:

Xn+1 = Xn + hf(Xn, Xn−m) +

r∑

l=1

gl(Xn, Xn−m)I0

+

r∑

l=1

r∑

q=1

∂xgl(Xn, Xn−m)gq(Xn, Xn−m)Iq,l,tn,tn+1,0 (4.2.23)

+

r∑

l=1

r∑

q=1

∂xτgl(Xn, Xn−m)gq(Xn−m, Xn−2m)

χtn≥τ Iq,l,tn,tn+1,τ , n = 0, 1, · · · , NT − 1,

where

I0 =

∫ tn+1

tn

dWl(t), Iq,l,tn,tn+1,0=

∫ tn+1

tn

∫ t

tn

dWq(s)dWl(t), tn ≥ 0;

Iq,l,tn,tn+1,τ =

∫ tn+1

tn

∫ t−τ

tn−τ

dWq(s)dWl(t), tn ≥ τ.

Using the Fourier basis (2.2.10), the three stochastic integrals in (4.2.23)are computed by

IF0 =

∫ tn+1

tn

m(1)i (t)ξ

(n)l,1 dt = ΔWl,n, (4.2.24)

IFq,l,tn,tn+1,0 =h

2ξ(n)q,1 ξ

(n)l,1 −

√2h

2πξ(n)q,1

s∑

p=1

1

pξ(n)l,2p

+h

2π

s1∑

p=1

1

p[ξ

(n)q,2p+1ξ

(n)l,2p − ξ

(n)q,2pξ

(n)l,2p+1],

IFq,l,tn,tn+1,τ =h

2ξ(n−m)q,1 ξ

(n)l,1 −

√2h

2πξ(n−m)q,1

s∑

p=1

1

pξ(n)l,2p +

h

2π

s1∑

p=1

1

p[ξ

(n−m)q,2p+1ξ

(n)l,2p

−ξ(n−m)q,2p ξ

(n)l,2p+1],


where s = [Nh

2 ] and s1 = [Nh−12 ]. When a piecewise constant basis (2.2.9) is

used, these integrals are

IL0 =

Nh−1∑

j=0

ΔWl,n,j = ΔWl,n,

ILq,l,tn,tn+1,0 =

Nh−1∑

j=0

ΔWl,n,j [ΔWq,n,j

2+

j−1∑

i=0

ΔWq,n,i], (4.2.25)

ILq,l,tn,tn+1,τ =

Nh−1∑

j=0

ΔWl,n,j [ΔWq,n−m,j

2+

j−1∑

i=0

ΔWq,n−m,i],

where ΔWk,n,j = Wk(tn + (j+1)hNh

) − Wk(tn + jhNh

), k = 1, · · · , r, j =0, · · · , Nh − 1, and ΔWk,n,−1 = 0. In Example 4.4.2, we will show that thepiecewise linear interpolation is less efficient than the Fourier approximationfor achieving the same order of accuracy.

The scheme (4.2.23) can be seen as further discretization of the Milsteinscheme for Stratonovich SDDEs proposed in [228]:

XMn+1 = XM

n + hf(XMn , XM

n−m) +r∑

l=1

gl(XMn , XM

n−m)ΔWl,n

+r∑

l=1

r∑

q=1

∂xgl(XMn , XM

n−m)gq(XMn , XM

n−m)Iq,l,tn,tn+1,0 (4.2.26)

+r∑

l=1

r∑

q=1

∂xτgl(X

Mn , XM

n−m)gq(XMn−m, XM

n−2m)χtn≥τIq,l,tn,tn+1,τ ,

n = 0, 1, · · · , NT − 1.

as the double integrals approximated by either the Fourier expansion or thepiecewise linear interpolation: I0, Iq,l,tn,tn+1,0, and Iq,l,tn,tn+1,τ are, respec-tively, approximation of the following integrals2:

I0 =

∫ tn+1

tn

◦dWl(t), Iq,l,tn,tn+1,0 =

∫ tn+1

tn

∫ t

tn

◦dWq(s) ◦ dWl(t), tn ≥ 0

Iq,l,tn,tn+1,τ =

∫ tn+1

tn

∫ t−τ

tn−τ

◦dWq(s) ◦ dWl(t), tn ≥ τ. (4.2.27)

Actually, we have the following relations.

2The approximation of double integrals in the present context is similar tothose using numerical integration techniques, which has been long explored, see,e.g., [259, 358].


Lemma 4.2.4 For the Fourier basis (2.2.10) with Δ = tn+1 − tn, it holdsthat

IF0 = I0, (4.28)

E[(IFq,l,tn,tn+1,0− Iq,l,tn,tn+1,0)

2] = ς(Nh)2Δ2

(Nhπ)2+

∞∑i=M

Δ2

(iπ)2≤ c

Δ2

π2M, (4.29)

E[(IFq,l,tn,tn+1,τ− Iq,l,tn,tn+1,τ )

2] = ς(Nh)2Δ2

(Nhπ)2+

∞∑i=M

Δ2

(iπ)2≤ c

Δ2

π2M, (4.30)

where ς(Nh) = 0 if Nh is odd and 1 otherwise, and M is the integer part of(Nh + 1)/2.

Proof. From (4.2.24), the first formula (4.28) can be readily obtained. Nowwe consider (4.29). For l = q, it holds that

Il,l,tn,tn+1,0 = Il,l,tn,tn+1,0 = (ΔWl,n)2/2,

if (2.2.8) with piecewise constant basis (2.2.9) or Fourier basis (2.2.10)

is used. For any orthogonal expansion (2.2.5), we have E[∫ tn+1

tn(Wq(s) −

Wq(s)) dWl

∫ tn+1

tnWq(s) d(Wl −Wl)] = 0 and thus by Wq(tn) = Wq(tn), Ito’s

isometry and integration by parts, we have, when l �= q,

E[(Iq,l,tn,tn+1,0 − Iq,l,tn,tn+1,0)2]

= E[(

∫ tn+1

tn

[Wq(s)−Wq(s)] ◦ dWl +

∫ tn+1

tn

Wq(s) d[Wl −Wl])2]

= E[(

∫ tn+1

tn

[Wq(s)−Wq(s)] dWl)2] + E[(

∫ tn+1

tn

Wq(s) d[Wl −Wl])2]

=

∫ tn+1

tn

E[[Wq(s)−Wq(s)]2] ds+ E[(−

∫ tn+1

tn

[Wl −Wl] dWq(s))2].

Then by the mutual independence of all Gaussian random variables ξ(n)q,i , i =

1, 2, · · · , q = 1, 2, · · · , r, we obtain E[[Wq(s) − Wq(s)]2] =

∑∞i=Nh+1 M

2i (s),

where Mi(s) =∫ s

tnmi(θ) dθ and for l �= q,

E[(

∫ tn+1

tn

[Wl(s)−Wl(s)] dWq)2] = E[(

∞∑i=Nh+1

Nh∑j=1

∫ tn+1

tn

Mi(s)mj(s) dsξ(n)l,i ξ

(n)q,j )

2]

=

∞∑i=Nh+1

Nh∑j=1

(

∫ tn+1

tn

Mi(s)mj(s) ds)2.


Then we have

E[(Iq,l,tn,tn+1,0 − Iq,l,tn,tn+1,0)2]

=

∞∑

i=Nh+1

∫ tn+1

tn

M2i (s) ds+

∞∑

i=Nh+1

Nh∑

j=1

(

∫ tn+1

tn

Mi(s)mj(s) ds)2.(4.2.31)

In (4.2.31), we consider the Fourier basis (2.2.10). Since Mi(s) (i ≥ 2) arealso sine or cosine functions, we have

∞∑

i=Nh+1

Nh∑

j=1

(

∫ tn+1

tn

Mi(s)mj(s) ds)2 = (

∫ tn+1

tn

MNh+1(s)mNh(s) ds)2

(4.2.32)

when Nh is even and∑∞

i=Nh+1

∑Nh

j=1(∫ tn+1

tnMi(s)mj(s) ds)

2 = 0 when Nh isodd. Moreover, for i ≥ 2, it holds from simple calculations that

∫ tn+1

tn

M2i (s) ds =

3Δ2

(2�i/2�π)2 , i is even andΔ2

(2�i/2�π)2 otherwise. (4.2.33)

Then by (4.2.31), (4.2.32), we have

E[(IFq,l,tn,tn+1,0 − Iq,l,tn,tn+1,0)2]

=∞∑

i=Nh+1

∫ tn+1

tn

M2i (s) ds+

∞∑

i=Nh+1

Nh∑

j=1

(

∫ tn+1

tn

Mi(s)mj(s) ds)2

= ς(Nh)Δ2

(Nhπ)2+

∞∑

i=Nh+1

3ς(i)Δ2

(2[i/2]π)2= ς(Nh)

2Δ2

(Nhπ)2+

∞∑

i=M

Δ2

(iπ)2.

Hence, we arrive at (4.29) by the fact∑∞

i=M1i2 ≤ 1

M . Similarly, we canobtain (4.30).

With Lemma 4.2.4, we can show that the Milstein-like scheme (4.2.23)can be of first-order convergence in the mean-square sense.

Theorem 4.2.5 Assume that f , gl, ∂xglgq, and ∂xτglgq (l, q = 1, 2, · · · , r)

satisfy the Lipschitz condition (4.1.7) and also gl have bounded second-orderpartial derivatives with respect to all variables. If E[‖φ‖pL∞ ] < ∞, p ≥ 4, thenwe have for the Milstein-like scheme (4.2.23),

max1≤n≤NT

E|X(tn)−Xn|2 = O(h2), (4.2.34)

when the double integrals Iq,l,tn,tn+1,0, Iq,l,tn,tn+1,τ are computed by (4.2.24)and Nh is at the order of 1/h.


Proof. We present the proof for d = 1 in (4.1.6), which can be extendedto multi-dimensional case d > 1 without difficulty. Subtracting (4.2.23)from (4.2.26) and taking expectation after squaring over both sides, we have

E[(XMn+1−Xn+1)

2] = E[(XMn −Xn)

2]+24∑

i=0

E[(XMn −Xn)Ri]+

4∑

i,j=0

E[RiRj ],

where we denote fMn = f(XM

n , XMn−m) and gMl,n = gl(X

Mn , XM

n−m) and

R0 = h(fMn − fn) +

r∑

l=1

(gMl,n − gl,n)ΔWl,n,

R1 =r∑

l=1

r∑

q=1

[∂xg

Ml,ng

Mq,n − ∂xgl,ngq,n

]IFq,l,tn,tn+1,0,

R2 =r∑

l=1

r∑

q=1

[∂xτ

gMl,ngMq,n−m − ∂xτ

gl,ngq,n−m

]IFq,l,tn,tn+1,τ ,

R3 =r∑

l=1

r∑

q=1

∂xgMl,ng

Mq,n(Iq,l,tn,tn+1,0 − IFq,l,tn,tn+1,0),

R4 =

r∑

l=1

r∑

q=1

∂xτgMl,ng

Mq,n−m(Iq,l,tn,tn+1,τ − IFq,l,tn,tn+1,τ ).

Similar to the proof of Theorem 4.2.1, we have

E[R20] ≤ C(h2 + h)(E[(XM

n −Xn)2] + E[(XM

n−m −Xn−m)2]), (4.2.35)

E[R21] ≤ C max

1≤l,q≤rE[(

∣∣XMn −Xn

∣∣2 +∣∣XM

n−m −Xn−m

∣∣2)]E[(IFq,l,tn,tn+1,0)2],

E[R22] ≤ C max

1≤l,q≤rE[(

∣∣XMn −Xn

∣∣2 +∣∣XM

n−m −Xn−m

∣∣2)(IFq,l,tn,tn+1,τ )2],

E[R23] ≤ C max

1≤l<q≤rE[(1+

∣∣XMn

∣∣2 +∣∣XM

n−m

∣∣2)]E[(Iq,l,tn,tn+1,0 − IFq,l,tn,tn+1,0)2],

E[R24] ≤ C max

1≤l<q≤rE[(1 +

∣∣XMn

∣∣2 +∣∣XM

n−m

∣∣2)(Iq,l,tn,tn+1,τ − IFq,l,tn,tn+1,τ )2].

First, we establish the following estimations:

E[R2i ] ≤ Ch3, i = 3, 4. (4.2.36)

The case for i = 3 follows directly from Lemma 4.2.4 and boundedness ofmoments of Xn and XM

n . By Lemma 4.2.4 and (4.2.24), we have

E[(Iq,l,tn,tn+1,τ − IFq,l,tn,tn+1,τ )4]

= E[(−√2h

2πξ(n−m)q,1

∞∑p=s+1

1

pξ(n)l,2p +

h

2π

∞∑p=s1+1

1

p[ξ

(n−m)q,2p+1ξ

(n)l,2p − ξ

(n−m)q,2p ξ

(n)l,2p+1])

4]

≤ Ch4[(∞∑

p=s+1

1

p2)2 + (

∞∑p=s1+1

1

p2)2] ≤ C

h4

N2h

,


where s = [Nh

2 ] and s1 = [Nh−12 ]. As Nh is at the order of 1/h, we have

E[(Iq,l,tn,tn+1,τ − IFq,l,tn,tn+1,τ )4] ≤ Ch6. (4.2.37)

Then by the fact that Xn and XMn have bounded fourth-order moment,

Cauchy inequality, and (4.2.37), we reach (4.2.36) when i = 4.Second, we estimate E[R2

i ], i = 1, 2. By (4.2.24), the Lipschitz condi-tion (4.1.7) and Nh is at the order of 1/h, we have

E[R21] ≤ Ch(E[(XM

n −Xn)2] + E[(XM

n−m −Xn−m)2]). (4.2.38)

Now we require to estimate E[R22]. By Lipschitz condition (4.1.7), the adapt-

edness of XMn−m and Xn−m and Cauchy inequality (twice), we have

E[R22] ≤ C max

1≤l,q≤r

{E[∣∣∣XM

n − Xn

∣∣∣2 (IFq,l,tn,tn+1,τ )

2] + E[

∣∣∣XMn−m − Xn−m

∣∣∣2 (IFq,l,tn,tn+1,τ )

2]

},

≤ C max1≤l,q≤r

(E[∣∣∣XM

n − Xn

∣∣∣4])1/4(E[(IFq,l,tn,tn+1,τ )

8])

1/4(E[

∣∣∣XMn − Xn

∣∣∣2])1/2+Ch

2E[(X

Mn−m − Xn−m)

2].

It can be readily checked from (4.2.24) that E[(IFq,l,tn,tn+1,τ)8] ≤ Ch8. Hence,

from the boundedness of moments, we have

E[R22] ≤ Ch2(E[(XM

n −Xn)2])1/2 + Ch2

E[(XMn−m −Xn−m)2]. (4.2.39)

Now estimate E[(XMn −Xn)Ri], i = 0, 1, 2, 3, 4. By the adaptedness of Xn

and Lipschitz condition of f , we have

E[(XMn −Xn)R0] ≤ ChE[(

∣∣XMn −Xn

∣∣2 +∣∣XM

n−m −Xn−m

∣∣2)]. (4.2.40)

By the adaptedness of Xn and E[Iq,l,tn,tn+1,0] = δq,lh/2 (δq,l is the Kroneckerdelta) and Lipschitz condition of ∂xglgq, we have

E[(XMn −Xn)R1] ≤ ChE[(

∣∣XMn −Xn

∣∣2 +∣∣XM

n−m −Xn−m

∣∣2)]. (4.2.41)

By the adaptedness of Xn and E[Iq,l,tn,tn+1,0 − Iq,l,tn,tn+1,0] = 0, we have

E[(XMn −Xn)R3] = 0. (4.2.42)

Similar to the proof of (4.2.16), we have

E[(XMn −Xn)R4] = 0. (4.2.43)

Then by (4.2.35), (4.2.36)–(4.2.39), (4.2.40)–(4.2.43), and Cauchy inequality,we have

E[(XMn+1 −Xn+1)

2] ≤ (1 + Ch)E[(XMn −Xn)

2] + ChE[(XMn−m −Xn−m)2]

+Ch2(E[(XMn −Xn)

2])1/2 + Ch3, (4.2.44)

where n ≥ m. Similarly, we have that (4.2.44) holds also for 1 ≤ n ≤ m− 1.From here and by the nonlinear Gronwall inequality, we reach the conclu-sion (4.2.34).

4.3 Linear stability of some schemes 119

Remark 4.2.6 When a piecewise linear approximation of double integrals(4.2.25) is used in the Milstein-like scheme (4.2.23), the first-order strongconvergence can be proved similarly when Nh is of the order of 1/h.

The spectral truncations we use are from the piecewise linear interpolationand a Fourier expansion. Comparison between these two truncations will bepresented for a specific numerical example in Chapter 4.4, where it is shownthat the Fourier approach is faster than the piecewise constant approach,similar to the case when τ = 0 (SODEs), see [358, Section 1.4].

4.3 Linear stability of some schemes

For fixed h, let’s analyze the behavior of numerical schemes described abovewhen tj → ∞. This is done by analyzing the simple but a revealing lineartest problem

dX(t) = λX(t)dt+ σX(t− τ) dW (t), t ∈ (0, T ],

X(t) = φ(t), t ∈ [−τ, 0], (4.3.1)

where λ < 0, σ, τ ≥ 0. It can be shown that when 2λ+ σ2 < 0, the solutionto (4.3.1) is asymptotically stable in the mean-square sense:

limt→∞E[X2(t)] = 0.

Specifically, when τ = 0, the solution is

exp((λ− 1

2σ2)t+ σW (t)).

If 2λ+ σ2 < 0, then

limt→∞E[X2(t)] = lim

t→∞E[exp((2λ−σ2)t+2σW (t))] = limt→∞E[exp((2λ+σ2)t)] = 0.

Stability region of the forward Euler scheme

Applying the forward Euler scheme to the linear test equation (4.3.1), wehave

Xn+1 = (1 + z)Xn +√yXn−mξn, z = λh, and y = σ2h.

Then E[X2n+1] = (1 + z)2E[X2

n] + yE[XnXn−m], and thus

E[X2n+1] ≤ ((1+z)2+y/2)E[X2

n]+yE[X2n−m]/2 ≤ ((1+z)2+y)max(E[X2

n],E[(Xn−m)2]).

By the discrete Gronwall inequality, we need (1+z)2+y < 1 so that E[X2n+1]

is decreasing and is asymptotically mean-square stable. Thus, the stabilityregion is {

(z, y) ∈ (−2, 0)× (0, 1)|(1 + z)2 + y < 1}.

This mean-square stability region is illustrated in Figure 4.1. In the following,it is shown that the stability regions of the midpoint and the predictor-corrector schemes are larger than the forward Euler scheme’s when τ > 0.


Fig. 4.1. The mean-square stability region of the forward Euler scheme (4.2.3)when τ ≥ 0.

−2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 00

0.5

1

1.5

2

2.5

3

3.5

4

λ h

σ2 h

exact solutionforward Euler

Stability analysis of the predictor-corrector scheme

In Stratonovich form, (4.3.1) is written as

dX(t) = λX(t)dt+ σX(t− τ) ◦ dWl(t), t ∈ (0, T ],

X(t) = φ(t), t ∈ [−τ, 0]. (4.3.2)

where λ = λ− σ2

2 χ{τ=0}. Applying (4.2.3) to (4.3.2), we have

Xn+1 = Xn + λhXn + σ√hXn−mξn, n ≥ 0, m = τ/h > 0

Xn+1 = Xn +λh

2

(Xn +Xn+1

)+

1

2σ√h(Xn−m +Xn−m+1

)ξn. (4.3.3)

The scheme can be then simplified as

Xn+1 = (1 + λh+λ2h2

2)Xn +

σ√h

2(1 + λh)ξnXn−m +

σ√h

2ξnXn−m+1.

By the Cauchy-Schwarz inequality, the second moment ofXn+1 is bounded by

E[X2n+1] ≤ R2(λh)E[X2

n]+Q21(λh, σ

√h)E[X2

n−m]+Q22(λh, σ

√h)E[X2

n−m+1],

4.3 Linear stability of some schemes 121

where R(λh) = 1 + λh+ λ2h2

2 , and

Q21(λh, σ

√h) =

σ2h

4(1 + λh)2 +

σ2h

4

∣∣∣1 + λh∣∣∣ , Q2

2(λh, σ√h) =

σ2h

4+

σ2h

4

∣∣∣1 + λh∣∣∣ .

WhenR2(λh) +Q2

1(λh, σ√h) +Q2

2(λh, σ√h) < 1,

i.e.,

(1 + λh+λ2h2

2)2 +

σ2h

4(∣∣∣1 + λh

∣∣∣+ 1)2 < 1, (4.3.4)

the second moment E[X2n+1] is asymptotically stable. According to (4.3.4),

the stability region of the predictor-corrector scheme (4.2.3) when τ > 0 is(here z = λh and y = σ2h)

{(z, y) ∈ (−2, 0)× [0, 3)|F (z, y) < 0 when − 1 < z < 0;

and G(z, y) > 0 when − 2 < z ≤ −1 and y < 3} ,

where F (z, y) = (2z+ y)(1+ z/2)2 + z(2+ z) z2

4 and G(z, y) = z3

4 + z2+2z+2 + y

4z. The stability region is illustrated in Figure 4.2. The stability regionof the linear equation is plotted only for λh ≥ −2.5 while its stability regionis 2λ+ σ2 < 0.

Now let us derive the stability region of the predictor-corrector scheme

(4.2.3) based on the condition (4.3.4). When σ = 0, we need∣∣∣R(λh)

∣∣∣ < 1

which gives −2 < λh < 0 (i.e.,∣∣∣1 + λh

∣∣∣ < 1). Denote by z = λh and y = σ2h.

If 1 + λh > 0, then (4.3.4) becomes

(1 + z + z2/2)2 +y

4(z + 2)2 < 1.

Let F (z, y) = (1+z+z2/2)2+ y4 (z+2)2−1. It can be verified that if 1+z > 0

and |1 + z| < 1,

F (z, y) = (2z + y)(1 + z/2)2 + z(2 + z)z2

4< 0.

If 1 + z = 1 + λh ≤ 0, then (4.3.4) becomes

(1 + z + z2/2)2 +y

4z2 < 1.

Let g(z, y) = (1 + z + z2/2)2 + y4z

2 − 1. We want to find the range of z suchthat

g(z, y) < 0, when − 2 < z ≤ −1.

Recall that

g(z, y) = z(z3

4+ z2 + 2z + 2 +

y

4z).


Fig. 4.2. The mean-square stability region of the predictor-corrector scheme (4.2.3)when τ > 0.

−2.5 −2 −1.5 −1 −0.5 00

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

λ h

σ2 hexact solutionPredictor−Corrector

We then want G(z, y) = g(z, y)/z = z3

4 + z2 + 2z + 2 + y4z > 0. Note that

G(−1, y) =3− y

4, G(−2, y) = −y

2, ∂zG(z, y) =

3

4z2 + 2z + 2 +

y

4> 0.

Thus, only when 3 − y > 0, there exists −2 < z0 < −1 such that 0 =G(z0, y) < G(−1, y).

Stability analysis of the midpoint scheme

The midpoint scheme applied to (4.3.2) reads

Xn+1 = Xn + λh(Xn +Xn−1)/2 + σ(Xn−m +Xn−m+1)/2ζ(n)

√h,

n ≥ 0, m = τ/h ≥ 0. (4.3.5)

where ζ(n) is a truncation of the standard Gaussian random variable ξ(n) asin (4.2.19). When τ > 0, it holds thatXn+1 = R(λh)Xn+Q(λh, σ

√h)(Xn−m+

Xn−m+1)ζ(n) where

R(λh) =1 + λh/2

1− λh/2, Q(λh, σ

√h) =

1

2

σ√h

1− λh/2.

4.4 Numerical results 123

Then we have

E[X2n+1] = R2(λh)E[X2

n] +Q2(λh, σ√h)E[(Xn−m +Xn−m+1)

2]E[(ζ(n))2]

< R2(λh)E[X2n] +Q2(λh, σ

√h)E[(Xn−m +Xn−m+1)

2] (4.3.6)

<(R2(λh) + 2Q2(λh, σ

√h))max(E[X2

n],E[(Xn−m)2], E[Xn−m+1)2]).

When R2(λh) + 2Q2(λh, σ√h) < 1, i.e.

(1 + λh/2

1− λh/2

)2+

1

2

( σ√h

1− λh/2

)2< 1,

then E[X2n+1] is strictly decreasing and is asymptotically stable. We conclude

that the midpoint scheme is asymptotically stable for any h > 0 as long asλ+ σ2/2 < 0. In other word, the midpoint scheme has the same asymptoticmean-square stable region as the equation (4.3.1).

4.4 Numerical results

In this section, we test convergence rates of the aforementioned schemes andcompare their numerical performance under commutative or noncommutativenoises as well as the effect of delay. In the first example, we test the predictor-corrector scheme (4.2.3) and midpoint scheme (4.2.19) and show that bothmethods are of half-order mean-square convergence. Further, both schemesconverge with order one in the mean-square sense for a SDDE with singlewhite noise and no time delay in diffusion coefficients. In the second example,we investigate the Milstein-like scheme (4.2.23) and show that it is first-orderconvergent for SDDEs with multiple white noises.

Throughout this section, mean-square errors of numerical solutions aredefined as

ρh,T =( 1

np

np∑

i=1

|Xh(T, ωi)−X2h(T, ωi)|2)1/2

,

where ωi denotes the i-th single sample path and np is the total number ofsample paths.

The numerical tests were performed using Matlab R2012a on a Dell Op-tiplex 780 computer with CPU (E8500 3.16 GHz). We used the MersenneTwister random generator with seed 1 and took a large number of paths sothat the statistical error can be ignored. Newton’s method with toleranceh2/100 was used to solve the nonlinear algebraic equations at each step ofthe implicit schemes.

We test the convergence rate for the predictor-corrector scheme (4.2.3)and the midpoint scheme (4.2.19) for SDDEs with different types of noises:noncommutative noise, single noise. We will show that the time delay in a


diffusion coefficient keeps both methods only convergent at half-order, whilefor the SDDE with single noise, the two schemes can be of first-order accuracyin the mean-square sense if the time delay does not appear explicitly in thediffusion coefficients.

Example 4.4.1 Consider Equation (4.1.6) in one dimension and assumethe initial function φ(t) = t+ 0.2, with different diffusion terms:

• commutative (single) white noises without delay in the diffusion coeffi-cients:

dX = [−X(t) + sin(X(t− τ))] dt+ sin(X(t)) ◦ dW (t); (4.4.1)

• noncommutative noises without delay in the diffusion coefficients:

dX = [−X(t)+ sin(X(t− τ))] dt+sin(X(t)) ◦ dW1(t)+0.5X(t) ◦ dW2(t),(4.4.2)

where the noises are noncommutative as ∂x(sin(x))0.5x − ∂x(0.5x)sin(x) �= 0;

• single white noises with delay in the diffusion coefficients:

dX = [−X(t) + sin(X(t− τ))] dt+ sin(X(t− τ)) ◦ dW (t). (4.4.3)

For Equation (4.4.1), there is a single noise with no delay in the diffusioncoefficient. The noise is a special case of commutative noises. The convergenceorder of these two schemes is one in the mean-square sense, see, e.g., [356].The convergence order is numerically verified in Figure 4.3. The effect of thedelay τ is small and does not affect the convergence order.

Fig. 4.3. Mean-square errors of the predictor-corrector (left) and midpoint schemes(right) for Example 4.4.1 at T=5 with different τ using np = 10000 sample paths:single white noise with a non-delayed diffusion term. These figures are adapted from[61].

1/512 1/256 1/128 1/64

h1

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

1/512 1/256 1/128 1/64

h1

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

10-2 10-2

10-3

10-4

10-3

10-4


Fig. 4.4. Mean-square convergence test of the predictor-corrector (left column) andmidpoint schemes (right column) on Example 4.4.1 at T=5 with different τ usingnp = 10000 sample paths: multi white noises with non-delayed diffusion terms.These figures are adapted from [61].

1/512 1/256 1/128 1/64

h1/2

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

1/512 1/256 1/128 1/64

h1/2

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

10-210-2

10-110-1

10-310-3

For Equation (4.4.2) with noncommutative noises, the convergence orderis half for both schemes, see Figure 4.4.

For Equation (4.4.3), there is a single noise while the diffusion coefficientcontains a delay term. In this case, we observe half-order strong convergencein Figure 4.5. This implies that the delay has affected the convergence ordersince the order is one if no delay appears in the diffusion coefficient, e.g.,in (4.4.1).

Fig. 4.5. Mean-square convergence test of the predictor-corrector (left column)and midpoint schemes (right column) on Example 4.4.1 at T=5 with different τusing np = 10000 sample paths: single white noise with a delayed diffusion term.These figures are adapted from [61].

1/512 1/256 1/128 1/64

h1/2

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

1/512 1/256 1/128 1/64

10-210-2

10-1

10-310-3

h1/2

h

mea

n sq

ure

erro

r

τ=1/2τ=1/4τ=1/16

From this example, we conclude that for the predictor-corrector and mid-point schemes, when the time delay only appears in the drift term, the con-vergence order is one for the equation with commutative noises and half for


the one with noncommutative noises. However, when the diffusion coefficientscontain time delays, these two schemes are only half-order even for equationswith a single white noise.

In the following examples, we test the Milstein-like scheme (4.2.23) usingdifferent bases, i.e., piecewise constant basis (2.2.9) and Fourier approxima-tion (2.2.10), and compare its numerical performance with the predictor-corrector and midpoint schemes. For the Milstein-like scheme, we show thatfor multiple noises, the computational cost for achieving the same accuracyis much higher than the other two schemes, see Example 4.4.2; while forsingle noise, the computational cost for the same accuracy is lower, see Ex-ample 4.4.3.

To reduce the computational cost, the double integrals in the Milstein-likescheme are computed by the Fourier expansion approximation (4.2.24) andthe following relation

Iq,l,tn,tn+1,0 = ΔWl,nΔWq,n − Il,q,tn,tn+1,0, Il,l,tn,tn+1,0 =(ΔWl,n)

2

2.

(4.4.4)We also use the following relations

Iq,l,tn,tn+ph,0 =

p−1∑

j=0

[Iq,l,tn+jh,tn+(j+1)h,0 +ΔWl,n+jχj≥1

j−1∑

i=0

ΔWq,n+i

],

Iq,l,tn,tn+ph,τ =

p−1∑

j=0

[Iq,l,tn+jh,tn+(j+1)h,τ +ΔWl,n+jχj≥1

j−1∑

i=0

ΔWq,n−m+i

].

Example 4.4.2 We consider the Milstein-like scheme (4.2.23) for

dX(t) = [−9X(t) + sin(X(t− τ))]dt+ [sin(X(t)) +X(t− τ)] ◦ dW1(t)

+[X(t) + cos(0.5X(t− τ))] ◦ dW2(t), t ∈ (0, T ]

X(t) = t+ τ + 0.1, t ∈ [−τ, 0]. (4.4.5)

In Table 4.1, we show that for Equation (4.4.5), the Milstein-like scheme(4.2.23) converges with order one in the mean-square sense. Compared to thepredictor-corrector scheme or the midpoint scheme, when the time step sizesare the same, the computational cost for the Milstein-like scheme (4.2.23)is several times higher. In the Milstein-like scheme, the extra computationalcost comes from evaluating the double integrals IFq,l,tn,tn+1,0

and IFq,l,tn,tn+1,τ

at each time step, which requires 7/(2h)(3r2 − r)/2 operations when we takethe relation (4.4.4) into account.

We also test the Milstein-like scheme (4.2.23) using the piecewise constantbasis (2.2.9). The computational cost is even higher than that of using theFourier basis for the same time step size. Actually, the amount of operationsfor evaluating double integrals using (4.2.25) is (1/(2h2) + 5/(2h)− 1)(3r2 −r)/2, which is O(1/h2), much more than that of using the Fourier basis,


Table 4.1. Convergence rate of the Milstein-like scheme (left) for Equation (4.4.5)at T = 1 and comparison with the convergence rate of the predictor-correctorscheme (middle) and the midpoint scheme (right) using np = 4000 sample paths.The upper rows are with τ = 1/16 and the lower are with τ = 1/4.

h ρh,T Order Time (s.) ρh,T Order Time (s.) ρh,T Order Time (s.)

2−5 9.832e-02 – 1.0 7.164e-02 – 0.10 5.000e-02 – 0.292−6 4.090e-02 1.27 1.7 3.734e-02 0.94 0.12 3.304e-02 0.60 0.412−7 1.921e-02 1.09 3.3 2.308e-02 0.69 0.25 2.263e-02 0.55 0.792−8 9.703e-03 0.99 6.4 1.616e-02 0.51 0.40 1.590e-02 0.51 1.54

2−5 9.307e-02 – 0.93 6.956e-02 – 0.10 5.050e-02 – 0.222−6 3.824e-02 1.28 1.6 3.582e-02 0.96 0.17 3.155e-02 0.68 0.392−7 1.804e-02 1.08 2.8 2.205e-02 0.70 0.26 2.133e-02 0.56 0.782−8 9.069e-03 0.99 5.5 1.434e-02 0.62 0.45 1.425e-02 0.58 1.59

O(1/h). Our numerical tests (not presented here) confirmed the fast increaseof the amount of operations.

However, the amount of operations of the Milstein-like scheme can besignificantly reduced when there is just a single diffusion term.

Example 4.4.3 We consider the Milstein-like scheme (4.2.23) for

dX(t) = [−2X(t) + 2X(t− τ)]dt+ [sin(X(t)) +X(t− τ)] ◦ dW (t), t ∈ (0, T ]

X(t) = t+ τ, t ∈ [−τ, 0]. (4.4.6)

In Table 4.2, we observe that the Milstein-like scheme for Equation (4.4.6)is still of first-order convergence but the predictor-corrector scheme and themidpoint scheme are only of half-order convergence. For the same accuracy,the computational cost for the Milstein-like scheme using the Fourier basisis less than that for the other two schemes. In fact, for single noise, we onlyneed to compute one double integral I1,1,tn,tn+1,τ . Moreover, when the coef-ficients of the diffusion term are small, a small number of Fourier modes isrequired for large time step sizes, i.e., Nh can be O(1) instead of O(1/h). Thecomputational cost can thus be reduced somewhat, see, e.g., [358, Chapter3] for such a discussion for equations with small noises without delay.

In summary, the predictor-corrector scheme and midpoint scheme are con-vergent with half-order in the mean-square sense and can be of first-order ifthe underlying SDDEs with single noise (commutative noise) and the timedelay is only in the drift coefficients, see Example 4.4.1. In Example 4.4.2the numerical tests show that the Milstein-like scheme is of first-order in themean-square sense for SDDEs with noncommutative noise wherever the timedelay appears, i.e., in the drift and/or diffusion coefficients. Compared to theother two schemes, the Milstein-like scheme is more accurate but is moreexpensive as it requires evaluations of double integrals, with cost inverselyproportional to the time step size and proportional to the square of number of


Table

4.2.Convergen

cerate

oftheMilstein-likeschem

e(left)

forEquation(4.4.6)(single

whitenoise)

atT

=1andcomparisonwith

theconvergen

cerate

ofthepredictor-correctorschem

e(m

iddle)andmidpointschem

e(right)

usingnp=

4000sample

paths.Thedelay

τis

taken

as1/4.

hρh,T

Order

Tim

e(s.)

hρh,T

Order

Tim

e(s.)

ρh,T

Order

Tim

e(s.)

2−5

3.164e-02

–0.28

2−8

1.252e-02

–0.37

1.263e-02

–1.09

2−6

1.688e-02

0.91

0.46

2−9

9.219e-03

0.44

0.56

9.246e-03

0.45

2.05

2−7

8.499e-03

0.99

0.79

2−10

6.462e-03

0.51

1.03

6.471e-03

0.51

3.97

2−8

4.570e-03

0.90

1.40

2−11

4.617e-03

0.49

1.91

4.627e-03

0.48

7.58

gk

Sticky Note

waste of a page....we could havde constructed the table differently!


noises. However, for SDDEs with single noise in Example 4.4.3, the Milstein-like scheme (with the Fourier basis) can be superior to the predictor-correctorscheme and the midpoint scheme both in terms of accuracy and computationalcost.


We have presented three distinct schemes using different time-discretizationtechniques after approximating the Brownian motion by a spectral expan-sion (a type of Wong-Zakai approximation): midpoint scheme, predictor-corrector scheme, and a Milstein-like scheme. The mechanism is similar tothe Wong-Zakai approximation for stochastic differential equation withoutdelay. Further, if there is no delay in diffusion coefficients, the performanceof all schemes is similar to that of schemes for stochastic ordinary differentialequations without delay:

• The predictor-corrector scheme is of half order convergence in the mean-square sense, see Theorem 4.2.1 and numerical examples in Chapter 4.4.

• The midpoint scheme is of half order convergence in the mean-squaresense, see numerical examples in Chapter 4.4.

• The Milstein-like scheme is of first order convergence in the mean-squaresense, see Theorem 4.2.5 and Examples 4.4.2 and 4.4.3.

Also, when the diffusion coefficient satisfies the commutative conditions, boththe predictor-corrector scheme and the midpoint scheme can be of first orderconvergence in the mean-square sense.

However, the numerical performance is a bit different if there is even asingle diffusion coefficient with delay:

• The predictor-corrector scheme and midpoint scheme are still of half orderconvergence in the mean-square sense. This can be understood as thebreak of commutative conditions on diffusion coefficients for stochasticordinary differential equations.

• The Milstein-like scheme is much more expensive as the double integralsinvolve the delayed history, which implies less use of this first-order schemein practice, especially if there are high-dimensional Wiener processes in-volved. However, it is efficient when there is only one Brownian motion(Wiener processes).

Linear stability theory for basic numerical schemes is presented in Sec-tion 4.3.

A spectral approximation with different time discretizations does not leadto higher-order numerical methods unless we use higher-order time discretiza-tion. This is in accordance with classical theories on numerical methods forstochastic different equations, see, e.g., [259, 354, 358]. In the following chap-ters, we show that this conclusion is also true for parabolic equations with


temporal white noise. However, in Chapter 10, we show that a spectral ap-proximation of spatial white noise can lead to higher-order numerical methodsof time-independent equations when the solutions are smooth, e.g., for ellipticand biharmonic equations with additive noise.

Bibliographic notes. Numerical solution of stochastic delay differentialequations (SDDEs) has attracted increasing interest recently, as memoryeffects in the presence of noise are modeled with SDDEs in engineeringand finance, e.g., [173, 221, 392, 449, 460]. Most of the numerical meth-ods for SDDEs have focused on the convergence and stability of differ-ent time-discretization schemes since the early work [451, 452]. Currently,several time-discretization schemes have been well studied: the Euler-typeschemes (the forward Euler scheme [14, 279] and the drift-implicit Eulerscheme [230, 306, 483]), the Milstein schemes [49, 222, 228, 260], the split-step schemes [176, 472, 501], and also some multi-step schemes [50, 51, 59, 60].These schemes are usually based on the Ito-Taylor expansion, see, e.g., [260]or anticipative calculus, see, e.g., [228].

The Wong-Zakai (WZ) approximation is different from these afore-mentioned approaches. The difference is that in WZ we first approximate theBrownian motion with an absolute continuous process, see, e.g., [324, 453,458] for a piecewise linear approximation of Brownian motion in SDDEs.Subsequently, we apply proper time-discretization schemes for the resultingequation while the schemes mentioned in the last paragraph are ready forsimulation without any further time discretization. The WZ approximationthus can be viewed as an intermediate step for deriving numerical schemesand can provide more flexibility of discretization of Brownian motion beforeperforming any time discretization. Moreover, with the WZ approximationwe can apply Taylor expansion rather than Ito-Taylor expansion and antici-pative calculus.

The Wong-Zakai approximation for SODEs, see, e.g., [481, 482], is a semi-discretization method where Brownian motion is approximated by finite di-mensional absolute continuous stochastic processes before any discretizationin time or in physical space. There are different types of WZ approxima-tion, see, e.g., [241, 386, 434, 510]. The Wong-Zakai approximation has beenextended in various aspects since the seminal papers in [481, 482]:

• from single white noise to multiple white noise, see, e.g., [195, 435].• from SODEs to SPDEs, hyperbolic equations (e.g., [352, 406, 407, 508]),

parabolic equations (e.g., [3, 114]) including Burgers equation (e.g., [382,383]) and Navier-Stokes equations (e.g., [79, 456, 457]), and equations ona manifold (e.g., [45, 198]), etc.

• from piecewise linear approximation to general approximation: mollifiertype (e.g., [241, 323, 325]), Ikeda-Nakao-Yamato-type approximations(e.g., [143, 239, 323]) or their extensions (e.g., [195]), (Levy-Ciesielski)spectral type (e.g., [54, 315, 508]), general color noises (e.g., [3, 434, 435]),etc.


• from SODEs driven by Gaussian white noise to those with general pro-cesses: general semi-martingales (e.g., [127, 142, 212, 263, 283, 284, 401]),fractional Brownian motion (e.g., [20, 450]), rough path (e.g., [137]), etc.

To the best of our knowledge, this is the first time that fully discretizednumerical schemes via Wong-Zakai approximation have been presented. Oneof the reasons that Wong-Zakai approximation has not been popular as anumerical method is that it is difficult to establish error estimates. One dif-ficulty is that in the Wong-Zakai approximation (not yet fully discretized intime), the solutions to the resulting equations are not adapted to the naturalfiltration of Wiener process. To have an adapted solution, Gyongy and hiscollaborators, e.g., [200, 201], modified the standard piecewise linear approx-imation (2.2.4) as

W (n)(t) = W (ti−1) + (W (ti)−W (ti−1))t− ti

ti+1 − ti, t ∈ [ti, ti+1). (4.5.1)

With such as modification, the authors of [200, 201] have proved a 1/4 − ε-order convergence [200] and 1/2− ε-order convergence in pathwise sense forWong-Zakai approximation of linear stochastic parabolic equations with mul-tiplicative noise (white noise as coefficients of first-order differential opera-tors).

Simulation of double integrals (4.2.27). In [228], Iq,l,tn,tn+1,0 andIq,l,tn,tn+1,τ are approximated similarly. The Brownian motion Wq thereinis approximated by the sum of (t − tn)/(tn+1 − tn)Wq(tn) and a truncatedFourier expansion of the Brownian bridge Wq(t)− (t− tn)/(tn+1− tn)Wq(tn)for tn ≤ t ≤ tn+1, see also [159] and [259, Section 5.8]. It can be read-ily checked that this approximation is equivalent to the Fourier approxima-tion (4.2.24). In numerical simulations (results are not presented), these twoapproximations lead to a little difference in computational cost and accuracybut the convergence order is the same.

As we note in the beginning of Chapter 2, the choice of complete orthonor-mal bases is arbitrary. However, the use of general spectral approximationmay lead to different accuracy, see, e.g., [287] for a detailed comparison ofsome spectral approximation of multiple Stratonovich integrals.

In addition to the Fourier approximation, several methods of approxi-mating Iq,l,tn,tn+1,0 have been proposed: applying the trapezoid rule, see,e.g., [358, Section 1.4] and modified Fourier approximation, see, e.g., [480].We note that the use of the trapezoid rule leads to a similar formula asin (4.2.25), which is shown to be less efficient than the Fourier approxi-mation, see Example 4.4.2. In [480], Iq,l,tn,tn+1,0 is approximated with thesum of a Fourier approximation and a tail process Aq,l,tn,tn+1

, where thetail Aq,l,tn,tn+1

is modeled with the product of r(r − 1)/2-dimensional i.i.d.Gaussian random variables and a functional of increments of Brownian mo-tion ΔWl,n. It is shown in [159] that the modified Fourier approximation in

[480] requires O(r4√h) i.i.d. Gaussian random variables to maintain the first-

order convergence while the Fourier approximation requires O(r2h−1) i.i.d.


Gaussian random variables. However, it is difficult to extend this approachto approximate Iq,l,tn,tn+1,τ even when r is small because a tail Aq,l,tn,tn+1,0

will be correlated with Aq,l,tn,tn+1,τ , which is difficult to identify and bringsno computational benefits.

In practice, the cost of simulating double integrals is prohibitively ex-pensive. However, there are cases where we can reduce or even avoid thesimulation of double integrals. For example, when the diffusion coefficientsare small and of the order ε and the coefficients at the double integrals areof order ε2, we may take ε2/Nh ∼ O(h) to achieve an accuracy of O(h) inthe mean-square sense, according to the proof of Theorem 4.2.5. Thus, onlya small Nh (the number of terms in the truncated spectral approximation ofBrownian motion) is required if ε ∼ O(

√h). Also, when the diffusion coef-

ficients contain no delay and satisfy the so-called commutative noises , i.e.,∂gl(x)∂x gq(x) =

∂gq(x)∂x gl(x), the Milstein-like scheme can be rewritten as

XMn+1 = XM

n + hf(XMn , XM

n−m) +

r∑

l=1

gl(XMn )ΔWl,n

+1

2

r∑

l=1

r∑

q=1

∂xgl(XMn )gq(X

Mn )ΔWl,nΔWq,n, n = 0, 1, · · · , NT − 1.

In this case, only Wiener increments are used and the Milstein scheme is oflow cost.


Exercise 4.6.1 Derive and plot the mean-square stability region of theMilstein-like scheme (4.2.23) for Equation (4.3.1).

Exercise 4.6.2 Use the definitions of the Ito’s and Stratonovich integralsin (2.3) to convert the equation (4.1.6) to an Ito’s equation. In particular,Equation (4.3.2) can be written as (4.3.1).

Hint. Apply the relation (2.3.2). Compare with the case τ = 0 in Exer-cise 3.6.3.

Exercise 4.6.3 Apply the backward (drift-implicit) Euler scheme to the lin-ear test equation (4.3.1) and derive the mean-square stability region. Canyou also apply the fully implicit Euler scheme and obtain the mean-squarestability region?

Exercise 4.6.4 (Single noise, commutative noises) Write Matlab codefor the Milstein-like scheme (4.2.23), the predictor-corrector scheme (4.2.3),and the midpoint scheme (4.2.19) for


dX(t) = [−2X(t) + 2X(t− τ)]dt+ [X(t) +X(t− τ)] ◦ dW (t), t ∈ (0, T ]

X(t) = t+ τ, t ∈ [−τ, 0]. (4.6.1)

Check the pathwise (almost sure) convergence rate using

ρa.s.h,T = |Xh(T, ω)−X2h(T, ω)|,

where ω represents a user-defined simulation path. The convergence rateshould be close to 1/2, 1/2, and 1, which can be computed as follows (cf.the rate in Appendix E)

log(ρa.s.h,T /ρa.s.h/2,T )

log(2).

Use the Fourier approximation (4.2.24) for the Milstein-like scheme.

Exercise 4.6.5 (Non-commutative noise) Write Matlab code for theMilstein-like scheme (4.2.23), the predictor-corrector scheme (4.2.3), and themidpoint scheme (4.2.19) for Equation (4.4.2) and check the pathwise con-vergence rate as in Exercise 4.6.4.

Exercise 4.6.6 Write Matlab code for the Milstein-like scheme (4.2.23), thepredictor-corrector scheme (4.2.3), and the midpoint scheme (4.2.19) for thefollowing equation

dX = [−X(t)+sin(X(t− τ))] dt+sin(X(t− τ))◦ dW1(t)+0.5X(t)◦ dW2(t),(4.6.2)

and check the pathwise convergence rate as in Exercise 4.6.4. Remark on yourobservations on the difference of numerical results when τ = 0 and τ �= 0 forthe Milstein-like scheme.

5

Balanced numerical schemes for SDEs withnon-Lipschitz coefficients

In this chapter, we discuss numerical methods for SDEs with coefficients ofpolynomial growth. The nonlinear growth of the coefficients induces instabil-ities, especially when the nonlinear growth is polynomial or even exponential.For stochastic differential equations (SDEs) with coefficients of polynomialgrowth at infinity and satisfying a one-sided Lipschitz condition, we provea fundamental mean-square convergence theorem on the strong convergenceorder of a stable numerical scheme in Chapter 5.2. We apply the theorem toa number of existing numerical schemes. We present in Chapter 5.3 a specialbalanced scheme, which is explicit and of half-order mean-square conver-gence. Some numerical results are presented in Chapter 5.4. We summarizethe chapter in Chapter 5.5 and present some bibliographic notes on numeri-cal schemes for nonlinear SODEs. Three exercises are presented for interestedreaders.

5.1 A motivating example

Usually, in numerical analysis for SDEs [259, 354, 358], it is assumed thatthe SDEs coefficients are globally Lipschitz, which is a significant limitationas many models in applications of physical interest have coefficients growingfaster at infinity than a linear function. If the global Lipschitz condition isviolated, almost all standard numerical methods (explicit schemes) will failto converge, see, e.g., [218, 226, 359, 437]. To see this, let us consider thefollowing example.

dX = −X3 dt+ dW (t), X(0) = X0. (5.1.1)

Here the drift coefficient −X3 is not global Lipschitz as it grows cubically.In the following, we show that the standard Euler scheme fails to converge.


135

136 5 Balanced numerical schemes for SDEs with non-Lipschitz coefficients

Suppose that we want to compute the solution at T , given a uniform partitionof the time interval [0, T ] and a time step size h. We assume that X(0) =X0 = 1/h.1 Then by the Euler scheme (3.2.1), we have

X1 = X0 −X30h+ ξ1

√h =

1

h− 1

h2+ ξ1

√h, ξ1 ∼ N (0, 1).

With a large probability, X1 is of the order 1h2 , as ξ1 has a large probability

taking small values, e.g., P(|ξ1| ≤ 3) = 0.98. By one more step from the Eulerscheme, we have

X2 = X1 −X31h+ ξ2

√h = − 1

h2+

1

h5+ ξ2

√h, ξ2 ∼ N (0, 1),

which can be considered of the order − 1h5 with a large probability. One more

step gives us

X3 = X2 −X32h+ ξ3

√h ≈ − 1

h14, ξ3 ∼ N (0, 1).

Even when h is relatively large, we may observe overflows in simulations. Sothe Euler scheme here is not convergent. It is proved in [234] that for (5.1.1)with any deterministic initial conditions X0, the Euler-Maruyama schemewith a uniform time step size diverges in the strong sense over any com-pact interval [0, T ]. For this example, there are several approaches to obtainconvergent numerical schemes for (5.1.1):

• Euler schemes with variable step sizes. To avoid the possible quick growthof the solution, we can set the time step size small enough when needed.For example, in the above example, we can take the time step size hx as1/x2, when |x| > 1/

√h and x is the solution at the previous step.

• implicit schemes, e.g., the scheme (3.2.27).• balanced implicit schemes, where the Euler scheme is modified with a

penalty term, see Chapter 5.3.• tamed schemes (balanced explicit schemes). The Euler scheme is modified

so that the drift and diffusion coefficients are bounded by some polynomialgrowth of 1/h, where h is the time step size.

Strong schemes for SDEs with non-globally Lipschitz coefficients have beenconsidered in a number of recent works, see, e.g., [218, 219, 226, 233, 235,236, 335, 336, 375, 410, 474] and the references therein); see an extendedliterature review on this topic in [233].

When one is interested in simulating averages Eϕ(X(T )) of the solutionto SDEs, weak-sense convergence cannot be guaranteed if the coefficients arenot global Lipschitz, see, e.g., [359, 360] for simulation of averages at finite

1As Brownian motion can take values in R, the (numerical) solution may reachany value at a certain time step. We can assume that we compute the solutionsfrom such a step and denote it as the zeroth step.

5.2 Fundamental theorem 137

time and also of ergodic limits when ensemble averaging is used. The conceptof rejecting exploding trajectories, proposed and justified in [359], allows usto use any numerical method for solving SDEs with non-globally Lipschitzcoefficients for estimating averages. Following this concept, we do not takeinto account the approximate trajectories X(t), which leave a sufficientlylarge ball SR := {x : |x| < R} during the time T. See other approaches forresolving this problem in the context of computing averages, including thecase of simulating ergodic limits via time averaging, e.g., in [40, 341, 437].

For SDEs under non-globally Lipschitz assumptions on the coefficients,the convergence of many standard numerical methods can fail, and this mo-tivates the recent interest in both theoretical support of existing numericalmethods and developing new methods.

In this chapter, we deal with mean-square (strong) approximation ofSDEs with non-globally Lipschitz coefficients. We present a variant of thefundamental mean-square convergence theorem in the case of SDEs withnon-globally Lipschitz coefficients proposed in [448], which is analogous toMilstein’s fundamental theorem for the global Lipschitz case [353] (see also[354, 358]). More precisely, we assume that the SDEs coefficients can growpolynomially at infinity and satisfy a one-sided Lipschitz condition. The the-orem is stated in Chapter 5.2. Its corollary on almost sure convergence is alsogiven. In Chapter 5.2 we present a discussion on applicability of the funda-mental theorem, including its application to the drift-implicit Euler schemein [336] and thus establish its order of convergence.

Here we present a particular balanced method proposed in [448] and proveits convergence with order half in the non-globally Lipschitz setting in Chap-ter 5.3. Some numerical experiments supporting our results are presented inChapter 5.4. Similar balanced methods have also been proposed elsewhere,see, e.g., [411, 436].

5.2 Fundamental theorem

Let (Ω,F ,P) be a complete probability space and FWt be an increasing family

of σ-subalgebras of F induced by W (t) for 0 ≤ t ≤ T , where (W (t),FWt ) =

((W1(t), . . . ,Wm(t))�,FWt ) is an m-dimensional standard Wiener process.

We consider the system of Ito stochastic differential equations (SDEs):

dX = a(t,X)dt+m∑

r=1

σr(t,X)dWr(t), t ∈ (t0, T ], X(t0) = X0, (5.2.1)

where X, a, σr are d-dimensional column-vectors and X0 is independent ofW . We assume that any solution Xt0,X0

(t) of (5.2.1) is regular on [t0, T ], i.e.,it is defined for all t0 ≤ t ≤ T [208].

Let Xt0,X0(t) = X(t), t0 ≤ t ≤ T, be a solution of the system (5.2.1). We

will assume the following.


Assumption 5.2.1 (i) The initial condition is such that

E|X0|2p ≤ K < ∞, for all p ≥ 1. (5.2.2)

(ii) For a sufficiently large p0 ≥ 1 there is a constant c1 ≥ 0 such that fort ∈ [t0, T ],

(x−y, a(t, x)−a(t, y))+2p0−1

2

m∑

r=1

|σr(t, x)−σr(t, y)|2≤ c1|x−y|2, x, y∈ Rd.

(5.2.3)(iii) There exist c2 ≥ 0 and κ ≥ 1 such that for t ∈ [t0, T ],

|a(t, x)−a(t, y)|2 ≤ c2(1+ |x|2κ−2+ |y|2κ−2)|x− y|2, x, y ∈ Rd. (5.2.4)

The condition (5.2.3) implies that

(x, a(t, x)) +2p0 − 1− ε

2

m∑

r=1

|σr(t, x)|2 ≤ c0 + c′1|x|2, t ∈ [t0, T ], x ∈ Rd,

(5.2.5)

where c0 = |a(t, 0)|2/2 + (2p0−1−ε)(2p0−1)2ε

∑mr=1 |σr(t, 0)|2 and c′1 = c1 + 1/2.

The inequality (5.2.5) together with (5.2.2) is sufficient to ensure finitenessof moments [208]: there is K > 0

E|Xt0,X0(t)|2p < K(1 + E|X0|2p), 1 ≤ p ≤ p0 − 1, t ∈ [t0, T ]. (5.2.6)

Also, (5.2.4) implies that

|a(t, x)|2 ≤ c3 + c′2|x|2κ , t ∈ [t0, T ], x ∈ Rd, (5.2.7)

where c3 = 2|a(t, 0))|2 + 2c2(κ − 1)/κ and c′2 = 2c2(1 + κ)/κ.

Example 5.2.2 Here is an example for Assumption 5.2.1 (ii):

dX = −μX|X|r1−1dt+ λXr2dW,

where μ, λ > 0, r1 ≥ 1, and r2 ≥ 1. If r1 + 1 > 2r2 or r1 = r2 = 1,then (5.2.3) is valid for any p0 ≥ 1. If r1 + 1 = 2r2 and r1 > 1 then (5.2.3)is valid for 1 ≤ p0 ≤ μ/λ2 + 1/2.

We introduce the one-step approximation Xt,x(t+h), t0 ≤ t < t+h ≤ T,for the solution Xt,x(t + h) of (5.2.1), which depends on the initial point(t, x), a time step h, and {W1(θ)−W1(t), . . . ,Wm(θ)−Wm(t), t ≤ θ ≤ t+h}and which is defined as follows:

Xt,x(t+h) = x+A(t, x, h;Wi(θ)−Wi(t), i = 1, . . . ,m, t ≤ θ ≤ t+h). (5.2.8)

Using the one-step approximation (5.2.8), we recurrently construct the ap-proximation (Xk,Ftk), k = 0, . . . , N, tk+1 − tk = hk+1, TN = T :


X0 = X(t0), Xk+1 = Xtk,Xk(tk+1) (5.2.9)

= Xk +A(tk, Xk, hk+1;Wi(θ)−Wi(tk),

i = 1, . . . ,m, tk ≤ θ ≤ tk+1).

The following theorem is a generalization of Milstein’s fundamental the-orem [353] (see also [354, 358, Chapter 1]) from the global to non-globallyLipschitz case. It also has similarities with a strong convergence theoremin [218] proved for the case of non-globally Lipschitz drift, global Lipschitzdiffusion, and Euler-type schemes.

For simplicity, we will consider a uniform time step size, i.e., hk = h forall k.

Theorem 5.2.3 ([448]) Suppose (i) Assumption 5.2.1 holds;(ii) The one-step approximation Xt,x(t+h) from (5.2.8) has the following

orders of accuracy: for some p ≥ 1 there are α ≥ 1, h0 > 0, and K > 0 suchthat for arbitrary t0 ≤ t ≤ T − h, x ∈ R

d, and all 0 < h ≤ h0 :

|E[Xt,x(t+ h)− Xt,x(t+ h)]| ≤ K(1 + |x|2α)1/2hq1 , (5.2.10)

[E|Xt,x(t+ h)− Xt,x(t+ h)|2p

]1/(2p) ≤ K(1 + |x|2αp)1/(2p)hq2 (5.2.11)

with

q2 ≥ 1

2, q1 ≥ q2 +

1

2; (5.2.12)

(iii) The approximation Xk from (5.2.9) has finite moments, i.e., for somep ≥ 1 there are β ≥ 1, h0 > 0, and K > 0 such that for all 0 < h ≤ h0 andall k = 0, . . . , N :

E|Xk|2p < K(1 + E|X0|2pβ). (5.2.13)

Then for any N and k = 0, 1, . . . , N the following inequality holds:

[E|Xt0,X0

(tk)− Xt0,X0(tk)|2p

]1/(2p) ≤ K(1 + E|X0|2γp)1/(2p)hq2−1/2 ,(5.2.14)

where K > 0 and γ ≥ 1 do not depend on h and k, i.e., the order of accuracyof the method (5.2.9) is q = q2 − 1/2.

Corollary 5.2.4 ([448]) In the setting of Theorem 5.2.3 for p ≥ 1/(2q) in(5.2.14), there is 0 < ε < q and an a.s. finite random variable C(ω) > 0such that

|Xt0,X0(tk)−Xk| ≤ C(ω)hq−ε,

i.e., the method (5.2.9) for (5.2.1) converges with order q − ε a.s.

The corollary is proved using the Borel-Cantelli lemma in Appendix D(see, e.g., [187, 361]).


Remark 5.2.5 The assumptions and the statement of Theorem 5.2.3 includethe famous fundamental theorem of Milstein [353] ( see also Theorem 3.2.2)proved under the global conditions on the SDEs coefficients when the assump-tion (5.2.13) is naturally satisfied.

The constant K in (5.2.14) depends on p, t0, T as well as on the SDEscoefficients. The constant γ in (5.2.14) depends on α, β, and κ.

5.2.1 On application of Theorem 5.2.3

Theorem 5.2.3 says that if moments of Xk are bounded and the schemewas proved to be convergent with order q in the global Lipschitz case thenthe scheme has the same convergence order q in the considered non-globallyLipschitz case.

However, checking the condition (5.2.13) on moments of a method Xk isoften rather difficult. Usually, each scheme of non-globally Lipschitz SDEs re-quires a special consideration while for schemes for SDEs with global Lipschitzcoefficients, boundedness of moments of Xk is just direct implication of theboundedness of moments of the SDEs solution and the one-step properties ofthe method, see [358, Lemma 1.1.5]). For a number of strong schemes, bound-edness of moments in non-globally Lipschitz cases were proved, see, e.g.,[218, 226, 233, 235, 437]. In Chapter 5.3 we show boundedness of momentsfor a balanced method. See also [448] for fully implicit methods (3.2.27).

Consider the drift-implicit Euler scheme [358, p. 30]:

Xk+1 = Xk + a(tk+1, Xk+1)h+

m∑

r=1

σr(tk, Xk)ξrk√h, (5.2.15)

where ξrk = (Wr(tk+1) − Wr(tk))/√h are Gaussian N (0, 1) i.i.d. random

variables. Assume that the coefficients a(t, x) and σr(t, x) have continuousfirst-order partial derivatives in t and the coefficient a(t, x) also has contin-uous first-order partial derivatives in xi and that all these derivatives andthe coefficients themselves satisfy inequalities of the form (5.2.4). It is notdifficult to show that the one-step approximation corresponding to (5.2.15)satisfies (5.2.10) and (5.2.11) with q1 = 2 and q2 = 1, respectively. Its bound-edness of moments, in particular, under the condition (5.2.5) for time stepsh ≤ 1/(2c1), is proved in [233]. Then, due to Theorem 5.2.3, (5.2.15) con-verges with mean-square order q = 1/2 (note that for q = 1/2, it is sufficientto have q1 = 3/2, which can be obtained under lesser smoothness of a).

In the case of additive noise (i.e., σr(t, x) = σr(t), r = 1, . . . ,m), q1 =2 and q2 = 3/2 and (5.2.15) converges with mean-square order 1 due toTheorem 5.2.3. We note that convergence of (5.2.15) with order half in theglobal Lipschitz case is well known [259, 354, 358]; in the case of non-globallyLipschitz drift and global Lipschitz diffusion was proved in [218, 226] (see alsorelated results in [187, 437]); and under Assumption 5.2.1 strong convergenceof (5.2.15) without order was proved in [233, 335] and its strong order half isestablished in [336].


Due to the bound (5.2.6) on the moments of the solution X(t), it wouldbe natural to require that β in (5.2.13) should be equal to 1. Indeed, (5.2.13)with β = 1 holds for the drift-implicit method (5.2.15) [233] and for fullyimplicit methods (see [448, Section 4] or (3.2.27).

5.2.2 Proof of the fundamental theorem

In this section we shall use the letter K to denote various constants, whichare independent of h and k. The proof exploits the idea of the proof of thistheorem in the global Lipschitz case [353].

We need the following lemma to prove the fundamental theorem.Lemma 5.2.6 is analogous to Lemma 1.1.3 in [358].

Lemma 5.2.6 Suppose Assumption 5.2.1 holds. For the representation

Xt,x(t+ θ)−Xt,y(t+ θ) = x− y + Zt,x,y(t+ θ), (5.2.16)

we have for 1 ≤ p ≤ (p0 − 1)/κ :

E|Xt,x(t+ h)−Xt,y(t+ h)|2p ≤ |x− y|2p(1 +Kh) , (5.2.17)

E |Zt,x,y(t+ h)|2p ≤ K(1 + |x|2κ−2 + |y|2κ−2)p/2|x− y|2php . (5.2.18)

Proof. Introduce the process St,x,y(s) = S(s) := Xt,x(s) − Xt,y(s) and notethat Z(s) = S(s) − (x − y). We first prove (5.2.17). Using the Ito formulaand the condition (5.2.3) (recall that (5.2.3) implies (5.2.6)), we obtain forθ ≥ 0 :

E|S(t+ θ)|2p = |x− y|2p + 2p

∫ t+θ

t

E|S|2p−2

[Sᵀ(a(t,Xt,x(s))− a(t,Xt,y(s)))

+1

2

m∑

r=1

|σr(t,Xt,x(s))− σr(t,Xt,y(s))|2]

ds

+2p(p− 1)

∫ t+θ

t

E|S|2p−4

∣∣∣∣∣Sᵀ(s)

m∑

r=1

[σr(t,Xt,x(s))]

∣∣∣∣∣

− |[σr(t,Xt,y(s))]|2 ds

≤ |x− y|2p + 2p

∫ t+θ

t

E|S|2p−2


+2p− 1

2

∫ t+θ

t

m∑

r=1


ds

≤ |x− y|2p + 2pc1

∫ t+θ

t

E|S(s)|2p ds

from which (5.2.17) follows after applying the Gronwall inequality.


Now we prove (5.2.18). Using the Ito formula and the condition (5.2.3),we obtain for θ ≥ 0 :

E |Z(t+ θ)|2p = 2p

∫ t+θ

t

E|Z|2p−2

[Zᵀ(a(t,Xt,x(s))− a(t,Xt,y(s)))

+1

2

m∑

r=1


ds

+2p(p− 1)

∫ t+θ

t

E|Z|2p−4

∣∣∣∣∣Zᵀ

m∑

r=1

[σr(t,Xt,x(s))]

∣∣∣∣∣

− |[σr(t,Xt,y(s))]|2 ds

≤ 2p

∫ t+θ

t

E|Z|2p−2(s)


+2p− 1

2

∫ t+θ

t

m∑

r=1


ds

−2p

∫ t+θ

t

E|Z|2p−2(x− y, a(t,Xt,x(s))− a(t,Xt,y(s)))ds

≤ 2pc1

∫ t+θ

t

E|Z|2p−2|S|2 ds− 2p

∫ t+θ

t

E|Z|2p−2(x− y, a

(t,Xt,x(s))− a(t,Xt,y(s)))ds.

Using the Young inequality, we get for the first term in the right-hand sideof (5.2.19):

2pc1

∫ t+θ

t

E|Z|2p−2|S|2 ds ≤ 4pc1

∫ t+θ

t

E|Z|2p−2(|Z|2 + |x− y|2) ds (5.2.19)

≤ K

∫ t+θ

t

E|Z|2pds+K|x− y|2∫ t+θ

t

E|Z|2p−2ds.

Consider the second term in the right-hand side of (5.2.19). Using theHolder inequality (twice), (5.2.4), (5.2.17) and (5.2.6), we obtain

−2p

∫ t+θ

t

E|Z|2p−2(x− y, a(t,Xt,x(s))− a(t,Xt,y(s)))ds (5.2.20)

≤ 2p

∫ t+θ

t

E|Z|2p−2|a(t,Xt,x(s))− a(t,Xt,y(s))||x− y|ds

≤ K|x− y|∫ t+θ

t

[E|Z|2p

]1−1/p[E|a(t,Xt,x(s))− a(t,Xt,y(s))|p]1/p ds

≤ K|x− y|∫ t+θ

t

[E|Z|2p

]1−1/p

×(E[(1 + |Xt,x(s)|2κ−2 + |Xt,y(s)|2κ−2)p/2|Xt,x(s)−Xt,y(s)|p])1/p ds


≤ K|x− y|∫ t+θ

t

[E|Z|2p

]1−1/p (E[(1 + |Xt,x(s)|2κ−2 + |Xt,y(s)|2κ−2)p]

)1/2p

(E[|Xt,x(s)−Xt,y(s)|2p]

)1/2pds

≤ K |x− y|2 (1 + |x|2κ−2 + |y|2κ−2)1/2∫ t+θ

t

[E|Z|2p

]1−1/pds.

Substituting (5.2.19) and (5.2.20) in (5.2.19) and applying the Holderinequality to E|Z|2p−2, we get

E |Z(t+ θ)|2p ≤ K

∫ t+θ

t

E|Z|2pds

+K |x− y|2 (1 + |x|2κ−2 + |y|2κ−2)1/2∫ t+θ

t

[E|Z|2p

]1−1/pds

whence we obtain (5.2.18) for integer p ≥ 1 using the Gronwall inequality,and then by the Jensen inequality for non-integer p > 1 as well.

Now we are ready to prove the fundamental theorem. Consider the errorof the method Xt0,X0

(tk+1) at the (k + 1)-step:

ρk+1 := Xt0,X0(tk+1)− Xt0,X0

(tk+1) = Xtk,X(tk)(tk+1)− Xtk,Xk(tk+1)

= (Xtk,X(tk)(tk+1)−Xtk,Xk(tk+1)) + (Xtk,Xk

(tk+1)− Xtk,Xk(tk+1)) .

(5.2.21)

The first difference in the right-hand side of (5.2.21) is the error of the solutionarising due to the error in the initial data at time tk, accumulated at the k-thstep, which we can rewrite as

Stk,X(tk),Xk(tk+1) = Sk+1 := Xtk,X(tk)(tk+1)−Xtk,Xk

(tk+1)

= ρk + Ztk,X(tk),Xk(tk+1)

= ρk + Zk+1,

where Z is as in (5.2.16). The second difference in (5.2.21) is the one-steperror at the (k + 1)-step and we denote it as rk+1 :

rk+1 = Xtk,Xk(tk+1)− Xtk,Xk

(tk+1).

Let p ≥ 1 be an integer. We have

E|ρk+1|2p = E |Sk+1 + rk+1|2p = E[(Sk+1, Sk+1) + 2(Sk+1, rk+1) + (rk+1, rk+1)]p

≤ E |Sk+1|2p + 2pE |Sk+1|2p−2 (ρk + Zk+1, rk+1)

+ K

2p∑l=2

E |Sk+1|2p−l |rk+1|l.

Due to (5.2.17) of Lemma 5.2.6, the first term on the right-hand side of (5.2.22)is estimated as

E |Sk+1|2p ≤ E|ρk|2p(1 +Kh). (5.2.22)


Consider the second term on the right-hand side of (5.2.22):

E |Sk+1|2p−2(ρk + Zk+1, rk+1) = E |ρk|2p−2

(ρk, rk+1)

+E

(|Sk+1|2p−2 − |ρk|2p−2

)(ρk, rk+1)

+E |Sk+1|2p−2(Zk+1, rk+1). (5.2.23)

Due to Ftk -measurability of ρk and due to the conditional variant of (5.2.10),we get for the first term on the right-hand side of (5.2.23):

E |ρk|2p−2(ρk, rk+1) ≤ KE |ρk|2p−1

(1 + |Xk|2α)1/2hq1 . (5.2.24)

Consider the second term on the right-hand side of (5.2.23) and first of allnote that it is equal to zero for p = 1. We have for integer p ≥ 2 :

E(|Sk+1|2p−2 − |ρk|2p−2) (ρk, rk+1) ≤ KE |Zk+1| |ρk||rk+1|

2p−3∑l=0

|Sk+1|2p−3−l|ρk|l.

Further, using Ftk -measurability of ρk and the conditional variants of (5.2.11),(5.2.17), and (5.2.18) and the Cauchy-Schwarz inequality (twice), we get forp ≥ 2 :

E

(|Sk+1|2p−2 − |ρk|2p−2

)(ρk, rk+1) (5.2.25)

≤ KE |ρk|2p−1(1 + |X(tk)|2κ−2 + |Xk|2κ−2)1/4hq2+1/2(1 + |Xk|2α)1/2.

Due to Ftk -measurability of ρk, the conditional variants of (5.2.11)and (5.2.18) and the Cauchy-Schwarz inequality (twice), we obtain for thethird term on the right-hand side of (5.2.23):

E |Sk+1|2p−2(Zk+1, rk+1) (5.2.26)

≤ E[E(Sk+1|4p−4|Ftk

)1/2E(|Zk+1|4|Ftk

)1/4E(|rk+1|4|Ftk

)1/4]

≤ KE |ρk|2p−1(1 + |X(tk)|2κ−2 + |Xk|2κ−2)1/4hq2+1/2(1 + |Xk|4α)1/4.

Due to Ftk -measurability of ρk and due to the conditional variantsof (5.2.11) and (5.2.17) and the Cauchy-Schwarz inequality, we estimate thethird term on the right-hand side of (5.2.22):

K

2p∑

l=2

E |Sk+1|2p−l |rk+1|l ≤ K

2p∑

l=2

E[E(|Sk+1|4p−2l |Ftk)1/2

E(|rk+1|2l|Ftk)1/2] (5.2.27)

≤ K

2p∑

l=2

E[|ρk|2p−lhlq2(1 + |Xk|2lα)1/2].

Substituting (5.2.22)-(5.2.27) in (5.2.22) and recalling that q1 ≥ q2+1/2,we obtain

5.3 A balanced Euler scheme 145

E|ρk+1|2p≤E|ρk|2p(1 +Kh) +KE |ρk|2p−1(1 + |Xk|2α)1/2hq2+1/2

+KE |ρk|2p−1(1+|X(tk)|2κ−2+|Xk|2κ−2)1/4hq2+1/2(1+|Xk|2α)1/2

+KE |ρk|2p−1(1+|X(tk)|2κ−2+|Xk|2κ−2)1/4hq2+1/2(1+|Xk|4α)1/4

+K

2p∑

l=2

E[|ρk|2p−lhlq2(1 + |Xk|2lα)1/2]

≤E|ρk|2p(1 +Kh) +KE |ρk|2p−1(1 + |X(tk)|2κ−2

+|Xk|2κ−2)1/4hq2+1/2(1 + |Xk|2α)1/2

+K

2p∑

l=2

E[|ρk|2p−lhlq2(1 + |Xk|2lα)1/2].

Then using the Young inequality and the conditions (5.2.6) and (5.2.13), weobtain

E|ρk+1|2p ≤ E|ρk|2p +KhE|ρk|2p +K(1 + E|X0|βp(κ−1)+2pαβ)h2p(q2−1/2)+1

whence (5.2.14) with integer p ≥ 1 follows from the Gronwall inequality. Thenby the Jensen inequality (5.2.14) holds for non-integer p as well. �

5.3 A balanced Euler scheme

In this section we present a particular balanced scheme from the class ofbalanced methods introduced in [355] (see also [358, Chapter 1.3]) and proveits mean-square convergence with order half using Theorem 5.2.3. In Chap-ter 5.4, we test the balanced scheme, which is similar to the one in [233], ona model problem and demonstrate that it is more efficient than the tamedscheme in [233] (5.4.2), see Chapter 5.4.

The concept of balanced methods in [355] is introduced as Euler-typenumerical methods with balance between approximating stochastic terms instiff SDEs. Specifically, the balanced Euler schemes can be written as


m∑

r=1

σr(tk, Xk)(Wr(tk+1)−Wr(tk))

+P (tk, tk+1, Xk, Xk+1, h,Wr(tk+1)−Wr(tk)),

where the term P is not zero unless h = 0. It can be considered as a penaltymethod or a Lagrange multiplier method for stiff SDEs. From the aboveformula, we know that the explicit Euler scheme is not a balanced methodas it has the term P being 0.

Consider the following scheme for (5.2.1), which is a balanced Eulerscheme:

Xk+1 = Xk +a(tk, Xk)h+

∑mr=1 σr(tk, Xk)ξrk

√h

1 + h|a(tk, Xk)|+√h∑m

r=1 |σr(tk, Xk)ξrk|, (5.3.1)


where ξrk are Gaussian N (0, 1) i.i.d. random variables. This scheme is abalanced method as it can be written as


m∑

r=1

σr(tk, Xk)ξrk√h

− a(tk, Xk)h+∑m

r=1 σr(tk, Xk)ξrk√h

1 + h|a(tk, Xk)|+√h∑m

r=1 |σr(tk, Xk)ξrk|h|(a(tk, Xk)|

+√h

m∑

r=1

|σr(tk, Xk)ξrk|).

Here the “extra term” (in the second line of the equation) is not vanishingunless h = 0.

We will prove two lemmas, which show that the scheme (5.3.1) satisfies theconditions of Theorem 5.2.3. The first lemma is on boundedness of momentsand uses a stopping time technique (see also, e.g., [233, 359]).

Lemma 5.3.1 Suppose Assumption 5.2.1 holds with sufficiently large p0. Forall natural N and all k = 0, . . . , N the following inequality holds for momentsof the scheme (5.3.1):

E|Xk|2p ≤ K(1 + E|X0|2pβ), 1 ≤ p ≤ p0 − 1

3κ − 1− 1

2, (5.3.2)

with some constants β ≥ 1 and K > 0 independent of h and k.

Remark 5.3.2 It is common that β is larger than 1 in Theorem 5.2.3 fortamed-type methods (see [235] and the bibliographic notes at the end of thischapter) or the balanced Euler method (5.3.1).

Proof. In the proof we shall use the letter K to denote various constants,which are independent of h and k.

The following elementary consequence of the inequalities (5.2.5) and (5.2.7)will be used in the proof: there exits a constant K > 0 such that

m∑

r=1

|σr(t, x)|2 ≤ K(1 + |x|κ+1). (5.3.3)

We observe from (5.3.1) that

|Xk+1| ≤ |Xk|+ 1 ≤ |X0|+ (k + 1). (5.3.4)

Let R > 0 be a sufficiently large number. Introduce the events

ΩR,k := {ω : |Xl| ≤ R, l = 0, . . . , k}, (5.3.5)


≤ χΩR,kE

[(Xk, a(tk, Xk)h)

1 + h|a(tk, Xk)|+√h∑m


+2p− 1

2

h∑m

r=1 |σr(tk, Xk)|2ξ2rk1 + h|a(tk, Xk)|+

√h∑m


∣∣∣∣∣Ftk

]

+2p− 1

2χΩR,k

a2(tk, Xk)h2

= χΩR,kE

[(Xk, a(tk, Xk)h) +

2p−12 h

∑mr=1 |σr(tk, Xk)|2

1 + h|a(tk, Xk)|+√h∑m


+2p− 1

2

h∑m

r=1 |σr(tk, Xk)|2(ξ2rk − 1)

1 + h|a(tk, Xk)|+√h∑m


∣∣∣∣∣Ftk

]

+2p− 1

2χΩR,k

a2(tk, Xk)h2.

Using (5.2.5) and (5.2.7), we obtain

A ≤ c0h+ c′1|Xk|2hχΩR,k(5.3.11)

+2p− 1

2hχΩR,k

m∑

r=1

|σr(tk, Xk)|2

E

[(ξ2rk − 1)

1 + h|a(tk, Xk)|+√h∑m


∣∣∣∣∣Ftk

]

+Kh2 +KχΩR,k|Xk|2κh2.

Since E(ξ2rk − 1) = 0, moments of ξrk are bounded and ξrk are independentof Ftk , we obtain for the expectation in the second term in (5.3.11):

χΩR,kE

[(ξ2rk − 1)

1 + h|a(tk, Xk)|+√h∑m


∣∣∣∣∣Ftk

](5.3.12)

= χΩR,kE

[(ξ2rk − 1)

1 + h|a(tk, Xk)|+√h∑m

r=1 |σr(tk, Xk)ξrk|− (ξ2rk − 1)

∣∣∣∣∣Ftk

]

= −χΩR,kE

[(ξ2rk − 1)

h|a(tk, Xk)|+√h∑m

r=1 |σl(tk, Xk)ξlk|1 + h|a(tk, Xk)|+

√h∑m

l=1 |σr(tk, Xk)ξlk|

∣∣∣∣∣Ftk

]

≤ χΩR,kE

[|ξ2rk − 1|

(h|a(tk, Xk)|+

√h

m∑

r=1

|σl(tk, Xk)||ξlk|)∣∣∣∣∣Ftk

]

≤ χΩR,kK

(h|a(tk, Xk)|+

√h

m∑

r=1

|σr(tk, Xk)|)

.


Using (5.2.7) and (5.3.3), we get from (5.3.11)-(5.3.12):

A ≤ c0h+ c′1χΩR,k|Xk|2h+KhχΩR,k

m∑

r=1

|σr(tk, Xk)|2[h|a(tk, Xk)|

+√h

m∑

r=1

|σr(tk, Xk)|]

+Kh2 +KχΩR,k|Xk|2κh2 (5.3.13)

≤ χΩR,kKh(1 + |Xk|2 + |Xk|2κ+1h+ |Xk|3/2(1+κ)h1/2).

Now consider the last term in (5.3.6):

EχΩR,k(ω) |Xk|2p−l |Xk+1 −Xk|l (5.3.14)

≤ KEχΩR,k(ω) |Xk|2p−l

[hl|a(tk, Xk)|l + hl/2

m∑

r=1

|σr(tk, Xk)|l|ξrk|l

]

≤ KEχΩR,k(ω) |Xk|2p−l

hl/2[1 + hl/2|Xk|lκ + |Xk|l

κ+12

], l ≥ 3,

where we used (5.2.7) and (5.3.3) again as well as the fact that χΩR,k(ω) and

Xk are Ftk -measurable while ξrk are independent of Ftk .Combining (5.3.6), (5.3.7), (5.3.10), (5.3.13), and (5.3.14), we obtain

EχΩR,k+1(ω)|Xk+1|2p (5.3.15)

≤ EχΩR,k(ω)|Xk|2p +KhEχΩR,k

(ω) |Xk|2p−2

[1 + |Xk|2 + |Xk|2κ+1h+ |Xk|3/2(1+κ)h1/2

]

+K

2p∑

l=3

EχΩR,k(ω) |Xk|2p−l

hl/2[1 + hl/2|Xk|lκ + |Xk|l

κ+12

]

≤ EχΩR,k(ω)|Xk|2p +KhEχΩR,k

(ω) |Xk|2p +K

2p∑

l=2


hl/2

+KhEχΩR,k(ω) |Xk|2p−2

[|Xk|2κ+1h+ |Xk|3/2(1+κ)h1/2

]

+K

2p∑

l=3


hl/2[hl/2|Xk|lκ + |Xk|l

κ+12

]

ChoosingR = R(h) = h−1/(3κ−1), κ ≥ 1, (5.3.16)

we get, for l = 3, . . . , 2p,

χΩR,k(ω) |Xk|2p−2

[|Xk|2κ+1h+ |Xk|3/2(1+κ)h1/2

]≤ χΩR(h),k

2|Xk|2p,

χΩR,k(ω) |Xk|2p−l

hl/2[hl/2|Xk|lκ + |Xk|l

κ+12

]≤ χΩR(h),k

2|Xk|2p


and hence we rewrite (5.3.15) as

EχΩR(h),k+1(ω)|Xk+1|2p (5.3.17)

≤ EχΩR(h),k(ω)|Xk|2p

+KhEχΩR(h),k(ω) |Xk|2p +K

p∑

l=1

EχΩR(h),k(ω) |Xk|2(p−l)

hl

≤ EχΩR(h),k(ω)|Xk|2p +KhEχΩR(h),k

(ω) |Xk|2p +Kh,

where in the last line we have used the Young inequality. From here, we getby the Gronwall inequality that

EχΩR(h),k(ω)|Xk|2p ≤ K(1 + E|X0|2p), (5.3.18)

where R(h) is from (5.3.16) and K does not depend on k and h but it dependson p.

It remains to estimate EχΛR(h),k(ω)|Xk|2p. We have

χΛR,k= 1− χΩR,k

= 1− χΩR,k−1χ|Xk|≤R = χΛR,k−1

+ χΩR,k−1χ|Xk|>R

= · · · =∑k

l=0 χΩR,l−1χ|Xl|>R,

where we put χΩR,−1= 1. Then, using (5.3.4), (5.3.18), (5.2.2), and Cauchy-

Schwarz’s and Markov’s inequalities, we obtain

EχΛR(h),k(ω)|Xk|2p = E

∑kl=0 |Xk|2pχΩR(h),l−1

χ|Xl|>R(h)

≤(E|X0 + k|4p

)1/2∑kl=0

(E

[χΩR(h),l−1|Xl|>R(h)

])1/2

=(E|X0 + k|4p

)1/2∑kl=0

(P (χΩR(h),l−1

|Xl| > R))1/2

≤(E|X0 + k|4p

)1/2∑kl=0

(

E(χΩR(h),l−1|Xl|2(2p+1)(3κ−1))

)1/2

R(h)(2p+1)(3κ−1)

≤ K(E|X0 + k|4p

)1/2 (E(1 + |X0|2(2p+1)(3κ−1))

)1/2kh2p+1

≤ K(1 + E|X0|2(2p+1)(3κ−1))1/2,

which together with (5.3.18) implies (5.3.2) for integer p ≥ 1. Then, by theJensen inequality, (5.3.2) holds for non-integer p as well. �

The next lemma gives estimates for the one-step error of the balancedEuler scheme (5.3.1).

Lemma 5.3.3 Assume that (5.2.6) holds. Assume that the coefficients a(t, x)and σr(t, x) have continuous first-order partial derivatives in t and that thesederivatives and the coefficients satisfy inequalities of the form (5.2.4). Thenthe scheme (5.3.1) satisfies the inequalities (5.2.10) and (5.2.11) with q1 =3/2 and q2 = 1, respectively.


As in the global Lipschitz case [355, 358], the proof of Lemma 5.3.3 isroutine one-step error analysis.

Proof. We recall an auxiliary result from [448] that for ϕ(t, x) which havecontinuous first-order partial derivative in t and that the derivative and thefunction satisfy inequalities of the form (5.2.4). For α ≥ 1 and s ≥ t, we have

E |ϕ(s,Xt,x(s))− ϕ(t, x)|α ≤ K(1+ |x|2ακ−α)[(s− t)α/2+(s− t)α], (5.3.19)

which, in particular, holds for the functions a(t, x) and σr(t, x) under theconditions of the lemma.

Now consider the one-step approximation of the SDE (5.2.1), which cor-responds to the balanced Euler method (5.3.1):

X = x+a(t, x)h+

∑mr=1 σr(t, x)ξr

√h

1 + h|a(t, x)|+√h∑m

r=1 |σr(t, x)ξr|(5.3.20)

and the one-step approximation corresponding to the explicit Euler scheme:

X = x+ a(t, x)h+

m∑

r=1

σr(t, x)ξr√h. (5.3.21)

We start with analysis of the one-step error of the Euler scheme:

ρ(t, x) := Xt,x(t+ h)− X.

Using (5.3.19), we obtain2

|Eρ(t, x)| =∣∣∣∣∣E∫ t+h

t

(a(s,Xt,x(s))− a(t, x))ds

∣∣∣∣∣ (5.3.22)

≤ E

∫ t+h

t

|a(s,Xt,x(s))− a(t, x)|ds

≤ Kh3/2(1 + |x|2κ−1).

Further,

Eρ2p(t, x) ≤ KE

∣∣∣∣∣

∫ t+h

t


∣∣∣∣∣

2p

(5.3.23)

+K

q∑

r=1

E

∣∣∣∣∣

∫ t+h

t

(σr(s,Xt,x(s))− σr(t, x)) dWr(s)

∣∣∣∣∣

2p

.

2Assuming additional smoothness of a(t, x), we can get an estimate for Eρ(t, x)of order h2 but this will not improve the result of this lemma for the balanced Eulerscheme (5.3.1).


Applying (5.3.19), we have

E

∣∣∣∣∣

∫ t+h

t


∣∣∣∣∣

2p

≤ Kh3p(1 + |x|4pκ−2p) (5.3.24)

and

E

∣∣∣∣∣

∫ t+h

t

(σr(s,Xt,x(s))− σr(t, x)) dWr(s)

∣∣∣∣∣

2p

(5.3.25)

≤ Khp−1

∫ t+h

t

E |σr(s,Xt,x(s))− σr(t, x)|2p ds ≤ Kh2p(1 + |x|4pκ−2p).

It follows from (5.3.23) to (5.3.25) that

Eρ2p(t, x) ≤ Kh2p(1 + |x|4pκ−2p). (5.3.26)

Now we compare the one-step approximations (5.3.20) of the balancedEuler scheme and (5.3.21) of the Euler scheme:

X = x+a(t, x)h+

∑mr=1 σr(t, x)ξr

√h

1 + h|a(t, x)|+√h∑m

r=1 |σr(t, x)ξr|, X=X=ρ(t, x), (5.3.27)

ρ(t, x) =

(a(t, x)h+

m∑

r=1

σr(t, x)ξr√h

)h|a(t, x)|+

√h∑m

r=1 |σr(t, x)ξr|1+h|a(t, x)|+

√h∑m

r=1 |σr(t, x)ξr|.

Using the equality (5.3.8) and the assumptions made on the coefficients(see (5.2.4)), we obtain

|Eρ(t, x)| =∣∣∣∣∣a(t, x)hE

h|a(t, x)|+√h∑m

r=1 |σr(t, x)ξr|1 + h|a(t, x)|+

√h∑m

r=1 |σr(t, x)ξr|

∣∣∣∣∣ ≤ Kh3/2(1 + |x|3κ/2),

which together with (5.3.27) and (5.3.23) implies that (5.3.20) satisfies (5.2.10)with q1 = 3/2. Further,

Eρ2p(t, x) ≤ h2pE

[√h|a(t, x)|+

m∑

r=1

|σr(t, x)ξr|]4p

≤ Kh2p(1 + |x|4pκ),

which together with (5.3.27) and (5.3.26) implies that (5.3.20) satisfies (5.2.11)with q2 = 1. �

Lemmas 5.3.1 and 5.3.3 and Theorem 5.2.3 imply the following result.

Proposition 5.3.4 Under the assumptions of Lemmas 5.3.1 and 5.3.3 thebalanced Euler scheme (5.3.1) has mean-square order half, i.e., for it theinequality (5.2.14) holds with q = q2 − 1/2 = 1/2.

5.4 Numerical examples 153

Remark 5.3.5 In the additive noise case, the mean-square order of the bal-anced Euler scheme (5.3.1) does not improve since q1 and q2 remain 3/2 and1, respectively.

Remark 5.3.6 One can consider the following scheme for (5.2.1) insteadof (5.3.1):



√h

1 + h|a(tk, Xk)|+√h∑m

r=1 |σr(tk, Xk)|. (5.3.28)

This scheme is still a balanced scheme but has less restricted conditions onp-th order moments when p0 is finite. The proof is left as an exercise at theend of this chapter.

5.4 Numerical examples

5.4.1 Some numerical schemes

In this section we list some numerical schemes for nonlinear stochastic dif-ferential equations. In the following schemes, ξrk = (Wr(tk+1)−Wr(tk))/

√h

are i.i.d. N (0, 1) (Gaussian) random variables.Explicit schemes

• the drift-tamed Euler scheme (a modified balanced method) [235]:

Xk+1 = Xk + ha(Xk)

1 + h |a(Xk)|+

m∑

r=1

σr(tk, Xk)ξrk√h. (5.4.1)

• the fully tamed scheme [233]:

Xk+1 = Xk +a(Xk)h+


√h

max(1, h

∣∣∣ha(Xk) +∑m

r=1 σr(tk, Xk)ξrk√h∣∣∣) . (5.4.2)

• the balanced Euler method (5.3.1)



√h

1 + h|a(tk, Xk)|+√h∑m

r=1 |σr(tk, Xk)ξrk|. (5.4.3)

Drift-implicit schemes

• the drift-implicit Euler scheme (5.2.15)

Xk+1 = Xk + a(tk+1, Xk+1)h+m∑

r=1



• the trapezoidal scheme [358, p. 30]:

Xk+1 = Xk +h

2[a(Xk+1) + a(Xk)] +

m∑

r=1


Fully implicit schemes

• the fully implicit Euler scheme ((3.2.27) with λ = 1)

Xk+1 = Xk + a(tk+1, Xk+1)h−m∑

r=1

d∑

j=1

∂σr

∂xj(tk+1, Xk+1)σ

jr(tk+1, Xk+1)h

+

m∑

r=1

σr(tk+λ, (1− λ)Xk + λXk+1) (ζrh)k√h. (5.4.6)

• the midpoint method ((3.2.27) with λ = 1/2)

Xk+1 = Xk + a(tk+ 12,Xk +Xk+1

2)h

+

m∑

r=1

σr(tk+ 12,Xk +Xk+1

2) (ζrh)k

√h− 1

2

m∑

r=1

d∑

j=1

∂σr

∂xj(tk+ 1

2,Xk +Xk+1

2)σj

r(tk+ 12,Xk +Xk+1

2)h, (5.4.7)

where (ζrh)k are i.i.d. random variables defined in (3.2.28), which are trun-

cations of ξrk ∼ N (0, 1) and Ah =√

4| lnh|.Convergence order of these schemes. The drift-tamed Euler scheme (5.4.1)

converges with strong convergence order half under Assumption 5.2.1 to-gether with Lipschitz diffusion coefficients [235]. The fully tamed Eulerscheme (5.4.2) is proved to have strong convergence but without order givenunder Assumption 5.2.1, see [233]. A half-order strong convergence of (5.4.3)is proved in Chapter 5.3.

The half-order convergence of the drift-implicit scheme (5.4.4) is provedin [336] and in Chapter 5.2.1. The trapezoidal scheme (5.4.5) can be shownto be of mean-square convergence with order half using Theorem 5.2.3. Itonly requires to show boundedness of higher moments of the scheme, whichis similar to the proof of bounded moments in [335].

When 1/2 < λ ≤ 1, the fully implicit scheme (3.2.27) is expected to con-verge with order half. When diffusion coefficients and ∂σr

∂xj σjr are Lipschitz

continuous, the half-order convergence is proved in [448]. For λ = 1/2, themidpoint method is proved in [448] to converge with a mean-square con-vergence order half when the diffusion coefficients are uniformly bounded.Moreover, the midpoint method is of mean-square convergence one for SDEswith commutative noises (see (3.2.7)).


5.4.2 Numerical results

In all the experiments with fully implicit schemes, where the truncated ran-dom variables ζ are used, we took l = 2 in (3.2.29). The experiments wereperformed using Matlab R2012a on a Macintosh desktop computer with In-tel Xeon CPU E5462 (quad-core, 2.80 GHz). In simulations we used theMersenne Twister random generator with seed 100. Newton’s method wasused to solve the nonlinear algebraic equations at each step of the implicitschemes.

We test the methods on two model problems. The first one has non-globally Lipschitz drift and global Lipschitz diffusion with two noncommu-tative noises. The second example satisfies Assumption 5.2.1 (non-globallyLipschitz both drift and diffusion). The aim of the tests is to compare theperformance of the methods: their accuracy and computational costs.

Remark 5.4.1 Experiments cannot prove or disprove boundedness of mo-ments of the schemes since experiments rely on a finite sample of trajectoriesrun over a finite time interval while blow-up of moments in divergent meth-ods (e.g., explicit Euler scheme) is, in general, a result of large deviations[341, 359].

To compute the mean-square error, we run M independent trajectories

X(i)(t), X(i)k :

(E[(X(T )−XN )2]

)1/2 .=

(1

M

M∑

i=1

[X(i)(T )−X(i)N ]2

)1/2

. (5.4.8)

We took time T = 50 and M = 104. The reference solution was computed bythe midpoint method with small time step h = 10−4. It was verified that usinga different implicit scheme for simulating a reference solution does not affectthe outcome of the tests. We chose the midpoint scheme as a reference sincein all the experiments it produced the most accurate results. The numberof trajectories M = 104 was sufficiently large for the statistical errors (theMonte Carlo error with 95% confidence) not to significantly hinder the mean-square errors.

Example 5.4.2 Consider the following Stratonovich SDE:

dX = (1−X5) dt+X ◦ dW1 + dW2, X(0) = 0. (5.4.9)

In Ito’s sense, the drift of the equation becomes a(t, x) = 1− x5 + x/2. Herewe tested the balanced Euler method (5.4.3), the drift-tamed scheme (5.4.1),the fully implicit Euler scheme (5.4.6), and the midpoint method (5.4.7).

Mean-square errors of these schemes are plotted in Figure 5.1. The ob-served rates of convergence of all the tested methods are close to the predicted1/2. For a fixed time step h, the midpoint method is the most accurate schemewhile the balanced Euler method (5.4.3) is the least accurate.


Fig. 5.1. Mean-square errors of the selected schemes for Example 5.4.2.

0 0.02 0.04 0.06 0.08 0.10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

h

mea

n−sq

uare

er

rors

fully implicit Eulermid−pointdrift−tame Eulerbalanced Euler

For large time step sizes, the convergence rates of balanced Euler scheme(5.4.3) are a bit off the predicted order 1/2. However, when time step sizesbecome smaller, the convergence rates are much closer to 1/2. See mean-square errors for the balanced Euler scheme (5.4.3) in Table 5.1.

Table 5.1. Mean-square errors of the balanced Euler scheme (5.4.3) for Exam-ple 5.4.2.

h 0.1 0.05 0.02 0.01 0.005 0.002 0.001

Errors 3.59e-01 3.02e-01 2.30e-01 1.78e-01 1.35e-01 9.27e-02 6.86e-02

Rate – 0.25 0.30 0.37 0.39 0.41 0.44

To produce the result with accuracy ∼ 0.06 − 0.07, in our experimentof running M = 104, the drift-tamed Euler (5.4.1) costs the least time andthe balanced Euler scheme (5.4.3) costs the most. These numerical resultsconfirmed the conclusion of [235] that the scheme (5.4.1) from [235] is highlycompetitive (Table 5.2).3

3However, (5.4.1) is not applicable when diffusion grows faster than a linearfunction.


Table 5.2. Comparison of mean-square errors at the magnitude of 0.06–0.07 ofdifferent schemes for Example 5.4.2 at T = 50.

Methods h Errors Time (sec.)

The tamed Euler (5.4.1) 0.01 6.10e-02 170

The midpoint (5.4.7) 0.02 5.26e-02 329

The fully implicit Euler scheme (5.4.6) 0.01 5.48e-02 723

The balanced Euler (5.4.3) 0.001 6.86e-02 1870

Example 5.4.3 Consider the SDE in the Stratonovich sense:

dX = (1−X5) dt+X2 ◦ dW, X(0) = 0. (5.4.10)

In Ito’s sense, the drift of the equation becomes a(t, x) = 1− x5 + x3.Here we tested the balanced Euler method (5.4.3), the fully tamed

Euler scheme (5.4.2), the drift-implicit scheme (5.4.4), the fully implicitEuler scheme (5.4.6), the midpoint method (5.4.7), and the trapezoidalscheme (5.4.5).

It can be proved directly that implicit algebraic equations arising fromapplication of the midpoint and fully implicit Euler schemes to (5.4.10) haveunique solutions under a sufficiently small time step.

The fully tamed scheme (5.4.2) did not produce accurate results until thetime step size is at least h = 0.005 and thus no errors presented here. Ahalf-order convergence of this scheme is not expected, see [448, Remark 5.2]for detailed explanation.

Table 5.3 gives the mean-square errors and experimentally observed con-vergence rates for the corresponding methods. In addition to the data in thetable, we evaluated errors for (5.3.1) for smaller time steps: h = 0.002 – theerror is 3.70e-02 (rate 0.41), 0.001 – 2.73e-02 (0.44), 0.0005 – 2.00e-02 (0.45),i.e., for smaller h the observed convergence rate of (5.3.1) becomes closerto the theoretically predicted order 1/2. Since (5.4.10) is with single noise,the midpoint method demonstrates the first order of convergence. The otherimplicit schemes show the order half as expected.

Table 5.4 presents the time costs in seconds. Let us fix the tolerance levelat 0.05–0.06. We highlighted in bold the corresponding values in both ta-bles. In this example the midpoint scheme is the most efficient, due to itsfirst order convergence in the commutative case. Among methods of half-order, the balanced Euler method (5.3.1) is the fastest and one can expectthat for multi-dimensional SDEs the explicit scheme (5.3.1) can consider-ably outperform implicit methods (see a similar outcome for the drift-tamedmethod (5.4.1) supported by experiments in [235]. In comparison with thebalanced Euler (5.4.3), the drift-tamed Euler scheme (5.4.1) is divergent whendiffusion is growing faster than a linear function on infinity.


Table

5.3.Example

5.4.3.Mean-square

errors

oftheselected

schem

es.

h(5.4.4)

Rate

(5.4.6)

Rate

(5.4.7)

Rate

(5.4.5)

Rate

(5.4.3)

Rate

0.2

3.449e-01

–1.816e-01

–1.378e-01

–4.920e-01

–2.102e-01

–

0.1

2.441e-01

0.50

1.331e-01

0.45

8.723e-02

0.66

3.526e-01

0.48

1.637e-01

0.36

0.05

1.592e-01

0.62

9.619e-02

0.47

5.344e-02

0.71

2.230e-01

0.66

1.270e-01

0.37

0.02

8.360e-02

0.70

6.599e-02

0.41

2.242e-02

0.95

1.048e-01

0.82

9.170e-02

0.36

0.01

5.460e-02

0.61

4.919e-02

0.42

1.145e-02

0.97

5.990e-02

0.81

7.065e-02

0.38

0.005

3.682e-02

0.57

3.522e-02

0.48

5.945e-03

0.95

3.784e-02

0.66

5.393e-02

0.39

gk

Sticky Note

waste of a page


Table 5.4. Example 5.4.3. Computational times for the selected schemes.

h (5.4.4) (5.4.6) (5.4.7) (5.4.5) (5.4.3)

0.2 9.25e+00 1.10e+01 9.33e+00 1.20e+01 3.98e+00

0.1 1.77e+01 2.17e+01 1.80e+01 2.30e+01 7.49e+00

0.05 3.42e+01 4.26e+01 3.51e+01 4.48e+01 1.41e+01

0.02 8.33e+01 1.04e+02 8.69e+01 1.10e+02 3.37e+01

0.01 1.64e+02 2.05e+02 1.73e+02 2.19e+02 6.62e+01

0.005 3.25e+02 4.07e+02 3.47e+02 4.37e+02 1.32e+02


Under the assumption 5.2.1 (global monotone condition and polynomialgrowth of the coefficients), a solution to the equation (5.2.1) can be extremelylarge and then (5.2.1) becomes highly stiff. To deal with stiffness, several nu-merical methods have been proposed in the literature such as implicit schemesand balanced schemes. To show the convergence order of these methods, it isnatural to investigate a basic relationship between local truncation error andglobal truncation error of numerical methods for SDEs with locally Lipschitzcontinuous and polynomially growing coefficients, see Theorem 5.2.3.

It is important to observe that numerical methods for SDEs with coef-ficients of nonlinear growth require the Lp (p ≥ 2)-stability of numericalsolutions for the mean-square convergence while for numerical methods ofSDEs with Lipschitz coefficients they do not require this stability (thoughthe Lp stability is naturally satisfied).

In Chapter 5.4, we show some comparisons among numerical results frombalanced explicit schemes and implicit schemes. The balanced explicit scheme(5.4.3) is competitive for SDEs with drift and diffusion coefficients of poly-nomial growth, see Example 5.4.3. However, the balanced scheme (5.4.3)exhibits very large errors for SDEs with Lipschitz diffusion coefficients. Weobserve that it requires small time step sizes to reach the asymptotic regionof convergence even though it is proved to be of half order in the mean-squaresense, see Examples 5.4.2 and 5.4.3.

Bibliographic notes. There have been many developments in numericalmethods for SDEs with locally Lipschitz continuous coefficients since [448],see Refs. [232, 237, 411, 436, 503, 512]. In [512], a semi-tamed Euler is pro-posed for SDEs with non-Lipschitz continuous drift coefficients and Lipschitzcontinuous diffusion coefficients. The drift is decomposed into two parts: Lip-schitz and non-Lipschitz parts, where only the non-Lipschitz part is tamed.

For SDEs with non-Lipschitz continuous drift and diffusion coefficients,Ref. [503] presents a class of first-order balanced schemes and proves Lp-stability of presented schemes using the fundamental theorem presented here.Ref. [411] proposes a slightly different tamed Euler scheme than (5.3.1)/(5.4.3) in this chapter:




√h

1 +√h|a(tk, Xk)|+

√h∑m

r=1 |σr(tk, Xk)|. (5.5.1)

A general tamed scheme is proposed in [436] for Lyapunov stability ratherthan simply Lp-stability, one of which can be

Xk+1 = Xk +a(tk, Xk)h

1 +√h|a(tk, Xk)|

+

m∑

r=1

σr(tk, Xk)

1 +√h∑m

r=1 |σr(tk, Xk)|ξrk

√h.

(5.5.2)Under even more general conditions, Refs. [232, 237] propose a tamedEuler scheme of a similar type for SDEs with exponential moments and proveLyapunov stability and half-order convergence.

In [448], the authors present fully implicit (i.e., implicit both in drift anddiffusion, (3.2.27)) mean-square schemes for one-sided Lipschitz drift coeffi-cients, which grows superlinearly and not faster than polynomial growth atinfinity. The fully implicit schemes (3.2.27) was proposed and motivated bygeometric integration of stochastic Hamiltonian equations in [356] (see also[358]), where their convergence was proved under globally Lipschitz coeffi-cients.

In this book, we will discuss numerical methods for SPDEs with coeffi-cients of one-sided Lipschitz continuous in Chapters 9 and 10.


Exercise 5.6.1 Show that the scheme (5.3.28) has less restricted conditionson p-th order moments when p0 is finite compared with the scheme (5.3.1).

Exercise 5.6.2 Consider the following stochastic Ginzburg-Landau equation

dX(t) =(a(t)X(t)− b(t)X3(t)

)dt+ σ(t)X(t) dW (t), (5.6.1)

X(0) = x > 0. (5.6.2)

Here b(t) > 0, a(t), and σ(t) are bounded continuous functions on [0,∞).

a) Show that the coefficients satisfy Assumption 5.2.1.b) Show that the solution to stochastic Ginzburg-Landau equation is given

by

X(t) =e∫ t0a(s)− 1

2σ2(s) ds+

∫ t0σ(s) dW (s)

(x−2 + 2

∫ t

0b(s)e2

∫ s0a(θ)− 1

2σ2(θ) dθ+2

∫ s0σ(θ) dW (θ)

)1/2 .

c) Apply the schemes (5.4.1), (5.2.15), (5.4.5), (5.3.1), and (3.2.27) withλ = 1/2 and λ = 1 to solve the stochastic Ginzburg-Landau equation andcompare their mean-square convergence orders.


Hint. Apply the integrating factor method. Let Y (t) be the following ex-ponential process dY (t) = a(t)Y (t) dt+σ(t)Y (t) dW (t) and X(t) = C(t)Y (t).By Ito formula, we have

dC(t) = −b(t)Y 2(t)C3(t) dt,

from which we can find C(t) and thus X(t).

Exercise 5.6.3 Consider the following SDE

dX(t) =(X(t)−X3(t)

)dt+ σX2(t) dW (t), (5.6.3)

X(0) = 1. (5.6.4)

Here σ = 0.1 are bounded continuous functions on [0,∞).

a) Show that the coefficients satisfy conditions in Assumption 5.2.1.b) Apply the schemes (5.4.1), (5.2.15), (5.4.5), (5.3.1), and (3.2.27) with

λ = 1/2 and λ = 1 to solve the equation and compare their mean-squareconvergence orders. Note this time we have no analytical solution. Use anumerical solution by (3.2.27) with λ = 1/2 with a fine time step size asa reference solution (“exact solution”).

Part II

Temporal White Noise

165

The standard approach to constructing SPDE solvers starts with a spacediscretization of a SPDE, for which spectral methods (see, e.g., [78, 167, 249]),finite element methods (see, e.g., [6, 152, 491]) or spatial finite differences (see,e.g., [6, 189, 420, 495]) can be used. The result of such a space discretizationis a large system of ordinary stochastic differential equations (SDEs), whichrequires time discretization to complete a numerical algorithm. In [101, 109]the SPDE is first discretized in time and then to this semi-discretization afinite-element or finite-difference method can be applied. Other numericalapproaches include those making use of splitting techniques [33, 191, 293],quantization [161], or an approach based on the averaging-over-characteristicformula [361, 396]. In [315, 344] numerical algorithms based on the Wienerchaos expansion (WCE) were introduced for solving the nonlinear filteringproblem for hidden Markov models. Since then, the WCE-based numericalmethods have been successfully developed in a number of directions (see, e.g.,[225, 489]).

In Part II of the book, we consider deterministic integration methodsin random space for stochastic partial differential equations. In Chapter 6,we discuss Wiener chaos methods (WCE) and a multi-stage WCE for longtime integration for linear advection-diffusion-reaction equations with multi-plicative noise. In Chapter 7, we discuss stochastic collocation methods (pre-cisely, sparse grid collocation methods) for linear parabolic equations withmultiplicative noise. Subsequently, we compare the two methods for theselinear equations in Chapter 8 while in Chapter 9 we apply stochastic colloca-tion methods discussed in Chapters 7 and 8 to nonlinear equations, namelystochastic Euler equations for the one-dimensional piston problem.

6

Wiener chaos methods for linear stochasticadvection-diffusion-reaction equations

In this chapter, we discuss numerical algorithms using Wiener chaos expan-sion (WCE) for solving second-order linear parabolic stochastic partial dif-ferential equations (SPDEs). The algorithm for computing moments of theSPDE solutions is deterministic, i.e., it does not involve any statistical errorsfrom generating random numbers.

Although the Wiener chaos expansion (WCE) results in a triangular sys-tem of deterministic partial differential equations, WCE is only efficient forshort time integration. Here, we present a recursive Wiener chaos expan-sion (WCE) method for longer time integration of linear parabolic equationswith temporal white noise. We compare the deterministic algorithm with theMonte Carlo method and demonstrate that the new recursive WCE methodis more efficient for highly accurate solutions.

This chapter is organized as follows. We first describe in Chapter 6.1the multistage WCE for computing solutions and moments of solutions anddiscuss the complexity of the resulting algorithm. An illustration of the al-gorithm with a simple example is presented in Chapter 6.2. We comparethe multistage WCE and Monte Carlo-type algorithms for one-dimensionalproblems in Chapter 6.3 and for a two-dimensional passive scalar equation inChapter 6.4. In Chapter 6.5, we highlight the main points of this chapter andcomment on WCE and the multistage WCE. We also provide two exercisesfocusing on time integration and the multistage WCE method at the end ofthe chapter.

6.1 Description of methods

In computing moments of SPDE solutions, the existing approaches to solvingSPDEs are usually complemented by the Monte Carlo technique. Conse-quently, in these approaches numerical approximations of SPDEmoments


167

168 6 Wiener chaos methods for A-D-R equations

have two errors: numerical integration (space-time discretization) error andMonte Carlo (statistical) error. To reach a high accuracy, we have to runa very large number of independent simulations of the SPDE to reduce theMonte Carlo error. In contrast, WCE methods for computing moments of theSPDE solutions are statistical-error free (no random number generators areused) but are only subject to the error from the truncation of the WCE.

6.1.1 Multistage WCE method

For the linear SPDE (3.3.23)–(3.3.24), we can use the propagator (3.3.16).In order to apply a truncation of the propagator, we introduce the followingnotation: the order of multi-index α:

d(α) = max {l ≥ 1 : αk,l > 0 for some k ≥ 1} ,

and the truncated set of multi-indices:

JN,n = {α ∈ J : |α| ≤ N, d(α) ≤ n} .

Recall that the multi-index length is |α| =∑∞

i,k=1 αk,l. Here N is the highestHermite polynomial order and n is the maximum number of Gaussian randomvariables for each Wiener process. Using (3.3.14), we introduce the truncatedWiener chaos solution

uN,n(t, x) =∑

α∈JN,n

1√α!

ϕα(t, x)ξα, (6.1.1)

with the basis {ml(s)}l≥1 given by

m1(s) =1√t, ml(s) =

√2

tcos(

π(l − 1)s

t), l ≥ 2, 0 ≤ s ≤ t. (6.1.2)

Letting α ∈ JN,n in the propagator (3.3.16), we have that the coefficientsϕα(t, x;φ) satisfy the propagator

∂ϕα(t, x;φ)

∂t= Lϕα(t, x;φ) + f(x)1{|α|=0} (6.1.3)

+

q∑

k=1

n∑

l=1

αk,lml(t)[Mkϕα−(k,l)(t, x;φ)

+gk(x)1{|α|=1}], t ∈ (0, T ],

ϕα(0, x) = φ(x)1{|α|=0},

where α−(k, l) is the multi-index with components

(α−(k, l)

)i,j

=

{max(0, αi,j − 1), if i = k and j = l,αi,j , otherwise.

(6.1.4)

6.1 Description of methods 169

The truncated expansion (6.1.1) together with (6.1.3), (2.3.4), and (6.1.2)gives us a constructive approximation of the solution to (3.3.23), where im-plementation requires that we numerically solve the propagator (3.3.16).

It is proved in [315, Theorem 2.2] that when bki (t, x) = 0, c = 0, gk = 0(reaction-diffusion equation) and the number of noises is finite there is aconstant C > 0 such that for any t ∈ (0, T ]

E[‖uN,n(t, ·)− u(t, ·)‖2L2 ] ≤ CeCt

((Ct)N+1

(N+ 1)!+

t3

n

). (6.1.5)

It follows from the error estimates (6.1.5) that the error of the approximationuN,n(t, ·) grows exponentially in time t, which severely limits its practicaluse. To overcome this difficulty, it was proposed in [315] to introduce a timediscretization with step Δ > 0 and view (6.1.1), (3.3.16), (2.3.4), (6.1.2) asthe one-step approximation of the SPDE solution.

To this end, we introduce the multi-step basis for the WCE and its cor-responding propagator. Let 0 = t0 < t1 < · · · < tK = T be a uniformpartition of the time interval [0, T ] with time step size Δ, see Figure 6.1. Let

{m(i)k } =

{m

(i)k (s)

}

k≥1be the following CONS in L2([ti, ti−1]) :

m(i)l = ml(s− ti), ti ≤ s ≤ ti−1, (6.1.6)

ml(s) =1√Δ

, ml(s) =

√2

Δcos

(π(l − 1)s

Δ

), l ≥ 2, 0 ≤ s ≤ Δ,

ml(s) = 0, l ≥ 1, s /∈ [0, Δ].

Define the random variables ξ(i)α , i = 1, . . . ,K, as

ξ(i)α :=

(i)∏

α

(Hαk,l

(ξ(i)k,l)√

αk,l!

), α ∈ J , (6.1.7)

where ξ(i)k,l =

∫ ti−1

ti

m(i)l (s) dWk(s), and Hn are Hermite polynomials (2.3.3).

LetuΔ,N,n(0, x) = u0(x) (6.1.8)

and by induction for i = 1, . . . ,K :

uΔ,N,n(ti−1, x) =∑

α∈JN,n

1√α!

ϕ(i)α (Δ,x)ξ(i)α , (6.1.9)


where ϕ(i)α (Δ,x) solves the system

∂ϕ(i)α (s, x)

∂s= Lϕ(i)

α (s, x) + f(x)1{|α|=0}

+∑

k,l

αk,lm(i)l (s) (6.1.10)

[Mkϕ

(i)α−(l,k)(s, x) + gk(x)1{|α|=1}

], s ∈ (0, Δ],

ϕ(i)α (0, x) = uΔ,N,n(ti, x)1{|α|=0}.

Thus, (6.1.8)–(6.1.10) together with (6.1.6) and (6.1.7) give us a recursivemethod for solving the SPDE (3.3.23), where implementation requires tonumerically solve the propagator (6.1.10) at every time step.

Based on the one-step error (6.1.5), the following global error estimatefor the recursive WCE method is proved in [315, Theorem 2.4] (the case ofbki (t, x) = 0, c = 0, gk = 0 and finite number of noises):

E[‖uΔ,N,n(ti−1, ·)− u(ti−1, ·)‖2L2 ] ≤ CeCT

((CΔ)N

(N+ 1)!+

Δ2

n

), i = 1, . . . ,K,

(6.1.11)

for some C > 0 independent of Δ, N, and n, i.e., this method is of global

mean-square order O(

ΔN/2√(N+1)!

+ Δ√n

).

As mentioned above, the recursive WCE method requires to solve thepropagator (6.1.10) at every time step, which is computationally rather ex-pensive. To reduce the cost, we introduce a modification of this methodfollowing [315]. The idea is to expand the initial condition u0(x) in a ba-sis {em}, present uΔ,N,n(ti, x) as uΔ,N,n(ti, x) =

∑m cmem(x) and note that

ϕα(Δ,x;uΔ,N,n(ti, ·)) =∑

m cmϕα(Δ,x; em), where ϕα(s, x;φ) is the solutionof the propagator (6.1.10) with the initial condition φ(x).

The idea is sketched in Figure 6.1. We can first compute the propaga-tor (6.1.12) (see below) on (0, Δ] and obtain a problem-dependent basisqα,l,m (6.1.13). This step is called “offline” as in [315]. Thus, one recur-sively computes the solution “online” by (6.1.14) and (6.1.15) only at timeiΔ (i = 2, · · · ,K) using the obtained basis qα,l,m. Specifically, we proceedas follows. Let {em} = {em(x)}m≥1 be a CONS in L2(D) with boundaryconditions satisfied and (·, ·) be the inner product in that space.

Fig. 6.1. Illustration of the idea of multistage WCE. The dotted line denotes the“offline” computation, where we solve the propagator up to time Δ. The dashedline implies that one solves only the solution on certain time levels instead of onthe entire time interval.

· · ·0 Δ 2Δ · · · · · · iΔ · · · · · · T = KΔ

offline online

δt

{


For simplicity, we assume that f = gk = 0. Let ϕα(s, x;φ) solve thefollowing propagator:

∂ϕα(s, x;φ)

∂s= Lϕα(s, x;φ) +

∑

k,l

αk,lml(s)Mkϕα−(l,k)(s, x;φ),

s ∈ (0, Δ], (6.1.12)

ϕα(0, x) = φ(x)1{|α|=0},

where ml(s)’s are the orthonormal cosine basis (6.1.2) on L2([0, Δ]). Define

qα,l,m = (ϕα(Δ, ·; el), em), l,m ≥ 1, (6.1.13)

and then find by induction the coefficients

ψm(0;N, n) := (u0, em), (6.1.14)

ψm(i;N, n) :=∑

α∈JN,n

∑

l

1√α!

ψl(i− 1;N, n)qα,l,mξ(i)α , i = 1, . . . ,K.

It can be readily shown that

uΔ,N,n(ti−1, x) =∑

m

ψm(i;N, n)em(x), i = 0, . . . ,K, P -a.s. . (6.1.15)

We refer to the numerical method (6.1.15), (6.1.12)–(6.1.14) together with(6.1.6)–(6.1.7) as the multistage WCE method for the SPDE (3.3.23).

In practice, if the equation (3.3.23) has an infinite number of Wienerprocesses, we truncate them to a finite number r ≥ 1 of noises. We introducethe correspondingly truncated set JN,n,r so that

JN,n,r = {α ∈ J : |α| ≤ N, dr(α) ≤ n} ,

where dr(α) = max {l ≥ 1 : αk,l > 0 for some 1 ≤ k ≤ r} . We have the fol-lowing algorithm to compute the numerical solution.

Algorithm 6.1.1 Choose a truncation of the number of noises r ≥ 1and the algorithm’s parameters: a CONS {em(x)}m≥1 and its truncation{em(x)}Mm=1; a time step Δ; N and n and r to determine the multi-indexset JN,n,r.

Step 1. For each m = 1, . . . ,M, solve the propagator (6.1.12) for α ∈ JN,n,r

on the time interval [0, Δ] with the initial condition em(x) and denote theobtained solution as ϕα(Δ,x; em), α ∈ JN,n,r, m = 1, . . . ,M. We also need tochoose a time step size δt to solve the equations in the propagator numerically.

Step 2. Evaluate ψm(0;N, n,M) = (u0, em), m = 1, . . . ,M, where u0(x)is the initial condition for (3.3.23), and qα,l,m = (ϕα(Δ, ·; el), em(·)), l,m =1, . . . ,M.


Step 3. On the i-th time step (at time t = iΔ), generate the Gaussian

random variables ξ(i)α , α ∈ JN,n,r, according to (6.1.7), compute the coeffi-

cients

ψm(i;N, n,M) =∑

α∈JN,n,r

M∑

l=1

1√α!

ψl(i− 1;N, n,M)qα,l,mξ(i)α , m = 1, . . . ,M,

and obtain the approximate solution of (3.3.23)

uMΔ,N,n(ti−1, x) =

M∑

m=1

ψm(i;N, n,M)em(x).

In the next section we present an algorithm based on Algorithm 6.1.1,which allows us to compute moments of the solution to (3.3.23) withoutusing the Monte Carlo technique.

Remark 6.1.2 The cost of simulation of the random field u(ti, x) by Algo-

rithm 6.1.1 over K time steps is proportional to KM2 (N+nr)!N!(nr)! .

Remark 6.1.3 Choosing an orthonormal basis is an important topic in theresearch of spectral methods, which can be found in [163] and many subsequentworks. Here we choose the Fourier basis for Problem 3.3.23 because of periodicboundary conditions.

6.1.2 Algorithm for computing moments

Implementation of Algorithm 6.1.1 requires the generation of the random

variables ξ(i)α (see (6.1.7)). Then, for computing moments of the solution of the

SPDE (3.3.23), we also need to make use of the Monte Carlo technique. In thissection we present a deterministic algorithm (Algorithm 6.1.4) for computingmoments, i.e., an algorithm which does not require any random numbers anddoes not have a statistical error. In Chapters 6.2, 6.3, and 6.4 we compareAlgorithm 6.1.4 with some Monte Carlo-type methods and demonstrate thatAlgorithm 6.1.4 can be more computationally efficient when higher accuracyis required.

The mean solution E[u(t, x)] is equal to the solution ϕ(0)(t, x) of the prop-agator (6.1.12) with α = (0):

E[u(t, x)] = ϕ(0)(t, x).

Thus, evaluating the mean E[u(t, x)] is reduced to numerical solution of thelinear deterministic PDE for ϕ(0)(t, x). We limit ourselves here to presentingan algorithm for computing the second moment of the solution, E[u2(t, x)].Other moments of the solution u(t, x) can be considered analogously.

According to Algorithm 6.1.1, we approximate the solution u(ti−1, x)of (3.3.23) by uM

Δ,N,n(ti−1, x) (when f = gk = 0) as follows:


ψm(0;N, n,M) = (u0, em), m = 1, . . . ,M,

ψm(ti−1;N, n,M) =∑

α∈JN,n,r

M∑

l=1

1√α!

ψl(ti;N, n,M)qα,l,m ξ(i)α , m = 1, . . . ,M,

uMΔ,N,n(ti−1, x) =

M∑

m=1

ψm(ti−1;N, n,M)em(x), i = 1, . . . ,K,

where qα,l,m are from (6.1.13) and ξ(i)α are from (6.1.7). Then, we can evaluate

the covariance matrices

Qlm(0;N, n,M) := ψl(0;N, n,M)ψm(0;N, n,M), l,m = 1, . . . ,M, (6.1.16)

Qlm(ti−1;N, n,M) := E[ψl(ti−1;N, n,M)ψm(ti−1;N, n,M)]

=M∑

j,k=1

Qjk(ti;N, n,M)∑

α∈JN,n,r

1

α!qα,j,lqα,k,m,

l,m = 1, . . . ,M, i = 1, . . . ,K,

and, consequently, the second moment of the approximate solution

E[(uMΔ,N,n(ti−1, x))

2] =

M∑

l,m=1

Qlm(ti−1;N, n,M)el(x)em(x), i = 1, . . . ,K.

(6.1.17)Implementation of (6.1.16)–(6.1.17) does not require generation of the ran-

dom variables ξ(i)α . Hence, we have constructed a deterministic algorithm for

computing the second moments of the solution to the SPDE (3.3.23) whenf = gk = 0, which we formulate below.

Algorithm 6.1.4 (Recursive multistage Wiener chaos expansion,[505, Algorithm 2]) Choose a truncation of the number of noises r ≥ 1in (3.3.23) and the algorithm’s parameters: a CONS {em(x)}m≥1 and itstruncation {em(x)}Mm=1; a time step Δ; N and n which together with r deter-mine the size of the multi-index set JN,n,r.

Step 1. For each m = 1, . . . ,M, solve the propagator (6.1.12) for α ∈JN,n,r on the time interval [0, Δ] with the initial condition φ(x) = em(x) anddenote the obtained solution as ϕα(Δ,x; em), α ∈ JN,n,r, m = 1, . . . ,M. Also,choose a time step size δt to solve the equations in the propagator numerically.

Step 2. Evaluate ψm(0;N, n,M) = (u0, em), m = 1, . . . ,M, where u0(x)is the initial condition for (3.3.23), and qα,l,m = (ϕα(Δ, ·; el), em(·)), l,m =1, . . . ,M.

Step 3. Recursively compute the covariance matrices Qlm(ti−1;N, n,M)according to (6.1.16) and obtain the second moment E[(uM

Δ,N,n(ti−1, x))2] of

the approximate solution to (3.3.23) by (6.1.17).

The accuracy of Algorithm 6.1.4 for single noise (r = 1) will be shownin Theorem 8.3.6 and its Corollary 8.3.2 in Chapter 8. The error estimates


for approximation of the second moment E[u2(ti−1, x)] by E[u2Δ,N,n(ti−1, x)]

is the same as the errors given in (6.1.11). Due to the orthogonality of the

random variables ξ(i)α in the sense that E[ξ

(i)α ξ

(j)β ] = 0 unless i = j and α = β,

the following equality holds:

E[u2(t, x)]− E[u2N,n(t, x)] = E[(u(t, x)− uN,n(t, x))

2]. (6.1.18)

Here, we do not discuss errors arising from noise truncation and from trun-cation of the basis {em(x)}m≥1.

Computational Cost. The computational costs of Steps 1 and 2 of

Algorithm 6.1.4 are proportional to M2 (N+nr)!N!(nr)! and the computational cost

of Step 3 over K time steps is proportional to KM4 (N+nr)!N!(nr)! . Taking this into

account together with the error estimates (6.1.11), it is computationally ben-eficial to choose n = 1 and N = 2 or 1.

The main computational cost of Algorithm 6.1.4 is due to the totalnumber of basis functions M in physical space required for reaching a sat-isfactory accuracy. For a fixed accuracy, the number M of basis functions{em}Mm=1 is proportional to Cd, where C depends on a choice of the basisand on the problem. If the variance of u2(t, xi) is relatively large and theproblem considered does not require a very large number of basis functionsM, then one expects Algorithm 6.1.4 to be computationally more efficientin evaluating second moments than the combination of Algorithm 6.1.1 withthe Monte Carlo technique.

The efficiency of Algorithm 6.1.4 can often be improved by choosing anappropriate basis {em} so that the majority of functions qα,l,m are identi-cally zero or negligible and hence can be dropped from computing the co-variance matrix {Qlm(ti−1;N,M)}Ml,m=1, significantly decreasing the compu-tational cost of Step 3. For instance, for the periodic passive scalar equationconsidered in Chapter 6.4 we choose the Fourier basis {em}. In this case thenumber of zero qα,l,m is proportional just to M (the total number of qα,l,mis proportional to M2) and, consequently, the computational cost of Step 3(and hence that of Algorithm 6.1.4) becomes proportional to M2 instead ofthe original M4. Moreover, computation of the covariance matrix accordingto (6.1.16) can be done in parallel. Clearly, the use of reduced-order meth-ods with offline/online strategies [409] can greatly reduce the value of M andhence will make the recursive WCE method very efficient.

Remark 6.1.5 It is more expensive to compute higher-order moments by adeterministic algorithm analogous to Algorithm 6.1.4. Since second momentsgive us such important, from the physical point of view, characteristics asenergy and correlation functions, Algorithm 6.1.4 can be a competitive alter-native to Monte Carlo-type methods in practice.

6.2 Examples in one dimension 175

6.2 Examples in one dimension

We consider two one-dimensional problems and illustrate the application ofAlgorithm 6.1.4 to these problems. We will test Algorithm 6.1.4 by eval-uating the second moments E[u2(t, x)] of the solutions to the stochasticadvection-diffusion equation (3.3.18) and the stochastic reaction-diffusionequation (3.3.20).

Application of WCE algorithms to the model problem. The problems (3.3.18)and (3.3.20) are simpler than the general linear SPDE (3.3.23). Consequently,Algorithm 6.1.4 applied to them takes a simpler form, see Algorithm 6.2.1below.

The model problems (3.3.18) and (3.3.20) have a single Wiener processand their solutions (3.3.19) and (3.3.21) have the form u(t, x) = f(t, x,W (t)),where f(t, x, y) is a smooth function. Consequently, the solutions are expand-able in the basis consisting just of ξα = Hk(W (t)/

√t)/

√k! = Hk(ξ1)/

√k!,

α = (k, 0, . . . , 0), k = 0, 1, . . . , i.e., we have

u(t, x) =∑

α∈J

ϕα(t, x)√α!

ξα =

∞∑

N=0

∑

α∈JN,1

ϕα(t, x)√α!

ξα =

∞∑

k=0

ϕk(t, x)√k!

ηk, (6.2.1)

where ηk = ξα with α = (k, 0, . . . , 0), k = 0, 1, . . . . Hence

uN,1(t, x) = :uN(t, x) =N∑

k=0

ϕk(t, x)√k!

ηk, (6.2.2)

which corresponds to setting n = 1 in (6.1.1). It is not difficult to show thatapplying Algorithm 6.1.4 to the model problems (3.3.18) and (3.3.20) is moreaccurate than in general cases of (3.3.23) (cf. (6.1.11) and (6.1.18)):

∥∥E[u2(t, ·)]− E[u2Δ,N(t, ·)]

∥∥L2 ≤ C

(CΔ)N

(N+ 1)!(6.2.3)

for all sufficiently small Δ > 0 and a constant C > 0 independent of Δ and N(we assume the errors arising from truncation of the basis {em} is negligible).

For the problems (3.3.18) and (3.3.20), the propagator (6.1.12) takes theform

∂tϕ0 = a∂2xxϕ0, ϕ0(0, x;φ) = φ(x), (6.2.4)

∂tϕk = a∂2xxϕk +

1√Δ

σk∂xϕk−1, ϕk(0, x; 0) = 0, k > 0,

and

∂tϕ0 = a∂2xxϕ0, ϕ0(0, x;φ) = φ(x), (6.2.5)

∂tϕk = a∂2xxϕk +

1√Δ

σkϕk−1, ϕk(0, x; 0) = 0, k > 0,


respectively. We solve these propagators numerically using the Fourier collo-cation method with M nodes in physical space and the Crank–Nicolson timediscretization with step δt in time. Denote by Lm(x), m = 1, . . . ,M, the m-th Lagrangian trigonometric polynomials using M Fourier collocation nodes,i.e., Lm(x) arem-th order trigonometric polynomials satisfying Lm(xl) = δm,l

and xl =2πM (l − 1), l = 1, . . . ,M. The m-th Lagrangian trigonometric poly-

nomial can be written as

Lm(x) = Msin(Mx/2)

tan(x/2), x ∈ [0, 2π]. (6.2.6)

In fact, the Lagrangian trigonometric polynomials can be represented as

Lm(x) =

M/2∑

k=−M/2+1

exp(−ikx)ak, where i =√−1.

Applying the conditions Lm(xl) = δm,l, we can find ak’s and rewrite Lm(x)in the form of (6.2.6). More details of derivation of the formula (6.2.6) canbe found in [447, Chapter 3].

Now we formulate the realization of Algorithm 6.1.4 in these two modelproblems.

Algorithm 6.2.1 For given values of the model parameters a and σ, choosethe algorithm parameters: a number of Fourier collocation nodes M, a timestep δt for solving the propagator (6.2.4) (or (6.2.5)), a time step Δ, and thenumber of Hermite polynomials N.

Step 1. Solve the propagator (6.2.4) (or (6.2.5)) on the time interval [0, Δ]with the initial condition φ(x) = Lk(x) using the Fourier collocation methodwith M nodes in physical space and the Crank–Nicolson scheme with step δtin time and denote the obtained numerical approximation of ϕk(Δ,xl;Lm)

as ϕM,δtk (Δ,xl;Lm), l,m = 1, . . . ,M, k = 0, 1, . . . ,N.Step 2. Recursively compute the covariance matrices

Qlm(ti−1;N,M) := E[uM,δtΔ,N (ti−1, xl)u

M,δtΔ,N (ti−1, xm)], ti−1 = iΔ, i = 0, . . . ,K,

of the approximate solution to (3.3.18) (or (3.3.20)):

Qlm(0;N,M) = u0(xl)u0(xm), l,m = 1, . . . ,M,

Qlm(ti−1;N,M) =

N∑

k=1

M∑

q=1

M∑

r=1

1

k!Qqr(ti;N,M)ϕM,δt

k (Δ,xl;Lq)ϕM,δtk (Δ,xm;Lr),

l,m = 1, . . . ,M, i = 1, . . . ,K,

where u0(x) is the initial condition of (3.3.18) (or (3.3.20)).In particular, we obtain the second moment of the approximate solution

to (3.3.18) (or (3.3.20)):

E[[uM,δtΔ,N (ti−1, xj)]

2] = Qjj(ti−1;N,M), j = 1, . . . ,M, i = 1, . . . ,K.

6.2 Examples in one dimension 177

To approximate the solution of (3.3.18) (or (3.3.20)), one can use thetruncated WCE uN(t, x) from (6.2.2) and, in particular, evaluate the secondmoment E[u2(t, x)] as

E[u2(t, x)] ≈ E[u2N(t, x)] =

N∑

k=0

ϕ2k(t, x)

k!≈

N∑

k=0

[ϕM,δtk (t, x)

]2

k!, (6.2.7)

where ϕ0(t, x) = ϕ0(t, x;u0(x)) and ϕk(t, x) = ϕk(t, x; 0), k > 0, are solu-

tions of the propagator (6.2.4) (or (6.2.5)) and ϕM,δtk (t, x) are their numerical

approximations obtained, e.g., using the Fourier collocation method with Mnodes in physical space and the Crank–Nicolson scheme with step δt in time.The approximation (6.2.7) can be viewed as a one-step approximation corre-sponding to Algorithm 6.2.1, i.e., the first step of Algorithm 6.2.1 with Δ = t,and its error is estimated by

∥∥E[u2(t, ·)]− E[u2N(t, ·)]

∥∥L2 ≤ CeCt (Ct)N+1

(N+ 1)!.

This error grows exponentially with t, which can be readily verified bynumerical tests with (3.3.18). To reach a satisfactory accuracy of the approx-imation (6.2.7) for a fixed t, one has to take a sufficiently large N which iscomputationally expensive, even in the case of moderate values of t. In con-trast, it is demonstrated in Chapter 6.2.1 that the error of Algorithm 6.2.1grows linearly with time and it is relatively small even for N = 1.

6.2.1 Numerical results for one-dimensionaladvection-diffusion-reaction equations

In this section we present some numerical results of Algorithm 6.2.1 on twomodel problems (3.3.18) and (3.3.20).

In approximating the propagators (6.2.4) and (6.2.5) we choose a suffi-ciently large number of Fourier collocation nodes M and a sufficiently smalltime step δt so that errors of numerical solutions to the propagators have anegligible influence on the overall accuracy of Algorithm 6.2.1 in our simula-tions. In all the numerical tests it was sufficient to take M = 20; this choiceof M was tested by running control tests with M = 80.

We measure numerical errors in the following norms:

ρ2(t) =

(2π

M

M∑

m=1

(E[uM,δtΔ,N (t, xm))2]− E[u2(t, xm)])2

)1/2

,

andρ∞(t) = max

1≤m≤M

∣∣∣E[[uM,δtΔ,N (t, xm)]2]− E[u2(t, xm)]

∣∣∣ .


The results of our tests on the model problem (3.3.18) in the degeneratecase (i.e., ε = 0) and in the nondegenerate case (i.e., ε > 0) are presented inTables 6.1 and 6.2, respectively. Table 6.3 corresponds to the tests with thesecond model problem (3.3.20). Numerical tests with values of the parametersother than those used for Tables 6.1–6.3 were also performed and they gavesimilar results.

Analyzing the results in Tables 6.1, 6.2, and 6.3, we observe the conver-gence order of ΔN for a fixed N in all the tests, which confirms our theoreticalprediction (6.2.3). We also ran other cases (not presented here) to confirmthe conclusion from Chapter 6.2 that the number n of random variables ξkused per step does not influence the accuracy of Algorithm 6.1.4 in the caseof the model problems (3.3.18) and (3.3.20).

In Figure 6.2 we demonstrate the dependence of the relative numericalerror

ρr2(t) =ρ2(t)

‖E[u2(t, ·)]‖L2

on integration time. These results were obtained in the degenerate case ofthe problem (3.3.18), but similar behavior of errors was observed in our testswith other parameters as well. One can conclude from Figure 6.2 that (afteran initial fast growth) the error grows linearly with integration time. This isa remarkable feature of the recursive WCE algorithm, which implies that thealgorithm can be used for long time integration of SPDEs.

Table 6.1. Performance of Algorithm 6.2.1 for Model (3.3.18). The parameters ofthe model (3.3.18) are σ = 1, ε = 0, and the time t = 10. In Algorithm 6.2.1 wetake M = 20.

N Δ δt ρ2(10) ρ∞(10)

1 0.1 1× 10−3 4.69× 10−1 1.87× 10−1

0.01 1× 10−4 6.07× 10−2 2.42× 10−2

0.001 1× 10−5 6.25× 10−3 2.49× 10−3

2 0.1 1× 10−3 1.92× 10−2 7.67× 10−3

0.01 1× 10−4 2.07× 10−4 8.27× 10−5

0.001 1× 10−5 2.09× 10−6 8.33× 10−7

3 0.1 1× 10−3 4.82× 10−4 1.99× 10−4

0.01 1× 10−4 5.16× 10−7 2.06× 10−7

0.001 1× 10−5 3.37× 10−10 1.81× 10−10

4 0.1 1× 10−3 9.36× 10−6 3.73× 10−6

0.01 1× 10−5 9.35× 10−10 4.17× 10−10

6.3 Comparison of the WCE algorithm and Monte Carlo type algorithms 179

Table 6.2. Model (3.3.18): performance of Algorithm 6.2.1. The parameters of themodel (3.3.18) are σ = 1, ε = 0.01, and the time t = 10. In Algorithm 6.2.1 we takeM = 20.

N Δ δt ρ2(10) ρ∞(10)

1 0.1 1× 10−3 3.84× 10−1 1.53× 10−1

0.01 1× 10−4 4.97× 10−2 1.98× 10−2

0.001 1× 10−4 5.11× 10−3 2.04× 10−3

2 0.1 1× 10−3 1.58× 10−2 6.28× 10−3

0.01 1× 10−4 1.70× 10−4 6.77× 10−5

0.001 1× 10−4 1.72× 10−6 6.88× 10−7

3 0.1 1× 10−3 3.95× 10−4 1.57× 10−4

0.01 1× 10−4 4.22× 10−7 1.68× 10−7

0.001 1× 10−5 3.65× 10−10 2.01× 10−10

4 0.1 1× 10−3 7.67× 10−6 3.06× 10−6

0.01 1× 10−5 8.39× 10−10 3.90× 10−10

6.3 Comparison of the WCE algorithm and MonteCarlo type algorithms

In this section, we compare the recursive WCE algorithm with some MonteCarlo-type algorithms.

As discussed in Chapter 2, there are other approaches to solving SPDEs,which are usually complemented by the Monte Carlo technique when one isinterested in computing moments of SPDE solutions. In this section, usingthe problem (3.3.18), we compare the performance of Algorithm 6.2.1 and twoMonte Carlo-type algorithms, one of which is based on the method of charac-teristics [361] and another on the Fourier transform of the linear SPDE withsubsequent simulation of SDEs and application of the Monte Carlo technique.

Table 6.3. Performance of Algorithm 6.2.1 for Model (3.3.20). The parameters ofthe model (3.3.20) are σ = 1, a = 0.5, and the time t = 10. In Algorithm 6.2.1 wetake M = 20.

N Δ δt ρ2(10) ρ∞(10)

1 0.1 1× 10−3 5.75× 10−1 3.74× 10−1

0.01 1× 10−4 7.44× 10−2 4.85× 10−2

0.001 1× 10−4 7.65× 10−3 4.98× 10−3

2 0.1 1× 10−3 2.36× 10−2 1.53× 10−2

0.01 1× 10−4 2.54× 10−4 1.65× 10−4

0.001 1× 10−4 2.58× 10−6 1.68× 10−6

3 0.1 1× 10−3 5.90× 10−4 3.85× 10−4

0.01 1× 10−4 6.32× 10−7 4.12× 10−7


Fig. 6.2. Dependence of the relative numerical error ρr2(t) on integration time.Model (3.3.18) is simulated by Algorithm 6.2.1 with M = 20 and δt = Δ/100 andvarious Δ and N. The parameters of (3.3.18) are σ = 1 and ε = 0.

0 1 2 3 4 5 6 7 8 9 10t

N=1, Δ=0.1N=2, Δ=0.1N=1, Δ=0.01N=2, Δ=0.01

10-1

10-2

100

10-3

10-4

10-5

10-6

10-7

The solution of (3.3.18) with ε = 0 (the degenerate case) can be repre-sented via the method of characteristics [408]:

u(t, x) = sin(Xt,x(0)), (6.3.1)

where Xt,x(s), 0 ≤ s ≤ t, is the solution of the system of backward charac-teristics

dXt,x(s) = σ←−−dW (s), Xt,x(t) = x. (6.3.2)

The notation “←−−dW (s)” means backward Ito integral (see, e.g., [408]). It follows

from (6.3.2) that Xt,x(0) has the same probability distribution as x+ σ√tζ,

where ζ is a standard Gaussian random variable (i.e., ζ ∼ N (0, 1)). Since weare interested only in computing statistical moments, it is assumed that

Xt,x(0) = x+ σ√tζ. (6.3.3)

Then, we can estimate the second moment m2(t, x) := E[u2(t, x)] as

m2(t, x).= m2(t, x) =

1

L

L∑

l=1

sin2(x+ σ√tζ(l)), (6.3.4)

where ζ(l), l = 1, . . . , L, are i.i.d. standard Gaussian random variables. Theestimate m2 for m2 is unbiased, and, hence, the numerical procedure forfinding m2 based on (6.3.4) has only the Monte Carlo (i.e., statistical) errorwhich can be quantified via half of the length of the 95% confidence interval:

6.3 Comparison of the WCE algorithm and Monte Carlo type algorithms 181

ρMC(t, x) = 2

√Var(sin2(x+ σ

√tζ))

√L

.

Table 6.4 gives the statistical error for m2(t, x) from (6.3.4) (no space-timediscretization error in this algorithm), which is computed as

2 ·maxj

√1L

∑Ll=1 sin

4(xj + σ√tζ(l))− [m2(t, xj)]

2

√L

, (6.3.5)

where the set of xj is the same as the one used in Table 6.5 by Algorithm 6.2.1and ζ(l) are as in (6.3.4).

All the tests were run using Matlab R2007b, on a Macintosh desktopcomputer with Intel Xeon CPU E5462 (quad-core, 2.80 GHz). Every effortwas made to program and execute the different algorithms as much as possiblein an identical way.

The cost of simulation due to (6.3.4) is directly proportional to L. Theslower time increase for smaller L in Table 6.4 is due to the initialization timeof the computer program in the time measurement.

Table 6.4. Performance of the method (6.3.4) for Model (3.3.18). The parametersof the model (3.3.18) are σ = 1, ε = 0, and the time t = 10. The statistical error iscomputed according to (6.3.5).

L Statistical error CPU time (sec.)

102 8.87× 10−2 6× 10−3

104 7.40× 10−3 6.7× 10−2

106 7.09× 10−4 7.4× 100

108 7.07× 10−5 7.4× 102

1010 7.07× 10−6 7.3× 104

In Table 6.5 we repeat some of the results already presented in Table 6.1,which are now also accompanied by CPU time for comparison.

Comparing the results in Tables 6.4 and 6.5, we conclude that for a rela-tively large error the estimate m2(t, x) from (6.3.4) is computationally moreefficient than Algorithm 6.2.1; however, Algorithm 6.2.1 has lower costs inreaching a higher accuracy (errors of order equal to or smaller than 10−6).

Now we compare Algorithm 6.2.1 with another approach exploiting theMonte Carlo technique for the problem (3.3.18) with ε = 0. We can representthe solution of this periodic problem via the Fourier transform:

u(t, x) =∑

k∈Z

eikxuk(t) (6.3.6)


Table 6.5. Performance of Algorithm 6.2.1 for Model (3.3.18). The parametersof the model (3.3.18) are σ = 1, ε = 0, and the time t = 10. The parameters ofAlgorithm 6.2.1 are Δ = 0.1, M = 20, δt = 0.001.

N ρ∞(10) CPU time (sec.)

1 1.87× 10−1 5.7× 100

2 7.67× 10−3 8.1× 100

3 1.99× 10−4 1.1× 101

4 3.73× 10−6 1.3× 101

with uk(t), t ≥ 0, k ∈ Z, satisfying the system of SDEs:

duk(t) = −k2 1

2σ2uk(t)dt+ ikσuk(t)dW (t), Reuk(0) = 0, (6.3.7)

Imuk(0) =1

2(δ1k − δ−1k) .

Noting that here uk(t) ≡ 0 for all |k| �= 1 and rewriting (6.3.6)-(6.3.7) in thetrigonometric form, we get

u(t, x) = uc(t) cosx+ us(t) sinx, (6.3.8)

where

duc(t) = −1

2σ2uc(t)dt+ σus(t)dW (t), uc(0) = 0, (6.3.9)

dus(t) = −1

2σ2us(t)dt− σuc(t)dW (t), us(0) = 1.

The system (6.3.9) is a Hamiltonian system with multiplicative noise (see,e.g., [356, 358]). It is known [356, 358] that symplectic integrators have advan-tages in comparison with usual numerical methods in long time simulations ofstochastic Hamiltonian systems. Here we use one of the symplectic methods– the midpoint scheme, to (6.3.9) and it takes the following form:

uc(tk+1) = uc(tk) +σ

2(us(tk) + us(tk+1))

√Δtζk+1, uc(0) = 0, (6.3.10)

us(tk+1) = us(tk)−σ

2(uc(tk) + uc(tk+1))

√Δtζk+1, us(0) = 1,

where ζk are i.i.d. standard Gaussian random variables and Δt > 0 is a timestep. The scheme (6.3.10) converges with the mean-square order 1/2 andweak order 1 [358]. It is implicit but can be resolved analytically since we aredealing with the linear system here.

Using (6.3.8) and (6.3.10), we evaluate the second moment of the solutionto (3.3.18) with ε = 0 as

6.4 A two-dimensional passive scalar equation 183

m2(tk, x) := E[u2(tk, x)].= E[(uc(tk) cosx+ us(t) sinx)2] (6.3.11)

.= m2(tk, x) =

1

L

L∑

l=1

[uc,(l)(tk) cosx+ us,(l)(tk) sinx

]2,

where uc,(l)(tk), us,(l)(tk) are independent realizations of the random variables

uc(tk), us(tk).

The estimate m2(tk, x) from (6.3.11) has two errors: the time discretiza-tion error due to the approximation of (6.3.9) by (6.3.10) and the Monte Carloerror. The errors presented in Table 6.6 are computed as maxj [m2(tk, xj) −E[u2(tk, xj)]] and are given together with the 95% confidence interval.

Table 6.6. Model (3.3.18): performance of the method (6.3.11). The parametersof the model (3.3.18) are σ = 1, ε = 0, and the time t = 10.

Δt L Error CPU time (sec.)

0.1 104 8.06× 10−3 ± 7.09× 10−3 4.72× 10−1

0.01 104 6.55× 10−4 ± 7.08× 10−4 3.90× 102

0.001 106 8.81× 10−5 ± 7.07× 10−5 3.81× 105

Comparing the results in Tables 6.6 and 6.5, we observe again that Al-gorithm 6.2.1 is computationally more efficient than the Monte Carlo-basedalgorithms in reaching a higher accuracy.

6.4 A two-dimensional passive scalar equation

We consider a special stochastic advection-diffusion equation (3.3.23)–(3.3.24)– a two-dimensional passive scalar equation. This equation is motivated bythe study of the turbulent transport problem, see [146, 273, 314] and thereferences therein. Here we perform numerical tests on the two-dimensional(d = 2) passive scalar equation with periodic boundary conditions:

du(t, x) +∞∑

k=1

d∑

i=1

σik(x)Diu ◦ dWk(t) = 0, t > 0, x ∈ (0, ")2, (6.4.1)

u(t, x1 + ", x2) = u(t, x1, x2 + ") = u(t, x),

t > 0, x ∈ (0, ")2,

u(0, x) = u0(x), x ∈ (0, ")2,

where ‘◦’ indicates the Stratonovich product, " > 0, the initial condition u0(x)is a periodic function with the period (0, ")2, and σi

k(x) are divergence-freeperiodic functions with the period (0, ")2:

div σk = 0. (6.4.2)


In (6.4.1), we take a combination of such σk(x) so that the corresponding spa-

tial covariance C is symmetric and stationary: C(x−y) =

∞∑

k=1

λkσk(x)σ�k (y),

where λk are some nonnegative numbers. Namely, we consider

C(x− y) =∞∑

k=1

λkC(x− y;nk,mk), (6.4.3)

where nk,mk is a sequence of positive integers, and

C(x− y;n,m) = cos(2π(n[x1 − y1] +m[x2 − y2]

)/")

[m2 −nm−nm n2

],

which can be decomposed as

C(x− y;n,m) = cos(2π[nx1 +mx2]/�)

[−mn

]cos(2π

[ny1 +my2] /�) [−m n

]

+sin(2π[nx1 +mx2] /�)

[−mn

]sin(2π[ny1 +my2]/�)

[−m n

].

Hence, {σk(x)}k≥1 in (6.4.1) is an appropriate combination of vector functionsof the form

cos(2π[nx1 +mx2]/")

[−mn

]and sin(2π

[nx1 +mx2

]/")

[−mn

].

We rewrite (6.4.1) in the Ito’s form:

du(t) +1

2

d∑

i,j=1

Cij(0)DiDjudt +

∞∑

k=1

d∑

i=1

σik(x)DiudWk(t) = 0, (6.4.4)

u(t, x1 + ", x2) = u(t, x1, x2 + ") = u(t, x), t > 0, x ∈ (0, ")2,

u(0, x) = u0(x), x ∈ (0, ")2.

We present numerical results of Algorithm 6.1.4 applied to (6.4.4) andnumerical results of the Monte Carlo-type algorithm from [361] based on themethod of characteristics. We aim at computing the L2-norm of the secondmoment of the SPDE solution

∥∥E[u2(T, ·)]∥∥L2 =

[∫

[0,�]2(E[u2(T, x)])2dx

]1/2. (6.4.5)

We considered the particular case of (6.4.1), (6.4.3) with " = 2π, the initialcondition

u0(x) = sin(2x1) sin(x2), (6.4.6)

and with two noise terms:


σ1(x) = cos(x1 + x2)

[−11

], (6.4.7)

σ2(x) = sin(x1 + x2)

[−11

], σk(x) = 0 for k > 2.

This example satisfies the so-called commutativity condition (3.3.29). Theerror estimate (6.1.11) holds in this case and is confirmed in the tests, seeChapter 8 for error estimates for single noise, one special case of commutativenoises.

In Algorithm 6.1.4, we solve the propagator (6.1.12) corresponding to theSPDE (6.4.4) using a fourth-order explicit Runge–Kutta with step size δt intime and the Fourier spectral method with M modes in physical space.

The computational cost of Algorithm 6.1.4 is proportional to M4 but withan appropriate choice of basis functions, this cost can be considerably reducedby exploiting the sparsity of the Wiener chaos expansion coefficients. Also,the use of Fourier basis for the problem (6.4.4) reduces the computationalcost to being proportional to M2. This significant reduction is based on thefollowing observation. Since we consider finite number of noises with periodicσk(x) and ϕα(Δ,x; el) to be the solution of the propagator (6.1.12) withthe initial condition equal to a single basis function el(x), ϕα(Δ,x; el) isexpandable in a finite number of periodic functions ek(x) and this numberdoes not depend on M. Hence, for fixed α and l the number of nonzeroqα,l,m = (ϕα(Δ, ·; el), em(·)) is finite and a small number. Therefore, theoverall number of nonzero qα,l,m is proportional to M instead of M2.

The sparsity was tested and confirmed in our tests. We use the above factin our computer realization of Algorithm 6.1.4 and reduce the computationalcost of obtaining a single entry of the matrix Ql,m from the order of O(M2)to order O(1). Hence, the computational cost of Step 3 (and hence that ofAlgorithm 6.1.4) becomes proportional to M2 instead of the original M4.

We need a reference solution as we do not have an exact solution of theproblem (6.4.1). To this end, the L2-norm of the second moment of the SPDEsolution at T = 1 was computed by Algorithm 6.1.4 with parameters N = 2,n = 1, M = 900 (i.e., 30 basis functions in each space direction), δt = 1×10−5,Δ = 1 × 10−4 and which is equal to 1.57976 (5 d.p.). This result was alsoverified by the Monte Carlo-type method described below with Δt = 1×10−3,Ms = 10 and L = 8× 107, which gave 1.579777± 7.6× 10−5 where ± reflectsthe Monte Carlo error only.

For Algorithm 6.1.4, we measure the error of computing the L2-norm ofthe second moment of the SPDE solution as follows

ρ(T ) =∥∥E[u2

ref(T, ·)]∥∥l2−∥∥∥E[(uM,δt

Δ,N (T, ·))2]∥∥∥l2,

where ‖v(·)‖l2 = �Ms

(∑Ms

i,j=1 v2(x1

i , x2j ))1/2

, x1i = x2

i = (i − 1)"/Ms, i =

1, . . . ,Ms, and E[u2ref(T, ·)] is the reference solution computed as explained


above. The results demonstrating second-order convergence (see (6.1.11) andthe discussion after Algorithm 6.1.4 ) are given in Table 6.7. Some controltests with δt = 1 × 10−5 and M = 1600 showed that the errors presented inthis table are not essentially influenced by the errors caused by the choice ofδt = 1× 10−4 and cut-off of the basis at M = 900.

Table 6.7. Performance of Algorithm 6.1.4 for passive scalar equation (6.4.4). Theparameters of Algorithm 6.1.4 are N = 2, n = 1, M = 900, δt = 1× 10−4.

Δ 0.05 0.02 0.01 0.005 0.0025

ρ(1) 0.1539 0.0326 0.0089 0.0023 0.0006

6.4.1 A Monte Carlo method based on the method ofcharacteristics

Let us now describe the Monte Carlo-type algorithm based on the method ofcharacteristics. The solution u(t, x) of (6.4.1) has the following (conditional)probabilistic representation (see [314, 408]):

u(t, x) = u0(Xt,x(0)), a.s., (6.4.8)

where Xt,x(s), 0 ≤ s ≤ t, is the solution of the system of (backward) charac-teristics

− dX =∑

k

σk(X)←−−dWk(s), X(t) = x. (6.4.9)

Due to (6.4.2), it holds that

∑

k

∂σk

∂xσk = 0, (6.4.10)

the phase flow of (6.4.9) preserves phase volume (see, e.g., [358, p. 247,Equation (5.5)]). We also recall that the Ito and Stratonovich forms of (6.4.9)coincide. As shown in [358], it is beneficial to approximate (6.4.9) usingphase volume preserving schemes, e.g., by the midpoint method [358, Chap-ter 4], which for (6.4.9) takes the form (here we exploited that the Ito andStratonovich forms of (6.4.9) coincide): for an integer m ≥ 1,

Xm = x, (6.4.11)

Xl = Xl+1 +∑

k

σk

(Xl +Xl+1

2

) (ζΔtk

)l

√Δt, l = n− 1, . . . , 0,

where(ζΔtk

)lare, e.g., i.i.d. random variables with the law


ζΔtk =

⎧⎨

⎩

ξk, |ξk| ≤ AΔt,AΔt, ξk > AΔt,

−AΔt, ξk < −AΔt,(6.4.12)

and ξk are i.i.d. standard Gaussian random variables, and AΔt =√

2c| lnΔt|,c ≥ 1. Its weak order is equal to one [358]. To solve the two-dimensionalnonlinear equation at each step, we used the fixed-point method with the levelof tolerance 10−13. In this example, two fixed-point iterations were sufficientto reach this accuracy.

Using Xt,x(0) = X0 obtained by (6.4.11) with Δt = T/m, we simulatethe L2-norm of the second moment of the SPDE solution as follows:

∥∥E[u2(T, ·)]∥∥L2 =

(∫

[0,�]2(E[u2(T, x)])2 dx

)1/2

≈∥∥E[u2(T, ·)]

∥∥l2

(6.4.13)

="

Ms

⎡

⎣Ms∑

i,j=1

(E[u2

0(XT,x1i ,x

2j(0))]

)2⎤

⎦1/2

≈ "

Ms

⎡

⎣Ms∑

i,j=1

(E[u20(XT,x1

i ,x2j(0))]2

⎤

⎦1/2

≈ "

Ms

⎡

⎣Ms∑

i,j=1

[1

L

L∑

l=1

u20(X

(l)

T,x1i ,x

2j(0))

]2⎤

⎦1/2

,

where x1i = x2

i = (i− 1)"/Ms, i = 1, . . . ,Ms; X(l)

t,x1i ,x

2j(0) are independent re-

alizations of the random variables Xt,x1i ,x

2j(0). The approximation in (6.4.13)

has three errors: (i) the error of discretization of the integral of the space do-main [0, "]2 which is negligible in our example even for Ms = 10; (ii) the errorof numerical integration due to replacement of Xt,x1

i ,x2j(0) by Xt,x1

i ,x2j(0); (iii)

the Monte Carlo error.

6.4.2 Comparison between recursive WCE and Monte Carlomethods

We compare Algorithm 6.1.4 and the Monte Carlo algorithm (6.4.13) bysimulating the example (6.4.1), (6.4.6), (6.4) at T = 1. From Tables 6.8and 6.9,1 we observe similar effects as in one dimension that for lower accuracythe Monte Carlo algorithm (6.4.13) outperforms Algorithm 6.1.4; and thatAlgorithm 6.1.4 is more efficient for obtaining higher accuracy.

1Matlab R2010b was used for each test on a single core of two Intel Xeon 5540(2.53GHz) quad-core Nehalem processors.


Table 6.8. Performance of Algorithm 6.1.4 (recursive WCE) on passive scalarequation (6.4.4) at T = 1. The parameters of Algorithm 6.1.4 are N = 2, n = 1,M = 900, δt = 1× 10−4.

Δ ρ(1) CPU time

1× 10−2 8.89× 10−3 3.7× 104 (sec.)

1× 10−3 1.20× 10−4 3.2× 105(sec.)

5× 10−4 3.73× 10−5 1.8× 102(hours)

Algorithm 6.1.4 with the parameters N = 2, n = 1, Δ = 0.01, M = 900,δt = 1×10−4 produced the result with the error 9×10−3. It needed 3.7×104

seconds of computer time. However, the Monte Carlo algorithm 6.4.11-6.4.13required 12 seconds of computer time to produce the same level of accuracy(combined numerical integration and statistical error), with Δt = 0.2, Ms =10, L = 2.5× 104.

Table 6.9. Performance of Algorithm 6.4.11-6.4.13 (a Monte Carlo method) forpassive scalar equation (6.4.4) at T = 1. The parameter is M = 100.

Δt L Error CPU time

2× 10−1 2.5× 104 4.68× 10−3 ± 4.38× 10−3 1.2× 101 (sec.)

1× 10−2 4× 107 1.46× 10−4 ± 1.08× 10−4 3.5× 105 (sec.)

1× 10−3 4× 108 ∼ ×10−5 ± 3.03× 10−5 9.7× 103 (hours)2

If we require higher accuracy, then Algorithm 6.1.4 outperforms the MonteCarlo algorithm 6.4.11-6.4.13. Algorithm 6.1.4 needs approximately 180 hoursof computer time to obtain the error 3 × 10−5, with the parameters N = 2,n = 1, Δ = 5 × 10−4, M = 900, δt = 1 × 10−4. To achieve the same levelof accuracy, the Monte Carlo algorithm 6.4.11-6.4.13 requires approximately9700 hours of computer time (estimated, not experimentally measured), withΔt = 0.001, Ms = 10, L = 4× 108.


We have presented a multistage Wiener chaos expansion (WCE) method foradvection-diffusion-reaction equations with multiplicative noise, and comple-mented this method by a deterministic algorithm for computing second mo-ments of the SPDE solutions without any use of the Monte Carlo technique,see Algorithm 6.2.1.

2This is an estimated time according to the tests with smaller Δt, L and withM = 100.


• The numerical tests demonstrated that the WCE-based deterministic al-gorithm can be more efficient than Monte Carlo-type methods in obtain-ing results of higher accuracy, scaling as ΔN, where Δ is the time-step ofthe “online” integration and N is the order of Wiener chaos.

• For obtaining results of lower accuracy, Monte Carlo-type methods out-perform the deterministic algorithm for computing moments even in theone-dimensional case. The recursive WCE algorithm is conceptually dif-ferent from Monte Carlo-type methods and thus it can be used for inde-pendent verification of results obtained by Monte Carlo solvers.

• The efficiency of the algorithm can be greatly improved if it is combinedwith reduced-order methods so that only a handful of modes will be re-quired to represent the solution accurately in physical space, i.e., a casewith small M, see a discussion at the end of Chapter 6.1. We can also usethe sparsity of the WCE solution in numerical simulation in Chapter 6.4to reduce the computational cost.

We have shown the efficiency of the recursive WCE for linear stochasticparabolic equations with nonrandom coefficients, where WCE of linear equa-tions leads to a lower triangular system of deterministic PDEs. While thecoefficients are random, WCE will lead to a fully coupled system of PDEsand thus the computational convenience may be lost. In the next chapter, wewill introduce stochastic collocation methods which lead to decoupled systemof PDEs.

Bibliographic notes. The idea of the recursive WCE is originally from[315] for the Zakai equation of nonlinear filtering, where the operators Mk

in (3.3.23) are just zeroth order differential operators. In Chapter 8, it will beshown that the technique for the convergence proof for the case of Mk beingfirst-order differential operator is different from the one in [315]. The differ-ence lies on the fact that the first-order differential operators Mk’s requiremore regularity of the solution and stronger assumption on the coefficientsin Mk even if a second-order convergence in random space is needed.

In this chapter, we require the solution to be square-integrable in therandom space. In many cases, this requirement cannot be satisfied and weneed to seek solutions in weaker sense. It is true that WCE can still be appliedin this situation but solutions are only living in a weighted space, see, e.g.,[34, 318, 319, 332, 384, 459, 469]. The WCE methods in these papers are allassociated with the Ito product between two random fields, which can be seenas a zeroth order approximation of the classical product, see [346, 462, 470]and also Chapter 11.

Algorithm 6.1.1 coincides with the algorithm proposed in [315] for (3.3.23)in the case of bki (t, x) = 0, c = 0, gk = 0, and finite number of noisesbut generalizes it to a wider class of linear SPDEs of the form (3.3.23). Inparticular, the algorithm from [315] was applied to the nonlinear filteringproblem for hidden Markov models in the case of independent noises in signal


and observation, while Algorithm 6.1.1 is also applicable when noises in signaland observation are dependent.

Algorithm 6.1.1 allows us to simulate mean-square approximations of thesolution to the SPDE (3.3.23). It can also be used together with the MonteCarlo technique for computing expectations of functionals of the solutionto (3.3.23).

The recursive polynomial chaos has also been applied to linear parabolicequations driven by Gaussian color noise [321] (WCE) and Poisson noise [303](generalized polynomial chaos).

It is possible to reduce the variance of the estimator on the right-handside of (6.4.13) using variance reduction techniques to reduce the Monte Carloerror, see, e.g., [358, 361] and the references therein. However, for complexstochastic problems it is usually rather difficult to reduce the variance effi-ciently. In this chapter, we consider only direct Monte Carlo methods with-out variance reduction and give a comparison of computational costs for theWCE-based algorithm and these Monte Carlo methods.

One can recognize that (6.3.9) is a Kubo oscillator. A number of numericaltests with symplectic and non-symplectic integrators are done on a Kubooscillator in [356, 358].


Exercise 6.6.1 Solve the Kubo oscillator (6.3.9) with the backward Eulerscheme and numerically check the convergence rate.

Hint. The exact solution is (3.3.18) when ε = 0. The numerical conver-gence rate should be close to 1/2 in the mean-square sense and 1 in the weaksense. The number of Monte Carlo sampling paths should be large enough tohave the statistical error significantly smaller than the integration error.

Exercise 6.6.2 Consider the following linear stochastic differential equa-tion:

dy

dt= ξy, y(0) = y0, (6.6.1)

where y0 = 1, and ξ ∼ N (0, 1). We can apply the multistage WCE tosolve (6.6.1) in the following steps:

a) Write down the propagator, i.e., equations for the WCE expansion coef-ficients yα, α = 0, 1, . . ., of Equation (6.6.1).

b) Solve the propagator in a) on [0, Δ], where α = 0, 1, 2, . . . ,M for a suffi-cient large M , say 200. Use the explicit fourth-order Runge-Kutta schemein time. Specifically, for the equation dx

dt = f(t, x), x(0) = x0, the ex-plicit fourth-order Runge-Kutta scheme reads

xn+1 = xn + h6 (k1 + 2k2 + 2k3 + k4) , n = 0, 1, 2, 3, · · · , N, Nh = Δ,


where h is the time step size and

k1 = f(tn, xn),

k2 = f(tn + h2 , xn + h

2k1),

k3 = f(tn + h2 , xn + h

2k2),

k4 = f(tn + h, xn + hk3).

c) Compute the numerical solution at larger t ∈ (Δ,T ]. Denote by z(k) thesolution in b) to the following problem on [0, Δ].

dz

dt= ξz, t ∈ [0, Δ], z(0) = Hk(ξ), k = 1, · · · , N. (6.6.2)

Then z(k) ≈∑M

m=0 z(k)m (t)Hm(ξ), t ∈ [0, Δ]. Suppose y0 =

∑Nk=0 ykHk

(ξ). Then a numerical solution to the problem (6.6.1) is

yM,N (t) =

N∑

k=0

yk

(M∑

m=0

z(k)m (t)Hm(ξ)

)

=

M∑

m=0

(N∑

k=0

ykz(k)m (t)

)Hm(ξ), t ∈ [0, Δ]. (6.6.3)

d) Compute the moments of the numerical solution yM,N (t) for t = lΔ forl = 2, 3, · · · .Use also a Monte Carlo method to compute moments of solutions approx-

imately. Compare the computational time (for both low and high accuracy)with those for the above multistage WCE at t = 1 and t = 10.

Hint. The exact solution to (6.6.1) is y0 exp(tξ), where the moments can becomputed explicitly. For comparison between multistage WCE and MonteCarlo methods, a similar conclusion as in this chapter is expected.

7

Stochastic collocation methods for differentialequations with white noise

Stochastic collocation methods can lead to a fully decoupled system of PDEs,which can be readily implemented on parallel computers. However, stochasticcollocation methods do not work when longer time integration is required.Though these methods are also cursed by the dimensionality, we apply therecursive strategy for longer time integration discussed in Chapter 6 to inves-tigate the error behavior of stochastic collocation methods in one-time stepapproximation or short time integration.

Here, as an example, we consider one specific collocation method, namely,Smolyak’s sparse grid collocation, which is a sparse grid collocation method,see, e.g., Chapter 2.5.4. If no confusion arises, we will still use the termstochastic collocation method (SCM) instead of sparse grid collocation method.We first analyze the error of Smolyak’s sparse grid collocation used to evalu-ate expectations of functionals of solutions to stochastic differential equationsdiscretized by the Euler scheme. We show theoretically and numerically thatthis algorithm can have satisfactory accuracy for small magnitude of noise orrelatively short integration time. However, it does not converge, in general,neither with decrease of the Euler scheme’s time step size nor with increase ofSmolyak’s sparse grid level. Subsequently, we use this method as a buildingblock for presenting a new algorithm by combining sparse grid collocationwith a recursive procedure. This approach allows us to numerically integratelinear stochastic partial differential equations over longer times. This is illus-trated in numerical tests on a stochastic advection-diffusion equation.

7.1 Introduction

In a number of applications from physics, financial engineering, biology, andchemistry it is of interest to compute expectations of some functionals ofsolutions of ordinary stochastic differential equations (SDE) and stochastic


193

194 7 Stochastic collocation methods for differential equations with white noise

partial differential equations (SPDE) driven by white noise. Usually, eval-uation of such expectations requires to approximate solutions of stochasticequations and then to compute the corresponding averages with respect tothe approximate trajectories. The most commonly used approach for com-puting the averages is the Monte Carlo technique, which is known for its slowrate of convergence and hence limiting computational efficiency of stochas-tic simulations. To speed up computation of the averages, variance reductiontechniques (see, e.g., [358, 361] and the references therein), quasi-Monte Carloalgorithms [376, 423], and the multi-level Monte Carlo method [156, 157] havebeen proposed and used.

In this chapter, we consider a sparse grid collocation method accompaniedby time discretization of differential equations perturbed by time-dependentnoise. We obtain an error estimate for the SCM in conjunction with the Eu-ler scheme for evaluating expectations of smooth functionals of solutions ofa scalar linear SDE with additive noise. In particular, we conclude that theSCM can successfully work for a small magnitude of noise and relatively shortintegration time while in general it does not converge neither by decreasingthe time discretization step used for SDE approximation nor by increasingthe level of Smolyak’s sparse grid; see Remark 7.2.4. Numerical tests in Sec-tion 7.2 confirm our theoretical conclusions and we also observe first-orderconvergence in time step size of the algorithm using the SCM as long as theSCM error is small relative to the error of time discretization of SDE. Wenote that our conclusion is, to some extent, similar to that for cubatures onWiener space [70], for Wiener chaos method [225, 315, 316, 505] and someother functional expansion approaches [53, 54].

To achieve accurate longer time integration by numerical algorithms usingthe SCM, we exploit the idea of the recursive approach for a linear SPDE withtime-independent coefficients presented in [315] and Chapter 6. The recursiveapproach works as follows. We first find an approximate solution of an SPDEat a relatively small time t = h, and subsequently take the approximation att = h as the initial value in order to compute the approximate solution att = 2h, and so on, until we reach the final integration time T = Nh. To obtainsecond moments of the SPDE solution, we store a covariance matrix of theapproximate solution at each time step kh and recursively compute the firsttwo moments. Such an algorithm is presented in Chapter 7.3; in Chapter 7.4we demonstrate numerically that this algorithm converges in time step h andworks well on longer time intervals.

At the end of this chapter, we summarize and present a brief review ondeterministic high-dimensional quadratures in random space including somedisadvantage of the sparse grid collocation method. A restarting strategy fornonlinear SODEs is also presented for long-time integration. Two exercisesare provided on applying stochastic collocation methods.

7.2 Isotropic sparse grid for weak integration of SDE 195

7.2 Isotropic sparse grid for weak integration of SDE

We consider application of the sparse grid rule (2.5.9) to the integral in(3.2.20). In this approach, the total error has two parts

|Ef(X(T ))−A(L, N)ϕ| ≤∣∣∣Ef(X(T ))− Ef(XN )

∣∣∣+∣∣∣Ef(XN )−A(L, N)ϕ

∣∣∣ ,

where A(L, N) is defined in (2.5.9) and ϕ is from (3.2.19). The first part iscontrolled by the time step size h, see (3.2.15), and it converges to zero withorder one in h. The second part is controlled by the sparse grid level L butit depends on h since decrease of h increases the dimension of the randomspace. Some illustrative examples will be presented in Chapter 7.2.2.

7.2.1 Probabilistic interpretation of SCM

It is not difficult to show that SCM admits a probabilistic interpretation,e.g., in the case of level L = 2 we have

A(2, N)ϕ(y1,1, . . . , yr,1, . . . , y1,N , . . . , yr,N ) (7.2.1)

= (Q2 ⊗Q1 ⊗ · · · ⊗Q1)ϕ+ (Q1 ⊗Q2 ⊗Q1 ⊗ · · · ⊗Q1)ϕ

+ · · ·+ (Q1 ⊗Q1 ⊗ · · · ⊗Q2)ϕ− (Nr − 1) (Q1 ⊗Q1 ⊗ · · · ⊗Q1)ϕ

=

N∑

i=1

r∑

j=1

Eϕ(0, . . . , 0, ζj,i, 0, . . . , 0)− (Nr − 1)ϕ(0, 0, . . . , 0),

where ζj,i are i.i.d. random variables with the law (3.2.17). Using (3.2.21),(7.2.1), Taylor’s expansion and symmetry of ζj,i, we obtain the relationshipbetween the weak Euler scheme (3.2.16) and the SCM (2.5.9):

Ef(XN )−A(2, N)ϕ = Eϕ(ζ1,1, . . . , ζr,1, . . . , ζ1,N , . . . , ζr,N ) (7.2.2)

−N∑

i=1

r∑

j=1

Eϕ(0, . . . , 0, ζj,i, 0, . . . , 0)

−(Nr − 1)ϕ(0, 0, . . . , 0)

=∑

|α|=4

4

α!E

[N∏

i=1

r∏

j=1

(ζj,i)αj,i

∫ 1

0

(1− z)3Dα

ϕ(zζ1,1, . . . , zζr,N ) dz

]

− 1

3!

N∑

i=1

r∑

j=1

E

[ζ4j,i

∫ 1

0

(1− z)3∂4

(∂yj,i)4ϕ(0, . . . , 0,

zζj,i, 0, . . . , 0)dz

],

where the multi-index α = (α1,1, . . . , αr,N ) ∈ NrN0 , |α| =

∑Ni=1

∑rj=1 αj,i,

α! =∏N

i=1

∏rj=1 αj,i! and Dα = ∂|α|

(∂y1,1)α1,1 ···(∂yr,N )αr,N . The error of the SCM

applied to the weak approximation of SDE is further studied in Chapter 7.2.2.


In the SCM context, it is beneficial to exploit higher-order or higher-accuracy schemes for approximating the SDE (3.1.4) because they can allowus to reach a desired accuracy using larger time step sizes and therefore lessrandom variables than the first-order Euler scheme (3.2.2) or (3.2.16). Forexample, we can use the second-order scheme (3.2.23). Roughly speaking, toachieve O(h) accuracy with (3.2.23), we need only

√2rN (

√rN in the case of

additive noise) random variables, while we need rN random variables for theEuler scheme (3.2.2). This reduces the dimension of the random space andhence can increase efficiency and the applicability of SCM (see, in particularExample 7.4.1 in Chapter 7.4 for a numerical illustration). We note thatwhen noise intensity is relatively small, we can use high-accuracy low-orderschemes designed for SDEs with small noise [357] (see also [358, Chapter 3])in order to achieve a desired accuracy using less number of random variablesthan the Euler scheme (3.2.2).

Similarly, we can write down a probabilistic interpretation for any levelL and derive a similar error representation. For example, we have for L = 3that

E[ϕ(ζ(3)1,1 , · · · , ζ

(3)r,N )]−A(3, N)ϕ

=∑

|α|=6

6

α!E[

N∏

i=1

r∏

j=1

(ζ(3)j,i )

αj,i

∫ 1

0

(1− z)5Dαϕ(zζ(3)1,1 , · · · , zζ

(3)r,N ) dz]

−∑

|α|=αj,i+αl,k=6

(j−l)2+(i−k)2 �=0

6

αj,i!αk,l!E[(ζ

(3)j,i )

αj,i(ζ(3)l,k )

αl,k

∫ 1

0

(1− z)5Dαϕ(· · · , zζ(3)j,i ,

0, · · · , 0, zζ(3)l,k , · · · ) dz]

−N∑

i=1

r∑

j=1

6

6!E[(ζj,i)

6

∫ 1

0

(1− z)5Dαϕ(0, · · · , zζj,i, · · · , 0) dz],

where ζj,i are defined in (3.2.17) and ζ(3)j,i are i.i.d. random variables with the

law P (ζ(3)j,i = ±

√3) = 1/6, P (ζ

(3)j,i = 0) = 2/3.

7.2.2 Illustrative examples

In this section we show some limitations of the use of SCM in the weakapproximation of SDEs. To this end, it is convenient and sufficient to considerthe scalar linear SDE

dX = λXdt+ ε dW (t), X0 = 1, (7.2.3)

where λ and ε are some constants.We will compute expectations Ef(X(T )) for some f(x) and X(t) from

(7.2.3) by applying the Euler scheme (3.2.2) and the SCM (2.5.9). This simpleexample provides us with a clear insight when algorithms of this type are


able to produce accurate results and when they are likely to fail. Using directcalculations, we first (see Examples 7.2.1–7.2.2 below) derive an estimatefor the error |Ef(XN )−A(2, N)ϕ| with XN from (3.2.2) applied to (7.2.3)and for some particular f(x). This will illustrate how the error of SCM withpractical level (i.e., L ≤ 6) behaves. Then in Proposition 7.2.3, we obtain anestimate for the error |Ef(XN )−A(L, N)ϕ| for a smooth f(x), which growsnot faster than a polynomial function at infinity. We will observe that theconsidered algorithm is not convergent in time step h and the algorithm isnot convergent in level L unless the noise intensity and integration time aresmall.

It follows from (3.2.15) and (3.2.18) that

|Ef(XN )−A(L, N)ϕ| ≤∣∣∣Ef(XN )−A(L, N)ϕ

∣∣∣+ |Ef(XN )− Ef(XN )| (7.2.4)

≤∣∣∣Ef(XN )−A(L, N)ϕ

∣∣∣+Kh,

where XN is from the weak Euler scheme (3.2.16) applied to (7.2.3), which

can be written as XN = (1 + λh)N +N∑

j=1

(1 + λh)N−jε√hζj . Introducing the

function

X(N ; y) = (1 + λh)N +

N∑

j=1

(1 + λh)N−jε√hyj , (7.2.5)

we see that XN = X(N ; ζ1, . . . , ζN ). We have

∂

∂yiX(N ; y) = (1 + λh)N−iε

√h and

∂2

∂yi∂yjX(N ; y) = 0. (7.2.6)

Then we obtain from (7.2.2):

R : = Ef(XN )−A(2, N)ϕ (7.2.7)

= ε4h2∑

|α|=4

4

α!E

[N∏

i=1

(ζi(1 + λh)N−i)αi

∫ 1

0

(1− z)3z4d4

dx4

f(X(N, zζ1, . . . , zζN )) dz

]

− 1

3!ε4h2

N∑

i=1

E

[ζ4i

∫ 1

0

(1− z)3z4d4

dx4f(X(0, . . . , 0, zζi, 0, . . . , 0))

(1 + λh)4N−4idz

].

Non-Convergence in time step h

We will illustrate no convergence in h for SCM for levels two and threethrough two examples, where sharp error estimates of |Ef(XN )−A(2, N)ϕ|


are derived for SCM. Higher level SCM can be also considered but the conclu-sions do not change. In contrast, the algorithm of tensor-product integrationin random space and the strong Euler scheme in time (i.e., the weak Eulerscheme (3.2.16)–(3.2.17)) is convergent with order one in h. We also note thatin practice, typically SCM with level no more than six are employed.

Example 7.2.1 For f(x) = xp with p = 1, 2, 3, it follows from (7.2.7) thatR = 0, i.e., SCM does not introduce any additional error, and hence by (7.2.4)

|Ef(XN )−A(2, N)ϕ| ≤ Kh, f(x) = xp, p = 1, 2, 3.

For f(x) = x4, we get from (7.2.7):

R =6

35ε4h2

N∑

i=1

N∑

j=i+1

(1 + λh)4N−2i−2j

=6

35ε4 ×

{(1+λh)2N−1λ2(2+λh)2

[(1+λh)2N+11+(1+λh)2 − 1

], λ �= 0, 1 + λh �= 0,

T 2

2 − Th2 , λ = 0.

We see that R does not go to zero when h → 0 and that for sufficiently smallh > 0

|Ef(XN )−A(2, N)ϕ| ≤ Kh+6

35ε4 ×

{ 1λ2 (1 + e4Tλ), λ �= 0,T 2

2 , λ = 0.

We observe that the SCM algorithm does not converge with h → 0 forhigher moments. In the considered case of linear SDE, increasing the level Lof SCM leads to the SCM error R being 0 for higher moments, e.g., for L = 3the error R = 0 for up to 5th moment but the algorithm will not converge inh for 6th moment and so on (see Proposition 7.2.3 below). Further (see thecontinuation of the illustration below), in the case of, e.g., f(x) = cosx forany L this error R does not converge in h, which is also the case for nonlinearSDE. We also note that one can expect that this error R is small when noiseintensity is relatively small and either time T is small or SDE has, in somesense, stable behavior (in the linear case it corresponds to λ < 0).

Example 7.2.2 Now consider f(x) = cos(x). It follows from (7.2.7) that

R = ε4h2∑

|α|=4

4

α!E

[N∏

i=1

(ζi(1 + λh)N−i)αi

∫ 1

0

(1− z)3z4 cos((1 + λh)N

+zN∑

j=1

(1 + λh)N−jε√hζj) dz

⎤

⎦

− 1

3!ε4h2

N∑

i=1

(1 + λh)4N−4i

∫ 1

0

(1− z)3z4E[ζ4i cos((1 + λh)N

+z(1 + λh)N−iε√hζi)] dz


and after routine calculations we obtain

R = ε4h2 cos((1 + λh)N )

[(1

6

N∑i=1

(1 + λh)4N−4i + 2N∑i=1

N∑j=i+1

(1 + λh)4N−2i−2j

)

×∫ 1

0

(1− z)3z4N∏l=1

cos(z(1 + λh)N−lε√h)dz

+

⎛⎜⎜⎝

2

3

N∑i,j=1;i �=j

(1 + λh)4N−3i−j + 2N∑

k,i,j=1i �=j,i �=k,k �=j

(1 + λh)4N−2k−i−j

⎞⎟⎟⎠

×∫ 1

0

(1− z)3z4∏l=i,j

sin(z(1 + λh)N−lε√h)

N∏l=1

l �=i,l �=j


+4N∑

i,j,k,m=1i �=j,i �=k,i �=m,j �=k,j �=m,k �=m

(1 + λh)4N−i−j−k−m

×∫ 1

0

(1− z)3z4∏

l=i,j,k,m

sin(z(1 + λh)N−lε√h)

N∏l=1l �=i,l �=j,l �=k,l �=m


−1

6

N∑i=1

(1 + λh)4N−4i

∫ 1

0

(1− z)3z4 cos(z(1 + λh)N−iε√h)] dz

].

It is not difficult to see that R does not go to zero when h → 0. In fact, takinginto account that | sin(z(1+λh)N−jε

√h)| ≤ z(1+λh)N−jε

√h, and that there

are N4 terms of order h4 and N3 terms of order h3, we get for sufficientlysmall h > 0

|R| ≤ Cε4(1 + e4Tλ),

where C > 0 is independent of ε and h. Hence

|Ef(XN )−A(2, N)ϕ| ≤ Cε4(1 + e4Tλ) +Kh, (7.2.8)

and we have arrived at a similar conclusion for f(x) = cosx as for f(x) = x4.Similarly, we can also have for L = 3 that

|Ef(XN )−A(3, N)ϕ| ≤ Cε6(1 + e6Tλ) +Kh.

This example shows for L = 3, the error of SCM with the Euler scheme intime does not converge in h.

Error estimate for SCM with fixed level

Now we address the effect of the SCM level L. To this end, we will need thefollowing error estimate of a Gauss-Hermite quadrature. Let ψ(y), y ∈ R, be


a sufficiently smooth function which itself and its derivatives are growing notfaster than a polynomial at infinity. Using the Peano kernel theorem (see,e.g., [96]) and that a Gauss-Hermite quadrature with n-nodes has the orderof polynomial exactness 2n−1, we obtain for the approximation error Rn,γψof the Gauss-Hermite quadrature Qnψ:

Rn,γ(ψ) := Qnψ − I1ψ =

∫

R

dγ

dyγϕ(y)Rn,γ(Γy,γ) dy, 1 ≤ γ ≤ 2n, (7.2.9)

where Γy,γ(z) = (z − y)γ−1/(γ − 1)! if z ≥ y and 0 otherwise. One can show(see, e.g., [339, Theorem 2]) that there is a constant c > 0 independent of nand y such that for any 0 < β < 1

|Rn,γ(Γy,γ)| ≤c√2π

n−γ/2 exp

(−βy2

2

), 1 ≤ γ ≤ 2n. (7.2.10)

We also note that (7.2.10) and the triangle inequality imply, for 1 ≤ γ ≤2(n− 1):

|Rn,γ(Γy,γ)−Rn−1,γ(Γy,γ)| ≤c√2π

[n−γ/2 + (n− 1)−γ/2] exp

(−βy2

2

).

(7.2.11)

Now we consider the error of the sparse grid rule (2.5.9) accompanied bythe Euler scheme (3.2.2) for computing expectations of solutions to (7.2.3).

Proposition 7.2.3 Assume that a function f(x) and its derivatives up to2L-th order satisfy the polynomial growth condition (3.2.14). Let XN be ob-tained by the Euler scheme (3.2.2) applied to the linear SDE (7.2.3) andA(L,N)ϕ be the sparse grid rule (2.5.9) with level L applied to the integralcorresponding to Ef(XN ) as in (3.2.20). Then for any L and sufficientlysmall h > 0

|Ef(XN )−A(L, N)ϕ| ≤ Kε2L(1+ eλ(2L+κ)T )(1 + (3c/2)

L∧N)β−(L∧N)/2T L,

(7.2.12)

where K > 0 is independent of h, L and N ; c and β are from (7.2.10); κ isfrom (3.2.14).

Proof. We recall (see (3.2.20)) that

Ef(XN ) = INϕ =1

(2π)N/2

∫

RrN

ϕ(y1, . . . , yN ) exp

(−1

2

N∑

i=1

y2i

)dy.

Introduce the integrals

I(k)1 ϕ =

1√2π

∫

R

ϕ(y1, . . . , yk, . . . , yN ) exp

(−y2k

2

)dyk, k = 1, . . . , N,

(7.2.13)

and their approximations Q(k)n by the corresponding one-dimensional Gauss-

Hermite quadratures with n nodes. Also, let U (k)ik

= Q(k)ik

−Q(k)ik−1.


Using (2.5.9) and the recipe from the proof of Lemma 3.4 in [379], weobtain

INϕ−A(L, N)ϕ =

N∑

l=2

S(L, l)⊗Nk=l+1I

(k)1 ϕ+(I

(1)1 −Q

(1)L )⊗N

k=2I(k)1 ϕ, (7.2.14)

where

S(L, l) =∑

i1+···+il−1+il=L+l−1

⊗l−1k=1U

(k)ik

⊗ (I(l)1 −Q

(l)il). (7.2.15)

Due to (7.2.9), we have for n > 1 and 1 ≤ γ ≤ 2(n− 1)

Unψ = Qnψ −Qn−1ψ = [Qnψ − I1(ψ)]− [Qn−1ψ − I1(ψ)] (7.2.16)

=

∫

R

dγ

dyγψ(y)[Rn,γ(Γy,γ)−Rn−1,γ(Γy,γ)] dy,

and for n = 1Unψ = Q1ψ −Q0ψ = Q1ψ = ψ(0). (7.2.17)

By (7.2.15), (7.2.13), and (7.2.9), we obtain for the first term in the right-hand side of (7.2.14):

S(L, l)⊗Nn=l+1 I

(n)1 ϕ

=∑

i1+···+il=L+l−1

⊗l−1n=1U

(k)ik

⊗ (I(l)1 −Q

(l)il)⊗N

n=l+1 I(n)1 ϕ

=∑

i1+···+il=L+l−1

⊗l−1n=1U

(k)ik

⊗ (I(l)1 −Q

(l)il)

⊗∫

RN−l

ϕ(y)1

(2π)(N−l)/2exp(−

N∑

k=l+1

y2k2) dyl+1 . . . dyN

= −∑

i1+···+il=L+l−1

⊗l−1n=1U

(k)ik

⊗∫

RN−l+1

d2il

dy2ill

ϕ(y)Ril,2il(Γyl,2il)

× 1

(2π)(N−l)/2exp(−

N∑

k=l+1

y2k2) dyl . . . dyN .

Now consider two cases: if il−1 > 1 then by (7.2.16):

S(L, l)⊗Nn=l+1 I

(n)1 ϕ = −

∑

i1+···+il=L+l−1

⊗l−2n=1U

(k)ik

⊗∫

RN−l+2

d2il−1−2

dy2il−1−2l−1

d2il

dy2ill


×[Ril−1,2il−1−2(Γyl−1,2il−1−2)−Ril−1−1,2il−1−2

(Γyil−1,2il−1−2)]

× 1

(2π)(N−l)/2exp(−

N∑

k=l+1

y2k2) dyl−1 . . . dyN


otherwise (i.e., if il−1 = 1) by (7.2.17):

S(L, l)⊗Nn=l+1 I

(n)1 ϕ = −

∑

i1+···+il=L+l−1

⊗l−2n=1U

(k)ik

⊗∫

RN−l+1

Q(l−1)1

d2il

dy2ill


× 1

(2π)(N−l)/2exp(−

N∑

k=l+1

y2k2) dyl . . . dyN .

Repeating the above process for il−2, . . . , i1, we obtain

S(L, l)⊗Nn=l+1 I

(n)1 ϕ =

∑

i1+···+il=L+l−1

∫

RN−#Fl−1

[⊗m∈Fl−1Q

(m)1 D2αlϕ(y)]

(7.2.18)

×Rl,αl(y1, . . . , yl)

1

(2π)(N−l)/2exp(−

N∑

k=l+1

y2k2)∏

n∈Gl−1

dyn × dyl . . . dyN ,

where the multi-index αl = (i1 − 1, . . . , il−1 − 1, il, 0, . . . , 0) with the m-thelement αm

l , the sets Fl−1 = Fl−1(αl) = {m : αml = 0, m = 1, . . . , l − 1} and

Gl−1 = Gl−1(αl) = {m : αml > 0, m = 1, . . . , l − 1}, the symbols #Fl−1 and

#Gl−1 stand for the number of elements in the corresponding sets, and

Rl,αl(y1, . . . , yl) = −Ril,2il(Γyl,2il)⊗n∈Gl−1

[Rin,2in−2(Γyn,2in−2)

−Rin−1,2in−2(Γyn,2in−2)].

Note that #Gl−1 ≤ (L− 1) ∧ (l − 1) and also recall that ij ≥ 1, j = 1, . . . , l.Using (7.2.10), (7.2.11) and the inequality

∏

n∈Gl−1

[i−(in−1)n + (in − 1)−(in−1)]i−il

l ≤ (3/2)#Gl−1 ,

we get

|Rl,α(y1, . . . , yl)| ≤∏

n∈Gl−1

[i−(in−1)n +(in − 1)−(in−1)]i

−ill

c#Gl−1+1

(2π)(#Gl−1+1)/2(7.2.19)

× exp

⎛⎝−

∑n∈Gl−1

βy2n

2− βy2

l

2

⎞⎠

≤ (3c/2)#Gl−1+1

(2π)(#Gl−1+1)/2exp

⎛⎝−

∑n∈Gl−1

βy2n

2− βy2

l

2

⎞⎠ .


Substituting (7.2.19) in (7.2.18), we arrive at

∣∣∣S(L, l)⊗Nn=l+1 I

(n)1 ϕ

∣∣∣ (7.2.20)

≤∑

i1+···+il=L+l−1

(3c/2)#Gl−1+1

(2π)(N−#Fl−1)/2

∫

RN−#Fl−1

∣∣∣⊗m∈Fl−1Q

(m)1 D2αlϕ(y)

∣∣∣

× exp

⎛

⎝−∑

n∈Gl−1

βy2n2

− βy2l2

−N∑

k=l+1

y2k2

⎞

⎠∏

n∈Gl−1

dyn × dyl . . . dyN .

Using (7.2.6) and the assumption that∣∣∣ d2L

dx2L f(x)∣∣∣ ≤ K(1 + |x|κ) for some

K > 0 and κ ≥ 1, we get

∣∣D2αlϕ(y)∣∣ = ε2LhL

∣∣∣∣d2L

dx2Lf(X(N, y))

∣∣∣∣ (1 + λh)2LN−2∑l

i=1 iαil (7.2.21)

≤ Kε2LhL(1 + λh)2LN−2∑l

i=1 iαil (1 + |X(N, y)|κ).

Substituting (7.2.21) and (7.2.5) in (7.2.20) and doing further calculations,we obtain∣∣∣S(L, l)⊗N

n=l+1 I(n)1 ϕ

∣∣∣ ≤ Kε2LhL(1 + eλκT )(1 + (3c/2)L∧l)β−(L∧l)/2

∑

i1+···+il=L+l−1

(1 + λh)2LN−2∑l

i=1 iαil

≤ Kε2LhL(1 + eλ(2L+κ)T )(1 + (3c/2)L∧l)

β−(L∧l)/2

(L+ l − 2

L− 1

)

≤ Kε2LhL(1 + eλ(2L+κ)T )(1 + (3c/2)L∧l)β−(L∧l)/2lL−1

(7.2.22)

with a new K > 0 which does not depend on h, ε, L, c, β, and l. In the lastline of (7.2.22) we used

(L+ l − 2

L− 1

)=

L−1∏

i=1

(1 +l − 1

i) ≤

[1

L− 1

L−1∑

i=1

(1 +l − 1

i)

]L−1

≤ lL−1.

Substituting (7.2.22) in (7.2.14) and observing that∣∣∣(I(1)1 −Q

(1)L )⊗N

k=2 I(k)1 ϕ

∣∣∣is of order O(hL), we arrive at (7.2.12).

Remark 7.2.4 Due to Examples 7.2.1 and 7.2.2, the error estimate (7.2.12)proved in Proposition 7.2.3 is quite sharp and we conclude that in general theSCM algorithm for weak approximation of SDE does not converge by eitherdecreasing the time step h or by increasing the level L. At the same time,


the algorithm is convergent in L (when L ≤ N) if ε2T is sufficiently smalland SDE has some stable behavior (e.g., λ ≤ 0). Furthermore, the algorithmis sufficiently accurate when noise intensity ε and integration time T arerelatively small.

Remark 7.2.5 It follows from the proof (see (7.2.21)) that if d2L

dx2L f(x) = 0then the error IN (ϕ)−A(L, N)ϕ = 0. We emphasize that this is a feature ofthe linear SDE (7.2.3) thanks to (7.2.6), while in the case of nonlinear SDEthis error remains of the form (7.2.12) even if the 2L-th derivative of f iszero. See also the discussion at the end of Example 7.2.1 and numerical testsin Example 7.4.1.

7.3 Recursive collocation algorithm for linear SPDEs

In the previous section we have demonstrated the limitations of SCM algo-rithms in application to SDEs that, in general, such an algorithm will notwork unless the integration time T and the magnitude of noise are small. Itis not difficult to understand that SCM algorithms have the same limitationsin the case of SPDE as well, which, in particular, is demonstrated in Exam-ple 7.4.2, where a stochastic Burgers equation is considered. To treat thisdeficiency and achieve longer time integration in the case of linear SPDE, wewill exploit the idea of the recursive approach presented in Chapter 6 andin [315, 505] in the case of a Wiener chaos expansion method. To this end,we apply the algorithm of SCM in conjunction with a time discretization ofSPDE over a small interval [(k − 1)h, kh] instead of the whole interval [0, T ]as we did in the previous section, and build a recursive scheme to computethe second-order moments of the solutions to linear SPDE.

Consider the linear SPDE (3.3.23) with finite dimensional noises. We willcontinue to use the notation from the previous section: h is a step of uniformdiscretization of the interval [0, T ], N = T/h and tk = kh, k = 0, . . . , N. Weapply the midpoint rule in time to the SPDE (3.3.23):

uk+1(x) = uk(x) + h[Luk+1/2(x)− 1

2

r∑

l=1

Mlgl(x) + f(x)] (7.3.1)

+

r∑

l=1

[Mlu

k+1/2(x) + gl(x)]√

h (ξlh)k+1 , x ∈ D,

u0(x) = u0(x),

where uk(x) approximates u(tk, x), uk+1/2 = (uk+1 + uk)/2, and (ξlh)k are

i.i.d. random variables so that

ξh =

⎧⎨

⎩

ξ, |ξ| ≤ Ah,Ah, ξ > Ah,

−Ah, ξ < −Ah,(7.3.2)

7.3 Recursive collocation algorithm for linear SPDEs 205

with ξ ∼ N (0, 1) and Ah =√2p| lnh| with p ≥ 1. We note that the cut-

off of the Gaussian random variables is needed in order to ensure that theimplicitness of (7.3.1) does not lead to nonexistence of the second moment ofuk(x) [356, 358]. Based on standard numerical methods for SDEs [358], it isnatural to expect that under some regularity assumptions on the coefficientsand the initial condition of (3.3.23), the approximation uk(x) from (7.3.1)converges with order 1/2 in the mean-square sense and with order 1 in theweak sense; in the latter case one can use discrete random variables ζl,k+1

from (3.2.17) instead of (ξlh)k+1 (see also, e.g., [109, 167, 261] but we are notproving such a result here).

In the following, we denote for convenience that ukH(x;φ(·)) = uk

H(x;φ(·);(ξlh)k , l = 1, . . . , r) for the approximation (7.3.1) of the solution u(tk, x) tothe SPDE (3.3.23) with f(x) = 0 and gl(x) = 0 for all l (homogeneousSPDE) and with the initial condition φ(·) prescribed at time t = tk−1;ukO(x) = uk

O(x; (ξlh)k , l = 1, . . . , r) for the approximation (7.3.1) of the so-lution u(tk, x) to the SPDE (3.3.23) with the initial condition φ(x) = 0prescribed at time t = tk−1. Note that uk

O(x) = 0 if f(x) = 0 and gl(x) = 0for all l.

Let {ei} = {ei(x)}i≥1 be a complete orthonormal system (CONS) in

L2(D) with boundary conditions satisfied and (·, ·) be the inner product inthat space. Then we can write

uk−1(x) =

∞∑

i=1

ck−1i ei(x) (7.3.3)

with ck−1i = (uk−1, ei) and, due to the SPDE’s linearity:

uk(x) = ukO(x) +

∞∑

i=1

ck−1i uk

H(x; ei(·)).

We have

c0l = (u0, el), ckl = qkOl +

∞∑

i=1

ck−1i qkHli, l = 1, 2, . . . , k = 1, . . . , N,

where qkOl = (ukO, el) and qkHli = (uk

H(·; ei), el(·)).Using (7.3.3), we represent the second moment of the approximation uk(x)

from (7.3.1) of the solution u(tk, x) to the SPDE (3.3.23) as follows

E[uk(x)]2 =∞∑

i,j=1

Ckijei(x)ej(x), (7.3.4)

where the covariance matrix Ckij = E[cki c

kj ]. Introducing also the means Mk

i ,one can obtain the recurrent relations in k :


M0i = c0i = (u0, ei), C0

ij = c0i c0j , (7.3.5)

Mki = E[qkOi] +

∞∑

l=1

Mk−1l E[qkHil],

Ckij = E[qkOiq

kOj ] +

∞∑

l=1

Mk−1l

(E[qkOiq

kHjl] + E[qkOjq

kHil]

)

+

∞∑

l,p=1

Ck−1lp E[qkHilq

kHjp],

i, j = 1, 2, . . . , k = 1, . . . , N.

Since the coefficients of the SPDE (3.3.23) are time independent, none of theexpectations involving the quantities qkOi and qkHil in (7.3.5) depend on k andhence it is sufficient to compute them just once, on a single step k = 1, andwe get

M0i = c0i = (u0, ei), C0

ij = c0i c0j , (7.3.6)

Mki = E[q1Oi] +

∞∑

l=1

Mk−1l E[q1Hil],

Ckij = E[q1Oiq

1Oj ] +

∞∑

l=1

Mk−1l

(E[q1Oiq

1Hjl] + E[q1Ojq

1Hil]

)

+

∞∑

l,p=1

Ck−1lp E[q1Hilq

1Hjp],

i, j = 1, 2, . . . , k = 1, . . . , N.

These expectations can be approximated by quadrature rules from Chap-ter 2.5. If the number of noises r is small, then it is natural to use the tensorproduct rule (2.5.8) with one-dimensional Gauss–Hermite quadratures of or-der n = 2 or 3 (note that when r = 1, we can use just a one-dimensionalGauss–Hermite quadrature of order n = 2 or 3). If the number of noises r islarge then it might be beneficial to use the sparse grid quadrature (2.5.9) oflevel L = 2 or 3. More specifically,

E[q1Oi].=

η∑

p=1

(u1O(·; yp), ei(·))Wp, E[q1Hil]

.=

η∑

p=1

(u1H(·; el;Yp), ei(·))Wp, (7.3.7)

E[q1Oiq1Oj ]

.=

η∑

p=1

(u1O(·;Yp), ei(·))(u1

O(·;Yp), ej(·))Wp,

7.3 Recursive collocation algorithm for linear SPDEs 207

E[q1Oiq1Hjl]

.=

η∑

p=1

(u1O(·;Yp), ei(·))(u1

H(·; el;Yp), ej(·))Wp,

E[q1Hilq1Hjk]

.=

η∑

p=1

(u1H(·; el;Yp), ei(·))(u1

H(·; ek;Yp), ej(·))Wp,

where Yp ∈ Rr are nodes of the quadrature,Wp are the corresponding quadra-

ture weights, and η = nr in the case of the tensor product rule (2.5.8) withone-dimensional Gauss–Hermite quadratures of order n or η is the total num-ber of nodes #S used by the sparse-grid quadrature (2.5.9) of level L. To findu1O(x;Yp) and u1

H(x; el;Yp), we need to solve the corresponding elliptic PDEproblems, which we do by using the spectral method in physical space, i.e.,using a truncation of the CONS {el}l∗l=1 to represent the numerical solution.

To summarize, we formulate the following deterministic recursive algo-rithm for the second-order moments of the solution to the SPDE (3.3.23).

Algorithm 7.3.1 Choose the algorithm’s parameters: a complete orthonor-mal basis {el(x)}l≥1 in L2(D) and its truncation {el(x)}l∗l=1; a time step sizeh; and a quadrature rule (i.e., nodes Yp and the quadrature weights Wp,p = 1, . . . , η).

Step 1. For each p = 1, . . . , η and l = 1, . . . , l∗, find approximationsu1O(x;Yp) ≈ u1

O(x; yp) and u1H(x; el;Yp) ≈ u1

H(x; el;Yp) using the spectralmethod in physical space.

Step 2. Using the quadrature rule, approximately find the expectations asin (7.3.7) but with the approximate u1

O(x;Yp) and u1H(x; el;Yp) instead of

u1O(x;Yp) and u1

H(x; el;Yp), respectively.Step 3. Recursively compute the approximations of the means Mk

i , i =1, . . . , l∗, and covariance matrices {Ck

ij , i, j = 1, . . . , l∗} for k = 1, . . . , Naccording to (7.3.6) with the approximate expectations found in Step 2 insteadof the exact ones.

Step 4. Compute the approximation of the second-order moment E[uk(x)]2

using (7.3.4) with the approximate covariance matrix found in Step 3 insteadof the exact one {Ck

ij}.We emphasize that Algorithm 7.3.1 for computing moments does not

have a statistical error. Error analysis of a slightly different version of thisalgorithm will be considered in Chapter 8.

Remark 7.3.2 Algorithms analogous to Algorithm 7.3.1 can also be con-structed based on other time-discretizations methods than the trapezoidal ruleused here or based on other types of SPDE approximations, e.g., one canexploit the Wong-Zakai approximation as we will do in Chapter 8.

Remark 7.3.3 The cost of this algorithm is, similar to the algorithm in[505], T

Δηl4∗ and the storage is ηl2∗. The total cost can be reduced by employingsome reduced order methods in physical space and be proportional to l2∗ insteadof l4∗. The discussion on computational efficiency of the recursive Wienerchaos method is also valid here, see [505, Remark 4.1].



In this section we illustrate via three examples how the SCM algorithms canbe used for the weak-sense approximation of SDEs and SPDEs. The firstexample is a scalar SDE with multiplicative noise, where we show that theSCM algorithm’s error is small when the noise magnitude is small. We alsoobserve that when the noise magnitude is large, the SCM algorithm does notwork well. In the second example we demonstrate that the SCM can be suc-cessfully used for simulating Burgers equation with additive noise when theintegration time is relatively small. In the last example we show that the re-cursive algorithm from Chapter 7.3 works effectively for computing momentsof the solution to an advection-diffusion equation with multiplicative noiseover a longer integration time.

In all the tests we limit the dimension of random spaces by 40, which isan empirical limitation of the SCM of Smolyak on the dimensionality [393].Also, we take the sparse grid level less than or equal to five in order to avoidan excessive number of sparse grid points. All the tests were run using MatlabR2012b on a Macintosh desktop computer with Intel Xeon CPU E5462 (quad-core, 2.80 GHz).

Example 7.4.1 (Modified Cox-Ingersoll-Ross (mCIR), see, e.g.,[83]) Consider the Ito SDE

dX = −θ1X dt+ θ2√

1 +X2 dW (t), X(0) = x0. (7.4.1)

For θ22 − 2θ1 �= 0, the first two moments of X(t) are equal to

EX(t) = x0 exp(−θ1t), EX2(t) = − θ22θ22 − 2θ1

+ (x20 +

θF 22

θ22 − 2θ1) exp((θ22 − 2θ1)t).

In this example we test the SCM algorithms based on the Euler scheme(3.2.1) and on the second-order weak scheme (3.2.23). We compute the firsttwo moments of the SDE’s solution and use the relative errors to measureaccuracy of the algorithms as

ρr1(T ) =|EX(T )− EXN |

|EX(T )| , ρr2(T ) =

∣∣EX2(T )− EX2N

∣∣EX2(T )

. (7.4.2)

Table 7.1 presents the errors for the SCM algorithms based on the Eulerscheme (left) and on the second-order scheme (3.2.23) (right), when the noisemagnitude is small. For the parameters given in the table’s description, theexact values (up to 4 d.p.) of the first and second moments are 3.679× 10−2

and 4.162 × 10−2, respectively. We see that increase of the SCM level Labove 2 in the Euler scheme case and above 3 in the case of the second-orderscheme does not improve accuracy. When the SCM error is relatively smallin comparison with the error due to time discretization, we observe decreaseof the overall error of the algorithms in h: proportional to h for the Eulerscheme and to h2 for the second-order scheme. We underline that in thisexperiment the noise magnitude is small.


Table

7.1.ComparisonoftheSCM

algorithmsbasedontheEulerschem

e(left)

andonthesecond-order

schem

e(3.2.23)(right).The

parametersofthemodel

(7.4.1)are

x0=

0.1,θ 1

=1,θ 2

=0.3,andT

=1.

hL

ρr 1(1)

Order

ρr 2(1)

Order

Lρr 1(1)

Order

ρr 2(1)

Order

5×10−1

23.20×10−1

–3.72×10−1

–3

6.05×

10−2

–8.52×10−2

–

2.5×10−1

21.40×10−1

1.2

1.40×10−1

1.4

31.14×10−2

2.4

2.10×10−2

2.0

1.25×10−1

26.60×

10−2

1.1

4.87×10−2

1.5

31.75×10−3

2.7

6.73×10−3

1.6

6.25×10−2

23.21×10−2

1.0

8.08×10−3

2.6

43.64×10−4

2.3

1.21×10−3

2.5

3.125×10−2

21.58×10−2

1.0

1.12×10−2

-0.5

48.48×10−4

-1.2

3.75×10−4

1.7

2.5×10−2

21.26×10−2

1.49×10−2

29.02×10−4

5.72×10−2

2.5×10−2

31.26×10−2

1.48×10−2

39.15×10−5

2.84×10−3

2.5×10−2

41.26×10−2

1.55×10−2

41.06×10−4

2.77×10−4

2.5×10−2

51.26×10−2

1.56×10−2

51.06×10−4

1.81×10−4

gk

Sticky Note

wasted page


In Table 7.2 we give results of the numerical experiment when the noisemagnitude is not small. For the parameters given in the table’s description,the exact values (up to 4 d.p.) of the first and second moments are 0.2718and 272.3202, respectively. Although for the Euler scheme there is a propor-tional to h decrease of the error in computing the mean, there is almost nodecrease of the error in the rest of this experiment. The large value of thesecond moment apparently affects the efficiency of the SCM here. For theEuler scheme, increasing L and decreasing h can slightly improve accuracyin computing the second moment, e.g., the smallest relative error for thesecond moment is 56.88% when h = 0.03125 and L = 5 (this level requires750337 sparse grid points) out of the considered cases of h = 0.5, 0.25, 0.125,0.0625, and 0.03125 and L ≤ 5. For the mean, increase of the level L from 2to 3, 4 or 5 does not improve accuracy. For the second-order scheme (3.2.23),relative errors for the mean can be decreased by increasing L for a fixed h:e.g., for h = 0.25, the relative errors are 0.5121 0.1753, 0.0316, and 0.0086when L = 2, 3, 4, and 5, respectively.

We also see in Table 7.2 that the SCM algorithm based on the second-order scheme may not admit higher accuracy than the one based on the Eulerscheme, e.g., for h = 0.5, 0, 25, 0.125 the second-order scheme yields higheraccuracy while the Euler scheme demonstrates higher accuracy for smallerh = 0.0625 and 0.03125. Further decrease in h was not considered becausethis would lead to increase of the dimension of the random space beyond 40when the sparse grid of Smolyak (2.5.9) may fail and the SCM algorithm mayalso lose its competitive edge with Monte Carlo-type techniques.

Table 7.2. Comparison of the SCM algorithms based on the Euler scheme (left) andon the second-order scheme (3.2.23) (right). The parameters of the model (7.4.1)are x0 = 0.08, θ1 = −1, θ2 = 2, and T = 1. The sparse grid level L = 4.

h ρr1(1) Order ρr2(1) ρr1(1) ρr2(1)

5×10−1 1.72×10−1 – 9.61×10−1 2.86×10−2 7.69×10−1

2.5×10−1 1.02×10−1 0.8 8.99×10−1 8.62×10−3 6.04×10−1

1.25×10−1 5.61×10−2 0.9 7.87×10−1 1.83×10−2 7.30×10−1

6.25×10−2 2.96×10−2 0.9 6.62×10−1 3.26×10−2 8.06×10−1

3.125×10−2 1.52×10−2 1.0 5.64×10−1 4.20×10−2 8.40×10−1

In this example, we have shown that the SCM algorithms based on first-and second-order schemes can produce sufficiently accurate results when thenoise magnitude is small. The second-order scheme is preferable since forthe same accuracy it uses random spaces of lower dimension than the first-order Euler scheme, compare, e.g., the error values highlighted by bold fontin Table 7.1. When the noise magnitude is large (see Table 7.2), the SCMalgorithms do not work well as it was predicted in Chapter 7.2.


Example 7.4.2 (Burgers equation with additive noise) Consider thestochastic Burgers equation [93, 225]:

du+ u∂xudt = ν∂2xudt+ σ cos(x)dW, 0 ≤ x ≤ ", ν > 0 (7.4.3)

with the initial condition u0(x) = 2ν 2π�

sin( 2π� x)

a+cos( 2π� x)

, a > 1, and periodic bound-

ary conditions. In the numerical tests the used values of the parameters are" = 2π and a = 2.

Apply the Fourier collocation method in physical space and the trape-zoidal rule in time to (7.4.3):

uj+1 − uj

h− νD2uj+1 + uj

2= −1

2D(

uj+1 + uj

2)2 + σΓ

√hξj , (7.4.4)

where uj = (u(tj , x1), . . . , u(tj , xM))ᵀ, tj = jh, D is the Fourier spectral

differentiation matrix. The matrix D has entries ∂xLi(xj) = (−1)i−j cot((i−j)h/2)/2δi,j and can be implemented in Matlab using a Toeplitz matrix asfollows, see, e.g., [447, Chapter 3].

Code 7.1. Fourier spectral differentiation matrix

x left=0; x right=2*pi; % l=2*piM=128;h=(x right-x left)/M;%set up Fourier spectral differentiation matrixcolumn=[0 0.5*(-1).ˆ(1:M-1).*cot((1:M-1)*h/2)];D=toeplitz(column, column([1 M:-1:2]));

Also, ξj are i.i.d. N (0, 1) random variables, and Γ = (cos(x1), . . . ,cos(xM))

ᵀ. The Fourier collocation points are xm = m �M (m = 1, . . . ,M) in

physical space and in the experiment we used M = 100. We aim at computingmoments of uj , which are integrals with respect to the Gaussian measure cor-responding to the collection of ξj , and we approximate these integrals usingthe SCM from Chapter 2.5. The use of the SCM amounts to substituting ξjin (7.4.4) by sparse-grid nodes, which results in a system of (deterministic)nonlinear equations of the form (7.4.4). To solve the nonlinear equations, weused the fixed-point iteration method with tolerance h2/100.

The errors in computing the first and second moments are measured asfollows

ρr,21 (T ) =‖Euref(T, ·)− Eunum(T, ·)‖

‖Euref(T, ·)‖, ρr,22 (T ) =

∥∥Eu2ref(T, ·)− Eu2

num(T, ·)∥∥∥∥Eu2

ref(T, ·)∥∥ ,

ρr,∞1 (T ) =‖Euref(T, ·)− Eunum(T, ·)‖∞

‖Euref(T, ·)‖∞, ρr,∞2 (T ) =

∥∥Eu2ref(T, ·)− Eu2

num(T, ·)∥∥∞∥∥Eu2

ref(T, ·)∥∥∞

,

(7.4.5)

where ‖v(·)‖ =

(2π

M

M∑

m=1

v2(xm)

)1/2

, ‖v(·)‖∞ = max1≤m≤M

|v(xm)|, xm are the

Fourier collocation points, and unum and uref are the numerical solution


obtained by the SCM algorithm and the reference solution, respectively. Thefirst and second moments of the reference solution uref were computed by thesame solver in space and time (7.4.4) but accompanied by the Monte Carlomethod with a large number of realizations ensuring that the statistical errorswere negligible.

First, we choose ν = 0.1 and σ = 1. We obtain the reference solution withh = 10−4 and 1.92 × 106 Monte Carlo realizations. The corresponding sta-tistical error is 1.004× 10−3 for the mean (maximum of the statistical errorfor Euref(0.5, xj)) and 9.49× 10−4 for the second moment (maximum of thestatistical error for Eu2

ref(0.5, xj)) with 95% confidence interval, and the cor-responding estimates of L2-norm of the moments are ‖Euref(0.5, ·)‖ .

= 0.18653and

∥∥Eu2ref(0.5, ·)

∥∥ .= 0.72817. We see from the results of the experiment pre-

sented in Table 7.3 that for L = 2 the error in computing the mean decreaseswhen h decreases up to h = 0.05 but the accuracy does not improve withfurther decrease of h. For the second moment, we observe no improvementin accuracy with decrease of h. For L = 4, the error in computing the secondmoment decreases with h. When h = 0.0125, increasing the sparse grid levelimproves the accuracy for the mean: L = 3 yields ρr,21 (0.5)

.= 9.45× 10−3 and

L = 4 yields ρr,21 (0.5).= 8.34 × 10−3. As seen in Table 7.3, increase of the

level L also improves the accuracy for the second moment when h = 0.05,0.25, and 0.125.

Second, we choose ν = 1 and σ = 0.5. We obtain the first two momentsof the reference uref using h = 10−4 and the Monte Carlo method with3.84 × 106 realizations. The corresponding statistical error is 3.2578 × 10−4

for the mean and 2.2871× 10−4 for the second moment with 95% confidenceinterval, and the corresponding estimates of L2-norm of the moments are‖Euref(0.5, ·)‖ .

= 1.11198 and∥∥Eu2

ref(0.5, ·)∥∥ .= 0.66199.

The results of the experiment are presented in Table 7.4. We see thataccuracy is sufficiently high and there is some decrease of errors with decreaseof time step h. However, as expected, no convergence in h is observed andfurther numerical tests (not presented here) showed that taking h smallerthan 1.25×10−2 and level L = 2 or 3 does not improve accuracy. In additionalexperiments we also noticed that there was no improvement of accuracy forthe mean when we increased the level L up to 5. For the second moment,we observed some improvement in accuracy when L increases from 2 to 3(see Table 7.4) but additional experiments (not presented here) showed thatfurther increase of L (up to 5) did not reduce the errors.

For the errors measured in L∞-norm (7.4.5) we had similar observations(not presented here) as in the case of L2-norm.

In summary, this example has illustrated that SCM algorithms can pro-duce accurate results in finding moments of solutions of nonlinear SPDEwhen the integration time is relatively small. Comparing Tables 7.3 and 7.4,we observe better accuracy for the first two moments when the magnitudeof noise is smaller. In some situations, higher sparse grid levels L improveaccuracy but dependence of errors on L is not monotone. No convergence intime step h or in level L was observed, which is consistent with the theoreticalprediction in Chapter 7.2.


Table

7.3.Errors

oftheSCM

algorithm

tothestochastic

Burgerseq

uation(7.4.3)withparametersT

=0.5,ν=

0.1

andσ=

1.

hρr,2

1(0.5),

L=

2ρr,2

1(0.5),

L=

3ρr,2

2(0.5),

L=

2ρr,2

2(0.5),

L=

3ρr,2

2(0.5),

L=

4

2.5×10−1

1.28×10−1

1.3661×10−1

4.01×10−2

1.05×10−2

1.25×10−2

1.00×10−1

4.70×10−2

5.3874×10−2

4.48×10−2

4.82×10−3

4.69×10−3

5.00×10−2

2.75×10−2

2.7273×10−2

4.73×10−2

5.89×10−3

2.82×10−3

2.50×10−2

2.51×10−2

1.4751×10−2

4.87×10−2

6.92×10−3

2.34×10−3

1.25×10−2

2.67×10−2

9.4528×10−3

4.95×10−2

7.51×10−3

2.29×10−3


Table 7.4. Errors of the SCM algorithm applied to the stochastic Burgers equa-tion (7.4.3) with parameters ν = 1, σ = 0.5, and T = 0.5.

h ρr,21 (0.5), L = 2 ρr,22 (0.5), L = 2 ρr,22 (0.5), L = 3

2.5×10−1 4.94×10−3 8.75×10−3 8.48×10−3

1×10−1 8.20×10−4 1.65×10−3 1.13×10−3

5×10−2 4.88×10−4 1.18×10−3 6.47×10−4

2.5×10−2 3.83×10−4 1.08×10−3 5.01×10−4

1.25×10−2 3.45×10−4 1.07×10−3 4.26×10−4

Example 7.4.3 (Stochastic advection-diffusion equation) Considerthe stochastic advection-diffusion equation in the Ito sense:

du =

(ε2 + σ2

2∂2xu+ β sin(x)∂xu

)dt+σ∂xu dW (s), (t, x) ∈ (0, T ]× (0, 2π),

(7.4.6)

u(0, x) = φ(x), x ∈ (0, 2π),

where w(s) is a standard scalar Wiener process and ε ≥ 0, β, and σ areconstants. In the tests we took φ(x) = cos(x), β = 0.1, σ = 0.5, and ε = 0.2.

We apply Algorithm 7.3.1 to (7.4.6) to compute the first two momentsat a relatively large time T = 5. The Fourier basis was taken as CONS inphysical space. Since (7.4.6) has a single noise only, we used one-dimensionalGauss–Hermite quadratures of order n. The implicitness due to the use ofthe trapezoidal rule was resolved by the fixed-point iteration with stoppingcriterion h2/100.

As we have no exact solution of (7.4.6), we chose to find the reference so-lution by Algorithm 6.1.4 (multistage WCE method with the trapezoidal rulein time and Fourier collocation method in physical space) with the followingparameters: the number of Fourier collocation points M = 30, the length oftime subintervals for the recursion procedure h = 10−4, the highest order ofHermite polynomials N = 4, the number of modes approximating the Wienerprocess n = 4, and the time step in the trapezoidal rule h = 10−5. We obtainthe second moment in the L2-norm

∥∥Eu2ref(1, ·)

∥∥ .= 1.065195. The errors are

computed as follows

#22(T ) =∣∣∥∥Eu2

ref(T, ·)∥∥−

∥∥Eu2numer(T, ·)

∥∥∣∣ , #r,22 (T ) =#22(T )

‖Eu2ref(T, ·)‖ ,

(7.4.7)where the norm is defined as in (7.4.5).

In Table 7.5, we observe first-order convergence in h for the secondmoments. We notice that increasing the quadrature order n from 2 to 3 doesnot improve accuracy which is expected. Indeed, the used trapezoidal rule isof weak order one in h in the case of multiplicative noise and more accurate


quadrature rule cannot improve the order of convergence. This observationconfirms that the total error should be expected to be O(h)+O(hL−1), whichis proved in Chapter 8. In the case of additive noise, we expect to see thesecond order convergence in h when n = 3 due to the properties of the trape-zoidal rule.

In conclusion, we showed that the recursive Algorithm 7.3.1 can workeffectively for accurate computing of second moments of solutions to linearstochastic advection-diffusion equations at relatively large time. We observedconvergence of order one in h.

Table 7.5. Errors in computing the second moment of the solution to the stochasticadvection-diffusion equation (7.4.6) with σ = 0.5, β = 0.1, ε = 0.2 at T = 5 byAlgorithm 7.3.1 with l∗ = 20 and the one-dimensional Gauss–Hermite quadratureof order n = 2 (left) and n = 3 (right).

h �r,22 (5) Order CPU time (sec.) �r,22 (5) Order CPU time (sec.)

5×10−2 1.01×10−3 – 7.41 1.06×10−3 – 1.10×10

2×10−2 4.07×10−4 1.0 1.65×10 4.25×10−4 1.0 2.43×10

1×10−2 2.04×10−4 1.0 3.43×10 2.12×10−4 1.0 5.10×10

5×10−3 1.02×10−4 1.0 6.81×10 1.06×10−4 1.0 1.00×102

2×10−3 4.08×10−5 1.0 1.70×102 4.25×10−5 1.0 2.56×102

1×10−3 2.04×10−5 1.0 3.37×102 2.12×10−5 1.0 5.12×102


We show when and how well Smolyak’s sparse grid collocation works for amodel problem in theory and for several problems in simulations. We alsocombine the collocation methods with the recursive strategy for longer-timeintegration of linear stochastic parabolic equations. The key points of thischapter are:

• We show that for the linear SODE (7.2.3) for computing expectations ofsolutions, the sparse grid collocation method of fixed level with the Eulerscheme in time is convergent only when ε2T is small, where ε is the noiseintensity (diffusion coefficients) and T is the final integration time. SeeProposition 7.2.3 and Chapter 7.2.2 for examples.

• The sparse grid collocation is not convergent in time step size. To have aconvergent scheme in time, we apply the recursive strategy in [315] and inChapter 6 and develop Algorithm 7.3.1 to compute the first two momentsof solutions to linear stochastic advection-diffusion equations at relativelylarge time.

In the next chapter, we will compare theoretically and numerically therecursive WCE and the recursive SCM for linear problems.


Bibliographic notes. For SDE with time-dependent white noise, stochasticcollocation methods, which are deterministic high-dimensional quadraturesto evaluate integrals, have been discussed under different names: cubatureson Wiener space [322], de-randomization [373], optimal quantization [389,390], and sparse grids of Smolyak type [153, 154, 172]. See a brief reviewin Chapter 2.5.4. The sparse grid of Smolyak type has been considered in[153, 154, 172, 398], where high accuracy was observed. However, the useof sparse grid in [153, 172] relies on exact sampling of geometric Brownianmotion and of solutions of other simple SDE models.

While de-randomization and optimal quantization aim at finding quadra-ture rules which are in some sense optimal for computing a particular ex-pectation under consideration, cubatures on Wiener space and a stochasticcollocation method using Smolyak sparse grid quadratures use predeterminedquadrature rules in a universal way without being tailored towards a specificexpectation unless some adaptive strategies are applied. One of the majordifferences between SCM and other aforementioned methods is that SCM isendowed with negative weights. In practice, the difference leads to differentnumerical performance from those of cubatures on Wiener space, where onlyquadrature rules with positive weights are used.

For long-time simulation, deterministic methods for SDEs with whitenoise, e.g., stochastic collocation methods and functional expansion meth-ods, do not work well unless some restarting strategies are employed. Thefundamental obstacle is the increasing number of random variables inducedby the discretization of Brownian motion, which requires significant reduc-tion to compress history data. For ordinary SDEs, the approach proposed in[300] can be promising, compressing the history data via l1 regression at eachtime step, which is summarized in the following two paragraphs.

Suppose at time tk, we obtain a solution at cubature points xi (1 ≤ i ≤ n)and first m-th moments of the solution by the employed cubature rule. If westill use the same cubature rule at (tk, tk+1], according to the definition ofBrownian motion, we will have n2 cubature points (the use of tensor productreflects Markovian property of the solution). To control the increase of thenumber of cubature points, we want to reduce the number of points at xk bythe following procedure:

• Choose a subset of {xi}ni=1, say {xk}nk=1 (n < n) and some positiveweights {wk}nk=1 to form a new cubature rule such that the first m-moments obtained by the new cubature rule match the moments com-puted from the old cubature rule.

This procedure can be formulated as l1 regression; see more details in[300]. As this requires compression at each time step, more analysis and re-duction should be done to further reduce the computational cost. For SPDEs,it is not straightforward to extend the approach in [300] as we are dealing witha random field in both space and time. It seems that the recursive strategyproposed in [315] and in Chapter 6 may be the only feasible one. However,


it is only efficient for the first two moments of solutions of linear equations.Hence, more effort should be put on long-time simulation with deterministicsampling methods.


Exercise 7.6.1 Apply the stochastic collocation method to solve the Kubo os-cillator (6.3.9) and also its corresponding stochastic advection-diffusion equa-tion (3.3.18) with periodic boundary condition. Use the midpoint scheme intime and Fourier spectral collocation method in space. Compare the accuracywhen the levels of sparse grid are L = 0, 1, 2, 3.

Exercise 7.6.2 Consider the following stochastic advection-diffusion-reaction equation

du(t, x) = εuxx(t, x)dt+ f(u) + σux(t, x) ◦ dW (t), t > 0, x ∈ (0, 2π),(7.6.1)

u(0, x) = sin(x),

where f(u) = u− u3. Use the midpoint scheme in time and Fourier spectralcollocation method in space. Apply the sparse grid collocation in random spacewith the sparse grid level L = 0, 1, 2, 3. Plot the mean and variance of thesolution at different locations at t = kδ, δ = 0.1, k = 0, 1, 2, · · · , 10.

8

Comparison between Wiener chaos methodsand stochastic collocation methods

In the last two chapters, we incorporated the recursive strategy into bothWiener chaos expansion (WCE) methods and stochastic collocation methods(SCM). In this chapter, we will compare both methods for linear stochasticadvection-reaction-diffusion equations with commutative and noncommuta-tive noises. To make a fair comparison, we develop a recursive multistageSCM using a spectral truncation of Brownian motion as in the case of WCE.

Numerical results demonstrate that the recursive multistage SCM is oforder Δ (time step size) in the second-order moments while the recursivemultistage WCE is of order ΔN + Δ2 (N is the order of Wiener chaos) foradvection-diffusion-reaction equations with commutative noises. These nu-merical results are in agreement with the theoretical error estimates. How-ever, for noncommutative noises, both methods are of order one (Δ) in thesecond-order moments.

8.1 Introduction

In this chapter, we will show theoretically and through numerical examplesthat for white noise driven PDEs, WCE, and SCM have quite different per-formance when the noises are commutative. This is different than the caseof color noise where WCE and SCM exhibit similar performance for smoothsolutions.

To apply WCE and SCM and have a fair comparison between these twomethods, we first discretize the Brownian motion with its truncated spectralexpansion, see Section 2.2 and also, e.g., [391, Chapter IX] and [315], andsubsequently we employ the corresponding functional expansion (WCE andSCM) to represent the solution in random space. In principle, we can em-ploy any functional expansion, however, different expansions are preferred fordifferent stochastic products because of computational efficiency.


219

220 8 Comparison between Wiener chaos methods and stochastic...

In practice, WCE is associated with the Ito-Wick product, see (3.3.15)and Chapter 11, as the product is defined with Wiener chaos modes yieldinga weakly coupled system (lower-triangular system) of PDEs for linear equa-tions. On the other hand, SCM is associated with the Stratonovich product,see (3.3.26), yielding a decoupled system of PDEs. These different formula-tions lead to different numerical performance as we demonstrate in Chap-ter 8.4; in particular, WCE can be of second-order convergence in time whileSCM is only of first-order in time in the second-order moments for commuta-tive noises. Further, when the noises serve as the advection coefficients, SCMcan be more accurate than WCE when both methods are of first order con-vergence as the SCM (Stratonovich formulation) can lead to smaller diffusioncoefficients than those for WCE (Ito-Wick formulation).

This chapter is organized as follows. After the Introduction section, webriefly revisit the WCE method and SCM for linear parabolic SPDEs, andpresent a new recursive SCM using a spectral truncation of Brownian motion,following the same recursive procedure as in Chapters 6 and 7. In Chapter 8.3,we present the error estimates for both methods for linear advection-diffusion-reaction equations and their proofs. In Chapter 8.4, we present numericalresults of WCE and SCM for linear SPDEs with both commutative and non-commutative noises and verify the error estimates of WCE and SCM forcommutative noises. We conclude in Chapter 8.5 and comments for multi-stage WCE and SCM. Four exercises are provided for readers to practiceand compare multistage WCE and SCM for stochastic advection-diffusion-reaction equations in one dimension.

8.2 Review of Wiener chaos and stochastic collocation

In this section, we briefly revisit WCE and SCM for the linear SPDE (3.3.23).In both WCE and SCM, we discretize the Brownian motion using the spectralrepresentations (2.2.1):

limn→∞E[(W (t)−W (n)(t))2] = 0, W (n)(t) =

n∑

i=1

∫ t

0

mi(s) dsξi, t ∈ [0, T ],

(8.2.1)

where {mi}∞i=1 is a CONS in L2([0, T ]), and ξi are mutually independentstandard Gaussian random variables.

8.2.1 Wiener chaos expansion (WCE)

The SPDE (3.3.23) with finite dimensional noises can be written in the fol-lowing form using the Ito-Wick product

8.2 Review of Wiener chaos and stochastic collocation 221

du(t, x) = [Lu(t, x) + f(x)] dt+

q∑

k=1

[Mku(t, x) + gk(x)] � Wk dt,

(t, x) ∈ (0, T ]×D,

u(0, x) = u0(x), x ∈ D, (8.2.2)

where Wk is formally the first-order derivative of Wk in time, i.e., W = ddtW .

To obtain the coefficients ϕα(t, x;φ), we approximate Wk with the spectral

truncation (8.2.1), W(n)k , and then we substitute the representation (3.3.14)

into (3.3.23). By multiplying ξα on both sides of (3.3.23), and taking expec-

tation with the properties of the Ito-Wick product ξα � ξβ =√

(α+β)!α!β! ξα+β

and E[ξαξβ ] = δα=β , we then have that the coefficients ϕα(t, x;φ) satisfy thepropagator (6.1.3).

In practical computations, we are only interested in the truncated Wienerchaos solution (6.1.1). However, the error induced by the truncation of Wienerchaos expansion grows exponentially with time and thus WCE is not effi-cient for long-time integration. To control the error behavior, we can use therecursive WCE (see Algorithm 6.1.4) for computing the second moments,E[u2(t, x)], of the solution of the SPDE (3.3.23). See Chapter 6 for moredetails.

Note that in Algorithm 6.1.4 we discretize the Brownian motion usingthe following spectral representation in a multi-element version, i.e., using Kmulti-elements:

w(n,K)(t) =

K∑

k=1

n∑

i=1

∫ tk∧t

tk−1∧t

mi,k(s) dsξi,k, t ∈ [0, T ], (8.2.3)

where 0 = t0 < t1 < · · · < tK = T , tk ∧ t is the minimum of tk and t,{mi,k}∞i=1 is a CONS in L2([tk, tk+1]), and ξi,k are mutually independentstandard Gaussian random variables. This approximation of the Brownianmotion will be also used in the stochastic collocation methods presentedbelow.

8.2.2 Stochastic collocation method (SCM)

As we discussed in the last chapter, this method leads to a fully decoupled sys-tem instead of a weakly coupled system from the WCE. First, we rewrite theSPDE (3.3.23) with finite dimensional noises in Stratonovich form (3.3.26).Second, we replace the Brownian motion with its multi-element spectral ex-pansion (8.2.3), and obtain the following partial differential equation withsmooth random inputs:

du(t, x) = [Lu(t, x) + f(x)] dt+

q∑

k=1

[Mku(t, x) + gk(x)] (8.2.4)

dW(n,K)k (t), (t, x) ∈ (0, T ]×D, u(0, x) = u0(x), x ∈ D.


Now we can apply standard numerical techniques of high integration to obtainp-th moments of the solution to (3.3.23)

E[upn,K(x, t)] =

1

(2π)nqK/2

∫

RnqK

F (u0(x), x, t,y)e− y�y

2 dy, p = 1, 2, · · ·(8.2.5)

where y = (yi,k,l), i ≤ n, k ≤ K, l ≤ q and the functional F represents thesolution functional for (8.2.4). Here, we employ sparse grid collocation if thedimension nK is moderately large. As pointed out in [11, 486], we are led toa fully decoupled system of equations as in the case of Monte Carlo methods.

In practice, we use the sparse grid quadrature rule (2.5.9). Here again,the direct application of SCM is efficient only for short-time integration.To achieve long-time integration, we apply the recursive multistage idea usedin Algorithm 6.1.4, i.e., we use SCM over small time interval (ti−1, ti] insteadof over the whole interval (0, T ] and compute the second-order moments ofthe solution recursively in time. The derivation of such a recursive algorithmwill make use of properties of the problem (3.3.23) and orthogonality of thebasis both in physical space and in random space, as will be shown shortly.

We solve (8.2.4) with spectral methods in physical space, i.e., using a trun-

cation of a CONS in physical space {em}Mm=1 to represent the numerical solu-tion. The corresponding approximation of uΔ,n(t, x) is denoted by uM

Δ,n(t, x).

Further, let υ(t, x; s, υ0) be the approximation uMΔ,n(t, x) of uΔ,n(t, x) with

the initial data υ0 prescribed at s: uΔ,n(s, x) = υ0(x). Note that

uMΔ,n(ti, x) = υ(ti, x; ti−1, u

MΔ,n(ti−1, ·)), ti = iΔ. (8.2.6)

Denote Φm(ti;Δ, n,M) = (uMΔ,n(ti, ·), em). Then the second moments are com-

puted by

E[(uMΔ,n(ti, x))

2] =M∑

l,m=1

Hlm(ti;Δ, n,M)el(x)em(x), (8.2.7)

where Hlm(ti;Δ, n,M) = E[Φl(ti;Δ, n,M)Φm(ti;Δ, n,M)]. Now we show howthe matrix Hlm(ti;Δ, n,M) can be computed recursively. By the linearityof (8.2.4), we have

uMΔ,n(ti, x) =

M∑

l=1

Φl(ti−1;Δ, n,M)υ(ti, x; ti−1, el).

Denote hl,m,i−1 = (υ(ti, ·; ti−1, el), em). Then by the orthonormality of em,we have

Φm(ti;Δ, n,M) =

M∑

l=1

Φl(ti−1;Δ, n,M)hl,m,i−1.

8.2 Review of Wiener chaos and stochastic collocation 223

The matrix Hlm(ti;Δ, n,M) can be computed recursively as

Hlm(ti;Δ, n,M) =

M∑

j=1

M∑

k=1

Hjk(ti−1;Δ, n,M)E[hj,l,i−1hk,m,i−1].

We note that the expectation E[hj,l,i−1hk,m,i−1] does not depend on i −1 because according to (8.2.4) and (3.3.24), υ(ti, x; ti−1, el) depend on thelength of the time interval Δ and the random variables ξl,k,i (l ≤ n, k ≤ q)but is independent of time ti−1. Denote υ(ti, ·; ti−1, el) with ξl,k,i anchored atthe sparse grid point xκ ∈ Hnq

L by υκ(Δ, ·; el). Let hκ,l,m = (υκ(Δ, ·; el), em).Then, using the sparse grid quadrature rule (2.5.9), we obtain the recursiveapproximation of Hlm(ti;Δ, n,M):

Hlm(ti;Δ, n,M) ≈ Hlm(ti;Δ, L, n,M) :=

M∑

j=1

M∑

k=1

Hjk(ti−1;Δ, L, n,M)

η(L,nq)∑

κ=1

hκ,j,lhκ,k,mWκ. (8.2.8)

Substituting (8.2.8) in (8.2.7), we obtain an approximation for the secondmoments of u(t, x), denoted by M

MΔ,L,n(ti, x). When M = ∞ (i.e., when the

CONS {em} is not truncated), we denote this approximation by MΔ,L,n(ti, x).

Remark 8.2.1 For nonhomogeneous equations, i.e., with forcing terms, wecan have similar algorithms. Indeed, the same procedure applies once wecan split the nonhomogeneous equations into two equations: nonhomogeneousequation with zero initial value and homogeneous equation with initial value.See Chapter 7.3 for a derivation of similar algorithms where only incrementsof Brownian motion are used, which is different from the spectral approxima-tion of Brownian motion used here.

Now we have the following algorithm for the second moments of the ap-proximate solution when f = gk = 0.

Algorithm 8.2.2 (Recursive multistage stochastic collocationmethod) Choose a CONS {em(x)}m≥1 and its truncation {em(x)}Mm=1; atime step Δ; the sparse grid level L and n, which together with the numberof noises q determine the sparse grid Hnq

L which contains η(L, nq) sparse gridpoints.

Step 1. For each m = 1, . . . ,M, solve the system of equations (8.2.4)on the sparse grid Hnq

L in the time interval [0, Δ] with the initial conditionφ(x) = em(x) and denote the obtained solution as υκ(Δ,x; em), m = 1, . . . ,M,and κ = 1, · · · , η(L, nq). Also, choose a time step size δt to solve (8.2.4)numerically.

Step 2. Evaluate hκ,l,m = (υκ(Δ, ·; el), em), l,m = 1, . . . ,M.


Step 3. Recursively compute the covariance matrices Hlm(ti; L, n,M),l,m = 1, . . . ,M, as follows:

Hlm(0;Δ, L, n,M) = (u0, el)(u0, em),

Hlm(ti;Δ, L, n,M) =M∑

j,k=1

Hjk(ti−1;Δ, L, n,M)

η(L,nq)∑

κ=1

hκ,j,lhκ,k,mWκ, i = 1, . . . ,K,

where u0(x) is the initial condition for (3.3.23) and obtain the approximatesecond moments M

MΔ,L,n(ti, x) of the solution u(t, x) to (3.3.23) as

MM(ti,x)Δ,L,n =

M∑

l,m=1

Hlm(ti;Δ, L, n,M)el(x)em(x), i = 1, . . . ,K. (8.2.9)

Remark 8.2.3 Similar to Algorithm 6.1.4, the cost of this algorithm isTΔη(L, nq)M4 and the storage is η(L, nq)M2. The total cost can be reducedto the order of M2 by adopting some reduced order methods in physical space.The discussion on computational efficiency of the recursive WCE methods,see Chapter 6.1, is also valid for Algorithm 8.2.2.

8.3 Error estimates

Although WCE and SCM use the same spectral truncation of Brownian mo-tion, the former is associated with the Ito-Wick product while the latter isrelated to the Stratonovich product. Note that WCE employs orthogonalpolynomials as a basis but SCM does not have such orthogonality. This dif-ference enables WCE to have a better convergence rate than SCM in thesecond-order moments, see Corollary 8.3.2 and Theorem 8.3.6.

Assume that the operator L generates a semi-group {Tt}t≥0, which has

the following properties: for v ∈ Hk(D),

‖Ttv‖2Hk ≤ C(k,L)e2CLt ‖v‖2Hk , (8.3.1)

where C(0,L) = 1 and

∫ t

s

e2CL(t−θ) ‖Ttv‖2Hk+1 dθ ≤ δ−1L C(k,L)e2CL(t−s) ‖v‖2Hk . (8.3.2)

Also, we assume that there exists a constant C(r,M) such that

‖Mlv‖2Hk ≤ C(k,M) ‖v‖2Hk+1 , for v ∈ Hk+1, l = 1, . . . , q, (8.3.3)

8.3 Error estimates 225

and that there exists a constant C(k,L) such that

‖Lv‖2Hk ≤ C(k,L) ‖v‖2Hk+2 , for v ∈ Hk+2. (8.3.4)

The conditions (8.3.1) and (8.3.3) are satisfied with k ≤ r and (8.3.4) issatisfied with k ≤ r − 1 when the coefficients from (3.3.24) belong to theHolder space Cr+1

b (D). Define also

Ck = max1≤j≤k

{C(j,L)C(j − 1,M)

}. (8.3.5)

8.3.1 Error estimates for WCE

For the WCE for the SPDE (3.3.23) with single noise (q = 1), we have the con-vergence results stated below. In the general case, we have not succeeded inproving such theorems but we numerically check convergence orders using ex-amples with commutative noises and noncommutative noises in Chapter 8.4.

Theorem 8.3.1 Let q = 1 in (3.3.23). Assume that σi,1, ai,j , bi, c, ν1 in(3.3.24) belong to Cr+1

b (D) and u0 ∈ Hr(D), where r ≥ N + 2 and N is theorder of Wiener chaos. Also assume that (3.3.25) holds. Then for C1 < δL,the error of the truncated Wiener chaos solution uN,n(ti, x) from (6.1.1) isestimated as

(E[‖uN,n(ti, ·)− u(ti, ·)‖2])1/2

≤ (C r�Δ)N/2eCLT

[eC�r�T

(N+ 1)!+

(C r�Δ) r�−N−1

�r�!δL

δL − C1

]1/2‖u0‖Hr

+

√2CN+2C(N+ 2,L)C(N,L)eCN+2T+CLT Δ√

nπ‖u0‖HN+2 , (8.3.6)

where ti = iΔ, the constants δL and CL are from (3.3.25), C r� is defined

in (8.3.5), C(N,L) is from (8.3.4), and C(N+ 2,L) is from (8.3.1).

From Theorem 8.3.1, we have that the mean-square error of the recursivemultistage WCE is O(ΔN/2) + O(Δ). The same result is proved for q = 1and σi,r = 0 in [315], where the condition C1 < δL is not required. Also,for the case of σi,r �= 0, the mean-square convergence without order but notrequiring the condition C1 < δL was proved in [314, 316].

Corollary 8.3.2 Under the assumptions of Theorem 8.3.1, we have∣∣∣E[‖uN,n(ti, ·)‖2]− E[‖u(ti, ·)‖2]

∣∣∣ = E[‖uN,n(ti, ·)− u(ti, ·)‖2]

≤ (C r�Δ)Ne2CLT

[eC�r�T

(N+ 1)!+

(C r�Δ) r�−N−1

�r�!δL

δL − C1

]‖u0‖2Hr

+2CN+2C(N+ 2,L)C(N,L)e2CN+2T+2CLT Δ2

nπ2‖u0‖2HN+2 . (8.3.7)


This corollary states that the convergence rate of the error in second-ordermoments (8.3.7) is twice that of the mean-square error (8.3.6), i.e., O(ΔN)+O(Δ2). This corollary can be proved by the orthogonality of WCE. In fact,it holds that

E[u2(ti, x)]− E[u2N,n(ti, x)] = E[(u(ti, x)− uN,n(ti, x))

2], (8.3.8)

as the different terms in the Cameron-Martin basis are mutually orthogo-nal [57]. Then integration over the physical domain and by the Fubini The-orem (see Appendix D), we reach the conclusion in Theorem 8.3.1.

The idea for the proof Theorem 8.3.1 is to first establish an estimate forthe one-step (Δ = T ) error where the global error can be readily derivedfrom. We need the following two lemmas for the one-step errors. Introduce(cf. (3.3.14))

uN(t, x) =∑

|α|≤N, α∈Jq

1√α!

ϕα(t, x)ξα. (8.3.9)

Lemma 8.3.3 Let q = 1 in (3.3.23). Assume that σi,1, ai,j , bi, c, ν1 belong toCr+1b (D) and u0 ∈ Hr(D), where r ≥ N+ 1. Let u in (3.3.14) be the solution

to (3.3.23) and uN is from (8.3.9). For C1 < δL, the following estimate holds

E[‖u(Δ, ·)− uN(Δ, ·)‖2]

≤ (C r�Δ)N+1e2CLΔ[eC�r�Δ

(N+ 1)!+

(C r�Δ) r�−N−1

�r�!δL

δL − C1] ‖u0‖2H�r� ,

where the constants δL and CL are from (3.3.25) and C r� is from (8.3.5).

Lemma 8.3.4 Under the assumptions of Lemma 8.3.3 and r ≥ N + 2, wehave

E[‖uN,n(Δ, ·)− uN(Δ, ·)‖2] ≤ 2Δ3

nπ2C(N+2,L)C(N,L)CN+2e

2CN+2Δ+2CLΔ ‖u0‖2HN+2 ,

where CL is from (3.3.25), C(N+2,L) is from (8.3.1), C(N,L) is from (8.3.4)and CN+2 is from (8.3.5).

Using Lemmas 8.3.3 and 8.3.4, we can establish the estimate of the globalerror stated in Theorem 8.3.1. Specifically, the one-step error is bounded bythe sum of E[(u(Δ)−uN(Δ))2] and E[(uN(Δ)−uN,n(Δ))2], which are estimatedin Lemmas 8.3.3 and 8.3.4. Then, the global error is estimated based on therecursion nature of Algorithm 6.1.4 as in the proof in [315, Theorem 2.4],which completes the proof of Theorem 8.3.1.


Now we proceed to proving Lemmas 8.3.3 and 8.3.4. Let us denote by sk

the ordered set (s1, · · · , sk) and for k ≥ 1, denote dsk := ds1 . . . dsk, and

∫ (k)

(· · · ) dsk =

∫ Δ

0

∫ sk

0

· · ·∫ s2

0

(· · · ) ds1 . . . dsk,∫

(k)

(· · · ) dsk =

∫ Δ

0

∫ Δ

s1

· · ·∫ Δ

sk−1

(· · · ) dsk . . . ds2 ds1,

and F (Δ; sk;x) = TΔ−skM· · · Ts2−s1MTs1u0(x), where M := M1.Proof of Lemma 8.3.3. It follows from (3.3.25) and the assumptions on

the coefficients that (8.3.1) and (8.3.2) hold, cf. [125, Section 7.1.3]. Also,by the assumption that σi,1, ν1 belong to Cr+1

b (D), it can be readily checkedthat (8.3.3) holds.

By (3.3.14), (8.3.9), and orthogonality of ξα (see (6.1.7)), we have

E[‖u(Δ, ·)− uN(Δ, ·)‖2] =∑

k>N

∑

|α|=k

‖ϕα(Δ, ·)‖2

α!.

Similar to the proof of Proposition A.1 in [315], we have

∑

|α|=k

ϕ2α(Δ,x)

α!=

∫ (k) ∣∣F (Δ; sk;x)∣∣2 dsk.

Then by the Fubini theorem (see Appendix D),

∑

|α|=k

‖ϕα(Δ, ·)‖2

α!=

∫ (k) ∥∥F (Δ; sk; ·)∥∥2 dsk. (8.3.10)

Assume that r > 0 is a integer. When r > 0 is not an integer, we use �r�instead.

Denote Xk = Tsk−sk−1M· · · Ts2−s1MTs1u0, Yk = MXk, k ≥ 1 and also

X = TΔ−skYk. Then Xk = Tsk−sk−1Yk−1 and Yk−1 = MXk−1.

By the definition of F , (8.3.1), (8.3.3) and (8.3.5), we have for r ≥ k:

∥∥F (Δ; sk; ·)∥∥2 ≤ e2CL(Δ−sk) ‖Yk‖2H0 = e2CL(Δ−sk) ‖MikXk‖2H0

≤ C(0,M)e2CL(Δ−sk) ‖Xk‖2H1

≤ C1e2CL(Δ−sk−1) ‖Yk−1‖2H1 ≤ · · · ≤ Ck

ke2CLΔ ‖u0‖2Hk ,

where C(r− 1,M) is from (8.3.3) and Ck is defined in (8.3.5). We then have

∫ (k) ∥∥F (Δ; sk; ·)∥∥2 dsk ≤ Ck

ke2CLΔ ‖u0‖2Hk

∫ (k)

dsk. (8.3.11)


If r < k, by changing the integration order and applying (8.3.1), (8.3.3),and (8.3.2), we get

∫ (k) ∥∥F (Δ; sk; ·)∥∥2 dsk =

∫ (k)

‖X‖2 dsk =

∫

(k)

‖X‖2 dsk

≤∫

(k)

e2CL(Δ−sk) ‖Yk‖2 dsk =

∫

(k)

e2CL(Δ−sk) ‖MikXk‖2 dsk

≤ C(0,M)

∫

(k)

e2CL(Δ−sk) ‖Xk‖2H1 dsk

= C(0,M)

∫

(k−1)

∫ Δ

sk−1

e2CL(Δ−sk) ‖Xk‖2H1 dsk dsk−1

≤ δ−1L C1

∫

(k−1)

e2CL(Δ−sk−1) ‖Yk−1‖2 dsk−1,

where C1 is from (8.3.5). Repeating this procedure and using (8.3.11), weobtain

∫ (k) ∥∥F (t; sk; ·)∥∥2 dsk ≤ δr−k

L Ck−r1

∫

(r)

e2CL(Δ−sr) ‖Yr‖2 dsr

≤ δr−kL Ck−r

1 Crr e

2CLt ‖u0‖2Hr

∫ (r)

dsr. (8.3.12)

By (8.3.9), (8.3.10), (8.3.11), and (8.3.12), and

∫ (k)

dsk =Δk

k!, we con-

clude that, for r ≥ N+ 1 and C1 < δL,

E[‖u(Δ, ·)− uN(Δ, ·)‖2] =∑

N<k≤r

∫ (k) ∥∥F (Δ; sk; ·)∥∥2 dsk

+∑

k>r

∫ (k) ∥∥F (Δ; sk; ·)∥∥2 dsk

≤∑

N<k≤r

Δk

k!Ck

r e2CLΔ ‖u0‖2Hk

+Δr

r!Cr

r e2CLΔ ‖u0‖2Hr

∑

k>r

δr−kL Ck−r

1

≤ (CrΔ)N+1e2CLΔ[eCrΔ

(N+ 1)!

+(CrΔ)r−N−1

r!

δLδL − C1

] ‖u0‖2Hr . �


Remark 8.3.5 Lemma 8.3.3 holds for r = ∞ if C∞ < ∞. Based on (8.3.11),we can prove that

E[‖u(Δ, ·)− uN(Δ, ·)‖2] ≤∑

k≥N

Δk

k!Ck

∞e2CLΔ ‖u0‖2Hk

≤ (C∞Δ)N+1e2CLΔ eC∞Δ

(N+ 1)!‖u0‖2H∞ .

If r < ∞, we need to require that C1 < δL, i.e., C(0,M)C(1,L) < δL. Forexample, L = �, M1 = 1

2D1, for which C(0,M)C(1,L) = 12 < δL = 1.

Proof of Lemma 8.3.4. It can be proved as in [315, p. 447] that

E[|uN(Δ, ·)− uN,n(Δ, ·)|2] =∑

l≥n+1

N∑

k=1

∑

|α|=k,iαk=l

ϕ2α(Δ, ·)α!

, (8.3.13)

where iα|α| is the index of last nonzero element of α and the last summation

in the right-hand side can be bounded by, see [315, (3.7)],

∑

|α|=k,iαk=l

ϕ2α(Δ,x)

α!≤∫ (k−1)

∣∣∣∣∣∣

k∑

j=1

∫ sj+1

sj−1

Fj(Δ; sk;x)Ml(sj) dsj

∣∣∣∣∣∣

2

dskj ,

where dskj = ds1 · · · dsj−1 dsj+1 · · · dsk, s0 := 0, sk+1 := Δ, Ml(t) =∫ t

0ml(s) ds and

Fj(Δ; sk;x) =∂F (Δ; sk;x)

∂sj= TΔ−skM· · · Tsj+1−sjMLTsj−sj−1

· · · Ts1u0(x)

−TΔ−skM· · ·MLTsj+1−sj · · ·Ts1u0(x)

=: F 1j + F 2

j .

Then by the Fubini theorem (see Appendix D) and the Cauchy-Schwarz in-equality, we have

∑

|α|=k,iαk=l

‖ϕα(Δ, ·)‖2

α!≤ k

∫ (k−1) k∑

j=1

∫ sj+1

sj−1

∥∥Fj(Δ; sk; ·)∥∥2 dsj

∫ sj+1

sj−1

M2l (sj) dsj ds

kj .

We claim that∥∥Fj(Δ; sk; ·)

∥∥2 ≤ 2 max1≤j≤k

∥∥F 1j

∥∥2

≤ 2Ckk+2C(k,L)C(k + 2,L)e2CLΔ ‖u0‖2Hk+2 . (8.3.14)


Thus, by (8.3.14) we have

∑|α|=k,iα

k=l

‖ϕα(Δ, ·)‖2α!

≤ 2kΔCkk+2C(k + 2,L)C(k,L)e2CLΔ ‖u0‖2Hk+2

∫ Δ

0

M2l (s) ds

∫ (k−1)

dskj . (8.3.15)

Then by (8.3.13), (8.3.15), and Ml(t) =√2Δ

(l−1)π sin( (l−1)πΔ t) (by (2.2.6)),

we obtain that

E[‖uN(Δ, ·)− uN,n(Δ, ·)‖2] ≤∑

l≥n+1

Δ2

(l − 1)2π2e2CLΔ

N∑

k=1

Ckk+2C(k + 2,L)C(k,L) ‖u0‖2Hk+2

2kΔk

(k − 1)!

≤ 2Δ3

nπ2e2CLΔ

N∑

k=1

Ckk+2C(k,L)C(k + 2,L)

‖u0‖2Hk+2

kΔk−1

(k − 1)!

≤ 2Δ3

nπ2CN+2C(N+ 2,L)

C(N,L)e2CN+2Δ+2CLΔ ‖u0‖2HN+2 .

It remains to prove (8.3.14). Note that it is sufficient to estimate∥∥F 1

j

∥∥due to the same structure of the two terms in Fj(Δ; sk;x). By the assumptionthat ai,j bi and c belongs to CN+3

b (D), it can be readily checked that (8.3.4)holds with l ≤ N+ 1. Repeatedly using (8.3.1) and (8.3.3) gives

∥∥F 1j

∥∥2 =∥∥TΔ−skM· · · Tsj+1−sjMLTsj−sj−1

· · · Ts1u0

∥∥2

≤ e2CL(Δ−sk)∥∥M· · · Tsj+1−sjMLTsj−sj−1

· · · Ts1u0

∥∥2

≤ C(0,M)e2CL(Δ−sk)∥∥Tsk−sk−1

· · · Tsj+1−sjMLTsj−sj−1· · · Ts1u0

∥∥2H1

≤ C1e2CL(Δ−sk−1)

∥∥M· · · Tsj+1−sjMLTsj−sj−1· · · Ts1u0

∥∥2H1

≤ · · · ≤ Ck−jk−j e

2CL(Δ−sj)∥∥MLTsj−sj−1

· · · Ts1u0

∥∥2Hk−j

≤ Ck−jk−j C(k − j,M)e2CL(Δ−sj)

∥∥LTsj−sj−1· · · Ts1u0

∥∥2Hk−j+1

≤ Ck−jk−j C(k−j,M)C(k−j+1,L)e2CL(Δ−sj)

∥∥Tsj−sj−1· · · Ts1u0

∥∥2Hk−j+3

≤ Ck−j+1k−j+1 C(k − j + 1,L)e2CL(Δ−sj)

∥∥Tsj−sj−1M· · · Ts1u0

∥∥2Hk−j+3 .


where we have used (8.3.4) in the last but one line and the fact that C(k −j + 1,L) ≥ 1. Similarly, we have

∥∥Tsj−sj−1M· · · Ts1u0

∥∥2Hk−j+3 ≤ C(k − j + 3,L)Cj−1

k+2e2CLsj ‖u0‖2Hk+2 .

Thus, we arrive at (8.3.14). This ends the proof of Lemma 8.3.4. �

8.3.2 Error estimate for SCM

For SCM for the SPDE (3.3.23), we have the following estimates: the firstone is weak convergence of the Wong-Zakai type approximation uΔ,n(t, x)from (8.2.4) to u(t, x) from (3.3.23), see Theorem 8.3.6; the second one is theconvergence of SCM, i.e., the convergence ofMΔ,L,n(ti, x) to E[u

2Δ,n(ti, x)], see

Theorem 8.3.8. Here we prove the convergence rate when σi,r = 0, which be-longs to the case of commutative noises (3.3.29). Our proof for Theorem 8.3.6is based on the mean-square of convergence of the Wong-Zakai type approx-imation (8.2.4) to (3.3.23). When σi,r �= 0, we have not succeeded in provingthis mean-square convergence and, as far as we know, only a rate of almostsure convergence of the Wong-Zakai type approximations to (3.3.23) has beenproved [201].

According to Theorems 8.3.6 and 8.3.8, the error of the SCM isO(Δ2L−1)+O(Δ) in the second-order moments. Compared to Corollary 8.3.2, the SCMis of one order lower than WCE when N = 2 as the error of WCE isO(ΔN) +O(Δ2).

To prove Theorems 8.3.6 and 8.3.8, we need a probabilistic representa-tion of the solution to (3.3.23). Let ({Bk(s)} , 1 ≤ k ≤ d,FB

s ) be a sys-tem of one-dimensional standard Wiener processes on a complete probabil-ity space (Ω1,F1, Q) and independent of w(s) on the space (Ω ⊗ Ω1,F ⊗F1, P ⊗Q). Consider the following backward stochastic differential equationon (Ω1,F1, Q), for 0 ≤ s ≤ t,

dXt,x(s) = b(Xt,x(s)) ds+

d∑

r=1

αr(Xt,x(s)) dBr(s), Xt,x(t) = x. (8.3.16)

The symbol “d” means backward integral, see, e.g., [280, 408] for treat-ment of backward stochastic integrals. The d × d matrix α(x) is defined byα(x)α�(x) = 2a(x). Here a(x) and b(x) are from (3.3.24). Consider the fol-lowing backward stochastic differential equation on (Ω⊗Ω1,F ⊗F1, P ⊗Q)for 0 ≤ s ≤ t,

dYt,1,x(s) = c(Xt,x(s))Yt,1,x(s) ds+

q∑

r=1

νr(Xt,x(s))Yt,1,x(s)] dWr,

Yt,1,x(t) = 1. (8.3.17)

Here c(x) and νr(x) are from (3.3.24). When u0(x) ∈ C2b (D) and α(x), b(x),

c(x), νr(x) ∈ C0b (D) and σi,r = 0, the solution to (3.3.23)–(3.3.24) can be

represented by, see, e.g., [280],


u(t, x) = EQ[u0(Xt,x(0)) exp(

q∑

r=1

∫ t

0

νr(Xt,x(s)) dWr(s) +

∫ t

0

c(Xt,x(s)) ds)],

(8.3.18)where c(x) = c(x)− 1

2

∑qr=1 ν

2r (x).

Theorem 8.3.6 (Weak convergence of Wong-Zakai approximation)Assume that σi,r = 0 and that the initial condition u0 and the coefficientsin (3.3.24) are in C2

b (D). Let u(t, x) be the solution to (3.3.23) and uΔ,n(t, x)be the solution to (8.2.4). Then for any ε > 0, there exists a constant C > 0such that the one-step error is estimated by

∣∣E[u2(Δ,x)]− E[u2Δ,n(Δ,x)]

∣∣ ≤ C exp(CΔ)(Δ6 +Δ2)n−1+ε, (8.3.19)

and the global error is estimated by

∣∣E[u2(ti, x)]− E[u2Δ,n(ti, x)]

∣∣ ≤ C exp(CT )Δn−1+ε, 1 ≤ i ≤ K. (8.3.20)

To prove Theorem 8.3.6, we first establish the one-step error (8.3.19) andthen the global error (8.3.20). We follow the recipe of the proofs in [227,Theorem 3.1] and [54, Theorem 4.4] where n = 1 and K > 1.

We need the following mean-square convergence rate for the one-step er-ror.

Proposition 8.3.7 (Mean-square convergence) Assume that σi,r = 0and that the initial condition u0 and the coefficients in (3.3.24) are in C2

b (D).Let u(t, x) be the solution to (3.3.23) and uΔ,n(t, x) the solution to (8.2.4).Then for any ε > 0,

E[|u(Δ,x)− uΔ,n(Δ,x)|2] ≤ C exp(CΔ)(Δ3 +Δ2)n−1+ε, (8.3.21)

where the constant C > 0 is independent of n.

Proof. The solution to (8.2.4) using the spectral truncation of Brownian

motion W(Δ,n)r from (8.2.3) can be represented by, see, e.g., [54, 227],

uΔ,n(Δ,x) = EQ[u0(XΔ,x(0)) exp(

q∑

r=1

∫ Δ

0

νr(XΔ,x(s)) dW(Δ,n)r (s)

+

∫ Δ

0

c(XΔ,x(s)) ds)]. (8.3.22)

Using ex − ey = eθx+(1−θ)y(x− y), 0 ≤ θ ≤ 1, boundedness of c(x) and u0(x)and the Cauchy-Schwarz inequality (twice), we have for some C > 0:


E[|uΔ,n(Δ,x)− u(Δ,x)|2] (8.3.23)

= E[

(EQ[u0(XΔ,x(0)) exp(

∫ Δ

0

c(XΔ,x(s)) ds))

exp(

q∑

r=1

∫ Δ

0

νr(XΔ,x(s))[θ dW (Δ,n)r (s) + (1− θ) dWr(s)])

× (

q∑

r=1

∫ Δ

0

νr(XΔ,x(s))[ dW(Δ,n)r (s)− dWr(s)])]

)2

]

≤ C exp(CΔ)E[

(EQ[exp(

q∑

r=1

∫ Δ

0

νr(XΔ,x(s))[θ dW (Δ,n)r (s)

+(1− θ) dWr(s)])×∣∣∣∣∣

q∑

r=1

∫ Δ

0

νr(XΔ,x(s))[ dW(Δ,n)r (s)− dWr(s)]

∣∣∣∣∣])2

]

≤ C exp(CΔ)(E[EQ[exp(

q∑

r=1

∫ Δ

0

4νr(XΔ,x(s))[θ dW (Δ,n)r (s)

+(1− θ) dWr(s)])]])1/2

×(E[EQ[(

q∑

r=1

∫ Δ

0

νr(XΔ,x(s))[ dW(Δ,n)r (s)− dWr(s)])

4]])1/2

.

Recall that E[·] = EP [·] is the expectation with respect to P only. Hence,

we need to estimate I1 =(E[EQ[(

∑qr=1

∫Δ

0νr(XΔ,x(s))[ dW

(Δ,n)r (s) − d

Wr(s)])4]])1/2

and

I2 =(E[(EQ[exp(

q∑

r=1

∫ Δ

0

4νr(XΔ,x(s))[θ dW (Δ,n)r (s) + (1− θ) dWr(s)])]]

)1/2.

We first estimate I1. Due to the independence of Bk and Wr, and accordingto [386] and (8.2.1), we have

∫ Δ

0

νr(XΔ,x(s)) dWr(s) =

∫ Δ

0

νr(XΔ,x(s)) ◦ dWr(s)

=

∞∑

i=0

ξr,i

∫ Δ

0

νr(XΔ,x(s))mr,i(s) ds.


Thus by the Fubini theorem (see Appendix D), (8.2.1) and (8.2.3), we canrepresent I1 as

I1 =(EQ[E[

∣∣∣∣∣q∑

r=1

[

∫ Δ

0

νr(XΔ,x(s)) dW(Δ,n)r (s)−

∫ Δ

0

νr(XΔ,x(s)) ◦ dWr(s)]

∣∣∣∣∣4

]])1/2

=(EQ[E[

∣∣∣∣∣q∑

r=1

∞∑i=n+1

ξr,i

∫ Δ

0

νr(XΔ,x(s))mr,i(s) ds

∣∣∣∣∣4

]])1/2

≤(3EQ[

( q∑r=1

∞∑i=n+1

(

∫ Δ

0

νr(XΔ,x(s))mr,i(s) ds)2)2])1/2,

where we have used twice the fact that XΔ,x are independent of Wr and

W(Δ,n)r . Then by standard estimates of L2-projection error (cf. [58, (5.1.10)]),

we have for 0 < ε < 1,

∞∑

i=n+1

(

∫ Δ

0

νr(XΔ,x(s))mr,i(s) ds)2 ≤ CΔ1−εn−1+ε

∣∣∣νr(XΔ,x(·))∣∣∣2

1−ε2 ,2,[0,Δ]

,

(8.3.24)

where the Slobodeckij semi-norm |f |θ,p,[0,Δ] is defined by (3.3.10) and the

constant Δ1−ε appears due to the length of domain, see, e.g., [58, Chapter5.4]. Thus, we obtain

I1 ≤ CΔ1−εn−1+ε( q∑

r=1

EQ[∣∣∣νr(XΔ,x(·))

∣∣∣4

1−ε2 ,2,[0,Δ]

])1/2, 0 < ε < 1. (8.3.25)

By (8.3.16) and the Ito formula, we have

XΔ,x(s)− XΔ,x(s1) =

∫ s

s1

b(XΔ,x(s2)) ds2 +

p∑

k=1

αk(XΔ,x(s1))[Bk(s)

−Bk(s1)] +R(s1, s),

where EQ[|R(s1, s)|2l] ≤ C |s1 − s|2l (l ≥ 1) when b(x) and αk(x) belong toC2b (D). By the Lipschitz continuity of ν1, the definition of the Slobodeckij

semi-norm, it is not difficult to show that

EQ[∣∣∣νr(XΔ,x(·))

∣∣∣4

1−ε2 ,2,[0,Δ]

] ≤ C(Δ4+2ε +Δ2+2ε). (8.3.26)

Thus, by (8.3.25) and (8.3.26), we have

I1 ≤ C(Δ3 +Δ2)n−1+ε. (8.3.27)


Now we estimate I2. Using the following facts (see, e.g., [227, Lemma2.5]),

E[exp(

q∑

r=1

∫ Δ

0

4νr(XΔ,x(s)) dWr)] = exp(

q∑

r=1

8

∫ Δ

0

ν2r (XΔ,x(s)) ds),

E[exp(

q∑

r=1

∫ Δ

0

4νr(XΔ,x(s)) dW(Δ,n)r (s)] ≤ 4 exp(

q∑

r=1

8

∫ Δ

0

ν2r (XΔ,x(s)) ds),

we have I2 ≤ 4 exp(CΔ). From here, (8.3.27) and (8.3.23), we reach (8.3.21).�

Now we are ready to prove Theorem 8.3.6, i.e., the convergence in thesecond moments.

Proof of Theorem 8.3.6. For simplicity of notation, we consider q = 1while the case q > 1 can be proved similarly. Denote

UΔ,n,m,θ(t, x,y) =: u0(Xt,x(0)) exp(

n∑

i=1

ν1,iyi + θ

m∑

j=n+1

ν1,jyj

+

∫ t

0

c(Xt,x(s)) ds), m ≥ n,

where ν1,i(t, x) =∫ t

0ν1(Xt,x(s))mi(s) ds for i ≤ m (Xt,x(s) is the solution

to (8.3.16)) and y = (y1, . . . , yn, yn+1, . . . , ym). Let us write uΔ,n,m,θ(t, x, Ξ) =EQ[UΔ,n,m,θ(t, x, Ξ)], where Ξ = (ξ1, . . . , ξn, ξn+1, . . . , ξm). With this nota-tion, we have

uΔ,m(t, x) = uΔ,n,m,1(t, x, Ξ), uΔ,n(t, x) = uΔ,n,m,0(t, x, Ξ).

For m > n, by the first-order Taylor expansion, we have∣∣E[u2

Δ,m(Δ,x)− u2Δ,n(Δ,x)]

∣∣

= |2m∑

i,j=n+1

1

(δi,j + 1)

∫ 1

0

θ(1− θ)E[uΔ,n,m,θ(Δ,x,Ξ)

EQ[UΔ,n,m,θ(Δ,x,Ξ)ν1,i(t, x)ν1,j(t, x)]ξiξj ] dθ

+2

m∑i,j=n+1

1

(δi,j + 1)

∫ 1

0

θ(1− θ)E[EQ[UΔ,n,m,θ(Δ,x,Ξ)ν1,i(t, x)]

EQ[UΔ,n,m,θ(t, x, Ξ)ν1,j(t, x)]ξiξj ] dθ|

≤ 2

∣∣∣∣∫ 1

0

(1− θ)θE[uΔ,n,m,θ(Δ,x,Ξ)

EQ[UΔ,n,m,θ(Δ,x,Ξ)

(m∑

i=n+1

ν1,i(Δ,x)ξi

)2

]] dθ

∣∣∣∣∣

+2

∣∣∣∣∣∫ 1

0

(1− θ)θE[

(m∑

i=n+1

EQ[UΔ,n,m,θ(Δ,x,Ξ)ν1,i(Δ,x)]ξi

)2

] dθ

∣∣∣∣∣ ,

(8.3.28)


where δi,j = 1 if i = j and 0 otherwise and we have used the facts that ξi,i > n, are independent of un(t, x) and E[ξi] = 0.

By the Cauchy-Schwarz inequality (twice), we have for the first termin (8.3.28):

2

∣∣∣∣∫ 1

0

(1− θ)θE[uΔ,n,m,θ(Δ,x,Ξ)

EQ[UΔ,n,m,θ(Δ,x,Ξ)

(m∑

i=n+1

ν1,i(Δ,x)ξi

)2

]] dθ

∣∣∣∣∣∣

≤ C(E[EQ[

(m∑

i=n+1

ν1,i(Δ,x)ξi

)8

]])1/4. (8.3.29)

Here we also used that E[u2Δ,n,m,θ(Δ,x,Ξ)],E[(EQ[U

2Δ,n,m,θ(Δ,x,Ξ)])2] ≤ C,

which can be readily checked in the same way as in the proof of Proposi-tion 8.3.7.

By the Taylor expansion for UΔ,n,m,θ(Δ,x,y), we have

UΔ,n,m,θ(Δ,x,y) = UΔ,n,m,0(Δ,x,y) +m∑

i=n+1

ν1,i(Δ,x)

∫ 1

0

(1− θ1)θ1θUΔ,n,m,θθ1(Δ,x,y) dθ1yi.

Then by the Cauchy-Schwarz inequality (several times) and the fact thatξi, i > n are independent of UΔ,n,m,0(t, x, Ξ), we have for the second termin (8.3.28):

2

∣∣∣∣∣∣

∫ 1

0

(1− θ)θE[

(m∑

i=n+1

EQ[UΔ,n,m,θ(Δ,x,Ξ)ν1,i(Δ,x)]ξi

)2

] dθ

∣∣∣∣∣∣

≤ 4

∣∣∣∣∣

∫ 1

0

(1− θ)θ

m∑

i=n+1

E[(EQ[UΔ,n,m,0((Δ,x,Ξ)ν1,i(Δ,x)])2] dθ

∣∣∣∣∣

+4

∣∣∣∣∫ 1

0

(1− θ)θ3E[

(EQ[

∫ 1

0

(1− θ1)θ1UΔ,n,m,θθ1(Δ,x,Ξ) dθ1

(

m∑

i=n+1

ν1,i(Δ,x)ξi))2]

)2

] d

∣∣∣∣∣∣

≤ E[EQ[U2Δ,n,m,0((Δ,x,Ξ)]]

m∑

i=n+1

EQ[ν21,i(Δ,x)]


+C

∣∣∣∣∫ 1

0

(1− θ)θ3(E[EQ[

∫ 1

0

(1− θ1)θ1U4Δ,n,m,θθ1(Δ,x,Ξ) dθ1])

1/2 dθ]

∣∣∣∣

(E[EQ[(m∑

i=n+1

ν1,i((Δ,x)ξi)8]])1/2

≤ C

m∑

i=n+1

EQ[ν21,i(Δ,x)] + C(E[EQ[(

m∑

i=n+1

ν1,i(Δ,x)ξi)8]])1/2. (8.3.30)

Here we used that E[EQ[U2Δ,n,m,,0((Δ,x,Ξ)]], E[EQ[U

4Δ,n,m,θθ1

(Δ,x,Ξ)]] ≤C, which can be readily checked in the same way as in the proof of Proposi-tion 8.3.7.

By (8.3.28), (8.3.29), and (8.3.30), we have

∣∣E[u2Δ,m(Δ,x)− u2

Δ,n(Δ,x)]∣∣ (8.3.31)

≤ Cm∑

i=n+1

EQ[ν21,i(Δ,x)] + CE[EQ[(

m∑

i=n+1

ν1,i(Δ,x)ξi)8]])1/4

+C(E[EQ[(

m∑

i=n+1

ν1,i(Δ,x)ξi)8]])1/2.

Similar to the proof of (8.3.25), we have

E[EQ[(m∑

i=n+1

ν1,i(Δ,x)ξi)8]] ≤ CEQ[

(m∑

i=n+1

ν21,i(Δ,x)

)4

]

≤ CΔ4(1−ε)n4(1−ε)EQ[

∣∣∣ν1(XΔ,x(·))∣∣∣8

1−ε2 ,2,[0,Δ]

].

Similar to the proof of (8.3.26), we can estimate EQ[∣∣∣ν1(XΔ,x(·))

∣∣∣8

1−ε2 ,2,[0,Δ]

]

as follows:

EQ[∣∣∣ν1(XΔ,x(·))

∣∣∣8

1−ε2 ,2,[0,Δ]

] ≤ C(Δ8+4ε + CΔ4+4ε),

and thus

E[EQ[(

m∑

i=n+1

ν1,i(Δ,x)ξi)8]] ≤ C(Δ12 + CΔ8). (8.3.32)

Similarly, we have

E[EQ[(

m∑

i=n+1

ν1,i(Δ,x)ξi)2]] =

m∑

i=n+1

EQ[ν21,i(Δ,x)] ≤ C(Δ3+CΔ2). (8.3.33)


By (8.3.31), (8.3.32), and (8.3.33), we have

∣∣E[u2Δ,m(Δ,x)− u2

Δ,n(Δ,x)]∣∣ ≤ C exp(CΔ)(Δ6 +Δ2)n−1+ε. (8.3.34)

By the triangle inequality and the Cauchy-Schwarz inequality, we obtain

∣∣E[u2(Δ,x)− u2Δ,n(Δ,x)]

∣∣ ≤∣∣E[u2(Δ,x)− u2

Δ,m(Δ,x)]∣∣

+∣∣E[u2


∣∣ ,≤ C(E[|u(Δ,x)− uΔ,m(Δ,x)|2])1/2

+∣∣E[u2


∣∣ .

The one-step error (8.3.19) then follows from (8.3.34), Proposition 8.3.7 andtaking m to +∞. The global error (8.3.20) is estimated from the recursionnature of Algorithm 7.3.1 as in the proof in [315, Theorem 2.4]. �

The following theorem is on the convergence of the second moments bySCM to those of the solution to (8.2.4).

Theorem 8.3.8 Let uΔ,n(t, x) be the solution to (8.2.4) and MΔ,L,n(ti, x)bethe limit of MM

Δ,L,n(ti, x) from (8.2.9) when M → ∞. Under the assumptionsof Theorem 8.3.6, for any ε > 0, the one-step error is estimated by

∣∣MΔ,L,n(Δ,x)− E[u2Δ,n(Δ,x)]

∣∣ ≤ C exp(CΔ)(Δ3L +Δ2L)

(1 + (3c/2)L∧n)β−(L∧n)/2ε−LL−1nLε,

and the global error is estimated by, for 1 ≤ i ≤ K,

∣∣MΔ,L,n(ti, x)− E[u2Δ,n(ti, x)]

∣∣ ≤ C exp(CT )Δ2L−1

(1 + (3c/2)L∧n)β−(L∧n)/2ε−LL−1nLε.

Here the positive constants C, c, β < 1 are independent of Δ, L, and n. Theexpression L ∧ n means the minimum of L and n.

Proof. Setting ϕ(y1, · · · , yn) = u2Δ,n(t, x, y1, · · · , yn), we then have that

A(L, n)ϕ is an approximation of the second moment of the solution ob-tained by the sparse grid collocation methods. Recall from (8.3.22) thatuΔ,n(t, x,y) = EQ[UΔ,n(t, x,y)] where UΔ,n(t, x,y) = UΔ,n,m,0(t, x,y).

Now we estimate D2αl [u2Δ,n(Δ,x, y1, · · · , yn)]. To this end, we need to first

estimate Dβl [uΔ,n(Δ,x, y1, · · · , yn)], where βl ≤ 2αl. By (8.3.24), we have for0 < ε < 1,

ν21,k(Δ,x) ≤ C(Δmax(k − 1, 1))ε−1

∣∣∣ν1(XΔ,x(·))∣∣∣2

1−ε2 ,2,[0,Δ]

,


we have, by the Cauchy-Schwarz inequality,

∣∣Dβl uΔ,n(Δ,x,y)∣∣ =

∣∣∣∣∣EQ[UΔ,n(Δ,x,y)

l∏

k=1

(ν1,k(Δ,x))βkl ]

∣∣∣∣∣ (8.3.35)

≤ (EQ[U2Δ,n(Δ,x,y)])1/2(EQ[

l∏

k=1

(ν1,k(Δ,x))2βkl ])1/2

≤ (CΔ1−ε)|βl|/2l∏

k=2

(k − 1)(ε−1)βkl /2

(EQ[U2Δ,n(Δ,x,y)])1/2(EQ[

∣∣∣ν1(XΔ,x(·))∣∣∣|2βl|1−ε2 ,2,[0,Δ]

])1/2.

By the chain rule for multivariate functions, we have

D2αl [u2Δ,n(Δ,x,y)] =

∑

βl+γl=2αl

(2αl)!Dβl uΔ,n(Δ,x,y)

βl!

Dγl uΔ,n(Δ,x,y)

γl!,

and thus by (8.3.35) and the fact that∑

βl+γl=2αl

(2αl)!βl!γl!

= 22|αl|−1, we have

∣∣D2αl [u2Δ,n(Δ,x,y)]

∣∣ ≤ 22|αl|−1(CΔ1−ε)|αl|EQ[U2Δ,n(Δ,x,y)]

l∏k=2

(k − 1)(ε−1)αkl

× maxβl+γl=2αl

((EQ[∣∣∣ν1(XΔ,x(·))

∣∣∣2|βl|1−ε2

,2,[0,Δ]])1/2

(EQ[∣∣∣ν1(XΔ,x(·))

∣∣∣2|γl|1−ε2

,2,[0,Δ]])1/2).

Similar to (8.3.32), we have EQ[∣∣∣ν1(XΔ,x(·))

∣∣∣2|βl|1−ε2 ,2,[0,Δ]

] ≤ C(Δ|βl|(2+ε)+

Δ|βl|(1+ε)) and

∣∣D2αl [u2Δ,n(Δ,x,y)]

∣∣ ≤ C(Δ3|αl|+Δ2|αl|)l∏

k=2

(k−1)(ε−1)αkl EQ[U

2Δ,n(Δ,x,y)].

(8.3.36)Then by (7.2.20) and (8.3.36), we obtain

∣∣∣S(L, l)⊗nn=l+1 I

(n)1 ϕ

∣∣∣

≤ C(Δ3L +Δ2L)(1 + (3c/2)L∧l)β−(L∧l)/2

E[EQ[U2Δ,n(Δ,x,y)]]

∑

i1+···+il=L+l−1

l∏

k=2

(k − 1)(ε−1)αkl

≤ C(Δ3L +Δ2L)(1 + (3c/2)L∧l)β−(L∧l)/2ε1−L(l − 1)Lε−1 (8.3.37)


with the constant C > 0 which does not depend on n, ε, L, c, β, and l. In thelast line we used the fact that E[EQ[U

2Δ,n(Δ,x,y)]] is bounded and that

∑

i1+···+il=L+l−1

l∏

k=2

(k − 1)(ε−1)αkl = (l − 1)ε−1

∑

i1+···+il=L+l−1

l∏

k=2

(k − 1)(ε−1)(ik−1)

≤ (l − 1)ε−1(

l∑

k=2

(k − 1)ε−1)L−1

≤ (l − 1)ε−1(ε−1(l − 1)ε)L−1 = ε1−L(l − 1)Lε−1.

Then by (7.2.15) and (8.3.37), we have

|Inϕ−A(L, n)ϕ| ≤ C(Δ3L +Δ2L)(1 + (3c/2)L∧n)β−(L∧n)/2ε1−Ln∑

l=2

(l − 1)Lε−1

+∣∣∣(I(1)1 −Q

(1)L )⊗n

k=2 I(k)1 ϕ

∣∣∣

≤ C(Δ3L +Δ2L)(1 + (3c/2)L∧n)β−(L∧n)/2ε−LL−1nLε,

where the term in the second line is estimated by the classical error estimatefor the Gauss-Hermite quadrature Q, see, e.g., [339], and the estimation ofderivatives (8.3.36).

The global error is estimated from the recursion nature of Algorithm 7.3.1as in the proof in [315, Theorem 2.4].


In this section, we compare Algorithms 6.1.4 and 8.2.2 for linear stochasticadvection-diffusion-reaction equations with commutative and noncommuta-tive noises. We will test the computational performance of these two methodsin terms of accuracy and computational cost. All the tests were run using Mat-lab R2012b, on a Macintosh desktop computer with Intel Xeon CPU E5462(quad-core, 2.80 GHz). Every effort was made to program and execute thedifferent algorithms as much as possible in an identical way.

We do not have exact solutions for all examples and hence we will evaluatethe errors of the second-order moments using reference solutions, denoted byE[u2

ref(T, x)], which are obtained by either Algorithm 6.1.4 or Algorithm 8.2.2with fine resolution. We do not use solutions obtained from Monte Carlomethods as reference solutions since Monte Carlo methods are of low accuracyand are less accurate than the recursive multistage WCE, see for comparisonbetween WCE and Monte Carlo methods in Chapter 6 and also below.


The following error measures are used in the numerical examples below:

#22(T ) =∣∣∥∥E[u2

ref(T, ·)]∥∥l2−∥∥MM

Δ(T, ·)∥∥l2

∣∣ , #r,22 (T ) =#22(T )

‖E[u2ref(T, ·)]‖

l2,

(8.4.1)

#∞2 (T ) =∣∣∥∥E[u2

ref(T, ·)]∥∥l∞ −

∥∥MMΔ(T, ·))

∥∥l∞∣∣ , #r,∞2 (T ) =

#∞2 (T )

‖E[u2ref(T, ·)]‖

l∞,

(8.4.2)

where MMΔ(T, x) is either E[(uM

Δ,N,n(T, x))2] from Algorithm 6.1.4 or MMΔ,L,n

(T, x) from Algorithm 8.2.2, ‖v‖l2 =

(2π

M

M∑

m=1

v2(xm)

) 12

, ‖v‖l∞ = max1≤m≤M

|v(xm)|, xm are the Fourier collocation points.The computational complexity for Algorithm 6.1.4 is

(N+nqN

)TΔM4 and

that for Algorithm 8.2.2 is η(L, nq) TΔM4. The ratio of the computational cost

of SCM over that of WCE is η(L, nq)/(N+nqN

). For example, when N = 1 and

L = 2, the ratio is (1+2nq)/(1+nq), which will be used in the three numericalexamples. The complexity increases exponentially with nq and L, see, e.g.,[148], or N but increases linearly with T

Δ . Hence, we only consider low valuesof L and N.

Example 8.4.1 (Single noise) We consider the advection-diffusion equa-tion (3.3.27) over the domain (0, T ]× (0, 2π),

du = [(ε+1

2σ2)∂2

xu+ β(sin(x)∂xu+ u)] dt+ σ∂xu dW (t),

with periodic boundary conditions and nonrandom initial condition u(0, x) =cos(x).

In this example, we compare Algorithms 6.1.4 and 8.2.2 for (3.3.27) withthe parameters β = 0.1, σ = 0.5, and ε = 0.02. We will show that therecursive multistage WCE is at most of order Δ2 in the second-order momentsand the recursive multistage SCM is of order Δ.

In Step 1, Algorithm 6.1.4, we employ the Crank-Nicolson scheme in timeand Fourier collocation in physical space. We obtain the reference solution byAlgorithm 6.1.4 with the same solver but finer resolution as a reference so-lution since we have no exact solution to (3.3.27). The reason for this choiceof reference solution is as follows. For single noise, it is proved in Theo-rem 8.3.1 that the recursive multistage WCE is of second-order convergence insecond-order moments. The second-order convergence is numerically verifiedin Chapter 6. For this specific example, a Monte Carlo method with 106 sam-pling paths (which costs 27.6 hours) gives

∥∥E[u2MC]

∥∥ = 1.06517± 6.1× 10−4

and∥∥E[u2

MC]∥∥∞ = 0.51746± 6.1× 10−4, where the numbers after ‘±’ are the

statistical errors with the 95% confidence interval. We use Fourier collocationin space with M = 20 and Crank-Nicolson in time with δt = 10−3 for theMonte Carlo method.


The reference solution is obtained via Algorithm 6.1.4 by M = 30,Δ = 10−4, N = 4, n = 4, δt = 10−5. It gives the second-order momentsin l2-norm

∥∥E[u2ref ]∥∥l2=1.065194550063 and in the l∞-norm

∥∥E[u2ref ]∥∥l∞ =

0.5174746141105.From Table 8.1, we observe that the recursive WCE is O(ΔN) + O(Δ2)

for the second-order moments. When N = 2, the method is of second-orderconvergence in Δ and of first-order convergence when N = 1. When N = 3,the method is still second-order in Δ (not presented here). This verifies theestimate in Corollary 8.3.2.

Table 8.1. Algorithm 6.1.4: recursive multistage Wiener chaos method for (3.3.27)at T = 5: σ = 0.5, β = 0.1, ε = 0.02, and M = 20, n = 1.

Δ δt N �r,22 (T ) Order �r,∞2 (T ) Order CPU time (sec.)

1.0e-1 1.0e-2 1 1.5249e-2 – 8.8177e-3 3.57

1.0e-2 1.0e-3 1 1.5865e-3 Δ0.98 8.9310e-4 Δ0.99 33.22

1.0e-3 1.0e-4 1 1.5934e-4 Δ1.00 8.9429e-5 Δ1.00 348.03

1.0e-1 1.0e-2 2 1.9070e-4 – 4.1855e-5 – 5.14

1.0e-2 1.0e-3 2 2.0088e-6 Δ1.98 4.2889e-7 Δ1.99 51.75

1.0e-3 1.0e-4 2 2.0386e-8 Δ1.99 4.8703e-9 Δ1.94 490.04

In Step 1, Algorithm 8.2.2, we use the Crank-Nicolson scheme in time andFourier collocation method in physical space. The errors are also measured asin (8.4.1) and (8.4.2). The reference solution is obtained by Algorithm 6.1.4as in the case of WCE. We observe in Table 8.2 that the convergence orderfor second-order moments is one in Δ even when the sparse grid level L is 2,3, and 4 (the last is not presented here). The errors for L = 3 are more thanhalf in magnitude smaller than those for L = 2 while the time cost for L = 3is about 1.5 times of that for L = 2.

Table 8.2. Algorithm 8.2.2: recursive multistage stochastic collocation methodfor (3.3.27) at T = 5: σ = 0.5, β = 0.1, ε = 0.02, and M = 20, n = 1.

Δ δt L �r,22 (T ) Order �r,∞2 (T ) Order CPU time (sec.)

1e-01 1e-02 2 3.4808e-04 – 3.0383e-03 – 3.71

1e-02 1e-03 2 3.4839e-05 Δ1.00 3.0130e-04 Δ1.00 33.88

1e-03 1e-04 2 3.4844e-06 Δ1.00 3.0106e-05 Δ1.00 325.06

1e-01 1e-02 3 1.6815e-04 – 3.4829e-04 – 5.16

1e-02 1e-03 3 1.6230e-05 Δ1.02 3.2283e-05 Δ1.03 50.59

1e-03 1e-04 3 1.6170e-06 Δ1.00 3.2026e-06 Δ1.00 486.08

In summary, from Tables 8.1 and 8.2, we observe that the recursive mul-tistage WCE is O(ΔN) +O(Δ2) and the recursive multistage SCM is O(Δ),as predicted by the error estimates in Chapter 8.3. While the SCM and the


WCE are of the same order when N = 1 and L ≥ 2, the former can bemore accurate than the latter. In fact, when N = 1 and L = 2, the recursivemultistage SCM error is almost two orders of magnitude smaller than the re-cursive multistage WCE while the computational cost for both is almost thesame, as predicted (

(N+nqN

)= η(L, nq) = 2). The recursive multistage WCE

with N = 2 is of order Δ2 and its errors are almost two orders of magnitudesmaller than those by the recursive multistage SCM (with level 2 or 3) forthe second-order moments.

In this example, the recursive multistage SCM outperforms the recursivemultistage WCE with N = 1. The reason can be as follows. In SCM, we solvean advection-dominated equation rather than a diffusion-dominated equationin WCE, as SCM is associated with the Stratonovich product which leads tothe removal of the term 1

2σ2∂2

xu in the resulting equation, see (3.3.28). Thelarger σ is, the more dominant the diffusion is. In fact, results for σ = 1and σ = 0.1 (not presented here) show that when σ = 1, the relative errorof SCM with L = 2 is almost three orders of magnitude smaller than WCEwith N = 1; when σ = 0.1, the relative error of SCM with L = 2 is onlyless than one order of magnitude smaller than WCE with N = 1. With theCrank-Nicolson scheme in time and Fourier collocation in physical space, wecannot achieve better accuracy for WCE with N = 1 and Δ no less than0.0005 when M ≤ 40.

Example 8.4.2 (Commutative noises) We consider Equation (3.3.30)

du = [(ε+1

2σ21 cos

2(x))∂2xu+ (β sin(x)− 1

4σ21 sin(2x))∂xu] dt

+σ1 cos(x)∂xu dW1(t) + σ2u dW2(t),

with two commutative noises over the domain (0, T ] × (0, 2π), with periodicboundary conditions and nonrandom initial condition u(0, x) = cos(x). Theproblem has commutative noises (3.3.29):

σ1 cos(x)∂xσ2Idu = σ2Idσ1 cos(x)∂x = σ1σ2 cos(x)∂x.

Here Id is the identity operator.

In this example, we take σ1 = 0.5, σ2 = 0.2, β = 0.1, ε = 0.02. We againobserve first-order convergence for SCM and WCE with N = 1, and second-order convergence for WCE with N = 2 as in the last example with singlenoise.

We choose the same space-time solver for the recursive multistage WCEand SCM as in the last example. We compute the errors as in (8.4.1) and

(8.4.2). For WCE, the reference second moments are∥∥∥MM

Δ=10−4,N,n(T, ·))∥∥∥l2,

and∥∥∥MM

Δ=10−4,N,n(T, ·)∥∥∥l∞

obtained by Algorithm 6.1.4 with δt = 10−5

and all the other truncation parameters are the same as stated in the ta-

ble. For SCM, the reference second moments are∥∥∥MM

Δ=10−4,L,n(T )∥∥∥l2

and∥∥∥MM

Δ=10−4,L,n(T )∥∥∥l∞

obtained by Algorithm 8.2.2 with δt = 10−5 while all

the other truncation parameters are the same.


Here we do not compare the performance of Monte Carlo simulationswith our algorithms as the main cost of Monte Carlo methods is to reducethe statistical errors. For the same parameters described above, when weused 106 Monte Carlo sampling paths, we could only reach the statisticalerror of 8.3 × 10−4, in 3.9 hours. To obtain an error of 1 × 10−5, seventhousand times more Monte Carlo sampling paths should be used, requiringthree years of computational time and thus is not considered here. In the nextexample, we have similar situations and hence we will not consider MonteCarlo simulations. This also demonstrates the computational efficiency ofAlgorithms 6.1.4 and 8.2.2 in comparison with Monte Carlo methods.

For WCE, we observe in Figure 8.1 convergence of order ΔN (N ≤ 2) inthe second-order moments: first-order convergence when N = 1, and second-order convergence when N = 2. Numerical results for N = 3 (not presentedhere) show that the convergence order is still two even though the accuracyis further improved when N increases from 2 to 3. This is consistent with ourestimate O(ΔN) +O(Δ2) in Corollary 8.3.2.

We also tested the case n = 2, which gives similar results and the sameconvergence order.

Fig. 8.1. Relative errors in second-order moments of recursive WCE and SCM forExample 8.4.2 with commutative noises at T = 1. The parameters are δt = Δ/10,M = 30 and n = 1. σ1 = 0.5, σ2 = 0.2, β = 0.1, ε = 0.02. Left: l2 errors; Right: l∞

errors.

10−3

10−2

10−1

10−9

10−8

10−7

10−6

10−5

10−4

10−3

10−2

Δ

Δ2

Δ

rela

tive

erro

rs in

sec

ond−

orde

r m

omen

ts

WCE, N=1WCE, N=2SCM, L=2SCM, L=3

10−3

10−2

10−1

10−9

10−8

10−7

10−6

10

10−4

10−3

10−2

Δ

Δ2

Δ

rela

tive

erro

rs in

sec

ond−

orde

r m

omen

ts

WCE, N=1WCE, N=2SCM, L=2SCM, L=3

−5

For SCM, we observe first-order convergence in Δ from Figure 8.1 whenL = 2, 3. We note that further refinement in truncation parameters in randomspace, i.e., increasing L and/or n do not change the convergence order norimprove the accuracy. The case L = 3 actually leads to a bit worse accuracy,compared with the case L = 2. We tested the case L = 4, which leads to thesame magnitudes of errors as L = 3. We also tested n = 2 and observed noimproved accuracy for L = 2, 3, 4. These numerical results are not presentedhere.


Table 8.3. Recursive multistage WCE (left) and SCM (right) for commutativenoises (3.3.30) at T = 1: σ1 = 0.5, σ2 = 0.2, β = 0.1, ε = 0.02, and M = 30, n = 1.

Δ N �r,22 (T ) Order CPU time (sec.) L �r,22 (T ) Order CPU time (sec.)

1e-01 1 1.6994e-03 – 3.19 2 1.2453e-03 – 5.18

1e-02 1 1.7838e-04 Δ0.98 32.74 2 1.2009e-04 Δ1.02 54.70

1e-03 1 1.6323e-05 Δ1.04 329.15 2 1.0889e-05 Δ1.04 545.20

1e-01 2 4.0658e-05 – 6.53 3 2.0482e-04 – 13.26

1e-02 2 4.4805e-07 Δ1.96 65.89 3 1.7897e-05 Δ1.06 142.23

1e-03 2 4.4682e-09 Δ2.00 657.55 3 1.6062e-06 Δ1.05 1420.24

For the two commutative noises, we conclude from this example that therecursive multistage WCE is of order ΔN +Δ2 in the second-order momentsand that the recursive multistage SCM is of order Δ in the second-ordermoments no matter what sparse grid level is used. The errors of recursivemultistage SCM are one order of magnitude smaller than those of recursivemultistage WCE with N = 1 while the time cost of SCM is about 1.6 timesof that cost of WCE, see Table 8.3. For large magnitude of noises (σ1 =σ2 = 1, numerical results are not presented), we observed that the SCM withL = 2 and WCE with N = 1 have the same order-of-magnitude accuracy. Inthis example, the use of SCM with L = 2 for small magnitude of noises iscompetitive with the use of WCE with N = 1.

Example 8.4.3 (Noncommutative noises) We consider (3.3.32)

du= [(ε+1

2σ21)∂


1

2σ22 cos2(x)u] dt+ σ1∂xu dW1(t)+σ2 cos(x)u dW2(t),

with two noncommutative noises over the domain (0, T ]×(0, 2π), with periodicboundary conditions and nonrandom initial condition u(0, x) = cos(x). Theproblem has noncommutative noises as the coefficients do not satisfy (3.3.29).

We take the same constants ε > 0, β, σ1, σ2 as in the last example. Wealso take the same space-time solver as in the last example. In the currentexample, we observe only first-order convergence for SCM (level L = 2, 3, 4)and WCE (N = 1, 2, 3) when n = 1, 2, see Table 8.4 for parts of the numericalresults.

The errors are computed as in the last example. The reference solutionsare obtained by Algorithm 6.1.4 for the recursive multistage WCE solutionsand by Algorithm 8.2.2 for the recursive multistage SCM solutions, withΔ = 5× 10−4 and δt = 5× 10−5 and all the other truncation parameters thesame as stated in Tables 8.4 and 8.5.


Table 8.4. Algorithm 6.1.4 (recursive multistage Wiener chaos expansion, left)and Algorithm 8.2.2 (recursive multistage stochastic collocation method, right)for (3.3.32) at T = 1: σ1 = 0.5, σ2 = 0.2, β = 0.1, ε = 0.02, and M = 20,n = 1. The time step size δt is Δ/10. The reported CPU time is in seconds.

Δ N �r,22 (T ) Order time (sec.) L �r,22 (T ) Order time (sec.)

1.0e-1 1 3.7516e-03 – 1.04 2 6.4343e-04 – 1.65

5.0e-2 1 1.8938e-03 Δ0.99 2.11 2 3.1738e-04 Δ1.02 3.31

2.0e-2 1 7.5292e-04 Δ1.01 5.12 2 1.2440e-04 Δ1.02 8.64

1.0e-2 1 3.6796e-04 Δ1.03 10.19 2 6.0502e-05 Δ1.04 17.12

5.0e-3 1 1.7457e-04 Δ1.08 20.01 2 2.8635e-05 Δ1.08 33.82

2.0e-3 1 5.8246e-05 Δ1.20 50.39 2 9.5401e-06 Δ1.12 86.44

1.0e-1 2 9.4415e-05 – 2.16 3 1.5803e-04 – 4.03

5.0e-2 2 3.7303e-05 Δ1.81 4.11 3 7.6548e-05 Δ1.05 8.68

2.0e-2 2 1.2282e-05 Δ1.34 9.97 3 2.9673e-05 Δ1.03 22.08

1.0e-2 2 5.5807e-06 Δ1.21 20.03 3 1.4378e-05 Δ1.05 43.85

5.0e-3 2 2.5471e-06 Δ1.14 40.25 3 6.7925e-06 Δ1.08 88.35

2.0e-3 2 8.2965e-07 Δ1.22 101.34 3 2.2605e-06 Δ1.20 223.15

In this example, our error estimate for recursive multistage WCE is notvalid any more and the numerical results suggest that the errors behave asΔN + CΔ/n. For N = 1 and n = 10 (not presented), the error is almost thesame as n = 1. While N = 2 and n = 10, the error first decreases as O(Δ2) forlarge time step size and then as O(Δ) for small time step size; see Table 8.5.When N = 2 and n = 10, the errors with Δ = 0.005, 0.002, 0.001 are tenpercent (1/n) of those with the same parameters but n = 1 in Table 8.4. Herethe constant in front of Δ, C/n, plays an important role: when Δ is large andthis constant is small, then the order of two can be observed; when Δ is small,CΔ/n is dominant so that only first-order convergence can be observed.

The recursive multistage SCM is of first-order convergence when L =2, 3, 4 and n = 1, 2, 10 (only parts of the results presented). In contrast toExample 8.4.2, the errors from L = 3 are one order of magnitude smaller thosefrom L = 2. Recalling that the number of sparse grid points is η(2, 2) = 5and η(3, 2) = 13, we have the cost for L = 3 is about 2.6 times of that forL = 2. However, it is expected that in practice, a low level sparse grid ismore efficient than a high level one when nq is large as the number of sparsegrid points η(L, nq) is increasing exponentially with nq and L. In other words,L = 2 is preferred when SPDEs with many noises (large q) are considered.

As discussed in the beginning of this section, the ratio of time cost forSCM and WCE is η(L, nq)/

(N+nqN

). The cost of recursive multistage SCM

with L = 2 is at most 1.8 times (1.6 predicted by the ratio above, q = 2and n = 1) of that of recursive multistage WCE with N = 1. However, inthis example, the accuracy of the recursive multistage SCM is one order ofmagnitude smaller than that of the recursive multistage WCE when N = 1


and L = 2. In Table 8.4, we present in bold the errors between 3.5×10−5 and8.0 × 10−5. Among the four cases listed in the table, the most efficient, forthe given accuracy above, is WCE with N = 2, which outperforms SCM withL = 3 and L = 2. WCE with N = 1 is less efficient than the other three cases.We also observed that when σ1 = σ2 = 1, the error in SCM with L = 2 isone order of magnitude smaller than WCE with N = 1 (results not presentedhere).

For noncommutative noises in this example, we show that the error forWCE is Δ2+CΔ/n and the error for SCM is Δ. The numerical results suggestthat SCM with L = 2 is competitive with WCE with N = 1 for both smalland large magnitude of noises if n = 1.

Table 8.5. Algorithm 6.1.4: recursive multistage Wiener chaos expansionfor (3.3.32) at T = 1: σ1 = 0.5, σ2 = 0.2, β = 0.1, ε = 0.02. The parametersare M = 20, N = 2, and n = 10. The time step size δt is Δ/10.

Δ �r,22 (T ) Order �r,∞2 (T ) Order CPU time (sec.)

1.0e-1 4.9310e-05 – 2.6723e-05 – 84.00

5.0e-2 1.4031e-05 Δ1.81 7.3571e-06 Δ1.86 160.50

2.0e-2 2.9085e-06 Δ1.71 1.4171e-06 Δ1.80 391.40

1.0e-2 9.8015e-07 Δ1.57 4.4324e-07 Δ1.68 749.40

5.0e-3 3.5978e-07 Δ1.45 1.5082e-07 Δ1.56 1557.60

2.0e-3 9.8910e-08 Δ1.41 3.8369e-08 Δ1.49 3887.50

With these three examples, we observe that the convergence order of therecursive multistage SCM in the second-order moments is one for commuta-tive and noncommutative noises. We verified that our error estimate for WCE,ΔN +Δ2, is valid for commutative noises, see Examples 8.4.1 and 8.4.2; thenumerical results for noncommutative noises, see Example 8.4.3, suggest thatthe errors are of order ΔN + CΔ/n, where C is a constant depending on thecoefficients of the noises.

For stochastic advection-diffusion-reaction equations, different formula-tions of stochastic products (Ito-Wick product for WCE, Stratonovich prod-uct for SCM) lead to different numerical performance. When the white noiseis in the velocity, the Ito formulation will have stronger diffusion than thatin the Stratonovich formulation in the resulting PDE. As stronger diffusionrequires more resolution, the recursive multistage WCE with N = 1 may pro-duce less accurate results than those by the recursive multistage SCM withL = 2 with the same PDE solver under the same resolution, as shown in thefirst and the third examples.

To achieve convergence of approximations of second moments with first-order in time step Δ, we can use the recursive multistage SCM Algo-rithm 8.2.2 with L = 2, n = 1 and also the recursive multistage WCE Al-gorithm 6.1.4 with N = 1, n = 1, as both can outperform each other in


certain cases. For commutative noises, Algorithm 6.1.4 with N = 2 is prefer-able when the number of noises, q, is small and hence the number of WCEmodes is small so that the computational cost would grow slowly.

We also note that the errors of Algorithms 6.1.4 and 8.2.2 depend onthe SPDE coefficients and integration time (cf. theoretical results of Chapter8.3). For some SPDEs, the constants at powers of Δ in the errors can bevery large and, to reach desired levels of accuracy, we need to use very smallstep size Δ or develop numerical algorithms further (e.g., higher-order orstructure-preserving approximations, see such ideas for SODEs, e.g., in [358]).Further, in practice, we need to aim at balancing the three parts (truncationof Wiener processes, functional truncation of WCE/SCM, and space-timediscretizations of the deterministic PDEs appearing in the algorithms) of theerrors of Algorithms 6.1.4 and 8.2.2 for higher computational efficiency.


We have shown that both methods, WCE and SCM in conjunction with therecursive strategy, for linear advection-diffusion-reaction equations are com-parable in computational performance, even though WCE can be formally ofhigher order:

• For commutative noises, the accuracy of the recursive WCE is O(ΔN) +O(Δ/n) (see Theorem 8.3.1 and Corollary 8.3.2) while the accuracy ofthe recursive SCM is O(Δ) (see Theorem 8.3.8). Here N is the order ofWiener chaos and n is the truncation parameter of spectral expansion ofBrownian motion.

• For noncommutative noises, the accuracy of both the recursive WCE andthe recursive SCM is O(Δ), see numerical results in Example 8.4.3.

• With truncation of spectral expansion of Brownian motions in use, WCEand SCM are associated with different stochastic products: WCE is as-sociated with the Ito-Wick product and SCM is with the Stratonovichproduct. Hence, the WCE usually has more diffusion than the SCM does,see (8.2.2) and (3.3.26). This effect requires special attention of employingefficient numerical solvers in time and physical space.

• The numerical performance of these two methods depends on the intensityof the noises and the dynamics of the underlying equations. See numericalexamples in Chapter 8.4.

However, when the underlying equations are nonlinear rather than linear,the recursive strategy is not applicable any more. Then SCM is preferablebecause of its resulting fully decoupled systems of equations while WCEresults in fully coupled systems of equations. In the next chapter, we showhow we can apply SCM to a classic practical nonlinear problem and evaluatethe performance of SCM in stochastic flow simulations.


Bibliographic notes. Comparison between WCE and SCM is demonstratedin [13, 123] for elliptic equations with color noise. One big difference betweencolor noise and white noise is that there is no issue of definition for stochasticproducts for color noise (to be more precise, the noise smoother than Brow-nian motion). Also, for color noise, it is shown in [13, 123] that there areonly small differences in the numerical performance of generalized polyno-mial chaos expansion and SCM.

The convergence of WCE has been discussed in [314–316, 318] with whitenoise in the reaction rate and white noise in the advection velocity. However,the convergence rate of WCE is only known in the case of white noise inthe reaction rate, see, e.g., [315]. In this case, the problem has commutativenoises, (3.3.29). In this chapter, we discuss the convergence rate in the caseof single white noise in the advection velocity, which is a special case ofcommutative noises (3.3.29). The case of noncommutative noises has notbeen discussed in literature while only numerical results are presented in thischapter.

The convergence rate of SCM has been discussed for color noise, see, e.g.,[11, 379, 380, 500] for Smolyak’s sparse grid collocation. For white noise, theconvergence rate has been discussed in [506, 507]. Due to the low regularityof the solution in random space to PDEs with white noise, the convergencecannot be high as in the case of color noise.

The number of operations for the recursive multistage WCE is of orderM4, where M is the number of nodes employed in the discretization of phys-ical space. As shown in Chapter 6.4, this computational complexity can bereduced to the order of M2 using sparse representations (see, e.g., [418]) andmany coefficients of WCE modes are negligible in computation. The com-plexity of the recursive multistage SCM is also of order M4, but it is not clearthat SCM admits sparse representations as WCE does.


Exercise 8.6.1 Consider the following stochastic advection-diffusion-reactionequation over the domain (0, T ]× (0, 2π),

du = [(ε+1

2σ21 cos

2(x))∂2xu+ (β sin(x)− 1

4σ21 sin(2x))∂xu+ sin(3x)] dt

+[σ1 sin(x)∂xu+ cos(2x)] dW1(t) + σ2u dW2(t),

where the initial condition is u(0, x) = cos(x) and periodic boundary con-ditions are imposed. The equation is similar to that in Example 8.4.2, withcommutative noises but there are also forcing terms. Derive the recursiveWCE and SCM algorithms to compute second moments of solutions as sug-gested in Remark 8.2.1.


Exercise 8.6.2 Consider the following stochastic advection-diffusion-reactionequation over the domain (0, T ]× (0, 2π):

du = [(ε+1

2σ21)∂


1

2σ22 cos

2(x)u+ sin(3x)] dt

+[σ1∂xu+ cos(x) dW1(t) + σ2 cos(x)u dW2(t).

The equation has two noncommutative noises, where periodic boundary con-ditions are imposed and nonrandom initial condition u(0, x) = cos(x). Derivethe recursive WCE and SCM algorithms to compute second moments of so-lutions as suggested in Remark 8.2.1.

Exercise 8.6.3 Write Matlab code for Exercises 8.6.1 and 8.6.2. Comparethe performance of WCE and SCM with respect to the following aspects:

• convergence order in time• accuracy when the order of WCE and the level of sparse grid changes.

Hint. Use the multistage WCE with a fine resolution in physical space,time, and in random space to produce a reference solution.

Exercise 8.6.4 Apply the stochastic collocation method (Hermite collocationmethod in random space) instead of WCE in Exercise 6.6.2.

9

Application of collocation method tostochastic conservation laws

In this chapter we demonstrate how to apply the methods we presented inthe previous chapters to stochastic nonlinear conservations laws. Specifically,since the problem is nonlinear we apply the stochastic collocation method(SCM) using Smolyak’s sparse grid for a one-dimensional piston problemand test its computational performance. This is a classical problem in everyaerodynamics textbook with an analytical solution if the piston velocity isfixed. However, here we consider a piston with a velocity perturbed by Brow-nian motion moving into a straight tube filled with a perfect gas at rest.The shock generated ahead of the piston can be located by solving the one-dimensional Euler equations driven by white noise using the Stratonovichor Ito formulations. We apply the Lie-Trotter splitting method before we ap-proximate the Brownian motion with its spectral truncation and subsequentlyapply stochastic collocation using either sparse grid or the quasi-Monte Carlo(QMC) method. Numerical results verify the Stratonovich-Euler and Ito-Euler models against stochastic perturbation results, and demonstrate theefficiency of sparse grid collocation and QMC for small and large randompiston motions, respectively.

9.1 Introduction

We consider the stochastic piston problem in [298], which defines a testbedfor numerical solvers in both random and physical space. The piston drivenby time-varying random motions moves into a straight tube filled with aperfect gas at rest. Of interest is to quantify the perturbation of the shockposition ahead of the piston corresponding to the random motion. For theperturbed shock position, Lin et al. [298] obtained analytical solutions forsmall amplitudes of noises and numerical solutions for large amplitudes of


251

252 9 Application of collocation method to stochastic conservation laws

noises, with the method of stochastic perturbation analysis and polynomialchaos, respectively. A specific random motion of the piston was studied, wherethe piston velocity was perturbed by a correlated stochastic process with zeromean and exponential covariance kernel. It was concluded that the variance ofthe shock location grows quadratically with time for small time and linearlyfor large time by both the perturbation analysis and numerical simulations ofthe corresponding Euler equations. Numerical results from the Monte Carlomethod and the polynomial chaos method (e.g., [488]) for the stochastic Eulerequations showed good agreement with the results from the perturbationanalysis.

Here we consider the case of piston velocity perturbed by Brownian mo-tion, which leads to the Euler equations subject to white noise rather thanthe Euler equations subject to color noise in [298]. We will use the MonteCarlo method and the stochastic collocation method presented in the previ-ous two chapters for equations driven by white noise. Note that the methodof perturbation analysis in [298] is independent of the type of noises whenthey have continuous paths in the random space so that the results by theperturbation analysis can be understood in a path-wise sense. Therefore, thestochastic piston problem defined in [298] can serve as a rigorous testbedof evaluating numerical stochastic solvers. So we will use the variances fromperturbation analysis as reference solutions.

We use the stochastic collocation method (SCM) for time-dependent equa-tions driven by white noise in time, presented in the previous chapters. Herewe also adopt the quasi-Monte Carlo (QMC) method to compute up to largertime and/or for large amplitudes of noises. The QMC method is efficient andconverges faster than the Monte Carlo method if relatively low dimensionalintegration is considered, see, e.g., [376, 424]; see also [165] for the applicationof the QMC method to elliptic equations in random porous media.

This chapter is organized as follows. In Chapter 9.2, we describe the pistonproblem driven by stochastic processes and review two different approaches toobtain the shock location: the perturbation analysis and the one-dimensionalEuler equations. When the piston is driven by the Brownian motion, we in-troduce two types of Euler equations according to different interpretationsof stochastic products for white noise, i.e., the Stratonovich-Euler equationsand the Ito-Euler equations. In Chapter 9.3, we describe a splitting methodfor the Euler equations before comparing the variances from the two stochas-tic Euler equations with those from first-order perturbation analysis. Wedemonstrate that the Stratonovich-Euler equations are suitable for obtain-ing the variances of perturbations piston locations. We apply the stochasticcollocation method in Chapter 9.4 to solve the Stratonovich-Euler equationsin the splitting-method setting. We conclude in Chapter 9.5 with a summaryand comments on computational efficiency, where we also compare differ-ent shock locations when the piston is driven by three different stochasticprocesses. Two exercises are presented for readers to practice the splittingmethod for stochastic parabolic equations.

9.2 Theoretical background 253

9.2 Theoretical background

Suppose that the piston velocity is perturbed by a time-dependent stochasticprocess so that the piston velocity is up = Up + vp(t, ω), where ω is a pointin random space; see Figure 9.1 for a sketch of shock tube driven by a pistonperturbed with random motion. Here we write vp(t, ω) = εUpV (t, ω) anddenote the stochastic process V (t, ω) as V (t) for brevity.

When ε = 0, i.e., no perturbation is imposed on the piston, the pistonmoves into the tube with a constant velocity Up, the shock speed S (and thusthe shock location) can be determined analytically, see [297, 298]. When ε isvery small, one can determine the perturbation process of the shock locationusing the first-order perturbation analysis [298], that is:

Fig. 9.1. A sketch of piston-driven shock tube with random piston motion.

Up + v

p ( t )

S + vs ( t )

U = 0 P = P

+ρ = ρ

+C = C

+

U = Up + v

p ( t )

P = P−

ρ = ρ−

C = C−

z(t) = εUpqS′

∞∑

n=0

(−r)n∫ t

0

V (αβnt1) dt1, (9.2.1)

where z(t)+tS is the shock location induced by the random motion of piston,

S′ =γ + 1

4

S

S − γ+14 Up

,

q =2

1 + k, r =

1− k

1 + k, k = C−

S + S′Up

1 + γSUp,

α =C− + Up − S

C−, β =

C− + Up − S

C− + S − Up.

Here γ is the ratio of the specific heats and C− the sound speed behindthe shock when the piston is unperturbed. The first two moments of theperturbation process z(t) are


E[z(t)] = 0,

E[z2(t)] = (εUpqS′)2E[

( ∞∑

n=0

(−r)n∫ t

0

V (αβnt1, ·) dt1)2].

We note that the perturbation analysis in [298] is independent of the per-turbation process whenever the process is continuous such that the analysiscan be understood in a path-wise way. By taking V (t, ω) as the Brownianmotion W (t) (omitting ω), we then have

E[z2(t)] = (εUpqS′)2E[

( ∞∑

n=0

(−r)n∫ t

0

W (αβnt1) dt1)2]

= (εUpqS′)2E[

( ∞∑

n=0

(−r)n∫ t

0

√αβnW (t1) dt1

)2]

= (εUpqS′)2(

∞∑

n=0

(−r)n√

αβn)2E[( ∫ t

0

W (t1) dt1)2]

=αt3

3(εUpqS

′)21

(1 + rβ12 )2

, (9.2.2)

where we use the scaling property of Brownian motion (W (αβnt1) =√αβn

W (t1)) and

∫ t

0

W (t1) dt1 is a Gaussian process with zero mean and vari-

ance t3

3 .

9.2.1 Stochastic Euler equations

The stochastic piston problem can be modeled by the following Euler equa-tions with unsteady stochastic boundary:

∂

∂tU+

∂

∂x

(f(U)

)= 0, (9.2.3)

where U =

⎛

⎝ρρuE

⎞

⎠, f(U) =

⎛

⎝ρu

ρu2 + Pu(P + E)

⎞

⎠ , ρ is density, u is velocity, E is

total energy, and P is pressure given by (γ − 1)(E − 12ρu

2) and γ = 1.4. Theinitial and boundary conditions are given by

u(x, 0) = 0, P (x, 0) = P+, ρ(x, 0) = ρ+, x > Xp(t),

P (Xp(t), 0) = P−, ρ(Xp(t), 0) = ρ−,

and

u(Xp(t), t) =∂

∂tXp(t) = up(t), t > 0,

9.2 Theoretical background 255

where Xp(t) is the position of the piston, and up(t) is the velocity of thepiston.

This problem is a moving boundary problem and can be transformed toa fixed boundary problem by defining a new coordinate (y, τ) from (x, t) viathe following transform:

y = x−∫ τ

0

up(τ1, ω) dτ1, τ = t. (9.2.4)

Defining v = u − up, we then have the following Euler equations with asource term [298]:

∂

∂τV +

∂

∂y

(f(V)

)= g(V)

∂up

∂τ, (9.2.5)

where V =

⎛

⎝ρρv

E

⎞

⎠, E = Pγ−1 + 1

2ρv2 and g(V) =

⎛

⎝0−ρ−ρv

⎞

⎠. The initial and

boundary conditions are given by

v(y, 0) = −Up, P (y, 0) = P+, ρ(y, 0) = ρ+, y > 0,

P (0, 0) = P−, ρ(0, 0) = ρ−, (9.2.6)

andv(0, τ) = 0, τ ≥ 0.

Our goal here is to compute the variance of the shock location perturba-tion z(τ). The perturbation of the shock location is z(τ) = Xs(τ) − τS =Xs(t) − tS, where Xs(τ) = Ys(τ) +

∫ τ

0up(t1) dt1 is the shock location while

Ys(τ) is the shock location under the new coordinate (y, τ).If we take up(t) = Up(1+εW (t)), where W (t) is a scalar Brownian motion,

we are led to the following Euler equations

∂

∂τV +

∂

∂y

(f(V)

)= εUpg(V) ∗ W , (9.2.7)

where “∗” denotes two different products as follows:(1) Stratonovich-Euler equations

∂

∂τV +

∂

∂y

(f(V)

)= εUpg(V) ◦ W , (9.2.8)

(2) Ito-Euler equations

∂

∂τV +

∂

∂y

(f(V)

)= εUpg(V)W . (9.2.9)

The initial and boundary conditions are imposed as above.We will verify these two models (9.2.8) and (9.2.9) by solving them nu-

merically with a splitting method in the next section.


9.3 Verification of the Stratonovich- and Ito-Eulerequations

In the previous section, we introduced two approaches (perturbation analysisand stochastic Euler equations) to obtain the variances of the shock location.Here, we verify the correctness of the stochastic Euler equations by comparingthe variances of the shock location obtained by two approaches, i.e., the first-order perturbation analysis and the numerical solution of the stochastic Eulerequations, up to time T = 5.

For numerical simulations, we consider the piston velocity Up = 1.25,where the Mach number of the shock is M = 2 and γ = 1.4. We normalize allvelocities with C+, the sound speed ahead of the shock, i.e., C+ = 1. Then,the initial conditions are given through the unperturbed relations of statevariables [298] as follows:

P+ = 4.5, P− = 1.0, ρ+ = 3.73, ρ− = 1.4.

9.3.1 A splitting method for stochastic Euler equations

We use a source-term (noise-term) splitting method proposed in [224] for ascalar conservation law with time-dependent white noise source term. Holdenand Risebro [224] considered a Cauchy problem on the whole line with mul-tiplicative white noise in Ito’s sense: ∂

∂tu + ∂∂xf(u) = g(u)W (t) with deter-

ministic essentially bounded initial condition where f , g are both Lipschitz,and g has bounded support. They proved the almost-sure-convergence of thissplitting method to a weak solution of the Cauchy problem assuming an ini-tial condition having bounded support and finitely many extrema while noconvergence rate was provided. According to our discussion in Chapter 3.4.3for parabolic equations with multiplicative white noise, the convergence orderof splitting methods is usually half (

√Δτ , Δτ is a time step size for splitting

methods) and at most one (Δτ).Here we extend this splitting method to the system (9.2.7). Specifically,

given the solution at τn, Vn, to obtain the solution at τn+1, we first solve,

on the small time interval [τn, τn+1),

∂

∂τV(1) +

∂

∂y

(f(V(1))

)= 0, (9.3.1)

with the boundary conditions (9.2.6) and initial condition V(1)(τn) = Vn;then we solve the following Cauchy problem, again on [τn, τn+1),

∂

∂τV(2) = εUpg(V

(2)) ∗ W , (9.3.2)

with the initial condition V(2)(τn) = V(1)(τn+1). Then the solution at timeτn+1, V

n+1, is set as V(2)(τn+1) (subject to the error from the splitting).

9.3 Verification of the Stratonovich- and Ito-Euler equations 257

Let us denote by S(τ, τn) the operator which takes V(τn) as initial condi-tion at τn to the weak solution of (9.3.1) and by R(τ, τn) the operator whichtakes the initial condition at time τn to the solution of the stochastic differ-ential equation (9.3.2). Then the approximate solution at τn+1 is defined byVn+1 = R(τn+1, τn)S(τn+1, τn)V

n. Thus, we define a sequence of approxi-mate solution, {Vn}, to (9.2.7) at time {τn}.

The application of splitting technique requires numerical methods for(9.3.1) and (9.3.2). The splitting scheme allows us to employ efficient existingmethods to solve them separately. To solve (9.3.1), we use a fifth-order WENOscheme in physical space and second-order strong-property-preserving (SPP)Runge-Kutta in time [253]. In solving (9.3.2), we will employ two differentmethods: the Monte Carlo method and the stochastic collocation method. Weemploy 1000 points for the fifth-order WENO scheme over the interval [0, 5]and the time step size dτ = 0.0005 so that the error from time discretizationis negligible. As we mentioned before, our goal is to compute the variance ofthe perturbed shock location. Since there is always only one shock, we obtainYs(τ) by finding the biggest jump of pressure, where the error is of orderO(dx) (dx is the mesh size in physical space).

9.3.2 Stratonovich-Euler equations versus first-order perturbationanalysis

We first compare the results obtained by solving the Stratonovich-Euler equa-tions with the Monte Carlo method and those obtained from first-order per-turbation analysis.

To solve the Stratonovich-Euler equations (9.2.8) with the splitting method,we solve Equation (9.3.2) by the following Crank-Nicolson scheme

V(2)(τn+1) = V(2)(τn) + εUpg(V(2)(τn+1/2))ΔWn, (9.3.3)

to accommodate the definition of the Stratonovich integral.In our simulation, the values of function g(V(2)(τ)) at τn+1/2 are ap-

proximated by the average values g(V(2)(τn))+g(V(2)(τn+1))2 . Note that for the

specific form of g, we do not have to invert the resulting matrix in (9.3.3).Figure 9.2 verifies that the Stratonovich-Euler equations (9.2.8) can cap-

ture the variances of shock location for the stochastic piston problem drivenby Brownian motion. Here we employ 10,000 realizations so that the sta-tistical error can be neglected for noises with amplitude larger than 0.05but smaller than 0.5. For noises with amplitude less than 0.05, the error ofthe adopted methods is dominated by the statistical error from the MonteCarlo method and also the space discretization error from WENO. Figure 9.2presents the variances obtained by the Monte Carlo method (9.3.1)–(9.3.3)and those from variances estimates by the first-order perturbation analy-sis (9.2.2). We observe the agreement between the results from the MonteCarlo method and the perturbation analysis within small time and for small


Fig. 9.2. Comparison between the results from first-order perturbation analy-sis (9.2.2) and solving the Stratonovich-Euler equations (9.2.8) by the splittingmethod (9.3.1)–(9.3.3).

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

t

varia

nce

of s

hock

loca

tion

epsilon=0.01, Monte−Carloepsilon=0.02, Monte−Carloepsilon=0.05, Monte−Carloepsilon=0.01, Perturbation analysisepsilon=0.02 Perturbation analysisepsilon=0.05, Perturbation analysis

(a) Small noises

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

1

2

3

4

5

6

7

tva

rianc

e of

sho

ck lo

catio

n

epsilon=0.1, Monte−Carloepsilon=0.2, Monte−Carloepsilon=0.5, Monte−Carloepsilon=0.1, Perturbation analysisepsilon=0.2, Perturbation analysisepsilon=0.5, Perturbation analysis

(b) Large noises

noises. Figure 9.2(a) shows the results for small noises, i.e., ε ∼ O(10−2)while Figure 9.2(b) for large noises, i.e., ε ∼ O(10−1). The difference betweenthe variances from the Monte Carlo method and the first-order perturbationanalysis (9.2.2) is at most 12% − 13% of the variances (9.2.2), up to timeT = 5, for all cases except for the case ε = 0.5; for the latter, the differencebetween the variances is at most 19.3% of the variance (9.2.2). However, forsmall time (t < 1) the variances by Monte Carlo and perturbation analysisagree well, while they deviate much after t = 2. This effect can be explained asfollows. For t < 1, the variance of the driving process (Brownian motion) hassmall value (

√t) corresponding to a weak perturbation; while at later time

it has larger value increasing substantially the perturbation. (We remind thereader that the perturbation process in [298] has unit variance.)

9.3.3 Stratonovich-Euler equations versus Ito-Euler equations

For the Ito-Euler equations (9.2.9), we solve (9.3.2) by the forward Eulerscheme

V(2)(τn+1) = V(2)(τn) + εUpg(V(2)(τn))ΔWn. (9.3.4)

Next we compare the numerical results for the Stratonovich-Euler equa-tions and the Ito-Euler ones using the above discretization in time. We ob-serve from Figure 9.3 that for both small and large noises, these two typesof equations have almost the same variances for the perturbed shock loca-tion E[z2(t)] up to time T = 5. Actually, the difference of variances by theStratonovich-Euler and Ito-Euler equations for ε ≤ 0.2 is less than 10−3

up to time t = 5 which lies within the discretization errors. For ε = 0.5, wepresent in Table 9.1 the difference of variances for these two approaches usingthe same sequence of Monte Carlo points. The Stratonovich-Euler equationsexhibit larger variances in large time but the difference from those by the

9.4 Applying the stochastic collocation method 259

Fig. 9.3. Comparison between solving Stratonovich-Euler equations (9.2.8) andIto-Euler equations (9.2.9) by the splitting method (9.3.1)–(9.3.2).

0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

t

varia

nce

of s

hock

loca

tion ε=0.05, Stratonovich Euler equation

ε=0.1, Stratonovich Euler equationε=0.05, Ito Euler equationε=0.1, Ito Euler equationε=0.05, Perturbation analysisε=0.1, Perturbation analysis

(a) ε = 0.05, 0.1

0 1 2 3 4 50

1

2

3

4

5

6

7

t

varia

nce

of s

hock

loca

tion ε=0.2, Stratonovich Euler equation

ε=0.5, Stratonovich Euler equationε=0.2, Ito Euler equationε=0.5, Ito Euler equationε=0.2, Perturbation analysisε=0.5, Perturbation analysis

(b) ε = 0.2, 0.5

Ito-Euler equations is less than 10% of the variances by Ito-Euler equations.We then conclude that the Stratonovich-Euler equations are a suitable modelfor the piston problem driven by Brownian motion and we will consider onlythis approach hereafter.

Table 9.1. The difference of variances of shock location by Stratonovich-Euler andIto-Euler equations for ε = 0.5.

t 1.0 2.0 3.0 4.0 5.0

Error in variance 0.0007 0.0129 0.0742 0.2353 0.2421

9.4 Applying the stochastic collocation method

Next we test the stochastic collocation method versus the Monte Carlomethod for the Stratonovich-Euler equations (9.2.8). To solve the Stratonovich-Euler equations (9.2.8), we again use the splitting method (9.3.1)–(9.3.2). In(9.3.2), we adopt the stochastic collocation method, where we first introducea spectral approximation for the Brownian motions and subsequently applythe sparse grid method. Specifically, we first approximate Brownian motionwith its spectral approximation, using K multi-elements:

W (n,K)(τ) =

K−1∑

k=0

n∑

i=1

∫ τ

0

χ[tk,tk+1)(s)mk,i(s) dsξk,i, τ ∈ [0, T ],

where 0 = t0 < t1 < · · · < tK = T , χ[tk,tk+1)(τ) is the indicator function of theinterval [tk, tk+1), {mk,i}∞i=1 is a complete orthonormal basis in L2([tk, tk+1]),


and ξk,i are mutually independent standard Gaussian random variables (withzero mean and variance one). Hence, we obtain the following partial differ-ential equation with smooth inputs:

∂

∂τV(2) = εUpg(V

(2))

K∑

k=0

n∑

i=1

χ[tk−1,tk)(τ)mk,i(τ)ξk,i. (9.4.1)

In (9.4.1) we apply the stochastic collocation method [11, 439, 486] forsmooth noises. The stochastic collocation method we adopt here is thesparse grid of Smolyak type based on 1D Gaussian-Hermite quadrature,see, e.g., [148] and also Chapter 2.5.4. We use in this chapter the Matlabcode(‘nwspgr.m’) at http://www.sparse-grids.de/.

The first issue we have for the piston problem here is the discontinuityof the solution to (9.2.8), where the condition for spectral approximationto work may be invalid [434]. In practice, we solve the problem with theWENO scheme, which smears the shock somewhat, and thus we have higherregularity than that of the original problem. A second issue is that the useof the stochastic collocation method (Smolyak sparse grid) with Gaussianquadrature may not exhibit fast convergence because of the low regularity.Thus, we use n = 1 or 2 with large K (small time step in W (n,K)) insteadof large n with small K. This choice of n is verified with control tests withn = 3, 4 for different K, where the numerical results show large deviationsfrom those of Monte Carlo method with high oscillations. We choose a lowsparse grid level (i.e., L = 2) to be consistent with the “available regularity”(numerical tests with high sparse grid level show an instability). The thirdissue is the so-called “curse-of-dimensionality.” In practice, when the numberof random variables, Kn, increases, the Smolyak sparse grid method will notwork well and will be replaced by the QMC method.

Here we adopt a uniform partition of the time interval [0, T ], that istk = (k − 1)Δ, k = 1, · · · ,K. The complete orthonormal basis we employ inL2([tk, tk+1]) is the cosine basis

mk,1(t) =1√Δ

, mk,i(t) =

√2

Δcos

((i− 1)π

Δ(t− tk)

), i ≥ 2.

Figure 9.4 compares the numerical results from the Monte Carlo method(9.3.1)–(9.3.3) and the stochastic collocation method for (9.3.1) and (9.4.1)with both small and large noises. For each ε, we use different Δ (the lengthof the uniform partition of time interval [0, T ]), i.e., different size of elementsK. We note that all the numerical solutions obtained by the stochastic collo-cation method agree with those from the Monte Carlo method (9.3.1)–(9.3.3)within small time. Here we do not observe convergence in n, recalling thatsuch convergence requires smoothness in random space.

We note that smaller Δ and larger n may lead to a larger number ofrandom variables and thus the break down of the sparse grid method [486].


9.4 Applying the stochastic collocation method 261

So we first test the cases of small Δ such that we can apply the sparse gridmethod. Figure 9.4 shows that a low level sparse grid method works well forthe piston problem with small perturbations. We note that our sparse gridlevel is two and thus the number of collocation points is 2n T

Δ + 1.When n = 1, we observe in Figure 9.4 good agreement of the results

by the stochastic collocation method and the Monte Carlo method in smalltime (t ≤ 2). Notice that when n = 1, (9.4.1) is the classical Wong-Zakaiapproximation [481]

∂

∂τV(2) = g(V(2))

1√Δ

K−1∑

k=0

χ[tk,tk+1)(τ)ξk,1. (9.4.2)

Fig. 9.4. Comparison between numerical results from Stratonovich-Euler equa-tions (9.2.8) using the Monte Carlo method (9.3.1) and (9.3.3) and the stochasticcollocation method (9.4.1). The sparse grid level is 2 and Δ is the size of elementin time in the stochastic collocation method.

0 1 2 3 4 50

0.01

0.02

0.03

0.04

0.05

0.06

t

varia

nce

of s

hock

loca

tion Δ= 1, n=1

Δ= 0.5, n=1Δ= 0.2, n=1Δ= 1, n=2Δ= 0.5, n=2Δ= 0.2, n=2MC−Stratonovich

(a) ε = 0.05

0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

t

varia

nce

of s

hock

loca

tion Δ= 1, n=1


(b) ε = 0.1

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

t

varia

nce

of s

hock

loca

tion Δ= 1, n=1


(c) ε = 0.2

0 1 2 3 4 50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

t

varia

nce

of s

hock

loca

tion Δ= 1, n=1


(d) ε = 0.5


However, for n = 2, there are some disagreements between the results. InFigure 9.4(a) and 9.4(c), the results of the case n = 2 and Δ = 0.2 (note thatwe have nK = 50 random variables) underestimate those results from theMonte Carlo method and the stochastic collocation method with a smallernumber of random variables (n = 1). The larger number of random variables(n = 2 here) does not result in convergence since we do not have a smoothsolution as we mention above.

For the case with large perturbation, ε = 0.5, we require smaller Δ andthus more random variables. This is why we observe the disagreement inFigure 9.4(d). For all cases in Figure 9.4, we observe a deviation of numericalresults by stochastic collocation methods from those of Monte Carlo methodover large time. Similar effects arise in the application of spectral methodsin random space, e.g., in Wiener chaos methods. The interested reader mayrefer to [505] for a discussion of this effect.

To adapt to the high dimensionality (large number of random variables),we employ the QMC method instead of sparse grid methods. We consider twopopular QMC sequences: one is a scrambled Halton with the method RR2proposed in [262]; and the other is a scrambled Sobol sequence suggestedin [340]. Both sequences lie in hypercube and thus an inverse transforma-tion is adopted to generate sequences in the entire space based on these twosequences. The Matlab code for generating a scrambled Halton sequence islisted in Code 2.4, and the code for a scrambled Sobol sequence is listed inCode 2.5. In Figure 9.5, we test the large noise case, i.e., ε = 0.5. Both Haltonand Sobol sequences work if a moderately large sample of the sequences isadopted. For 1000 sample points, variances from both sequences are closer tothose from Monte Carlo method (9.3.1)–(9.3.3) than those from 500 samplepoints of both sequences.


In this chapter we demonstrated how to apply the stochastic collocationmethod to nonlinear conservation laws driven by white noise. Specifically,we simulated a stochastic piston problem, where a piston is pushed withvelocity being time-varying Brownian motion and ahead of the piston thereis an adiabatic tube of constant area.

• This one-dimensional problem is modeled with the Euler equations drivenby white noise, and we verified the simulation results with the first-orderstochastic perturbation analysis presented in [298]. The stochastic prod-ucts can be either an Ito product or a Stratonovich product. We showednumerically that the Stratonovich-Euler equations are a suitable modelfor the piston problem driven by Brownian motion.

• By splitting the Euler equations into two parts – a “deterministic part”and a “stochastic part” – we solved the “stochastic part” by the MonteCarlo method and the stochastic collocation method.


Fig. 9.5. Comparison between numerical results from Stratonovich-Euler equa-tions (9.2.8) using direct Monte Carlo method (9.3.1)–(9.3.3) and the QMC methodfor (9.4.1) with a large noise: ε = 0.5.

0 1 2 3 4 50

1

2

3

4

5

6

t

varia

nce

of s

hock

loca

tion

Δ= 0.1, n=1Δ= 0.05, n=1Δ= 0.02, n=1Δ= 0.1, n=2Δ= 0.05, n=2Δ= 0.02, n=2MC−Stratonovich

(a) Sobol sequence: 500 sample points

0 1 2 3 4 50

1

2

3

4

5

6

tva

rianc

e of

sho

ck lo

catio

n


(b) Sobol sequence: 1000 sample points

0 1 2 3 4 50

1

2

3

4

5

6

t

varia

nce

of s

hock

loca

tion


(c) Halton sequence: 500 sample points

0 1 2 3 4 50

1

2

3

4

5

6

t

varia

nce

of s

hock

loca

tion


(d) Halton sequence: 1000 sample points

• QMC is employed for more accurate longer-time integration. The sparsegrid collocation method is only efficient for short time integration (withlarge magnitudes of noises) and for a relative long time integration (withsmall magnitudes of noises). For large noises, we need small time-intervalΔ for the stochastic collocation method to converge. For smaller time-interval Δ which leads to larger number of random variables, the QMCmethod leads to accurate solutions.

• We tested two types of QMC sequences for the “stochastic part” using amulti-element spectral approximation of the Brownian motion when thenoise is large. The stochastic collocation and QMC methods are superiorto the Monte Carlo method in the sense that they can achieve fasterconvergence than the classic Monte Carlo method.


The low accuracy of the stochastic collocation method, especially for longtimes, is caused by the discontinuity of the solution. Due to the deterministicsolver, we have that the accuracy for the numerical shock location is onlyfirst-order in the spatial step size, i.e., O(dx) where dx is the mesh size inphysical space.

With regards to computational efficiency, the stochastic collocation methodis more efficient than Monte Carlo simulation when a small number of ran-dom variables is involved, where the number of collocation points is far lessthan Monte Carlo sampling points. As time becomes larger, we introducemore random variables and thus we need to employ the more efficient QMCmethod. In other applications involving long-time integration, it may be pos-sible to use all three different ways of sampling, i.e., starting with sparsegrid for early time, continuing with the QMC for moderate time and evenswitching to the Monte Carlo method for long time.

Bibliographic notes. The concept of a moving piston was used by theEnglish physicist James Joule to demonstrate the mechanical equivalent ofheat in his pioneering studies, almost two centuries ago. In the last century,the moving piston has also been used extensively in fundamental studiesof fluid mechanics and shock discontinuities. This now classical problem hasbeen solved analytically in one dimension and also in higher space dimensionssee, e.g., [84, 478].

However, many variants of the moving piston problems cannot be solvedanalytically. In gas dynamics, in particular, in the context of normal shockwaves, the one-dimensional classical problem describes a piston moving atconstant speed in a tube of constant area and adiabatic walls; the shock waveis created ahead of the piston. Closed-form analytical solutions of this flowproblem with general time-dependent piston speeds are difficult to obtain;see semi-analytical solutions in [292] for accelerating and decelerating pistonsthat are valid only for short times.

In [298], a moving piston with random velocity was investigated numeri-cally, where the effect of random velocity was measured by variances of theshock location. Also, a stochastic perturbation analysis was also performedas reference solutions. When the random velocity is a Gaussian random fieldwith zero mean and exponential covariance kernel exp(− |t1 − t2|), it wasshown that the mean of the shock location is the same as the shock locationunder a constant velocity while the variance of the shock location is growingquadratically with the time t.

In this chapter, we show that the variances of the shock location grow cu-bically with time when the random velocity is the standard Brownian motion.These variances are significantly different from those from a piston driven bycolor noise. In Figure 9.6 we compare the variances of shock positions inducedby three different Gaussian noises: Brownian motion, stochastic process withzero mean and exponential covariance kernel exp(− |t1 − t2|) (see [298]), andstandard Gaussian random variable, where the noise amplitude is ε = 0.1.The results are obtained via the stochastic perturbation analysis in [298].


Fig. 9.6. Comparison among variances of shock positions induced by three differentGaussian noises: Brownian motion, stochastic process with zero mean and expo-nential covariance kernel exp(− |t1 − t2|), and standard Gaussian random variable.The noise amplitude is ε = 0.1.

0 1 2 3 4 5 6 7 8 9 100

0.5

1

1.5

2

2.5

t3

t2

t

t

varia

nce

of s

hock

loca

tion

Brownian motionexponential kernel processsingle random variable

The case of Brownian motion induces smaller values of variances than theother two cases for short times and greater values of variances for longertimes.

Polynomial chaos methods have been applied to solve stochastic hyper-bolic problems with shocks, see, e.g., [343, 395]. Non-polynomial basis basedchaos methods, e.g., wavelet based chaos methods proposed for smooth solu-tions in [294, Chapter 8], are used in [394] for solutions with strong disconti-nuities.

Some other statistical-error-free methods other than function expansionmethods have also been proposed, e.g., equations of probability density func-tion (PDF) methods [77, 461] and equations of cumulative distribution func-tion (CDF) methods [471]. In these methods, some integro-differential equa-tions for PDF or CDF methods are derived where numerical methods fordeterministic equations are then applied to obtain PDF/CDF numerically.

The stochastic shock problem is listed as an open computational problemin [294] and it remains open today. An analytical methodology for wavepropagation in random media has been presented in [134].


Exercise 9.6.1 Consider the following stochastic Burgers equation (0, T ] ×(0, 1):


∂tu+(u+ σW (t)) ◦ ∂xu = μ∂2xu, u(0, x) = u0(x), u(t, 0) = u(t, 1). (9.6.1)

Here u0(x) = sin(2πx) is a deterministic function.

• Write down the equation in Ito form.• Solve the problem using the following splitting method

∂tu+ u∂xu = μ∂2xu, u(tn) = u(tn), t ∈ (tn, tn+1], (9.6.2)

∂tu=σW (t)) ◦ ∂xu, u(tn)= u(tn+1), t∈(tn, tn+1].

(9.6.3)

Apply the explicit fourth-order Runge-Kutta scheme for the first equationand the midpoint scheme to solve the second equation. In random space,use the Monte Carlo method. The total error of this splitting scheme isdominated by the splitting error, which is expected to be of half-order in themean-square sense. Check numerically that the convergence order is indeedhalf. Make sure the statistical error is much smaller than the integrationerror.

Hint. A solution to this problem can be obtained semi-analytically, seeAppendix B.

Exercise 9.6.2 Apply Ito product in Equation (9.6.1) instead ofStratonovich product and redo the last exercise.

Part III

Spatial White Noise

269

In this part, we discuss numerical methods for spatial white noise. Specif-ically, we consider a semilinear elliptic equation with additive noise in Chap-ter 10 and an elliptic equation with multiplicative noise in Chapter 11.

After the truncation of Brownian motion (or white noise), the resultingequations are usually solved with standard numerical methods such as fi-nite elements methods, finite difference methods, and spectral methods. Wepresent some error estimates of finite element approximation of the resultingequations as well as the error of truncating the Brownian motion. It is shownthat the convergence rate of the finite elements methods is the same as thatfor deterministic elliptic equations when one- or two-dimensional problemsare considered. For additive noise, the convergence rate in three-dimensionalspace is lower than in the deterministic case because of the low regularity ofspatial white noise.

For elliptic equations with coefficients that is spatial noise, the solutionsmay not be square-integrable in random space and instead they may lie insome weighted stochastic Sobolev space. However, when the solutions aresquare-integrable, the solutions can be numerically obtained with high accu-racy using Wiener chaos expansion methods (WCE). Moreover, a perturba-tion technique called Wick-Malliavin approximation can be applied to reducethe computational cost of WCE.

10

Semilinear elliptic equations with additivenoise

With temporal white noise, the solutions of stochastic parabolic equationshave low regularity in time (Holder continuous with exponent 1/2 − ε andε > 0 is arbitrarily small) and thus the spectral approximation of Brownianmotion leads to only half order convergence in its truncation mode n. Withspatial white noise, however, the solution can be smoother and we can expecthigher-order convergence with spectral approximation of Brownian motion.

We investigate in this chapter the strong and weak convergence orderof piecewise linear finite element methods for a class of semilinear ellipticequations with additive spatial white noise using a spectral truncation ofwhite noise. We show that the strong convergence order of the finite elementapproximation is h2−d/2, where h is the element size and d ≤ 3 is the dimen-sion. We also show that the weak convergence order is hmin(4−d,2). Moreover,we consider a fourth-order equation and show that a spectral approximationof the white noise can lead to higher convergence order in both strong andweak sense when the solutions are smooth. Numerical results confirm ourprediction for one- and two-dimensional elliptic problems.

10.1 Introduction

Let us first consider strong and weak convergence orders for spectral trunca-tion of white noise in a linear elliptic equation with additive noise.


271

272 10 Semilinear elliptic equations with additive noise

Example 10.1.1 (Linear elliptic equation, see, e.g., [6, 118])Consider the following linear problem:

−∂2xu+ bu = g(x) + ∂xW (x), x ∈ (0, 1),

u(0) = u(1) = 0,

where b > −π2 and g ∈ L2([0, 1]).

The solution can be represented by u =

∞∑

k=1

ξk + gkb+ k2π2

ek, where gk =

∫ 1

0gek dx, ∂

∂xW (x) =∑∞

k=1 ekξk, and ξk’s are i.i.d. standard Gaussian ran-

dom variables. The basis {ek}∞k=1 can be any orthonormal basis in L2([0, 1]),

e.g., ek =√2 sin(kπx), which are the corresponding eigenfunctions of −Δu =

λu over [0, 1] with u(0) = u(1) = 0. The first two moments of u are

E[u] =

∞∑k=1

gkb+ k2π2

ek E[‖u‖2] =∞∑

k=1

1 + g2k(b+ k2π2)2

=

∞∑k=1

1

(b+ k2π2)2+ ‖E[u]‖2 .

It can be readily checked that there exists C > 0 independent of n such that

E[‖u− un‖2] = E[‖u‖2 − ‖un‖2] ≤ C1

n3,

where un =

n∑

k=1

ξk + gkb+ k2π2

ek. In this example, the weak convergence order is

twice of the mean-square convergence order. This conclusion holds for non-linear equations as well when some mild assumptions on nonlinear terms aremade.

In this chapter, we study the numerical approximation of the followingsemilinear elliptic equation with additive white noise using a spectral approx-imation of spatial white noise:

−Δu(x) + f(u(x)) = g(x) +∂d

∂x1∂x2 · · · ∂xdW (x), x = (x1, · · · , xd) ∈ D,

(10.1.1)with Dirichlet boundary condition

u(x) = 0, x ∈ ∂D, (10.1.2)

where D = (0, 1)d, W (x) is a Brownian sheet on D = [0, 1]d, f, g ∈ L2(D) sothat (10.1.1) is well posed, see Chapter 10.2 for details.

The piecewise linear approximation for Brownian motion leads to a piece-wise constant approximation of white noise, which has been widely used inapproximating temporal noise for solving SDEs (see, e.g., [259] and [358])as well as in approximating spatial noises (see, e.g., [6, 64, 118, 194]). Us-ing piecewise constant approximation, Gyongy and Martinez [194] consid-ered a finite difference scheme in physical space for the problem (10.1.1) and

10.2 Assumptions and schemes 273

obtained dimension-dependent convergence order in the mean-square sense:h if d = 1; and h2− d

2−ε if d = 2, 3, where h is the finite difference stepsize. Here and throughout this chapter, ε > 0 is an arbitrary small constant.For finite element methods for (10.1.1) in physical space, [6] obtained themean-square convergence order h for a one-dimensional linear problem; also[64] considered a two-dimensional problem (10.1.1) over a general boundedconvex domain and established the mean-square convergence order h1−ε. Inother words, the finite element methods basically yield the same mean-squareconvergence order as the finite difference methods do for d = 1, 2, when thepiecewise constant approximation of white noise is used.

With a spectral approximation of the spatial additive noise, we will showthat the mean-square convergence order is h2−d/2, where we use piecewiselinear finite element approximation in physical space, see Theorem 10.4.1.Specifically, in the one-dimensional case, we obtain the mean-square conver-gence order h3/2 instead of h from the piecewise constant approximation ofwhite noise [6]. We note that for d = 1, the solution is actually in H3/2−ε(D)and the spectral approximation benefits from the smoothness of the solu-tion as will be shown in Chapter 10.2, where we also show similar effects forfourth-order equations.

We show that the weak convergence order of the spectral approximationof white noise is twice its mean-square convergence order when only the whitenoise is discretized in (10.1.1), see Theorem 10.3.2. While further discretiz-ing (10.1.1) with a piecewise linear finite element method, we show that theweak error is hmin(4−d,2) for d ≤ 3, see Theorem 10.4.3. We present somenumerical results for one- and two-dimensional semilinear elliptic equationsin Chapter 10.5. At the end of this chapter, we summarize the main pointsof the chapter and present a review on piecewise constant approximation ofwhite noise and discuss some disadvantage of spectral approximation of whitenoise. Some exercises are provided.

10.2 Assumptions and schemes

For (10.1.1) to be well posed, we require the following assumption as in[48, 64, 194].

Assumption 10.2.1 The function f satisfies the following conditions:

• There exists a constant L < Cp such that

[f(s)− f(t)](s− t) ≥ −L |s− t|2 , ∀s, t ∈ R. (10.2.1)

• There exist constants M ≥ 0 and R ≥ 0 such that

|f(s)− f(t)| ≤ M +R |s− t| , ∀s, t ∈ R. (10.2.2)


Here Cp is the constant in the Poincare inequality:

‖∇v‖2 ≥ Cp ‖v‖2 , v ∈ H10 (D).

Under Assumption 10.2.1, the solution to (10.1.1) is proved to exist and beunique in L

p(Ω,L2(D)) when d ≤ 3 [48]. For d > 4, for Equation (10.1.1) tobe well posed in L

p(Ω,L2(D)), [337] considered additive color noise insteadof white noise. The solution to (10.1.1) is understood in the sense of a mildsolution:

u(x) +

∫

DK(x, y)f(u(y)) dy =

∫

DK(x, y)g(y) dy +

∫

DK(x, y) dW (y),

(10.2.3)where K(x, y) is Green’s function of the Poisson equation.

Remark 10.2.2 The assumption (10.2.1) can allow the nonlinear functionf of the following form f = f1 + f2, where f1 is Lipschitz continuous witha small Lipschitz constant and f2 can be a sum of nondecreasing boundedfunctions and a Lipschitz continuous function. For example, f2 can be thesign function sgn(x) or Heaviside function or a summation of both functions.

Here we represent the spatial white noise W (x) with an orthogonal seriesexpansion

∂d

∂x1∂x2 · · · ∂xdW (x) =

∑

α∈Jeα(x)ξα, (10.2.4)

or for the spatial Brownian motion (Brownian sheet)

W (x) =∑

α∈Nd

∫ xd

0

∫ xd−1

0

· · ·∫ x1

0

eα(y) dy1 · · · dydξα, (10.2.5)

where {eα(x)}∞|α|=1 is a complete orthonormal basis in L2(D); ξα, α =

(α1, α2, α3) are independent standard Gaussian random variables. In practice,we can take any orthonormal basis in L2(D). Here we take the eigenfunctionsof the elliptic equation

−Δψ = λψ, x ∈ D ψ = 0, x ∈ ∂D, (10.2.6)

which can form an orthonormal basis in L2(D). We denote the truncation ofW (x) (10.2.5) by Wn:

Wn(x) =∑

|α|≤n, α∈Nd

∫ xd

0

∫ xd−1

0

· · ·∫ x1

0

eα(y) dy1 · · · dydξα. (10.2.7)

A semi-discrete scheme of (10.1.1) (to be precise, (10.2.3)) is then asfollows

un(x) +

∫

DK(x, y)f(un(y)) dy =

∫

DK(x, y)g(y) dy +

∫

DK(x, y) dWn(y),

(10.2.8)

10.3 Error estimates for strong and weak convergence order 275

which is equivalent to

−Δun(x) + f(un(x)) = g(x) +∂d

∂x1∂x2 · · · ∂xdWn(x). (10.2.9)

The semi-discrete scheme (10.2.9) requires further discretization in physi-cal space. Here we consider a finite element approximation. Let Vh be a linearfinite element subspace of H1

0 (D) with quasi-uniform triangulation Th. Then,the linear finite element approximation of un in (10.2.9) is to find uh

n ∈ Vh

such that

(∇uhn ,∇v) + (f(uh

n ), v) = (g +∂d

∂x1 · · · ∂xdWn, v), ∀v ∈ Vh. (10.2.10)

Before ending this section, we remark that this approach can be furtherextended as follows:

• The domain D can be a bounded domain with a smooth boundary ∂D,while conclusions in this chapter remain true. For example, one can con-sider the problem in [64] where the domain D is bounded convex.

• The operator Δ can be replaced by general self-adjoint, positive-definite,linear operators, say A, with compact inverse.

We emphasize again that any orthonormal basis in L2(D) can be usedin the spectral expansion (10.2.5), though it may be convenient to use theeigenfunctions of A as a basis if they can be explicitly obtained.

10.3 Error estimates for strong and weak convergenceorder

In this section, we will discuss strong and weak convergence orders.

Theorem 10.3.1 (Strong error) Let u be the solution to (10.1.1) and un

the solution to (10.2.9). Under Assumption 10.2.1, we have

E[‖u− un‖2] ≤ C(M(

∞∑

|α|=n+1

λ−2α )1/2 +

∞∑

|α|=n+1

λ−2α

)≤ C

(Mn−(2−d/2)

+(Cp +R)2n−(4−d)),

where the constant C depends only on d,Cp, L,M,R and λα are eigenvaluesof the problem (10.2.6). The constants L,M,R are from Assumption 10.2.1.

Under further smoothness of the nonlinear function f , we show that theweak convergence order is twice of the mean-square convergence order.


Theorem 10.3.2 (Weak error) Let u be the solution to (10.1.1) and un

the solution to (10.2.9). In addition to Assumption 10.2.1, assume also thatf and F and their derivatives up to fourth-order are of at most polynomialgrowth at infinity:

∣∣∣∣dk

dxkG(x)

∣∣∣∣ ≤ c(1 + |x|κ), κ < ∞, k = 1, 2, 3, 4 and G = f, F. (10.3.1)

Furthermore, we assume that M = 0 in Assumption 10.2.1. Then we have

‖E[F (u)]− E[F (un)]‖Lq ≤ C∞∑

|α|=n+1

λ−2α ≤ Cn−(4−d), 1 ≤ q < ∞.

(10.3.2)The constant C depends on d,Cp, L,M,R as in Theorem 10.3.1 and also theconstant in (10.3.1).

10.3.1 Examples of other PDEs

It seems that the fact that the weak convergence order is twice that ofthe strong-order convergence is quite general. In the following example, weshow that it is true for an advection equation with multiplicative noises anda fourth-order equation with additive noise using the spectral approxima-tion (10.2.7).

Example 10.3.3 (Advection-reaction, see, e.g., [438])

∂tu+ ∂xu = σ(u− 1) ◦ W (x), x ∈ [0, L] (10.3.3)

with initial condition u0(x) and zero inflow. The stochastic product u ◦ W isthe Stratonovich product. Equation (10.3.3) can be written in Ito’s form

∂tu+ ∂xu =σ2

2(u− 1) + σ(u− 1) �W (x), x ∈ [0, L], (10.3.4)

where “�” represents the Ito-Wick product. The exact solution of (10.3.3) is

u = 1 + [u0(x− t)− 1] exp[σW (x)− σW (x− t)]. (10.3.5)

Applying the truncated spectral expansion (10.2.7) in one-dimensional phys-ical space, we then have the following approximation to Equation (10.3.3):

∂tun + ∂xun = σ(un − 1)d

dxWn(x), x ∈ [0, L] (10.3.6)

whose solution is

un = 1 + [u0(x− t)− 1] exp[σWn(x)− σWn(x− t)]. (10.3.7)


Theorem 10.3.4 Let u be the solution to (10.3.3) and un the solution to(10.3.6). Then we have

E[|u− un|2] ≤ C11

n,∣∣E[uk − uk

n ]∣∣ ≤ C2

1

n, ∀k > 0, (10.3.8)

where C1 and C2 depend only on t, x, and σ in the former inequality and C2

depends also on k in the latter.

Proof. We first prove the strong convergence. Note that

E[(u− un)2] = (u0(x− t)− 1)2E[(exp(σW (x)− σW (x− t))

− exp(σWn(x)− σWn(x− t)))2]. (10.3.9)

By the fact that exp(a)− exp(b) = exp(θa+(1− θ)b)(a− b) where a ≤ θ ≤ band the Cauchy-Schwarz inequality, we have

E[(u− un)2]

= (u0(x−t)−1)2E[(exp(σW (x)−σW (x−t))− exp(σWn(x)−σWn(x−t))

)2]

≤ (u0(x−t)−1)2(E[exp(4σθ(W (x)−W (x−t))+4σ(1−θ)(Wn(x)−Wn(x−t)))]

)1/2

×σ2(E[((W (x)−W (x− t))− (Wn(x)−Wn(x− t)))4]

)1/2.

It requires to estimate the two expectations in this inequality. The first oneis bounded as follows:

(E[exp(4σθ(W (x)−W (x− t)) + 4(1− θ)σ(Wn(x)−Wn(x− t))])1/2

≤ (E[exp(8σθ(W (x)−W (x−t)))])1/4

(E[exp(8(1−θ)σ(Wn(x)−Wn(x−t)))])1/4

≤ exp(8σ2θ2t) exp(8(1− θ)2σ2t) ≤ exp(8σ2t). (10.3.10)

Now we estimate the second expectation E[(W (x) − W (x − t) − (Wn(x) −Wn(x− t)))4]. In fact,

E[(W (x)−W (x− t)− (Wn(x)−Wn(x− t)))4]

= E[(

∞∑

k=n+1

[Mk(x)−Mk(x− t)]ξk)4]

= E[(

∞∑

k=n+1

∞∑

l=n+1

[Mk(x)−Mk(x− t)]2[Ml(x)−Ml(x− t)]2ξ2kξ2l ]

≤ 3

∞∑

k=n+1

∞∑

l=n+1

[Mk(x)−Mk(x− t)]2[Ml(x)−Ml(x− t)]2

= 3(∞∑

k=n+1

[Mk(x)−Mk(x− t)]2)2 ≤ C1

n2, (10.3.11)

where Mk =∫ x

0mk(y) dy with m1(x) = 1/

√L, mk(x) = 2/

√L cos(π(k−1)x/

L) and C depends only on t and x.


By (10.3.10) and (10.3.11), we have the first estimate in (10.3.8).Now we prove the weak convergence. It suffices to check E[(u − 1)k] −

E[(un − 1)k]. By (10.3.5) and (10.3.7), we have

∣∣E[(u− 1)k]− E[(un − 1)k]∣∣ =

∣∣(u0(x− t)− 1)k exp(k2

2σ2

E[(W (x)

−W (x− t))2])

−(u0(x− t)− 1)k exp(k2

2σ2

E[(Wn(x)

−Wn(x− t))2])∣∣

≤ |u0(x− t)− 1|k exp(k2

2σ2

E[(W (x)

−W (x− t))2])

×k2

2σ2(E[(W (x)−W (x− t))2]− E[(Wn(x)

−Wn(x− t))2]),

where we have used the fact ex−ey = eθx+(1−θ)y(x−y) (0 ≤ θ ≤ 1) and thatE[(Wn(x) − Wn(x − t))2] ≤ E[(W (x) − W (x − t))2]. By E[(W (x) − W (x −t))2]−E[(Wn(x)−Wn(x−t))2] =

∑∞k=n+1

(Mk(x)−Mk(x−t)

)2, we then have

∣∣E[(u− 1)k]− E[(un − 1)k]∣∣ ≤ k2

2σ2∣∣(u0(x− t)− 1)k

∣∣ exp(k2

2σ2t)

C

n.

Hence, the estimate of the weak convergence order follows.

For one-dimensional advection equations with multiplicative noise, wehave the order of 1/

√n for strong convergence and 1/n for weak convergence.

We do not expect better convergence order as in the case of elliptic equation,where the smoothing of the inverse of Laplacian operator is involved. Thefollowing example shows that when better smoothing effects appear, e.g., forbiharmonic equations, the strong convergence order can be even higher thanthat in the case of the elliptic operators.

Example 10.3.5 (Linear biharmonic equations with additive noise)Consider the following linear biharmonic equation with additive noise

Δ2u+ bu = g(x) +∂d

∂x1∂x2 · · · ∂xdW (x), x = (x1, · · · , xd) ∈ D = [0, 1]d,

(10.3.12)

with u = 0 and Δu = 0 on ∂D, g ∈ L2(D). Suppose the operator Δ2 haseigenvalues λα and eigenfunctions eα. Then, λα is proportional to π4(α4

1 +· · ·+ α4

d). We approximate (10.3.12) by truncating the white noise using thespectral representation (10.2.4):

Δ2un + bun = g(x) +∑

|α|≤n

eα(x)ξα. (10.3.13)


Then we have

u =∑

α

gα + ξαb+ λα

eα, un =∑

|α|≤n

gα + ξαb+ λα

eα. (10.3.14)

Similar to Theorem 10.3.1, we can conclude that

E[‖u‖2 − ‖un‖2] = E[‖u− un‖2] ≤ Cn−(8−d). (10.3.15)

10.3.2 Proofs of the strong convergence order

The eigenvalue problem (10.2.6) has the following orthonormal eigenfunc-tions:

eα(x) := eα(x1, x2, · · · , xd) =

d∏

i=1

√2 sin(παixi), αi ≥ 1 (10.3.16)

and the corresponding eigenvalues λα = π2 |α|2. We also use single-indexedeigenvalues λi and eigenfunctions ei(x) if no confusion arises as the single-indexed system can be always achieved by a proper arrangement of multi-indices.

To prove the strong and weak convergence order, we need the followingspace

Hs= H

s(D) = D((−Δ)

s/2) =

{

v| ‖v‖s =∥∥∥(−Δ)

s/2v∥∥∥ = (

∞∑

k=1

λsk(v, ek))

1/2< ∞

}

, s ∈ R.

It is known that this space is equivalent to the classical Sobolev-Hilbertspace Hs, i.e., Hs = Hs, see, e.g., [446].

The Green function K(x, y) can be represented by

K(x, y) =∑

α∈Nd

1

π2 |α|2eα(x)eα(y). (10.3.17)

We first consider the regularity of solutions to (10.1.1).

Lemma 10.3.6 There exists a constant C depending only on d that∫

D|K(x, y)|2 dy ≤ C,

∫

D‖K(·, y)‖22−d/2−ε dy ≤ C. (10.3.18)

Proof. By (10.3.17) and orthonormality of {eα}, we have∫

D|K(x, y)|2 dy =

∑

α

1

π4 |α|4e2α(x) ≤ C

∑

α

1

π2 |α|4≤ C.

By the fact that λα ≤ C |α|2, we then have∫

D‖K(·, y)‖22−d/2−ε dy =

∑

α

λ2−d/2−εα

1

π4 |α|4≤ C

∑

α

1

|α|d+ε≤ Cε−1,

where we use the fact that the series∑

α∈Nd

1

|α|s converges if and only if s >

d+ ε with ε > 0.


Lemma 10.3.7 For any ε > 0, we have

E[

∥∥∥∥∂d

∂x1∂x2 · · · ∂xdWn

∥∥∥∥2

−d/2−ε

] ≤ E[

∥∥∥∥∂d

∂x1∂x2 · · · ∂xdW

∥∥∥∥2

−d/2−ε

] < C(d)ε−1.

(10.3.19)

Proof. By the definition of norms in H−β , where β > 0, we have

E[

∥∥∥∥∂d

∂x1∂x2 · · · ∂xdWn

∥∥∥∥2

−β

] ≤ E[

∥∥∥∥∂d

∂x1∂x2 · · · ∂xdW

∥∥∥∥2

−β

]

=∑

α∈Nd

λ−βα =

1

π2β

∑

α∈Nd

1

|α|2β.

Then we have for 2β > d+ ε, E[∥∥∥ ∂d

∂x1∂x2···∂xdW∥∥∥2

−β] ≤ C(d)ε−1 < ∞.

From Lemmas 10.3.6 and 10.3.7, we have the following regularity for(10.1.1).

Theorem 10.3.8 (Regularity) Under Assumption 10.2.1, we have for thesolution to (10.1.1):

E[‖u‖qLp ] < ∞, 1 ≤ p, q < ∞. (10.3.20)

Furthermore, if M = 0 in (10.2.2) of Assumption 10.2.1,

E[‖u‖22−d/2−ε] < Cε−1. (10.3.21)

Proof. We prove the Lp-stability for Equation (10.1.1). For p = 2, the L2

regularity can be found in [194]. By (10.2.3) and (10.2.2), taking Lp-normover both sides, we then have

E[‖u‖qLp ]≤CE[

∥∥∥∥∫

D|K(·, y)| (1 + |u|) dy

∥∥∥∥q

Lp

]+CE[

∥∥∥∥∥∥

∞∑

|α|=1

1

π2 |α|2eα(x)ξα

∥∥∥∥∥∥

q

Lp

].

(10.3.22)

By the Cauchy-Schwarz inequality and by Lemma 10.3.6, we have

E[

∥∥∥∥∫

D|K(·, y)| |u| dy

∥∥∥∥q

Lp

] ≤∥∥∥∥(∫

DK2(·, y) dy)1/2

∥∥∥∥q

Lp

E[‖u‖q] ≤ CE[‖u‖q].

By the Littlewood-Paley inequality (see Appendix D and [301]), we have forany 1 < p < ∞,∥∥∥∥∥∥

∞∑|α|=1

1

π2 |α|2eα(x)ξα

∥∥∥∥∥∥Lp

≤ C

∥∥∥∥∥∥(

∞∑|α|=1

1

π4 |α|4e2α(x)ξ

2α)

12

∥∥∥∥∥∥Lp

≤ C(∞∑

|α|=1

1

π4 |α|4ξ2α)

12 .


Then by the Cauchy-Schwarz inequality and the triangle inequality, we have

E[

∥∥∥∥∥∥

∞∑

|α|=1

1

π2 |α|2eα(x)ξα

∥∥∥∥∥∥

q

Lp

] ≤ CE[(

∞∑

|α|=1

1

π4 |α|4ξ2α)

q2 ]

≤ C(E[(

∞∑

|α|=1

1

π4 |α|4ξ2α)

q]) 1

2

≤ C( ∞∑

|α|=1

1

π4 |α|4(E[ξ2qα ])

1q) q

2

≤ C( ∞∑

|α|=1

1

π4 |α|4) q

2 < ∞.

Then by (10.3.22), we have the inequality in (10.3.20) when p > 1. Theinequality for p = 1 follows readily from Cauchy-Schwarz inequality and thecase p = 2.

With Lemma 10.3.7, the estimate (10.3.21) can be proved similarly.

Now, we can discuss the strong convergence order for the spectral trun-cation of white noise.

Lemma 10.3.9 For η =∫D K(x, y) d[W (y)−Wn(y)], we have

E[‖η‖2] ≤ C(d)ε−1(n+ 1)−(4−d). (10.3.23)

Proof. By (10.3.17), (10.2.5), and (10.2.7), we have

E[‖η‖2] = E[

∥∥∥∥

(∫

DK(x, y) d[W (y)−Wn(y)]

)∥∥∥∥2

]

= E[

∫

D

⎛

⎝∫

D

∑

α∈Nd

1

π2 |α|2eα(x)eα(y)

∑

|α|≥n+1

eα(y) dyξα

⎞

⎠2

dx]

= E[

∫

D

⎛

⎝∑

|α|≥n+1

1

π2 |α|2eα(x)ξα

⎞

⎠2

dx]

=∑

|α|≥n+1

∫

D

1

π4 |α|4e2α(x) dx

=∑

|α|≥n+1

1

π4 |α|4≤∫

|α|≥n+1

1

π4 |α|4dα ≤ C(d)n−(4−d).

Here we also used the mutual independence of ξα’s and the orthonormalityof the basis {eα}.


Lemma 10.3.10 (Cf. [194, Theorem 2.3]) Under Assumption 10.2.1,then we have

E[‖u− un‖2] ≤ C(M(E[‖η‖2]

) 12

+ (Cp +R)2E[‖η‖2]),

where the constant C depends only on Cp, L, and M,R.

Proof. By (10.2.3) and (10.2.8), the error equation reads

u(x)− un(x) = −∫

DK(x, y)[f(u(y))− f(un(y))] dy + η(x). (10.3.24)

Multiplying [f(u(y))− f(un(y))] over both sides of (10.1.1), applying theinequality (10.2.1) and integrating over the domain D, we have

−L ‖u− un‖2 ≤∫

Dη(x)[f(u(x))− f(un(x))] dx

−∫

D

∫

DK(x, y)[f(u(y))− f(un(y))] dy[f(u(x))− f(un(x))] dx

≤∫

Dη(x)[f(u(x))− f(un(x))] dx− Cp

∫

D

(∫

DK(x, y)[f(u(y))− f(un(y))] dy

)2

dx

=

∫

Dη(x)[f(u(x))− f(un(x))] dx− Cp

∫

D[u− un − η]2 dx.

Then, by (10.2.2) and the Cauchy-Schwarz inequality, we have

(Cp − L) ‖u− un‖2 ≤ (Cp +R) ‖u− un‖ ‖η‖+ 2M ‖η‖ .By Assumption 10.2.1 (Cp − L > 0), we have

‖u− un‖2 ≤ C(M ‖η‖+ (Cp +R)2 ‖η‖2),where the constant C depends only on Cp, L and M , R.

Theorem 10.3.1 follows from the triangle inequality and Lemmas 10.3.9and 10.3.10.

10.3.3 Weak convergence order

To prove Theorem 10.3.2, we need the following lemmas. We introduce thefollowing equation

−Δu(x; z1, · · · , zn, · · · ) + f(u(x; z1, · · · , zn, · · · )) = g(x) +

∞∑

i=1

ei(x)zi,

(10.3.25)

where zi are parameters in the real line. Note that u(x; ξ1, · · · ) is the solutionto the problem (10.1.1). Here ξi’s are also single-indexed in the same way asthe eigenvalues λi’s.


Lemma 10.3.11 In addition to Assumption 10.2.1, assume also that f sat-isfies the polynomial growth condition (10.3.1). Then, there exists a constantC > 0 depending only on d, κ, β and those constants in Assumption 10.2.1such that

E[∥∥Dβu

∥∥2Lq ] ≤ C

∏

i

λ−2βi

i , 1 ≤ |β| ≤ 4, 1 ≤ q ≤ ∞.

Lemma 10.3.12 Suppose that F satisfies the polynomial growth condition(10.3.1). Under the conditions of Lemma 10.3.11, we then have for someconstant C > 0 depending only on d, κ, β and those constants in Assump-tion 10.2.1 that

E[∥∥Dβ(F (u))

∥∥2Lq ] ≤ C

∏

i

λ−2βi

i , |β| ≤ 4, 1 ≤ q < ∞.

Proof of Lemma 10.3.11. To estimate the derivatives of solution withrespect to parameters, we need the following auxiliary equation: for g ∈L2(D),

−Δv + f ′(u)v = g(x), x ∈ D, v = 0 on ∂D. (10.3.26)

By Assumption 10.2.1, we claim the following estimate

‖v‖Lq ≤ C

∥∥∥∥∫

DK(x, y)g(y) dy

∥∥∥∥L∞

, 1 ≤ q ≤ ∞. (10.3.27)

We first establish the case q = 2. Equation (10.3.26) can be written in theintegral form as

v(x) +

∫

DK(x, y)f ′(u)v dy =

∫

DK(x, y)g(y) dy. (10.3.28)

Multiplying f ′(u)v over both sides of (10.3.28), by the Poincare inequality(see Appendix D) and (10.3.28), we have

0 = (f ′(u)v, v) + (

∫

DK(·, y)f ′(u(y))v(y) dy, f ′(u)v)− (

∫

DK(·, y)g(y) dy, f ′(u)v)

≥ (f ′(u)v, v) + Cp

∥∥∥∥∫

DK(·, y)f ′(u(y))v(y) dy

∥∥∥∥2

− (

∫

DK(·, y)g(y) dy, f ′(u)v)

≥ −L ‖v‖2 + Cp

∥∥∥∥v −∫

DK(·, y)g(y) dy

∥∥∥∥2

− (

∫

DK(·, y)g(y) dy, f ′(u)v).

Then, by the fact that f ′ ≥ −L > −Cp and |f ′| ≤ R, we have (10.3.27)when q = 2. After taking Lq-norm over both side of (10.3.28) and by∫D K2(x, y) dy ≤ C (Lemma 10.3.6), we have

‖v‖Lq ≤ RC ‖v‖+∥∥∥∥∫

DK(·, y)g(y) dy

∥∥∥∥Lq

, 1 ≤ q ≤ ∞,

and thus by Lemma 10.3.6, we reach (10.3.27).


Taking the derivative with respect to zi in Equation (10.3.25), we have

−ΔDεiu(x; z1, · · · , zn, · · · )+f ′(u(x; z1, · · · , zn, · · · ))Dεiu(x; z1, · · · , zn, · · · )=ei(x),

Thus, by (10.3.27) and (10.3.17), we have

‖Dεiu‖Lq ≤ C

∥∥∥∥∫

DK(·, y)ei(y) dy

∥∥∥∥L∞

= C∥∥λ−1

i ei∥∥L∞ ≤ Cλ−1

i , 1 ≤ q ≤ ∞.

(10.3.29)

Taking the derivatives with respect to zi and zj in Equation (10.3.25), wehave the following equation:

−ΔDεi+εju+ f ′(u)Dεi+εju = −f ′′(u)DεiuDεju.

Then by (10.3.27), Lemma 10.3.6, and (10.3.29), we have

∥∥Dεi+εju∥∥Lq ≤ C

∥∥∥∥∫

DK(·, y)f ′′(u)DεiuDεju dy

∥∥∥∥L∞

≤ Cλ−1i λ−1

j ‖f ′′(u)‖ , 1 ≤ q ≤ ∞.

Similarly, we have

∥∥Dεi+εj+εku∥∥Lq ≤ Cλ−1

i λ−1j λ−1

k (∥∥∥f (3)(u)

∥∥∥+ ‖f ′′(u)‖),∥∥Dεi+εj+εk+εlu

∥∥Lq ≤ Cλ−1

i λ−1j λ−1

k λ−1l (

∥∥∥f (4)(u)∥∥∥+

∥∥∥f (3)(u)∥∥∥+ ‖f ′′(u)‖).

By the assumption of polynomial growth at infinity for f and its derivativesand the Lp-stability (Theorem 10.3.8), we reach the conclusion. �

Proof of Lemma 10.3.12. By the multivariate chain rule (also known asmultivariate Faa di Bruno formula), we have Dεi+εjF (u) = F ′(u)Dεi+εju +F ′′(u)DεiuDεju, and thus by Lemma 10.3.11,

∥∥Dεi+εjF (u)∥∥Lq ≤ C(‖F ′(u)‖Lq + ‖F ′′(u)‖Lq )λ

−1i λ−1

j , 1 ≤ q < ∞.

Similarly, we have

∥∥Dεi+εj+εk+εlF (u)∥∥Lq ≤ Cλ−1

i λ−1j λ−1

k λ−1l (‖F ′(u)‖Lq

+ ‖F ′′(u)‖Lq +∥∥∥F (3)(u)

∥∥∥Lq

+∥∥∥F (4)(u)

∥∥∥Lq),

and then the conclusion follows from the assumption of polynomial growthof F and its derivatives at infinity (10.3.1) and Lemma 10.3.8. �

Proof of Theorem 10.3.2. By the first-order Taylor’s expansion, we have,for m > n,


E[F (um)− F (un)]

= E[F (um(ξ1, · · · , ξn′ , · · · , ξm′))− F (un(ξ1, · · · , ξn))]

= E[

m′∑

i=n′+1

Dεi(F (um(ξ1, · · · , ξn′ , 0, · · · , 0)))ξi]

+m′∑

i,j=n′+1

E[

∫ 1

0

(1− t)Dεi+εj (F (um(ξ1, · · · , ξn′ , tξn′+1, · · · , tξm′)))ξiξj ]

=

m∑

i,j=n′+1

∫ 1

0

(1− t)E[Dεi+εj (F (um(ξ1, · · · , ξn′ , tξn′+1, · · · , tξm′)))ξiξj ],

(10.3.30)

where n′ = n!/d!/(n − d)! and m′ = m!/d!/(m − d)! and we used the fact ξi(i ≥ n′ + 1) is independent of F (um(ξ1, · · · , ξn′ , 0, · · · , 0)) and E[ξi] = 0.

To estimate (10.3.30), we split the term into two parts:

I =m′∑

i=n′+1

∫ 1

0

(1− t)E[D2εi(F (um(ξ1, · · · , ξn′ , tξn′+1, · · · , tξm′)))ξ2i ],

II = 2m′∑

i<j, i,j=n′+1

∫ 1

0

(1− t)E[Dεi+εj (F (um(ξ1, · · · , ξn′ , tξn′+1, · · · , tξm′)))ξiξj ].

By Lemma 10.3.12, we have, for 1 ≤ q < ∞,

‖I‖Lq =

∥∥∥∥∥∥m′∑

i=n′+1

∫ 1

0

(1− t)E[D2εi(F (um(ξ1, · · · , ξn′ , tξn′+1, · · · , tξm′)))ξ2i ] dt

∥∥∥∥∥∥Lq

≤ Cm′∑

i=n′+1

λ−2i . (10.3.31)

For II, we use the recipe of the proof of Theorem 2.8 in [74]. For simplicity,we define that Xt,r,s

i.j = (ξ1, · · · , ξn′ , tξn′+1, · · · , trξi, · · · , tsξj , · · · , tξm′).

Noticing that E[Dεi+εj (F (um(Xt,0,1i,j )ξiξj ] = 0 (i < j), we have

∫ 1

0

(1− t)E[Dεi+εj (F (um(Xt,1,1i,j ))ξiξj ] dt

=

∫ 1

0


−∫ 1

0


=

∫ 1

0

∫ 1

0

(1− t)tE[D2εi+εj (F (um(Xt,r,1i,j ))ξiξj ] dt dr.


With E[D2εi+εj (F (um(Xt,r,0i,j )ξiξj ] = 0 (i < j), we have similarly

∫ 1

0

∫ 1

0

(1− t)tE[D2εi+εj (F (um(Xt,r,1i,j ))ξiξj ] dt dr,

=

∫ 1

0

∫ 1

0

∫ 1

0

(1− t)t2E[D2εi+2εj (F (um(Xt,r,si,j ))ξ2i ξ

2j ] dt dr ds,

and thus for i < j,

∫ 1

0


=

∫ 1

0

∫ 1

0

∫ 1

0

(1− t)t2E[D2εi+2εj (F (um(Xt,r,si,j ))ξ2i ξ

2j ] dt dr ds.

Now we can bound II as, with Lemma 10.3.12,

‖II‖Lq ≤

∥∥∥∥∥∥2

m′∑i<j, i,j=n′+1

∫ 1

0

(1− t)E[D2εi+2εj (F (um(Xt,1,1i,j ))ξiξj ] dt

∥∥∥∥∥∥Lq

(10.3.32)

≤ c

m′∑i<j, i,j=n′+1

λ−2i λ−2

j .

Thus, we have by (10.3.30), (10.3.31), and (10.3.32),

‖E[F (um)− F (un)]‖Lq ≤ c

m∑

|α|=n+1

λ−2α + c(

m∑

|α|=n+1

λ−2α )2. (10.3.33)

Then by λα = π2 |α|2, we arrive at the conclusion.

10.4 Error estimates for finite element approximation

In this section, we show that the strong convergence order of finite elementapproximation can be the same as the strong order of the spectral trunca-tion of white noise (Theorem 10.4.1) while the weak convergence order isconstrained by the convergence order of the piecewise linear finite elementapproximation in one dimension (Theorem 10.4.3). It then follows that forone-dimensional problems, a piecewise quadratic finite element approxima-tion should be used to obtain higher convergence order.

Theorem 10.4.1 (Finite element approximation, strong error) Let ube the solution to (10.1.1) and uh

n the solution to (10.2.10). Under Assump-tion 10.2.1, we have the following estimate for piecewise linear finite elementapproximation of (10.1.1),

10.4 Error estimates for finite element approximation 287

E[∥∥u− uh

n

∥∥2] ≤ 2E[‖u− un‖2] + 2E[∥∥un − uh

n

∥∥2]

≤ C(n−(4−d) +Mn−(2− d2 )) + C(h4nd +Mh2nd/2).

When taking n = 1/h, we have

E[∥∥u− uh

n

∥∥2] ≤ C(h4−d +Mh2− d2 ). (10.4.1)

Remark 10.4.2 The convergence order in Theorem 10.4.1 is optimal as thesolution u to (10.1.1) belongs to H2−d/2−ε(D) when M = 0 in Assump-tion 10.2.1, see Theorem 10.3.8. Compared to the methods of finite differenceand finite element in [6, 64, 194], the convergence order is half order higherthan the convergence order presented in [6] for the one-dimensional problemand is the same for higher dimensional problems [64, 194].

Define the Ritz projection Rh : H10 (D) → Vh by

(∇Rhw,∇v) = (∇w,∇v), ∀v ∈ Vh, w ∈ H10 (D).

Then it holds that, see, e.g., [446], there is a constant C independent of hsuch that for 0 ≤ l < r ≤ 2

‖w −Rhw‖l ≤ Chr−l ‖w‖r , w ∈ H2(D) ∩H10 (D). (10.4.2)

Proof. It can be readily checked from (10.2.9) and (10.2.10) that

(∇(Rhun − uhn ),∇v) + (f(un)− f(uh

n ), v) = 0, v ∈ Vh. (10.4.3)

Taking v = Rhun − uhn and by (10.2.1) and (10.2.2), the Cauchy-Schwarz

inequality, we have

∥∥∥∇(Rhun − uhn )∥∥∥2

= −(f(un)− f(uhn ), un − uh

n ) + (f(un)− f(uhn ), Rhun − un)

≤ L∥∥∥un − uh

n

∥∥∥2

+ c(M +R∥∥∥un − uh

n

∥∥∥) ‖Rhun − un‖)

≤ L+ Cp

2

∥∥∥un − uhn

∥∥∥2

+ CM ‖Rhun − un‖+ C ‖Rhun − un‖2 .

Then by the Poincare inequality∥∥∇(Rhun − uh

n )∥∥2 ≥ Cp

∥∥Rhun − uhn

∥∥2 , thetriangle inequality and L < Cp, there exists a constant C independent of hbut dependent of Cp, R, L:

∥∥Rhun − uhn

∥∥21+∥∥un − uh

n

∥∥2 ≤ C(M ‖Rhun − un‖+‖Rhun − un‖2). (10.4.4)

Then by (10.4.2), we have

∥∥Rhun − uhn

∥∥21+∥∥un − uh

n

∥∥2 ≤ C(Mh2 ‖un‖2 + h4 ‖un‖22). (10.4.5)



E[‖un‖22] ≤ CE[

∥∥∥∥g +∂d

∂x1 · · · ∂xdWn

∥∥∥∥2

] ≤ Cnd, (10.4.6)

where we have used the fact that

E[

∥∥∥∥∂d

∂x1 · · · ∂xdWn

∥∥∥∥2

] =∑

|α|≤n

‖eα‖2 ≤ Cnd.

By (10.4.4) and (10.4.6), we obtain that

∥∥un − uhn

∥∥2 ≤ C(Mh2 ‖un‖2 + h4 ‖un‖22),

whence we can reach the conclusion by setting h = n−1.

Theorem 10.4.3 (Finite element approximation, weak error) Let ube the solution to (10.1.1) and uh

n the solution to (10.2.10). Under the condi-tions of Theorem 10.3.2, we have, for d ≤ 3,

∣∣∣E[‖u‖2 −∥∥uh

n

∥∥2]∣∣∣ ≤ C[n−(4−d) + h4nd + h3nd/2 + h2nmin(d−2,0)].

To get optimal convergence in h, we take n at the order of 1/h and have∣∣∣E[‖u‖2 −

∥∥uhn

∥∥2]∣∣∣ ≤ Chmin(4−d,2).

The constant C depends on d,Cp, L,R as in Theorem 10.3.1 and also theconstant c in (10.3.1).

Lemma 10.4.4 Let un be the solution to (10.2.8) or (10.2.9) and uhn be the

finite element solution in (10.2.10). Then we have

E[‖un‖qLp ] < C < ∞, p, q ≥ 1. (10.4.7)

andE[‖un‖21] ≤ Cnmin(d−2,0), (10.4.8)

where C does not depend on n. For uhn , we have

E[∥∥uh

n

∥∥qLp ] < ∞, p, q ≥ 1. (10.4.9)

Proof. The proof of (10.4.7) is similar to the proof of (10.3.20). In particular,we have

E[‖un‖2] ≤ C.

Then multiplying un over both sides of (10.2.9) and using integration-by-parts, we have

‖∇un‖2 = −(f(un, un)) + (g, un) + (∂d−1

∂x2 · · · ∂xdWn, ∂x1

un).


By (10.3.16) and the definition of Brownian sheet (10.2.5), we have

E[

∥∥∥∥∂d−1

∂x2 · · · ∂xdWn

∥∥∥∥2

] = E[(

∫

D(∑

|α|≤n+1

∫ x1

0

eα(y1, x2, · · · , xd)ξα dy1)2 dx]

≤ Cnmin(d−2,0).

Then by Assumption 10.2.1 and Cauchy inequality, we have

‖∇un‖2 ≤ C(1 + ‖un‖2 + ‖g‖2) + 1

2

∥∥∥∥∂d−1

∂x2 · · · ∂xdWn

∥∥∥∥2

+1

2‖∂x1

un‖2 ,

and thus we reach (10.4.8) from the fact that ‖∂x1un‖2 ≤ ‖∇un‖2. In fact,

E[‖un‖21] = E[‖un‖2] + E[‖∇un‖2] ≤ Cnmin(d−2,0). (10.4.10)

Now we prove (10.4.9). When p = 2, (10.4.9) follows from (10.2.10) if wetake v = uh

n and apply Assumption 10.2.1. According to Chapter 2 in [446],we have

uhn +

∫

DRhK(x, y)f(uh

n ) dy =

∫

DRhK(x, y)g(y) dy +

∫

DRhK(x, y) dWn(y).

(10.4.11)

To prove (10.4.11), it is key to show that the inverse of −Δh, a discreteLaplacian from Sh to Sh, defined by

−(Δhφh, v) := (∇φ, v) = (g, v), φh, v ∈ Sh.

is Rh(−Δ)−1. Denote that T = (−Δ)−1 : L2 → H10 (D). Then for the elliptic

problem −Δφ = g over D with homogeneous Dirichlet boundary conditions,the solution is φ = Tg. Denote the inverse of −Δh by Th. Then we have, see,e.g., [446, (2.16)]

Th = RhT.

This gives that

Thg = RhTg =

∫

DRhK(x, y)g(y) dy. (10.4.12)

The finite element approximation (10.2.10) can be rewritten as

−Δhuhn + Phf(u

hn ) = Ph(g +

∂d

∂x1 · · · ∂xdWn),

where Ph is the L2 projection into Sh. Then by ThPh = Th (see [446, (2.24)]),we have

uhn = −Thf(u

hn ) + Th(g +

∂d

∂x1 · · · ∂xdWn).

Thus, by (10.4.12), we reach (10.4.11).


For p �= 2, we follow the same idea as in the proof of Theorem 10.3.8.Taking Lp-norm over both sides of (10.4.11) and by (10.2.2), we then have

E[∥∥∥uh

n

∥∥∥qLp

] ≤ CE[

∥∥∥∥∫D|RhK(·, y)| (1 +

∣∣∣uhn

∣∣∣) dy∥∥∥∥q

Lp

] + CE[

∥∥∥∥∥∥∞∑

|α|=1

1

π2 |α|2Rheα(x)ξα

∥∥∥∥∥∥q

Lp

].

(10.4.13)

By the Cauchy-Schwarz inequality, we have

E[

∥∥∥∥∫

D|RhK(·, y)|

∣∣∣uhn

∣∣∣ dy∥∥∥∥q

Lp

] ≤∥∥∥∥(∫

D(RhK(·, y))2 dy)1/2

∥∥∥∥q

Lp

E[∥∥∥uh

n

∥∥∥q

] ≤ CE[∥∥∥uh

n

∥∥∥q

].

Similar to the proof of (10.3.20), we conclude that (10.4.9) holds.

Proof. We will use the duality argument and Theorem 10.3.2 to proveTheorem 10.4.3.

By Theorem 10.3.2 (taking q = 1 in (10.3.2)), we have∣∣∣E[‖u‖2 − ‖un‖2]

∣∣∣ ≤ Cn−(4−d).

By the standard estimate of the Ritz operator in negative norms, (see, e.g.,[446, Theorem 5.1]),

‖un −Rhun‖−r ≤ Chq+r ‖un‖q , 1 ≤ q ≤ s, 0 ≤ r ≤ s− 2, (10.4.14)

we have, taking q = r = 1 and by the fact ‖Rhun‖1 ≤ C ‖un‖1,∣∣∣E[‖un‖2 − ‖Rhun‖2]

∣∣∣ ≤ E[‖un −Rhun‖−1 ‖un +Rhun‖1] ≤ Ch2(E[‖un‖21]).(10.4.15)


E[‖un‖21] ≤ C, (d = 1, 2) and E[‖un‖21] ≤ Cn, (d = 3).

From here and (10.4.15), we have∣∣∣E[‖un‖2 − ‖Rhun‖2]

∣∣∣ ≤ Ch2nmin(d−2,0). (10.4.16)

In order to estimate∣∣∣E[‖u‖2 −

∥∥uhn

∥∥2]∣∣∣, we only need to estimate

∣∣∣E[‖Rhun‖]2 − E[∥∥uh

n

∥∥2]∣∣∣. We use the duality argument to obtain such an

estimate. To this end, we introduce the following linear adjoint problem overthe domain D:

−Δψ + f ′(un)ψ = φ, ψ|∂D = 0. (10.4.17)

It holds that ‖ψ‖2 ≤ C ‖φ‖ since f ′(un) ≥ −L > −Cp is bounded. Introduc-ing e = Rhun−uh

n , e1 = un−uhn , e2 = Rhun−un, we then have, by (10.4.17),

the definition of the Ritz projection and the error equation (10.4.3),


(e, φ) = (∇e,∇ψ) + (f ′(un)e, ψ) = (∇e,∇Rhψ) + (f ′(un)e, ψ)

= −(f(un)− f(uhn ), Rhψ) + (f ′(un)e, ψ)

= (f(un)− f(uhn ), ψ −Rhψ) + (f ′(un)e− f(un)− f(uh

n ), ψ)

Thus we have, by (10.4.2), |f ′(un)| ≤ R and Taylor’s expansion,

|(e, φ)| ≤ C ‖e1‖h2 ‖ψ‖2 −1

2(f ′′(θun + (1− θ)uh

n )e2, ψ) (10.4.18)

≤ C ‖e1‖h2 ‖ψ‖2 + C ‖e‖2L4 ‖ψ‖∞ (1 + ‖un‖κL2κ +∥∥uh

n

∥∥κL2κ),

where 0 ≤ θ ≤ 1 and we used the polynomial growth condition (10.3.1)for f ′′. Then we have, by the embedding ‖v‖∞ ≤ C ‖v‖2 and ‖e‖L4 ≤C ‖e‖1, (10.4.18), and ‖ψ‖2 ≤ C ‖φ‖ that

|(e, φ)| ≤ C(h2 ‖e1‖+ (1 + ‖un‖κL2κ +

∥∥uhn

∥∥κL2κ) ‖e‖21

)‖φ‖ .

Thus, by the definition of negative norm and the Holder inequality, we have,for any 0 ≤ r ≤ 1,

E[‖e‖2−r] ≤ Ch4E[‖e1‖2] + E[‖e‖41 (1 + ‖un‖2κL2κ +

∥∥uhn

∥∥2κL2κ)]

≤ Ch4E[‖e1‖2]

+C(E[‖e‖4(1+ε)1 ])1/(1+ε)

(1 + (E[‖un‖2κ(1+1/ε)

L2κ ])ε/(1+ε)

+(E[∥∥uh

n

∥∥2κ(1+1/ε)

L2κ ])ε/(1+ε))

(10.4.19)

Then by Lemma 10.4.4, (10.4.14) and (10.4.5), we have, for d ≤ 3 and any0 ≤ r ≤ 1,

(E[‖e‖2−r])1/2 ≤ Ch2(E[‖e1‖2])1/2 + C(E[‖e‖4(1+ε)

1 ])1/(2+2ε) ≤ C(h3nd/2 + h4nd),

whence we obtain, by (10.4.14), (10.4.4), and (10.4.6),

∣∣∣E[∥∥uh

n

∥∥2 − ‖Rhun‖2]∣∣∣ ≤ E[

∥∥uhn −Rhun

∥∥ ∥∥uhn +Rhun

∥∥]

≤ (E[‖e‖2])1/2((E[‖e‖2])1/2 + 2(E[‖Rhun‖2])1/2)≤ Ch4nd + h3nd/2. (10.4.20)

where we applied (10.4.5) and (10.4.16).Then by the triangle inequality, Theorem 10.3.2, (10.4.16), and (10.4.20),

we reach the conclusion.

Remark 10.4.5 When f is linear, we can improve the error for∣∣∣E[∥∥uh

n

∥∥2 − ‖Rhun‖2]∣∣∣, which can be checked from the proof. However, the

conclusion on convergence order will not change since the error (10.4.16) isdominant in the total error.



In this section, we present some numerical results of piecewise linear fi-nite element approximation of one- and two- dimensional elliptic equa-tions (10.1.1) with spatial Brownian motion approximated by its spectraltruncation (10.2.7).

To compute the expectations, we use Monte Carlo sampling for bothproblems with the Mersenne Twister random generator (seed 100) to computeexpectations. The experiments were performed using Matlab R2012a on aMacintosh desktop computer with Intel Xeon CPU E5462 (quad-core, 2.80GHz). A fixed-point iteration method with tolerance h2/100 was used to solvethe nonlinear algebraic equations at each step of the implicit schemes.

Example 10.5.1 (One-dimensional elliptic)

− ∂2xu =

1

2u+ σ∂xW (x), x ∈ D = (0, 2), (10.5.1)

with zero Dirichlet boundary condition.

In this example, we will truncate the Brownian motion as follows W (x) =∑nk=1

∫ x

0mk(y) dyξk, where we use the cosine basis in L2(D):

m1(x) =1√|D|

, mk(x) =

√2

|D| cos((k − 1)π

|D| x), k ≥ 2.

The errors are measured in the weak sense:

#r2 =

∣∣∣E[∥∥uh

n

∥∥2]− E[‖uref‖2]∣∣∣

E[‖u2ref‖]

, (10.5.2)

where ‖v‖ is the L2 norm in physical space and n = 2/h in this example. We

take σ = 1 and obtained the value of E[‖uref‖2] = 0.2731183 (up to 7 digit)analytically as in Example 10.1.1.

In Table 10.1, we observe that the weak convergence of finite elementmethods is of second-order, which is in agreement of Theorem 10.4.3. Weuse 4× 108 Monte Carlo sample paths to obtain the numerical solution. Thenumbers after “±” are the statistical errors with the 95% confidence interval.

Example 10.5.2 (Two-dimensional elliptic equation)

−Δu+ sin(u) = σ∂2

∂x1∂x2W (x), x ∈ D = (0, 1)× (0, 1), (10.5.3)

with zero Dirichlet boundary conditions.


Table 10.1. Weak convergence of piecewise linear finite element methods forthe one-dimensional problem (10.5.1) with a spectral approximation of whitenoise (10.2.7) using n = 2/h.

# Element n �r2 Order

4 4 3.4237× 10−2 ± 4.08× 10−5 –

8 8 1.0658× 10−2 ± 3.77× 10−5 1.68

16 16 2.7521× 10−3 ± 3.69× 10−5 1.95

32 32 7.2822× 10−4 ± 3.67× 10−5 1.92

In this example, we test the weak convergence of piecewise linear finite el-ement (rectangular element) approximation of (10.5.3) with different noisemagnitudes. The errors are measured in the following weak sense:

ρr1 =

∣∣∣∥∥(E[uh

n ])2∥∥−

∥∥∥(E[uh/22n ])2

∥∥∥∣∣∣

∥∥∥(E[uh/22n ])2

∥∥∥, ρr2 =

∣∣∣∣E[∥∥uh

n

∥∥2]− E[∥∥∥uh/2

2n

∥∥∥2

]

∣∣∣∣

E[∥∥∥uh/2

2n

∥∥∥2

].

When 32× 32 elements are used, we employ 2× 105 Monte Carlo samplepaths and obtain

σ = 0.5,∥∥∥(E[u

√2/32

32 ])2∥∥∥ = 0.22861 ± 2.3 × 10−4 and E[

∥∥∥u√2/32

32

∥∥∥2

] =

0.22965± 4.5× 10−4;

σ = 1.0,∥∥∥(E[u

√2/32

32 ])2∥∥∥ = 0.22861 ± 4.7 × 10−4 and E[

∥∥∥u√2/32

32

∥∥∥2

] =

0.23278± 9.0× 10−4.In Table 10.2, we observe a second-order convergence of piecewise approx-

imation (10.2.10) for the two-dimensional semilinear problem (10.5.3), whichis consistent with our theoretical prediction in Theorem 10.4.3.


For spatial noise, we can expect higher-order convergence from the spectralapproximation of Brownian motion. When an explicit form of spectral repre-sentation of Brownian motion is available, we can use the spectral truncation:we have better convergence from it rather than from piecewise linear approx-imation of Brownian motion and have the same convergence as the piecewiselinear approximation. With the spectral approximation of Brownian motion,we observe the following:

• For semilinear elliptic equations with additive noise, the weak convergencerate is twice the strong convergence rate if only white noise (Brownianmotion) is truncated, see Theorems 10.3.1 and 10.3.2. This is also truefor other PDEs, in Chapter 10.3.1.


Table

10.2.Weakconvergen

ceofpiecewiselinearfiniteelem

entapproxim

ationofthetw

o-dim

ensionalsemilinearproblem

(10.5.3)

withasp

ectralapproxim

ationofwhitenoise(10.2.7)usingn=

√2/h.

σ#

MC

#Elemen

tρr 1

Order

ρr 2

Order

CPU

time(s.)

0.5

1×

103

4×

41.831×

10−2±

3.2

×10−3

–1.800×

10−2±

6.5

×10−3

–0.1

0.5

4×

104

8×

84.201×

10−3±

5.3

×10−4

h2.12

4.172×

10−3±

1.0

×10−3

h2.11

2.2

0.5

8×

104

16×

16

1.113×

10−3±

3.7

×10−4

h1.92

1.121×

10−3±

7.1

×10−4

h1.90

80.1

1.0

2×

103

4×

41.779×

10−2±

4.5

×10−3

–1.662×

10−2±

9.2

×10−3

–0.1

1.0

1×

105

8×

84.281×

10−3±

6.7

×10−4

h2.05

4.177×

10−3±

1.3

×10−3

h1.99

71.6

1.0

2×

105

16×

16

1.231×

10−3±

4.7

×10−4

h1.80

1.255×

10−3±

9.0

×10−4

h1.73

191.2


• For finite element discretization of semilinear elliptic equations with ad-ditive noise, the strong convergence order of the finite element approx-imation is h2−d/2 (Theorem 10.4.1) and the weak convergence order ishmin(4−d,2) (Theorem 10.4.3).

• When solutions are smooth in random space, e.g., a fourth-order equationwith additive noise, a spectral approximation of the white noise can leadto higher convergence order in both strong and weak sense when thesolutions are smooth.

In this chapter, we consider nonlinear elliptic equations with additivenoise where no stochastic products are involved. In the next chapter, wewill consider elliptic equations with multiplicative noise, where the stochasticproducts have to be carefully defined.

Bibliographic notes. Investigating the benchmark problem (10.1.1) is help-ful to better understand the influence of discretizing Brownian motion/sheetas well as more complex noises in the context of approximating stochastic par-tial differential equations. For example, when higher dimensional white noiseis considered, which is the case for space-time white noise (see, e.g., [249]),we can combine one of the above approximation methods in each dimensionand thus have different approximation of white noise. It is then crucial tounderstand the performance of different approximation methods in a simplecase such as the problem (10.1.1).

Piecewise constant approximation of white noise has been used for manyproblems with white noise since its use in linear elliptic equations [6], see,e.g., [64, 66] for nonlinear elliptic problems, [67, 68] for Helmholtz equations,[118] for linear elliptic equations with additive color noise, [490] heat equa-tion with additive space-time noise, [464] for reaction-diffusion equation withspace-time white noise as the coefficient of nonlinear reaction term, [491] forspace-time noise (color in space and white in time), and [256] for Allen-Cahnequations with additive space-time white noise, etc.

Spectral approximation can be and has been considered for elliptic equa-tions with multiplicative noise, see, e.g., linear elliptic equation with lognor-mal diffusivity [73, 74, 140, 141] and with white noise diffusivity [469].

Disadvantage of spectral approximation. The spectral approximation maynot be explicitly known, e.g., the eigenfunctions of the leading operator arenot easily found, L2-CONS in the spectral approximation may not be ex-plicitly expressed even when domains have arbitrary but smooth boundarycurves, cf. the last paragraph in Chapter 10.2.

It is crucial to assume that the nonlinear term f satisfies Assump-tion (10.2.1) as in [48, 64, 194], especially the monotone condition (10.2.1).The monotone condition (10.2.1) allows a large class of nonlinearity termssuch as f(x) = x(1− x2) (e.g., in Allen-Cahn equation).



In the following problems, {ek}∞k=1 is a CONS on L2(D), where D is thedomain considered and ξk’s are i.i.d. standard Gaussian random variables. Let

∂xWQ(x) =

∞∑

k=1

√qkek(x)ξk, qk ≥ 0.

Assume that a) qk = 1/k2 for any integer k ≥ 1 or b) qk = 1/k4 for allpositive integer k.

Exercise 10.7.1 Consider the following stochastic elliptic equation with ad-ditive noise

−∂2xu+ bu(x) = g(x) + ∂xWQ(x), x ∈ (0, 1),

u(0) = u(1) = 0.

Here g is smooth enough, say, g(x) = sin(x) and b > 0. Derive a regularityestimate for u that is similar as in Example 10.1.1, where qk = 1 for anyinteger k ≥ 1.

Exercise 10.7.2 Consider the following nonlinear stochastic elliptic equa-tion with additive noise

−∂2xu+ f(u) = g(x) + ∂xWQ(x), x ∈ (0, 1),

u(0) = u(1) = 0.

Assume that f satisfies Assumption (10.2.1). Derive a regularity estimate foru as in Theorem 10.3.8.

Exercise 10.7.3 Consider the following one-dimensional linear stochasticelliptic equation

− ∂2xu =

1

2u+ ∂xWQ(x), x ∈ D = (0, 2) (10.7.1)

with zero Dirichlet boundary conditions. Numerically check the mean-squareconvergence order when qk = 1/k2 for any integer k ≥ 1.

Hint. See Example 10.5.1.

Exercise 10.7.4 Consider the following one-dimensional elliptic equation

− ∂2xu = sin(u) + ∂xWQ(x), x ∈ D = (0, 2) (10.7.2)

with zero Dirichlet boundary conditions. Numerically check the mean-squareconvergence order when qk = 1/k2 for any integer k ≥ 1.

11

Multiplicative white noise:The Wick-Malliavin approximation

In this chapter, we consider Wiener chaos expansion (WCE) for ellipticequations with multiplicative noise. Unlike the stochastic collocation meth-ods (SCM), a direct application of WCE will lead to a fully coupled linearsystem. To sparsify the resulting linear system, we present WCE with theuse of Ito-Wick product and an approximation/reduction technique calledWick-Malliavin approximation. Specifically, we consider Wick-Malliavin ap-proximation for elliptic equations with lognormal coefficients and use theWick product for elliptic equations with spatial white noise as coefficients.Numerical results demonstrate that high-order Wick-Malliavin approxima-tion is efficient even when the noise intensity is relatively large.

The Wick-Malliavin approximation can be used as reduction methods forWCE of nonlinear problems. It can significantly reduce the computationalcost while maintaining the high accuracy of WCE. Moreover, the Wick-Malliavin approximation can be applied for SPDEs with non-Gaussian whitenoise. For stochastic collocation methods (SCM), such an approximation isnot available since the Wick-Malliavin approximation is based on Wienerchaos expansion or more generally polynomial chaos expansion.

11.1 Introduction

Consider WCE with Wick-Malliavin approximation for the following stochas-tic elliptic equation with multiplicative noise:

− div(a(x, ω)∇u(x, ω)) = f(x), x ∈ D, u(x) = 0, x ∈ ∂D, (11.1.1)

where D is a domain with Lipschitz boundary, a(x, ω) is a random field,and f is either random or deterministic. Many methods have been proposedfor (11.1.1) with different assumptions on a(x, ω), see the bibliographic note


297

298 11 Multiplicative white noise: The Wick-Malliavin approximation

at the end of this chapter. Here we consider WCE for lognormal a(x, ω) andwhite noise a(x, ω).

WCE for (11.1.1) leads to a fully coupled linear system of deterministicelliptic equations. To reduce the computational cost of WCE, the productin (11.1.1) between a(x, ω) and ∇u has been replaced with the Ito-Wickproduct, see, e.g., [320, 466]:

− div(a � ∇u) = f(x, ω), x ∈ D, u(x) = 0, x ∈ ∂D. (11.1.2)

The definition of the Ito-Wick product “�” can be found in Chapter 2.3. Theapplication of WCE to (11.1.2) results in a weakly coupled linear systemwhich is lower triangular. One feature of (11.1.2) is that the mean field ofthe solution, E[u], satisfies the deterministic equations

−div(E[a]∇E[u]) = f(x), E[u](x)|∂D = 0.

However, the use of Wick product can deteriorate the existence of the solutionin L

2(Ω,F ,P) and some weighted space has to be used, see Chapter 11.4.The solution to (11.1.2) can be treated as an approximation of (11.1.1)

of order two (in the intensity of the noise) in the mean-square sense. InChapter 11.3, we consider a higher-order approximation using the recentlydeveloped Wick-Malliavin approximation in [346, 470],

−∇· (a�∇u+

Q∑

q=1

1

q!Dqa�Dq∇u) = f(x), x ∈ D, u(x) = 0, x ∈ ∂D,

(11.1.3)

where D is the Malliavin derivative which will be defined shortly. WhenQ = 0, (11.1.3) becomes (11.1.2).

This chapter is organized as follows. First, we introduce the Wick-Malliavin approximation in Chapter 11.2 and apply this approximation to el-liptic equations with lognormal coefficients in Chapter 11.3. In Chapter 11.4,we discuss the case when a is a spatial white noise and Q = 0 and its finiteelement approximation. We then present the Wick-Malliavin approximationfor nonlinear equations with a Gaussian random forcing in Chapter 11.5 andfor nonlinear equations with a non-Gaussian random forcing in Chapter 11.6.Numerical results are presented in most of the sections. At the end of thechapter, we summarize the conclusions in this chapter and present a briefreview on numerical methods for elliptic equations with random coefficients,including different treatments in stochastic products between the random co-efficient and the gradient of the solution. A review on convergence rates ofWCE is also presented. Some exercises are provided to enhance the readers’understanding of generalized Wiener chaos methods and the Wick-Malliavinapproximation.

11.2 Approximation using the Wick-Malliavin expansion 299

11.2 Approximation using the Wick-Malliavin expansion

The Wick-Malliavin approximation states that the product of two square-integrable random variables can be approximated as in Taylor’s expansion,see Theorem 11.2.1.

Let us introduce the Malliavin derivative. Consider a spatial white noiseWφ =

∑∞k=1 φk(x)ξk on the probability space (Ω,F ,P) taking values on

the Hilbert space U = L2(D), where {φk}∞k=1 is a CONS of U. Here ξk =

Wφk=

∫D φk(x) dW (x) are i.i.d. standard Gaussian random variables and

E[WφWψ] = (φ, ψ)U with (·, ·)U being the inner product associated with theHilbert space U.

Denote by D the Malliavin derivative with respect to Wφ on L2(Ω,F ,P),

which can be defined as a directional derivative

Dξα =∑k≥1

√αkφkξα−εk =

∑β∈J

( ∑|γ|=1

1β+γ=α

√α!√β!

∏k

(φk)γk)ξβ ∈ L2((Ω,F ,P);U),

(11.2.1)

where εk ∈ J has only one nonzero component: (εk)k = 1 and (εk)j = 0for any j �= k. The high order Malliavin derivative can be defined throughreduction as

Dnξα = σn∑

β∈J

( ∑

|γ|=n

1β+γ=α

√α!√β!

φγ

)ξβ ∈ L2((Ω,F ,P);U⊗n) (11.2.2)

where φγ =∑

k1,··· ,kn

∑εk1+···+εkn=γ φk1

⊗ · · · ⊗ φkn∈ U⊗n. For example,

n = 2,

D2ξα = D(Dξα) =∑

|γ|=1

∑

β=α−γ

√α!√β!

Dξβ∏

k

(φk)γk

=∑

|γ|=1

∑

β=α−γ

√α!√β!

( ∑

|η|=1

∑

θ=β−η

√β!√θ!

ξθ∏

η

(φl)ηl)⊗∏

γ

(φk)γk

=∑

|γ|=1

∑

|η|=1

∑

θ=α−γ−η

√α!√θ!

ξθ∏

η

(φl)ηl ⊗

∏

γ

(φk)γk .

This can be written in the form of (11.2.2) as

D2ξα =∑

θ∈J

( ∑

|ζ|=2

1ζ+θ=α

√α!√θ!

φζ

)ξθ.

By the definition of Malliavin derivatives for Cameron-Martin basis, it canbe readily checked that for two square-integrable random variables u and v,

D(u � v) = Du � v + u �Dv. (11.2.3)


Theorem 11.2.1 (Mikulevicius-Rozovsky formula, [346]) For ele-ments of the Cameron-Martin basis ξα and ξβ, the following relation holdswith probability 1:

ξαξβ = ξα � ξβ +

∞∑

q=1

Dqξα �Dqξβq!

. (11.2.4)

Moreover, the relation can be extended to any square-integrable random vari-ables X and Y , i.e.,

XY = X � Y +

∞∑

q=1

DqX �DqY

q!. (11.2.5)

The theorem is a combination of Proposition 4 and Remark 11 in [346].The proof of this theorem is based on the following two observations:

• the linearization coefficients of the product

ξαξβ =∑

γ≤α∧β

B(α, β, γ)ξα+β−2γ , (11.2.6)

•Dnξα �Dnξβ = n!

∑

|γ|=n, γ≤α∧β

B(α, β, γ)ξα+β−2γ , (11.2.7)

or its equivalent form

Dnξα �Dnξβ = n!∑

γ

∑

θ≤γ

∑

|γ|=n

1γ+γ−θ=β1γ+θ=αB(α, β, γ)ξα+β−2γ ,

(11.2.8)where

B(α, β, γ) =

((α

γ

)(β

γ

)(α+ β − 2γ

k − γ

)) 12

. (11.2.9)

Proof. Assume that u =∑

κ∈J uκξκ and v =∑

κ∈J uκξκ, we have

uv =∑

α∈J

∑

θ,κ∈JuθvκE[ξθξκξα]ξα

=∑

α∈J

∑

θ,κ∈Juθvκ

∑

γ≤θ∧κ

B(θ, κ, γ)E[ξθ+κ−2γξα]ξα

=∑

α∈J

∑

γ∈J

∑

(0)≤β≤α

uβ+γvα−β+γΦ(α, β, γ)ξα

where Φ(α, β, γ) = B(β + γ, α− β + γ, γ) and

Φ(α, β, γ) =

[(α

β

)(β + γ

γ

)(α− β + γ

γ

)]1/2. (11.2.10)

The product of u and v then can be represented as

uv = u � v +

∞∑

Q=1

∑

α∈J

∑

γ∈J ,|γ|=Q

∑

(0)≤β≤α

uβ+γvα−β+γΦ(α, β, γ)ξα.

11.3 Lognormal coefficient 301

Then by the definition of Malliavin derivatives, (11.2.5) can be derived.

By the Mikulevicius-Rozovsky formula (Theorem 11.2.1) and the fact thatDnξα = 0 if |α| < n, we have the following conclusion.

Corollary 11.2.2 For elements of the Cameron-Martin basis ξθ and ξκ, thefollowing relation holds with probability 1:

ξθξκ = ξθ � ξκ +

Q∑

q=1

Dqξθ �Dqξκq!

, (11.2.11)

where Q = min(|θ| , |κ|).Now let us look at Equation (11.1.1) again. Assume that a(x, ω) and

∇u are square-integrable in random space. Then by Theorem 11.2.1, Equa-tion (11.1.1) can be rewritten as

−∇· (a�∇u+

∞∑

q=1

1

q!Dqa�Dq∇u) = f(x), x ∈ D, u(x) = 0, x ∈ ∂D.

(11.2.12)Here the above assumption can be satisfied when a(x, ω) is smooth enough,see, e.g., [73, 74] and Chapter 11.3.

11.3 Lognormal coefficient

Consider the equation (11.1.1) where the coefficient a(x, ω) is lognormal:

a(x, ω) = exp(σ

∞∑

k=1

λ1/2k φk(x)ξk(ω)

), (11.3.1)

where λk are nonnegative real numbers, φk(x) is a CONS in L2(D) ,and ξk’sare i.i.d. standard Gaussian random variables. This representation can beobtained from the Karhunen-Loeve expansion (Theorem 2.1.5) of ln(a(x, ω)),which is a Gaussian field with zero mean and covariance kernel K(x−y), e.g.,kernels in Table 2.1. When K(z) is in C0,1(R+), a and ln(a) belong to C0,μ(D)a.s. for μ < 1/2. By Mercer’s theorem, the correlation function K(x− y) canbe expressed as

K(x, y) = σ2∞∑

i=1

λkφk(x)φk(y), (11.3.2)

where {φk(x)}∞k=1 is a CONS of L2(D) and

ξk(ω) = σ−1λ−1/2k

∫

Dln(a(x, ω))φk(x)dx, k = 1, 2, · · ·

Theorem 11.3.1 (Existence and uniqueness, [74, Proposition 1])Assume that D is an open bounded domain in R

d with C2 boundary. Whenf ∈ L2(D) and K(z) is in C0,1(R+), Equation (11.1.1) has a unique solutionin L

q(Ω,H10 (D)).


By the Cameron-Martin theorem, ξα =∏

k

Hαk(ξk)√αk!

is a CONS in

L2(Ω,F ,P). Then the lognormal process a(x, ω) in (11.3.1) can be repre-

sented by the following WCE

a(x, ω) = eσ2/2

∑

α∈J

σ|α|∏k λ

αk/2k φαk

k (x)√α!

ξα. (11.3.3)

Now we truncate a with an as

an = eσ2/2 exp(σ

n∑

k=1

λ1/2φk(x)ξk) = eσ2/2

∑

α∈Jαi=0, i≥n

σ|α|∏k λ

αk/2k φαk

k (x)√α!

ξα.

We have the following stochastic elliptic equations with finite dimensionalrandom variables.

− div(an(x)∇un) = f(x), x ∈ D, un(x) = 0, x ∈ ∂D. (11.3.4)

For any μ, ν with 0 ≤ ν < μ < 1/2 and q ≥ 1 then ‖an − a‖Lq(Ω,C0,ν(D)) ≤

C(Rμn )

1/2 where the positive C depends only on μ, ν, q and

Rμn =

∞∑

k=n+1

λk ‖φk‖2C0,μ

is well defined and convergent for all 0 ≤ μ ≤ 1/2.

Theorem 11.3.2 ([74, Theorem 2.8]) For f ∈ Lp(D), p ≥ d, 0 < ν <min( 12 , 1−

dp ), the strong convergence order is

E[‖u− un‖qC1,ν(D))] ≤ C(Rν

n )q/2, q ≥ 1. (11.3.5)

Assume also that ψ ∈ C6(R) and ψ and its derivatives have at most polyno-mial growth, then the weak convergence holds

‖E[ψ(u)]− E[ψ(un)]‖C1,ν(D) ≤ CRνn ,

where the positive constant C depends only on β, p and f, ψ.

Remark 11.3.3 Many important processes admit a well-defined Rμn , such as

Brownian motion and Gaussian process with exponential kernel. For Brown-ian motion (spatial) over (0, L),

W (x) =

∞∑

k=1

∫ x

0

ek(y) dyξk, e1(x) =

√1

L, ek(x) =

√2

Lcos(

kπx√L), k ≥ 2,

λk ∼ O( 1k2 ) and ‖φk‖C0,ν([0,L]) ∼ O(kν) and thus Rν

n ∼ O(N2ν−1). A simi-lar conclusion holds for Gaussian process in one-dimension with exponential

kernel K(z) = exp(− |z|lc), where lc is called correlation length, see [74].

11.3 Lognormal coefficient 303

The Wick-Malliavin approximation of Q-th order to (11.3.4) is

− div(an � ∇un,Q +

Q∑

q=1

1

q!Dqan �Dq∇un,Q) = f(x), x ∈ D, (11.3.6)

where the boundary condition is un,Q|∂D = 0.

Theorem 11.3.4 Let un be the solution to (11.3.4) and un,Q be the solutionto (11.3.6). Then there exists a positive constant C such that

E[‖un,Q − un‖2L2(D)] ≤ C(C1σ)2(Q+1), (11.3.7)

where C depends on n and proper norms of un, un,Q, a, an and f , λ1/2k φk (k =

1, 2, · · · , n) but is independent of σ and Q. Here C1 is a constant dependingonly on f and a, an.

The conclusion can be proved similarly as in [470, Lemma 4.2]. We notethat the constant C depends on n in (11.3.7) and may blow up when n → ∞.To have an estimate (11.3.7) where the constant does not depend on n, it isrequired that the λk decays fast enough, e.g., λ1 = 1 and λk = 0 for k ≥ 2.

11.3.1 One-dimensional example

Consider the following test model:

− ∂x(a(ξ)∂xu) = 1, (11.3.8)

where a(ξ) = eσξ−σ2/2 =

∞∑

k=0

σkHk(ξ)

k!and ξ is a standard Gaussian random

variable.Suppose that uN =

∑Nk=0 uk(x)

Hk(ξ)√k!

. The fully discretization of WCE

yields a solution uN satisfying that

−N∑

k=0

uk(x)E[a(ξ)Hk(ξ)√

k!

Hl(ξ)√l!

] = E[Hl(ξ)√

l!], l ≥ 0.

In matrix form, we have

Su(x) = (1, 0, · · · , 0)�,

where u(x) = (u0(x), · · · , uN(x))� and S is a full matrix of size (N + 1) ×

(N+ 1). In fact,

Sl,k = E[a(ξ)Hk(ξ)√

k!

Hl(ξ)√l!

] =∞∑

n=0

σn

√n!

E[Hn(ξ)√

n!

Hk(ξ)√k!

Hl(ξ)√l!

]

=

∞∑n=0

σn

√n!

E[∑

q≤k∧n

B(k, n, q)Hn+k−2q(ξ)√(n+ k − 2q)!

Hl(ξ)√l!

]

=

∞∑n=0

σn

√n!

∑q≤k∧n

B(k, n, q)δn+k−2q,l.


The Q-th order Wick-Malliavin approximation (11.3.6) leads to

S(Q)u(x) = (1, 0, · · · , 0)�,

where S(Q)l,k = 0 if l − k < 0 is odd or l < k − 2Q and can be calculated from

the Mikulevicius-Rozovsky formula (Theorem 11.2.1) and

S(Q)l,k = E[

(a(ξ) � Hk(ξ)√

k!+

Q∑

q=1

Dqa �Dp(Hk(ξ)√k!

)

q!

)Hl(ξ)√l!

]

=

∞∑

n=0

σn

√n!

∑

q≤k∧n, q≤Q

B(k, n, q)δn+k−2q,l.

It can be readily seen that S = S(N). Compared to S, S(Q) can be sparse,

especially when Q � N. For example, S(0) is lower-triangular, as S(0)l,k = 0 if

0 ≤ l < k ≤ N.Now we present some numerical results for different Q when N is fixed.

Here we take N = 30 (N = 40 leads to similar results) and we measure theerrors in the following way

( N∑

n=0

∥∥un − uN,Qn )

∥∥2)1/2

,

which is a discrete analogue to (E[∥∥u− u(Q)

∥∥2])1/2.In Figure 11.1, we observe that the errors of the Wick-Malliavin approx-

imation decrease with the increase of level Q. When the noise magnitude issmall, e.g., σ = 0.1, level Q = 4 can lead to accuracy of 10−7. However, whenthe noise magnitude is large, e.g., σ = 0.5, we only achieve an accuracy of10−2 when Q = 4. Moreover, we note that when the noise magnitude is evenlarger, e.g., σ = 0.65, we need a large level Q: when Q = 1, the mean-squareerror is large than 1. To have a reasonable accuracy, we need Q = 5 wherethe error is around 10−1.

Numerical results in this section verify the claim in Theorem 11.3.4. Itseems likely that the Wick-Malliavin approximation is similar to perturbationmethods: they are efficient when the noise magnitude is small and not workingefficiently when the noise magnitude is large. However, the Wick-Malliavinapproximation can be of high order for Q and we only need small Q fornoises with small magnitude. When Q is large, the method can work forlarge magnitude noises such as σ = 0.5 which improves the results by aperturbation analysis in [39], where σ is less than 0.4.

11.4 White noise as coefficient

We discuss WCE with finite element methods for stochastic elliptic equationswith spatial white noise, i.e., (11.1.1) with

a(x, ω) = E[a(x, ·)] + W (x),

11.4 White noise as coefficient 305

Fig. 11.1. Mean-square errors of Wick-Malliavin approximation (11.3.6) with dif-ferent level Q for (11.3.8) with different noise intensity.

0 1 2 3 4 510

−10

10−8

10−6

10−4

10−2

100

102

Q

Err

or

σ=0.1σ=0.2σ=0.5σ=0.65

where W = W (x) is a centered (zero mean) Gaussian white noise process inthe spatial domain D. Specifically, W (x) =

∑∞k=1 mkξk where {mk(x)}k≥1

is a complete orthonormal basis in L2(D) and ξk are mutually independentstandard Gaussian random variables. Consider that the interaction of u andW is through the Wick product “�,” we then can write (11.1.1) as

Au = Mu � W + f, in D, u|∂D = 0, (11.4.1)

where A = M = Δ. The operators A, M can be extended to second-orderdifferential operators of more general forms where A is uniformly elliptic and

Mv � W =∑

k≥1

(Mv) � (∑

k≥1

mkξk) =:∑

k≥1

(Mkv) � ξk,

and Mk satisfies that for all u, v ∈ H10 (D), (Mku, v) ≤ Mk ‖∇u‖1 ‖v‖1 .

Given the Cameron-Martin basis {ξα}α∈J , where J is the collection ofmulti-indices with only finitely many nonzeros, any solution in L

2(Ω,F ,P)can be represented in the form of

∑α∈J uαξα. However, the solution u

for (11.4.1) belongs to the weighted space [319]

RL2(F ;H1(D)) =

{u =

∑

α∈Juαξα|

∑

α∈J‖uα‖2H1 r

2α < ∞

},


instead of lying in L2(F ;H1(D)) =

{u =

∑α∈J uαξα|

∑α∈J ‖uα‖2H1 < ∞

}.

Here, the weights rα are [319, Theorem 3.1.1]

rα =qα√|α|!

, where qα =

∞∏

k=1

qαk

k , (11.4.2)

and qk, k ≥ 1 are chosen such that∑

k≥1 q2kC

2k < 1, with

∥∥A−1Mkv∥∥1≤

Ck ‖v‖1.With an interpretation of solutions in weighted spaces, we still can plug

the representation∑

α∈J uαξα into (11.4.1) and take expectation after multi-plying ξα over both sides of the equations to obtain the so-called propagator,for each α ∈ J ,

Auα =

∞∑

k=1

√αkMkuα−εk + f1|α|=0

, in D, uα|∂D = 0. (11.4.3)

Here εk ∈ J and |εk| = 1 with (εk)k = 1, and we also use the conventionthat (α− εk)k = 0 if αk − εk ≤ 0. We observe that we have transformed thestochastic problem into a weakly coupled system of deterministic equationswith the use of Wick product; otherwise, we are led to a fully coupled systemof deterministic equations.

In numerical methods, we are only interested in some truncated propaga-tor, e.g., those equations with α∈JN,n =:

{α = (α1, · · · , αn)| |α| :=

∑nk=1 αk

}

≤ N for the propagator (11.4.3) of the elliptic equation.To facilitate the finite element approximation of the truncated propagator

in physical space, we first state the propagator (11.4.3) in its variational form,

A(uα, v) =

∞∑

k=1

√αkMk(uα−εk , v) + (f1|α|=0

, v), ∀ v ∈ H10 (D), (11.4.4)

where A(u, v), and Mk(u, v) are bilinear form associated with A and Mk,and (·, ·) is the inner product in L2(D). Suppose that we use finite elementspace {Sh}, which is finite-dimensional subspaces ofH1

0 (D) contains piecewisepolynomials of at most r−1 order (r ≥ 2) and the partition Th of the domainD is quasi-uniform and h is the maximal length. The FEA of (11.4.1) (to beprecise, FEA of the truncated propagator of (11.4.1)) is:

To find uhN,n =

∑α∈JN,n

uhαξα where the Wiener chaos expansion coefficient

uhα ∈ Sh satisfies

Ah(uhα, v) =

n∑

k=1

√αkM

hk (uα−εk , v) + (f1|α|=0

, v), ∀ v ∈ Sh, (11.4.5)

where Ah(·, ·) and Mhk (·, ·) are approximations of A(·, ·) and Mk(·, ·) by nu-

merical integrations, respectively. For simplicity of presentation, we assumethat Ah = A and Mh

k = M over Sh × Sh.


11.4.1 Error Estimates

In this section, we study how the finite element and Wiener Chaos truncationerrors for (11.4.1), especially how errors from equations in the propagatorsgrow in the weakly coupled systems of deterministic equations.

For the FEA of the problems (11.4.1) and (11.4.5), we have the followingerror estimate.

Theorem 11.4.1 Assume the domain D is convex and open bounded withsmooth boundary condition. Suppose that f ∈ Hm−1(D), then u ∈ RL

2(F ,Hm+1(D) ∩ H1

0 (D)) solves (11.4.1). Suppose that uhN,n ∈ Sh is the FEA

of (11.4.5). Under assumptions on Sh, we then have

∥∥∥u− uhN,n

∥∥∥2

RL2(F;H1(D))=

∑α∈JN,n

∥∥∥uα − uhα

∥∥∥2

1r2α +

∑α/∈JN,n

‖uα‖21 r2α

≤ C(1− (q − qW )N+1

1− (q − qW )(1− qN+1

n

1− qn)2 +

1− qN+1n

1− qn)h2m ‖f‖2m−1

+C(qW

(1− q)2+

(q − qW )N+1

1− q) ‖f‖2H−1 ,

where C is a constant depending solely on m, A1, A2 and D, q =∑

k≥1 C2kq

2k

< 1 and qW =∑

k>N C2kq

2k. The constants Ck’s come from (11.4.18) and

qn =∑n

k=1 C2kq

2k < qn =

∑nk=1 Ckqk < 1 and Ck = max(Ck, Ck).

Remark 11.4.2 We have required that q =∑n

k=1 Ckqk < 1, which leadsto q < q < 1 and q < q < 1. In this case, the constant in front of h2m

can be small. Error estimates for FEA of (11.4.1) have been investigated in[295, 469]. In [469], the authors present a total error estimate of FEA errorand truncation of WCE, showing how the numerical solution method behaveswith truncation parameters in random space. In a subsequent work [295], theauthors show an optimal convergence in L2-norm of some FEA in physicalspace, but there is a large constant in front of FEA error in physical space,(N+pp

), i.e., the number of equations one solves in the truncated propagator.

Theorem 11.4.1 implies that discretization in random space and physical spacehas very weak effects on each other.

As we bound∥∥∥u− uh

N,n

∥∥∥2

RL2(F ;H1(D))by the two parts

∑

α∈JN,n

∥∥uα − uhα

∥∥21r2α and

∑

α/∈JN,n

‖uα‖21 r2α,

we then present the error estimates in two lemmas which readily lead toTheorem 11.4.1.


Lemma 11.4.3 (Error estimate of WCE truncation error, cf. [469])Assume the domain D is convex and open bounded with smooth boundary con-dition. Suppose that f ∈ H−1(D), then u ∈ RL

2(F , H10 (D)) solves (11.4.1).

We then have

∑

α/∈JN,n

‖uα‖21 r2α ≤ C(qW

(1− q)2+

(q − qW )N+1

1− q) ‖f‖2H−1 , (11.4.6)


k≥1 C2kq

2k <

1 and qW =∑

k>N C2kq

2k.

Proof. We observe that

α /∈ JN,n = {α| |α| > N}⊕{α| |α| ≤ N and ak > 0 for some k > n} =: J1⊕J2.

Then we can write

∑

α/∈JN,n

‖uα‖21 r2α =∑

α∈J1

‖uα‖21 r2α +∑

α∈J2

‖uα‖21 r2α.

The first term in the summation can be estimated by, see Theorem 3.11 in[319],

∑

α∈J1

‖uα‖21 r2α =∑

|α|>N

‖uα‖2 r2α ≤ C2(A) ‖f‖2H−1

∑

|α|>N

(

∞∑

k=1

C2kq

2k)

|α|.

(11.4.7)We write α ∈ J2 into α = α(1) + α(2), where α(1) ∈ JN,n and all componentsin α(2) where αk, k ≤ n are zeros. Let

∣∣α(1)∣∣ = l and

∣∣α(2)∣∣ = |α| − l, where l

is a nonnegative integer.

∑

α∈J2

‖uα‖2 r2α =∑

|α|≥1, α∈J2

|α|−1∑

l=0

∑

|α(1)|=l

‖uα‖2 r2α

≤ C2(A) ‖f‖2H−1

∑

|α|≥1, α∈J2

|α|−1∑

l=0

∑

|α(1)|=l

|α|!α(1)!α(2)!

(∏

k

C2kq

2k)

|αk|, (11.4.8)

where we recall from [319] the estimate ‖uα‖2 r2α ≤ C2(A) ‖f‖2H−1|α|!

α(1)!α(2)!

(∏

k C2kq

2k)

|αk| for α ∈ J2. Then by the multinomial expansion, the summa-tion in (11.4.8) can be estimated by


∑

|α|≥1, α∈J2

|α|−1∑

l=0

∑

|α(1)|=l

|α|!α(1)!α(2)!

(∏

k

C2kq

2k)

|αk|

=∑

|α|≥1, α∈J2

|α|−1∑

l=0

|α|!l!(|α| − l)!

∑

|α(1)|=l

l!(|α| − l)!

α(1)!α(2)!(∏

k

C2kq

2k)

|αk|

≤∞∑

|α|=1

|α|−1∑

l=0

|α|!l!(|α| − l)!

(q − qW )iqW )n−i,

where q =∑

k C2kq

2k and qW =

∑

k>n

C2kq

2k. From here and (11.4.8), it holds

that

∑α/∈JN,n

‖uα‖2 r2α =∑

|α|>N

‖uα‖2 r2α +

N∑|α|=1

|α|−1∑l=0

‖uα‖2 r2α

≤ C2(A) ‖f‖2H−1 (

N∑|α|=1

|α|−1∑l=1

|α|!l!(|α| − l)!

(q − qW )i(qW )n−i +∑

|α|>N

q|α|).

Now we claim that

N∑

|α|=1

|α|−1∑

l=1

|α|!l!(|α| − l)!

(q− qW )l(qW )n−l+∑

|α|>N

q|α| ≤ qW(1− q)2

+(q − qW )p+1

1− q.

In fact, by the binomial expansion

N∑

|α|=1

|α|−1∑

i=1

|α|!i!(|α| − i)!

(q − qW )i(qW )n−i +∑

|α|>N

q|α|

≤N∑

|α|=1

(q|α| − (q − qW )|α|) +∑

|α|>N

q|α|

≤∞∑

|α|=1

(q|α| − (q − qW )|α|) +∑

|α|>N

(q − qW )|α|

≤ q

1− q− q − qW

1− q + qW+

(q − qW )p+1

1− q + qW

=qW

(1− q + qW )(1− q)+

(q − qW )p+1

1− q + qW

≤ qW(1− q)2

+(q − qW )p+1

1− q.

Thus we arrive at (11.4.6).


Let us introduce the Ritz-Galerkin projection and its error estimate. TheRitz-Galerkin projection πh is defined from H1

0 (D) to Sh such that

A(πhw − w, vh) = 0, ∀ vh ∈ Sh, w ∈ H10 (D).

Denote by Th the inverse of πh in Sh, i.e., Thg solves the deterministic problemAh(πhw, v) = (g, v) with homogeneous Dirichlet boundary condition: πhw =Thg ∈ Sh and

A(wh, v) = Ah(πhw, v) = (g, v), ∀ v ∈ Sh. (11.4.9)

For example, Au = −∑d

i,j=1 Di(Dju) + a0u, a0 > 0. Then A is uniformlypositive definite and

Ah(πhu, vh) =∑d

i,j=1(Djπhu,Divh) + a0(πhu, vh) =∑d

i,j=1(Djwh, Divh)

+a0(wh, vh) = A(wh, vh).

The standard error estimate for πh (associated with A and its bilinear form)is the following [446]:

‖πhw − w‖+h ‖πhw − w‖ ≤ hs ‖w‖s , ∀ w ∈ Hs(D)∩H10 (D), 1 ≤ s ≤ r.

(11.4.10)

Lemma 11.4.4 (Error estimate of FEM error) Assume the domain D isconvex and open bounded with smooth boundary condition. Suppose that f ∈Hm−1(D), then u ∈ RL

2(F , Hm+1(D)∩H10 (D)) solves (11.4.1). Suppose that

uhN,n ∈ Sh is the FEA of (11.4.5). Under assumptions on Sh, we then have

∑α∈JN,n


∥∥∥2

1r2α ≤ C(

1− (q − qW )N+1

1− (q − qW )(1− qN+1

n

1− qn)2 +

1− qN+1n

1− qn)h2m ‖f‖2m−1 ,


k≥1 C2kq

2k <

1 and qW =∑

k>N C2kq

2k. All other constants are defined in Theorem 11.4.1.

Proof. Denote eα = πhuα − uhα ∈ Sh and ηα = uα − πhuα, where πh is the

Ritz-Galerkin projection operator. From (11.4.3) and (11.4.5), one can getthe following error equation for each α ∈ JN,n,

Ah(eα, v) =n∑

k=1

√αk(Mkeα−εk , v) +

n∑

k=1

√αk(Mkηα−εk , v) (11.4.11)

for all v ∈ Sh. This error equation can be rewritten as, by the definition of Th,

eα =

n∑

k=1

√αkThMh

keα−εk + Fα, Fα =

n∑

k=1

√αkThMh

kηα−εk ∈ Sh.

(11.4.12)


To address the dependence on the right-hand side and γ, we denoteeα(g; γ) the solution of (11.4.12), where γ ∈ JN,n and Fα = g1α=γ ∈ H1.Noting that eα(g; γ) = 0 if |α| < |γ|, we have

∑

α∈JN,n

‖eα(Fγ ; γ)‖21 r2α =

∑

α,α+γ∈JN,n

‖eα+γ(Fγ ; γ)‖21 r2α+γ . (11.4.13)

Define eα = eα(α!)−1/2. By the linearity of the error equation (11.4.12), we

haveeα+γ(Fγ ; γ) = eα(Fγ(γ!)

−1/2; (0)).

The term in the right-hand side eα(Fγ(γ!)−1/2; (0)) can be estimated as fol-

lows. Following the arguments in the proof of Theorem 4.5 in [319], we have,for |α| = n,

(α!)1/2eα(Fγ(γ!)−1/2; (0)) = eα(Fγ(γ!)

−1/2; (0)) = 1√α!

∑σ∈Pn

ThMkσn· · ·

ThMkσ1Fγ(γ!)

−1/2, (11.4.14)

where Pn is the permutation group of the set {1, 2, · · · , n}. Thus by ‖ThMkv‖1≤ Ck ‖v‖1,1 we have, from (11.4.14),

‖eα+γ(Fγ ; γ)‖1 ≤ CA|α|!

α!√γ!

‖Fγ‖1∏

k

Cαk

k . (11.4.15)

Thus we have, by the triangle inequality, (11.4.13) and (11.4.15),

( ∑α∈JN,n

‖eα‖21 r

2α

)1/2 ≤∑

γ∈JN,n

⎛⎝ ∑

α∈JN,n

‖eα(Fγ ; γ)‖21 r

2α

⎞⎠

1/2

=∑

γ∈JN,n

⎛⎝ ∑

α∈JN,n

‖eα+γ(Fγ ; γ)‖21 r

2α+γ

⎞⎠

1/2

≤∑

γ∈JN,n

⎛⎝ ∑

α∈JN,n

C2A

(|α|!)2(α!)2γ!

‖Fγ‖21

∏k

C2αkk r

2α+γ(α + γ)!

⎞⎠

1/2

=CA∑

γ∈JN,n

‖Fγ‖1 rγ

⎛⎝ ∑

α∈JN,n

|α|! |γ|!α!γ!

(α + γ)!

|α + γ|! [|α|!α!

∏k

(Ckqk)2αk ]

⎞⎠

1/2

,

where we recall the weights (11.4.2) in the last two steps. Then by the fact

that |α|!|γ|!α!γ!

(α+γ)!|α+γ|! ≤ 1 [295, Lemma B.2]), we have

1The constant here is usually not the same as in the estimate∥∥A−1Mk

∥∥1≤

Ck ‖v‖1. Here we use the same constant for simplicity.


⎛⎝ ∑

α∈JN,n

‖eα‖21 r2α

⎞⎠

1/2

≤ CA∑

γ∈JN,n

‖Fγ‖1 rγ

⎛⎝ ∑

α∈JN,n

[|α|!α!

∏k

(Ckqk)2αk ]

⎞⎠

1/2

≤ CA∑

γ∈JN,n

‖Fγ‖1 rγ(

N∑n=0

(n∑

k=1

C2kq

2k)

n

)1/2

= CA

(1− (q − qW )N+1

1− q + qW

)1/2 ∑γ∈JN,n

‖Fγ‖1 rγ , (11.4.16)

where we denote∑∞

k=1 C2kq

2k = q and

∑k>N C2

kq2k = qW .

It remains to estimate ‖Fγ‖1. Recalling (11.4.12) and∥∥ThMh

kf∥∥1

≤Ck ‖f‖1, we have,

‖Fγ‖1 =

∥∥∥∥∥n∑

k=1

√γkThMh

kηγ−εk

∥∥∥∥∥1

≤n∑

k=1

√γk

∥∥∥ThMhkηγ−εk

∥∥∥1

≤n∑

k=1

√γkCk ‖ηγ−εk‖1 ≤ Chm

n∑k=1

√γkCk ‖uγ−εk‖m+1 , (11.4.17)

where we have applied the error estimate (11.4.10) for πhu− u .Assume that for any v in Hm+1(D), it holds that

∥∥A−1Mkv∥∥m+1

≤ Ck ‖v‖m+1 . (11.4.18)

Similar to the proof of (11.4.15), there exists constants Cm,A such that

‖uγ−εk‖m+1 ≤ Cm,A|γ − εk|!√(γ − εk)!

‖f‖m−1

n∏

j=1

C(γ−εk)jj . (11.4.19)

From here and (11.4.17), we then have

‖Fγ‖1 ≤ Chmn∑

k=1

√γkCk ‖uγ−εk‖m+1 ≤ Cm,AChm ‖f‖m−1

|γ|!√γ!

n∏j=1

Cγj

j

n∑k=1

γk|γ|

= Cm,AChm ‖f‖m−1

|γ|!√γ!

n∏j=1

Cγj

j ,

where Ck = max(Ck, Ck). Hence by the multinomial expansion and (11.4.2),we have that


∑γ∈JN,n

‖Fγ‖1 rγ ≤ Chm ‖f‖m−1

N∑n=0

∑|γ|=n

√|γ|!√γ!

n∏j=1

Cγj

j qγj

j

≤ Chm ‖f‖m−1

N∑n=0

∑|γ|=n

|γ|!γ!

n∏j=1

Cγj

j qγj

j

≤ Chm ‖f‖m−1

N∑n=0

(

n∑j=1

Cjqj)n

= Chm ‖f‖m−1

1− qN+1n

1− qn, (qn =

n∑j=1

Cjqj < 1). (11.4.20)

Form (11.4.20), (11.4.16), and (11.4.19), we then conclude that

∑α∈JN,n


∥∥∥2

1r2α ≤ 2

∑α∈JN,n

‖eα‖21 r2α + 2

∑α∈JN,n

‖ηα‖21 r2α

≤ C1− (q − qW )N+1

1− (q − qW )

(1− qN+1n

1− qn

)2h2m ‖f‖2m−1

+Ch2m ‖f‖2m−1

∑α∈JN,n

(|α|!)2α!

n∏j=1

C2αj r2α

≤ Ch2m ‖f‖2m−1 (1− (q − qW )N+1

1− (q − qW )(1− qN+1

n

1− qn)2 +

1− qN+1n

1− qn),

where qn =∑n

j=1 C2j q

2j < qn < 1 and we have used the following fact

∑α∈JN,n

(|α|!)2α!

n∏j=1

C2αj r2α =

∑α∈JN,n

|α|!α!

n∏j=1

C2αj q2α =

N∑n=0

(

n∑j=1

C2j q

2j )

n ≤ 1− qN+1n

1− qn.

Remark 11.4.5 Here we assume that the coefficients in A and M are suf-ficiently smooth. The operator A can be a second-order differential operatorof general form but has to be positive definite.

Remark 11.4.6 In the proof, we assume for simplicity A = Ah and M =Mh, i.e., we assume no extra errors introduced by numerical integration ofthe bilinear form A and M . The conclusion is also valid if these integrationerrors are considered and the convergence rate will not change as long as ahigh-order numerical integration method (higher than the convergence rate)is adopted. Strang’s First Lemma (e.g., [80, Theorem 4.1.1]) sheds light onthis issue: for each α,

∥∥uhα − uα

∥∥1≤ C inf

vh∈Sh

{‖vh − uα‖1 + sup

vh∈Sh

∣∣A(vh, wh)−Ah(vh, wh)∣∣

‖wh‖1

}

+

n∑

k=1

Ck√αk

∥∥uhα−εk

− uα−εk

∥∥1.



We consider a two-dimensional stochastic elliptic problem (11.4.1) in thefollowing form

−Δu = div(∇u) � W + 1, x ∈ D = [0, 1]× [0, 1], (11.4.21)

with homogeneous Dirichlet boundary conditions. Recall that W is a spatialwhite noise, W =

∑∞k=1 φk(x)ξk, where {φk(x)} is a CONS in L2(D) and ξk

are mutually independent standard normal random variables. Here we takeφk(x) as a proper reordering of the CONS with basis functions ml(x1)mn(x2),where {ml(x1)} is a CONS in L2([0, 1]), such that l + n is increasing and lstarts from 0. For example,

φ1(x) = m0(x1)m0(x2), φ2(x) = m0(x1)m1(x2), φ3(x)=m1(x1)m0(x2)

φ4(x) = m0(x1)m2(x2), φ5(x) = m1(x1)m1(x2), φ6(x)=m0(x1)m2(x2), · · ·

and we can deduce that

φ19 = m2(x1)m3(x2), φ20 = m1(x1)m4(x2), φ21 = m0(x1)m5(x2).

In the computation we take m0(x1) = 1 and ml(x1) =√2 cos(lπx1) when

l ≥ 1. Equation (11.4.21) can be rewritten as

−Δu =∞∑

k=1

div(φk∇u) � ξk + 1.

Let uN,n =∑N

n=0

∑|α|=n uαξα be the truncated WCE of the solution up

to polynomial order N. Then the truncated propagator reads

−Δuα =n∑

k=1

√αkdiv(φk∇uα−εk) + f1|α|=0

, in D, uα|∂D = 0. (11.4.22)

The weighted norm ‖u‖2RL2(F ;H1(D)) is computed by

∞∑

k=0

∑

|α|=k

‖uα‖2H1 r2α,

where

rα =qα

|α|! , qk =1

(1 + k)Ck, Ck = ‖φk‖∞ .

Each equation in the propagator (11.4.22) is solved with a spectral elementmethod with 10 by 10 elements and 6 by 6 nodes (Gauss-Legendre-Lobattonodes in both x1 and x2-directions) in each elements. In Figure 11.2, we plotthe errors with respect to N with n = 21, which are computed by

∥∥∥uN,nh − uN−1,n

h

∥∥∥RL2(F ;H1(D))

=( ∑

|α|=N

‖uα‖2H1 r2α

)1/2.

We observe that a spectral convergence in N, the order of WCE, can beobtained, in agreement with our error estimate in Theorem 11.4.1.

11.5 Application of Wick-Malliavin approximation to nonlinear SPDEs 315

Fig. 11.2. Weighted-H1 errors in the polynomial orders N of WCE for Equa-tion (11.4.22). A spectral element method with uniform partition (10×10 elements)with 6×6 Gauss-Legendre-Lobatto nodes in each elements. The number of randomvariable used is n = 21.

0 1 2 3 4 5

10−2

10−1

Err

or

N

11.5 Application of Wick-Malliavin approximationto nonlinear SPDEs

In Chapter 11.3, we showed both numerically and theoretically that the ac-curacy of Wick-Malliavin approximation to elliptic equation with lognormalcoefficient (11.3.8) is decreasing with the noise intensity, which is similar toperturbation methods. In this section, we show that the zeroth order Wick-Malliavin approximation leads to the same deterministic systems as thosederived from perturbation methods for SPDEs with quadratic nonlinearitydriven by Gaussian random fields, such as stochastic Burgers equation andstochastic Navier-Stokes equation.

Consider the following stochastic Burgers equation

∂tv + v∂xv = ν∂2xv + σ cos(x)

∞∑

k=1

λkmk(t)ξk, (t, x) ∈ (0, T ]× (0, 2π).

(11.5.1)

with deterministic initial condition v(0, x) and periodic boundary conditions.Here {mk(t)} is CONS in L2([0, T ]), ξk’s are mutually independent standardGaussian random variables and λk’s are real numbers. When λk = 1 for allk ≥ 1, then

∑∞k=1 λkmk(t)ξk is the white noise, see Chapter 2.2.

Consider the WCE for the stochastic Burgers equation (11.5.1). Supposethat v =

∑α∈JN,n

vαξα. By (11.2.7) and Lemma 11.2.1, we know that, see

also [225, 348],

v2 =∑

γ∈J

∑

(0)≤β≤α

B(α, β, γ)vα−β+γvβ+γξα,


where B(α, β, γ) is defined in (11.2.9). Then we can readily obtain the prop-agator for (11.5.1):

∂tvα+1

2

∑γ∈J

∑(0)≤β≤α

B(α, β, γ)∂x(uα−β+γuβ+γ) = ν∂2xuα+σ(x)

∞∑k=1

1{αj=δj,k}λkmk(t),

(11.5.2)

where for fixed k, 1{αj=δj,k} = 1 and otherwise is equal to 0. The initialconditions for uα are

uα(0, x) = δ|α|=0u0(x). (11.5.3)

Again, we are solving a truncated propagator: for α ∈ JN,n,

∂tvα +∑

γ∈JN,n

∑(0)≤β≤α

1

2B(α, β, γ)∂x(vα−β+γvβ+γ) = ν∂2

xuα + σ(x)∞∑

k=1

1{αj=δj,k}mk(t)

(11.5.4)with the initial condition (11.5.3). By the Wick-Malliavin approximation, thetruncation propagator can be approximated by

∂tuα +∑

γ∈JQ,n

∑

(0)≤β≤α

1

2B(α, β, γ)∂x(uα−β+γuβ+γ) = ν∂

2xuα + σ(x)

∞∑

k=1

1{αj=δj,k

}λkmk(t)

(11.5.5)with the initial condition (11.5.3). Here we used uα to represent an approxi-mation of vα as these two are generally different. In particular, when Q = 0,we have

∂tuα+∑

(0)≤β≤α

1

2∂x(uα−βuβ) = ν∂2

xuα+σ(x)

∞∑

k=1

1{αj=δj,k}λkmk(t). (11.5.6)

When Q ≥ N, we have from (11.2.11) that the truncated propagator (11.5.5)coincides with (11.5.4).

Now, we apply a perturbation method to solve (11.5.1): first write thesolution in a power series expansion:

u = u(0) +

∞∑

|α|=1

uα

∏

k

ξαk

k

and then plug this expansion into (11.5.1) and compare the coefficient of∏α ξαk

k to obtain an equation that uα satisfies:

∂tuα+∑

(0)≤β≤α

1

2∂x(uα−β uβ) = ν∂2

xuα+σ(x)

∞∑

k=1

1{αj=δj,k}λkmk(t). (11.5.7)

We observe that uα satisfies the exact same equation as (11.5.6) and also thesame initial condition (11.5.3). A similar observation is made in [462] whena one-dimensional Burgers equation is considered with a linear combinationof several Gaussian random variables as additive noise.

11.6 Wick-Malliavin approximation: extensions for non-Gaussian white noise 317

Remark 11.5.1 Here we illustrated the idea for SPDEs with temporal whitenoise instead of spatial white noise or spatial-temporal white noise. ForSPDEs with temporal white noise, the regularity can be low in time but canbe high in space. For SPDEs with spatial white noise, the regularity in spacecan be low as we have seen in Chapter 10. Thus we need to use more termsin Wiener chaos expansion and Wick-Malliavin approximation.

11.6 Wick-Malliavin approximation: extensionsfor non-Gaussian white noise

Let {mk(x)}k≥1 be a CONS in L2(D), D ⊆ Rn. Then we can define the white

noise as

N(x) =∞∑

k=1

mk(x)ξk, (11.6.1)

where ξk are mutually uncorrelated random variables obeying the same dis-tribution with mean zero and variance one.

For example, when D = ([0, T ]) and {mk(t)}k≥1 is a CONS on L2 ([0, T ]),

the stochastic process N is exactly the Gaussian white noise if ξk are mutuallyindependent standard Gaussian random variables. As in Chapter 2.2, theprocess

∫ t

0N(s) ds is Brownian motion:

N(t) =:

∫ t

0

N(s) ds =∑

k

∫ t

0

mk (s) dsξk, 0 ≤ t ≤ T. (11.6.2)

It is exactly the formulation of Brownian motion using an orthogonal expan-sion.

Even if ξk are not standard Gaussian random variables, we still have thefollowing properties:

• N(t) has uncorrelated increments;• E[N(t)] = 0;• E[N(t)N(s)] = t ∧ s. In particular, when t = s, E[(N(t))2] = t.

We can also define stochastic integrals in Ito’s sense: for any continuousdeterministic f(t) on [0, T ] :

∫ t

0

f(s)dN(s) = lim|πn|→0

n∑

i=1

f (ti)(Nti+1

−Nti

)in L

2,

where Πn = {ti = tni , 0 ≤ i ≤ n} is a partition of [0, T ].

Example 11.6.1 (Non-Gaussian white noise) Let ξk be i.i.d. and P(ξk =±1) = 1/2. Then the process Nt in (11.6.2) is not Gaussian since its char-acteristic function is for each n > 1, θ ∈ R,


E[exp

(iθ

n∑

k=1

∫ t

0

mk(s)dsξk

)] =

n∏

k=1

cos

(θ

∫ t

0

mk(s)ds

).

Let ξk be uniform and i.i.d. on [−√3,√3]. The stochastic process (11.6.2)

is non-Gaussian as

E[exp

(iθ

n∑

k=1

∫ t

0

mk(s)dsξk

)] =

n∏

k=1

sin(θ∫ t

0mk(s)ds

)

(θ∫ t

0mk(s)ds

) .

Recall that the characteristic function E[eiθX ] determines uniquely a dis-tribution of the random variable X. Thus, to show that a process (randomvariable) is not Gaussian, it is equivalent to show that its characteristicfunction is not the same as the characteristic function of a Gaussian pro-cess (random variable). For i.i.d. standard Gaussian random variables ξk, thecharacteristic function is

E[exp

(iθ

n∑

k=1

∫ t

0

mk(s)dsξk

)] =

n∏

k=1

exp

(−

θ2∫ t

0mk(s)ds

2

).

For many different distributions, we can derive generalized polynomialchaos using orthogonal polynomials, which are listed in Table 11.1. Moregenerally, the whole family of Askey scheme of orthogonal polynomials canbe used to construct polynomials chaos for various distributions, see, e.g.,[488].

Table 11.1. Commonly used distributions (measures) and corresponding orthogo-nal polynomials.

DistributionOrthogonal polynomials Support Alias

Gaussian Hermite polynomials R Wiener chaos (Hermite chaos)

Uniform Legendre polynomials [a, b] Legendre chaos

Beta Jacobi polynomials [a, b] Jacobi chaos

Gamma Laguerre polynomials [0,∞) Laguerre chaos

Poisson Charlier polynomial {0, 1, 2, . . .} Charlier chaos

Binomial Krawtchouk polynomial{0, 1, 2, . . . , N} Krawtchouk chaos

Assume that ξ obeys some distribution in Table 11.1 and Pk be the k-th order orthogonal polynomial corresponding to that measure. Assume alsothat ∫

DPk(αξ + β)Pl(αξ + β) dμ = δl,k

√k!,

where D is the support of the random variable ξ and α, β are constants suchthat αξ + β are well defined in the domain of Pk’s (the support D). Here μis the corresponding distribution (measure) of ξ. Then we define that


Nk(ξ) = Pk(αξ + β), k ≥ 0. (11.6.3)

Then by the completeness of these polynomials in L2(μ), the set {Nk}∞k=1 is

a complete orthogonal basis of L2(Ω, σ(ξ),P).

Example 11.6.2 Assume that ξ obeys a uniform distribution on [−√3,√3].

The orthogonal basis consists of Legendre polynomials, which can de definedby the Rodrigues formula

Lk(x) =(−1)k

2kk!

dk

dxk

[(1− x2)k

]. (11.6.4)

While Legendre polynomials Lk’s are defined on [−1, 1], we need the poly-nomials Lk(

x√3) instead (α = 1√

3, β = 0 in (11.6.3)). Moreover, we can check

from the definition of the Legendre polynomials and integration by parts thatfor any k, l ≥ 0,

1

2√3

∫ √3

−√3

Lk(x√3)Lk(

x√3)dx =

1

2

∫ 1

−1

Lk(y)Lk(y)dy =1

(2k + 1)δk,l.

(11.6.5)Define

Nk(ξ) =√

k!(2k + 1)Lk(ξ√3). (11.6.6)

Then the set {Nk(ξ)}∞k=1 is a complete orthogonal basis of L2(Ω, σ(ξ),P).

For a multi-index α ∈ J , we define the polynomial in a product fashion:

Nα =∏

k

Nαk(ξk). (11.6.7)

Note that E[N2α] = α!. The set {Nα, α ∈ J} is a complete orthogonal basis

in L2 (Ω, σ (ξk, k ≥ 1) ,P).Similar to the Cameron-Martin theorem (Theorem 2.3.6), we can use

orthogonal expansions to represent square-integrable stochastic processes.

Theorem 11.6.3 (Generalized polynomial chaos expansion) Let F =σ(ξk, k ≥ 1). Then {Nα}α∈J from (11.6.7) is a complete orthogonal system ofL2

(Ω,F ,P): for each η ∈ L2(Ω,F ,P),

η =∑

α

ηαNα, ηα =E[ηNα]

α!, E[(Nα)

2] = α!.

Moreover, E[η2] =∑

α η2αα! < ∞.


The orthogonal expansion in the theorem is called a generalized polynomialchaos expansion, cf. Wiener (Hermite polynomial) chaos expansion in Sec-tion 2.5.3. The basis {Nα} is called a generalized Cameron-Martin basis.

Now we define the Wick product for the generalized Cameron-Martinbasis and then for square-integrable stochastic processes.

Nα �Nβ = Nα+β , 1 �Nα = Nα, α, β ∈ J .

For u =∑

α uαNα, v =∑

α vαNα, where uα, vα ∈ R

u � v =∑

α

∑

β≤α

uβvα−βNα.

The following example shows that the propagators of Wick-nonlinearequations are formally the same regardless of the distribution of randomvariables or stochastic processes, see, e.g., [349].

Example 11.6.4 Consider the equation

Au− u�3 +∞∑

k=1

Mku � ξn = f.

Here u�3 denotes u �u �u. The operators A and Mk are the same with thosein Chapter 11.4.

By Theorem 11.6.3, we can seek a chaos solution: u =∑

α∈J uαNα. By thedefinition of Wick product,

u�3 = u � u � u =∑

α,β,γ∈JuαuβuγNα+β+γ =

∑

θ∈J

( ∑

α+β+γ=θ

uαuβuγ

)Nθ.

(11.6.8)Then by the orthogonality of generalized polynomial chaos expansion, weobtain the propagator (the coefficients of chaos expansion):

Auθ +∑

α+β+γ=θ

uαuβuγ +

∞∑

k=1

Mkuθ−εk = fθ, fθ =E[fNθ]

θ!, θ ∈ J .

(11.6.9)The propagator (11.6.9) holds for any random variable ξk’s as long as theyhave distributions corresponding to polynomial chaos listed in Table 11.1.For example, ξk’s can be all Gaussian random variables or ξ2k’s are Gaussianrandom variables while ξ2k−1’s have uniform distributions.

For equations with polynomial nonlinearity, we can use theWick-Malliavinexpansion to obtain propagators. Let us define the Malliavin derivative of thegeneralized Cameron-Martin basis and then that of square-integrable stochas-tic processes.

DNα =∑

|γ|=1,γ≤α

α!

(α− p)!Nα−γ =

∑

β∈J

∑

|γ|=1,β+γ=α

(γ + β)!

β!Nβ .


For u =∑

α uαNα ∈ L2,

Dγu =∑

|α|≥1

αkuαNα(k) =∑

α≥γ

α!

(α− γ)!uαNα−γ , |γ| = 1,

Du =∑

|γ|=1

Dγu =∑

α≥γ

∑

|γ|=1

α!

(α− γ)!uαNα−γ =

∑

α∈J

∑

|γ|=1

(α+ γ)!

α!uα+γNα.

Here α(k) = α − εk ≥ 0 and εk ∈ J with one at k-th element and |εk| = 1.Higher-order Malliavin derivatives can be defined as follows:

Dnγu =

∑α≥γ

α!

(α− γ)!uαNα−γ , |γ| = 1,

Dnu =∑|γ|=n

Dnγu =

∑|γ|=n

∑α≥γ

α!

(α− γ)!γ!uαNα−γ =

∑α∈J

∑|γ|=n

(α+ γ)!

α!γ!uα+γNα.

In general, the Mikulevicius-Rozovsky formula (11.2.5) is not valid forgeneral polynomial chaos other than Hermite polynomial chaos. However, itis still possible to define Wick-Malliavin approximation due to the polynomialnature of generalized polynomial chaos. In fact, the idea of Mikulevicius-Rozovsky formula is based on the linearization coefficient problem of Hermitepolynomials. For univariate Hermite polynomials, the linearization coefficientproblem is to find the coefficient ak,n,m such that

Hn(x)Hm(x) =∑m+n

k=0 ck,n,mHk(x) =∑m∧n

q=0 bn,m,qHm+m−2q, bn,m,q

= B(n,m, q)√n!m!√

(n+m−2q)!. (11.6.10)

The Wick-product keeps the highest order polynomial by letting q = 0. TheWick-Malliavin approximation keeps all higher-order polynomials by letting0 ≤ q ≤ Q where 0 ≤ Q ≤ n ∧ m. Similarly, the linearization coefficientproblem for general orthogonal polynomials is

Pn(x)Pm(x) =

m+n∑

k=0

ak,n,mPk(x) =∑

0≤2q≤m+n

bn,m,qPm+n−2q(x). (11.6.11)

Similar to the idea of Wick-Malliavin approximation, we define the followingapproximation

Pn(x)Pm(x) ≈ Pn(x) �Q Pm(x) =∑

0≤2q≤Q

bn,m,qPm+n−2q(x), Q ≤ m+ n.

Thus for (11.6.3) and (11.6.7), the Wick-Malliavin approximation is de-fined by

Nα �Q Nβ =∑

0≤2|γ|≤Q

bα,β,γNα+β−γ(x), Q ≤ |α+ β| . (11.6.12)


Assume that u =∑

α∈J uαNα and v =∑

α∈J vαNα. Then

u �Q v =∑

α,β∈J uαvβNα �Q Nβ =∑

α,β∈J uαvβ∑

0≤2|γ|≤Q(α,β) bα,β,γ

Nα+β−γ(x),Q(α, β) ≤ |α+ β| . (11.6.13)


Consider the following Burgers equation with additive noise:

∂tu+u∂xu = ν∂2xu+σ/n

n∑

k=1

cos(2kπx) cos(2kπt)ξk, x ∈ (0, 2π), (11.6.14)

with deterministic initial condition u0(x) = 1 + sin(2x) and periodic bound-ary conditions. We assume that ξk’s are all standard Gaussian random vari-ables or uniform random variables on [−1, 1]. The propagator of the Burgersequation is

∂tuα +1

2

∑

q∈JN,n

∑

β≤α

C(α, β, q)∂x(uα−β+quβ+q)

= ν∂2xuα + σ

n∑

k=1

1{αj=δj,k,|α|=1}mk(t), (11.6.15)

where for fixed k, 1{αj=δj,k,|α|=1} = 1 and otherwise is equal to 0. The initialcondition for uα are

uα(0, x) = δ|α|=0u0(x). (11.6.16)

If ξk’s are standard Gaussian random variables, C(α, β, q) = Φ(α, β, q). If ξk’sare normalized uniform random variables over [−1, 1], C(α, β, q) = Ψ(α, β, q).For the Legendre polynomial chaos expansion, we have for u, v ∈ L

2,

uv = u � v +

∞∑

Q=1

∑

α∈J

∑

κ∈J ,|κ|=Q

∑

(0)≤β≤α

uβ+κvα−β+κΨ(α, β, κ)Nα√α!

,

where Nα is the Legendre polynomial chaos (11.6.6)–(11.6.7) and

Ψ(α, β, q) =AqAβAα−β

Aα+q

√2α+ 1

2(α+ q) + 1

√(2(β + q) + 1)(2(α− β + q) + 1).

(11.6.17)

Here Aq = Γ (q+1/2)q!Γ (1/2) and Γ (·) is the Gamma function.

Let us first check the convergence in σ. In Figure 11.3, we plot the er-ror in the second and fourth moments of the Wick-Malliavin approximationfor (11.6.15) with Gaussian noise. Again, we don’t have any exact solutionand choose a reference solution from a stochastic collocation method with a


very fine resolution in random space. We observe that the error is decreas-ing faster with larger Q up to certain σ. We can actually observe a furtherdecrease in errors with smaller σ when we increase the resolution in physi-cal space and time (numerical results are not shown here). In other words,in the current plot, the further decease trend in σ for large Q is dominatedby the space-time discretization error. We have a very similar observation inFigure 11.4 for (11.6.15) with uniform noise.

Fig. 11.3. Relative errors in moments of Wick-Malliavin approximationfor (11.6.15) with different level Q and a single Gaussian random variable: T = 1and n = 1. Left: Error in the second moment; Right: Error in the fourth moment.

s0.1 0.3 0.5 0.7 0.9 2 4 6 8 10

L2 -

erro

rin

seco

ndm

omen

ts

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

Q=0Q=1Q=2

σ0.1 0.3 0.5 0.7 0.9 2 4 6 8 10

L2 -

erro

rin

four

thm

omen

ts

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

Q=0Q=1Q=2

Now we present the computational error behavior with time where n = 4and we use N = 8 for (11.6.15) with Gaussian noise and (11.6.15) with uniformnoise. Here we obtain the reference solution by computing the solution withpolynomial chaos methods without the Wick-Malliavin approximation. For

Fig. 11.4. Relative errors in moments of Wick-Malliavin approximationfor (11.6.15) with different level Q and a single normalized uniform random vari-able: T = 1 and n = 1. Left: Error in the second moment; Right: Error in the fourthmoment.

s0.1 0.3 0.5 0.7 0.9 2 4 6 8 10

L2 -

erro

rin

seco

ndm

omen

ts

10 -10

10 -9

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

Q=0Q=1Q=2

s0.1 0.3 0.5 0.7 0.9 2 4 6 8 10

L2 -

erro

rin

four

thm

omen

ts

10 -9

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

Q=0Q=1Q=2

this problem, we observe that errors in the second moment and the fourthmoment at t = 0.9 are changing slowly with time, see Figure 11.5 for Gaussian


noise and Figure 11.6 for uniform noise. The error behaviors are similar butthe errors from the uniform noise is smaller. This effect can be explained asfollows: the variance of the uniform random variable (1/3) is smaller thanthat of the standard Gaussian. Also, it is interesting to observe that whenQ = 3, the errors of second moments are actually decreasing with time. Onepossible reason is that lower level Wick-Malliavin approximations may lead tolarge accumulation errors from the Wick-Malliavin approximation while forhigh level approximation (level 3 in this case) can allow small accumulationerrors from the Wick-Malliavin approximation such that the total error isdominated by space-time discretization errors.

Fig. 11.5. Relative errors in moments of Wick-Malliavin approximationfor (11.6.15) with Gaussian noise using different level Q: σ = 10 and n = 4.

t0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L2 -

erro

r in

sec

ond

mom

ents

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

Q=0Q=1Q=3

t0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L2 -

erro

r in

fou

rth

mom

ents

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

Q=0Q=1Q=3

Fig. 11.6. Relative errors in moments of Wick-Malliavin approximationfor (11.6.15) with uniform noise using with different level Q: σ = 10 and n = 4.

t0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L2 -

erro

rin

seco

ndm

omen

ts

10 -10

10 -9

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

Q=0Q=1Q=3

t0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

L2 -

erro

rin

four

thm

omen

ts

10 -9

10 -8

10 -7

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

Q=0Q=1Q=3


11.6.2 Malliavin derivatives for Poisson noises

The Malliavin derivatives of random variables depend on the nature of asso-ciated orthogonal polynomials. For Poisson noises, the associated measure

of a Poisson random variable is∑

k∈Se−λλk

k! δ(x − k) (the distribution is

P(x = k) = e−λλk

k! and the mean is λ), where S = {0, 1, 2, . . .} . The cor-responding orthogonal polynomials are Charlier polynomials, the generatingfunction of which is

∞∑

n=0

Cn(x;λ)

n!tn = e−λt(1 + t)x. (11.6.18)

The Charlier polynomials satisfy the following orthogonality:

∑

k∈S

e−λλk

k!Cm(k;λ)Cn(k;λ) = n!λnδm,n. (11.6.19)

Similar to Jacobi and Hermite polynomials, the Charlier polynomials can becomputed via a recurrence relation

C0(x;λ) = 1, C1(x;λ) = x− λ,

Cn+1(x;λ) = (x− λ− n)Cn(x;λ)− λnCn−1(x;λ).

The first few Charlier polynomials are

C0(x;λ) = 1, C1(x;λ) = x− λ,

C2(x;λ) = x2 = 2λx− x+ λ2,

C3(x;λ) = x3 − 3λx2 − 3x2 + 3λ2x+ 3λx+ 2x− λ3.

In general, the Charlier polynomial can be written as

Cn(x;λ) =

n∑

k=0

(n

k

)(−1)n−kx(x− 1) · · · (x− k + 1). (11.6.20)

Consider the product of two Charlier polynomials, which can be repre-sented by linear combinations of Charlier polynomials. Moreover, by (11.6.18),we have

Cn(k;λ)Cm(k;λ) =∑

l≤m+n

∑p≥0

m!n!λp

p!(p−m+ l)!(p− n+ l)!(m+ n− l − 2p)!Cl(k;λ)

where p−m+ l, p− n+ l,m+ n− l − 2p ≥ 0. Letting l = m+ n− 2q, wethen have

Cn(k;λ)Cm(k;λ) =

m+n∑

2q=0

bm,n,qCm+n−2q(k;λ), (11.6.21)


where

bm,n,q =∑

0≤p≤q, p+m≥2q, p+n≥2q

m!n!λp

p!(p+m− 2q)!(p+ n− 2q)!(2q − 2p)!.

Now we define the Charlier polynomial chaos and the Wick-Malliavinapproximation of a product of two square-integrable random variables. Sup-pose that u =

∑∞m=0 umCm(ξ;λ) ∈ L

2 and v =∑∞

n=0 vnCn(ξ;λ) ∈ L2.

When uv ∈ L2,

uv =

∞∑

m,n=0

umvnCm(ξ;λ)Cn(ξ;λ) =

∞∑

m,n=0

umvn

m+n∑

2q=0

aq,m,nCm+n−2q(ξ;λ).

(11.6.22)Define the Wick product as

Cm(ξ;λ) � Cn(ξ;λ) = a0,m,nCm+n(ξ;λ) = Cm+n(ξ;λ).

As before, the idea is to simplify the form of the product in computation.Based on similar ideas, we can define the Wick-Malliavin product as follows:

Cn(ξ;λ) �Q Cm(ξ;λ) =

Q∑

2q=0

aq,m,nCm+n−2q(ξ;λ), Q = 0, 1, 2, . . . .

(11.6.23)When Q = 0,

Cn(ξ;λ) �0 Cm(ξ;λ) = a0,m,nCm+n(ξ;λ) = Cn(ξ;λ) � Cm(ξ;λ). (11.6.24)

When Q ≥ m+ n, then

Cn(ξ;λ) �Q Cm(ξ;λ) = Cn(ξ;λ)Cm(ξ;λ). (11.6.25)

We can interpret the operator �Q as the Wick-Malliavin approximation as Qgoes to m+ n from below. With the approximation (11.6.23), we can define

u �Q v =∞∑

m,n=0

umvnCm(ξ;λ) �Q Cn(ξ;λ) =∞∑

m,n=0

umvn

Q∑2q=0

aq,m,nCm+n−2q(ξ;λ).

(11.6.26)When Q = ∞, u �Q v is exactly uv if every term is well defined.

For multiple Poisson random variables (ξ′ks k ≥ 2), the Charlier polyno-mial chaos basis can be defined as

Cα =∏

k

Cαk(ξk;λk), α = (α1, α2, . . .) ∈ J .

The Wick-Malliavin product for these elements in the polynomial chaos ba-sis is

Cα�QCβ =∏k

(Cαk (ξk;λk) �qk Cβk (ξk;λk)) , Q = (q1, q2, . . .) ∈ J , and α, β ∈ J .

(11.6.27)


This definition of Wick-Malliavin product is a bit different from thatin (11.6.12) where Q is a scalar. When Q = (0), the Wick-Malliavin productbecomes the Wick product.

Cα � Cβ = Cα+β . (11.6.28)

For two square-integrable stochastic processes, u, v, we define

u �Q v =∑

α∈JuαCα �Q

∑

β∈JvβCβ =

∑

α,β∈Juαvβ (Cα �Q Cβ) , (11.6.29)

when uv is also square integrable.

Example 11.6.5 Let ξk’s be all Poisson random variables with mean λ.Consider the following Burgers equation

∂tu+ u∂xu = ν∂2xu+ σ

n∑

k=1

mk(t, x)C1(ξk), x ∈ (−π, π), (11.6.30)

with a deterministic initial condition and periodic boundary conditions. Heremk(t, x)’s are some real-valued function in t and x. A Wick-Malliavin ap-proximation of (11.6.30)

∂tv + v �Q ∂xv = ν∂2xv + σ

n∑

k=1

mk(t, x)C1(ξk), x ∈ (−π, π), (11.6.31)

where Q = (q1, q2, . . . , qn) ∈ JN,n.

The propagator of the Burgers equation (11.6.30) reads, for any α ∈ Jn,

∂tuα+∑

γ,β∈J

∑

β+γ=α+2θ, θ∈Jbβ,γ,θ∂xuβuγ = ν∂2

xuα+σ

n∑

k=1

1{αj=δj,k,|α|=1}mk(t)

(11.6.32)

where for fixed k, 1{αj=δj,k,|α|=1} = 1 and otherwise is equal to 0 and bα,β,γis the multivariate version of (11.6.21). The initial condition for uα are

uα(0, x) = δ{|α|=0}u0(x). (11.6.33)

In computation, we are only interested in a truncation of the propagator(with a bit abuse of notation): for α ∈ JN,n,

∂tuα +∑

γ,β∈JN,n

∑

β+γ=α+2θ, θ∈JN,n

bβ,γ,θ∂xuβuγ

= ν∂2xuα + σ

n∑

k=1

1{αj=δj,k,|α|=1}mk(t). (11.6.34)

A Wick-Malliavin approximation is then, for α ∈ JN,n,

∂tuα+∑

γ,β∈JN,n

∑

β+γ=α+2θ

θ∈JN,n, θ≤Q

bβ,γ,θ∂xuβuγ = ν∂2xuα+σ

n∑

k=1

1{αj=δj,k,|α|=1}mk(t).

(11.6.35)


Numerical results

Consider the Burgers equation (11.6.30) with the deterministic initial condi-tion u0(x) = 1 − sin(x) and periodic boundary conditions. More numericalexamples can be found in [509].

We are seeking a numerical solution of the form

uQN,n(t, x) =

∑

α∈JN,n

uα(t, x)Cα,

where uα(t, x) satisfies the Wick-Malliavin approximation (11.6.35). To solvethe propagator, we use the following time discretization

un+1α − ν δt

2 ∂2xu

n+1α = un

α + ν δt2 ∂2

xunα +

∑γ,β∈JN,n

∑β+γ=α+2θ

θ∈JN,n, θ≤Qbβ,γ,θ∂xu

nβu

nγ

+σ∑n

k=1 1{αj=δj,k,|α|=1}mk((tn + tn+1)/2). (11.6.36)

In space, we use Fourier collocation method as in Chapter 7.4.We compute the following error in the second moments

ρ2(t) =

∥∥∥E[u2N,Q]− E[u2

ref ]∥∥∥l2

‖E[u2ref ]‖l2

. (11.6.37)

Here the reference solution uref is computed with taking ξ = k, k =0, 1, 2, . . . ,K where P(ξ = K) = e−λKλK/K! > 10−16 and P(ξ = K + 1) <10−16. In computation, 100 Fourier collocation points are used and the timestep size is specified in each figure.

For simplicity, we consider a single random variable where n = 1,m1(x, t) = 1, and λ = 1. We investigate the accuracy of the Wick-Malliavinapproximation when Q varies. We take Charlier polynomial chaos up to sixorder (N = 6). We observe in Figure 11.7 that the accuracy decreases withincreasing Q.

We then check the effect of Q for various polynomial chaos order N. InFigure 11.8, we observe that for N = 1, 2, 3, 4, the accuracy can be improvedwhen Q ≤ N − 1 but the improvement in accuracy is less significant whenQ > N − 1, see also [509]. Moreover, when Q = N, the Wick-Malliavin ap-proximation (11.6.35) leads to the same propagator as in (11.6.34).

An adaptive Wick-Malliavin approximation method. Now let us describea simple but effective adaptive method proposed in [509]. The goal is tokeep the relative error ρ2(t) at time grid no more than ε, say, 10−10. TheN-adaptivity refers to the refinement in N to keep ρ2(t) ≤ ε. In other words,when ρ2(t) > ε, we need to increase the number N. The Q-adaptivity refersto the refinement in Q when ρ2(t) > ε if Q ≤ N. In Figure 11.9, we fix eitherN or Q to keep the relative error ρ2(t) no larger than ε.



For elliptic equations with spatial noise, we considered two cases of the coeffi-cients: lognormal coefficient and spatial white noise. The use of Wick productleads to significant reduction of the computational cost by sparsifying the re-sulting linear system of deterministic elliptic equations.

Fig. 11.7. Different levels of Wick-Malliavin approximation of the Burger equationsusing (11.6.31): the relative errors ρ2(t) versus t. Here ν = 1/2 and σ = 0.1. Thepolynomial chaos order N = 6 and the time step size is δt = 2×10−4. These figuresare adapted from [509].

• For elliptic equations with lognormal coefficients, the Wick-Malliavin ap-proximation was used in the framework of WCE to sparsify the result-ing linear systems. The Wick-Malliavin approximation can be seen ashigh-order perturbation methods in terms of noise intensity, see Theo-rem 11.3.4. Numerical results show that the Wick-Malliavin approxima-tion can work well even when the magnitude of noise is relatively large.

• For elliptic equations with spatial-white-noise coefficients, solutions lie ina weighted stochastic Sobolev space. The error estimate of finite elementapproximation and WCE are derived in Theorem 11.4.1. It is shown thatin a proper stochastic Sobolev space, the finite element error will not be


polluted in the linear system of deterministic elliptic equations (propaga-tor), i.e., the finite element approximation error can be small even if thereis a large number of equations in the propagator.

• The Wick-Malliavin approximation are applied to nonlinear equationsas well. We showed that the zeroth order Wick-Malliavin approximationleads to the same system of PDEs with that resulted from stochasticperturbation methods, see Chapter 11.5.

• The Wick-Malliavin approximation is generalized for non-Gaussian whitenoise. One-dimensional Burgers equations driven by non-Gaussian whitenoises are considered and numerical results are presented using the corre-sponding generalized polynomial chaos. An adaptive method with varyingpolynomial chaos orders and levels of the Wick-Malliavin approximationis presented.

The Wick-Malliavin approximation provides a systematic way to reducethe computational cost of generalized polynomial chaos methods, especiallyfor nonlinear equations. The approximation can be thought as a high-orderperturbation analysis method. However, it is not clear that under what con-ditions the Wick-Malliavin approximation errors will not blow up with thetruncation parameters.

Bibliographic notes. In different applications, the following form of a(x, ω)has been used in (11.1.1):

• a(x, ω) is a bounded stochastic field, i.e., for a.e. ω, a(x, ω) is uniformlybounded in x (a(x, ω is uniformly bounded in both x and ω.), see, e.g., in[12].

• a(x, ω) is a lognormal field, i.e., ln(a(x, ω)) is a Gaussian process, see, e.g.,[73, 124, 140, 141, 220, 462, 469].

• a(x, ω) is a Gaussian field, see, e.g., [319, 469]. In this case, the ellipticproblem is not well posed in the classical sense and usually is accompaniedby the Wick product where the solution lies in some weighted space alongstochastic direction, see, e.g., [319, 469].

When a(x, ω) is uniformly positive, bounded, and sufficiently regular, thewell-posedness and some finite element analysis has been established, see,e.g., [12] for some rigorous error analysis under the root assumption of finitedimensional noise. See also, e.g., [11] where deterministic integration methodsare used (SCM), [107] (Monte Carlo methods), [1, 25] (multilevel Monte Carlomethods), [281] (quasi-Monte Carlo methods), and [282] (multilevel quasi-Monte Carlo methods).

For lognormal diffusion, the well-posedness question has been consideredin, e.g., [73, 74, 141, 220, 320]. In the random space, several integrationmethods have been considered, see, e.g., [220, 320] (WCE), [124](SCM) [75,166, 440] (Monte Carlo and multilevel Monte Carlo methods), [165] (quasi-Monte Carlo methods), and [164] (multilevel quasi-Monte Carlo methods).


Fig. 11.8. Different levels of Wick-Malliavin approximation of the Burgers equa-tions using (11.6.31): the relative errors ρ2(0.5) versus Q. Here ν = 1 and σ = 1.The time step size is δt = 5× 10−4. These figures are adapted from [509].

Fig. 11.9. The Charlier polynomial chaos method for the Burgers equation withN- or Q-adaptivity: the relative error ρ2(t) versus t. Here ν = 1/2 and time stepδt = 2 × 10−4. Left: Q-adaptivity with the polynomial chaos order N = 6 andσ = 1; Right. N-adaptivity with the level of Wick-Malliavin approximation Q = 3and σ = 0.1. These figures are adapted from [509].


Another Wick-type model for elliptic equations with lognormal coefficientis proposed in [466, 467]:

−div((a−1)�(−1)

)� ∇u) = f.

This is also a second-order approximation (in the intensity of noise) of themodel(11.1.1) and is numerically demonstrated to be a more accurate approxi-mation than (11.1.2).

Karhunen–Loeve expansion. A convergence study of the Karhunen–Loeveexpansion has been presented numerically in [231] and theoretically in [135],where error analysis of truncating Karhunen–Loeve expansion is providedwith finite element methods for obtaining numerically φk(x), the elements ofCONS.

Interpretation of WCE Solutions to SPDEs. With Wiener chaos methods,Rozovsky and his colleagues construct solutions in weighted Wiener chaosspaces for SPDEs, especially for linear equations, see [315, 319, 345, 347, 348,384], etc. Note that the solutions are not always in L2(F), even for a simplewave equation with additive white noise in two dimensions. Thus, weightedspaces should be carefully introduced. Rozovsky and his colleagues provide asystematic approach for the Wiener chaos solution. By comparing it with theapproach based on white noise theory, more appropriate weighted spaces arecarefully chosen for different problems, offering more flexible solution spacesthan the framework in [223].

For Wiener chaos method for white noise SPDEs, a lot of work has beendone for linear equations, such as [34, 327, 330, 332, 333, 443, 469]. Con-vergence analysis can be found in [34, 62, 63, 469], etc. However, the erroranalysis is far from what is practically demanded, e.g., for nonlinear equa-tions and how to balance the errors from deterministic solvers and truncationin random space.

Convergence rate of WCE for elliptic equations with lognormal coeffi-cients. The key argument for the convergence of WCE is to estimate thederivatives with respect to parameters (random variables). There are two ap-proaches for estimating these derivatives. The first one is to use multivariateTaylor’s expansion to obtain the regularity estimation of the elliptic equationsthat the multivariate derivatives of solutions satisfy, see, e.g., [73, 74, 220].Once these derivatives are estimated, the convergence rate of WCE can befound [220]. The second approach is to directly estimate the WCE coefficientsof the solution, see, e.g., [140, 141], where some weighted spaces in randomspace are used.

The Mikulevicius-Rozovsky formula (11.2.5) was mentioned in [229, The-orem 4.10] but has been forgotten until the formula is derived in [346] witha simple proof for square-integrable random fields.

The Wick-Malliavin approximation has also been applied to some SPDEswith quadratic nonlinearity, driven by some Gaussian random fields [462] ordriven by discrete processes [509]. A general framework for Wick-Malliavin


approximation for random fields with given distribution has been developedin [349]. However, no rigorous analysis of the Wick-Malliavin approximationfor these problems is available.


Exercise 11.8.1 Show that the formula (11.2.11) holds.

Exercise 11.8.2 Show that the Legendre polynomials defined by (11.6.4) sat-isfy the following relations

∫ 1

−1

Ll(x)Lk(x) dx =2

2k + 1δk,l,

and the three-term recurrence relation

(k + 1)Lk+1(x) = (2k + 1)xLk(x)− kLk − 1(x),

where L0(x) = 1, L1(x) = 1. Apply Legendre polynomial chaos (11.6.6) andthe explicit fourth-order Runge-Kutta to solve the following linear model

dy

dt= −ξy, t ∈ [0, 5],

where ξ obeys the uniform distribution on [0, 1].

Exercise 11.8.3 Apply the multistage method as in Exercise 6.6.2 to solvethe linear model in the last exercise over the time interval [0, 100].

Exercise 11.8.4 Derive a Q-level Wick-Malliavin approximation and writedown the corresponding propagator for the following equation

∂tu− ∂2xu = u− u2 +

m∑

k=1

cos(kx)ξk, x ∈ (0, 2π),

under the following conditions

• m = 1, ξ1 is a standard Gaussian random variable;• m = 2, ξk’s (k = 1, 2) are i.i.d. standard Gaussian random variables;• m = 1, ξ1 obeys the uniform distribution on [0, 1];• m = 2, ξk’s (k = 1, 2) obey the uniform distribution on [0, 1] and are

i.i.d..

12

Epilogue

Stochastic partial differential equations usually have solutions of lowregularity due to the nature of infinite dimensional rough noises. The lowregularity results in an enormous amount of computational time spent onMonte Carlo simulations. Despite of the simplicity of Monte Carlo methods,the slow convergence of Monte Carlo methods is the main bottleneck in com-puting numerical solutions to SPDEs. Although substantial improvements inMonte Carlo methods have been made in recent years, it is still desirableto have further accelerated sampling techniques. Depending on the specificproblem, the integration in random space can be made effective using dif-ferent methods such as quasi-Monte Carlo methods, Wiener chaos methods,and stochastic collocation methods.

In addition to methods of integration in random space, it is of great im-portance to understand the underlying SPDEs. While numerical methods areusually discussed for general equations, it is appreciated for numerical SPDEsthat a specialized numerical method can be applied to solve a small class ofSPDEs. For example, for linear equations with deterministic coefficients, wecan make full use of linearity as done in Chapters 6 and 7.

In this work, we apply the Wong-Zakai approximation to stochasticdifferential equations with white noise. Our focus is to present how to usedeterministic integration methods in random space, particularly Wiener chaosmethods and stochastic collocation methods.

12.1 A review of this work

In Part I, we consider numerical methods for stochastic ordinary differentialequations (SDEs). For stochastic differential equations with constant timedelay, we derive three schemes from the Wong-Zakai approximation using


335

336 12 Epilogue

the spectral approximation, the predictor-corrector scheme, the mid-pointscheme, and the Milstein scheme. For stochastic ordinary differential equa-tions with or without delay, we observe that the convergence order of nu-merical schemes via the Wong-Zakai approximation is not determined by theWong-Zakai approximation but depends on further discretization in time.For example, under the assumption of Lipschitz continuous coefficients, theWong-Zakai approximation itself is of order half in the mean square sense;however, the Milstein scheme based on the Wong-Zakai approximation (calledMilstein-like scheme in Chapter 4) is of order one.

In practice, the coefficients of SDEs are not Lipschitz continuous or noteven of linear growth. We consider stochastic differential equations with non-Lipschitz continuous coefficients both in drift and diffusion. Under a one-sidedLipschitz condition on coefficients, we present a fundamental limit theorem,i.e., a relationship between the local truncation error and global error in themean-square sense for numerical schemes for nonlinear stochastic differentialequations. We present an explicit balanced scheme so that we can efficientlyintegrate stochastic differential equations with super-linearly growing coeffi-cients over a finite time interval.

In Part II, we consider Wiener chaos and stochastic collocation methodsfor linear advection-diffusion-reaction equations with multiplicative noises.

We present a recursive multistage Wiener chaos expansion method (WCE)and a recursive multistage stochastic collocation method (SCM) for longertime integration of linear stochastic advection-reaction-diffusion equationswith finite dimensional noises. To compute the first two moments of thesolution with such a recursive multistage procedure, we first compute thecovariance matrix of the solution at different physical points at a time stepand then recursively compute the covariance matrix of the solution at thenext time step using the covariance matrix at the previous time step. Wecontinue this process before we reach the final integration time.

We compare the recursive multistage WCE with methods of character-istics plus a standard Monte Carlo sampling strategy and show that themultistage WCE is more efficient than the standard Monte Carlo methods ifhigh accuracy in the first two moments is desired.

We also compare WCE and SCM in conjunction with the recursivemultistage procedure. Although WCE theoretically exhibits higher-orderconvergence than SCM, we show that both methods are comparable in per-formance, depending on the underlying problem. The computational cost isproportional to the fourth power of the number of nodes or modes employedin physical space but the cost can be reduced to the second power if we makefull use of the sparsity of the solution.

For SCM, we also discuss a benchmark problem for stochastic nonlinearconservation laws–a stochastic piston problem in one-dimensional physicalspace. The problem of a moving piston (with the piston velocity being aBrownian motion) into a tube is modeled with stochastic Euler equationsdriven by white noise. After splitting the stochastic Euler equations into

12.1 A review of this work 337

two parts (by Lie-Trotter splitting), we truncate the Brownian motion withits spectral expansion and applied SCM to obtain variances of the shocklocations at different time instants. The conclusion is that SCM is efficientfor a short time simulation and quasi-Monte Carlo methods are more efficientfor a relatively longer time simulation.

We also illustrate the efficiency of SCM with the Euler scheme in timethrough a linear stochastic ordinary differential equations: error estimatesshow that SCM using a sparse grid of Smolyak type is efficient for short timeintegration and for small magnitudes of noises.

Our conclusion on integration methods in random space is as follows. WCEand SCM are efficient for longer time integration of linear problems usingour recursive approach and for a small number of noises within short timesimulation. However, if time increases, we have already employed many ran-dom variables, either from increments of Brownian motion or from the modesof spectral truncation of Brownian motion. Hence, deterministic integrationmethods are not efficient anymore since their computational cost grows ex-ponentially with the number of random variables. We then have to use ran-domized sampling strategies, such as Monte Carlo methods or randomizedquasi-Monte Carlo methods, possibly together with variance reduction meth-ods to reduce the statistical errors.

For both WCE and SCM, we apply the Wong-Zakai approximationusing a spectral approximation of Brownian motion. However, we use dif-ferent stochastic products for WCE and SCM because of computational ef-ficiency. In practice, WCE is associated with the Ito-Wick product, whichyields a weakly coupled system of PDEs for linear equations. SCM is asso-ciated with the Stratonovich product, which yields a decoupled system ofPDEs. These different formulations lead to different numerical performancebut both methods are comparable in performance for linear problems.

In Part III, we consider elliptic equations with additive noise and multi-plicative noise.

Using a spectral approximation of Brownian motion we discuss a semi-linear equation with additive spatial white noise. We find that for prob-lems in two or three dimensions in physical space, we cannot expect betterconvergence from the spectral approximation of Brownian motion than thatfrom piecewise linear approximation. However, we may expect high-orderconvergence when we have solutions of high regularity. For example, for el-liptic equations with additive noise in one-dimensional physical space or evenhigher-order equations in two- or three-dimensional physical space, we canexpect high regularity and benefit from the spectral truncation of Brownianmotion.

For elliptic equations with multiplicative noise, we consider WCE for log-normal coefficient as well as spatial white noise as coefficient. In the formercase, we use the Wick-Malliavin approximation to reduce the computationalcost. It is shown that the Wick-Malliavin approximation can be a higher-orderperturbation even when the noise intensity is relatively large. In the latter

338 12 Epilogue

case (spatial white noise as coefficient), the solution lies in some weightedstochastic Sobolev spaces. Though the WCE can lead to a weakly coupledlinear system of deterministic equations which is of great convenience in com-putation, it is not clear what the physical meaning of these numerical solu-tions is as no bounded second-order moments of these solutions exist.

For those who are interested in general numerical methods for SPDEs, wehave included a brief review in Chapter 3.

12.2 Some open problems

What is the most important question to ask in numerical SPDEs?In general, the answer depends on what you want from SPDEs. Are you seek-ing the average behavior of solutions to SPDEs, e.g., mean and covarianceof solutions or some behavior along trajectories or the probability distribu-tion of solutions? This question immediately leads to different treatment innumerical methods and senses of convergence of numerical methods. Cur-rently, the main focus is on mean-square convergence and weak convergence(in moments or functionals of solutions). In applications, where the interestis the probability distribution of solutions, the current methodology can bevery inefficient using Monte Carlo methods or its variants, or deterministicintegration methods.

For stochastic partial differential equations with space-time noise, deter-ministic integration methods in random space are too expensive as manyrandom variables should be employed to truncate space-time noise. MonteCarlo methods and associated variance reduction methods including multi-level Monte Carlo method could potentially be applied to resolve this issue.Further, some model reduction methods could be applied to reduce the heavycomputational load, e.g., some homogenization for multiscale stochastic par-tial differential equations [119].

For long-time integration of nonlinear stochastic differential equations us-ing deterministic integration methods in random space, dimensionality in ran-dom space is still the essential difficulty: the number of random variablesgrows linearly and the number of Wiener chaos modes or stochastic colloca-tion points grows exponentially. For linear equations solved with the recursivemultistage WCE or SCM, we will have fast increasing computational cost forsome statistics of the solutions other than the first two moments, e.g., thecomputational cost for third-order moments will be proportional to the sixthpower of nodes or modes employed in physical space. For nonlinear equations,the recursive multistage approach fails as nonlinear equations usually havestrong dependence on the initial condition and the superposition principledoes not work anymore.

12.2 Some open problems 339

To lift the curse of dimensionality, we have to suppress the history dataand restart from time to time to keep low dimensionality in random space(and thus low computational cost). To suppress the history data, we shouldemploy some reduction methods, such as functional analysis of variance, see,e.g., [133, 171, 504], to reduce the number of used random variables beforeintegrating over the next time interval.

Appendices

A

Basics of probability

A.1 Probability space

Definition A.1.1 (probability measure) A probability measure P on ameasurable space (Ω, F) is a function from F to [0, 1] such that

• P(∅) = 0, and P(Ω) = 1;• If {An}n≥1 ∈ F and Ai∩Aj �= ∅ if i �= j, then P(∪∞

n=1An) =∑∞

n=1 P(An).

Definition A.1.2 (probability space) A triple (Ω, F ,P) is called aprobability space if

• Ω is a sample space which is a collection of all samples;• F is a σ-algebra1 on Ω;• P is a probability measure on (Ω, F).

Definition A.1.3 (complete probability space) A probability space(Ω, F , P) is said to be a complete probability space if for all B ∈ Fwith P(B) = 0 and all A ⊆ B one has A ∈ F .

A.1.1 Random variable

Denote σ(D) = ∩{H|H is a σ − algebra of Ω, D ⊆ H}. We call σ(D) a σ-algebra generated by D.

Definition A.1.4 (F-measurable) If (Ω, F , P) is a given probability space,then a function Y : Ω → R

n is called F-measurable ifY −1(U) = {ω ∈ Ω|Y (ω) ∈ U} ∈ F holds for all open sets U ∈ R

n (or,equivalently, for all Borel sets U ∈ R

n.)

1A σ-algebra on a set X is a collection of subsets of X that includes the emptysubset and is closed under complement and under countable unions.

© Springer International Publishing AG 2017Z. Zhang, G.E. Karniadakis, Numerical Methods for StochasticPartial Differential Equations with White Noise,Applied Mathematical Sciences 196, DOI 10.1007/978-3-319-57511-7

343

344 A Basics of probability

If X : Ω → Rn is a function, then σ(X) is the smallest σ-algebra on Ω

containing all the sets X−1(U) for all open sets U in Rn.

Definition A.1.5 (Random variable) Suppose that (Ω, F , P) is a givencomplete probability space. A random variable X is an F-measurable functionX : Ω → R

n.

Theorem A.1.6 (Doob-Dynkin theorem) If X Y : Ω → Rn are two

given functions the Y is σ(X)-measurable if and only if there exists a Borelmeasurable function g : Rn → R

n and Y = g(X).

Every random variable induces a probability measure μX (distribution ofX) on R

n:μX(B) = P(X−1(B)).

If

∫

Ω

|X(ω)| dP(ω) < ∞, the expectation of X w.r.t P is defined by

E[X] =

∫

Ω

X(ω) dP(ω) =

∫

Rn

x dμX(x).

For continuous random variables,

E[X] =

∫ ∞

0

P(X > λ) dλ, for X ≥ 0.

For discrete random variables,

E[X] =

∞∑

n=0

P(X ≥ n), for X ≥ 0.

The p-th moment of X is defined as (if the integrals are well defined)

E[Xp] =

∫

Ω

Xp dP(ω) =

∫

Rn

xp dμX(x).

The centered moments are defined by E[|X − E[X]|p], p = 1, 2, . . .. Whenp = 2, the centered moment is also called the variance.

A.2 Conditional expectation

Given a probability space (Ω, F , P), a sub-σ-algebra G ⊆ F and an integrablerandom variable X (or E[|X|] < ∞).

Definition A.2.1 (Conditional expectation) The conditional expectationof X given G, denoted by E[X|G], is a random variable Y such that

• Y is G-measurable;

A.2 Conditional expectation 345

•∫

A

Y dP =

∫

A

X dP for all A ∈ G.

Example A.2.2 Let Ω = [0, 1), F = B([0, 1)), and let P be Lebesguemeasure. Then a random variable is simply a Borel measurable functionX : [0, 1) → R. Let G =

{∅, [0, 1

2 ), [12 , 1), [0, 1)

}. Then

E[X|G](x) ={2∫ 1

2

0X(y) dy, x ∈ [0, 1

2 )

2∫ 1

12X(y) dy, x ∈ [ 12 , 1),

In other words, the conditional expectation simply performs a “partial aver-age” over the partition of the underlying probability space.

Example A.2.3 Let {Λi}Ni=1 be a disjoint partition of Ω:

∪Ni=1Λi = Ω, Λi ∩ Λj = ∅ (i �= j).

Suppose that P(Λi) > 0 for i = 1, 2, . . . , N and G = σ(Λ1, · · · , Λn). Thena version of the conditional expectation for an integrable random variableX is

E[X|G](ω) =N∑

i=1

1Λi(ω)

E[1ΛiX]

P(Λi).

When ω ∈ Λj (j = 1, 2, · · · , N), then

E[X|G](ω) =E[1Λj

X]

P(Λj).

A.2.1 Properties of conditional expectation

• (Linearity) E[aX + bY |G] = aE[X|G] + bE[Y |G].

• E[E[X|G]] = E[X], for any G.• If X is G-measurable, then E[X|G] = X.

• If X is independent of G, then E[X|G] = E[X].

If H is independent of σ(σ(X),G), then E[X|σ(H,G)] = E[X|G].

• (“Taking out what is known”) If E[XY ] is well defined and Y is G-measurable, then E[XY |G] = Y E[X|G].

• (Tower property) If E[X] is well defined (or simply E[|X|] < ∞), H ⊆ G,E[E[X|G]|H] = E[X|H] = E[E[X|H]|G].

• (conditional Jensen’s inequality) If φ is convex, E[|φ(X)|] < ∞, then

φ(E[X|G]) ≤ E[φ(X)|G].

346 A Basics of probability

Theorem A.2.4 Consider a sequence of random variables {Xn} on a prob-ability space (Ω, F , P). Let G ⊆ F be a σ-algebra. Then the following resultshold.

• (conditional Monotone Convergence Theorem). If Xn ≥ 0 and is increas-ing with a limit X, then E[Xn|G] is increasing and

limn→∞E[Xn|G] = E[X|G] a.s..

• (conditional Fatou Lemma). If Xn ≥ 0, then

lim infn→∞ E[Xn|G] ≥ E[lim inf

n→∞ Xn|G] a.s..

• (conditional Dominated Convergence Theorem). If |Xn| ≤ Y and E[Y ] <∞ and Xn → X a.s., then

E[|Xn −X| |G] → 0 a.s., and limn→∞E[Xn|G] = E[X|G] a.s..

Theorem A.2.5 (Best estimator/predictor) Let Z be a Y -measurablerandom variable and X is square-integrable E[X2] < ∞. Then

E[|X − E[X|Y ]|2] ≤ E[|X − Z|2].

A.2.2 Filtration and Martingales

On a probability space (Ω, F , P), a filtration refers to an increasing se-quence of σ-algebra:

F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fn ⊆ · · · .

A natural filtration (w.r.t. X) is the smallest σ-algebra that containsinformation of X. It is generated by X and FX

n = σ(X1, . . . , Xn) withFX

0 = {∅, Ω}. If limn→∞ Fn ⊆ F . Then we call (Ω, F , {Fn}n≥1 P) a fil-tered probability space. A stochastic process {Xn} on a filtered probabil-ity space is an adapted process if Xn is Fn-measurable for each n.

Definition A.2.6 (martingale) The process and filtration {(Xn,Fn)} iscalled a martingale if for each n

• Xn is Fn-measurable.• E[|Xn|] < ∞.• E[Xn+1|Fn] = Xn.

A submartingale is defined by replacing the third condition with (E[Xn+1|Fn]≥ Xn). A supermartingale is defined if the third condition is replaced withE[Xn+1|Fn] ≤ Xn.

A.3 Continuous time stochastic process 347

A.3 Continuous time stochastic process

Definition A.3.1 Let (Ω,F ,P) be a probability space and let T ⊆ R betime. A collection of random variables Xt, t ∈ T with values in R is called astochastic process.

If Xt takes values in S = Rd, it is called a vector-valued stochastic process

(often abbreviated as stochastic process).If the time T can be a discrete subset of R, then Xt is called a discrete

time stochastic process.If time is an interval, R

+ or R, it is called a stochastic process withcontinuous time. For any fixed ω ∈ Ω, one can regard Xt(ω) as a function oft (called a sample function of the stochastic process).

Definition A.3.2 A stochastic process is measurable if X : Ω ⊗ T !→ S ismeasurable with respect to the product σ-algebra F ⊗ B(T ).

Definition A.3.3 (filtration) A family of sub-σ-algebras Ft ⊆ F indexedby t ∈ [0,∞) is called a filtration if it is increasing Fs ⊆ Ft when 0 ≤ s ≤t < ∞.

A stochastic process X is adapted to the filtration {Ft}t∈[0,∞) if Xt is

Ft-measurable for every t ∈ [0,∞).We assume that the filtration {Ft}t∈[0,∞) satisfies the so-called usual con-

ditions, i.e.,

• F0 contains all the P-negligible sets (hence so does every Ft),• The filtration {Ft}t∈[0,∞) is right-continuous, i.e., Ft = Ft+ =: ∩s>tFs =

∩∞n=1Ft+ 1

n.

Let X(t, ω) ∈ T ×Ω → R be a stochastic process. For a fixed ω0 ∈ Ω, wecall X(t, ω0) a realization (a sample) of the stochastic process X(t).

We say a stochastic process is mean-square continuous if

limε→0

E[|X(t+ ε)−X(t)|2] = 0.

B

Semi-analytical methods for SPDEs

Here we recall some semi-analytical methods of obtaining solutions of SPDEs,especially stochastic transformation methods and integrating factor methodsthat can transform SPDEs into deterministic PDEs. For integrating factormethods, we refer to Chapter 3.

Consider the following stochastic Burgers equation on (0, T ]× (0, 1):

∂tu+ u∂xu = μ∂2xu+ σ(t, x)W (t), u(0, x) = u0(x), u(t, 0) = u(t, 1).

(B.0.1)Here W (t) is a standard Brownian motion and μ > 0. If σ(t, x) depends ont, it can be readily checked that the solution is

u(t, x) = v(t, Y (t, x)) +

∫ t

0

σ(s)W (s) ds, (B.0.2)

where Y (t, x) = x −∫ t

0σ(s)W (s) ds and v(t, x) satisfies the deterministic

Burgers equation

∂tv + vvx = μ∂2x, v(0, x) = u0(x), v(t, 0) = v(t, 1). (B.0.3)

The solution to the following Stratonovich Burgers equation on (0, T ]× (0, 1)

∂tu+ (u+ σW (t)) ◦ ∂xu = μ∂2xu, u(0, x) = u0(x), u(t, 0) = u(t, 1) (B.0.4)

is given byu(t, x) = v(t, x− σW (t)), (B.0.5)

where v(t, x) satisfies Equation (B.0.3).Since we usually don’t have analytical solutions to Equation (B.0.3), we

first find a numerical solution for v in Equation (B.0.3) and then obtainthe solution to Equation (B.0.1) using (B.0.2) and the solution to (B.0.4)


349

350 B Semi-analytical methods for SPDEs

using (B.0.5). For the above stochastic Burgers equation, the methodologyexploits (stochastic) analytical transforms and numerical methods and is thuscalled semi-analytical. This approach is very efficient when periodic boundaryconditions are imposed since no SPDEs are solved.

Here are more equations that can be transformed to deterministic ones.The stochastic Korteweg-de Vries (KdV) equation

∂tu+ u∂xu+ μ∂3xu+ γu = f(t, ω) (B.0.6)

can be transformed to the following KdV equation on (0, T ]

∂tv + v∂xv + μ∂3xv + γv = 0, (B.0.7)

if we let u(t, x) = v(t, x−∫ t

0g(s, ω) ds) + g(s, ω), where g(s, ω) = e−γt

∫ t

0eγs

f(s, ω) ds. Here f(t, ω) can be very rough, e.g., white noise. The Navier-Stokes equation with additive random forcing also can be transformed into adeterministic one. Through the substitutions u(t, x) = U(t, x−

∫ t

0g(s, ω) ds)

and p = P (t, x−∫ t

0g(s, ω) ds), g(s, ω) =

∫ t

0f(s, ω) ds, we obtain from

∂tu+ u∇u = μΔu−∇p+ f(t, ω). (B.0.8)

that∂tU + U∇U = μΔU −∇P. (B.0.9)

Though the boundary conditions for (B.0.8) may be different from thosefor (B.0.9), the initial conditions are the same. Moreover, if we are givenperiodic boundary conditions for (B.0.8), we can still apply periodic boundaryconditions to (B.0.9).

For multiplicative noise, we can apply similar techniques if the noise isonly time-dependent. Consider for example the advection-diffusion equation

∂tu+ f(t, ω)∂xu = μ∂2xu. μ ≥ 0.

Let u(t, x, ω) = v(t, x−∫ t

0f(s, ω) ds). Then we have

∂tv = μ∂2xv.

Here again the noise f(t, ω) can be rough, e.g., white noise. However, if f(t, ω)is white noise, we have to interpret the product f(t, ω)∂xu using Stratonovichproduct: f(t, ω) ◦ ∂xu as in (B.0.4).

C

Gauss quadrature

We recall some basic facts about Gauss quadrature, which is used in theconstruction of sparse grid collocation methods.

C.1 Gauss quadrature

Definition C.1.1 Suppose that

I(f) =

∫ b

a

f(x)dx ≈ In(f) =n∑

k=0

Akf(xk),

When In(f) has a polynomial exactness (2n + 1), we will call it Gauss-Legendre quadrature rule, and the corresponding points xk, k = 0, 1 · · · , nare called Gauss-Legendre points.

I(f) ≈ In(f) has a polynomial exactness (2n+ 1) if and only if

∫ b

a

xidx =n∑

k=0

Akxik, i = 0, 1, · · · , 2n+ 1.

Example C.1.2 Find A0, A1 and x0, x1 such that the following rule is aGauss quadrature.

∫ 1

−1

f(x)dx ≈ A0f(x0) +A1f(x1).

Solution. When n = 1, we need a polynomial exactness 2 + 1 = 3. Thus


351

352 C Gauss quadrature

f(x) = 1, A0 +A1 =

∫ 1

−1

1dx = 2,

f(x) = x, A0x0 +A1x1 =

∫ 1

−1

xdx = 0,

f(x) = x2, A0x20 +A1x

21 =

∫ 1

−1

x2dx =2

3,

f(x) = x3, A0x30 +A1x

31 =

∫ 1

−1

x3dx = 0.

We then obtain A0 = A1 = 1, x0 = − 1√3, x1 = 1√

3. So the desired Gauss

quadrature rule on [−1, 1] is

∫ 1

−1

f(x)dx ≈ f

(− 1√

3

)+ f

(1√3

).

I(g) =

∫ 1

−1

g(t)dt ≈n∑

k=0

Akg(tk),

The zeros of Pn+1(t) are the Gauss quadrature points and the Gauss quadra-ture weights are

Ak =

∫ 1

−1

n∏

j=0

j �=k

t− tjtk − tj

dt, k = 0, 1, · · · , n.

When n = 0, t0 = 0, A0 = 2,

∫ 1

−1

g(t)dt ≈ 2g(0).

When n = 1, t0 = − 1√3, t1 = 1√

3, A0 = 1, A1 = 1,

∫ 1

−1

g(t)dt ≈ g

(− 1√

3

)+ g

(1√3

).

When n = 2, t0 = −√

35 , t1 = 0, t2 =

√35 , A0 = 5

9 , A1 = 89 , A2 = 5

9 .

∫ 1

−1

g(t)dt ≈ 5

9g(−√

3

5

)+

8

9g(0) +

5

9g(√3

5

).

C.1 Gauss quadrature 353

Remark C.1.3 (Gauss quadrature rule on [a, b]) Using the lineartransformation x = a+b

2 + b−a2 t, we have

I(f) =

∫ b

a

f(x)dx =

∫ 1

−1

b− a

2f

(a+ b

2+

b− a

2t

)dt.

By Gauss quadrature rule on [−1, 1], we have

In(f) =n∑

k=0

b− a

2Akf

(a+ b

2+

b− a

2tk

).

Quadrature rules for integration with weights. Consider the followingintegration

I(f) =

∫ b

a

ρ(x)f(x)dx, ρ(x) ∈ C(a, b),

where f(x) has enough number of derivatives over [a, b] (smooth enough).The weight function ρ(x) satisfies the following conditions:

1. x ∈ (a, b), ρ(x) ≥ 0;

2.∫ b

aρ(x)dx > 0;

3. k = 0, 1, 2, . . .,∫ b

axkρ(x)dx is well defined.

Recall the definition of polynomial exactness for the quadrature rule:

∫ b

a

ρ(x)f(x)dx ≈n∑

k=0

Akf(xk). (C.1.1)

We say the quadrature rule (C.1.1) has polynomial exactness m when (C.1.1)is exact for f(x) = 1, x, x2, · · · , xm but (C.1.1) is not exact for f(x) = xm+1.We call a quadrature rule Gauss quadrature rule when the polynomial exact-ness is (2n+ 1).

Example C.1.4 Consider the following integral with weights I(f) =∫ 1

0f(x)√

x

dx

I(f) ≈ Af

(1

5

)+Bf(1).

Find A,B to make the polynomial exactness as high as possible.

Solution. Here we have two unknowns. We can ask for at least polynomial

exactness 1. When f(x) = 1, I(f) =∫ 1

01√xdx = 2 = A+B. When f(x) = x,

I(f) =∫ 1

0x√xdx = 2

3 = 15A+B.

A+B = 21

5A+B =

2

3

354 C Gauss quadrature

This gives A = 53 , B = 1

3 . The quadrature rule is

I(f) ≈ 5

3f

(1

5

)+

1

3f(1).

When f(x) = x2, I(f) =∫ 1

0x2√xdx = 2

5 . The quadrature rule gives the same

value: 53

(15

)2+ 1

3 (1)2 = 2

5 .

When f(x) = x3, I(f) =∫ 1

0x3√xdx = 2

7 , but the quadrature rule gives a

different value 53

(15

)3+ 1

3 (1)3 = 26

75 . The polynomial exactness is 2.One fundamental theorem for Gauss quadrature is that the Gauss quadra-

ture points are exactly the n + 1 zeros of the n + 1-th order orthogonalpolynomial (with respect to the weight ρ(x)). For example, when ρ = 1,the orthogonal polynomials are Legendre polynomials. The Gauss quadra-ture points are exactly the zeros of n + 1-th order Legendre polynomial.When ρ(x) = (1 − x)α(1 + x)β (α, β > −1), the corresponding orthogonalpolynomial is called the Jacobi polynomial. The quadrature is then calledGauss-Jacobi quadrature. When ρ(x) = exp(−x2/2), the corresponding or-thogonal polynomial is called the Hermite polynomial and the quadrature iscalled Gauss-Hermite quadrature.

Gauss-Lobatto quadrature

Definition C.1.5 Suppose that a positive function ρ(x) satisfies the condi-tions in (C.1).

I(f) =

∫ b

a

f(x)ρ(x) dx ≈ In(f) =

n∑

k=0

Akf(xk), x0 = a, xn = b.

When In(f) has a polynomial exactness (2n−1), we will call it Gauss-Lobattoquadrature rule, and the corresponding points xk, k = 0, 1 · · · , n are calledGauss-Lobatto quadrature points.

When ρ(x) = 1, the Gauss-Lobatto quadrature is called Gauss-Legendre-Lobatto quadrature. The Gauss-Lobatto quadrature points are the zeros of(1− x2)∂xPn(x), where Pn(x) is the n-th order Legendre polynomial.

C.2 Gauss-Hermite quadrature

Let ψ(y), y ∈ R, be a smooth function and Qnψ be a Gauss-Hermite quadra-ture applied to ψ, i.e.

E[ψ(ξ)] =1

(2π)1/2

∫

R

ψ(y) exp(−y2/2) dy (C.2.1)

≈ Qnψ :=

n∑

i=1

ψ(yi)wi,

C.2 Gauss-Hermite quadrature 355

where ξ is a standard Gaussian random variable, yi = y(n)i , i = 1, 2, . . . , n,

are roots of the nth Hermite polynomial

Hn(y) = (−1)n exp(y2/2)dn

dynexp(−y2/2)

and the associated weights wi = w(n)i are given by

wi =n!

n2 [Hn−1(yi)]2 . (C.2.2)

For instance,

n = 1 : w1 = 1, y1 = 0; (C.2.3)

n = 2 : w1 = w2 = 1/2, y1 = −1, y2 = 1;

n = 3 : w1 = w3 = 1/6, w2 = 2/3, y1 = −√3, y3 = −

√3, y2 = 0.

The quadrature Qnψ is exactly equal to E[ψ(ξ)] for polynomials ψ(y) oforder ≤ 2n− 1.

D

Some useful inequalities and lemmas

We give a summary of basic inequalities and lemmas that we use in the book.Cauchy-Schwarz inequality (aka. Cauchy inequality). If f, g ∈

L2(D), then fg ∈ L1(D) and

∫

D

fg ≤ (

∫

D

f2)1/2∫

D

g2)1/2.

Holder inequality. If 1p +

1q = 1, p, q ≥ 1 and f ∈ Lp(D) and g ∈ Lq(D),

then fg ∈ L1(D) and

∫

D

fg ≤ (

∫

D

fp)1/p(

∫

D

gq)1/q.

Young inequality. If 1p + 1

q = 1, p, q > 0, then for all a, b ≥ 0, we have

ab ≤ ap

p+

bq

q.

Gronwall inequality (a.k.a. Gronwall’s lemma, Gronwall-Bellman in-equality) Assume that u(t), k(t), ϕ(t) ≥ 0 are continuous on t ∈ [t0, T ] and

u(t) ≤ ϕ(t) +

∫ t

t0

k(s)u(s) ds, for all t ∈ [t0, T ].

Then u(t), t ∈ [t0, T ] is bounded by

u(t) ≤ ϕ(t) +

∫ t

t0

k(s)ϕ(s) exp

(∫ t

s

k(θ) dθ

)ds.


357

358 D Some useful inequalities and lemmas

If ϕ(t) is a constant, then

u(t) ≤ ϕ exp

(∫ t

t0

k(s) ds

).

Nonlinear Gronwall inequality. Assume that u(t) k(t) ≥ 0 are contin-uous on t ∈ [t0, T ] and

u2(t) ≤ M +

∫ t

t0

k(s)u(s) ds, M ≥ 0, for all t ∈ [t0, T ].

Then u(t), t ∈ [t0, T ] is bounded by

u(t) ≤√M + 2

∫ t

t0

k(s) ds.

Discrete Gronwall inequality. Assume that un, Kn, and kn are non-negative sequences and

un ≤ Kn +n−1∑

j=0

kjuj , n ≥ 1.

Then it holds that for n ≥ 0,

un ≤ Kn +

n−1∑

j=0

kjKj

∏

j<i<n

(1 + ki).

Poincare inequality. Let p ≥ 1 and Ω is a bounded subset in Rd.

There then exists a constant C depending only on Ω and p such that for anyu ∈ W 1,p

0

‖u‖Lp(Ω) ≤ C ‖∇u‖Lp(Ω) .

Here the constant C is called the Poincare constant.Littlewood-Paley inequality [301]. Suppose that f(x) =

∑∞i=1 skmk(x)

exists in L2([a, b]), a and b are finite. If f ∈ Lp([a, b]), 1 < p < ∞, then thereexist constants L > 0 and M > 0 such that

L ‖f‖Lp ≤∥∥∥∥∥

( ∞∑

i=1

s2km2k(x)

)1/2∥∥∥∥∥Lp

≤ M ‖f‖Lp .

The function within the norm in the middle term is called the Littlewood-Paley function.

Markov inequality. Assume that φ is a monotonically increasing func-tion from the nonnegative reals to the nonnegative reals. If X is a randomvariable and E[φ(X)] < ∞, c > 0, and φ(c) > 0, then

P(|X| ≥ c) ≤ E[φ(|X|)]φ(c)

.

D Some useful inequalities and lemmas 359

Chebyshev inequality. Let X be a random variable with |E[X]| ≤ ∞and Var[X] = σ2 < ∞. Then for any real number c > 0,

P(|X − μ| ≥ cσ) ≤ 1

c2.

Jensen inequality. If X is a random variable, φ is a convex functionand E[φ(X)] < ∞, then

φ(E[X]) ≤ E[φ(X)].

An example of convex functions in this book is φ(x) = xp, p ≥ 1.Central limit theorem. Xi are i.i.d. (independent and identically

distributed) and E[X2i ] < ∞ and also μ = E[X1] and σ2 = Var(X1).

Sn =∑n

i=1 Xi. Then

P

{a ≤ Sn − nμ

σ√n

≤ b

}→∫ b

a

1√2π

e−x2

2 dx, as n → ∞.

Borel-Cantelli Lemma. Let {An} be a sequence of events in a proba-

bility space. If∞∑

n=1

P(An) < ∞, then

P

(lim supn→∞

An

)= 0. Recall that lim sup

n→∞An =

∞⋂

n=1

∞⋃

k≥n

Ak.

Burkholder-Davis-Gundy inequality. For any 1 ≤ p < ∞, there existconstants cp, Cp > 0 such that for all (local) martingales X with X0 = 0 andstopping times τ , the following inequality holds:

cpE[[X]p/2τ ] ≤ E[(X∗τ )

p] ≤ CpE

[[X]p/2τ

].

Here X∗t = sups≤t |Xs| is the maximum process Xt and [X] is the quadratic

variation of X. Furthermore, for continuous (local) martingales, this state-ment holds for all 0 < p < ∞.

Fubini theorem. This is called Fubini-Tonelli theorem but often calledFubini theorem.

Consider two σ-finite measure spaces (X, E , μ) and (Y,F , ν), and the prod-uct measure space (X × Y, E ⊗ F , π). If f is a measurable function on theproduct measure space such that any one of the three integrals∫

X×Y

f(x, y)π(dxdy),

∫

X

[∫

Y

f(x, y)ν(dy)

]μ(dx),

∫

Y

[∫

X

f(x, y)μ(dx)

]ν(dy)

is finite. Then∫

X×Y

f(x, y)π(dxdy) =

∫

X

[∫

Y

f(x, y)ν(dy)

]μ(dx) =

∫

Y

[∫

X

f(x, y)μ(dx)

]ν(dy).

Recall that a measure defined on a σ-algebra of subsets of a set X is calledσ-finite if X is the countable union of measurable sets with finite measure

E

Computation of convergence rate

Suppose that gn is a good approximation of f , say, ‖f − gn‖ ∼ Cn−r

(‖f − gn‖ is proportional to Cn−r), where r > 0 and C does not dependon n. To determine the convergence rate of approximation methods, we canuse the following formula

log(‖f − gn2‖ / ‖f − gn1

‖)log(n2/n1)

.

Denote by En = ‖f − gn‖. Suppose that En ∼ Cn−r. We then have

En2

En1

∼(n2

n1

)−r.

Taking the logarithm over both sides leads to the formula above.When f is not known, we can replace f with some fN obtained with a

numerical method where N is sufficiently large so that f−fN is much smallerthan fN − gn

‖f − gn‖ = ‖(f − fN ) + fN − gn‖ ≈ ‖fN − gn‖ .

We call this fN as a reference solution and measure the convergence rate by

log(‖fN − gn2‖ / ‖fN − gn1

‖)log(n2/n1)

, n1, n2 � N.


361

References

1. A. Abdulle, A. Barth, C. Schwab, Multilevel Monte Carlo methodsfor stochastic elliptic multiscale PDEs. Multiscale Model. Simul. 11,1033–1070 (2013)

2. M. Abramowitz, I.A. Stegun (eds.), Handbook of Mathematical Func-tions, with Formulas, Graphs, and Mathematical Tables (Dover,Mineola, 1972). 10th printing, with corrections

3. P. Acquistapace, B. Terreni, An approach to Ito linear equations inHilbert spaces by approximation of white noise with coloured noise.Stoch. Anal. Appl. 2, 131–186 (1984)

4. I.A. Adamu, G.J. Lord, Numerical approximation of multiplicativeSPDEs. Int. J. Comput. Math. 89, 2603–2621 (2012)

5. A. Alabert, I. Gyongy, On numerical approximation of stochasticBurgers’ equation, in From Stochastic Calculus to Mathematical Finance(Springer, Berlin, 2006), pp. 1–15

6. E.J. Allen, S.J. Novosel, Z. Zhang, Finite element and difference approx-imation of some linear stochastic partial differential equations. Stoch.Stoch. Rep. 64, 117–142 (1998)

7. V.V. Anh, W. Grecksch, A. Wadewitz, A splitting method for a stochas-tic Goursat problem. Stoch. Anal. Appl. 17, 315–326 (1999)

8. L. Arnold, Stochastic Differential Equations: Theory and Applications(Wiley-Interscience, New York, 1974)

9. A. Ashyralyev, M. Akat, An approximation of stochastic hyperbolicequations: case with Wiener process. Math. Methods Appl. Sci. 36,1095–1106 (2013)

10. R. Askey, J.A. Wilson, Some Basic Hypergeometric Orthogonal Poly-nomials that Generalize Jacobi Polynomials (American MathematicalSociety, Providence, 1985)


363

364 References

11. I. Babuska, F. Nobile, R. Tempone, A stochastic collocation method forelliptic partial differential equations with random input data. SIAM J.Numer. Anal. 45, 1005–1034 (2007)

12. I. Babuska, R. Tempone, G.E. Zouraris, Galerkin finite element approx-imations of stochastic elliptic partial differential equations. SIAM J.Numer. Anal. 42, 800–825 (2004)

13. J. Back, F. Nobile, L. Tamellini, R. Tempone, Stochastic spectralGalerkin and collocation methods for PDEs with random coefficients: anumerical comparison, in Spectral and High Order Methods for PartialDifferential Equations (Springer, Berlin, Heidelberg, 2011), pp. 43–62

14. C.T.H. Baker, E. Buckwar, Numerical analysis of explicit one-step meth-ods for stochastic delay differential equations. LMS J. Comput. Math.3, 315–335 (2000)

15. V. Bally, Approximation for the solutions of stochastic differential equa-tions. I. Lp-convergence. Stoch. Stoch. Rep. 28, 209–246 (1989)

16. V. Bally, Approximation for the solutions of stochastic differential equa-tions. II. Strong convergence. Stoch. Stoch. Rep. 28, 357–385 (1989)

17. V. Bally, Approximation for the solutions of stochastic differential equa-tions. III. Jointly weak convergence. Stoch. Stoch. Rep. 30, 171–191(1990)

18. V. Bally, A. Millet, M. Sanz-Sole, Approximation and support theoremin Holder norm for parabolic stochastic partial differential equations.Ann. Probab. 23, 178–222 (1995)

19. X. Bardina, M. Jolis, L. Quer-Sardanyons, Weak convergence for thestochastic heat equation driven by Gaussian white noise. Electron. J.Probab. 15(39), 1267–1295 (2010)

20. X. Bardina, I. Nourdin, C. Rovira, S. Tindel, Weak approximation of afractional SDE. Stoch. Process. Appl. 120, 39–65 (2010)

21. A. Barth, A. Lang, Milstein approximation for advection-diffusion equa-tions driven by multiplicative noncontinuous martingale noises. Appl.Math. Optim. 66, 387–413 (2012)

22. A. Barth, A. Lang, Simulation of stochastic partial differential equationsusing finite element methods. Stochastics 84, 217–231 (2012)

23. A. Barth, A. Lang, Lp and almost sure convergence of a Milstein schemefor stochastic partial differential equations. Stoch. Process. Appl. 123,1563–1587 (2013)

24. A. Barth, A. Lang, C. Schwab, Multilevel Monte Carlo method forparabolic stochastic partial differential equations. BIT Numer. Math.53, 3–27 (2013)

25. A. Barth, C. Schwab, N. Zollinger, Multi-level Monte Carlo finite el-ement method for elliptic PDEs with stochastic coefficients. Numer.Math. 119, 123–161 (2011)

26. M. Barton-Smith, A. Debussche, L. Di Menza, Numerical study of two-dimensional stochastic NLS equations. Numer. Methods Partial Differ.Equ. 21, 810–842 (2005)

References 365

27. C. Bauzet, On a time-splitting method for a scalar conservation law witha multiplicative stochastic perturbation and numerical experiments. J.Evol. Equ. 14, 333–356 (2014)

28. C. Bayer, P.K. Friz, Cubature on Wiener space: pathwise convergence.Appl. Math. Optim. 67, 261–278 (2013)

29. S. Becker, A. Jentzen, P.E. Kloeden, An exponential Wagner-Platentype scheme for SPDEs. SIAM J. Numer. Anal. 54, 2389–2426 (2016)

30. G. Ben Arous, M. Gradinaru, M. Ledoux, Holder norms and the supporttheorem for diffusions. Ann. Inst. H. Poincare Probab. Stat. 30, 415–436(1994)

31. A. Bensoussan, R. Glowinski, A. Rascanu, Approximation of the Za-kai equation by the splitting up method. SIAM J. Control Optim. 28,1420–1431 (1990)

32. A. Bensoussan, R. Glowinski, A. Rascanu, Approximation of the Za-kai equation by the splitting up method. SIAM J. Control Optim. 28,1420–1431 (1990)

33. A. Bensoussan, R. Glowinski, A. Rascanu, Approximation of somestochastic differential equations by the splitting up method. Appl. Math.Optim. 25, 81–106 (1992)

34. F.E. Benth, J. Gjerde, Convergence rates for finite element approxima-tions of stochastic partial differential equations. Stoch. Stoch. Rep. 63,313–326 (1998)

35. M. Bieri, C. Schwab, Sparse high order FEM for elliptic SPDEs. Com-put. Methods Appl. Mech. Eng. 198, 1149–1170 (2009)

36. D. Blomker, Amplitude equations for stochastic partial differential equa-tions, in Interdisciplinary Mathematical Sciences, vol. 3 (World Scien-tific Publishing Co. Pte. Ltd., Hackensack, NJ, 2007)

37. D. Blomker, A. Jentzen, Galerkin approximations for the stochasticBurgers equation. SIAM J. Numer. Anal. 51, 694–715 (2013)

38. D. Blomker, M. Kamrani, S.M. Hosseini, Full discretization of thestochastic Burgers equation with correlated noise. IMA J. Numer. Anal.33(3), 825–848 (2013)

39. F. Bonizzoni, F. Nobile, Perturbation analysis for the Darcy prob-lem with log-normal permeability. SIAM/ASA J. Uncertain. Quant. 2,223–244 (2014)

40. N. Bou-Rabee, E. Vanden-Eijnden, A patch that imparts unconditionalstability to explicit integrators for Langevin-like equations. J. Comput.Phys. 231, 2565–2580 (2012)

41. H. Breckner, Approximation of the solution of the stochastic Navier-Stokes equation. Optimization 49, 15–38 (2001)

42. N. Bruti-Liberati, E. Platen, Strong predictor-corrector Euler methodsfor stochastic differential equations. Stoch. Dyn. 8, 561–581 (2008)

43. Z. Brzezniak, M. Capinski, F. Flandoli, Stochastic partial differentialequations and turbulence. Math. Models Methods Appl. Sci. 1, 41–59(1991)

366 References

44. Z. Brzezniak, E. Carelli, A. Prohl, Finite-element-based discretizationsof the incompressible Navier-Stokes equations with multiplicative ran-dom forcing. IMA J. Numer. Anal. 33, 771–824 (2013)

45. Z. Brzezniak, A. Carroll, Approximations of the Wong-Zakai type forstochastic differential equations in M-type 2 Banach spaces with appli-cations to loop spaces, in Seminaire de Probabilites XXXVII (Springer,Berlin, 2003), pp. 251–289

46. Z. Brzezniak, F. Flandoli, Almost sure approximation of Wong-Zakaitype for stochastic partial differential equations. Stoch. Process. Appl.55, 329–358 (1995)

47. Z. Brzezniak, A. Millet, On the splitting method for some complex-valued quasilinear evolution equations, in Stochastic Analysis andRelated Topics, ed. by L. Decreusefond, J. Najim (Springer, Berlin,2012), pp. 57–90

48. R. Buckdahn, E. Pardoux, Monotonicity methods for white noisedriven quasi-linear SPDEs, in Diffusion Processes and Related Prob-lems in Analysis, Evanston, IL, 1989, vol. I (Birkhauser, Boston, 1990),pp. 219–233

49. E. Buckwar, T. Sickenberger, A comparative linear mean-square stabil-ity analysis of Maruyama- and Milstein-type methods. Math. Comput.Simul. 81, 1110–1127 (2011)

50. E. Buckwar, R. Winkler, Multistep methods for SDEs and their appli-cation to problems with small noise. SIAM J. Numer. Anal. 44, 779–803(2006)

51. E. Buckwar, R. Winkler, Multi-step Maruyama methods for stochasticdelay differential equations. Stoch. Anal. Appl. 25, 933–959 (2007)

52. A. Budhiraja, L. Chen, C. Lee, A survey of numerical methods for non-linear filtering problems. Phys. D: Nonlinear Phenom. 230, 27–36 (2007)

53. A. Budhiraja, G. Kallianpur, Approximations to the solution of theZakai equation using multiple Wiener and Stratonovich integral expan-sions. Stoch. Stoch. Rep. 56, 271–315 (1996)

54. A. Budhiraja, G. Kallianpur, The Feynman-Stratonovich semigroupand Stratonovich integral expansions in nonlinear filtering. Appl. Math.Optim. 35, 91–116 (1997)

55. A. Budhiraja, G. Kallianpur, Two results on multiple Stratonovich in-tegrals. Stat. Sinica 7, 907–922 (1997)

56. R.E. Caflisch, Monte Carlo and quasi-Monte Carlo methods. Acta Nu-mer. 7, 1–49 (1998)

57. R.H. Cameron, W.T. Martin, The orthogonal development of non-linearfunctionals in series of Fourier-Hermite functionals. Ann. Math. (2) 48,385–392 (1947)

58. C. Canuto, M.Y. Hussaini, A. Quarteroni, T.A. Zang, Spectral Methods(Springer, Berlin, 2006)

References 367

59. W. Cao, Z. Zhang, Simulations of two-step Maruyama methods for non-linear stochastic delay differential equations. Adv. Appl. Math. Mech.4, 821–832 (2012)

60. W. Cao, Z. Zhang, On exponential mean-square stability of two-stepMaruyama methods for stochastic delay differential equations. J. Com-put. Appl. Math. 245, 182–193 (2013)

61. W. Cao, Z. Zhang, G.E. Karniadakis, Numerical methods for stochasticdelay differential equations via Wong-Zakai approximation. SIAM J.Sci. Comput. 37(1), A295–A318 (2015)

62. Y. Cao, On convergence rate of Wiener-Ito expansion for generalizedrandom variables. Stochastics 78, 179–187 (2006)

63. Y. Cao, Z. Chen, M. Gunzburger, Error analysis of finite element ap-proximations of the stochastic Stokes equations. Adv. Comput. Math.33, 215–230 (2010)

64. Y. Cao, H. Yang, L. Yin, Finite element methods for semilinear ellipticstochastic partial differential equations. Numer. Math. 106, 181–198(2007)

65. Y. Cao, L. Yin, Spectral Galerkin method for stochastic wave equationsdriven by space-time white noise. Commun. Pure Appl. Anal. 6, 607–617(2007)

66. Y. Cao, L. Yin, Spectral method for nonlinear stochastic partial differ-ential equations of elliptic type. Numer. Math. Theory Methods Appl.4, 38–52 (2011)

67. Y. Cao, R. Zhang, K. Zhang, Finite element and discontinuous Galerkinmethod for stochastic Helmholtz equation in two- and three-dimensions.J. Comput. Math. 26, 702–715 (2008)

68. Y. Cao, R. Zhang, K. Zhang, Finite element method and discontinuousGalerkin method for stochastic scattering problem of Helmholtz type inR

d (d = 2, 3). Potential Anal. 28, 301–319 (2008)69. E. Carelli, A. Prohl, Rates of convergence for discretizations of the

stochastic incompressible Navier-Stokes equations. SIAM J. Numer.Anal. 50, 2467–2496 (2012)

70. T. Cass, C. Litterer, On the error estimate for cubature on Wienerspace. Proc. Edinb. Math. Soc. (2) 57, 377–391 (2014)

71. H.D. Ceniceros, G.O. Mohler, A practical splitting method for stiff SDEswith applications to problems with small noise. Multiscale Model. Simul.6, 212–227 (2007)

72. M. Chaleyat-Maurel, D. Michel, A Stroock Varadhan support theoremin nonlinear filtering theory. Probab. Theory Relat. Fields 84, 119–139(1990)

73. J. Charrier, Strong and weak error estimates for elliptic partial differ-ential equations with random coefficients. SIAM J. Numer. Anal. 50,216–246 (2012)

368 References

74. J. Charrier, A. Debussche, Weak truncation error estimates for ellipticPDEs with lognormal coefficients. Stoch. PDE: Anal. Comp. 1, 63–93(2013)

75. J. Charrier, R. Scheichl, A.L. Teckentrup, Finite element error analysisof elliptic PDEs with random coefficients and its application to multi-level Monte Carlo methods. SIAM J. Numer. Anal. 51, 322–352 (2013)

76. G.-Q. Chen, Q. Ding, K.H. Karlsen, On nonlinear stochastic balancelaws. Arch. Ration. Mech. Anal. 204, 707–743 (2012)

77. H. Cho, D. Venturi, G.E. Karniadakis, Statistical analysis and simu-lation of random shocks in stochastic Burgers equation. Proc. R. Soc.Lond. Ser. A Math. Phys. Eng. Sci. 470 (2014), 20140080, 21

78. P. L. Chow, J.-L. Jiang, J.-L. Menaldi, Pathwise convergence of approx-imate solutions to Zakai’s equation in a bounded domain, in StochasticPartial Differential Equations and Applications, Trento, 1990 (LongmanScientific & Technical, Harlow, 1992), pp. 111–123

79. I. Chueshov, A. Millet, Stochastic two-dimensional hydrodynamicalsystems: Wong-Zakai approximation and support theorem. Stoch. Anal.Appl. 29, 570–611 (2011)

80. P.G. Ciarlet, The Finite Element Method for Elliptic Problems (SIAM,Philadelphia, PA, 2002)

81. Z. Ciesielski, Holder conditions for realizations of Gaussian processes.Trans. Am. Math. Soc. 99, 403–413 (1961)

82. K.A. Cliffe, M.B. Giles, R. Scheichl, A.L. Teckentrup, Multilevel MonteCarlo methods and applications to elliptic PDEs with random coeffi-cients. Comput. Vis. Sci. 14, 3–15 (2011)

83. F. Comte, V. Genon-Catalot, Y. Rozenholc, Penalized nonparamet-ric mean square estimation of the coefficients of diffusion processes.Bernoulli 13, 514–543 (2007)

84. R. Courant, K.O. Friedrichs, Supersonic Flow and Shock Waves (Inter-science Publishers, Inc., New York, 1948)

85. S. Cox, J. van Neerven, Convergence rates of the splitting scheme forparabolic linear stochastic Cauchy problems. SIAM J. Numer. Anal. 48,428–451 (2010)

86. S. Cox, J. van Neerven, Pathwise Holder convergence of the implicit-linear Euler scheme for semi-linear SPDEs with multiplicative noise.Numer. Math. 125, 259–345 (2013)

87. D. Crisan, Exact rates of convergence for a branching particle approxi-mation to the solution of the Zakai equation. Ann. Probab. 31, 693–718(2003)

88. D. Crisan, Particle approximations for a class of stochastic partial dif-ferential equations. Appl. Math. Optim. 54, 293–314 (2006)

89. D. Crisan, J. Gaines, T. Lyons, Convergence of a branching particlemethod to the solution of the Zakai equation. SIAM J. Appl. Math. 58,1568–1590 (1998)

References 369

90. D. Crisan, T. Lyons, A particle approximation of the solution of theKushner-Stratonovitch equation. Probab. Theory Relat. Fields 115,549–578 (1999)

91. D. Crisan, J. Xiong, Numerical solutions for a class of SPDEs overbounded domains, in Conference Oxford sur les methodes de MonteCarlo sequentielles (EDP Sciences, Les Ulis, 2007), pp. 121–125

92. G. Da Prato, Kolmogorov Equations for Stochastic PDEs (Birkhauser,Basel, 2004)

93. G. Da Prato, A. Debussche, R. Temam, Stochastic Burgers’ equation.NoDEA Nonlinear Differ. Equ. Appl. 1, 389–402 (1994)

94. G. Da Prato, J. Zabczyk, Stochastic Equations in Infinite Dimensions(Cambridge University Press, Cambridge, 1992)

95. A.M. Davie, J.G. Gaines, Convergence of numerical schemes for the solu-tion of parabolic stochastic partial differential equations. Math. Comp.70, 121–134 (2001)

96. P.J. Davis, P. Rabinowitz, Methods of Numerical Integration. ComputerScience and Applied Mathematics, 2nd edn. (Academic, Orlando, FL,1984)

97. D.A. Dawson, Stochastic evolution equations. Math. Biosci. 15, 287–316(1972)

98. D.A. Dawson, E.A. Perkins, Measure-valued processes and renormal-ization of branching particle systems, in Stochastic Partial DifferentialEquations: Six Perspectives, ed. by R.A. Carmona, B. Rozovskii (eds.)(AMS, Providence, RI, 1999), pp. 45–106

99. A. de Bouard, A. Debussche, A stochastic nonlinear Schrodinger equa-tion with multiplicative noise. Commun. Math. Phys. 205, 161–181(1999)

100. A. De Bouard, A. Debussche, A semi-discrete scheme for the stochasticnonlinear Schrodinger equation. Numer. Math. 96, 733–770 (2004)

101. A. de Bouard, A. Debussche, Weak and strong order of convergence of asemidiscrete scheme for the stochastic nonlinear Schrodinger equation.Appl. Math. Optim. 54, 369–399 (2006)

102. A. de Bouard, A. Debussche, Random modulation of solitons for thestochastic Korteweg-de Vries equation. Ann. Inst. H. Poincare Anal.Non Lineaire 24, 251–278 (2007)

103. A. de Bouard, A. Debussche, The nonlinear Schrodinger equation withwhite noise dispersion. J. Funct. Anal. 259, 1300–1321 (2010)

104. A. de Bouard, A. Debussche, L. Di Menza, Theoretical and numeri-cal aspects of stochastic nonlinear Schrodinger equations. Monte CarloMethods Appl. 7, 55–63 (2001)

105. A. de Bouard, A. Debussche, Y. Tsutsumi, White noise drivenKorteweg-de Vries equation. J. Funct. Anal. 169, 532–558 (1999)

106. A. de Bouard, A. Debussche, On the stochastic Korteweg-de Vries equa-tion. J. Funct. Anal. 154, 215–251 (1998)

370 References

107. M.K. Deb, I.M. Babuska, J.T. Oden, Solution of stochastic partial dif-ferential equations using Galerkin finite element techniques. Comput.Methods Appl. Mech. Eng. 190, 6359–6372 (2001)

108. A. Debussche, The 2D-Navier-Stokes equations perturbed by a deltacorrelated noise, in Probabilistic Methods in Fluids (World ScientificPublishers, River Edge, NJ, 2003), pp. 115–129

109. A. Debussche, Weak approximation of stochastic partial differentialequations: the nonlinear case. Math. Comp. 80, 89–117 (2011)

110. A. Debussche, J. Printems, Numerical simulation of the stochasticKorteweg-de Vries equation. Phys. D 134, 200–226 (1999)

111. A. Debussche, J. Printems, Convergence of a semi-discrete scheme forthe stochastic Korteweg-de Vries equation. Discrete Contin. Dyn. Syst.Ser. B 6, 761–781 (2006)

112. A. Debussche, J. Printems, Weak order for the discretization of thestochastic heat equation. Math. Comp. 78, 845–863 (2009)

113. A. Debussche, J. Vovelle, Scalar conservation laws with stochastic forc-ing. J. Funct. Anal. 259, 1014–1042 (2010)

114. A. Deya, M. Jolis, L. Quer-Sardanyons, The Stratonovich heat equa-tion: a continuity result and weak approximations. Electron. J. Probab.18(3), 34 (2013)

115. J. Dick, F.Y. Kuo, I.H. Sloan, High-dimensional integration: the quasi-Monte Carlo way. Acta Numer. 22, 133–288 (2013)

116. P. Dorsek, Semigroup splitting and cubature approximations for thestochastic Navier-Stokes equations. SIAM J. Numer. Anal. 50, 729–746(2012)

117. H. Doss, Liens entre equations differentielles stochastiques et ordinaires.C. R. Acad. Sci. Paris Ser. A-B 283, Ai, A939–A942 (1976)

118. Q. Du, T. Zhang, Numerical approximation of some linear stochasticpartial differential equations driven by special additive noises. SIAM J.Numer. Anal. 40, 1421–1445 (2002)

119. J. Duan, W. Wang, Effective Dynamics of Stochastic Partial DifferentialEquations (Elsevier, Amsterdam, 2014)

120. Y. Duan, X. Yang, On the convergence of a full discretization schemefor the stochastic Navier-Stokes equations. J. Comput. Anal. Appl. 13,485–498 (2011)

121. Y. Duan, X. Yang, The finite element method of a Euler scheme forstochastic Navier-Stokes equations involving the turbulent component.Int. J. Numer. Anal. Model. 10, 727–744 (2013)

122. M.A. El-Tawil, A.-H.A. El-Shikhipy, Approximations for some statisti-cal moments of the solution process of stochastic Navier-Stokes equationusing WHEP technique. Appl. Math. Inf. Sci. 6, 1095–1100 (2012)

123. H.C. Elman, C.W. Miller, E.T. Phipps, R.S. Tuminaro, Assessment ofcollocation and Galerkin approaches to linear diffusion equations withrandom data. Int. J. Uncertain. Quant. 1, 19–33 (2011)

References 371

124. O. Ernst, B. Sprungk, Stochastic collocation for elliptic PDEs with ran-dom data: the lognormal case, in Sparse Grids and Applications - Mu-nich 2012, ed. by J. Garcke, D. Pfluger (Springer International Publish-ing, Cham, 2014), pp. 29–53

125. L.C. Evans, Partial Differential Equations (AMS, Providence, RI, 1998)126. J. Feng, D. Nualart, Stochastic scalar conservation laws. J. Funct. Anal.

255, 313–373 (2008)127. G. Ferreyra, AWong-Zakai-type theorem for certain discontinuous semi-

martingales. J. Theor. Probab. 2, 313–323 (1989)128. F. Flandoli, V.M. Tortorelli, Time discretization of Ornstein-Uhlenbeck

equations and stochastic Navier-Stokes equations with a generalizednoise. Stoch. Stoch. Rep. 55, 141–165 (1995)

129. W. Fleming, Distributed parameter stochastic systems in populationbiology, in Control Theory, Numerical Methods and Computer SystemsModelling, ed. by A. Bensoussan, J. Lions (Springer, Berlin, 1975),pp. 179–191

130. P. Florchinger, F. Le Gland, Time-discretization of the Zakai equationfor diffusion processes observed in correlated noise, in Analysis and Op-timization of Systems (Antibes, 1990) (Springer, Berlin, 1990), pp. 228–237

131. P. Florchinger, F. Le Gland, Time-discretization of the Zakai equationfor diffusion processes observed in correlated noise. Stoch. Stoch. Rep.35, 233–256 (1991)

132. P. Florchinger, F. Le Gland, Particle approximation for first orderstochastic partial differential equations, in Applied Stochastic Analysis,New Brunswick, NJ, 1991 (Springer, Berlin, 1992), pp. 121–133

133. J. Foo, G.E. Karniadakis, Multi-element probabilistic collocationmethod in high dimensions. J. Comput. Phys. 229, 1536–1557 (2010)

134. J.-P. Fouque, J. Garnier, G. Papanicolaou, K. Sølna, Wave Propagationand Time Reversal in Randomly Layered Media. Stochastic Modellingand Applied Probability, vol. 56 (Springer, New York, 2007)

135. P. Frauenfelder, C. Schwab, R.A. Todor, Finite elements for ellipticproblems with stochastic coefficients. Comput. Methods Appl. Mech.Eng. 194, 205–228 (2005)

136. M.I. Freidlin, Random perturbations of reaction-diffusion equations: thequasideterministic approximation. Trans. Am. Math. Soc. 305, 665–697(1988)

137. P. Friz, H. Oberhauser, Rough path limits of the Wong-Zakai type witha modified drift term. J. Funct. Anal. 256, 3236–3256 (2009)

138. P. Friz, H. Oberhauser, On the splitting-up method for rough (partial)differential equations. J. Differ. Equ. 251, 316–338 (2011)

139. J.G. Gaines, Numerical experiments with S(P)DE’s, in Stochastic Par-tial Differential Equations (Edinburgh, 1994) (Cambridge UniversityPress, Cambridge, 1995), pp. 55–71

372 References

140. J. Galvis, M. Sarkis, Approximating infinity-dimensional stochasticDarcy’s equations without uniform ellipticity. SIAM J. Numer. Anal.47, 3624–3651 (2009)

141. J. Galvis, M. Sarkis, Regularity results for the ordinary product stochas-tic pressure equation. SIAM J. Math. Anal. 44, 2637–2665 (2012)

142. A. Ganguly, Wong-Zakai type convergence in infinite dimensions. Elec-tron. J. Probab. 18(31), 34 (2013)

143. J. Garcia, Convergence of stochastic integrals and SDE’s associated toapproximations of the Gaussian white noise. Adv. Appl. Stat. 10, 155–177 (2008)

144. M. Gardner, White and brown music, fractal curves and one-over-f fluc-tuations. Sci. Am. 238, 16–27 (1978)

145. L. Gawarecki, V. Mandrekar, Stochastic Differential Equations in In-finite Dimensions with Applications to Stochastic Partial DifferentialEquations. Probability and its Applications (New York) (Springer, Hei-delberg, 2011)

146. K. Gawedzki, M. Vergassola, Phase transition in the passive scalaradvection. Phys. D 138, 63–90 (2000)

147. M. Geissert, M. Kovacs, S. Larsson, Rate of weak convergence of thefinite element method for the stochastic heat equation with additivenoise. BIT Numer. Math. 49, 343–356 (2009)

148. A. Genz, B.D. Keister, Fully symmetric interpolatory rules for multipleintegrals over infinite regions with Gaussian weight. J. Comput. Appl.Math. 71, 299–309 (1996)

149. M. Gerencser, I. Gyongy, Finite difference schemes for stochastic partialdifferential equations in Sobolev spaces. Appl. Math. Optim. 72, 77–100(2015)

150. A. Germani, L. Jetto, M. Piccioni, Galerkin approximation for optimallinear filtering of infinite-dimensional linear systems. SIAM J. ControlOptim. 26, 1287–1305 (1988)

151. A. Germani, M. Piccioni, A Galerkin approximation for the Zakaiequation, in System Modelling and Optimization (Copenhagen, 1983)(Springer, Berlin, 1984), pp. 415–423

152. A. Germani, M. Piccioni, Semidiscretization of stochastic partial differ-ential equations on Rd by a finite-element technique. Stochastics 23,131–148 (1988)

153. T. Gerstner, Sparse Grid Quadrature Methods for Computational Fi-nance, PhD thesis, University of Bonn, Habilitation, 2007

154. T. Gerstner, M. Griebel, Numerical integration using sparse grids. Nu-mer. Algorithms 18, 209–232 (1998)

155. R.G. Ghanem, P.D. Spanos, Stochastic Finite Elements: A Spectral Ap-proach (Springer, New York, 1991)

156. M.B. Giles, Multilevel Monte Carlo path simulation. Oper. Res. 56,607–617 (2008)

157. M.B. Giles, Multilevel Monte Carlo methods, inMonte Carlo and Quasi-Monte Carlo Methods 2012 (Springer, Berlin, 2013)

References 373

158. M.B. Giles, C. Reisinger, Stochastic finite differences and multilevelMonte Carlo for a class of SPDEs in finance. SIAM J. Financ. Math. 3,572–592 (2012)

159. H. Gilsing, T. Shardlow, SDELab: a package for solving stochastic differ-ential equations in MATLAB. J. Comput. Appl. Math. 205, 1002–1018(2007)

160. C.J. Gittelson, J. Konno, C. Schwab, R. Stenberg, The multi-level MonteCarlo finite element method for a stochastic Brinkman problem. Numer.Math. 125, 347–386 (2013)

161. E. Gobet, G. Pages, H. Pham, J. Printems, Discretization and simula-tion of the Zakai equation. SIAM J. Numer. Anal. 44, 2505–2538 (2006)

162. N.Y. Goncharuk, P. Kotelenez, Fractional step method for stochasticevolution equations. Stoch. Process. Appl. 73, 1–45 (1998)

163. D. Gottlieb, S.A. Orszag, Numerical Analysis of Spectral Methods: The-ory and Applications (SIAM, Philadelphia, PA, 1977)

164. I. Graham, F. Kuo, J. Nichols, R. Scheichl, C. Schwab, I. Sloan, Quasi-Monte Carlo finite element methods for elliptic PDEs with lognormalrandom coefficients. Numer. Math. 131, 329–368 (2015)

165. I.G. Graham, F.Y. Kuo, D. Nuyens, R. Scheichl, I.H. Sloan, Quasi-Monte Carlo methods for elliptic PDEs with random coefficients andapplications. J. Comput. Phys. 230, 3668–3694 (2011)

166. I.G. Graham, R. Scheichl, E. Ullmann, Mixed finite element analysis oflognormal diffusion and multilevel Monte Carlo methods. Stoch. PDE:Anal. Comp. 4, 41–75 (2016)

167. W. Grecksch, P.E. Kloeden, Time-discretised Galerkin approximationsof parabolic stochastic PDEs. Bull. Aust. Math. Soc. 54, 79–85 (1996)

168. W. Grecksch, H. Lisei, Approximation of stochastic nonlinear equationsof Schrodinger type by the splitting method. Stoch. Anal. Appl. 31,314–335 (2013)

169. W. Grecksch, B. Schmalfuß, Approximation of the stochastic Navier-Stokes equation. Mat. Apl. Comput. 15, 227–239 (1996)

170. W. Grecksch, C. Tudor, Stochastic Evolution Equations. A Hilbert SpaceApproach. Mathematical Research, vol. 85 (Akademie, Berlin, 1995)

171. M. Griebel, Sparse grids and related approximation schemes for higherdimensional problems, in Foundations of Computational Mathemat-ics, Santander 2005 (Cambridge University Press, Cambridge, 2006),pp. 106–161

172. M. Griebel, M. Holtz, Dimension-wise integration of high-dimensionalfunctions with applications to finance. J. Complex. 26, 455–489 (2010)

173. M. Grigoriu, Control of time delay linear systems with Gaussian whitenoise. PrEM 12, 89–96 (1997)

174. M. Grigoriu, Stochastic Systems: Uncertainty Quantification and Prop-agation (Springer, London, 2012)

374 References

175. C. Gugg, H. Kielhofer, M. Niggemann, On the approximation of thestochastic Burgers equation. Commun. Math. Phys. 230, 181–199(2002)

176. Q. Guo, W. Xie, T. Mitsui, Convergence and stability of the split-stepθ-Milstein method for stochastic delay Hopfield neural networks. Abstr.Appl. Anal. 12 pp. (2013), no. 169214

177. S.J. Guo, On the mollifier approximation for solutions of stochastic dif-ferential equations. J. Math. Kyoto Univ. 22, 243–254 (1982)

178. B. Gustafsson, H.-O. Kreiss, J. Oliger, Time Dependent Problems andDifference Methods (Wiley, New York, 1995)

179. I. Gyongy, On the approximation of stochastic differential equations.Stochastics 23, 331–352 (1988)

180. I. Gyongy, On the approximation of stochastic partial differential equa-tions. I. Stochastics 25, 59–85 (1988)

181. I. Gyongy, On the approximation of stochastic partial differential equa-tions. II. Stoch. Stoch. Rep. 26, 129–164 (1989)

182. I. Gyongy, The stability of stochastic partial differential equations andapplications. I. Stoch. Stoch. Rep. 27, 129–150 (1989)

183. I. Gyongy, The stability of stochastic partial differential equations. II.Stoch. Stoch. Rep. 27, 189–233 (1989)

184. I. Gyongy, The approximation of stochastic partial differential equationsand applications in nonlinear filtering. Comput. Math. Appl. 19, 47–63(1990)

185. I. Gyongy, On the support of the solutions of stochastic differentialequations. Teor. Veroyatnost. i Primenen. 39, 649–653 (1994)

186. I. Gyongy, Lattice approximations for stochastic quasi-linear parabolicpartial differential equations driven by space-time white noise. I. Poten-tial Anal. 9, 1–25 (1998)

187. I. Gyongy, A note on Euler’s approximations. Potential Anal. 8, 205–216 (1998)

188. I. Gyongy, Lattice approximations for stochastic quasi-linear parabolicpartial differential equations driven by space-time white noise. II. Po-tential Anal. 11, 1–37 (1999)

189. I. Gyongy, Approximations of stochastic partial differential equations,in Stochastic Partial Differential Equations and Applications (Trento,2002) (Dekker, New York, 2002), pp. 287–307

190. I. Gyongy, N. Krylov, On the rate of convergence of splitting-up ap-proximations for SPDEs, in Stochastic Inequalities and Applications(Birkhauser, Basel, 2003), pp. 301–321

191. I. Gyongy, N. Krylov, On the splitting-up method and stochastic partialdifferential equations. Ann. Probab. 31, 564–591 (2003)

192. I. Gyongy, N. Krylov, An accelerated splitting-up method for parabolicequations. SIAM J. Math. Anal. 37, 1070–1097 (2005)

References 375

193. I. Gyongy, N. Krylov, Accelerated finite difference schemes for linearstochastic partial differential equations in the whole space. SIAM J.Math. Anal. 42, 2275–2296 (2010)

194. I. Gyongy, T. Martınez, On numerical solution of stochastic partial dif-ferential equations of elliptic type. Stochastics 78, 213–231 (2006)

195. I. Gyongy, G. Michaletzky, On Wong-Zakai approximations with δ-martingales. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460,309–324 (2004)

196. I. Gyongy, D. Nualart, Implicit scheme for quasi-linear parabolic par-tial differential equations perturbed by space-time white noise. Stoch.Process. Appl. 58, 57–72 (1995)

197. I. Gyongy, D. Nualart, Implicit scheme for stochastic parabolic partialdifferential equations driven by space-time white noise. Potential Anal.7, 725–757 (1997)

198. I. Gyongy, D. Nualart, M. Sanz-Sole, Approximation and support the-orems in modulus spaces. Probab. Theory Relat. Fields 101, 495–509(1995)

199. I. Gyongy, T. Prohle, On the approximation of stochastic differentialequation and on Stroock-Varadhan’s support theorem. Comput. Math.Appl. 19, 65–70 (1990)

200. I. Gyongy, A. Shmatkov, Rate of convergence of Wong-Zakai approxi-mations for stochastic partial differential equations. Appl. Math. Optim.54, 315–341 (2006)

201. I. Gyongy, P.R. Stinga, Rate of convergence of Wong-Zakai approxima-tions for stochastic partial differential equations, in Seminar on Stochas-tic Analysis, Random Fields and Applications VII, ed. by R.C. Dalang,M. Dozzi, F. Russo (Springer, Basel, 2013), pp. 95–130

202. L.G. Gyurko, T.J. Lyons, Efficient and practical implementations ofcubature on Wiener space, in Stochastic Analysis 2010 (Springer, Hei-delberg, 2011), pp. 73–111

203. M. Hairer, J. Maas, A spatial version of the Ito-Stratonovich correction.Ann. Probab. 40, 1675–1714 (2012)

204. M. Hairer, M.D. Ryser, H. Weber, Triviality of the 2D stochastic Allen-Cahn equation. Electron. J. Probab. 17, 1–14 (2012)

205. M. Hairer, J. Voss, Approximations to the stochastic Burgers equation.J. Nonlinear Sci. 21, 897–920 (2011)

206. E.J. Hall, Accelerated spatial approximations for time discretizedstochastic partial differential equations. SIAM J. Math. Anal. 44,3162–3185 (2012)

207. E.J. Hall, Higher order spatial approximations for degenerate parabolicstochastic partial differential equations. SIAM J. Math. Anal. 45,2071–2098 (2013)

208. R.Z. Has′minskiı, Stochastic Stability of Differential Equations (Sijthoff& Noordhoff, Alphen, 1980)

376 References

209. E. Hausenblas, Numerical analysis of semilinear stochastic evolutionequations in Banach spaces. J. Comput. Appl. Math. 147, 485–516(2002)

210. E. Hausenblas, Approximation for semilinear stochastic evolution equa-tions. Potential Anal. 18, 141–186 (2003)

211. E. Hausenblas, Weak approximation for semilinear stochastic evolutionequations, in Stochastic Analysis and Related Topics VIII (Birkhauser,Basel, 2003), pp. 111–128

212. E. Hausenblas, Wong-Zakai type approximation of SPDEs of Levy noise.Acta Appl. Math. 98, 99–134 (2007)

213. E. Hausenblas, Weak approximation of the stochastic wave equation. J.Comput. Appl. Math. 235, 33–58 (2010)

214. J. He, Numerical analysis for stochastic age-dependent population equa-tions with diffusion, in Advances in Electronic Commerce, Web Applica-tion and Communication, ed. by D. Jin, S. Lin (Springer, Berlin, 2012),pp. 37–43

215. R.L. Herman, A. Rose, Numerical realizations of solutions of thestochastic KdV equation. Math. Comput. Simul. 80, 164–172 (2009)

216. J.S. Hesthaven, S. Gottlieb, D. Gottlieb, Spectral Methods for Time-Dependent Problems. Cambridge Monographs on Applied and Compu-tational Mathematics, vol. 21 (Cambridge University Press, Cambridge,2007)

217. D.J. Higham, An algorithmic introduction to numerical simulation ofstochastic differential equations. SIAM Rev. 43, 525–546 (2001)

218. D.J. Higham, X. Mao, A.M. Stuart, Strong convergence of Euler-typemethods for nonlinear stochastic differential equations. SIAM J. Numer.Anal. 40, 1041–1063 (2002)

219. D.J. Higham, X. Mao, L. Szpruch, Convergence, non-negativity andstability of a new Milstein scheme with applications to finance. DiscreteContin. Dyn. Syst. Ser. B 18, 2083–2100 (2013)

220. V.H. Hoang, C. Schwab, N-term Wiener chaos approximation rates forelliptic PDEs with lognormal Gaussian random inputs. Math. ModelsMethods Appl. Sci. 24, 797–826 (2014)

221. D.G. Hobson, L.C.G. Rogers, Complete models with stochastic volatil-ity. Math. Financ. 8, 27–48 (1998)

222. N. Hofmann, T. Muller-Gronbach, A modified Milstein scheme for ap-proximation of stochastic delay differential equations with constant timelag. J. Comput. Appl. Math. 197, 89–121 (2006)

223. H. Holden, B. Øksendal, J. Uboe, T. Zhang, Stochastic Partial Differ-ential Equations (Birkhauser, Boston, MA, 1996)

224. H. Holden, N.H. Risebro, Conservation laws with a random source.Appl. Math. Optim. 36, 229–241 (1997)

225. T.Y. Hou, W. Luo, B. Rozovskii, H.-M. Zhou, Wiener chaos expansionsand numerical solutions of randomly forced equations of fluid mechanics.J. Comput. Phys. 216, 687–706 (2006)

References 377

226. Y. Hu, Semi-implicit Euler-Maruyama scheme for stiff stochastic equa-tions, in Stochastic Analysis and Related Topics (Birkhauser, Boston,MA, 1996), pp. 183–202

227. Y. Hu, G. Kallianpur, J. Xiong, An approximation for the Zakai equa-tion. Appl. Math. Optim. 45, 23–44 (2002)

228. Y. Hu, S.-E.A. Mohammed, F. Yan, Discrete-time approximations ofstochastic delay equations: the Milstein scheme. Ann. Probab. 32,265–314 (2004)

229. Y.-Z. Hu, J.-A. Yan, Wick calculus for nonlinear Gaussian functionals.Acta Math. Appl. Sin. Engl. Ser. 25, 399–414 (2009)

230. C. Huang, S. Gan, D. Wang, Delay-dependent stability analysis of nu-merical methods for stochastic delay differential equations. J. Comput.Appl. Math. 236, 3514–3527 (2012)

231. S.P. Huang, S.T. Quek, K.K. Phoon, Convergence study of the truncatedKarhunen-Loeve expansion for simulation of stochastic processes. Int.J. Numer. Methods Eng. 52, 1029–1043 (2001)

232. M. Hutzenthaler, A. Jentzen, On a perturbation theory and on strongconvergence rates for stochastic ordinary and partial differential equa-tions with non-globally monotone coefficients. ArXiv (2014). https://arxiv.org/abs/1401.0295

233. M. Hutzenthaler, A. Jentzen, Numerical approximation of stochastic dif-ferential equations with non-globally Lipschitz continuous coefficients.Mem. Am. Math. Soc. 236, 1 (2015)

234. M. Hutzenthaler, A. Jentzen, P.E. Kloeden, Strong and weak diver-gence in finite time of Euler’s method for stochastic differential equa-tions with non-globally Lipschitz continuous coefficients. Proc. R. Soc.A 467, 1563–1576 (2011)

235. M. Hutzenthaler, A. Jentzen, P.E. Kloeden, Strong convergence of anexplicit numerical method for SDEs with nonglobally Lipschitz contin-uous coefficients. Ann. Appl. Probab. 22, 1611–1641 (2012)

236. M. Hutzenthaler, A. Jentzen, P.E. Kloeden, Divergence of the multilevelMonte Carlo Euler method for nonlinear stochastic differential equa-tions. Ann. Appl. Probab. 23, 1913–1966 (2013)

237. M. Hutzenthaler, A. Jentzen, X. Wang, Exponential integrability prop-erties of numerical approximation processes for nonlinear stochastic dif-ferential equations. Math. Comp. (2017). https://doi.org/10.1090/

mcom/3146

238. S.M. Iacus, Simulation and Inference for Stochastic Differential Equa-tions. Springer Series in Statistics (Springer, New York, 2008). With Rexamples

239. N. Ikeda, S. Nakao, Y. Yamato, A class of approximations of Brownianmotion. Publ. Res. Inst. Math. Sci. 13, 285–300 (1977/1978)

240. N. Ikeda, S. Taniguchi, The Ito-Nisio theorem, quadratic Wiener func-tionals, and 1-solitons. Stoch. Process. Appl. 120, 605–621 (2010)

https://arxiv.org/abs/1401.0295


https://doi.org/10.1090/mcom/3146

https://doi.org/10.1090/mcom/3146

378 References

241. N. Ikeda, S. Watanabe, Stochastic Differential Equations and DiffusionProcesses (North-Holland Publishing Co., Amsterdam, 1981)

242. K. Ito, Approximation of the Zakai equation for nonlinear filtering.SIAM J. Control Optim. 34, 620–634 (1996)

243. K. Ito, M. Nisio, On the convergence of sums of independent Banachspace valued random variables. Osaka J. Math. 5, 35–48 (1968)

244. K. Ito, B. Rozovskii, Approximation of the Kushner equation for non-linear filtering. SIAM J. Control Optim. 38, 893–915 (2000)

245. M. Jardak, C.-H. Su, G.E. Karniadakis, Spectral polynomial chaos solu-tions of the stochastic advection equation. J. Sci. Comput. 17, 319–338(2002)

246. A. Jentzen, Pathwise numerical approximation of SPDEs with addi-tive noise under non-global Lipschitz coefficients. Potential Anal. 31,375–404 (2009)

247. A. Jentzen, Taylor expansions of solutions of stochastic partial differen-tial equations. Discrete Contin. Dyn. Syst. Ser. B 14, 515–557 (2010)

248. A. Jentzen, Higher order pathwise numerical approximations of SPDEswith additive noise. SIAM J. Numer. Anal. 49, 642–667 (2011)

249. A. Jentzen, P.E. Kloeden, The numerical approximation of stochasticpartial differential equations. Milan J. Math. 77, 205–244 (2009)

250. A. Jentzen, P.E. Kloeden, Overcoming the order barrier in the numericalapproximation of stochastic partial differential equations with additivespace-time noise. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 465,649–667 (2009)

251. A. Jentzen, P.E. Kloeden, Taylor Approximations for Stochastic PartialDifferential Equations (SIAM, Philadelphia, PA, 2011)

252. A. Jentzen, M. Rockner, A Milstein scheme for SPDEs. Found. Comput.Math. 15, 313–362 (2015)

253. G.-S. Jiang, C.-W. Shu, Efficient implementation of weighted ENOschemes. J. Comput. Phys. 126, 202–228 (1996)

254. E. Kalpinelli, N. Frangos, A. Yannacopoulos, Numerical methods forhyperbolic SPDEs: a Wiener Chaos approach. Stoch PDE: Anal Comp.1, 606–633 (2013)

255. I. Karatzas, S.E. Shreve, Brownian Motion and Stochastic Calculus,2nd edn. (Springer, New York, 1991)

256. M.A. Katsoulakis, G.T. Kossioris, O. Lakkis, Noise regularization andcomputations for the 1-dimensional stochastic Allen-Cahn problem. In-terfaces Free Bound. 9, 1–30 (2007)

257. P.E. Kloeden, A. Jentzen, Pathwise convergent higher order numericalschemes for random ordinary differential equations. Proc. R. Soc. Lond.Ser. A Math. Phys. Eng. Sci. 463, 2929–2944 (2007)

258. P.E. Kloeden, G.J. Lord, A. Neuenkirch, T. Shardlow, The exponentialintegrator scheme for stochastic partial differential equations: pathwiseerror bounds. J. Comput. Appl. Math. 235, 1245–1260 (2011)

References 379

259. P.E. Kloeden, E. Platen, Numerical Solution of Stochastic DifferentialEquations (Springer, Berlin, 1992)

260. P.E. Kloeden, T. Shardlow, The Milstein scheme for stochastic delaydifferential equations without using anticipative calculus. Stoch. Anal.Appl. 30, 181–202 (2012)

261. P.E. Kloeden, S. Shott, Linear-implicit strong schemes for Ito-Galerkinapproximations of stochastic PDEs. J. Appl. Math. Stoch. Anal. 14,47–53 (2001)

262. L. Kocis, W.J. Whiten, Computational investigations of low-discrepancysequences. ACM Trans. Math. Softw. 23, 266–294 (1997)

263. F. Konecny, On Wong-Zakai approximation of stochastic differentialequations. J. Multivar. Anal. 13, 605–611 (1983)

264. R. Korn, E. Korn, G. Kroisandt, Monte Carlo Methods and Models inFinance and Insurance (CRC Press, Boca Raton, FL, 2010)

265. G.T. Kossioris, G.E. Zouraris, Fully-discrete finite element approxima-tions for a fourth-order linear stochastic parabolic equation with ad-ditive space-time white noise. M2AN Math. Model. Numer. Anal. 44,289–322 (2010)

266. G.T. Kossioris, G.E. Zouraris, Finite element approximations for a linearCahn-Hilliard-Cook equation driven by the space derivative of a space-time white noise. Discrete Contin. Dyn. Syst. Ser. B 18, 1845–1872(2013)

267. G.T. Kossioris, G.E. Zouraris, Finite element approximations for a linearfourth-order parabolic SPDE in two and three space dimensions withadditive space-time white noise. Appl. Numer. Math. 67, 243–261 (2013)

268. P. Kotelenez, Stochastic Ordinary and Stochastic Partial DifferentialEquations (Springer, New York, 2008)

269. M. Kovacs, S. Larsson, F. Lindgren, Weak convergence of finite elementapproximations of linear stochastic evolution equations with additivenoise. BIT Numer. Math. 52, 85–108 (2012)

270. M. Kovacs, S. Larsson, F. Lindgren, Weak convergence of finite elementapproximations of linear stochastic evolution equations with additivenoise II. Fully discrete schemes. BIT Numer. Math. 53, 497–525 (2013)

271. M. Kovacs, S. Larsson, A. Mesforush, Finite element approximation ofthe Cahn-Hilliard-Cook equation. SIAM J. Numer. Anal. 49, 2407–2429(2011)

272. M. Kovacs, S. Larsson, F. Saedpanah, Finite element approximation ofthe linear stochastic wave equation with additive noise. SIAM J. Numer.Anal. 48, 408–427 (2010)

273. R.H. Kraichnan, Small-scale structure of a scalar field convected byturbulence. Phys. Fluids 11, 945–953 (1968)

274. I. Kroker, C. Rohde, Finite volume schemes for hyperbolic balance lawswith multiplicative noise. Appl. Numer. Math. 62, 441–456 (2012)

380 References

275. R. Kruse, Consistency and stability of a Milstein-Galerkin finite elementscheme for semilinear SPDE. Stoch. Partial Differ. Equ. Anal. Comput.2, 471–516 (2014)

276. R. Kruse, Optimal error estimates of Galerkin finite element methods forstochastic partial differential equations with multiplicative noise. IMAJ. Numer. Anal. 34, 217–251 (2014)

277. R. Kruse, Strong and Weak Approximation of Semilinear StochasticEvolution Equations (Springer, Cham, 2014)

278. N.V. Krylov, Introduction to the Theory of Diffusion Processes (AMS,Providence, RI, 1995)

279. U. Kuchler, E. Platen, Strong discrete time approximation of stochasticdifferential equations with time delay. Math. Comput. Simul. 54, 189–205 (2000)

280. H. Kunita, Stochastic partial differential equations connected with non-linear filtering, in Nonlinear Filtering and Stochastic Control (Cortona,1981) (Springer, Berlin, 1982), pp. 100–169

281. F.Y. Kuo, C. Schwab, I.H. Sloan, Quasi-Monte Carlo finite elementmethods for a class of elliptic partial differential equations with randomcoefficients. SIAM J. Numer. Anal. 50, 3351–3374 (2012)

282. F.Y. Kuo, C. Schwab, I.H. Sloan, Multi-level quasi-Monte Carlo finiteelement methods for a class of elliptic PDEs with random coefficients.Found. Comput. Math. 15, 411–449 (2015)

283. T.G. Kurtz, P. Protter, Weak limit theorems for stochastic integrals andstochastic differential equations. Ann. Probab. 19, 1035–1070 (1991)

284. T.G. Kurtz, P. Protter, Wong-Zakai corrections, random evolutions, andsimulation schemes for SDEs, in Stochastic Analysis (Academic, Boston,MA, 1991), pp. 331–346

285. T.G. Kurtz, J. Xiong, Numerical solutions for a class of SPDEs withapplication to filtering, in Stochastics in Finite and Infinite Dimensions.Trends in Mathematics (Birkhauser, Boston, MA, 2001), pp. 233–258

286. H.J. Kushner, On the differential equations satisfied by conditionalprobability densities of Markov processes, with applications. J. Soc. In-dust. Appl. Math. Ser. A Control 2, 106–119 (1964)

287. D.F. Kuznetsov, Strong approximation of multiple Ito and Stratonovichstochastic integrals: multiple Fourier series approach, St. PetersburgState Polytechnic University, St. Petersburg, Russian edition, 2011

288. A. Lang, A Lax equivalence theorem for stochastic differential equations.J. Comput. Appl. Math. 234, 3387–3396 (2010)

289. A. Lang, Almost sure convergence of a Galerkin approximation forSPDEs of Zakai type driven by square integrable martingales. J. Com-put. Appl. Math. 236, 1724–1732 (2012)

290. A. Lang, P.-L. Chow, J. Potthoff, Almost sure convergence for a semidis-crete Milstein scheme for SPDEs of Zakai type. Stochastics 82, 315–326(2010)

References 381

291. S. Larsson, A. Mesforush, Finite-element approximation of the lin-earized Cahn-Hilliard-Cook equation. IMA J. Numer. Anal. 31, 1315–1333 (2011)

292. M.P. Lazarev, P. Prasad, S.K. Sing, An approximate solution of one-dimensional piston problem. Z. Angew. Math. Phys. 46, 752–771 (1995)

293. F. Le Gland, Splitting-up approximation for SPDEs and SDEs with ap-plication to nonlinear filtering, in Stochastic Partial Differential Equa-tions and Their Applications (Springer, Berlin, 1992), pp. 177–187

294. O.P. Le Maitre, O.M. Knio, Spectral Methods for Uncertainty Quantifi-cation (Springer, Dordrecht, 2010)

295. C.Y. Lee, B. Rozovskii, A stochastic finite element method for stochasticparabolic equations driven by purely spatial noise. Commun. Stoch.Anal. 4 (2010), 271–297

296. R.J. LeVeque, Numerical Methods for Conservation Laws. Lectures inMathematics ETH Zurich (Birkhauser, Basel, 1990)

297. H.W. Liepmann, A. Roshko, Elements of Gasdynamics (Wiley, NewYork, 1957)

298. G. Lin, C.H. Su, G.E. Karniadakis, The stochastic piston problem. Proc.Natl. Acad. Sci. U.S.A. 101, 15840–15845 (2004)

299. F. Lindner, R. Schilling, Weak order for the discretization of the stochas-tic heat equation driven by impulsive noise. Potential Anal. 38, 345–379(2013)

300. C. Litterer, T. Lyons, High order recombination and an application tocubature on Wiener space. Ann. Appl. Probab. 22, 1301–1327 (2012)

301. J.E. Littlewood, R.E.A.C. Paley, Theorems on Fourier series and powerseries (II). Proc. Lond. Math. Soc. S2-42, 52–89 (1937)

302. D. Liu, Convergence of the spectral method for stochastic Ginzburg-Landau equation driven by space-time white noise. Commun. Math.Sci. 1, 361–375 (2003)

303. H. Liu, On spectral approximations of stochastic partial differentialequations driven by Poisson noise, PhD thesis, University of SouthernCalifornia, 2007

304. J. Liu, A mass-preserving splitting scheme for the stochastic Schrodingerequation with multiplicative noise. IMA J. Numer. Anal. 33, 1469–1479(2013)

305. J. Liu, Order of convergence of splitting schemes for both deterministicand stochastic nonlinear Schrodinger equations. SIAM J. Numer. Anal.51, 1911–1932 (2013)

306. M. Liu, W. Cao, Z. Fan, Convergence and stability of the semi-implicitEuler method for a linear stochastic differential delay equation. J. Com-put. Appl. Math. 170, 255–268 (2004)

307. J.A. Londono, A.M. Ramirez, Numerical performance of some Wong-Zakai type approximations for stochastic differential equations, Techni-cal report, Department of Mathematics, National University of Colom-bia, Bogota, Colombia, 2006

382 References

308. G.J. Lord, C.E. Powell, T. Shardlow, An Introduction to ComputationalStochastic PDEs (Cambridge University Press, Cambridge, 2014)

309. G.J. Lord, J. Rougemont, A numerical scheme for stochastic PDEs withGevrey regularity. IMA J. Numer. Anal. 24, 587–604 (2004)

310. G.J. Lord, T. Shardlow, Postprocessing for stochastic parabolic partialdifferential equations. SIAM J. Numer. Anal. 45, 870–889 (2007)

311. G.J. Lord, A. Tambue, A modified semi-implicit Euler-Maruyamascheme for finite element discretization of SPDEs. ArXiv (2010).https://arxiv.org/abs/1004.1998

312. G.J. Lord, A. Tambue, Stochastic exponential integrators for the finiteelement discretization of SPDEs for multiplicative and additive noise.IMA J. Numer. Anal. 33, 515–543 (2013)

313. G.J. Lord, V. Thummler, Computing stochastic traveling waves. SIAMJ. Sci. Comput. 34, B24–B43 (2012)

314. S.V. Lototskiı, B.L. Rozovskiı, The passive scalar equation in a tur-bulent incompressible Gaussian velocity field. Uspekhi Mat. Nauk 59,105–120 (2004)

315. S. Lototsky, R. Mikulevicius, B.L. Rozovskii, Nonlinear filtering revis-ited: a spectral approach. SIAM J. Control Optim. 35, 435–461 (1997)

316. S. Lototsky, B. Rozovskii, Stochastic differential equations: a Wienerchaos approach, in From Stochastic Calculus to Mathematical Finance(Springer, Berlin, 2006), pp. 433–506

317. S.V. Lototsky, Wiener chaos and nonlinear filtering. Appl. Math. Optim.54, 265–291 (2006)

318. S.V. Lototsky, B.L. Rozovskii, Wiener chaos solutions of linear stochas-tic evolution equations. Ann. Probab. 34, 638–662 (2006)

319. S.V. Lototsky, B.L. Rozovskii, Stochastic partial differential equationsdriven by purely spatial noise. SIAM J. Math. Anal. 41, 1295–1322(2009)

320. S.V. Lototsky, B.L. Rozovskii, X. Wan, Elliptic equations of higherstochastic order. M2AN Math. Model. Numer. Anal. 44, 1135–1153(2010)

321. S.V. Lototsky, K. Stemmann, Solving SPDEs driven by colored noise: achaos approach. Quart. Appl. Math. 66, 499–520 (2008)

322. T. Lyons, N. Victoir, Cubature on Wiener space. Proc. R. Soc. Lond.Ser. A Math. Phys. Eng. Sci. 460, 169–198 (2004)

323. V. Mackevicius, On Ikeda-Nakao-Yamato type approximations. Litovsk.Mat. Sb. 30, 752–757 (1990)

324. V. Mackevicius, On approximation of stochastic differential equationswith coefficients depending on the past. Liet. Mat. Rink. 32, 285–298(1992)

325. V. Mackyavichyus, Symmetric stochastic differential equations withnonsmooth coefficients. Mat. Sb. (N.S.) 116(158), 585–592, 608 (1981)


References 383

326. P. Malliavin, Stochastic calculus of variation and hypoelliptic operators,in Proceedings of the International Symposium SDE, Kyoto 1976, ed. byK. Ito, (Springer, Kinokuniya, 1978), pp. 195–263

327. H. Manouzi, A finite element approximation of linear stochastic PDEsdriven by multiplicative white noise. Int. J. Comput. Math. 85, 527–546(2008)

328. H. Manouzi, Numerical solutions of SPDEs involving white noise. Int.J. Tomogr. Stat. 10, 96–108 (2008)

329. H. Manouzi, SPDEs driven by additive and multiplicative white noises:a numerical study. Math. Comput. Simul. 81, 2234–2243 (2011)

330. H. Manouzi, M. Seaid, Solving Wick-stochastic water waves using aGalerkin finite element method. Math. Comput. Simul. 79, 3523–3533(2009)

331. H. Manouzi, M. Seaid, M. Zahri, Wick-stochastic finite element solutionof reaction-diffusion problems. J. Comput. Appl. Math. 203, 516–532(2007)

332. H. Manouzi, T.G. Theting, Mixed finite element approximation for thestochastic pressure equation of Wick type. IMA J. Numer. Anal. 24,605–634 (2004)

333. H. Manouzi, T.G. Theting, Numerical analysis of the stochastic Stokesequations of Wick type. Numer. Methods Partial Differ. Equ. 23, 73–92(2007)

334. X. Mao, Stochastic Differential Equations and Their Applications (Hor-wood Publishing Limited, Chichester, 1997)

335. X. Mao, L. Szpruch, Strong convergence and stability of implicit nu-merical methods for stochastic differential equations with non-globallyLipschitz continuous coefficients. J. Comput. Appl. Math. 238, 14–28(2013)

336. X. Mao, L. Szpruch, Strong convergence rates for backward Euler-Maruyama method for non-linear dissipative-type stochastic differentialequations with super-linear diffusion coefficients. Stochastics 85, 144–171 (2013)

337. T. Martınez, M. Sanz-Sole, A lattice scheme for stochastic partial differ-ential equations of elliptic type in dimension d ≥ 4. Appl. Math. Optim.54, 343–368 (2006)

338. R. Marty, On a splitting scheme for the nonlinear Schrodinger equationin a random medium. Commun. Math. Sci. 4, 679–705 (2006)

339. G. Mastroianni, G. Monegato, Error estimates for Gauss-Laguerre andGauss-Hermite quadrature formulas, in Approximation and Computa-tion (Birkhauser, Boston, 1994), pp. 421–434

340. J. Matousek, On the L2-discrepancy for anchored boxes. J. Complex.14, 527–556 (1998)

341. J.C. Mattingly, A.M. Stuart, D.J. Higham, Ergodicity for SDEs and ap-proximations: locally Lipschitz vector fields and degenerate noise. Stoch.Process. Appl. 101, 185–232 (2002)

384 References

342. E.J. McShane, Stochastic differential equations and models of randomprocesses, in Proceedings of the Sixth Berkeley Symposium on Math-ematical Statistics and Probability (Univ. California, Berkeley, Calif.,1970/1971), Vol. III: Probability theory, Berkeley, Calif., 1972 (Uni-versity of California Press, Berkeley, 1972), pp. 263–294

343. W.C. Meecham, D.-T. Jeng, Use of the Wiener-Hermite expansion fornearly normal turbulence. J. Fluid Mech. 32, 225–249 (1968)

344. R. Mikulevicius, B. Rozovskii, Separation of observations and parame-ters in nonlinear filtering, in Proceedings of the 32nd IEEE Conferenceon Decision and Control, vol. 2 (1993), pp. 1564–1569

345. R. Mikulevicius, B. Rozovskii, Linear parabolic stochastic PDEs andWiener chaos. SIAM J. Math. Anal. 29, 452–480 (1998)

346. R. Mikulevicius, B. Rozovskii, On unbiased stochastic Navier-Stokesequations. Probab. Theory Relat. Fields 54, 787–834 (2012)

347. R. Mikulevicius, B.L. Rozovskii, Fourier–Hermite expansions for non-linear filtering. Theory Probab. Appl. 44, 606–612 (2000)

348. R. Mikulevicius, B.L. Rozovskii, Stochastic Navier-Stokes equations forturbulent flows. SIAM J. Math. Anal. 35, 1250–1310 (2004)

349. R. Mikulevicius, B.L. Rozovskii, On distribution free Skorokhod–Malliavin calculus. Stoch. PDE: Anal. Comp. 4, 319–360 (2016)

350. A. Millet, P.-L. Morien, On implicit and explicit discretization schemesfor parabolic SPDEs in any dimension. Stoch. Process. Appl. 115, 1073–1106 (2005)

351. A. Millet, M. Sanz-Sole, A simple proof of the support theorem fordiffusion processes, in Seminaire de Probabilites, XXVIII. Lecture Notesin Mathematics, vol. 1583 (Springer, Berlin, 1994), pp. 36–48

352. A. Millet, M. Sanz-Sole, Approximation and support theorem for a waveequation in two space dimensions. Bernoulli 6, 887–915 (2000)

353. G.N. Mil′shteın, A theorem on the order of convergence of mean-squareapproximations of solutions of systems of stochastic differential equa-tions. Teor. Veroyatnost. i Primenen. 32, 809–811 (1987)

354. G.N. Milstein, Numerical Integration of Stochastic Differential Equa-tions (Kluwer Academic Publishers Group, Dordrecht, 1995)

355. G.N. Milstein, E. Platen, H. Schurz, Balanced implicit methods for stiffstochastic systems. SIAM J. Numer. Anal. 35, 1010–1019 (1998)

356. G.N. Milstein, Y.M. Repin, M.V. Tretyakov, Numerical methods forstochastic systems preserving symplectic structure. SIAM J. Numer.Anal. 40, 1583–1604 (2002)

357. G.N. Milstein, M.V. Tretyakov, Numerical methods in the weak sensefor stochastic differential equations with small noise. SIAM J. Numer.Anal. 34, 2142–2167 (1997)

358. G.N. Milstein, M.V. Tretyakov, Stochastic Numerics for MathematicalPhysics (Springer, Berlin, 2004)

References 385

359. G.N. Milstein, M.V. Tretyakov, Numerical integration of stochastic dif-ferential equations with nonglobally Lipschitz coefficients. SIAM J. Nu-mer. Anal. 43, 1139–1154 (2005)

360. G.N. Milstein, M.V. Tretyakov, Computing ergodic limits for Langevinequations. Phys. D 229, 81–95 (2007)

361. G.N. Milstein, M.V. Tretyakov, Solving parabolic stochastic partial dif-ferential equations via averaging over characteristics. Math. Comp. 78,2075–2106 (2009)

362. J. Ming, M. Gunzburger, Efficient numerical methods for stochastic par-tial differential equations through transformation to equations driven bycorrelated noise. Int. J. Uncertain. Quant. 3, 321–339 (2013)

363. S. Mishra, C. Schwab, Sparse tensor multi-level Monte Carlo finite vol-ume methods for hyperbolic conservation laws with random initial data.Math. Comp. 81, 1979–2018 (2012)

364. S. Mishra, C. Schwab, J. Sukys, Multi-level Monte Carlo finite volumemethods for nonlinear systems of conservation laws in multi-dimensions.J. Comput. Phys. 231, 3365–3388 (2012)

365. S. Mishra, C. Schwab, J. Sukys, Multilevel Monte Carlo finite vol-ume methods for shallow water equations with uncertain topographyin multi-dimensions. SIAM J. Sci. Comput. 34, B761–B784 (2012)

366. Y.S. Mishura, G.M. Shevchenko, Approximation schemes for stochasticdifferential equations in a Hilbert space. Teor. Veroyatn. Primen. 51,476–495 (2006)

367. K. Mohamed, M. Seaid, M. Zahri, A finite volume method for scalarconservation laws with stochastic time-space dependent flux functions.J. Comput. Appl. Math. 237, 614–632 (2013)

368. S.E.A. Mohammed, Stochastic Functional Differential Equations (Pit-man Publishing Ltd., Boston, MA, 1984)

369. C.M. Mora, Numerical solution of conservative finite-dimensionalstochastic Schrodinger equations. Ann. Appl. Probab. 15, 2144–2171(2005)

370. M. Motamed, F. Nobile, R. Tempone, A stochastic collocation methodfor the second order wave equation with a discontinuous random speed.Numer. Math. 1–44 (2012)

371. T. Muller-Gronbach, K. Ritter, An implicit Euler scheme with non-uniform time discretization for heat equations with multiplicative noise.BIT Numer. Math. 47, 393–418 (2007)

372. T. Muller-Gronbach, K. Ritter, Lower bounds and nonuniform time dis-cretization for approximation of stochastic heat equations. Found. Com-put. Math. 7, 135–181 (2007)

373. T. Muller-Gronbach, K. Ritter, L. Yaroslavtseva, Derandomization ofthe Euler scheme for scalar stochastic differential equations. J. Complex.28, 139–153 (2012)

386 References

374. D. Mumford, The dawning of the age of stochasticity. Atti Accad. Naz.Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. (2000),pp. 107–125. Mathematics towards the third millennium (Rome, 1999)

375. A. Neuenkirch, L. Szpruch, First order strong approximations of scalarSDEs defined in a domain. Numer. Math. 128, 103–136 (2014)

376. H. Niederreiter, Random Number Generation and Quasi-Monte CarloMethods (SIAM, Philadelphia, PA, 1992)

377. M. Ninomiya, Application of the Kusuoka approximation with a tree-based branching algorithm to the pricing of interest-rate derivativesunder the HJM model. LMS J. Comput. Math. 13, 208–221 (2010)

378. F. Nobile, R. Tempone, Analysis and implementation issues for the nu-merical approximation of parabolic equations with random coefficients.Int. J. Numer. Methods Eng. 80, 979–1006 (2009)

379. F. Nobile, R. Tempone, C.G. Webster, An anisotropic sparse gridstochastic collocation method for partial differential equations with ran-dom input data. SIAM J. Numer. Anal. 46, 2411–2442 (2008)

380. F. Nobile, R. Tempone, C.G. Webster, A sparse grid stochastic colloca-tion method for partial differential equations with random input data.SIAM J. Numer. Anal. 46, 2309–2345 (2008)

381. E. Novak, K. Ritter, Simple cubature formulas with high polynomialexactness. Constr. Approx. 15, 499–522 (1999)

382. A. Nowak, A Wong-Zakai type theorem for stochastic systems of Burg-ers equations. Panamer. Math. J. 16, 1–25 (2006)

383. A. Nowak, K. Twardowska, On support and invariance theorems for astochastic system of Burgers equations. Demonstratio Math. 39, 691–710 (2006)

384. D. Nualart, B. Rozovskii, Weighted stochastic Sobolev spaces and bi-linear SPDEs driven by space-time white noise. J. Funct. Anal. 149,200–225 (1997)

385. D. Nualart, M. Zakai, On the relation between the Stratonovich andOgawa integrals. Ann. Probab. 17, 1536–1540 (1989)

386. S. Ogawa, Quelques proprietes de l’integrale stochastique du type non-causal. Jpn. J. Appl. Math. 1, 405–416 (1984)

387. S. Ogawa, On a deterministic approach to the numerical solution of theSDE. Math. Comput. Simul. 55, 209–214 (2001)

388. B. Øksendal, Stochastic Differential Equations. Universitext, 6th edn.(Springer, Berlin, 2003). An introduction with applications

389. G. Pages, H. Pham, Optimal quantization methods for nonlinear filter-ing with discrete-time observations. Bernoulli 11, 893–932 (2005)

390. G. Pages, J. Printems, Optimal quadratic quantization for numerics:the Gaussian case. Monte Carlo Methods Appl. 9, 135–165 (2003)

391. R. Paley, N. Wiener, Fourier Transforms in the Complex Domain (AMSColloquium Publications, New York, 1934)

392. M.D. Paola, A. Pirrotta, Time delay induced effects on control of linearsystems under random excitation. PrEM 16, 43–51 (2001)

References 387

393. K. Petras, SmolPack: a software for Smolyak quadrature withClenshaw-Curtis basis-sequence (2003). http://people.sc.fsu.edu/

~jburkardt/c_src/smolpack/smolpack.html

394. P. Pettersson, G. Iaccarino, J. Nordstrom, A stochastic Galerkin methodfor the Euler equations with Roe variable transformation. J. Comput.Phys. 257, 481–500 (2014)

395. P. Pettersson, J. Nordstrom, G. Iaccarino, Boundary procedures for thetime-dependent Burger’s equation under uncertainty. Acta Math. Sci.Ser. B Engl. Ed. 30, 539–550 (2010)

396. J. Picard, Approximation of nonlinear filtering problems and orderof convergence, in Filtering and Control of Random Processes (Paris,1983) (Springer, Berlin, 1984), pp. 219–236

397. J. Picard, Approximation of stochastic differential equations and appli-cation of the stochastic calculus of variations to the rate of convergence,in Stochastic Analysis and Related Topics (Silivri, 1986) (Springer,Berlin, 1988), pp. 267–287

398. F. Pizzi, Stochastic collocation and option pricing, Master’s thesis, Po-litecnico di Milano, 2012

399. C. Prevot, M. Rockner, A Concise Course on Stochastic Partial Differ-ential Equations (Springer, Berlin, 2007)

400. J. Printems, On the discretization in time of parabolic stochastic partialdifferential equations. M2ANMath. Model. Numer. Anal. 35, 1055–1078(2001)

401. P. Protter, Approximations of solutions of stochastic differential equa-tions driven by semimartingales. Ann. Probab. 13, 716–743 (1985)

402. R. Qi, X. Yang, Weak convergence of finite element method for stochas-tic elastic equation driven by additive noise. J. Sci. Comput. 56, 450–470(2013)

403. L. Quer-Sardanyons, M. Sanz-Sole, Space semi-discretisations for astochastic wave equation. Potential Anal. 24, 303–332 (2006)

404. C. Reisinger, Mean-square stability and error analysis of implicit time-stepping schemes for linear parabolic SPDEs with multiplicative Wienernoise in the first derivative. Int. J. Comput. Math. 89, 2562–2575 (2012)

405. A.J. Roberts, A step towards holistic discretisation of stochastic partialdifferential equations. ANZIAM J. 45, C1–C15 (2003/04)

406. C. Roth, A combination of finite difference and Wong-Zakai methods forhyperbolic stochastic partial differential equations. Stoch. Anal. Appl.24, 221–240 (2006)

407. C. Roth, Weak approximations of solutions of a first order hyperbolicstochastic partial differential equation. Monte Carlo Methods Appl. 13,117–133 (2007)

408. B.L. Rozovskiı, Stochastic Evolution Systems (Kluwer, Dordecht, 1990)409. G. Rozza, D.B.P. Huynh, A.T. Patera, Reduced basis approximation

and a posteriori error estimation for affinely parametrized elliptic coer-

http://people.sc.fsu.edu/~jburkardt/c_src/smolpack/smolpack.html

http://people.sc.fsu.edu/~jburkardt/c_src/smolpack/smolpack.html

388 References

cive partial differential equations: application to transport and contin-uum mechanics. Arch. Comput. Methods Eng. 15, 229–275 (2008)

410. S. Sabanis, Euler approximations with varying coefficients: the caseof superlinearly growing diffusion coefficients. Ann. Appl. Probab. 26,2083–2105 (2016)

411. S. Sabanis, A note on tamed Euler approximations. Electron. Commun.Probab. 18, 1–10 (2013), no. 47

412. M. Sango, Splitting-up scheme for nonlinear stochastic hyperbolic equa-tions. Forum Math. 25, 931–965 (2013)

413. B. Saussereau, I.L. Stoica, Scalar conservation laws with fractionalstochastic forcing: existence, uniqueness and invariant measure. Stoch.Process. Appl. 122, 1456–1486 (2012)

414. B. Schmalfuss, On approximation of the stochastic Navier-Stokes equa-tions. Wiss. Z. Tech. Hochsch. Leuna-Merseburg 27, 605–612 (1985)

415. W. Schoutens, Stochastic Processes and Orthogonal Polynomials. Lec-ture Notes in Statistics, vol. 146 (Springer, New York, 2000)

416. H. Schurz, Numerical analysis of stochastic differential equations with-out tears, in Handbook of Stochastic Analysis and Applications (Dekker,New York, 2002), pp. 237–359

417. C. Schwab, p- and hp-Finite Element Methods (The Clarendon PressOxford University Press, New York, 1998)

418. C. Schwab, C.J. Gittelson, Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs. Acta Numer. 20, 291–467(2011)

419. C. Schwab, R.A. Todor, Karhunen-Loeve approximation of randomfields by generalized fast multipole methods. J. Comput. Phys. 217,100–122 (2006)

420. T. Shardlow, Numerical methods for stochastic parabolic PDEs. Numer.Funct. Anal. Optim. 20, 121–145 (1999)

421. T. Shardlow, Weak convergence of a numerical method for a stochasticheat equation. BIT Numer. Math. 43, 179–193 (2003)

422. G. Shevchenko, Rate of convergence of discrete approximations of so-lutions to stochastic differential equations in a Hilbert space. TheoryProbab. Math. Stat. 69, 187–199 (2004)

423. I.H. Sloan, S. Joe, Lattice Methods for Multiple Integration (Oxford Uni-versity Press, New York, 1994)

424. I.H. Sloan, H. Wozniakowski, When are quasi-Monte Carlo algorithmsefficient for high-dimensional integrals? J. Complex. 14, 1–33 (1998)

425. S.A. Smolyak, Quadrature and interpolation formulas for tensor prod-ucts of certain classes of functions. Sov. Math. Dokl. 4, 240–243 (1963)

426. A.R. Soheili, M.B. Niasar, M. Arezoomandan, Approximation ofstochastic parabolic differential equations with two different finite dif-ference schemes. Bull. Iran. Math. Soc. 37, 61–83 (2011)

References 389

427. C. Soize, R. Ghanem, Physical systems with random uncertainties: chaosrepresentations with arbitrary probability measure. SIAM J. Sci. Com-put. 26, 395–410 (2004)

428. V.N. Stanciulescu, M.V. Tretyakov, Numerical solution of the Dirich-let problem for linear parabolic SPDEs based on averaging over char-acteristics, in Stochastic Analysis 2010 (Springer, Heidelberg, 2011),pp. 191–212

429. M. Stoyanov, M. Gunzburger, J. Burkardt, Pink noise, 1/fα noise,and their effect on solutions of differential equations. Int. J. Uncertain.Quant. 1, 257–278 (2011)

430. R.L. Stratonovic, A new form of representing stochastic integrals andequations. Vestnik Moskov. Univ. Ser. I Mat. Meh. 1964, 3–12 (1964)

431. R.L. Stratonovich, A new representation for stochastic integrals andequations. SIAM J. Control 4, 362–371 (1966)

432. D.W. Stroock, S.R.S. Varadhan, On the support of diffusion processeswith applications to the strong maximum principle, in Proceedings ofthe Sixth Berkeley Symposium on Mathematical Statistics and Proba-bility, Berkeley, CA, vol. III (Univ. California Press, Berkeley, 1972),pp. 333–359

433. M. Sun, R. Glowinski, Pathwise approximation and simulation for theZakai filtering equation through operator splitting. Calcolo 30, 219–239(1993)

434. H.J. Sussmann, On the gap between deterministic and stochastic ordi-nary differential equations. Ann. Probab. 6, 19–41 (1978)

435. H.J. Sussmann, Limits of the Wong-Zakai type with a modifieddrift term, in Stochastic Analysis (Academic, Boston, MA, 1991),pp. 475–493

436. L. Szpruch, V-stable tamed Euler schemes. ArXiv (2013). https://

arxiv.org/abs/1310.0785

437. D. Talay, Stochastic Hamiltonian systems: exponential convergence tothe invariant measure, and discretization by the implicit Euler scheme.Markov Process. Relat. Fields 8, 163–198 (2002)

438. D.M. Tartakovsky, M. Dentz, P.C. Lichtner, Probability density func-tions for advective-reactive transport with uncertain reaction rates. Wa-ter Resour. Res. 45 (2009)

439. M. Tatang, G. McRae, Direct treatment of uncertainty in models ofreaction and transport, tech. rep., Department of Chemical Engineering,MIT, 1994

440. A. Teckentrup, R. Scheichl, M. Giles, E. Ullmann, Further analysis ofmultilevel Monte Carlo methods for elliptic PDEs with random coeffi-cients. Numer. Math. 125, 569–600 (2013)

441. A.L. Teckentrup, Multilevel Monte Carlo methods for highly hetero-geneous media, in Proceedings of the Winter Simulation Conference,Winter Simulation Conference (2012), pp. 32:1–32:12



390 References

442. G. Tessitore, J. Zabczyk, Wong-Zakai approximations of stochastic evo-lution equations. J. Evol. Equ. 6, 621–655 (2006)

443. T.G. Theting, Solving Wick-stochastic boundary value problems usinga finite element method. Stoch. Stoch. Rep. 70, 241–270 (2000)

444. T.G. Theting, Solving parabolic Wick-stochastic boundary value prob-lems using a finite element method. Stoch. Stoch. Rep. 75, 49–77 (2003)

445. T.G. Theting, Numerical solution of Wick-stochastic partial differentialequations, in Proceedings of the International Conference on Stochas-tic Analysis and Applications (Kluwer Academic Publishers, Dordrecht,2004), pp. 303–349

446. V. Thomee, Galerkin Finite Element Methods for Parabolic Problems,2nd edn. (Springer, Berlin, 2006)

447. L.N. Trefethen, Spectral Methods in Matlab (SIAM, Philadelphia, PA,2000)

448. M.V. Tretyakov, Z. Zhang, A fundamental mean-square convergencetheorem for SDEs with locally Lipschitz coefficients and its applications.SIAM J. Numer. Anal. 51, 3135–3162 (2013)

449. L.S. Tsimring, A. Pikovsky, Noise-induced dynamics in bistable systemswith delay. Phys. Rev. Lett. 87, 250602 (2001)

450. C. Tudor, Wong-Zakai type approximations for stochastic differentialequations driven by a fractional Brownian motion. Z. Anal. Anwend.28, 165–182 (2009)

451. C. Tudor, M. Tudor, On approximation of solutions for stochastic delayequations. Stud. Cerc. Mat. 39, 265–274 (1987)

452. M. Tudor, Approximation schemes for stochastic equations with hered-itary argument. Stud. Cerc. Mat. 44, 73–85 (1992)

453. K. Twardowska, On the approximation theorem of the Wong-Zakai typefor the functional stochastic differential equations. Probab. Math. Stat.12, 319–334 (1991)

454. K. Twardowska, An extension of the Wong-Zakai theorem for stochasticevolution equations in Hilbert spaces. Stoch. Anal. Appl. 10, 471–500(1992)

455. K. Twardowska, An approximation theorem of Wong-Zakai type fornonlinear stochastic partial differential equations. Stoch. Anal. Appl.13, 601–626 (1995)

456. K. Twardowska, An approximation theorem of Wong-Zakai type forstochastic Navier-Stokes equations. Rend. Sem. Mat. Univ. Padova 96,15–36 (1996)

457. K. Twardowska, Wong-Zakai approximations for stochastic differentialequations. Acta Appl. Math. 43, 317–359 (1996)

458. K. Twardowska, T. Marnik, M. Pas�lawaska-Po�luniak, Approximationof the Zakai equation in a nonlinear filtering problem with delay. Int.J. Appl. Math. Comput. Sci. 13, 151–160 (2003)

459. G. Vage, Variational methods for PDEs applied to stochastic partialdifferential equations. Math. Scand. 82, 113–137 (1998)

References 391

460. K. Vasilakos, A. Beuter, Effects of noise on a delayed visual feedbacksystem. J. Theor. Biol. 165, 389–407 (1993)

461. D. Venturi, D. Tartakovsky, A. Tartakovsky, G.E. Karniadakis, exactPDF equations and closure approximations for advective-reactive trans-port. J. Comput. Phys. 243, 323–343 (2013)

462. D. Venturi, X. Wan, R. Mikulevicius, B.L. Rozovskii, G.E. Karniadakis,Wick-Malliavin approximation to nonlinear stochastic partial differen-tial equations: analysis and simulations. Proc. R. Soc. Edinb. Sect. A.469 (2013)

463. J.B. Walsh, A stochastic model of neural response. Adv. Appl. Probab.13, 231–281 (1981)

464. J.B. Walsh, Finite element methods for parabolic stochastic PDE’s. Po-tential Anal. 23, 1–43 (2005)

465. J.B. Walsh, On numerical solutions of the stochastic wave equation. Ill.J. Math. 50, 991–1018 (2006)

466. X. Wan, A note on stochastic elliptic models. Comput. Methods Appl.Mech. Eng. 199, 2987–2995 (2010)

467. X. Wan, A discussion on two stochastic elliptic modeling strategies.Commun. Comput. Phys. 11, 775–796 (2012)

468. X. Wan, G. Karniadakis, Beyond Wiener-Askey expansions: handlingarbitrary PDFs. J. Sci. Comput. 27, 455–464 (2006)

469. X. Wan, B. Rozovskii, G.E. Karniadakis, A stochastic modeling method-ology based on weighted Wiener chaos and Malliavin calculus. Proc.Natl. Acad. Sci. U.S.A. 106, 14189–14194 (2009)

470. X. Wan, B.L. Rozovskii, The Wick–Malliavin approximation of ellipticproblems with log-normal random coefficients. SIAM J. Sci. Comput.35, A2370–A2392 (2013)

471. P. Wang, D.M. Tartakovsky, Uncertainty quantification in kinematic-wave models. J. Comput. Phys. 231, 7868–7880 (2012)

472. X. Wang, S. Gan, B-convergence of split-step one-leg theta methods forstochastic differential equations. J. Appl. Math. Comput. 38, 489–503(2012)

473. X. Wang, S. Gan, A Runge-Kutta type scheme for nonlinear stochas-tic partial differential equations with multiplicative trace class noise.Numer. Algorithms 62, 193–223 (2013)

474. X. Wang, S. Gan, The tamed Milstein method for commutative stochas-tic differential equations with non-globally Lipschitz continuous coeffi-cients. J. Differ. Equ. Appl. 19, 466–490 (2013)

475. X. Wang, S. Gan, Weak convergence analysis of the linear implicit Eu-ler method for semilinear stochastic partial differential equations withadditive noise. J. Math. Anal. Appl. 398, 151–169 (2013)

476. X. Wang, S. Gan, J. Tang, Higher order strong approximations of semi-linear stochastic wave equation with additive space-time white noise.SIAM J. Sci. Comput. 36, A2611–A2632 (2015)

392 References

477. G.W. Wasilkowski, H. Wozniakowski, Explicit cost bounds of algo-rithms for multivariate tensor product problems. J. Complex. 11, 1–56(1995)

478. G.B. Whitham, Linear and Nonlinear Waves (Wiley, New York, 1974)479. N. Wiener, The homogeneous chaos. Am. J. Math. 60, 897–936 (1938)480. M. Wiktorsson, Joint characteristic function and simultaneous simula-

tion of iterated Ito integrals for multiple independent Brownian motions.Ann. Appl. Probab. 11, 470–487 (2001)

481. E. Wong, M. Zakai, On the convergence of ordinary integrals to stochas-tic integrals. Ann. Math. Stat. 36, 1560–1564 (1965)

482. E. Wong, M. Zakai, On the relation between ordinary and stochasticdifferential equations. Int. J. Eng. Sci. 3, 213–229 (1965)

483. F. Wu, X. Mao, L. Szpruch, Almost sure exponential stability of numer-ical solutions for stochastic delay differential equations. Numer. Math.115, 681–697 (2010)

484. D. Xiu, Fast numerical methods for stochastic computations: a review.Commun. Comput. Phys. 5, 242–272 (2009)

485. D. Xiu, Numerical Methods for Stochastic Computations: A SpectralMethod Approach (Princeton University Press, Princeton, 2010)

486. D. Xiu, J. Hesthaven, High-order collocation methods for differentialequations with random inputs. SIAM J. Sci. Comput. 27, 1118–1139(2005)

487. D. Xiu, G.E. Karniadakis, Modeling uncertainty in steady state diffusionproblems via generalized polynomial chaos. Comput. Methods Appl.Mech. Eng. 191, 4927–4948 (2002)

488. D. Xiu, G.E. Karniadakis, The Wiener-Askey polynomial chaos forstochastic differential equations. SIAM J. Sci. Comput. 24, 619–644(2002)

489. J. Xu, J. Li, Sparse Wiener chaos approximations of Zakai equationfor nonlinear filtering, in Proceedings of the 21st Annual InternationalConference on Chinese Control and Decision Conference, CCDC’09,Piscataway, NJ (IEEE Press, New York, 2009), pp. 910–913

490. Y. Yan, Semidiscrete Galerkin approximation for a linear stochasticparabolic partial differential equation driven by an additive noise. BITNumer. Math. 44, 829–847 (2004)

491. Y. Yan, Galerkin finite element methods for stochastic parabolic partialdifferential equations. SIAM J. Numer. Anal. 43, 1363–1384 (2005)

492. X. Yang, Y. Duan, Y. Guo, A posteriori error estimates for finiteelement approximation of unsteady incompressible stochastic Navier-Stokes equations. SIAM J. Numer. Anal. 48, 1579–1600 (2010)

493. X. Yang, W. Wang, Y. Duan, The approximation of a Crank-Nicolsonscheme for the stochastic Navier-Stokes equations. J. Comput. Appl.Math. 225, 31–43 (2009)

References 393

494. R.-M. Yao, L.-J. Bo, Discontinuous Galerkin method for elliptic stochas-tic partial differential equations on two and three dimensional spaces.Sci. China Ser. A 50, 1661–1672 (2007)

495. H. Yoo, Semi-discretization of stochastic partial differential equationson R1 by a finite-difference method. Math. Comp. 69, 653–666 (2000)

496. N. Yoshida, Stochastic shear thickening fluids: strong convergence of theGalerkin approximation and the energy equality. Ann. Appl. Probab.22, 1215–1242 (2012)

497. K. Yosida, Functional Analysis, vol. 11 (Springer, Berlin, 1995), p. 14.Reprint of the sixth (1980) edition

498. M. Zahri, M. Seaid, H. Manouzi, M. El-Amrani, Wiener-Ito chaos ex-pansions and finite-volume solution of stochastic advection equations,in Finite Volumes for Complex Applications IV (ISTE, London, 2005),pp. 525–538

499. M. Zakai, On the optimal filtering of diffusion processes. Z. Wahrschein-lichkeitstheorie und Verw. Gebiete 11, 230–243 (1969)

500. G. Zhang, M. Gunzburger, Error analysis of a stochastic collocationmethod for parabolic partial differential equations with random inputdata. SIAM J. Numer. Anal. 50, 1922–1940 (2012)

501. H. Zhang, S. Gan, L. Hu, The split-step backward Euler method forlinear stochastic delay differential equations. J. Comput. Appl. Math.225, 558–568 (2009)

502. L. Zhang, Q.M. Zhang, Convergence of numerical solutions for thestochastic Navier-Stokes equation. Math. Appl. (Wuhan) 21, 504–509(2008)

503. Z. Zhang, H. Ma, Order-preserving strong schemes for SDEs with locallylipschitz coefficients. Appl. Numer. Math. 112, 1–16 (2017)

504. Z. Zhang, M. Choi, G.E. Karniadakis, Error estimates for the ANOVAmethod with polynomial chaos interpolation: tensor product functions.SIAM J. Sci. Comput. 34, A1165–A1186 (2012)

505. Z. Zhang, B. Rozovskii, M.V. Tretyakov, G.E. Karniadakis, A multi-stage Wiener chaos expansion method for stochastic advection-diffusion-reaction equations. SIAM J. Sci. Comput. 34, A914–A936 (2012)

506. Z. Zhang, M.V. Tretyakov, B. Rozovskii, G.E. Karniadakis, A recur-sive sparse grid collocation method for differential equations with whitenoise. SIAM J. Sci. Comput. 36, A1652–A1677 (2014)

507. Z. Zhang, M.V. Tretyakov, B. Rozovskii, G.E. Karniadakis, Wienerchaos vs stochastic collocation methods for linear advection-diffusionequations with multiplicative white noise. SIAM J. Numer. Anal. 53,153–183 (2015)

508. Z. Zhang, X. Yang, G. Lin, G.E. Karniadakis, Numerical solution ofthe Stratonovich- and Ito-Euler equations: application to the stochasticpiston problem. J. Comput. Phys. 236, 15–27 (2013)

394 References

509. M. Zheng, B. Rozovsky, G.E. Karniadakis, Adaptive Wick-Malliavin ap-proximation to nonlinear SPDEs with discrete random variables. SIAMJ. Sci. Comput. 37, A1872–A1890 (2015)

510. B. Zhibaıtis, V. Matskyavichyus, Gaussian approximations of Brownianmotion in a stochastic integral. Liet. Mat. Rink. 33, 508–526 (1993)

511. B. Zibaitis, Mollifier approximation of Brownian motion in stochasticintegral. Litovsk. Mat. Sb. 30, 717–727 (1990)

512. X. Zong, F. Wu, C. Huang, Convergence and stability of the semi-tamedEuler scheme for stochastic differential equations with non-Lipschitzcontinuous coefficients. Appl. Math. Comput. 228, 240–250 (2014)

Index

Q-Wiener process, 80z value, 32

additive noise, 7, 67, 83, 84, 87autonomous SODE, 61

backward stochastic differentialequation, 231

backward stochastic integral, 231balanced scheme, 145bounded variation, 104Brownian bridge, 19Brownian motion

mutually independent, 6spectral approximation, 7spectral truncation, 8

Burkholder-Davis-Gundy inequality,71, 110

Cameron-Martin basis, 28, 36, 43, 45Cameron-Martin theorem, 35, 319Charlier polynomial chaos, 318, 326,

328Chebyshev inequality, 32color noise, 2

space-time color noise, 6space-time white noise, 6

color of noise1/fα noise, 2, 3blue noise, 2

pink noise, 2red noisesee also Brownian motion, 2

white noise, 3commutative noises, 62, 69, 79, 93, 124,

129, 132, 157, 185, 231, 249complete orthonormal system (CONS),

17, 77confidence interval, 32, 33conversion

between Ito’s and StratonovichSPDEs, 78

of a Stratonovich integral to an Itointegral, 26

of a Stratonovich SDDE to an ItoSDDE, 132

of an Ito SDDE to a StratonovichSDDE, 120

of an Ito SDE to a Stratonovich SDE,96

correlation, 3, 12correlation length, 1, 4, 14

covariance function, 4, 14covariance matrix, 176cumulative distribution function, 11

deterministic integration methods inrandom space, 8

dominated convergence theorem, 111


395

396 Index

elliptic equationwith multiplicative noise, 8with additive noise, 8

empirical variance, 32, 33equation of cumulative distribution

function method, 265equation of probability density function

method, 265Euler equation, 252

Ito-Euler equation, 252stochastic Euler equation, 8, 254Stratonovich-Euler equation, 252

explicit Euler scheme, 83explicit fourth-order Runge-Kutta

scheme, RK4, 190exponential moments, 160

Feller’s test, 59finite dimensional noise, 87finite elements methods, 89Fourier spectral differentiation matrix,

211Fourier spectral method, 185fractional Brownian motion, 29fully implicit schemes, 160fully implicit schemes, 68, 154

implicit Euler scheme, 156

Gauss quadrature points, 352Gauss quadrature weights, 352Gaussian process, 2, 11, 12, 96

white in time but color in space, 6generalized polynomial chaos, 48, 330generalized polynomial chaos expan-

sion, 319, 320

Holder continuous, 19Hamiltonian system, 182

stochastic, 182Hermite polynomial, 27homogenization, 338

indicator function, 23integrating factor method, 57, 87, 161inverse problems, 7Ito isometry, 27Ito process, 30Ito-Wick product, 6, 28, 298

see alsoWick product, 27

Karhunen-Loeve expansion, 13, 14, 17,301

Lagrange multiplier method, 145law of the iterated logarithm, 18Legendre polynomial chaos, 318, 322linear-implicit Euler scheme, 90linearization coefficient problem, 321Lipschitz continuous

global, 56Lipschitz continuous, 6, 19

one-sided, 56lognormal diffusion, 330lognormal process, 6Lyapunov stability, 160

Malliavin calculus, 91Malliavin derivative, 299martingale, 27mean-square stability, 82

of a stochastic advection-diffusionequation, 91

mean-square stabilityregion, 92

method of characteristics, 179, 180, 186method of eigenfunction expansion, 70method of moment equations, 58midpoint scheme, 182Mikulevicius-Rozovsky formula, 300,

321, 332mild form, 274mild solution, 74, 274Milstein scheme, 61modified Bessel function, 14Monte Carlo methods, 8moving boundary problem, 255multiplicative noise, 7, 67, 87, 88, 182,

297multiplicative noises, 93

non-commutative noises, 79, 124, 133,248, 249

non-Gaussian white noise, 297nonlinear filtering, 5Nystrom method, 15

Ogawa integral, 29Ornstein-Uhlenbeck process, 18

Index 397

orthogonal expansions, 17orthonormality, 222

pathwise convergencenumerical SPDEs, 81, 91

penalty method, 145perturbation analysis, 253, 304, 330Poisson noise, 325polynomial exactness, 353population growth model, 1post-processing scheme, 67predictor-corrector scheme, 67, 120pressure equation, 6principle component analysis, PCA, 15propagator, 43, 45, 76, 320, 330pseudorandom number generator, 32

Mersenne Twister, 3, 22, 33, 123,155, 292

Q-Wiener process, 6quasi-Monte Carlo methods, 8

Halton sequence, 41, 42, 262Sobol sequence, 42, 262

Riemann sum, 4Runge-Kutta method

strong-property-preserving, 257

sample pathsee also realization,trajectory, 21

semi-implicit Euler, 89semi-tamed Euler scheme, 159shock location, 253shock tube, 253Slobodeckij semi-norm, 234space-time white noise, 84sparse grid collocation

Smolyak type, 8sparse grid collocation method, 44sparsity, 185SPDEs

amplitude equation, 7Kolmogrov equation, 7multiscale equation, 7well-posedness, 7

splitting method, 86, 252, 266stability

A-stability, 67L-stability, 67

linear stability, 82, 92numerical SDDEs, 119, 129numerical SODEs, 66

numerical SPDEs, 91zero-stabilitynumerical SPDEs, 81

stability regionmean-square, 68

statistical error, 32, 33stochastic advection-reaction-diffusion

equations, 7stochastic Burgers equation, 89stochastic collocation methods, 7, 37,

38recursive SCM, 8

stochastic hyperbolic problems, 265stochastic Navier-Stokes equation, 6stochastic passive scalar equation, 183stochastic piston problem, 8stochastic reaction-diffusion equation, 6Stratonovich integral, 29Stratonovich product, 4, 27strong convergence

numerical SODEs, 62numerical SPDEs, 80

Strong law of large numbers forBrownian motion, 18

strong solutionSODEs, 55SPDEs, 73

symplectic schemes, 182

the method of change-of-variable, 57three-term recurrence relation

for Hermite polynomial, 36for Legendre polynomial, 333

time integrationlong time integration, 8short time integration, 8

Topelitz matrix, 211total variation, 35

variance reduction methods, 33, 190control variates, 33multilevel Monte Carlo method, 33

variational solution, 74

weak convergencenumerical SODEs, 64

398 Index

numerical SPDEs, 81, 90

WENO scheme, 257

white noise, 4

definition, 317

spatial, 299

Wick product, 297

for generalized Cameron-Martinbasis, 320

Wick-Malliavin approximation, 8, 297Wiener chaos expansion, 28, 35Wiener chaos expansion methods, 7

recursive WCE, 7Wiener chaos solution, 74Wong-Zakai approximation, 7, 8, 80Wong-Zakai correction term, 104

Zakai equation, 5, 7

zhongqiang˜zhang george˜em˜karniadakis numerical methods ... · leslie greengard, courant...

Documents