pavl/m5a44lecturenotes.pdf · 1 part i: introduction random number generators. random variables,...

306
M5A44 COMPUTATIONAL STOCHASTIC PROCESSES Professor G.A. Pavliotis Department of Mathematics Imperial College London, UK [email protected] WINTER TERM 2014-2015 14/01/2015 Pavliotis (IC) CompStochProc 1 / 298

Upload: others

Post on 01-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

M5A44 COMPUTATIONAL STOCHASTICPROCESSES

Professor G.A. Pavliotis

Department of Mathematics Imperial College London, [email protected]

WINTER TERM 2014-201514/01/2015

Pavliotis (IC) CompStochProc 1 / 298

Page 2: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Lectures: Mondays, 11:00-13:00, Wednesdays 12:00-13:00,Huxley 658.

Office Hours: Mondays 14:00-15:00, Wednesdays 14:00-15:00 orby appointment. Huxley 621.Course webpage:http://www.ma.imperial.ac.uk/~pavl/comp_stoch_proc. htm

Text: Lecture notes, available from the course webpage. Also,recommended reading from various textbooks/review articles.

Pavliotis (IC) CompStochProc 2 / 298

Page 3: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

This is an introductory course on computational stochasticprocesses, aimed towards 4th year, MSc and MRes students inapplied mathematics applied mathematics and theoretical physics.

Prior knowledge of basic stochastic processes in continuous time,scientific computing and a programming language such as matlabor C is assumed.

Pavliotis (IC) CompStochProc 3 / 298

Page 4: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance reduction techniques.

2 SIMULATION OF STOCHASTIC PROCESSES Introduction to stochastic processes. Brownian motion and related stochastic processes and their

simulation. Gaussian stochastic processes. Karhunen-Loeve expansion and simulation of Gaussian random

fields. Numerical solution of ODEs and PDEs with random coefficients.

Pavliotis (IC) CompStochProc 4 / 298

Page 5: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

3. NUMERICAL SOLUTION OF STOCHASTIC DIFFERENTIALEQUATIONS.

Basic properties of stochastic differential equations. Itô’s formula and stochastic Taylor expansions. Examples of numerical methods: The Euler-Marayama and Milstein

schemes. Theoretical issues: convergence, consistency, stability. Weak versus

strong convergence. Implicit schemes and stiff SDEs.

Pavliotis (IC) CompStochProc 5 / 298

Page 6: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

4. ERGODIC DIFFUSION PROCESSES Deterministic methods: numerical solution of the Fokker-Planck

equation. Numerical calculation of the invariant measure/steady state

simulation. Calculation of expectation values, the Feynman-Kac formula.

5. MARKOV CHAIN MONTE CARLO (MCMC). The Metropolis-Hastings and MALA algorithms. Diffusion limits for MCMC algorithms. Peskun and Tierney orderings and Markov Chain comparison. Optimization of MCMC algorithms. Bias correction and variance reduction techniques.

Pavliotis (IC) CompStochProc 6 / 298

Page 7: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

6. STATISTICAL INFERENCE FOR DIFFUSION PROCESSES. Estimation of the diffusion coefficient (volatility). Maximum likelihood estimation of drift coefficients in SDEs. Nonparametric and Bayesian techniques.

7. APPLICATIONS Probabilistic methods for the numerical solution of partial differential

equations. Exit time problems. Molecular dynamics/computational statistical mechanics. Molecular motors, stochastic resonance.

Pavliotis (IC) CompStochProc 7 / 298

Page 8: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Prerequisites

Basic knowledge of ODEs and PDEs.

Familiarity with the theory of stochastic processes.

Stochastic differential equations.

Basic knowledge of functional analysis and stochastic analysis

Numerical methods and scientific computing.

Familiarity with a programming language: Matlab, C, Fortran....

Pavliotis (IC) CompStochProc 8 / 298

Page 9: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assessment

Based on 1 project (25%) and a final exam (75%).

Mastery exam: additional project/paper to study and write areport.

The project will be (mostly) computational.

The exam will be theoretical (e.g. analysis of algorithms).

Pavliotis (IC) CompStochProc 9 / 298

Page 10: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Bibliography

Lecture notes will be provided for all the material that we will coverin this course. They will be posted on the course webpage.Books that cover (parts of) the contents of this course are

G.A. Pavliotis Stochastic Processes and Applications, Springer(2014).

Kloeden & Platen: Numerical Solution of Stochastic DifferentialEquations, Springer (1992).

S.M. Iacus Simulation and inference for stochastic differentialequations, Springer (2008).

Asmussen and Glynn, Stochastic Simulation, Springer (2007). Huynh, Lai, Soumare Stochastic simulation and applications in

finance with Matlab programs. Wiley (2008).

Pavliotis (IC) CompStochProc 10 / 298

Page 11: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Lectures

Slides and whiteboard.

Computer experiments in Matlab.

Pavliotis (IC) CompStochProc 11 / 298

Page 12: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

why Computational Stochastic Processes?

Many models in science, engineering and economics areprobabilistic in nature and we have to deal with uncertainty.

Many deterministic problems can be solved more efficiently usingprobabilistic techniques.

Pavliotis (IC) CompStochProc 12 / 298

Page 13: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Deriving dynamical models from paleoclimatic recordsF. Kwasniok, and G. Lohmann, Phys. Rev. E, 80, 6, 066104 (2009)

Pavliotis (IC) CompStochProc 13 / 298

Page 14: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Fit this data to a bistable SDE

dx = −V ′(x; a) dt + σ dW, V(x) =4∑

j=1

ajxj. (1)

Estimate the coefficients in the drift from the palecolimatic datausing the unscented Kalman filter.

the resulting potential is highly asymmetric.

Pavliotis (IC) CompStochProc 14 / 298

Page 15: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Pavliotis (IC) CompStochProc 15 / 298

Page 16: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Computational Statistical Physics

Sample from the Boltzmann distribution:

π(x) =1Z

e−βU(x), (2)

where β = (kBT)−1 is the inverse temperature and thenormalization constant Z is the partition function

Z =

R3ke−βU(x) dx. (3)

x denotes the configuration of the system x = xi : i = 1, . . . k,xi = (xi1, xi2, xi3).The potential U(x) is a sum of pairwise interactions

U(x) =∑

i,j

Φ(|xi − xj|), Φ(r) = 4ε

[(σr

)12−(σ

r

)6].

Pavliotis (IC) CompStochProc 16 / 298

Page 17: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We need to be able to calculate Z(β), which is a very highdimensional integral.

We want to be able to calculate expectations with respect to π(x):

Eπf (x) =∫

R3kf (x)π(x) dx.

One method for doing this is by simulating a diffusion process(MCMC):

dXt = −∇U(Xt) dt +√

2β−1 dWt.

Pavliotis (IC) CompStochProc 17 / 298

Page 18: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Some interesting questions

How many experiments should we do?

How de we compute expectations with respect to stationarydistributions?

Can we exploit the problem structure to speed up thecomputation?

How do we efficiently compute probabilities of rare events?

How do we estimate the sensitivity of a stochastic model tochanges in a parameter?

How we we use stochastic simulation to optimize our choice ofdecision parameters?

Pavliotis (IC) CompStochProc 18 / 298

Page 19: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example: Brownian motion in a periodic potential

Consider the Langevin dynamics in a smooth periodic potential

q = −∇V(q)− γq +√

2γβ−1 W, (4)

where W(t) denotes standard d-dimensional Brownian motion,γ > 0 the friction coefficient and β the inverse temperature(strength of the noise).

Write as a first order system:

dqt = pt dt, (5a)

dpt = −∇qV(qt) dt − γpt dt +√

2γβ−1 dWt. (5b)

Pavliotis (IC) CompStochProc 19 / 298

Page 20: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

At long times the solution to qt performs an effective Brownianmotion with diffusion coefficient D, Eq2

t ∼ 2Dt, t ≪ 1.

We compute the diffusion coefficient for V(q) = cos(q) using theformula

D = limt→∞

Var(qt)

2t,

as a function of γ, at a fixed temperature β−1.

The SDE (5) has to be solved numerically using an appropriatescheme.

The numerical simulations suggest that

D ∼ 1γ, for both γ ≪ 1 and γ ≫ 1. (6)

Pavliotis (IC) CompStochProc 20 / 298

Page 21: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

101

102

103

104

10−1

100

101

102

t

< x

2>

/2t

101

102

103

104

10−1

100

101

102

t

< x

2>

/2t

101

102

103

104

10−1

100

101

102

t

< x

2>

/2t

γ = 0.063

γ = 0.01

γ = 0.0158

γ = 0.0251

γ = 0.0398

10−3

10−2

10−1

100

101

102

10−3

10−2

10−1

100

101

102

103

γ

De

ff

Deff

D0eff

0.1826/γ0.97

a. Var(qt)〉/2t vs t b. D vs γ

Figure: Variance and effective diffusion coefficient for various values of γ .

Pavliotis (IC) CompStochProc 21 / 298

Page 22: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The One-Dimensional Random Walk

We let time be discrete, i.e. t = 0, 1, . . . . Consider the followingstochastic process Sn:

S0 = 0;

at each time step it moves to ±1 with equal probability 12 .

In other words, at each time step we flip a fair coin. If the outcome isheads, we move one unit to the right. If the outcome is tails, we moveone unit to the left.Alternatively, we can think of the random walk as a sum of independentrandom variables:

Sn =n∑

j=1

Xj,

where Xj ∈ −1, 1 with P(Xj = ±1) = 12 .

Pavliotis (IC) CompStochProc 22 / 298

Page 23: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can simulate the random walk on a computer:

We need a (pseudo)random number generator to generate nindependent random variables which are uniformly distributed inthe interval [0,1].

If the value of the random variable is >12 then the particle moves

to the left, otherwise it moves to the right.

We then take the sum of all these random moves.

The sequence SnNn=1 indexed by the discrete time

T = 1, 2, . . .N is the path of the random walk. We use a linearinterpolation (i.e. connect the points n, Sn by straight lines) togenerate a continuous path .

Pavliotis (IC) CompStochProc 23 / 298

Page 24: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 5 10 15 20 25 30 35 40 45 50

−6

−4

−2

0

2

4

6

8

50−step random walk

Figure: Three paths of the random walk of length N = 50.

Pavliotis (IC) CompStochProc 24 / 298

Page 25: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 100 200 300 400 500 600 700 800 900 1000

−50

−40

−30

−20

−10

0

10

20

1000−step random walk

Figure: Three paths of the random walk of length N = 1000.

Pavliotis (IC) CompStochProc 25 / 298

Page 26: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Every path of the random walk is different: it depends on theoutcome of a sequence of independent random experiments.

We can compute statistics by generating a large number of pathsand computing averages. For example, E(Sn) = 0, E(S2

n) = n.

The paths of the random walk (without the linear interpolation) arenot continuous: the random walk has a jump of size 1 at each timestep.

This is an example of a discrete time, discrete space stochasticprocesses.

The random walk is a time-homogeneous (the probabilistic lawof evolution is independent of time) Markov (the future dependsonly on the present and not on the past) process.

If we take a large number of steps, the random walk starts lookinglike a continuous time process with continuous paths.

Pavliotis (IC) CompStochProc 26 / 298

Page 27: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the sequence of continuous time stochastic processes

Znt :=

1√n

Snt.

In the limit as n → ∞, the sequence Znt converges (in some

appropriate sense) to a Brownian motion with diffusioncoefficient D = ∆x2

2∆t =12 .

Pavliotis (IC) CompStochProc 27 / 298

Page 28: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

2

t

U(t)

mean of 1000 paths5 individual paths

Figure: Sample Brownian paths.

Pavliotis (IC) CompStochProc 28 / 298

Page 29: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Brownian motion W(t) is a continuous time stochastic processeswith continuous paths that starts at 0 (W(0) = 0) and hasindependent, normally. distributed Gaussian increments.

We can simulate the Brownian motion on a computer using arandom number generator that generates normally distributed,independent random variables.

Pavliotis (IC) CompStochProc 29 / 298

Page 30: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can write an equation for the evolution of the paths of aBrownian motion Xt with diffusion coefficient D starting at x:

dXt =√

2DdWt, X0 = x.

This is an example of a stochastic differential equation .

The probability of finding Xt at y at time t, given that it was at x attime t = 0, the transition probability density ρ(y, t) satisfies thePDE

∂ρ

∂t= D

∂2ρ

∂y2 , ρ(y, 0) = δ(y − x).

This is an example of the Fokker-Planck equation .

The connection between Brownian motion and the diffusionequation was made by Einstein in 1905.

Pavliotis (IC) CompStochProc 30 / 298

Page 31: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ELEMENTS OF PROBABILITY THEORY

Pavliotis (IC) CompStochProc 31 / 298

Page 32: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

A collection of subsets of a set Ω is called a σ–algebra if itcontains Ω and is closed under the operations of takingcomplements and countable unions of its elements.

A sub-σ–algebra is a collection of subsets of a σ–algebra whichsatisfies the axioms of a σ–algebra.

A measurable space is a pair (Ω,F) where Ω is a set and F is aσ–algebra of subsets of Ω.

Let (Ω,F) and (E,G) be two measurable spaces. A functionX : Ω 7→ E such that the event

ω ∈ Ω : X(ω) ∈ A =: X ∈ A

belongs to F for arbitrary A ∈ G is called a measurable function orrandom variable.

Pavliotis (IC) CompStochProc 32 / 298

Page 33: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let (Ω,F) be a measurable space. A function µ : F 7→ [0, 1] iscalled a probability measure if µ(∅) = 1, µ(Ω) = 1 andµ(∪∞

k=1Ak) =∑∞

k=1 µ(Ak) for all sequences of pairwise disjoint setsAk∞k=1 ∈ F .

The triplet (Ω,F , µ) is called a probability space.

Let X be a random variable (measurable function) from (Ω,F , µ) to(E,G). If E is a metric space then we may define expectation withrespect to the measure µ by

E[X] =∫

ΩX(ω) dµ(ω).

More generally, let f : E 7→ R be G–measurable. Then,

E[f (X)] =∫

Ωf (X(ω)) dµ(ω).

Pavliotis (IC) CompStochProc 33 / 298

Page 34: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let U be a topological space. We will use the notation B(U) todenote the Borel σ–algebra of U: the smallest σ–algebracontaining all open sets of U. Every random variable from aprobability space (Ω,F , µ) to a measurable space (E,B(E))induces a probability measure on E:

µX(B) = PX−1(B) = µ(ω ∈ Ω;X(ω) ∈ B), B ∈ B(E).

The measure µX is called the distribution (or sometimes the law)of X.

Example

Let I denote a subset of the positive integers. A vectorρ0 = ρ0,i, i ∈ I is a distribution on I if it has nonnegative entries andits total mass equals 1:

∑i∈I ρ0,i = 1.

Pavliotis (IC) CompStochProc 34 / 298

Page 35: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can use the distribution of a random variable to computeexpectations and probabilities:

E[f (X)] =∫

Sf (x) dµX(x)

and

P[X ∈ G] =

GdµX(x), G ∈ B(E).

When E = Rd and we can write dµX(x) = ρ(x) dx, then we refer to

ρ(x) as the probability density function (pdf), or density withrespect to Lebesque measure for X.

When E = Rd then by Lp(Ω;Rd), or sometimes Lp(Ω;µ) or even

simply Lp(µ), we mean the Banach space of measurable functionson Ω with norm

‖X‖Lp =(E|X|p

)1/p.

Pavliotis (IC) CompStochProc 35 / 298

Page 36: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example

Consider the random variable X : Ω 7→ R with pdf

γσ,m(x) := (2πσ)−12 exp

(−(x − m)2

).

Such an X is termed a Gaussian or normal random variable. Themean is

EX =

R

xγσ,m(x) dx = m

and the variance is

E(X − m)2 =

R

(x − m)2γσ,m(x) dx = σ.

Pavliotis (IC) CompStochProc 36 / 298

Page 37: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Continued)

Let m ∈ Rd and Σ ∈ R

d×d be symmetric and positive definite. Therandom variable X : Ω 7→ R

d with pdf

γΣ,m(x) :=((2π)ddetΣ

)− 12 exp

(−1

2〈Σ−1(x − m), (x − m)〉

)

is termed a multivariate Gaussian or normal random variable.The mean is

E(X) = m (7)

and the covariance matrix is

E

((X − m)⊗ (X − m)

)= Σ. (8)

Pavliotis (IC) CompStochProc 37 / 298

Page 38: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Since the mean and variance specify completely a Gaussianrandom variable on R, the Gaussian is commonly denoted byN (m, σ). The standard normal random variable is N (0, 1).

Since the mean and covariance matrix completely specify aGaussian random variable on R

d, the Gaussian is commonlydenoted by N (m,Σ).

Pavliotis (IC) CompStochProc 38 / 298

Page 39: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Characteristic Function

Many of the properties of (sums of) random variables can bestudied using the Fourier transform of the distribution function.

The characteristic function of the random variable X is definedto be the Fourier transform of the distribution function

φ(t) =∫

R

eitλ dµX(λ) = E(eitX). (9)

The characteristic function determines uniquely the distributionfunction of the random variable, in the sense that there is aone-to-one correspondance between F(λ) and φ(t).

The characteristic function of a N (m,Σ) is

φ(t) = e〈m,t〉−12 〈t,Σt〉.

Pavliotis (IC) CompStochProc 39 / 298

Page 40: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Lemma

Let X1,X2, . . . Xn be independent random variables with characteristicfunctions φj(t), j = 1, . . . n and let Y =

∑nj=1 Xj with characteristic

function φY(t). ThenφY(t) = Πn

j=1φj(t).

Lemma

Let X be a random variable with characteristic function φ(t) andassume that it has finite moments. Then

E(Xk) =1ikφ(k)(0).

Pavliotis (IC) CompStochProc 40 / 298

Page 41: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Types of Convergence and Limit Theorems

One of the most important aspects of the theory of randomvariables is the study of limit theorems for sums of randomvariables.

The most well known limit theorems in probability theory are thelaw of large numbers and the central limit theorem.

There are various different types of convergence for sequences orrandom variables.

Pavliotis (IC) CompStochProc 41 / 298

Page 42: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionLet Zn∞n=1 be a sequence of random variables. We will say that

(a) Zn converges to Z with probability one (almost surely) if

P(

limn→+∞

Zn = Z)= 1.

(b) Zn converges to Z in probability if for every ε > 0

limn→+∞

P(|Zn − Z| > ε

)= 0.

(c) Zn converges to Z in Lp if

limn→+∞

E[∣∣Zn − Z

∣∣p] = 0.

(d) Let Fn(λ), n = 1, · · · +∞, F(λ) be the distribution functions ofZn n = 1, · · · +∞ and Z, respectively. Then Zn converges to Z indistribution if

limn→+∞

Fn(λ) = F(λ)

for all λ ∈ R at which F is continuous.Pavliotis (IC) CompStochProc 42 / 298

Page 43: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let Xn∞n=1 be iid random variables with EXn = V. Then, thestrong law of large numbers states that average of the sum ofthe iid converges to V with probability one:

P

(lim

n→+∞1N

N∑

n=1

Xn = V

)= 1.

The strong law of large numbers provides us with informationabout the behavior of a sum of random variables (or, a largenumber or repetitions of the same experiment) on average.We can also study fluctuations around the average behavior.Indeed, let E(Xn − V)2 = σ2. Define the centered iid randomvariables Yn = Xn − V. Then, the sequence of random variables

1σ√

N

∑Nn=1 Yn converges in distribution to a N (0, 1) random

variable:

limn→+∞

P

(1

σ√

N

N∑

n=1

Yn 6 a

)=

∫ a

−∞

1√2π

e−12 x2

dx.

Pavliotis (IC) CompStochProc 43 / 298

Page 44: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that E|X| <∞ and let G be a sub–σ–algebra of F . Theconditional expectation of X with respect to G is defined to bethe function E[X|G] : Ω 7→ E which is G–measurable and satisfies

GE[X|G] dµ =

GX dµ ∀G ∈ G.

We can define E[f (X)|G] and the conditional probabilityP[X ∈ F|G] = E[IF(X)|G], where IF is the indicator function of F, ina similar manner.

Pavliotis (IC) CompStochProc 44 / 298

Page 45: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Random Number Generators

How can a deterministic machine produce random numbers?We can use a chaotic dynamical system:

xn+1 = f (xn), x0 = x, n = 1, 2, . . .

Provided that this dynamical system is “sufficiently chaotic”, thenwe can hope that for n sufficiently large the sequence of numbersxn, xn+1, xn+2, . . . is random with nice statistical properties.Statistical tests can be used in order to determine whether thissequence of pseudo-random numbers can be used in order toperform Monte Carlo simulations, simulate stochastic processesetc.Modern programming languages have well tested build in RNG. InMatlab:rand U(0, 1).randn N(0, 1).

Pavliotis (IC) CompStochProc 45 / 298

Page 46: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionA uniform pseudo-number generator is an algorithm which, startingfrom an initial value x0 (the seed), produces a sequence ui = Diu0 ofvalues in [0, 1].

LemmaSuppose Y ∼ U(0, 1) and F a 1d cumulative distribution function. ThenX = F−1(U) with F−1(u) = inf x ; F(x) > u has the distribution F.

We can use the Box-Muller algorithm to generate Gaussian randomvariables, given a pseudonumber generator of uniform r.v.: GivenU1, U2 ∼ U(0, 1) independent, then

Z0 =√

−2 ln U1 cos(2πU2),

Z1 =√

−2 ln U1 sin(2πU2)

are independent N(0, 1) r.v.

Pavliotis (IC) CompStochProc 46 / 298

Page 47: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Fast Chaotic NoiseFast/slow system:

dxdt

= x − x3 +λ

εy2 ,

dy1

dt=

10ε2 (y2 − y1) ,

dy2

dt=

1ε2 (28y1 − y2 − y1y3) ,

dy3

dt=

1ε2 (y1y2 −

83

y3)

Effective Dynamics: [Melbourne, Stuart ’11]

dXt = A(Xt − Xt

3) dt +√σ dWt

true values:

A = 1 , λ =2

45, σ = 2λ2

∫ ∞

0lim

T→∞1T

∫ T

0ψs(y)ψs+t(y) ds dt

Pavliotis (IC) CompStochProc 47 / 298

Page 48: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Fast Chaotic NoiseEstimators

Values for σ reported in the literature (ε = 10−3/2) 0.126 ± 0.003 via Gaussian moment approx. 0.13 ± 0.01 via HMM

here: ε = 10−1 → σ ≈ 0.121 and ε = 10−3/2 → σ ≈ 0.124

But we estimate also A

Pavliotis (IC) CompStochProc 48 / 298

Page 49: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Generation of Multidimensional Gaussian Random Variables

We are given Z ∼ N (0, I), where I ∈ Rd×d denotes the identity

matrix.

We want to generate a Gaussian random vector Y ∼ N (b,Σ),EY = b, E((Y − b)(Y − b)T) = Σ.

X = Y − b is mean zero, sufficient to consider this case.

We can use the Cholesky decomposition: Assuming that Σ > 0there exists a unique lower triangular matrix such that

Σ = ΛΛT . (10)

Then: X = ΛZ is X ∼ N (0,Σ).

The computational cost is O(d3).

In matlab:

chol( Σ)

Pavliotis (IC) CompStochProc 49 / 298

Page 50: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Eigenvalue decomposition of Cov(X)

The Cholesky decomposition is not unique for covariancematrices that are not strictly positive definite.

The eigenvalues and eigenvectors of the covariance matrixprovide us complete information about the Gaussian randomvector.

Principal Component Analysis.

Let X ∼ N (0,Σ), Σ > 0. Then

Σ = PDPT , PPT = I, D = diag(d1, dd). (11)

Set L = P√

D and use the Cholesky decomposition.

In Matlab:

[P,D] = eig ( Σ)

Pavliotis (IC) CompStochProc 50 / 298

Page 51: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Other techniques for generating random numbers with a givendistribution:

Acceptance-rejection algorithm.

Markov Chain Monte Carlo.

Pavliotis (IC) CompStochProc 51 / 298

Page 52: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Monte Carlo Techniques

Example:

I =

Df (x) dx = Ef (X), x ∼ (U(D))d

≈ 1N

N∑

i=1

f (xi) =: IN , xi iid U(D).

To prove convergence use the (strong) law of large numbers:

limN→+∞

1N

N∑

i=1

f (xi) = I a.s.

Pavliotis (IC) CompStochProc 52 / 298

Page 53: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To obtain an estimate on the rate of convergence use the centrallimit theorem:

√N(IN − I

)→ N(0, σ2), in distribution,

where σ2 = Varπ(f (x)).

The error of the Monte Carlo approximation is O(N−1/2),regardless of the dimensionality of x.

For smooth functions, deterministic approximations (Riemannsum, trapezoidal rule..) can have better convergence rate, but theydon’t scale well as the number of dimensions increases.

Pavliotis (IC) CompStochProc 53 / 298

Page 54: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

More generally, consider the integral

I =∫

Rdf (x)π(x) dx, (12)

where π(x) is a probability distribution. We can always writeintegrals in this form (multiply and divide by π(x)).Monte Carlo estimator :

IN =1N

N∑

i=1

f (Xi), Xi ∼ π(x), iid. (13)

The MC estimator is unbiased:

EπIN = I.

Variance of the estimator:

σ2I := varπ(IN) =

1N

varπ(f (x)) =1N

Rd(f (x) − I)2π(x) dx.

Pavliotis (IC) CompStochProc 54 / 298

Page 55: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

From the strong law of large numbers:

limN→+∞

IN = I a.s.

From the weak LLN and Chebyshev’s inequality:

P (|IN − I| 6 ε) > 1 − σ2I

ε2 = 1 −σ2

f

Nε2 . (14)

Accuracy of the MC estimator depends on N and on the varianceσ2

I .In order for

P (IN ∈ [I − ε, I + ε]) = 1 − α,

for a confidence level α ∈ (0, 1) we need

N >σ2

f

αε2 .

MC error scales like 1/√

N, independently of the number ofdimensions. It is expected that σ2

f increases as the number ofdimensions increases.

Pavliotis (IC) CompStochProc 55 / 298

Page 56: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Huynh et al Stochastic Simulation andApplications in finance ch. 5)

1 function [Call, VarCall]=CalculateCall2(N)2 % Price a call option3 %N: Number of generated points to compute the expectation4

5 x = exp(sqrt(0.1) * randn(1,N) + 5);6 Call = mean( max(x - 110,0));7 VarCall = var(max(x - 110,0))/N;8

9 end

Pavliotis (IC) CompStochProc 56 / 298

Page 57: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

VARIANCE REDUCTION TECHNIQUES

To improve the accuracy of the MC estimator and to reduce thecomputational cost we can use variance reduction techniques.Examples of such techniques are:

Antithetic variables. Control variates. Importance sampling.

To implement these techniques we need to be able to estimate themean and variance "on the fly":

IN+1 =1

N + 1(NIN + f (XN+1))

and

var(IN) =1N

(1

N − 1

N∑

i=1

|f (Xi)|2 −N

N − 1I2N

).

Pavliotis (IC) CompStochProc 57 / 298

Page 58: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Antithetic Variables: generate additional random variables thathave the same distribution as Xi, i = 1, . . .N but are negativelycorrelated to Xi.

Example: suppose we want to estimate the mean µX of iid randomvariables Xi, i = 1, . . .N. We introduce the auxiliary r.v.Xa

i , i = 1, . . .N with

E[(Xai − µx)(Xj − µX)] = −δijσ

2X.

The variance of the sample mean is

Var(µa) =1

2N2σ2X +

14N2

N∑

k,j=1

E[(Xai − µx)(Xj − µX)]

=1

4Nσ2

X.

Not possible in general to find perfectly anti-correlated randomvariables.

Pavliotis (IC) CompStochProc 58 / 298

Page 59: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleLet X ∼ U(0, 1). Then Xa = 1 − X is an antithetic variable of X

Example

Let X ∼ N (µ, σ2). Then Xa = 2µ− X is an antithetic variable of X.

ExampleLet X ∼ N (µ, Σ) a 2d Gaussian random vector. We use antitheticvariables to reduce the variance and increase the accuracy of theestimation of Ef (X1, X2). We will estimate the expectation ofYj = eXj , j = 1, 2. These integrals can be calculated analytically.

Pavliotis (IC) CompStochProc 59 / 298

Page 60: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

(Huynh et al Stochastic Simulation and Applications in finan ce ch. 5)

1 function Rep=SimulationsAnti2 %calculate the mean of an exponential gaussian using3 %antithetic variables.4 NbTraj=100;5 % construct the covariance matrix6 sigma1=0.2; sigma2=0.5;rho=-0.3;7 MoyTheo=[10;15];8 CovTheo=[sigma1^2,rho * sigma1 * sigma2;rho * sigma1 * sigma2,sigma2^2];9 L=chol(CovTheo)';

10 %Simulation of the random variables.11 Sample=randn(2,NbTraj);12 SampleSimple=repmat(MoyTheo,1,NbTraj)+L * Sample;13 %Introduction of the antithetic variables14 XAnti=cat(2,SampleSimple,2 * repmat(MoyTheo,1,NbTraj)-SampleSimple15 %Transformation of Xi into Zi16 Z1Anti=exp(XAnti(1,:))17 Z2Anti=exp(XAnti(2,:))18 Rep=mean(Z1Anti,2);19 end

Pavliotis (IC) CompStochProc 60 / 298

Page 61: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

CONTROL VARIATES

Let Z be a random variable. We want to estimate EZ.

We look for a r.v. W that is strongly correlated with Z and withknown EW.

Consider the new r.v.

X = Z + α(W − EW). (15)

We have

Var(X) = Var(Z) + α2Var(W) + 2αCov(Z,W).

We minimize the variance with respect to α:

α = −Cov(Z,W)

Var(W)and Varmin(X) = Var(Z)(1 − ρ2).

Pavliotis (IC) CompStochProc 61 / 298

Page 62: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

CONTROL VARIATES

The estimator (16) is unbiased and the variance is reduced when

the correlation coefficient ρ2 = Cov(Z,W)

Var(Z)Var(W)is close to 1.

The optimal value of α and Var(X) can be estimated using theempirical means.

We can also construct confidence intervals using the empiricalvalues.

We can add R control variates:

X = Z +

R∑

j=1

α(Wj − EWj). (16)

A similar idea can be used for vector valued r.v. X, or even ininfinite dimensions.

Pavliotis (IC) CompStochProc 62 / 298

Page 63: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Importance sampling: choose a good probability distribution fromwhich to simulate random variables:

Rdf (x)π(x) dx =

Rd

f (x)π(x)ψ(x)

ψ(x) dx

= Eψ

(f (x)π(x)ψ(x)

).

where ψ(x) is a probability distribution.

Consider the MC estimator

IψN =1N

N∑

j=1

f (Xj)π(Xj)

ψ(Xj), Xj ∼ ψ(x).

The estimator is unbiased. The variance is

var(IψN ) =1N

var(

f (X)π(X)ψ(X)

).

Pavliotis (IC) CompStochProc 63 / 298

Page 64: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can to choose ψ(x) so that it has similar shape to f (x)π(x).

The closer f (X)π(X)ψ(X) is to a constant the smaller the variance is.

ExampleWe use importance sampling to estimate the integral

I =∫ 1

0c exp(−a(x − b)4) dx = 18.128

with c = 100, a = 10000, b = 0.5. we choose ψ = γ(0.5, 0.05);For N = 10000 we have IN = 18.2279, var(IN) = 0.1193, errN =0.0999, IIC = 18.1271, var(IIC) = 0.0041, err(IIC) = 8.56 × 10−4.

Pavliotis (IC) CompStochProc 64 / 298

Page 65: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

80

90

100

Figure: c exp(−a(x − b)4) and γ(0.5, 0.05)

Pavliotis (IC) CompStochProc 65 / 298

Page 66: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 function [yun,varyun,errun,yIC,varyIC,errIC] = quarticIS(N)2 % Use importance sampling to calculate the mean3 % of exp(-V) where V is a quartic polynomial4 a = 10000;b=0.5;c=100;5 intexact = c * 0.18128;6 x = rand(N,1);7 yun = mean(c * exp(-a * (x-b).^4));8 varyun = var(c * exp(-a * (x-b).^4))/N;9 errun = abs(intexact-yun);

10 mu = 0.5;sigma = 0.05;11 xx = mu + sigma * randn(N,1);12 yIC = mean((c * exp(-a * (xx-b).^4))./normpdf(xx,mu,sigma));13 varyIC = var((c * exp(-a * (xx-b).^4))./normpdf(xx,mu,sigma))/N;14 errIC = abs(intexact-yIC);

Pavliotis (IC) CompStochProc 66 / 298

Page 67: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES

Pavliotis (IC) CompStochProc 67 / 298

Page 68: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let T be an ordered set. A stochastic process is a collection ofrandom variables X = Xt; t ∈ T where, for each fixed t ∈ T, Xt isa random variable from (Ω,F) to (E,G).The measurable space Ω,F is called the sample space . Thespace (E,G) is called the state space .

In this course we will take the set T to be [0,+∞).

The state space E will usually be Rd equipped with the σ–algebra

of Borel sets.

A stochastic process X may be viewed as a function of both t ∈ Tand ω ∈ Ω. We will sometimes write X(t),X(t, ω) or Xt(ω) insteadof Xt. For a fixed sample point ω ∈ Ω, the function Xt(ω) : T 7→ E iscalled a sample path (realization, trajectory) of the process X.

Pavliotis (IC) CompStochProc 68 / 298

Page 69: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The finite dimensional distributions (fdd) of a stochasticprocess are the distributions of the Ek–valued random variables(X(t1),X(t2), . . . ,X(tk)) for arbitrary positive integer k and arbitrarytimes ti ∈ T, i ∈ 1, . . . , k:

F(x) = P(X(ti) 6 xi, i = 1, . . . , k)

with x = (x1, . . . , xk).

We will say that two processes Xt and Yt are equivalent if theyhave same finite dimensional distributions.

From experiments or numerical simulations we can only obtaininformation about the (fdd) of a process.

Pavliotis (IC) CompStochProc 69 / 298

Page 70: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Definition

A Gaussian process is a stochastic processes for which E = Rd and

all the finite dimensional distributions are Gaussian

F(x) = P(X(ti) 6 xi, i = 1, . . . , k)

= (2π)−n/2(detKk)−1/2 exp

[−1

2〈K−1

k (x − µk), x − µk〉],

for some vector µk and a symmetric positive definite matrix Kk.

A Gaussian process x(t) is characterized by its mean

m(t) := Ex(t)

and the covariance function

C(t, s) = E

((x(t)− m(t)

)⊗(x(s) − m(s)

)).

Thus, the first two moments of a Gaussian process are sufficientfor a complete characterization of the process.

Pavliotis (IC) CompStochProc 70 / 298

Page 71: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Examples of Gaussian stochastic processes

Random Fourier series: let ξi, ζi ∼ N (0, 1), i = 1, . . .N and define

X(t) =N∑

j=1

(ξj cos(2πjt) + ζj sin(2πjt)) .

Brownian motion is a Gaussian process withm(t) = 0, C(t, s) = min(t, s).

Brownian bridge is a Gaussian process withm(t) = 0, C(t, s) = min(t, s)− ts.

The Ornstein-Uhlenbeck process is a Gaussian process withm(t) = 0, C(t, s) = λ e−α|t−s| with α, λ > 0.

Pavliotis (IC) CompStochProc 71 / 298

Page 72: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To simulate a Gaussian stochastic process: Fix ∆t and define tj = (j − 1)∆t, j = 1, . . .N. Set Xj := X(tj) and define the Gaussian random vector

XN =

XNj

N

j=1. Then XN ∼ N (µN , ΓN) with µN = (µ(t1), . . . µ(tN))

and ΓNij = C(ti, tj).

Then XN = µN + ΛN (0, I) with ΓN = ΛΛT .

We can calculate the square root of the covariance matrix eitherusing the Cholesky factorization or by calculating its eigenvaluedecomposition (PCA).

Γ might be a full matrix and the calculation of Λ can becomputationally expensive.

Pavliotis (IC) CompStochProc 72 / 298

Page 73: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 function [t,x,gamma,mx,varx] =gaussiansimulation(T,dt,M)2 % simulate a mean zero gaussian process3 % use the eigenvalue decomposition to calculate sqrtGamma 4 % Input: T: length of path5 % dt: timestep6 % gamma: covariance7 % M : number of paths simulated8 % calculate first and second moment9 t = 0:dt:T;

10 N = length(t);11 for i =1:N,12 for j=1:N,13 % gamma(i,j) = covBM(t(i),t(j));14 % gamma(i,j) = covOU(t(i),t(j),1,1);15 % gamma(i,j) = covBB(t(i),t(j));16 gamma(i,j) = covfBM(t(i),t(j),0.2);17 end18 end19 [vv,dd] = eig(gamma);lambda = vv * sqrt(dd) * vv';20 x = lambda * randn(N,M);21 mx = mean(x'); varx = mean(x'. * x');22 figure;plot(t,x(:,1:10), 'Linewidth' ,1.0);23 hold ...Pavliotis (IC) CompStochProc 73 / 298

Page 74: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 function rr = covBM(x,y)2 % covariance function of Brownian motion3 rr = min(x,y);4 end

1 function rr = covBB(x,y)2 % covariance function of Brownian motion3 rr = min(x,y) - x * y;4 end

1 function rr = covOU(x,y,dd,aa)2 % covariance function of Ornstein-Uhlenbeck process3 % aa: inverse correlation time4 % dd: diffusion coefficient, rr(o)= dd/aa5 rr = (dd/aa) * exp(-aa * abs(x-y));6 end

Pavliotis (IC) CompStochProc 74 / 298

Page 75: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 function rr = covfBM(x,y,H)2 % covariance function of fractional Brownian motion3 % H: Hurst exponent4 rr = (1/2) * (abs(x)^(2 * H) + abs(y)^(2 * H) - abs(x-y)^(2 * H) ...

);5 end

Pavliotis (IC) CompStochProc 75 / 298

Page 76: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let(Ω,F ,P

)be a probability space. Let Xt, t ∈ T (with T = R or Z) be

a real-valued random process on this probability space with finitesecond moment, E|Xt|2 < +∞ (i.e. Xt ∈ L2 ).

Definition

A stochastic process Xt ∈ L2 is called second-order stationary orwide-sense stationary if the first moment EXt is a constant and thesecond moment E(XtXs) depends only on the difference t − s:

EXt = µ, E(XtXs) = C(t − s).

Pavliotis (IC) CompStochProc 76 / 298

Page 77: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The constant m is called the expectation of the process Xt. Wewill set m = 0.

The function C(t) is called the covariance or the autocorrelationfunction of the Xt.

Notice that C(t) = E(XtX0), whereas C(0) = E(X2t ), which is finite,

by assumption.

Since we have assumed that Xt is a real valued process, we havethat C(t) = C(−t), t ∈ R.

Pavliotis (IC) CompStochProc 77 / 298

Page 78: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The correlation function of a second order stationary processenables us to associate a time scale to Xt, the correlation timeτcor:

τcor =1

C(0)

∫ ∞

0C(τ) dτ =

∫ ∞

0E(XτX0)/E(X

20) dτ.

The slower the decay of the correlation function, the larger thecorrelation time is. We have to assume sufficiently fast decay ofcorrelations so that the correlation time is finite.

Pavliotis (IC) CompStochProc 78 / 298

Page 79: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleConsider the mean zero, second order stationary process withcovariance function

R(t) =Dα

e−α|t|. (17)

The spectral density of this process is:

f (x) =1

2πDα

∫ +∞

−∞e−ixte−α|t| dt

=1

2πDα

(∫ 0

−∞e−ixteαt dt +

∫ +∞

0e−ixte−αt dt

)

=1

2πDα

(1

ix + α+

1−ix + α

)

=Dπ

1x2 + α2 .

Pavliotis (IC) CompStochProc 79 / 298

Page 80: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Continued)This function is called the Cauchy or the Lorentz distribution.

The Gaussian stochastic process with covariance function (17) iscalled the stationary Ornstein-Uhlenbeck process .

The correlation time is (we have that C(0) = D/(α))

τcor =

∫ ∞

0e−αt dt = α−1.

Pavliotis (IC) CompStochProc 80 / 298

Page 81: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Second order stationary processes are ergodic: time averages equalphase space (ensemble) averages. An example of an ergodic theoremfor a stationary processes is the following L2 (mean-square) ergodictheorem.

TheoremLet Xtt>0 be a second order stationary process on a probabilityspace Ω, F , P with mean µ and covariance R(t), and assume thatR(t) ∈ L1(0,+∞). Then

limT→+∞

E

∣∣∣∣1T

∫ T

0X(s) ds − µ

∣∣∣∣2

= 0. (18)

Pavliotis (IC) CompStochProc 81 / 298

Page 82: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof.We have

E

∣∣∣∣1T

∫ T

0X(s) ds − µ

∣∣∣∣2

=1

T2

∫ T

0

∫ T

0R(t − s) dtds

=2

T2

∫ T

0

∫ t

0R(t − s) dsdt

=2

T2

∫ T

0(T − v)R(u) du → 0,

using the dominated convergence theorem and the assumptionR(·) ∈ L1. In the above we used the fact that R is a symmetric function,together with the change of variables u = t − s, v = t and an integrationover v.

Pavliotis (IC) CompStochProc 82 / 298

Page 83: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionA stochastic process is called (strictly) stationary if all finitedimensional distributions are invariant under time translation: for anyinteger k and times ti ∈ T, the distribution of (X(t1),X(t2), . . . ,X(tk)) isequal to that of (X(s + t1),X(s + t2), . . . ,X(s + tk)) for any s such thats + ti ∈ T for all i ∈ 1, . . . , k. In other words,

P(Xt1+t ∈ A1,Xt2+t ∈ A2 . . .Xtk+t ∈ Ak)

= P(Xt1 ∈ A1,Xt2 ∈ A2 . . .Xtk ∈ Ak), ∀t ∈ T.

Let Xt be a strictly stationary stochastic process with finite secondmoment (i.e. Xt ∈ L2). The definition of strict stationarity impliesthat EXt = µ, a constant, and E((Xt − µ)(Xs − µ)) = C(t − s).Hence, a strictly stationary process with finite second moment isalso stationary in the wide sense. The converse is not true.

Pavliotis (IC) CompStochProc 83 / 298

Page 84: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Remarks

1 A sequence Y0, Y1, . . . of independent, identically distributedrandom variables is a stationary process withR(k) = σ2δk0, σ

2 = E(Yk)2.

2 Let Z be a single random variable with known distribution and setZj = Z, j = 0, 1, 2, . . . . Then the sequence Z0, Z1, Z2, . . . is astationary sequence with R(k) = σ2.

3 The first two moments of a Gaussian process are sufficient for acomplete characterization of the process. A corollary of this is thata second order stationary Gaussian process is also a (strictly)stationary process.

Pavliotis (IC) CompStochProc 84 / 298

Page 85: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The most important continuous time stochastic process isBrownian motion . Brownian motion is a mean zero, continuous(i.e. it has continuous sample paths: for a.e ω ∈ Ω the function Xt

is a continuous function of time) process with independentGaussian increments.

A process Xt has independent increments if for every sequencet0 < t1...tn the random variables

Xt1 − Xt0 , Xt2 − Xt1 , . . . ,Xtn − Xtn−1

are independent.

If, furthermore, for any t1, t2 and Borel set B ⊂ R

P(Xt2+s − Xt1+s ∈ B)

is independent of s, then the process Xt has stationaryindependent increments .

Pavliotis (IC) CompStochProc 85 / 298

Page 86: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Definition

A one dimensional standard Brownian motion W(t) : R+ → R is areal valued stochastic process with the following properties:

1 W(0) = 0;2 W(t) is continuous;3 W(t) has independent increments.4 For every t > s > 0 W(t)− W(s) has a Gaussian distribution with

mean 0 and variance t − s. That is, the density of the randomvariable W(t) − W(s) is

g(x; t, s) =(

2π(t − s))−

12

exp

(− x2

2(t − s)

); (19)

Pavliotis (IC) CompStochProc 86 / 298

Page 87: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Definition (Continued)

A d–dimensional standard Brownian motion W(t) : R+ → Rd is a

collection of d independent one dimensional Brownian motions:

W(t) = (W1(t), . . . ,Wd(t)),

where Wi(t), i = 1, . . . , d are independent one dimensionalBrownian motions. The density of the Gaussian random vectorW(t)− W(s) is thus

g(x; t, s) =(

2π(t − s))−d/2

exp

(− ‖x‖2

2(t − s)

).

Brownian motion is sometimes referred to as the Wiener process .

Pavliotis (IC) CompStochProc 87 / 298

Page 88: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To generate Brownian paths we can use the general algorithm forsimulating Gaussian processes with µN = 0 andΓ = min(ti, tj)

Ni,j=1.

It is easier to use the fact that Brownian motion has independentincrements with dW ∼ N (0, dt):

Wj = Wj−1 + dWj, j = 1, 2, . . .N, where dWj =

√∆tN (0, 1).

After having generated Brownian paths we can compute statisticsusing Monte Carlo

Ef (Wt) ≈1N

N∑

j=1

f (W jt ).

We can also simulate processes related to Brownian motion suchas the geometric Brownian motion :

St = S0eat+σWt . (20)

Pavliotis (IC) CompStochProc 88 / 298

Page 89: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

2

t

U(t)

mean of 1000 paths5 individual paths

Figure: Brownian sample paths

Pavliotis (IC) CompStochProc 89 / 298

Page 90: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

It is possible to prove rigorously the existence of the Wiener process(Brownian motion):

Theorem(Wiener) There exists an almost-surely continuous process Wt withindependent increments such and W0 = 0, such that for each t > therandom variable Wt is N (0, t). Furthermore, Wt is almost surely locallyHölder continuous with exponent α for any α ∈ (0, 1

2).

Notice that Brownian paths are not differentiable.

Pavliotis (IC) CompStochProc 90 / 298

Page 91: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Brownian motion is a Gaussian process. For the d–dimensionalBrownian motion, and for I the d × d dimensional identity, we have (see(7) and (8))

EW(t) = 0 ∀t > 0

andE

((W(t)− W(s))⊗ (W(t)− W(s))

)= (t − s)I. (21)

Moreover,E

(W(t)⊗ W(s)

)= min(t, s)I. (22)

Pavliotis (IC) CompStochProc 91 / 298

Page 92: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

From the formula for the Gaussian density g(x, t − s), eqn. (19), weimmediately conclude that W(t)− W(s) and W(t + u)− W(s + u)have the same pdf. Consequently, Brownian motion has stationaryincrements.

Notice, however, that Brownian motion itself is not a stationaryprocess.

Since W(t) = W(t)− W(0), the pdf of W(t) is

g(x, t) =1√2πt

e−x2/2t.

We can easily calculate all moments of the Brownian motion:

E(xn(t)) =1√2πt

∫ +∞

−∞xne−x2/2t dx

= 1.3 . . . (n − 1)tn/2, n even,

0, n odd.

Pavliotis (IC) CompStochProc 92 / 298

Page 93: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can define the OU process through the Brownian motion via atime change.

LemmaLet W(t) be a standard Brownian motion and consider the process

V(t) = e−tW(e2t).

Then V(t) is a Gaussian second order stationary process with mean 0and covariance

K(s, t) = e−|t−s|.

Pavliotis (IC) CompStochProc 93 / 298

Page 94: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionA (normalized) fractional Brownian motion WH

t , t > 0 with Hurstparameter H ∈ (0, 1) is a centered Gaussian process with continuoussample paths whose covariance is given by

E(WHt WH

s ) =12

(s2H + t2H − |t − s|2H). (23)

Fractional Brownian motion has the following properties.

1 When H = 12 , W

12t becomes the standard Brownian motion.

2 WH0 = 0, EWH

t = 0, E(WHt )2 = |t|2H , t > 0.

3 It has stationary increments, E(WHt − WH

s )2 = |t − s|2H .4 It has the following self similarity property

(WHαt, t > 0) = (αHWH

t , t > 0), α > 0,

where the equivalence is in law.

Pavliotis (IC) CompStochProc 94 / 298

Page 95: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

A very useful result is that we can expand every centered(EXt = 0) stochastic process with continuous covariance (andhence, every L2-continuous centered stochastic process) into arandom Fourier series.

Let (Ω,F ,P) be a probability space and Xtt∈T a centeredprocess which is mean square continuous. Then, theKarhunen-Loéve theorem states that Xt can be expanded in theform

Xt =∞∑

n=1

ξnφn(t), (24)

where the ξn are an orthogonal sequence of random variables withE|ξk|2 = λk, where λk φn∞k=1 are the eigenvalues andeigenfunctions of the integral operator whose kernel is thecovariance of the processes Xt. The convergence is in L2(P) forevery t ∈ T.

If Xt is a Gaussian process then the ξk are independent Gaussianrandom variables.

Pavliotis (IC) CompStochProc 95 / 298

Page 96: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Set T = [0, 1].Let K : L2(T) → L2(T) be defined by

Kψ(t) =∫ 1

0K(s, t)ψ(s) ds.

The kernel of this integral operator is a continuous (in both s andt), symmetric, nonnegative function. Hence, the correspondingintegral operator has eigenvalues, eigenfunctions

Kφn = λn, λn > 0, (φn, φm)L2 = δnm,

such that the covariance operator can be expanded in a uniformlyconvergent series

B(t, s) =∞∑

n=1

λnφn(t)φn(s).

Pavliotis (IC) CompStochProc 96 / 298

Page 97: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The random variables ξn are defined as

ξn =

∫ 1

0X(t)φn(t) dt.

The orthogonality of the eigenfunctions of the covariance operatorimplies that

E(ξnξm) = λnδnm.

Thus the random variables are orthogonal. When X(t) isGaussian, then ξk are Gaussian random variables. Furthermore,since they are also orthogonal, they are independent Gaussianrandom variables.

Pavliotis (IC) CompStochProc 97 / 298

Page 98: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example

The Karhunen-Loéve Expansion for Brownian Motion We setT = [0, 1].The covariance function of Brownian motion isC(t, s) = min(t, s). The eigenvalue problem Cψn = λnψn becomes

∫ 1

0min(t, s)ψn(s) ds = λnψn(t).

Or, ∫ 1

0sψn(s) ds + t

∫ 1

tψn(s) ds = λnψn(t).

We differentiate this equation twice:

∫ 1

tψn(s) ds = λnψ

′n(t) and − ψn(t) = λnψ

′′n (t),

where primes denote differentiation with respect to t.

Pavliotis (IC) CompStochProc 98 / 298

Page 99: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleFrom the above equations we immediately see that the right boundaryconditions are ψ(0) = ψ′(1) = 0. The eigenvalues and eigenfunctionsare

ψn(t) =√

2 sin

(12(2n − 1)πt

), λn =

(2

(2n − 1)π

)2

.

Thus, the Karhunen-Loéve expansion of Brownian motion on [0, 1] is

Wt =√

2∞∑

n=1

ξnsin((

n − 12

)πt)

(n − 1

2

. (25)

Pavliotis (IC) CompStochProc 99 / 298

Page 100: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example KL expansion for Brownian motion

1 function [t,bb] =KLBmotion(dt)2 % generate paths of Brownian motion on [0,1]3 % using the KL expansion4 t = 0:dt:1;5 N = length(t);6 bb = zeros(1,N);7 for i =1:N,8 xi = sqrt(2) * randn/((i-0.5) * pi);9 bb = bb + xi * sin((i-0.5) * pi * t);

10 end11 figure;plot(t,bb, 'Linewidth' ,2);12

13 end

Pavliotis (IC) CompStochProc 100 / 298

Page 101: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleThe Karhunen-Loéve expansion of the Brownian bridge Bt = Wt − tW1

on [0, 1] is

Bt =

∞∑

n=1

ξn

√2 sin (nπt)

nπ. (26)

Pavliotis (IC) CompStochProc 101 / 298

Page 102: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example KL expansion for Brownian bridge

1 function [t,bb] =KLBBridge(dt)2 % generate paths of Brownian bridge on [0,1]3 % using the KL expansion4 t = 0:dt:1;5 N = length(t);6 bb = zeros(1,N);7 for i =1:N,8 xi = sqrt(2) * randn/(i * pi);9 bb = bb + xi * sin(i * pi * t);

10 end11 figure;plot(t,bb, 'Linewidth' ,2);12

13 end

Pavliotis (IC) CompStochProc 102 / 298

Page 103: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

STOCHASTIC DIFFERENTIAL EQUATIONS (SDEs)

Pavliotis (IC) CompStochProc 103 / 298

Page 104: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In this part of the course we will study stochastic differentialequation (SDEs): ODEs driven by Gaussian white noise.

Let W(t) denote a standard m–dimensional Brownian motion,h : Z → R

d a smooth vector-valued function and γ : Z → Rd×m a

smooth matrix valued function (in this course we will takeZ = T

d, Rd or Rl ⊕ Td−l.

Consider the SDE

dzdt

= h(z) + γ(z)dWdt, z(0) = z0. (27)

We think of the term dWdt as representing Gaussian white noise: a

mean-zero Gaussian process with correlation δ(t − s)I.

The function h in (27) is sometimes referred to as the drift and γas the diffusion coefficient.

Pavliotis (IC) CompStochProc 104 / 298

Page 105: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Such a process exists only as a distribution. The preciseinterpretation of (27) is as an integral equation for z(t) ∈ C(R+,Z):

z(t) = z0 +

∫ t

0h(z(s))ds +

∫ t

0γ(z(s))dW(s). (28)

In order to make sense of this equation we need to define thestochastic integral against W(s).

Pavliotis (IC) CompStochProc 105 / 298

Page 106: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Itô Stochastic Integral

For the rigorous analysis of stochastic differential equations it isnecessary to define stochastic integrals of the form

I(t) =∫ t

0f (s) dW(s), (29)

where W(t) is a standard one dimensional Brownian motion. Thisis not straightforward because W(t) does not have boundedvariation.

In order to define the stochastic integral we assume that f (t) is arandom process, adapted to the filtration Ft generated by theprocess W(t), and such that

E

(∫ T

0f (s)2 ds

)<∞.

Pavliotis (IC) CompStochProc 106 / 298

Page 107: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Itô stochastic integral I(t) is defined as the L2–limit of theRiemann sum approximation of (29):

I(t) := limK→∞

K−1∑

k=1

f (tk−1) (W(tk)− W(tk−1)) , (30)

where tk = k∆t and K∆t = t.

Notice that the function f (t) is evaluated at the left end of eachinterval [tn−1, tn] in (30).

The resulting Itô stochastic integral I(t) is a.s. continuous in t.

These ideas are readily generalized to the case where W(s) is astandard d dimensional Brownian motion and f (s) ∈ R

m×d for eachs.

Pavliotis (IC) CompStochProc 107 / 298

Page 108: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The resulting integral satisfies the Itô isometry

E|I(t)|2 =

∫ t

0E|f (s)|2Fds, (31)

where | · |F denotes the Frobenius norm |A|F =√

tr(AT A).

The Itô stochastic integral is a martingale:

EI(t) = 0

andE[I(t)|Fs] = I(s) ∀ t > s,

where Fs denotes the filtration generated by W(s).

Pavliotis (IC) CompStochProc 108 / 298

Page 109: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleConsider the Itô stochastic integral

I(t) =∫ t

0f (s) dW(s),

where f ,W are scalar–valued. This is a martingale with quadraticvariation

〈I〉t =

∫ t

0(f (s))2 ds.

More generally, for f , W in arbitrary finite dimensions, the integralI(t) is a martingale with quadratic variation

〈I〉t =

∫ t

0(f (s)⊗ f (s)) ds.

Pavliotis (IC) CompStochProc 109 / 298

Page 110: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Stratonovich Stochastic Integral

In addition to the Itô stochastic integral, we can also define theStratonovich stochastic integral. It is defined as the L2–limit of adifferent Riemann sum approximation of (29), namely

Istrat(t) := limK→∞

K−1∑

k=1

12

(f (tk−1) + f (tk)

)(W(tk)− W(tk−1)) , (32)

where tk = k∆t and K∆t = t. Notice that the function f (t) isevaluated at both endpoints of each interval [tn−1, tn] in (32).

The multidimensional Stratonovich integral is defined in a similarway. The resulting integral is written as

Istrat(t) =∫ t

0f (s) dW(s).

Pavliotis (IC) CompStochProc 110 / 298

Page 111: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The limit in (32) gives rise to an integral which differs from the Itôintegral.

The situation is more complex than that arising in the standardtheory of Riemann integration for functions of bounded variation:in that case the points in [tk−1, tk] where the integrand is evaluateddo not effect the definition of the integral, via a limiting process.

In the case of integration against Brownian motion, which does nothave bounded variation, the limits differ.

When f and W are correlated through an SDE, then a formulaexists to convert between them.

Pavliotis (IC) CompStochProc 111 / 298

Page 112: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Itô and Stratonovich stochastic integrals coincide providedthat the integrand is sufficiently regular then the Itô andStratonovich stochastic integrals coincide.

Suppose that there exist C <∞, ε > 0 such that

E|f (t, ω) − f (s, ω)|2 6 C|t − s|1+ε,

for all s, t ∈ [0,T]. Then

∫ T

0f (t, ω) dWt =

∫ t

0f (t, ω) dWt.

This is more than the regularity of the solution to an SDE.

On the other hand:∫ T

0Wt dWt =

12

W2T − 1

2T,

∫ T

0Wt dWt =

12

W2T .

Pavliotis (IC) CompStochProc 112 / 298

Page 113: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Itô and Stratonovich stochastic integrals (Higham2001)

1 function [itoerr, straterr] = stint(T,N)2 %3 % Ito and Stratonovich integrals of W dW4 %5

6 dt = T/N;7

8 dW = sqrt(dt) * randn(1,N); % increments9 W = cumsum(dW); % cumulative sum

10

11 ito = sum([0,W(1:end-1)]. * dW)12 strat = sum((0.5 * ([0,W(1:end-1)]+W) + ...

0.5 * sqrt(dt) * randn(1,N)). * dW)13

14 itoerr = abs(ito - 0.5 * (W(end)^2-T))15 straterr = abs(strat - 0.5 * W(end)^2)16 end

Pavliotis (IC) CompStochProc 113 / 298

Page 114: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can associate a second order differential operator with andSDE, the generator of the process.

Consider the Itô SDE

dXt = b(Xt) dt + σ(Xt) dW(t), (33)

We defineΣ(x) = σ(x)σ(x)T . (34)

The generator of Xt is

L = b · ∇+12

d∑

i,j=1

Σij(x)∂2

∂xi∂xj, (35)

Pavliotis (IC) CompStochProc 114 / 298

Page 115: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Using the generator we can write Itô’s formula, which enables usto calculate the rate of change in time of functions V(t,Xt).

In the absence of noise the rate of change of V can be written as

ddt

V(t,Xt) =∂V∂t

(t, xt) +AV(t, x(t)), (36)

where Xt is the solution of the ODE Xt = b(x) and A is

A = b(x) · ∇. (37)

For the SDE the chain rule (36) has to be modified by the additionof a term due to noise:

ddt

V(t,Xt) =∂V∂t

(t,Xt)+AV(t,Xt)+

⟨∇V(t,Xt), σ(Xt)

dWdt

(t)

⟩. (38)

Pavliotis (IC) CompStochProc 115 / 298

Page 116: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The additional term in LV is proportional to Σ and it arises fromthe lack of smoothness of Brownian motion.The precise interpretation of the expression for the rate of changeof V is in integrated form:

V(t,Xt) = V(x(0)) +∫ t

0

∂V∂s

(s,Xs) ds +∫ t

0LV(s,Xs) ds

+

∫ t

0〈∇V(s,Xs), σ(Xs) dW(s)〉 . (39)

The Brownian differential scales like the square root of thedifferential in time: E(dWt)

2 = dt. We can write Itô’s formula in theform

dV(t, x(t)) =∂V∂t

dt +d∑

i,j=1

∂V∂xi

dxi +12∂2V∂xi∂xj

dxidxj, (40)

We are using the conventiondWi(t)dWj(t) = δij dt, dWi(t)dt = 0, i, j = 1, . . . d.Thus, we can think of (40) as a generalization of Leibniz’ rule (38)where second order differentials are kept.

Pavliotis (IC) CompStochProc 116 / 298

Page 117: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Taking the expectation of (40) and using the fact that thestochastic integral is a martingale we obtain (we assume that V isindependent of time and that the I.C. are deterministic)

EV(Xt) = E

∫ t

0LV(s,Xs) ds.

We can use Itô’s formula to calculate the expectation value offunctionals of the solution on an SDE.For a Stratonovich SDE the rules of standard calculus apply; let Xt

denote the solution of the Stratonovich SDE

dXt = b(Xt) dt + σ(Xt) dWt, (41)

where b : Rd 7→ Rd and σ : Rd 7→ R

d×d. Then the generator of Xt

is

L· = b · ∇ ·+12σ : ∇(σT∇·). (42)

Furthermore, the Newton-Leibniz chain rule applies: leth ∈ C2(Rd) and define Yt = h(Xt). Then

dYt = ∇h(Xt) ·(b(Xt) dt + σ(Xt) dWt

). (43)

Pavliotis (IC) CompStochProc 117 / 298

Page 118: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Geometric Brownian motion)Consider the linear Itô SDE with multiplicative noise

dXt = λXt dt + σXt dWt, X0 = x, (44)

The generator of Xt is

L = λx∂

∂x+σ2x2

2∂2

∂x2 .

The solution of (44) is

X(t) = x exp

((λ− σ2

2)t + σW(t)

). (45)

Pavliotis (IC) CompStochProc 118 / 298

Page 119: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Geometric Brownian motion contd.)To derive this formula, we apply Itô’s formula to the functionf (x) = log(x):

d log(Xt) = L(

log(Xt))

dt + σx∂x log(Xt) dWt

=

(λXt

1Xt

+σ2X2

t

2

(− 1

X2t

))dt + σ dWt

=

(λ− σ2

2

)dt + σ dWt.

Consequently:

log

(Xt

X0

)=

(λ− σ2

2

)t + σW(t),

from which (45) follows.

Pavliotis (IC) CompStochProc 119 / 298

Page 120: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Existence and Uniqueness of solutions for SDEs

Definition

By a solution of (27) we mean an Rd-valued stochastic process Xt on

t ∈ [0,T] with the properties:1 Xt is continuous and Ft−adapted, where the filtration is generated

by the Brownian motion W(t);2 h(Xt) ∈ L1((0,T)), γ(Xt) ∈ L2((0,T));3 equation (27) holds for every t ∈ [0,T] with probability 1.

The solution is called unique if any two solutions Xi(t), i = 1, 2 satisfy

P(X1(t) = X2(t), ∀t ∈ [0.T]) = 1.

Pavliotis (IC) CompStochProc 120 / 298

Page 121: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

It is well known that existence and uniqueness of solutions forODEs (i.e. when γ ≡ 0 in (27)) holds for globally Lipschitz vectorfields h(x).

A very similar theorem holds when γ 6= 0.

As for ODEs the conditions can be weakened, when a prioribounds on the solution can be found.

Pavliotis (IC) CompStochProc 121 / 298

Page 122: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

Assume that both h(·) and γ(·) are globally Lipschitz on Rd and that x0

is a random variable independent of the Brownian motion W(t) with

E|x0|2 <∞.

Then the SDE (27) has a unique solution z(t) ∈ C(R+;Rd) with

E

[∫ T

0|Xt|2 dt

]<∞ ∀ T <∞.

Furthermore, the solution of the SDE is a Markov process.

Pavliotis (IC) CompStochProc 122 / 298

Page 123: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

RemarksThe Stratonovich analogue of (27) is

dXdt

= h(X) + γ(X) dWdt, X(0) = X0. (46)

By this we mean that z ∈ C(R+,Rd) satisfies the integral equation

X(t) = X(0) +∫ t

0h(X(s))ds +

∫ t

0γ(X(s)) dW(s). (47)

By using definitions (30) and (32) it can be shown that z satisfyingthe Stratonovich SDE (46) also satisfies the Itô SDE

dXdt

= h(X)+12∇·(γ(X)γ(X)T)− 1

2γ(X)∇·

(γ(X)T)+γ(X)dW

dt, (48a)

X(0) = X0, (48b)

provided that γ(z) is differentiable.

Pavliotis (IC) CompStochProc 123 / 298

Page 124: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

White noise is, in most applications, an idealization of a stationaryrandom process with short correlation time. In this context theStratonovich interpretation of an SDE is particularly importantbecause it often arises as the limit obtained by using smoothapproximations to white noise.

On the other hand the martingale machinery which comes withthe Itô integral makes it more important as a mathematical object.

It is very useful that we can convert from the Itô to theStratonovich interpretation of the stochastic integral.

There are other interpretations of the stochastic integral, e.g. theKlimontovich stochastic integral.

Pavliotis (IC) CompStochProc 124 / 298

Page 125: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Examples of SDEs

The Ornstein-Uhlenbeck process (α, σ > 0):

dXt = −αXt dt + σ dWt. (49)

The solution is

Xt = e−αtX0 + σ

∫ t

0e−α(t−s) dWs.

Geometric Brownian motion:

dXt = rXt dt + σXt dWt. (50)

The solution isXt = X0eσWt+(r− 1

2σ2)t.

Note that the solution is different in for the Itô and the StratonovichSDEs.The Cox-Ingersoll-Ross SDE (α, b > 0):

dXt = α(b − Xt) dt + σ√

Xt dWt. (51)

Pavliotis (IC) CompStochProc 125 / 298

Page 126: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Examples of SDEs (contd)

Stochastic Verhulst equation (population dynamics)

dXt = (λXt − X2t ) dt + σXt dWt. (52)

Lotka-Volterra SDEs:

dXi(t) = Xi(t)

ai +

d∑

j=1

bijXj(t)

dt + σiXi(t) dWi(t). (53)

Protenin kinetics:

dXt = (α− Xt + λXt(1 − Xt)) dt + σXt(1 − Xt) dWt. (54)

Pavliotis (IC) CompStochProc 126 / 298

Page 127: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Examples of SDEs (contd)

Tracer particle (turbulent diffusion)

dXt = u(Xt, t) dt + σ dWt, ∇ · u(x, t) = 0. (55)

Josephson junction (pendulum with friction and noise)

φt = − sin(φt)− γφt +√

2γβ−1 Wt. (56)

Noisy Duffing oscillator (stochastic resonance)

Xt = −βXt − αX3t − γXt + A cos(ωt) + σ Wt (57)

Stochastic heat equation

∂tu = ∂2x u + ∂tW(x, t), (58)

on [0, 1] with Dirichlet boundary conditions and W(x, t) denoting aninfinite dimensional Brownian motion.

Pavliotis (IC) CompStochProc 127 / 298

Page 128: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the SDE (80). The generator L and its L2-adjoint are

L = b(x)∂x +12σ2(x)∂2

x ,

and

L∗· = −∂x

(− b(x) ·+1

2∂x(σ2(x) ·

)).

We can obtain evolution equations for the expectation offunctionals of the solution of the SDE and for the transitionprobability density.

Pavliotis (IC) CompStochProc 128 / 298

Page 129: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

(Kolmogorov) Let f (x) ∈ Cb(R) and let

u(x, s) := E(f (Xt)|Xs = x) =∫

f (y)p(y, t|x) dy ∈ C2b(R).

Assume furthermore that the functions b(x), Σ(x) = σ2(x) arecontinuous in x. Then u(x, s) ∈ C2,1(R× R

+) and it solves the finalvalue problem

− ∂u∂s

= b(x)∂u∂x

+12Σ(x, s)

∂2u∂x2 , lim

s→tu(s, x) = f (x). (59)

Pavliotis (IC) CompStochProc 129 / 298

Page 130: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

(Kolmogorov) Assume that p(y, t|·, ), b(y),Σ(y) ∈ C2(R ×R+). Then the

transition probability density of Xt satisfies the equation

∂p∂t

= − ∂

∂y(b(y)p) +

12∂2

∂y2 (Σ(y)p) , p(0, y|x) = δ(x − y). (60)

Pavliotis (IC) CompStochProc 130 / 298

Page 131: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that initial distribution of Xt is ρ0(x). Define

p(y, t) :=∫

p(y, t|x)ρ0(x) dx.

We multiply the forward Kolmogorov equation (60) by ρ0(x) andintegrate with respect to x to obtain the equation

∂p(y, t)∂t

= − ∂

∂y(b(y, t)p(y, t)) +

12∂2

∂y2 (Σ(y)p(t, y)) , (61)

together with the initial condition

p(y, 0) = ρ0(y). (62)

Pavliotis (IC) CompStochProc 131 / 298

Page 132: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Numerical Solution of SDEs

We consider the scalar Itô SDE

dXt = b(Xt) dt + σ(Xt) dWt, X0 = x, (63)

posed on the interval [0,T]. The initial condition can be eitherdeterministic or random.

Our goal is to obtain an approximate solution to this equation tocompute quantities of the form Ef (Xt), where E denotes theexpectation with respect to the law of the process Xt.

The simplest numerical method is the Euler-Marayama methodthat is the analogue of the explicit Euler method for ODEs.

Pavliotis (IC) CompStochProc 132 / 298

Page 133: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We partition the interval [0,T] into N equal subintervals of size ∆twith N∆t = T.

We setXj := X(j∆t), j = 0, 1, . . .N.

The Euler-Marayama method for the SDE (63) is

Xj = Xj−1 + b(Xj−1)∆t + σ(Xj−1)∆Wj, j = 1, . . .N, (64)

where ∆Wj = Wj − Wj−1 denotes the jth Brownian increment.

We can write ∆Wj = ξj√∆t with ξj ∼ N (0, 1) i.i.d.

The Euler-Marayama scheme can be written as

Xj = Xj−1 + b(Xj−1)∆t + σ(Xj−1)√∆tξj, j = 1, . . .N. (65)

Pavliotis (IC) CompStochProc 133 / 298

Page 134: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 1 2 3 4 5 6 7 8 9 10−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

t

Xt

Figure: Five sample paths of the Ornstein-Uhlenbeck process.

Pavliotis (IC) CompStochProc 134 / 298

Page 135: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We solve the Ornstein–Uhlenebeck SDE in one dimension usingthe EM scheme.

dXt = −αXt dt +√

2σ dWt. (66)

We solve (66) for α = 1, σ = 12 for t ∈ [0, 10] with ∆t = 0.0098 and

initial conditions X0 ∼ 2U(0, 1), where U(0, 1) denotes the uniformdistribution in the interval (0, 1).

In Figure 7 we present five sample paths of the OU process.

In Figure 8 we plot the first two moments of the Euler-Marayamaapproximation of the OU process. We compare against thetheoretical solution.

Pavliotis (IC) CompStochProc 135 / 298

Page 136: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 1 2 3 4 5 6 7 8 9 10−0.2

0

0.2

0.4

0.6

0.8

1

1.2

t

E X

(t)

Euler−Marayama

exact formula

0 1 2 3 4 5 6 7 8 9 100.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

t

E(X

(t)

2)

Euler−Marayama

exact formula

a. EXt b. EX2t

Figure: First and second moments of the Ornstein-Uhlenebeck process usingthe Euler-Marayama method.

Pavliotis (IC) CompStochProc 136 / 298

Page 137: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We use the analytical solution to investigate the error of theEuler-Marayama scheme, as a function of ∆t.

We generate 1, 000 paths using the EM scheme as well as theexact solution, using the same Brownian paths.

Since the noise in the OU SDE is additive, we expect that thestrong order of convergence is 1.

This is verified by our numerical experiments, see Figure 9.

Pavliotis (IC) CompStochProc 137 / 298

Page 138: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

10−3

10−2

10−1

100

10−4

10−3

10−2

10−1

100

∆ t

E|X

(T)

−X

em(T

)|

Figure: Strong order of convergence for the Euler-Marayama method for theOrnstein-Uhlenbeck process. The linear equation with slope 1 is plotted forcomparison.

Pavliotis (IC) CompStochProc 138 / 298

Page 139: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the motion of a Brownian particle in a bistable potential:

dXt = −V ′(Xt) dt +√

2β−1 dWt, (67)

with

V(x) =x4

4− x2

2. (68)

The deterministic dynamical system has two stable equilibria.

For weak noise strengths the solution of the SDE spends mosttime oscillating around the minima of the potential.

In Figure 10 we present a sample path of the SDE with β = 10,obtained using the EM algorithm.

Pavliotis (IC) CompStochProc 139 / 298

Page 140: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 50 100 150 200 250 300 350 400 450 500−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

t

Xt

Figure: Sample path of the SDE (67).

Pavliotis (IC) CompStochProc 140 / 298

Page 141: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 1 2 3 4 5 6 7 8 9 10

0.4

0.5

0.6

0.7

0.8

0.9

1

t

E(X

(t)

2)

Figure: Second moment of the bistable SDE (67).

Pavliotis (IC) CompStochProc 141 / 298

Page 142: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Another standard numerical method for solving stochasticdifferential equations is the Milstein’s method

Xj = Xj−1+b(Xj−1)∆t+σ(Xj−1)ξj

√∆t+

12σ(Xj−1)σ

′(Xj−1)∆t(ξ2

j −1).

(69)

Notice that there is an additional term, in comparison to the EMscheme.

The Milstein scheme converges strongly with order 1.

Pavliotis (IC) CompStochProc 142 / 298

Page 143: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The drift term is discretized in the same way in the Euler andMilstein schemes.On the other hand, the Milstein scheme has an additional term,which is related to the approximation of the stochastic term in (63).We can derive the Milstein scheme as follows. First, we write anincrement of the solution to the SDE in the form

Xj+1 = Xj +

∫ (j+1)∆t

j∆tb(Xs) ds +

∫ (j+1)∆t

j∆tσ(Xs) dWs. (70)

We apply Itô’s formula to the drift and diffusion coefficients toobtain

b(Xs) = b(Xj) +

∫ s

j∆t(Lb)(Xℓ) dℓ+

∫ s

j∆t(b′σ)(Xℓ) dWℓ

and

σ(Xs) = σ(Xj) +

∫ s

j∆t(Lσ)(Xℓ) dℓ +

∫ s

j∆t(σ′σ)(Xℓ) dWℓ,

for s ∈ [j∆t, (j + 1)∆t], where L denotes the generator of theprocess Xt.

Pavliotis (IC) CompStochProc 143 / 298

Page 144: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We substitute these formulas into (70) to obtain, with ξj ∼ N (0, 1),

Xj+1 = Xj + b(Xj)∆t + σ(Xj)∆Wj

+

∫ (j+1)∆t

j∆t

∫ s

j∆t(Lb)(Xℓ) dℓ ds +

∫ (j+1)∆t

j∆t

∫ s

j∆t(b′σ)(Xℓ) dWℓds

+

∫ (j+1)∆t

j∆t

∫ s

j∆t(Lσ)(Xℓ) dℓdWs +

∫ (j+1)∆t

j∆t

∫ s

j∆t(σ′σ)(Xℓ) dWℓdWs

= Xj + b(Xj)∆t + σ(Xj)∆Wj + (σ′σ)(Xj)

∫ (j+1)∆t

j∆t

∫ s

j∆tdWℓdWs + O(∆

≈ Xj + b(Xj)∆t + σ(Xj)∆Wj +12(σ′σ)(Xj)

(∆W2

j −∆t)

= Xj + b(Xj)∆t + σ(Xj)∆Wj +12∆t(σ′σ)(Xj)

(ξ2

j − 1),

In the above, we have used the fact that (∆t)α(∆Wj)β = O((∆t)α+β/2)

and that, in one dimension,∫ (j+1)∆t

j∆t

∫ s

j∆tdWℓdWs =

12

(∆W2

j −∆t). (71)

Pavliotis (IC) CompStochProc 144 / 298

Page 145: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We will say that a numerical scheme has strong order ofconvergence α if there exists a positive constant C such that

E|Xj − X(j∆t)| 6 C∆tα, (72)

for any j = 1, . . .N and for ∆t sufficiently small.

We usually evaluate (72) at t = T.

We will show that the Euler-Marayama method has strong order ofconvergence 1

2 .

The Milstein scheme has strong order 1.

The strong order of convergence for the Euler-Marayama methodbecomes 1 when the noise is additive.

There are applications such as filtering and statistical inferencewhere strong convergence of the numerical solution of the SDE isneeded.

Pavliotis (IC) CompStochProc 145 / 298

Page 146: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Quite often we are interested in the calculation of statisticalquantities of solutions to SDEs, such as moments, rather than inpathwise properties.

For such a purpose, the strong convergence (72) is more thanwhat we actually need.

We will say that a numerical method has weak order ofconvergence β provided that there exists a positive constant Csuch that for all f ∈ CℓP(R) (the class of ℓ times continuouslydifferentiable functions which, together with their derivatives up toand including order ℓ have at most polynomial growth)

∣∣Ef (Xj)− Ef (X(j∆t))∣∣ 6 C∆tβ, (73)

for any j = 1, . . .N and for ∆t sufficiently small.

Both the Euler-Marayama and the Milstein schemes have weakorder of convergence 1.

The Milstein scheme requires more function evaluations.

Pavliotis (IC) CompStochProc 146 / 298

Page 147: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For the calculation of the expectation in (72) and (73) we need touse a Monte Carlo method for calculating the expectation.

We generate N sample paths of the exact solution of the SDE andof the numerical discretization that are driven by the same noise :

ε =1N

N∑

k=1

|XkT − Xk

T |.

ε is a random variable for which we can prove a central limittheorem.

In addition to the discretization error we also have the Monte Carloerror.

We can perform similar numerical experiments in order to studythe weak order of convergence. In this case the exact and theapproximate solutions do not need to be driven by the same noise.

Pavliotis (IC) CompStochProc 147 / 298

Page 148: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To estimate the strong order of convergence of a numericalscheme we plot the error as a function of step size in log-log anduse least squares fitting:

log(err) = log C + γ log∆t.

The weak convergence error consists of two steps, thediscretization error and the statistical error , due to the MCapproximation.We estimate (XT denotes the solution of the numerical scheme):

ε = |Ef (XT)−1M

M∑

m=1

f (XmN∆t)|

6 |Ef (XT)− Ef (XT)|+ |Ef (XT)−1M

M∑

m=1

f (XmN∆t)|

6 CN−γ + CM−1/2.

Fixed computational resources: optimize over N, M.

Pavliotis (IC) CompStochProc 148 / 298

Page 149: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The EM or Milstein schemes provide us with the solution of theSDE only at the discretization times.The values at intermediate instants can be estimated usinginterpolation:

Constant interpolation:

X(t) = Xjt , jt = maxj = 0, 1, 2 . . . ,N : j∆t 6 t,

linear interpolation:

X(t) = Xjt +t − τjt

τjt+1 − τjt(Xjt+1 − Xjt),

where τj = j∆t and where we have used X(t) to denote theinterpolated numerical solution of the SDE.

Pavliotis (IC) CompStochProc 149 / 298

Page 150: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider an approximate Brownian motion Wh(t) constructed as aGaussian random walk at tn = nh, h = T/N, T = 1 and by linearinterpolation in between. Then

E

∫ 1

0|Wh(t)− W(t)| dt =

c1

N1/2,

where c1 =√π/32. Furthermore

limN→+∞

√N

log NE sup

06t61|Wh(t)− W(t)| = c2,

for some constant c2 ∈ (0,+∞).

We can obtain similar results for diffusion processes.

It is possible to simulate exactly a diffusion process: Exactsimulation of diffusions A. Beskos, G.O. Roberts - The Annalsof Applied Probability, 2005

Pavliotis (IC) CompStochProc 150 / 298

Page 151: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider an SDE in arbitrary dimensions:

dXt = b(Xt) dt + σ(Xt) dWt, (74)

where Xt : [0,T] 7→ Rd, b ∈ C∞(Rd), σ ∈ C∞(Rd×d), Wt standard

d-dimensional Brownian motion.

The EM scheme is (we use the notation bin = bi(Xn) for the ith

component of the vector b etc.)

Xin+1 = Xi

n + bin∆t +

d∑

j=1

σij∆W jn. (75)

It is straightforward to implement this scheme.

Pavliotis (IC) CompStochProc 151 / 298

Page 152: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For the Milstein scheme we have:

Xin+1 = Xi

n + bin∆t +

d∑

j=1

σij∆W jn +

d∑

j1,j2=1,ℓ=1

σjℓn∂

∂xℓσj1j2

n Ij1j2 , (76)

where Ij1j2 denotes a double Itô integral:

Ij1j2 =

∫ tn+1

tn

∫ s1

tndW j1

s2dW j2

s1. (77)

The diagonal terms Ij1j2 are given by the 1d formula. Theoff-diagonal elements might be difficult/expensive to simulate.

We can obtain higher order schemes using the Stochastic Taylorexpansion .

Higher order schemes require the evaluation of a large number ofderivatives of the drift and diffusion coefficients

Pavliotis (IC) CompStochProc 152 / 298

Page 153: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can exploit the structure of the SDE to simplify the calculationof the Milstein correction to the EM scheme.Consider a nonlinear oscillator with friction and noise (example:the noisy Duffing-Van der Pol oscillator):

Xt = b(Xt)− γ(Xt)Xt +

d∑

j=1

σj(Xt) W j. (78)

Write as a first order system:

dX1t = X2

t dt,

dX2t =

(b(X1

t )− γ(X1t )X

2t

)dt +

d∑

j=1

σj(Xt) dW j.

The Milstein discretization for this SDE is

Y1n+1 = Y1

n + Y2n ∆t,

Y2n+1 = Y2

n −(b(Y1

n )− γ(Y1n )Y

2n

)∆t +

d∑

j=1

σj(Yn) dW j.

Pavliotis (IC) CompStochProc 153 / 298

Page 154: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

No Milstein correction appears (double Itô integrals vanish). Thisscheme has strong order of convergence 1.

Solve the first equation for Y2n and substitute to the second to

obtain a multistep scheme for Y1n :

Y1n+2 =

(2 − γ(Y1

n )∆t)Y1

n+1 − (1 − γ(Y1n )∆t)Y1

n + b(Y1n )(∆t)2

+

d∑

j=1

σj(Y1n )∆W j

n ∆t. (79)

This is the Lepingle-Ribemont scheme. The first equation in theMilstein scheme can be used as a starting routine and forcalculating Y2

n .

Pavliotis (IC) CompStochProc 154 / 298

Page 155: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Population dynamics

dXt = rXt(K − Xt) dt + βXt dWt, X(0) = X0.

Motion in a periodic incompressible velocity field subject tomolecular diffusion

dXt = v(Xt) dt +√

2κdWt, ∇ · v(x) = 0.

Pavliotis (IC) CompStochProc 155 / 298

Page 156: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 % compute effective diffusivities for particles moving ...in periodic velocity

2 % fields using monte carlo. the velocity field is a ...Taylor-Green flow

3 % dt: time step N: number of steps M: number of paths4 %5 % D: molecular diffusion6 %7 function [kxx, kyy,tint] = effdiff(N,dt,M,D)8 T = dt * N;9 tint = [0:dt:T];

10 %11 u = zeros(M,N);12 v = zeros(M,N);13 uin = u(:,1);14 vin = v(:,1);15 %16 DDD = num2str(D);17 MMM = num2str(M);18 elll = num2str(dt);19 %20 dW1 = sqrt(dt) * randn(M,N); % Brownian increments

Pavliotis (IC) CompStochProc 155 / 298

Page 157: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

21 dW2 = sqrt(dt) * randn(M,N);22 for j = 1:N-123 u(:,j+1) = u(:,j) + chsowx(u(:,j),v(:,j)) * dt + ...

sqrt(2 * D) * dW1(:,j);24 v(:,j+1) = v(:,j) + chsowy(u(:,j),v(:,j)) * dt + ...

sqrt(2 * D) * dW2(:,j);25 end26 %27 v = [vin,v];28 u=[uin,u];29 %30 figure31 plot(u(1,:),v(1,:))32 grid on33 %34 kxx = var(v)./(2 * tint);35 kyy = var(u)./(2 * tint);36 %37 figure38 plot(tint,kxx, 'r' , 'LineWidth' ,3)39 ylabel( 'var(x)/2t' , 'Fontsize' ,16, 'Rotation' ,0)40 xlabel( 't' , 'Fontsize' ,16)41 title([ 'var(x)/2t ' , ' D = ' DDD ' , M = ' MMM' , dt = ...

Pavliotis (IC) CompStochProc 155 / 298

Page 158: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

' elll ])42 %43 figure44 plot(tint,kyy, 'r' , 'LineWidth' ,3)45 ylabel( 'var(y)/2t' , 'Fontsize' ,16, 'Rotation' ,0)46 xlabel( 't' , 'Fontsize' ,16)47 title([ 'var(y)/2t ' , ' D = ' DDD ' , M = ' MMM' , dt = ...

' elll ])

Pavliotis (IC) CompStochProc 156 / 298

Page 159: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We consider the Itô SDE

dXt = b(Xt) dt + σ(Xt) dWt, X0 = x. (80)

We assume that the coefficients are smooth, globally Lipschitzcontinuous and satisfy the linear growth condition:

|b(x) − b(y)| + |σ(x) − σ(y)| 6 C|x − y|, (81)

and|b(x)| + |σ(x)| 6 C(1 + |x|). (82)

We also assume that the initial condition is a random variableindependent of the Brownian motion Wt with

E|X0|2 6 C.

Under the above assumptions there exists a unique strongsolution to (80) for t ∈ [0,T].

Pavliotis (IC) CompStochProc 156 / 298

Page 160: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Lemma

Under the above assumptions the solution to the SDE satisfies

E supt∈[0,T]

|Xt|2 6 C(T). (83)

Proof.Use the assumptions on the coefficients b(·) and σ(·) and the initialconditions, Cauchy-Schwartz and the Burkholder-Gandy-Davis andGronwall inequalities.

The BDG inequality that we need is

c2E

∫ t

0α2(s) ds 6 E sup

t∈[0,T]

(∫ t

0α(s) dWs

)2

6 C2E

∫ t

0α2(s) ds (84)

Pavliotis (IC) CompStochProc 157 / 298

Page 161: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let X∆tn N

n=0 denote the Euler discretisation of the SDE (80) withconstant interpolation.

We partition the interval [0,T]: tn = n∆t, n = 0, N, ∆t = T/N.

Lemma

Under the above assumptions we have

E supt∈[tn,tn+1]

|Xt − Xtn |2 6 C∆t. (85)

Proof.Use the assumptions on the coefficients, the BDG inequality andLemma 34.

Pavliotis (IC) CompStochProc 158 / 298

Page 162: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

Let X∆tn N

n=0 denote the Euler discretisation of the SDE (80) withconstant interpolation. Under the above assumptions we have

E supt∈[0,T]

|Xt − X∆tt |2 6 C∆t. (86)

For the proof we will need the discrete Gronwall inequality in thefollowing form: let ynN

n=0 be a sequence and A, B > 0 satisfying

y0 = 0, yn 6 A + Bhn−1∑

j=0

yj, 1 6 n 6 N, h = 1/N.

Thenmax

06i6Nyi 6 AeB. (87)

Pavliotis (IC) CompStochProc 159 / 298

Page 163: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can write the Euler scheme in the form

X∆tn+1 = X∆t

n +

∫ tn+1

tnb(X∆t

n ) ds +∫ tn+1

tnσ(X∆t

n ) dWs.

For the exact solution we have

Xn+1 = Xn +

∫ tn+1

tnb(Xn) ds +

∫ tn+1

tnσ(Xn) dWs + εn,

where

εn =

∫ tn+1

tn(b(Xs)− b(Xn)) ds +

∫ tn+1

tn(σ(Xs)− σ(Xn)) dWs.

Pavliotis (IC) CompStochProc 160 / 298

Page 164: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Take the difference between the Euler approximation and theexact solution (use the notationδXn = Xn − X∆t

n , δb(Xn) = b(Xn)− b(X∆tn ) etc.):

δXn+1 = δXn +

∫ tn+1

tnδbn ds +

∫ tn+1

tnδσn dWs + εn.

Sum over n and use δX0 = 0 to obtain

δXn+1 =

N−1∑

n=0

∫ tn+1

tnδbn ds +

N−1∑

n=0

∫ tn+1

tnδσn dWs + RN ,

where

RN =

N−1∑

n=0

εn.

Pavliotis (IC) CompStochProc 161 / 298

Page 165: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Use the Lipschitz continuity, Cauchy-Schwarz, BGD to obtain(∆t 6 1)

E|δXn+1|2 6 C(T)∆tN−1∑

n=0

E|δXn|2 + CE|RN |2.

Use Lemma 35 and Lipschitz continuity, Cauchy-Schwarz, BGD toobtain

E|RN |2 6 C∆.

Use the discrete Gronwall inequality to conclude the proof ofTheorem 36.

Pavliotis (IC) CompStochProc 162 / 298

Page 166: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

Let X∆tn N

n=0 denote the Euler discretisation of the SDE (80) withconstant interpolation and assume that b, σ ∈ C4

b and f ∈ C4P. Then

|Ef (X∆tT )− Ef (XT)| 6 C∆t. (88)

Pavliotis (IC) CompStochProc 163 / 298

Page 167: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For the proof of this theorem we will use the backward Kolmogorovequation. Let u(t, x) denote the solution of the backward Kolmogorovequation with u(T, x) = f (x). We have

Ef (X∆tT )− Ef (XT) = E(u(T,X∆t

T ))− u(0,X0)

=n−1∑

i=0

E[u(ti+1,X

∆tti+1

)− u(ti,X∆tti )]

=n−1∑

i=0

E

∫ ti+1

ti

[∂tu(s,X

∆ts ) + L(X∆t

ti )u(s,X∆ts )]

ds

=n−1∑

i=0

E

∫ ti+1

ti

[ (∂t + L(X∆t

ti ))(u(s,X∆t

s )− (u(ti,X∆tti )]

ds

We use Itô’s formula:

E|∂tu(s,X∆ts )− ∂tu(ti,X

∆tti )| = E|χ(Xπti )|O(∆t).

Pavliotis (IC) CompStochProc 164 / 298

Page 168: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

when we are interested in weak approximation, we can replacethe Brownian increments by different random variables that mightbe easier to simulate and have the right statistics.

We can use, for example the two point distributed random variable√∆tζn where ζn are i.i.d. with

P

(ζn = ±1

)=

12. (89)

For higher order approximations we can use the 3-point distributedrandom variable

P

(ζn = ±

√3)=

16, P

(ζn = 0

)=

23. (90)

This is useful when considering weak convergence for fully implicitschemes.

Pavliotis (IC) CompStochProc 165 / 298

Page 169: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Itô’s formula can be used to obtain a probabilistic description ofsolutions to linear partial differential equations of parabolic type.

Let Xxt be a diffusion process with drift b(·), diffusion Σ(·) = σσT(·),

and generator L with Xx0 = x, and let f ∈ C2

0(Rd) and V ∈ C(Rd),

bounded from below. Then the function

u(x, t) = E

(e−

∫ t0 V(Xx

s ) dsf (Xxt ))

(91)

is the solution to the initial value problem

∂u∂t

= Lu − Vu, (92a)

u(0, x) = f (x). (92b)

Pavliotis (IC) CompStochProc 166 / 298

Page 170: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To derive this result, we introduce the variableYt = exp

(−∫ t

0 V(Xs) ds).

We rewrite the SDE for Xt as

dXxt = b(Xx

t ) dt + σ(Xxt ) dWt, Xx

0 = x, (93a)

dYxt = −V(Xx

t ) dt, Yx0 = 0. (93b)

The process Xxt , Yx

t is a diffusion process with generator

Lx,y = L − V(x)∂

∂y.

We can write

E

(e−

∫ t0 V(Xx

s ) dsf (Xxt ))= E(φ(Xx

t ,Yxt )),

where φ(x, y) = f (x)ey. We apply Itô’s formula to this function toobtain (92).

Pavliotis (IC) CompStochProc 167 / 298

Page 171: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The representation formula (91) for the solution of the initial valueproblem (92) is called the Feynman–Kac formula.

We can use the Feynman-Kac formula to develop a numericalscheme for solving parabolic PDEs based on the numericalsolution of the underlying SDE.

Pavliotis (IC) CompStochProc 168 / 298

Page 172: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Stability analysis of numerical schemes

Goal: solve SDEs accurately over long time intervals.

Constant in the strong/weak order of convergence estimatesgrows exponentially fast (Gronwall’s lemma).

Stability analysis: fix ∆t, study the T → +∞ limit.

Test the stability of the scheme on the linear SDE (geometric BM)

dXt = λXt dt + σXt dWt, X(0) = X0, (94)

where λ, µ ∈ C.

Pavliotis (IC) CompStochProc 169 / 298

Page 173: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We will say that an SDE is stable in mean square if for all X0 6= 0with prob. 1

limt→+∞

E|Xt|2 = 0, (95)

We will say that an SDE is asymptotically stable if for all X0 6= 0with prob. 1

P

(lim

t→+∞|Xt| = 0

)= 1. (96)

For the gBM (94) we have

limt→+∞

E|Xt|2 = 0 ⇔ Re(λ) +12|σ|2 < 0. (97)

and

P

(lim

t→+∞|Xt| = 0

)= 1 ⇔ Re

(λ− 1

2σ2)< 0. (98)

Pavliotis (IC) CompStochProc 170 / 298

Page 174: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

LemmaThe geometric Brownian motion (94) with λ, µ ∈ C is stable in meansquare provided that

Re(λ) +12|σ|2 < 0. (99)

Pavliotis (IC) CompStochProc 171 / 298

Page 175: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof.

We apply Itô’s formula to |Xt|2 = (ReXt)2 + (ImXt)

2 to obtain

d |Xt|2 =(

2Re(λ) + |σ|2)|Xt|2 dt + dMt,

where Mt is a martingale. We take the expectation to obtain

ddtE|Xt|2 =

(2Re(λ) + |σ|2

)E|Xt|2.

The solution to this equation is

E|Xt|2 = exp(2Re(λ) + |σ|2

)E|X0|2,

from which (99) follows.

Pavliotis (IC) CompStochProc 172 / 298

Page 176: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Mean square stability is more stringent than asymptotic stability.

Suppose that the parameters λ and σ are chosen so that gBM ismean square stable. For what values of ∆t is the EM also meansquare stable?

limj→+∞

EX2j = 0 ⇔ (1 +∆t + λ)2 +∆t|σ|2 < 1. (100)

Similar results can be obtained for the Milstein scheme.

Pavliotis (IC) CompStochProc 173 / 298

Page 177: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Definition

Let X∆tt denote a time-discrete approximation of the SDE

dXt = b(Xt) dt + σ(Xt) dWt (101)

with step size ∆t starting at X∆t0 at t = 0 and let X

∆tt denote the

approximation starting at X∆t0 We will say that X∆t

t is stochasticallynumerically stable for the SDE (101) if for any finite interval [0,T]∃∆t0 s.t. ∀ ε > 0, ∆t ∈ (0,∆t0) we have

lim|X∆t

0 −X∆t0 |→0

supt∈[0,T]

P

(|X∆t

nt− X∆t

nt| > ε

)= 0. (102)

We will say that a numerical scheme is stochastically numericallystable if it satisfies (103) for all SDEs for which it is convergent.

Pavliotis (IC) CompStochProc 174 / 298

Page 178: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Asymptotic stochastic stability: Replace (103) with

lim|X∆t

0 −X∆t0 |→0

limT→+∞

P

(|X∆t

nt− X∆t

nt| > ε

)= 0. (103)

Consider the linear SDE

dXt = λXt dt + dWt.

It is mean square stable for Re(λ) < 0.

Consider numerical schemes that can be written in the form

X∆tn+1 = X∆

t G(λ∆t) + Z∆tn .

The region of absolute stability of the scheme is the set of λ∆tfor which

Re(λ) < 1 and |G(λ∆t)| < 1.

If the region of absolute stability is the left half complex plane thescheme is A-stable .

Pavliotis (IC) CompStochProc 175 / 298

Page 179: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Just as with ODEs, implicit schemes have better stabilityproperties than explicit schemes.

The implementation of an implicit scheme requires the solution ofan additional algebraic equation at each time step, which canusually be done using the Newton-Raphson algorithm.

Treating the noise involves reciprocals of Gaussian randomvariables which do not have finite absolute moments.

We will treat the drift implicitly and the noise explicitly.

The implicit Euler scheme is

Xn+1 = Xn + b(Xn+1)∆t + σ(Xn)∆Wn. (104)

If we are only interested in weak convergence we can replace√∆tN (0, 1) by random variables that are accurate in the weak

sense and are easier to take their reciprocals.

Pavliotis (IC) CompStochProc 176 / 298

Page 180: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Family of implicit Euler schemes (stochastic theta method, STM):

Xn+1 = Xn +(θ b(Xn+1) + (1 − θ)b(Xn)

)∆t + σ(Xn)∆Wn, (105)

for θ ∈ [0, 1].

Family of implicit Milstein schemes:

Xn+1 = Xn +(θ b(Xn+1) + (1 − θ)b(Xn)

)∆t (106)

+σ(Xn)∆Wn +12(σσ′)(Xn)

((∆Wn)

2 −∆t). (107)

For both schemes we have

G(λ∆t) =1 + (1 − θ)λ∆t

1 − θλ∆t.

Pavliotis (IC) CompStochProc 177 / 298

Page 181: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Test problem: geometric Brownian motion (linear SDE withmultiplicative noise), eqn. (94).

For what stepsizes ∆t does the stochastic θ method

share the stability properties of the test problem?

Use the concept of mean square stability.

For the SDE the mean square stability depends on the parametersλ, σ.

For the numerical scheme it depends on the model parametersλ, σ and on the numerical scheme parameters ∆t, θ.

Our goal is to find for what values of ∆t, θ, λ, σ the stochastictheta method is also mean square stable.

Pavliotis (IC) CompStochProc 178 / 298

Page 182: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The STM for the geometric Brownian motion becomes

Xn+1 = Xn + (1 − θ)λhXn + θλhXn+1 + σ√

hξnXn, (108)

with h := ∆t. We can rewrite this equation as

Xn+1 =(

a + b ξn

)Xn, (109)

with

a =1 + (1 − θ)hλ

1 − θhλ, b =

√hσ

1 − θhλ. (110)

This is a homogeneous Markov chain with an uncountable statespace.

To study the stability properties of the STM we need to study thelong time behavior of the Markov Chain (109)

Pavliotis (IC) CompStochProc 179 / 298

Page 183: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionThe numerical scheme (Markov Chain) (109) is mean square stable iff

limn→+∞

E|Xn|2 = 0. (111)

TheoremThe MC (109) is mean square stable if and only if

|a|2 + |b|2 < 1. (112)

In particular, for the STM we have

|1 + (1 − θ)hλ|2 + h|σ2||1 − θhλ|2 < 1. (113)

Pavliotis (IC) CompStochProc 180 / 298

Page 184: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Define the stability regions for the gBM and for the STM:

SSDE := λ, σ ∈ C : Re(λ) +12|σ|2 < 0 (114)

and

SSTM(θ, h) :=

λ, σ ∈ C :

|1 + (1 − θ)hλ|2 + h|σ2||1 − θhλ|2 < 1

. (115)

TheoremFor all h > 0 we have

SSTM(θ, h) ⊂ SSDE for θ ∈ [0, 12).

SSTM(θ, h) = SSDE for θ = 12 .

SSTM(θ, h) ⊃ SSDE for θ ∈ (12 , 1].

Pavliotis (IC) CompStochProc 181 / 298

Page 185: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

DefinitionThe numerical scheme is (109) (mean sqaure) A-stable provided thatwhenever the test problem (94) is mean-sqaure stable, then (109) isalso mean square stable for all ∆t > 0.

Corollary

The STM is A-stable for all θ ∈ [12 , 1].

If the SDE is unstable, then so is the STM for all h.

If the SDE is stable, then so is the STM for sufficiently small h.

In this case, the resulting stepsize restriction can be arbitrarilysevere.

Pavliotis (IC) CompStochProc 182 / 298

Page 186: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Numerical methods for ergodic SDEs

We consider SDEs in Rd

dXt = b(Xt) dt + σ(Xt) dWt. (116)

We assume that Xt is ergodic with respect to the invariantmeasure π(dx) = ρ∞(x) dx.Assuming sufficient regularity, the invariant density ρ∞(x) is theunique normalizable solution of the stationary Fokker-Planckequation

L∗ρ∞ = 0. (117)

In many applications we need to be able to calculate expectationswith respect to the stationary distribution

Eπf (x) =∫

Rdf (x)ρ∞(x) dx. (118)

Pavliotis (IC) CompStochProc 183 / 298

Page 187: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In high dimensions it is computationally prohibitive to solve thestationary Fokker-Planck equation.

We also need to calculate the high dimensional integral (118).

For ergodic SDEs time averages equal phase-space averages:

limT→+∞

1T

∫ T

0f (Xs) ds = Eπf (x). (119)

We can solve numerically the SDE (116) and then calculate thetime average in (119) to calculate Eπf (x).

Let Xhn = Xh(nh), h = ∆t denote the solution of the numerical

scheme. We need to estimate the difference

∣∣∣ 1N

N∑

n=1

f (Xhn)−

Rdf (x)ρ∞(x) dx

∣∣∣ (120)

as a function of h and N.

Pavliotis (IC) CompStochProc 184 / 298

Page 188: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let X∆tn nT

n=0 denote a (long) numerically calculated trajectory. Wehave

FT :=

∫ T

0f (Xs) ds ≈ 1

nT

nT−1∑

n=0

f (X∆tn ) := F∆t

T .

DefineF∆t = lim

T→+∞F∆t

T .

We will say that a time discrete approximation X∆tn converges with

respect to the ergodic criterion with order β > 0 to the ergodicdiffusion process Xt as ∆t → 0 provided that for everyf ∈ C∞

P (Rd,R) there exist a positive constant Cf independent of ∆tand a ∆t0 such that

|F∆t − Eµf (x)| 6 Cf (∆t)β (121)

Pavliotis (IC) CompStochProc 185 / 298

Page 189: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

This is an extension of the weak convergence criterion to theinfinite time horizon T = ∞.

The explicit Euler scheme converges with respect to the ergodiccriterion with order β = 1.0.

We expect that most of the numerical schemes with weak order βalso converge with respect to the ergodic criterion with the sameorder β, under the assumptions of the Hasminski criterion.

Pavliotis (IC) CompStochProc 186 / 298

Page 190: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We need to address the following issues.1 Does the numerical scheme have an invariant measure (density)ρh∞

? How quickly does it converge to it?2 How close is ρh

∞to ρ∞?

3 How close is the time averaging estimator to the stationarymeasure of the SDE?

The standard weak convergence results (e.g. for the explicit Eulermethod) are valid only over finite time intervals. In order to be ableto control the difference between the long time average of thenumerical scheme and Eπf we need estimates on the numericalscheme that are uniform in time.

For this we need to use the structural property of (geometric)ergodicity of the SDE.

Pavliotis (IC) CompStochProc 187 / 298

Page 191: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

A very important concept in the study of limit theorems forstochastic processes is that of ergodicity .

This concept, in the context of Markov processes, provides us withinformation on the long–time behavior of a Markov semigroup.

DefinitionA Markov process is called ergodic if the equation

Ptg = g, g ∈ Cb(E) ∀t > 0

has only constant solutions.

Roughly speaking, ergodicity corresponds to the case where thesemigroup Pt is such that Pt − I has only constants in its nullspace, or, equivalently, to the case where the generator L hasonly constants in its null space. This follows from the definition ofthe generator of a Markov process.

Pavliotis (IC) CompStochProc 188 / 298

Page 192: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Under some additional compactness assumptions, an ergodicMarkov process has an invariant measure µ with the property that,in the case T = R

+,

limt→+∞

1t

∫ t

0g(Xs) ds = Eg(x),

where E denotes the expectation with respect to µ.

This is a physicist’s definition of an ergodic process: timeaverages equal phase space averages .

Using the adjoint semigroup we can define an invariant measureas the solution of the equation

P∗t µ = µ.

If this measure is unique, then the Markov process is ergodic.

Pavliotis (IC) CompStochProc 189 / 298

Page 193: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Using this, we can obtain an equation for the invariant measure interms of the adjoint of the generator L∗, which is the generator ofthe semigroup P∗

t . Indeed, from the definition of the generator of asemigroup and the definition of an invariant measure, we concludethat a measure µ is invariant if and only if

L∗µ = 0

in some appropriate generalized sense ((L∗µ, f ) = 0 for everybounded measurable function).

Assume that µ(dx) = ρ(x) dx. Then the invariant density satisfiesthe stationary Fokker-Planck equation

L∗ρ = 0.

The invariant measure (distribution) governs the long-timedynamics of the Markov process.

Pavliotis (IC) CompStochProc 190 / 298

Page 194: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

If X0 is distributed according to µ, then so is Xt for all t > 0. Theresulting stochastic process, with X0 distributed in this way, isstationary .

In this case the transition probability density (the solution of theFokker-Planck equation) is independent of time: ρ(x, t) = ρ(x).

Consequently, the statistics of the Markov process is independentof time.

Pavliotis (IC) CompStochProc 191 / 298

Page 195: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleThe one dimensional Brownian motion is not an ergodic process: Thenull space of the generator L = 1

2d2

dx2 on R is not one dimensional!

Example

Consider a one-dimensional Brownian motion on [0, 1], with periodicboundary conditions. The generator of this Markov process L is thedifferential operator L = 1

2d2

dx2 , equipped with periodic boundaryconditions on [0, 1]. This operator is self-adjoint. The null space of bothL and L∗ comprises constant functions on [0, 1]. Both the backwardKolmogorov and the Fokker-Planck equation reduce to the heatequation

∂ρ

∂t=

12∂2ρ

∂x2

with periodic boundary conditions in [0, 1]. Fourier analysis shows thatthe solution converges to a constant at an exponential rate.

Pavliotis (IC) CompStochProc 192 / 298

Page 196: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleThe one dimensional Ornstein-Uhlenbeck (OU) process is aMarkov process with generator

L = −αxddx

+ Dd2

dx2 .

The null space of L comprises constants in x. Hence, it is anergodic Markov process. In order to calculate the invariantmeasure we need to solve the stationary Fokker–Planck equation:

L∗ρ = 0, ρ > 0, ‖ρ‖L1(R) = 1. (122)

Pavliotis (IC) CompStochProc 193 / 298

Page 197: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example (Continued)

Let us calculate the L2-adjoint of L. Assuming that f , h decaysufficiently fast at infinity, we have:

R

Lfh dx =

R

[(−αx∂xf )h + (D∂2

x f )h]

dx

=

R

[f∂x(αxh) + f (D∂2

x h)]

dx =:

R

fL∗h dx,

where

L∗h :=ddx

(axh) + Dd2hdx2 .

We can calculate the invariant distribution by solvingequation (122).

The invariant measure of this process is the Gaussian measure

µ(dx) =

√α

2πDexp

(− α

2Dx2)

dx.

If the initial condition of the OU process is distributed according tothe invariant measure, then the OU process is a stationary

Pavliotis (IC) CompStochProc 194 / 298

Page 198: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Approximation of Invariant Measures and MCMC

Consider the SDE in Rd:

dXt = b(Xt) dt + σ(Xt) dWt, X0 = x. (123)

The generator of Xt is

L = b(x) · ∇+12

d∑

i,j=1

Σij(x)∂2

∂xi∂xj. (124)

The L2-adjoint (Fokker-Planck operator) is

L· = ∇ ·(

b(x) ·+12∇ ·(Σ(x) ·

)). (125)

Pavliotis (IC) CompStochProc 195 / 298

Page 199: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The expectation u(x, t) = E(φ(Xt)|X0 = x) is the solution of theinitial value problem for the backward Kolmogorov equation

∂u∂t

= Lu, u(x, 0) = φ(x). (126)

Using the transition probability density p(t, x, y) we can write

u(x, t) =∫

Rdp(t, x, y)φ(y) dy.

p(t, x, y) is the solution of the forward Kolmogorov (Fokker-Planck)equation

∂p∂t

= L∗yp, p(0, x, y) = δ(x − y). (127)

Pavliotis (IC) CompStochProc 196 / 298

Page 200: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Connection between SDEs, (elliptic and parabolic) PDEs andprobability measures.

We can use SDEs in order to calculate functional integrals, thesolution of elliptic and parabolic PDEs and to sample from a givenprobability distribution.

We need to study the ergodic properties of the SDE (123).

We denote by Pt and P∗t the semigroups generated by L and L∗,

respectively:Pt = eLt and P∗

t = eL∗t.

Pt acts on L∞ functions and P∗t on probability measures.

Pavliotis (IC) CompStochProc 197 / 298

Page 201: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

A probability measure µ is invariant under the dynamics (123)provided that

P∗t µ = µ. (128)

We will say that the dynamics Xt is ergodic if there exists only oneinvariant measure. In this case

limT→+∞

1T

∫ T

0f (Xs) ds =

Rdf (x)µ(dx). (129)

Let µ(dx) = p∞(x) dx. We divide (128) by t and pass to the limitt → 0 to obtain the stationary Fokker-Planck equation

L∗p∞ = 0, (130)

together with the appropriate boundary conditions.

The SDE (123) is ergodic provided that there exists a uniquenormalizable solution to (130).

Pavliotis (IC) CompStochProc 198 / 298

Page 202: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the dynamics (123) with X0 ∼ p0(x). The law of theprocess Xt is the solution of the initial value problem for theFokker-Planck equation

∂p∂t

= L∗p, p(0, x) = p0(x). (131)

When Xt is ergodic we have that

limt→+∞

‖p(·, t) − p∞(·)‖L1(Rd) = 0.

In applications to MCMC we need quantitative information on therate of convergence to equilibrium.

Pavliotis (IC) CompStochProc 199 / 298

Page 203: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can study the ergodic properties of an SDE using techniquesfrom stochastic analysis, functional analysis (functionalinequalities), Lyapunov function techniques etc.

Hasminski’s criterion : if the drift and diffusion coefficients aresmooth, the diffusion coefficient is strictly positive definite andbounded and there exists a constant β > 0 and a compact subsetK ∈ R

d such that〈b(x), x〉 6 −β|x|2

for all x ∈ Rd K. Then Xt is ergodic.

Even if we can prove that an SDE is ergodic, we usually do nothave an analytic formula for the invariant measure.

Example: nonequilibrium steady states.

Pavliotis (IC) CompStochProc 200 / 298

Page 204: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can consider numerical schemes that can be written in theform

Xhn+1 = Xn + b(Xh

n ; h)h + σ(Xhn , h)

√hξn, (132)

where b : Rd × (0, 1) → Rd, σ : Rd × (0, 1) → R

d×m, and ηn is acollection of iid real-valued r.v. with

Eηn,i = Eηn,i = 0, Eηn,i = 1, Eη2rn,i < +∞,

for r sufficiently large.

For example, we can have ηn,i ∼ N (0, 1) iid or iid two-point rv (89).

Pavliotis (IC) CompStochProc 201 / 298

Page 205: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The convergence of the long time average of the solution to theSDE to its invariant measure can be proved by studying thesolution of the Poisson equation

− Lφ = f − Eπf , Eπφ = 0. (133)

We apply Itô’s formula to φ:

dφ(Xt) = Lφ(Xt) dt + dMt,

= −(f (Xt)− Eπf ) dt + dMt.

where Mt =∫ t

0 ∇φ(Xt)σ(Xt) dWt.

1T

∫ T

0f (Xt) dt − Eπf = − 1

T

(φ(Xt)− φ(X0)

)+

1T

MT .

Pavliotis (IC) CompStochProc 202 / 298

Page 206: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that solutions of the Poisson equation (133) are bounded(for example, if the diffusion is uniformly elliptic and the statespace is compact, e.g. the unit torus T

d). Then:

limT→+∞

E

(1T

(φ(Xt)− φ(X0)

))2

= 0,

and

limT→+∞

E1

T2 M2t = lim

T→+∞1TE〈Mt〉 = 0,

using the law of large numbers and the central limit theorems formartingales.

Consequently:

E

(1T

∫ T

0f (Xt) dt − Eπf

)2

6CT.

Pavliotis (IC) CompStochProc 203 / 298

Page 207: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

This is a quantitative version of the mean ergodic theorem. Wecan also prove an almost sure ergodic theorem (strong law oflarge numbers)

limT→+∞

1T

∫ T

0f (Xt) dt = Eπf a.s.

The long time average of f (Xt) is an estimator for the integral Eπf .Its bias and variance are:

Bias(

1T

∫ T

0f (Xt) dt

):= Eπ

(1T

∫ T

0f (Xt) dt − Eπf

)= O

(1T

)

(134)and

Varπ

(1T

∫ T

0f (Xt) dt

):= Eπ

(1T

∫ T

0f (Xt) dt − Eπf

)2

= O

(1T

).

(135)

Pavliotis (IC) CompStochProc 204 / 298

Page 208: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We have that

limT→+∞

T Varπ

(1T

∫ T

0f (Xt) dt

)= 2〈φ, f − Eπf 〉π, (136)

where φ denotes the solution of the Poisson equation (133) and〈h, g〉π =

∫hgπ(dx) denotes the L2(π) inner product.

This follows from Itô’s formula or, equivalently, from (settingf = f − Eπf )

limT→+∞

T Varπ

(1T

∫ T

0f (Xt) dt

)= 2

∫ +∞

0Eπ(f (Xt)f (X0)) dt (137)

and the backward Kolmogorov equation.

Pavliotis (IC) CompStochProc 205 / 298

Page 209: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We have:

Varπ

(1T

∫ T

0f (Xt) dt

)= Eπ

(1T

∫ T

0f (Xt) dt

)2

=1

T2

∫ T

0

∫ T

0Eπ

(f (Xt)f (Xs)

)dtds

=:1

T2

∫ T

0

∫ T

0Rf (t, s) dtds

=2

T2

∫ T

0(T − s)Rf (s) ds

=2T

∫ T

0

(1 − s

T

)Eπ

(f (Xs)f (X0)

)ds,

from which (137) follows.

Pavliotis (IC) CompStochProc 206 / 298

Page 210: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let u(x, t) denote the solution of the backward Kolmogorovequation with initial condition f (x). We can formally write

E(f (Xt) |X0 = x) = u(x, t) = eLtf (x),

where L denotes the generator of Xt.For the calculation of the time integral of the stationaryautocorrelation function we have (X denotes the state space):∫ +∞

0Eπ(f (Xt)f (X0)) dt =

X

∫ +∞

0

(eLtf (x)

)f (x) dt π(dx)

=

Xf (x)

(∫ +∞

0eLtf (x) dt

)π(dx)

=

Xf (x)

(∫ +∞

0eLtf (x) dt

)π(dx)

=

Xf (x)(−L)−1f (x)π(dx)

= 〈f , φ〉π.

Pavliotis (IC) CompStochProc 207 / 298

Page 211: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the long time average estimator of Eπf

fN =1N

N−1∑

n=0

f (Xhn), (138)

where XhnN−1

n=0 is the Markov chain obtained from the numericaldiscretization of the SDE, from example using the EM scheme.

We want to calculate the variance of this estimator by generating along trajectory:

1 We generate a long trajectory of length M T with T = Nh and wesplit it into M blocks of length T.

2 We evaluate the estimators fm,N . Assuming that T is large enoughand that we have fast decay of correlations then these estimatorsare close to being uncorrelated.

3 We compute the sampled variance

D =1

M − 1

M∑

m=1

f 2m,N −

(1M

M∑

m=1

fm,N

)2

(139)

Pavliotis (IC) CompStochProc 208 / 298

Page 212: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For sufficiently large N, M we have the following confidenceinterval

Eπ fN ∈(

fNM − c

√D√M, fNM + c

√D√M

), (140)

with probability 95% for c = 2 and probability 99.7% for c = 3.

Pavliotis (IC) CompStochProc 209 / 298

Page 213: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

MARKOV CHAIN MONTE CARLO

Pavliotis (IC) CompStochProc 210 / 298

Page 214: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Goal: sample from a distribution π(x) that is known only up to aconstant. We will sometimes write

π(x) =1Z

e−V(x), Z =

RNe−V(x) dx. (141)

We want to calculate integrals of the form

Eπf :=

Rdf (x)π(dx).

1 MC approach: Calculate the normalization constant

Z =

Rd

π(x) dx

and the integral

Eπf (x) =∫

Rd

f (x)π(x) dx

using MC, together with an appropriate variance reductiontechnique.

2 SDE approach: Construct an appropriate stochastic dynamicswhose invariant distribution is π(x).

Pavliotis (IC) CompStochProc 211 / 298

Page 215: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Markov Chain Monte Carlo

Construct ergodic stochastic dynamics whose invariantdistribution is π(x).

There are many different dynamics whose invariant distribution isgiven by π(x).

Different discretizations of the corresponding SDE can behavevery differently, even fail to converge to π(x).computational efficiency:

1 Choose the dynamics that converges to equilibrium as quickly aspossible (bias correction).

2 Choose the dynamics that leads to the minimum asymptoticvariance (variance reduction).

Consider these problems either for a class of observables or for aspecific observable.

Pavliotis (IC) CompStochProc 212 / 298

Page 216: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We want to calculate the integral

Eπf =

Rdf (x)π(dx), (142)

where π(dx) is known up to the normalization constant.We want to use a diffusion process Xt that is ergodic with respectto π(dx):

limT→+∞

1T

∫ T

0f (Xs) ds = Eπf a.s. (143)

For all f ∈ L1(π).We will consider diffusions that satisfy a functional central limittheorem

limT→+∞

√T

(1T

∫ T

0f (Xs) ds − Eπf

)= N (0, 2σ2

f ), (144)

in distribution for all f ∈ L2(π).Our goal is to choose the diffusion process so that we can speedup convergence to equilibrium and minimize the asymptoticvariance.

Pavliotis (IC) CompStochProc 213 / 298

Page 217: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In order for π(x) to be the invariant distribution of Xt, we requirethat it is the (unique) solution of the stationary Fokker-Planckequation

∇ · (−b(x)π(x) +∇ · (Σ(x)π(x))) = 0, (145)

together with appropriate boundary conditions (if we are in abounded domain).

If the detailed balance condition

Js := −bπ +∇ · (Σπ) = 0, (146)

is satisfied, then the process Xt is reversible wrt π(x).

Pavliotis (IC) CompStochProc 214 / 298

Page 218: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

There are (infinitely) many reversible diffusions that can be used inorder to sample from π(x). A reversible diffusion can be written inthe form

dXt = Σ(Xt)∇π(Xt) dt +∇ · Σ(Xt) dt +√

2Σ(Xt) dWt. (147)

The rate of convergence to equilibrium depends on the tails of thedistribution π(x) and on the choice of diffusion process Xt.

The standard choice is the overdampled Langevin dynamics,Σ = 2 I, b(x) = 1

2∇ logπ(x)

dXt =12∇ log π(Xt) dt + dWt. (148)

Pavliotis (IC) CompStochProc 215 / 298

Page 219: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Definition

A stationary stochastic process Xt is time reversible if its law isinvariant under time reversal: for every T ∈ (0,+∞) Xt and thetime-reversed process XT−t have the same distribution.

The processes Xt and XT−t have the same finite dimensionaldistributions. Equivalently, for each N ∈ N

+, a collection of times0 = t0 < t1 · · · < tN = T, and bounded measurable functions withcompact support fj, j = 0, . . .N we have that

N∏

j=0

fj(Xtj) = Eµ

N∏

j=0

fj(XT−tj), (149)

where µ(dx) denotes the invariant measure of Xt and Eµ denotesexpectation with respect to µ.Reversible diffusion processes can be characterized in terms ofthe properties of their generator. Time-reversal is equivalent to theselfadjointness of the generator in the Hilbert space L2(Rd;µ).

Pavliotis (IC) CompStochProc 216 / 298

Page 220: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

A stationary Markov process Xt in Rd with generator L and invariant

measure µ is reversible if and only if its generator is selfadjoint inL2(Rd;µ).

Pavliotis (IC) CompStochProc 217 / 298

Page 221: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof.Assume first (149). We take N = 1 and t0 = 0, t1 = T to deduce that

(f0(X0)f1(XT)

)= Eµ

(f0(XT)f1(X0)

), ∀f0, f1 ∈ L2(Rd;µ).

This is equivalent to∫ (

eLtf0(x))

f1(x)µ(dx) =∫

f0(x)(eLtf1(x)

)µ(dx),

i.e.〈eLtf1, f2〉L2

µ= 〈f1, eLtf2〉L2

µ, ∀f1, f2 ∈ L2(Rd; ρs). (150)

Consequently, the semigroup eLt generated by L is selfadjoint.Differentiating (150) at t = 0 gives that L is selfadjoint.

Pavliotis (IC) CompStochProc 218 / 298

Page 222: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof.

Conversely, assume that L is selfadjoint in L2(Rd;µ). We will use aninduction argument. Our assumption of selfadjointness impliesthat (149) is true for N = 1

1∏

j=0

fj(Xtj) = Eµ

1∏

j=0

fj(XT−tj), (151)

Assume that it is true for N = k. We have that

k∏

j=0

fj(Xtj) =

∫. . .

∫f0(x0)µ(dx0)

k∏

j=1

fj(xj)p(tj − tj−1, xj−1, dxj)

= Eµ

k∏

n=0

fj(Xtj−1)

=

∫. . .

∫fk(xk)µ(dxk)

k∏

j=1

fj−1(xj−1)p(tj − tj−1, xj, dxj−1),

(152)Pavliotis (IC) CompStochProc 219 / 298

Page 223: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof.Now we show that (149) it is true for N = k + 1. We calculate,using (151) and (152)

k+1∏j=1

fj(Xtj ) = EµΠkj=1fj(Xtj )fk+1(Xtk+1 )

(?? )=

∫. . .

∫µ(dx0)f0(x0)

k∏j=1

fj(xj)p(tj − tj−1, xj−1, dxj) ×

fk+1(xk+1)p(tk+1 − tk, xk, dxk+1)

(152)=

∫. . .

∫µ(dxk)f0(xk)

k∏j=1

fj−1(xj−1)p(tj − tj−1, xj, dxj−1) ×

fk+1(xk+1)p(tk+1 − tk, xk, dxk+1)

(152)=

∫. . .

∫µ(dxk)f0(xk)fk+1(xk+1)p(tk+1 − tk, xk, dxk+1) ×

k∏j=1

fj−1(xj−1)p(tj − tj−1, xj, dxj−1)

(151)=

∫. . .

∫µ(dxk+1)f0(xk+1)

k+1∏j=1

fj−1(xj−1)p(tj − tj−1, xj, dxj−1) ×

fk+1(xk+1)p(tk+1 − tk, xk, dxk+1) = Eµ

k+1∏j=0

fj(XT−tj).

Pavliotis (IC) CompStochProc 220 / 298

Page 224: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In order to sample from π(x) using the overdamped Langevindynamics (148) we first have to discretize the SDE. The Eulerdiscretization gives

Xn+1 = Xn +12

h∇ log π(Xn) +√

hξn, ξn ∼ N (0, 1). (153)

This can be written as

Xn+1 ∼ N(

Xn +12

h∇ log π(Xn), hI)

(154)

This defines a Markov Chain XnNn=0. It is not clear that this

Markov chain has the same ergodic properties as the SDE.

We can choose to either accept or reject the next move Xn+1 witha certain probability. Introducing this accept-reject step to a givenMarkov chain leads to the Metropolis-Hastings algorithm.

The resulting Metropolis adjusted algorithm is always ergodicwith respect to the target distribution π(x).

Pavliotis (IC) CompStochProc 221 / 298

Page 225: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 %SDE parameters2 dt = 1.0e-2;3 kappa = 0.1;4 diff = sqrt(2 * kappa * dt);5

6 %number of iterations7 N = 10^6;8

9 %batch count (N must be divisible by m)10 m = floor(sqrt(N));11

12 %Significance value for confidence intervals13 ∆ = 0.01; % 99% CI14

15 %Derivative of potential16 Vprime = @(x) x^3 - x;17

18 %Observable19 f = @(x) x;20

21 obs = zeros(1, N);22 x = 0;

Pavliotis (IC) CompStochProc 221 / 298

Page 226: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

23 %simulate SDE24 for i=1:N25 x = - Vprime(x) * dt + diff * randn(1);26 obs(i) = f(x);27 end;28

29 %The estimator is then given by30 av = mean(obs);31 disp([ 'Mean ' , num2str(av)]);32

33 %Compute the confidence intervals using a batch-means ...estimator

34 B = reshape(obs, m, []);35

36 %Compute batch means37 bmeans = mean(B, 1);38

39 %Compute batch variance40 D = var(bmeans);41 disp([ 'Variance ' , num2str(D)]);42

43 %Compute confidence interval44 cinv = tinv(1- ∆/2,m-1) * sqrt(D/m);

Pavliotis (IC) CompStochProc 221 / 298

Page 227: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

45

46 %Now do all the47 hold on48 plot((1:N) * dt, cumsum(obs)./(1:N), '-r' );49 xlim([10 N * dt])50

51 hline = refline(0, av - cinv);52 set(hline, 'LineStyle' , ':' )53 hline = refline(0,av + cinv);54 set(hline, 'LineStyle' , ':' )55 ylabel( '$f(X_t)$' , 'interpreter' , 'latex' )56 xlabel( '$t$' , 'interpreter' , 'latex' )57 title( 'Estimating $f(x)=x$ from ...

$e^-(x^2-1)^2/4\kappa$' , 'interpreter' , 'latex' );58 hold off

Pavliotis (IC) CompStochProc 222 / 298

Page 228: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Metropolis-Hastings AlgorithmWe are given a target distribution π(x) on R

d that is known up tothe normalisation constant.

We are also given a Markov chain Xn with transition kerneldensity q(x, y) = q(x − y). Examples include:

For the random walk with diffusion coefficient σ2 we haveq(x) ∼ N (x, σ2).

For the Euler discretization of the overdamped Langevin dynamicswe consider the Markov chain (154).

Given π and Xn, a proposed value Yn+1 is generated from thedensity q(Xn, y) and is then accepted with probability

α(x, y) =

minπ(y)π(x)

q(y,x)q(x,y) , 1

: π(x)q(x, y) > 0,

1 : π(x)q(x, y) = 0.. (155)

If the proposed value is accepted then set Xn+1 = Yn+1 otherwiseset Xn+1 = Xn.

Pavliotis (IC) CompStochProc 222 / 298

Page 229: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The transition probability density for the Metropolis adjustedMarkov chain is

p(x, y) = q(x, y)α(x, y), fory 6= x (156)

and with probability of remaining at the same point given by

r(x) = P(x, x) =∫

q(x, y)[1 − α(x, y)] dy.

With choice of the acceptance probability we have that theresulting Markov chain is ergodic (in fact, reversible) with respectto the target measure π(dx):

π(A) =∫π(x)P(x,A) dx, x ∈ X, A ∈ B, (157)

where X denotes the state space and B the Borel sets.

Pavliotis (IC) CompStochProc 223 / 298

Page 230: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%2 % rwmh: Implementation of Random-Walk metropolis hastings3 % Code written by A. Duncan4 % Parameters:5 % X0 : starting state.6 % ∆: step size7 % nsamp: number of samples8 % target distribution: functional handle to density ...

function.9 %

10 % Output:11 % X: [nsamp, length(X0)]-dimensional array of samples12 % acc: Array of accept-reject flags13 %14 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%15

16 function [ X, acc ] = rwmh( X0, ∆, nsamp, ...targetDistribution)

17 %Metropolis hastings algorithm18

19 dim = length(X0);20 X = zeros(nsamp, dim);

Pavliotis (IC) CompStochProc 223 / 298

Page 231: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

21 acc = zeros([1 nsamp]);22

23 %set the initial state24 X(1,:) = X0;25

26 for i = 1:nsamp-127 %Generate proposal28 Y = proposal(X(i,:), ∆);29

30 %Compute acceptance probability31 pX_given_Y = proposal_prob(Y, X(i,:), ∆);32 pY_given_X = proposal_prob(X(i,:), Y, ∆);33

34 piY = targetDistribution(Y);35 piX = targetDistribution(X(i,:));36 alpha = min(1, pX_given_Y * piY/(pY_given_X * piX));37

38

39 %Accept/Reject sample.40 if (rand < alpha)41 X(i+1,:) = Y;42 acc(i) = 1;43 else

Pavliotis (IC) CompStochProc 223 / 298

Page 232: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

44 X(i+1,:) = X(i,:);45 acc(i) = 0;46 end47 end48

49 end50

51 % RWMH proposal function52 function Y = proposal(X, ∆)53 Y = normrnd(X, sqrt( ∆), [1,length(X)]);54 end55

56 % RWMH proposal probability57 function prob = proposal_prob(Y, X, ∆)58 d = dot((Y - X), (Y - X));59 prob = exp(-d/(2 * ∆));60 end

Pavliotis (IC) CompStochProc 224 / 298

Page 233: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Given essentially any probability distribution π theMetropolis-Hastings algorithm provides us with a method forgenerating a Markov chain Xn that has π as a stationarydistribution.

When the transition kernel density q(·, ·) is symmetric then theformula for the acceptance probability simplifies to

α(x, y) = min

π(y)π(x)

, 1

.

For the symmetric Random Walk Metropolis algorithm (RWM),q(·) = N (x, σ2): Set Xn = x, choose Yn+1 ∼ N (x, σ2), set

Xn+1 =

Yn+1 : with probability α(x, y),

Yn : with probability 1 − α(x, y).. (158)

Pavliotis (IC) CompStochProc 224 / 298

Page 234: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We consider an MCMC algorithm for sampling from π(x), with orwithout an accept reject step. We consider a Markov chain Φn.The n-step transition probability is

Pn(x,A) = P(Φn ∈ A |Φ0 = x

),

where x ∈ X (the state space) and A ∈ B.

For the RWMH and the Metropolis adjusted Langevin algorithm(MALA) it converges to π in total variation (strong law of largenumbers)

‖Pn(x, ·) − π‖TV :=12

supA∈B

|Pn(x,A)− π(A)| → 0. (159)

Under additional assumptions we also have a central limittheorem.

The rate of convergence to the target distribution and theasymptotic variance can be used to measure the efficiency of theMCMC algorithm.

Pavliotis (IC) CompStochProc 225 / 298

Page 235: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For an MCMC algorithm to be efficient, it has to be geometricallyergodic (exponentially fast convergence to equilibrium).

Without the Metropolis-Hastings step, the (unadjusted) Eulerdiscretization of the Langevin dynamics (ULA) might fail to be(geometrically) ergodic.

This depends on the tails of the target distribution.

We can test the convergence of different MCMC algorithms for aclass of 1d distributions E(β, γ):

π ∈ E(β, γ) ⇔ π(x) ∝ e−γ|x|β |x| > x0, (160)

for some x0 and some positive constants γ, β.

The ULA Markov chain might fail to be (geometrically) ergodiceven when the Langevin dynamics is, depending on the values ofα and β and on the step size.

Pavliotis (IC) CompStochProc 226 / 298

Page 236: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Optimal Scaling for the Metropolis-Hastings AlgorithmFor the RWMH algorithm we can choose the diffusion coefficient:

q(·) = N (·, σ2).

We want to choose the parameter σ in an optimal way.

The RWMH can perform arbitrarily badly for σ ≪ 1 and for σ ≫ 1.There exists an optimal value σ∗.

We can measure the efficiency of the MCMC algorithm in terms ofthe asymptotic variance (for all f ∈ L2(π)). The efficiency of thealgorithm depends on the acceptance rate

a =

∫α(x, y)π(x)q(x, y) dxdy (161)

= limn→+∞

1n

numberaccepted moves. (162)

Choosing the optimal scaling reduces the computational cost.

Pavliotis (IC) CompStochProc 227 / 298

Page 237: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The optimal acceptance rate for RWMH is

aopt = 0.234.

For MALA:aopt = 0.574.

The MALA algorithm asymptotically mixes considerably fasterthan the RWMH algorithm.

Pavliotis (IC) CompStochProc 228 / 298

Page 238: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Ordering for MCMC algorithms

Consider the class S of diffusion processes that are ergodic withrespect to π, the distribution from which we want to sample.

1 Choose Xt so that it converges as quickly as possible to theequilibrium.

2 Choose Xt s that the stationary variance is minimized

σ2f = Varπ(f ) = 〈(−L)−1f , f 〉π . (163)

We can use (2) to introduce a partial ordering in S(Peskun-Tierney ordering).

Pavliotis (IC) CompStochProc 229 / 298

Page 239: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can consider the nonreversible dynamics (Hwang et al 1993,2005).

dXbt =

(−∇V(Xγt ) + b(Xγt )

)dt +

√2 dWt, (164)

where γ is taken to be divergence-free with respect to theinvariant distribution π(dx) = Z−1e−V dx:

∇ ·(γe−V) = 0. (165)

This ensures that π(dx) is still the invariant measure of thedynamics (164).

We can construct such vector fields by taking

γ = J∇V, J = −JT . (166)

Pavliotis (IC) CompStochProc 230 / 298

Page 240: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The dynamics (164) is non-reversible: (Xγt )06t6T has the same lawas (X−γ

T−t)06t6T and thus not the same law as (XγT−t)06t6T .

Equivalently, the system does not satisfy detailed balance–thestationary probability flux is not zero.

From (166) it is clear that there are many (in fact, infinitely many)different ways for modifying the reversible dynamics withoutchanging the invariant measure.

Pavliotis (IC) CompStochProc 231 / 298

Page 241: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We consider the nonreversible dynamics

dXt = (−I + δJ)∇V(Xt) dt +√

2β−1 dWt, (167)

with δ ∈ R and J the standard 2 × 2 antisymmetric matrix, i.e.J12 = 1, J21 = −1. For this class of nonreversible perturbations theparameter that we wish to choose in an optimal way is δ.

However, the numerical experiments will illustrate that even anon-optimal choice of δ can significantly accelerate convergenceto equilibrium.

We will use the potential

V(x, y) =14(x2 − 1)2 +

12

y2. (168)

Pavliotis (IC) CompStochProc 232 / 298

Page 242: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

t

E(x

2 +y2 )

δ = 0δ =10

Figure: Second moment as a function of time for (167) with thepotential (168). We take 0 initial conditions and β−1 = 0.1.

Pavliotis (IC) CompStochProc 233 / 298

Page 243: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

STATISTICAL INFERENCE FOR STOCHASTIC DIFFERENTIALEQUATIONS

Pavliotis (IC) CompStochProc 234 / 298

Page 244: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Statistical Inference for SDEs

Many of the stochastic models that are used in applicationsinclude unknown parameters that have to be determined fromobservations.

We can use techniques from statistics to estimate parameters inthe drift and diffusion coefficients.

We will consider one dimensional Itô SDEs of the form

dXt = b(Xt; θ) dt + σ(Xt; θ) dWt, X0 = x, (169)

where θ ∈ Θ ⊂ RN is a finite set of parameters that we want to

estimate from observations.

Pavliotis (IC) CompStochProc 235 / 298

Page 245: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We can consider either the case where only discrete observationsare available, or the case where an entire path Xt, t ∈ [0,T] isobserved.

The length of the path can be either fixed or we can also considerthe case where the observation interval increases, T → +∞.

We can also consider noisy observations:

Ytj = Xtj + ξεtj , εtj ∼ N (0, 1), iid.

Pavliotis (IC) CompStochProc 236 / 298

Page 246: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Examples

Estimate the diffusion coefficient of Brownian motion

dXt =√

2σ dWt.

Estimate the drift and diffusion coefficients of the (stationary) OUprocess

dXt = −αXt dt +√

2σ dWt.

Estimate the drift and diffusion coefficients θ = (A, B, σa, σb) inthe Landau-Stuart equation with additive and multiplicative noise

dXt = (AXt − BX3t ) dt +

√σ2

a + σ2bX2

t dWt. (170)

estimate the volatility in geometric Brownian motion:

dXt = µXt dt + σXt dWt,

Pavliotis (IC) CompStochProc 237 / 298

Page 247: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Estimate the integrated stochastic volatility in the Heston model:

dXt = µXt dt +√νtXt dW1

t ,

dνt = κ(θ − νt) dt + σ√νt dW2

t ,

where E(W1t W2

t ) = ρ.

The integrated stochastic volatility is

σ(T) =∫ T

0νtX

2t dt.

Pavliotis (IC) CompStochProc 238 / 298

Page 248: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

To estimate parameters in the diffusion coefficient we can use thequadratic variation of the process Xt

〈Xt,Xt〉 :=∫ t

0σ2(Xs; θ) ds = lim

∆tk→0

tk6t

|Xtk+1 − Xtk |2. (171)

where the limit is in probability.When the diffusion coefficient is constant the convergencein (171) becomes almost sure:

limn→+∞

n∑

i=1

[XiT2−n − X(i−1)T2−n

]2= σ2T, a.s. (172)

If we fix the length of the observation [0,T] and we let the numberof observations become infinite, n → +∞ we can in factdetermine (not only estimate) the diffusion coefficient.This is called the high frequency limit .T can be (arbitrarily) small.For the estimation of the diffusion coefficient we do not need toassume that the process Xt is stationary.

Pavliotis (IC) CompStochProc 239 / 298

Page 249: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

Let XjJj=0 be a sequence of equidistant observations of (?? ) with

timestep ∆t = δ and Jδ = T fixed. Assume that the drift b(x; θ) isbounded and define

σ2J =

1Jδ

J−1∑

j=0

(Xj+1 − Xj

)2, (173)

Then|Eσ2

J − σ2| 6 C(δ + δ1/2). (174)

In particular,lim

J→+∞|Eσ2

J − σ2| = 0. (175)

Pavliotis (IC) CompStochProc 240 / 298

Page 250: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We have

Xj+1 − Xj =

∫ (j+1)δ

jδb(Xs; θ) ds + σ∆Wj,

where ∆Wj = W(j+1)δ − Wjδ ∼ N (0, δ). We substitute thisinto (173)

σ2J = σ2 1

δJ

J−1∑

j=0

(∆Wj)2 +

2δJ

J−1∑

j=0

IjMj +1δJ

J−1∑

j=0

I2j ,

where

Ij :=

∫ (j+1)δ

jδb(Xs; θ) ds

and Mj := σ∆Wj.

Note that E(∆Wn)2 = δ.

Pavliotis (IC) CompStochProc 241 / 298

Page 251: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

From the boundedness of b(x; θ) and using the Cauchy-Schwarzinequality we get

EI2j 6 δ

∫ (j+1)δ

jδE(b(Xs; θ)

)2ds 6 Cδ2.

Consequently:

∣∣Eσ2J − σ2

∣∣ 61δEI2

j +2δE

∣∣∣IjMj

∣∣∣

6 Cδ +Cδ

(1αEI2

j + αEM2j

)

6 C(δ + δ1/2

).

In the above we used Cauchy’s inequality with α = δ1/2.

Pavliotis (IC) CompStochProc 242 / 298

Page 252: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that we have already estimated the diffusion coefficient.Consider the SDE

dXt = b(Xt; θ) dt + dWt. (176)

Our goal is to estimate the unknown parameters in the drift θ ∈ Θfrom discrete observations.

We denote the true value by θ0.

We will use the maximum likelihood approach which is based onmaximizing thelikelihood function .

We first study the problem of estimating parameters in thedistribution function of random variables using observations.

Pavliotis (IC) CompStochProc 243 / 298

Page 253: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We consider a random variable X whose probability distributionfunction f (x|θ) in known up to parameters θ that we want toestimate from observations.When X is a Gaussian random variable, in which case theparameters to be estimated are the mean and variance,θ = (µ, σ).Suppose that we have N independent observations of the randomvariable X.We define the likelihood function

L(xiN

i=1

∣∣θ)=

N∏

i=1

f (xi|θ). (177)

The likelihood function is essentially the probability densityfunction of the random variable X, viewed as a function of theparameters θ. The maximum likelihood estimator (MLE) is then

θ = argmax L(x|θ), (178)

with x = xiNi=1.

Pavliotis (IC) CompStochProc 244 / 298

Page 254: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The MLE is a random variable that depends on the observationsxiN

i=1. When X is a Gaussian random variable, X ∼ N (µ, σ2) thelikelihood function takes the form

L(xiN

i=1

∣∣θ)=

(1

2πσ2

)N/2

exp

(−∑N

i=1(xi − µ)2

2σ2

).

Maximizing (the log likelihood function) then with respect to µ andσ2 we obtain the maximum likelihood estimators

µ =1N

N∑

i=1

xi, σ2 =1N

N∑

i=1

(xi − µ)2. (179)

Notice that

Eµ = µ and Eσ2 =N − 1

Nσ2.

Pavliotis (IC) CompStochProc 245 / 298

Page 255: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Under appropriate assumptions the maximum likelihood estimatoris consistent: We can estimate the true values of the parametersas accurately as we wish if we have a sufficiently large number ofobservations N.

Use the LLN and the CLT to study the asymptotic (large N)properties of the MLE.

In particular, we can show that

limN→+∞

θ = θ0,

either in probability or almost surely and

limN→+∞

√N(θ − θ0) = N (0,D2),

in distribution, for an appropriate constant D2.

Pavliotis (IC) CompStochProc 246 / 298

Page 256: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We want to use this idea in order to estimate the parameters in thedrift of the SDE (176).

The (independent) observations of the random variable X are nowreplaced by observations of the process Xt, XiN

i=1 with Xi = Xi h

and h N = T.

We have N equidistant observations, or with an entire pathXt, t ∈ [0,T].

Assume then that a path of the process Xt is observed.

The analogue of the likelihood function (177) is the law of theprocess on path space.

To obtain a formula for the likelihood function we need to useGirsanov’s theorem .

Pavliotis (IC) CompStochProc 247 / 298

Page 257: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The law of Xt, denoted by PX, is absolutely continuous withrespect to the Wiener measure, the law of Brownian motion.

The density of PX with respect to the Wiener measure is given bythe Radon-Nikodym derivative:

dPX

dPW= exp

(∫ T

0b(Xs; θ) dXs −

12

∫ T

0(b(Xs; θ))

2 ds

)

=: L(Xtt∈[0,T]; θ,T

). (180)

The maximum likelihood estimator MLE is defined as

θ = argmaxθ∈ΘL(Xtt∈[0,T]; θ

). (181)

The MLE estimator is a random variable that depends on the pathXtt∈[0,T].

Pavliotis (IC) CompStochProc 248 / 298

Page 258: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that the diffusion process (176) is stationary.

The MLE (181) is asympotically unbiased : in the limit as thewindow of observation becomes infinite, T → +∞, the MLE θconverges to the true value θ0.

Assume that there are N parameters to be estimated,θ = (θ1, . . . θN). The MLE is obtained by solving the (generallynonlinear) system of equations

∂L∂θi

= 0, i = 1, . . . ,N. (182)

The solution of this system of equations can be expressed interms of functionals (e.g. moments) of the observed pathXtt∈[0,T]

θ = F(Xtt∈[0,T]).

Pavliotis (IC) CompStochProc 249 / 298

Page 259: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example(MLE for the stationary OU process). Consider the stationary OUprocess

dXt = −αXt dt + dWt (183)

with X0 ∼ N(0, 1

). The log Likelihood function

log L = α

∫ T

0Xt dXt −

α2

2

∫ T

0X2

t dt.

We solve the equation ∂ log L∂α = 0 from which we obtain

α = −∫ T

0 Xt dXt∫ T0 X2

t dt=: − B1(Xtt∈[0,T])

M2(Xtt∈[0,T])(184)

where we have used the notation

Bn(Xtt∈[0,T]) :=∫ T

0Xn

t dXt, Mn(Xtt∈[0,T]) :=∫ T

0Xn

t dt. (185)

Pavliotis (IC) CompStochProc 250 / 298

Page 260: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Given a set of discrete equidistant observations XjJj=0,

Xj = Xj∆t, ∆Xj = Xj+1 − Xj, formula (184) can be approximated by

α = −∑J−1

j=0 Xj ∆Xj∑J−1

j=0 |Xj|2 ∆t. (186)

The MLE (184) becomes asymptotically unbiased in the largesample limit J → +∞, ∆t fixed.

Pavliotis (IC) CompStochProc 251 / 298

Page 261: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0.6

0.8

1

1.2

1.4

0 1000 2000 3000 4000 5000 6000

T

α

Figure: MLE for the OU process.

Pavliotis (IC) CompStochProc 252 / 298

Page 262: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Using Itô’s formula we can obtain an alternative formula for themaximum likelihood estimator of the drift coefficient for the OUprocess.The numerator in (184) can be written as

∫ t

0Xs dXs = −α

∫ t

0X2

s ds +∫ t

0Xs dWs.

We apply Itô’s formula to the function V(x) = 12x2 to obtain

dV(Xt) = −αX2t dt +

12

dt + Xt dWt.

We combine the above two equations to obtain∫ t

0XsdXs =

X2t − X2

0 − t2

.

The formula for the MLE now becomes

α = −X2T − X2

0 − T

2∫ T

0 X2t dt

. (187)

Pavliotis (IC) CompStochProc 253 / 298

Page 263: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example

Consider the following generalization of the previous example:

dXt = αb(Xt) dt + dWt, (188)

where b(x) is such that the equation has a unique ergodic solution.The log Likelihood function is

log L = α

∫ T

0b(Xt) dXt −

α2

2

∫ T

0b(Xt)

2 dt.

The MLE is

α =

∫ T0 b(Xt) dXt∫ T

0 (b(Xt))2 dt.

Pavliotis (IC) CompStochProc 254 / 298

Page 264: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleMLE for a stationary bistable SDE

Consider the SDE

dXt =(αXt − βX3

t

)dt + dWt (189)

This SDE is of the form dXt = −V ′(Xt) dt + dWt withV(x) = α

2 x2 − β4 x4 and is ergodic with invariant distribution

ρ(x) = Z−1e−12 V(x).

Our goal is to estimate the coefficients α and β from observationsusing the maximum likelihood approach. The log likelihoodfunctions reads

log L =

∫ T

0

(αXt − βX3

t

)dXt −

12

∫ T

0

(αXt − βX3

t

)2dt

=: αB1 − βB3 −12α2M2 −

12β2M6 + αβM4,

using the notation (185).Pavliotis (IC) CompStochProc 255 / 298

Page 265: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

ExampleEquations (182) become

∂ log L∂α

(α, β) = 0,∂ log L∂β

(α, β) = 0,

This leads to a linear system of equations(

M2 −M4

M4 −M6

)(α

β

)=

(B1

B3

),

The solution of which is

α =B1M6 − B3M4

M2M6 − M24

, β =B1M4 − B3M2

M2M6 − M24

. (190)

Pavliotis (IC) CompStochProc 256 / 298

Page 266: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0 1000 2000 3000 4000 5000 6000

T

αβ

Figure: MLE estimators for a bistable potential.

Pavliotis (IC) CompStochProc 257 / 298

Page 267: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The rigorous justification of the MLE is based on Girsanov’stheorem.

Here we present a heuristic derivation of (180) that is based onthe Euler-Marayama discretization of (176):

Xn+1 − Xn = b(Xn; θ)∆t +∆Wn, (191)

where Xn = X(n∆t) and ∆Wn = Wn+1 − Wn =: ξn.

We have that ξn ∼ N (0,√∆t), mutually independent.

Our goal is to calculate the Radon-Nikodym derivative of the lawof the discrete-time process XnN−1

n=0 and the discretized Brownianmotion.

In the discrete case this derivative becomes the ratio between thedistribution functions of the two processes.

Pavliotis (IC) CompStochProc 258 / 298

Page 268: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We rewrite (191) in the form

∆Xn = bn ∆+ ξn, (192)

where bn := b(Xn; θ), ∆ := ∆t. Our goal is to calculate theRadon–Nikodym derivative of the law of the discrete-time processXnN−1

n=0 and the discretized Brownian motion.In the discrete case, this derivative becomes the ratio between thedistribution functions of the two processes.We rewrite (?? ) in the form

∆Xn = bn ∆t + ξn

√∆t, (193)

where bn := b(Xn; θ). The distribution function of the discretizedBrownian motion is

pNW =

N−1∏

i=0

1√2π∆t

exp

(− 1

2∆t(∆Wi)

2)

=1

(√

2π∆t)Nexp

(− 1

2∆t

N−1∑

i=0

(∆Wi)2

). (194)

Pavliotis (IC) CompStochProc 259 / 298

Page 269: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Similarly, for the law of the discretized process XnN−1n=0 , using the

fact that p(Xi+1|Xi) ∼ N (Xi + bi∆t,∆t), we can write

pNX =

1

(√

2π∆t)Nexp

(−

N−1∑

i=0

(1

2∆t(∆Xi)

2 +12(bi)

2∆t − bi∆Xi

)).(195)

Now we can calculate the ratio of the laws of the two processes,evaluated at the path XnN−1

n=0 :

dPNX

dPNW

= exp

(−1

2

N−1∑

i=0

(bi)2∆t +

N−1∑

i=0

bi∆Xi

).

Passing now (formally) to the limit as N → +∞ while keeping ∆fixed, we obtain (180).

Pavliotis (IC) CompStochProc 260 / 298

Page 270: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Lamperti’s Transformation and Girsanov’s Theorem

For SDEs in one dimension it is possible to map multiplicativenoise to additive noise.

Consider a one dimensional Itô SDE with multiplicative noise

dXt = f (Xt) dt + σ(Xt) dWt. (196)

We ask whether there exists a transformation z = h(x) thatmaps (196) into an SDE with additive noise.

We apply Itô’s formula to obtain

dZt = Lh(Xt) dt + h′(Xt)σ(Xt) dWt,

where L denotes the generator of Xt.

Pavliotis (IC) CompStochProc 261 / 298

Page 271: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In order to obtain an SDE with unit diffusion coefficient we need toimpose the condition

h′(x)σ(x) = 1,

from which we deduce that

h(x) =∫ x

x0

1σ(x)

dx, (197)

where x0 is arbitrary. We have that

Lh(x) =f (x)σ(x)

− 12σ′(x).

Consequently, the transformed SDE has the form

dYt = fY(Yt) dt + dWt (198)

with

fY(y) =f (h−1(y))σ(h−1(y))

− 12σ′(h−1(y)).

This is called the Lamperti transformation .

Pavliotis (IC) CompStochProc 262 / 298

Page 272: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Consider the Cox-Ingersoll-Ross (CIR) SDE

dXt = (µ− αXt) dt + σ√

Xt dWt, X0 = x > 0, (199)

where µ, α, σ > 0. From (197) we deduce h(x) = 2σ

√x.

The generator of the CIR process is

L = (µ − αx)ddx

+σ2

2x

d2

dx2 .

We have that

Lh(x) =(µσ− σ

4

)x−1/2 − α

σx1/2.

The CIR SDE becomes, for Yt =2σ

√Xt,

dYt =(µσ− σ

4

) 1√Xt

dt − α

σ

√Xt dt + dWt

=

[(2µσ2 − 1

2

)1Yt

− α

2Yt

]dt + dWt.

When µ = σ2

4 the equation above becomes theOrnstein-Uhlenbeck process for Yt.

Pavliotis (IC) CompStochProc 263 / 298

Page 273: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The Lamperti transformation is very useful when estimatingparameters for an SDE model using the MLE approach.

Such a transformation does not exist for arbitrary SDEs in higherdimensions. In particular, it is not possible, in general, to transforma multidimensional Itô SDE with multiplicative noise to an SDEwith additive noise.

Multidimensional diffusion processes for which the reduction frommultiplicative to additive noise is possible are called reducible.

Conditions on the coefficients of an SDE so that it isreducible are obtained in Ait-Sahalia, Closed form likelihoodexpansions for multivariate diffusions, the Annals of Statistics(2008) 36(2) 906–937.

Pavliotis (IC) CompStochProc 264 / 298

Page 274: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Conversely, it is also sometimes possible to remove the drift termfrom an SDE and obtain an equation where only (multiplicative)noise is present.Consider the one dimensional SDE:

dXt = b(Xt) dt + dWt. (200)

We introduce the following functional of Xt

Mt = exp

(−1

2

∫ t

0b2(Xs) ds +

∫ t

0b(Xs) dWs

). (201)

We can write Mt = e−Yt where Yt is the solution to the SDE

Mt = e−Yt , dYt =12

b2(Xt) dt + b(Xt) dWt, Y0 = 0. (202)

We can now apply Itô’s formula to obtain the SDE

dMt = −Mtb(Xt) dWt. (203)

Notice that this is an equation without drift.

Pavliotis (IC) CompStochProc 265 / 298

Page 275: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Under appropriate conditions on the drift b(·), it is possible toshow that the law of the process Xt, denoted by P, which is aprobability measure over the space of continuous functions, isabsolutely continuous with respect to the Wiener measure PW , thelaw of the Brownian motion Wt.

The Radon-Nikodym derivative between these two measures isthe inverse of the stochastic process Mt given in (202):

dPdPW

(Xt) = exp

(12

∫ t

0b2(Xs) ds +

∫ t

0b(Xs) dWs

). (204)

This is a form of Girsanov’s theorem.

Using now equation (200) we can rewrite (204) as

dPdPW

(Xt) = exp

(∫ t

0b(Xs) dXs −

12

∫ t

0|b(Xs)|2 ds

). (205)

Pavliotis (IC) CompStochProc 266 / 298

Page 276: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

A form of Girsanov’s theorem that is very useful in statisticalinference for diffusion processes is the following: consider the twoSDEs

dXt = b1(Xt) dt + σ(Xt) dWt, X0 = x1, t ∈ [0,T], (206a)

dXt = b2(Xt) dt + σ(Xt) dWt, X0 = x2, t ∈ [0,T], (206b)

where σ(x) > 0. We assume that we have existence anduniqueness of strong solutions for both SDEs.

Assume that x1 and x2 are random variables with densities f1(x)and f2(x) with respect to the Lebesgue measure which have thesame support, or nonrandom and equal to the same constant.Let P1 and P2 denote the laws of these two SDEs. Then these twomeasures are equivalent and their Radon-Nikodym derivative is

dP2

dP1(X) =

f2(X0)

f1(X0)exp

(∫ T

0

b2(Xt)− b1(Xt)

σ2(Xt)dXt −

12

∫ T

0

b22(Xt)− b2

1(Xt)

σ2(Xt)dt

).

(207)

Pavliotis (IC) CompStochProc 267 / 298

Page 277: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Example: estimation of the diffusion and the driftfor the OU process

1 % Estimate Parameters in the OU process.2 % Use MLE and quadratic variation3 % Input: a path of the SDE and the observation times4 %5 function [alpha, lambda, t, x] = param_est(q)6 t = q(:,1); x = q(:,2);7 figure; plot(t,x, 'Linewidth' ,2)8 hold9 N = length(t); dt = t(2) - t(1); T = N * dt;

10 lambda = (1/(2 * T)) * sum(diff(x).^2)11 alpha = - sum(x(1:end-1). * diff(x))/(sum(x.^2) * dt)

Pavliotis (IC) CompStochProc 268 / 298

Page 278: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The MLE (181) depends on the path Xtt∈[0,T] and consequently itis random variable.

We have to prove that, in the large sample limit J → +∞, ∆t fixed,and for appropriate assumptions on the diffusion process Xt, theMLE converges to the true value θ0 and to also obtain informationabout the fluctuations around the limiting value θ0.

Assuming that Xt is stationary we can prove that the MLE θconverges in the limit as T → +∞ (assuming that the entire pathXtt∈[0,T] is available to us) to θ0.

Furthermore, we can prove asympotic normality of themaximum likelihood estimator,

√T(θ − θ0

)→ N (0, σ2), (208)

for a variance σ2 that can be calculated.

Pavliotis (IC) CompStochProc 269 / 298

Page 279: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem

Let Xt be the stationary OU process

dXt = −αXt dt + dWt, X0 ∼ N(

0,1

)

and let α denote the MLE (184). Then

limT→+∞

√T|α− α| = N (0, 2α) (209)

in distribution.

Pavliotis (IC) CompStochProc 270 / 298

Page 280: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

For the proof of this theorem we will need the following result fromprobability theory.

Theorem

(Slutsky) Let Xn+∞n=1 , Yn+∞

n=1 be sequences of random variablessuch that Xn converges in distribution to a random variable X and Yn

converges in probability to a constant c 6= 0. Then

limn→+∞

Y−1n Xn = c−1X,

in distribution.

Pavliotis (IC) CompStochProc 271 / 298

Page 281: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Proof of Theorem 57.First we observe that

α = −∫ T

0 Xt dXt∫ T0 X2

t dt= α−

∫ T0 Xt dWt∫ T0 X2

t dt.

Consequently:

α− α = −∫ T

0 Xt dWt∫ T0 X2

t dt= − 1√

T

1√T

∫ T0 Xt dWt

1T

∫ T0 X2

t dt

Law= − 1√

T

W(

1T

∫ T0 X2

t dt)

1T

∫ T0 X2

t dt,

where the scaling property of Brownian motion was used.∫ T

0f (s) dW(s) = W

(∫ T

0f 2(s) ds

). (210)

Pavliotis (IC) CompStochProc 272 / 298

Page 282: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The process Xt is stationary. We use the ergodic theorem forstationary Markov processes to obtain

limT→+∞

1T

∫ T

0X2

t dt = EX2t =

12α, (211)

Let now Y = N (0, σ2) with σ2 = 12α . We can write Y = W(σ2).

We use the Hölder continuity of Brownian motion to conclude that,almost surely,

∣∣∣∣1√T

∫ T

0X2

t dt − Y

∣∣∣∣ =

∣∣∣∣W(

1T

∫ T

0X2

t dt

)− W

(1

)∣∣∣∣ (212)

6 Hö l(W)

∣∣∣∣1T

∫ T

0X2

t dt − 12α

∣∣∣∣

12−ε

, (213)

where Höl(W) denotes the Hölder constant of Brownian motion.

Pavliotis (IC) CompStochProc 273 / 298

Page 283: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We have (in distribution)

limT→+∞

1T

∫ T

0Xt dWt = N

(0,

12α

), (214)

We combine (211) with (214) and use Slutsky’s theorem toconclude that (in distribution)

limT→+∞

√T|α− α| = N (0, 2α) (215)

Pavliotis (IC) CompStochProc 274 / 298

Page 284: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Econometrics: Market Microstructure NoiseZhang, Mykland, Ait-Sahalia J. American Statistical Association 2005 100(472)

pp. 1394–1411

St is the price process of a security. Xt = log(St) is the solution of

dXt = µt dt + σt dW1t . (216)

µt, σt are stochastic processes.For example, we can take (Heston model)

µt = (µ− νt/2), νt = σ2t .

and

dνt = κ (α− νt) dt + γν1/2t dW2

t , (217)

with E(dW1t dW2

t ) = ρ dt.Goal: Estimate the integrated stochastic volatility of Xt from noisyobservations Yt.

Pavliotis (IC) CompStochProc 275 / 298

Page 285: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Assume that we have market mictrostructure (observationerror ):

Yti = Xti + εti , i = 0, . . .N (218)

whereεti iid Eεti = 0, Eε2

ti = σ2ε.

Continuous time modelling of discrete time data:

The quadratic estimator of the volatility fails when the data issampled at the highest frequencies.

In the absence of market microstructure:

lim∆t→0

[X,X]T =

∫ T

0σ2

t dt,

in probability, where

[X,X]T =∑

ti

(Xti+1 − Xti

)2. (219)

Pavliotis (IC) CompStochProc 276 / 298

Page 286: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

On the other hand, for the noisy measurements:

lim∆t→0

[Y,Y]T = 2Nσ2ε + O(N1/2).

12N lim∆t→0[Y,Y]T provides us with an estimate for the variance ofthe noise!

The standard estimator for the integrated volatility gives us thewrong result.

We need to ignore high frequency data.

This is called the "fifth best estimator" in Ait-Sahalia et al.

Pavliotis (IC) CompStochProc 277 / 298

Page 287: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Fourth best estimator: sample at an arbitrary sparse frequency ns:

[Y,Y]spT ≈ 〈X, X〉T + 2nsσ

2ε + C(ε, σt)Z, Z ∼ N (0, 1).

Third best estimator: sample at the optimal sparse frequency byminimizing the mean squared error:

nopt =

(T

4(σ2ε)

2

∫ T

0σ4

t dt

)1/3

We are using small sample sizes, the variance can be quite large(use one out every 300 observations).

Pavliotis (IC) CompStochProc 278 / 298

Page 288: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Second best estimator: average over the data (moving average).Averaging over K grids of average size n we have

[Y,Y]avgT ≈ 〈X, X〉T + 2nσ2

ε + C(ε, σt)Z, Z ∼ N (0, 1).

The optimal choice of n is

nopt =

(T

6(σ2ε)

2

∫ T

0σ4

t dt

)1/3

The first best estimator: subsampling, averaging andbias-correction. Use the fifth best estimator to estimate the noiseand subtract it off from the fourth best estimator.

ˆ〈X, X〉T = [Y,Y]avgT − n

n[Y,Y]

allT .

This estimator is unbiased

ˆ〈X, X〉T = 〈X,X〉T + n−1/6C(ε, σt)Z, Z ∼ N (0, 1).

Pavliotis (IC) CompStochProc 279 / 298

Page 289: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Thermal Motion in a Two-Scale PotentialA.M. Stuart and G.P., J. Stat. Phys. 127(4) 741-781, (2007).

Consider the SDE

dxε(t) = −∇V

(xε(t),

xε(t)ε

)dt +

√2σ dW(t),

Separable potential, linear in the coefficient α:

V(x, y;α) := αV(x) + p (y) .

p(y) is a mean-zero smooth periodic function.

xε(t) ⇒ X(t) weakly in C([0,T];Rd), the solution of thehomogenized equation:

dX(t) = −αK∇V(X(t))dt +√

2σKdW(t).

Pavliotis (IC) CompStochProc 280 / 298

Page 290: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20.5

1

1.5

2

2.5

3

Figure: Bistable potential with periodic fluctuations

Pavliotis (IC) CompStochProc 281 / 298

Page 291: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

The coefficients A, Σ are given by the standard homogenizationformulas.

Goal: fit a time series of xε(t), the solution of (?? ), to thehomogenized SDE.

Problem: the data is not compatible with the homogenizedequation at small scales.

Model misspecification.

Similar difficulties when studying inverse problems for PDEs with amultiscale structure.

Pavliotis (IC) CompStochProc 282 / 298

Page 292: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In one dimension

dxε(t) = −αV ′(xε(t))dt − 1ε

p′(

xε(t)ε

)dt +

√2σ dW(t).

The homogenized equation is

dX(t) = −AV ′(X(t))dt +√

2Σ dW(t).

(A,Σ) are given by

A =αL2

ZZ, Σ =

σL2

ZZZ =

∫ L

0e−

p(y)σ dy, Z =

∫ L

0e

p(y)σ dy.

A and Σ decay to 0 exponentially fast in σ → 0.

The homogenized coefficients satisfy (detailed balance):

σ.

Pavliotis (IC) CompStochProc 283 / 298

Page 293: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We are given a path of

dxε(t) = −αV ′(xε(t)) dt − 1ε

p′(

xε(t)ε

)dt +

√2σ dβ(t).

We want to fit the data to

dX(t) = −AV ′(X(t))dt +√

2Σ dβ(t).

It is reasonable to assume that we have some information on thelarge–scale structure of the potential V(x).

We do not assume that we know anything about the small scalefluctuations.

Pavliotis (IC) CompStochProc 284 / 298

Page 294: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

We fit the drift and diffusion coefficients via maximum likelihoodand quadratic variation, respectively.

For simplicity we fit scalars A,Σ in

dx(t) = −A∇V(x(t))dt +√

2ΣdW(t).

The Radon–Nikodym derivative of the law of this SDE wrt Wienermeasure is

L = exp

(− 1Σ

∫ T

0A∇V(x) dx(s) − 1

∫ T

0|A∇V(x(s))|2 ds

).

This is the maximum likelihood function.

Pavliotis (IC) CompStochProc 285 / 298

Page 295: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Let x denote x(t)t∈[0,T] or x(nδ)Nn=0 with nδ = T.

Diffusion coefficient estimated from the quadratic variation:

ΣN,δ(x)) =1

dNδ

N−1∑

n=0

|xn+1 − xn|2,

Choose A to maximize logL :

A(x) = −∫ T

0 〈∇V(x(s)), dx(s)〉∫ T

0 |∇V(x(s))|2 ds

Pavliotis (IC) CompStochProc 286 / 298

Page 296: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

In practice we use the estimators on discrete time data and usethe following discretisations:

ΣN,δ(x) =1

N−1∑

n=0

|xn+1 − xn|2,

AN,δ(x) = −∑N−1

n=0 〈∇V(xn), (xn+1 − xn)〉∑N−1n=0 |∇V(xn)|2 δ

,

AN,δ(x) = ΣN,δ

∑N−1n=0 ∆V(xn)δ∑N−1

n=0 |∇V(xn)|2 δ,

Pavliotis (IC) CompStochProc 287 / 298

Page 297: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

No Subsampling

Generate data from the unhomogenized equation (quadratic orbistable potential, simple trigonometric perturbation).

Solve the SDE numerically using Euler–Marayama for a singlerealization of the noise. Time step is sufficiently small so thaterrors due to discretization are negligible.

Fit to the homogenized equation.

Use data on a fine scale δ ≪ ε2 (i.e. use all data).

Parameter estimation fails.

Pavliotis (IC) CompStochProc 288 / 298

Page 298: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.2

0.4

0.6

0.8

1

ε

A

0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.1

0.2

0.3

0.4

0.5

0.6

ε

ΣFigure: A, Σ vs ε for quadratic potential.

Pavliotis (IC) CompStochProc 289 / 298

Page 299: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

σ

A

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

σ

Σ

Figure: A, Σ vs σ for quadratic potential with ε = 0.1.

Pavliotis (IC) CompStochProc 290 / 298

Page 300: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Subsampling

Generate data from the unhomogenized equation.

Fit to the homogenized equation.

Use data on a coarse scale ε2 ≪ δ ≪ 1.

More precisely

δ := ∆tsam = 2k∆t, k = 0, 1, . . . .

Study the estimators as a function of ∆tsam.

Parameter Estimation Succeeds.

Pavliotis (IC) CompStochProc 291 / 298

Page 301: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

A

∆ tsam

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

∆ tsam

Σ

Figure: A, Σ vs ∆tsam for quadratic potential with ε = 0.1.

Pavliotis (IC) CompStochProc 292 / 298

Page 302: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

∆ tsam

A

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

∆ tsam

B

Figure: A, B vs ∆tsam for bistable potential with σ = 0.5, ε = 0.1.

Pavliotis (IC) CompStochProc 293 / 298

Page 303: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

∆ tsam

B1

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

∆ tsam

B1

2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

∆ tsam

B2

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

1

1.5

2

2.5

3

3.5

∆ tsam

B2

2

Figure: Bij, i, j = 1, 2 vs ∆tsam for 2d quadratic potential with σ = 0.5, ε = 0.1.

Pavliotis (IC) CompStochProc 294 / 298

Page 304: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Conclusions From Numerical Experiments

Parameter estimation fails when we take the small–scale (highfrequency) data into account.

A, Σ become exponentially wrong in σ → 0.

A, Σ do not improve as ε→ 0.

Parameter estimation succeeds when we subsample (use onlydata on a coarse scale).

There is an optimal sampling rate which depends on σ.

Optimal sampling rate is different in different directions in higherdimensions.

Pavliotis (IC) CompStochProc 295 / 298

Page 305: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem (No Subsampling)

Let xε(t) : R+ 7→ Rd be generated by the unhomogenized equation.

Thenlimε→0

limT→∞

A(xε(t)) = α, a.s.

Fix T = Nδ. Then for every ε > 0

limN→∞

ΣN,δ(xε(t)) = σ, a.s.

Thus the unhomogenized parameters are estimated – the wronganswer.

Pavliotis (IC) CompStochProc 296 / 298

Page 306: pavl/M5A44LectureNotes.pdf · 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance

Theorem (With Subsampling)

Fix T = Nδ with δ = εα with α ∈ (0, 1). Then

limε→0

ΣN,δ(xε) = Σ in distribution.

Let δ = εα with α ∈ (0, 1), N = [ε−γ ], γ > α. Then

limε→0

AN,δ(xε) = A in distribution.

Thus we get the right answer provided subsampling is used.

Pavliotis (IC) CompStochProc 297 / 298