multilevel sampling methods for inverse problems constrained by partial differential … ·...

48
Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential Equations Aretha Teckentrup School of Mathematics, University of Edinburgh Joint work with: Tim Dodwell (Exeter), Chris Ketelsen (DigitalGlobe), Rob Scheichl (Heidelberg), Andrew Stuart (Caltech) University of Greenwich - February 20, 2019 A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 1 / 25

Upload: others

Post on 10-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Multilevel Sampling Methods for Inverse ProblemsConstrained by Partial Differential Equations

Aretha Teckentrup

School of Mathematics, University of Edinburgh

Joint work with:

Tim Dodwell (Exeter), Chris Ketelsen (DigitalGlobe),Rob Scheichl (Heidelberg), Andrew Stuart (Caltech)

University of Greenwich - February 20, 2019

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 1 / 25

Page 2: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Outline

1 Bayesian Inverse Problems

2 (Multilevel) Ratio Estimators

3 (Multilevel) Metropolis Hastings Estimators

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 2 / 25

Page 3: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsDefinition and Applications

An inverse problem is concerned with determining causal factors fromobserved results.

In mathematical terms, we want to determine unknown parameters ina model from indirect observations.

Inverse problems appear in many application areas, including

I the determination of an earthquake’s epicentre using observations ofseismic waves on the earth’s surface,

I the detection of flaws or cracks within a concrete structure fromacoustic or electromagnetic measurements at its surface.

The model of the process is frequently given by a partial differentialequation, where we have observations of the solution and want todetermine initial conditions/boundary conditions/coefficients.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 3 / 25

Page 4: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsDefinition and Applications

An inverse problem is concerned with determining causal factors fromobserved results.

In mathematical terms, we want to determine unknown parameters ina model from indirect observations.

Inverse problems appear in many application areas, including

I the determination of an earthquake’s epicentre using observations ofseismic waves on the earth’s surface,

I the detection of flaws or cracks within a concrete structure fromacoustic or electromagnetic measurements at its surface.

The model of the process is frequently given by a partial differentialequation, where we have observations of the solution and want todetermine initial conditions/boundary conditions/coefficients.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 3 / 25

Page 5: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation

Suppose we are given a mathematical model of a process, representedby a (typically non-linear) map G.

We are interested in the following inverse problem: given observationaldata y ∈ Rdy , determine model parameters u ∈ U ⊆ Rdu such that

y = G(u) + η,

where η represents observational noise, due to for examplemeasurement error.

Simply ”inverting G” is not possible, since

I we do not know the value of η, and

I typically the problem is ill-posed and/or ill-conditioned.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 4 / 25

Page 6: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation

Suppose we are given a mathematical model of a process, representedby a (typically non-linear) map G.

We are interested in the following inverse problem: given observationaldata y ∈ Rdy , determine model parameters u ∈ U ⊆ Rdu such that

y = G(u) + η,

where η represents observational noise, due to for examplemeasurement error.

Simply ”inverting G” is not possible, since

I we do not know the value of η, and

I typically the problem is ill-posed and/or ill-conditioned.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 4 / 25

Page 7: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation

Suppose we are given a mathematical model of a process, representedby a (typically non-linear) map G.

We are interested in the following inverse problem: given observationaldata y ∈ Rdy , determine model parameters u ∈ U ⊆ Rdu such that

y = G(u) + η,

where η represents observational noise, due to for examplemeasurement error.

Simply ”inverting G” is not possible, since

I we do not know the value of η, and

I typically the problem is ill-posed and/or ill-conditioned.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 4 / 25

Page 8: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation [Stuart ’10] [Kaipio, Somersalo ’04]

The Bayesian approach treats u, y and η as random variables.

We choose a prior measure µ0 on u with density π0.

Under the measurement model y = G(u) + η with η ∼ N(0,Γ), wehave y|u ∼ N(G(u),Γ), and the likelihood of the data y is

P(y|u) h exp(− 1

2‖y − G(u)‖2Γ−1

)=: exp

(− Φ(u)

).

Using Bayes’ Theorem, we obtain the posterior measure µy of u|ywith density πy, given by

πy(u) =1

Zexp

(− Φ(u)

)π0(u),

(dµydµ0

(u) =1

Zexp

(− Φ(u)

))where Z =

∫U exp

(− Φ(u)

)π0(u)du = Eπ0 [exp

(− Φ

)].

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 5 / 25

Page 9: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation [Stuart ’10] [Kaipio, Somersalo ’04]

The Bayesian approach treats u, y and η as random variables.

We choose a prior measure µ0 on u with density π0.

Under the measurement model y = G(u) + η with η ∼ N(0,Γ), wehave y|u ∼ N(G(u),Γ), and the likelihood of the data y is

P(y|u) h exp(− 1

2‖y − G(u)‖2Γ−1

)=: exp

(− Φ(u)

).

Using Bayes’ Theorem, we obtain the posterior measure µy of u|ywith density πy, given by

πy(u) =1

Zexp

(− Φ(u)

)π0(u),

(dµydµ0

(u) =1

Zexp

(− Φ(u)

))where Z =

∫U exp

(− Φ(u)

)π0(u)du = Eπ0 [exp

(− Φ

)].

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 5 / 25

Page 10: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation [Stuart ’10] [Kaipio, Somersalo ’04]

The Bayesian approach treats u, y and η as random variables.

We choose a prior measure µ0 on u with density π0.

Under the measurement model y = G(u) + η with η ∼ N(0,Γ), wehave y|u ∼ N(G(u),Γ), and the likelihood of the data y is

P(y|u) h exp(− 1

2‖y − G(u)‖2Γ−1

)=: exp

(− Φ(u)

).

Using Bayes’ Theorem, we obtain the posterior measure µy of u|ywith density πy, given by

πy(u) =1

Zexp

(− Φ(u)

)π0(u),

(dµydµ0

(u) =1

Zexp

(− Φ(u)

))where Z =

∫U exp

(− Φ(u)

)π0(u)du = Eπ0 [exp

(− Φ

)].

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 5 / 25

Page 11: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsMathematical Formulation [Stuart ’10] [Kaipio, Somersalo ’04]

The Bayesian approach treats u, y and η as random variables.

We choose a prior measure µ0 on u with density π0.

Under the measurement model y = G(u) + η with η ∼ N(0,Γ), wehave y|u ∼ N(G(u),Γ), and the likelihood of the data y is

P(y|u) h exp(− 1

2‖y − G(u)‖2Γ−1

)=: exp

(− Φ(u)

).

Using Bayes’ Theorem, we obtain the posterior measure µy of u|ywith density πy, given by

πy(u) =1

Zexp

(− Φ(u)

)π0(u),

(dµydµ0

(u) =1

Zexp

(− Φ(u)

))where Z =

∫U exp

(− Φ(u)

)π0(u)du = Eπ0 [exp

(− Φ

)].

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 5 / 25

Page 12: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsComputational Challenges

For the remainder of this talk, suppose are interested in computingEπy [φ], for some quantity of interest φ(u).

In most cases, we do not have a closed form expression for theposterior density πy, since the normalising constant Z is not knownexplicitly.(Exception: parameter-to-observation map G linear and prior π0

Gaussian ⇒ posterior πy also Gaussian.)

We will present two efficient approaches to computing Eπy [φ]:

I (multilevel) ratio estimators,

I (multilevel) Metropolis Hastings estimators.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 6 / 25

Page 13: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Bayesian Inverse ProblemsComputational Challenges

For the remainder of this talk, suppose are interested in computingEπy [φ], for some quantity of interest φ(u).

In most cases, we do not have a closed form expression for theposterior density πy, since the normalising constant Z is not knownexplicitly.(Exception: parameter-to-observation map G linear and prior π0

Gaussian ⇒ posterior πy also Gaussian.)

We will present two efficient approaches to computing Eπy [φ]:

I (multilevel) ratio estimators,

I (multilevel) Metropolis Hastings estimators.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 6 / 25

Page 14: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsComputing Expectations

Suppose are interested in computing Eπy [φ]. Using Bayes’ Theorem, wecan write this as

Eπy [φ] = Eπ0 [1

Zexp(−Φ)φ] =

Eπ0 [φ exp(−Φ)]

Eπ0 [exp(−Φ)].

We have rewritten the posterior expectation as a ratio of two prior

expectations.

We can now approximate

Eπy [φ] =Q

Z≈ Q

Z,

where Q is an estimator of Q := Eπ0 [φ exp(−Φ)] and Z is an estimator ofZ = Eπ0 [exp

(− Φ

)].

The prior distribution is known in closed form, and furthermore often has asimple structure (e.g. i.i.d. Gaussian or uniform).

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 7 / 25

Page 15: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsComputing Expectations

Suppose are interested in computing Eπy [φ]. Using Bayes’ Theorem, wecan write this as

Eπy [φ] = Eπ0 [1

Zexp(−Φ)φ] =

Eπ0 [φ exp(−Φ)]

Eπ0 [exp(−Φ)].

We have rewritten the posterior expectation as a ratio of two prior

expectations. We can now approximate

Eπy [φ] =Q

Z≈ Q

Z,

where Q is an estimator of Q := Eπ0 [φ exp(−Φ)] and Z is an estimator ofZ = Eπ0 [exp

(− Φ

)].

The prior distribution is known in closed form, and furthermore often has asimple structure (e.g. i.i.d. Gaussian or uniform).

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 7 / 25

Page 16: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMonte Carlo Estimators

The standard Monte Carlo (MC) estimator of Z = Eπ0 [exp(−Φ)],

ZMCh,N =

1

N

N∑i=1

exp(−Φh(u(i))),

is an equal weighted average of N i.i.d samples exp(−Φh(u(i))), where Φh

denotes an approximation of Φ using mesh width h.

The mean square error of ZMCh,N is

e(ZMCh,N )2 := E[(ZMC

h,N − Eπ0 [exp(−Φ)])2]

=Vπ0 [exp(−Φh]

N︸ ︷︷ ︸sampling error

+ (Eπ0 [exp(−Φh)− exp(−Φ)])2︸ ︷︷ ︸numerical error

.

⇒ need to choose N large and h small!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 8 / 25

Page 17: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMonte Carlo Estimators

The standard Monte Carlo (MC) estimator of Z = Eπ0 [exp(−Φ)],

ZMCh,N =

1

N

N∑i=1

exp(−Φh(u(i))),

is an equal weighted average of N i.i.d samples exp(−Φh(u(i))), where Φh

denotes an approximation of Φ using mesh width h.

The mean square error of ZMCh,N is

e(ZMCh,N )2 := E[(ZMC

h,N − Eπ0 [exp(−Φ)])2]

=Vπ0 [exp(−Φh]

N︸ ︷︷ ︸sampling error

+ (Eπ0 [exp(−Φh)− exp(−Φ)])2︸ ︷︷ ︸numerical error

.

⇒ need to choose N large and h small!A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 8 / 25

Page 18: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMultilevel Monte Carlo estimators [Giles, ’08], [Heinrich ’01]

Multilevel methods use a decreasing sequence of mesh widths {h`}L`=0.

Linearity of expectation gives us

Eπ0

[e−ΦhL

]= Eπ0

[e−Φh0

]+

L∑`=1

Eπ0

[e−Φh` − e

−Φh`−1

].

The multilevel Monte Carlo (MLMC) estimator

ZML{h`,N`} =

1

N0

N0∑i=1

e−Φh0(u(i,0)) +

L∑`=1

1

N`

N∑i=1

e−Φh` (u(i,`))− e

−Φh`−1(u(i,`))

,

is a sum of L+ 1 independent MC estimators. The sequence {N`} isdecreasing, which means a significant portion of the computational effortis shifted onto the coarse grids.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 9 / 25

Page 19: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMultilevel Monte Carlo estimators [Giles, ’08], [Heinrich ’01]

Multilevel methods use a decreasing sequence of mesh widths {h`}L`=0.

Linearity of expectation gives us

Eπ0

[e−ΦhL

]= Eπ0

[e−Φh0

]+

L∑`=1

Eπ0

[e−Φh` − e

−Φh`−1

].

The multilevel Monte Carlo (MLMC) estimator

ZML{h`,N`} =

1

N0

N0∑i=1

e−Φh0(u(i,0)) +

L∑`=1

1

N`

N∑i=1

e−Φh` (u(i,`))− e

−Φh`−1(u(i,`))

,

is a sum of L+ 1 independent MC estimators. The sequence {N`} isdecreasing, which means a significant portion of the computational effortis shifted onto the coarse grids.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 9 / 25

Page 20: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMultilevel Monte Carlo estimators [Giles, ’08], [Heinrich ’01]

The mean square error of ZML{h`,N`} is

e(ZML{h`,N`})

2 =Vπ0 [e

−Φh0 ]

N0+

L∑`=1

Vπ0 [e−Φh` − e

−Φh`−1 ]

N`︸ ︷︷ ︸sampling error

+ (Eπ0 [e−ΦhL − e−Φ])2︸ ︷︷ ︸

numerical error

.

Thus,

N0 still needs to be large, but samples are much cheaper to obtain oncoarse grids.

N` (`� 0) are much smaller, since Vπ0 [e−Φh` − e−Φh`−1 ]→ 0 as

h` → 0.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 10 / 25

Page 21: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsMultilevel Monte Carlo estimators [Giles, ’08], [Heinrich ’01]

The mean square error of ZML{h`,N`} is

e(ZML{h`,N`})

2 =Vπ0 [e

−Φh0 ]

N0+

L∑`=1

Vπ0 [e−Φh` − e

−Φh`−1 ]

N`︸ ︷︷ ︸sampling error

+ (Eπ0 [e−ΦhL − e−Φ])2︸ ︷︷ ︸

numerical error

.

Thus,

N0 still needs to be large, but samples are much cheaper to obtain oncoarse grids.

N` (`� 0) are much smaller, since Vπ0 [e−Φh` − e−Φh`−1 ]→ 0 as

h` → 0.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 10 / 25

Page 22: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsConvergence of Mean Square Error [Stuart, Scheichl, T. ’17]

We want to bound the mean square error (MSE): E[(

QZ −

Q

Z

)2].

Rearranging the MSE and applying the triangle inequality, we have

E[(QZ− Q

Z

)2]≤ 2

Z2

(E[(Q− Q)2

]+ E

[(Q/Z)2(Z − Z)2

]).

If (Q/Z)2 ∈ L∞, then the MSE of Q/Z can be bounded in terms ofthe individual MSEs of Q and Z.

If (Q/Z)2 ∈ Lr, for some 2 < r <∞, then the MSE of Q/Z can bebounded in terms of the MSE of Q and a higher order error of Z,E[(Z − Z)p

]for some p > 2.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 11 / 25

Page 23: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsConvergence of Mean Square Error [Stuart, Scheichl, T. ’17]

We want to bound the mean square error (MSE): E[(

QZ −

Q

Z

)2].

Rearranging the MSE and applying the triangle inequality, we have

E[(QZ− Q

Z

)2]≤ 2

Z2

(E[(Q− Q)2

]+ E

[(Q/Z)2(Z − Z)2

]).

If (Q/Z)2 ∈ L∞, then the MSE of Q/Z can be bounded in terms ofthe individual MSEs of Q and Z.

If (Q/Z)2 ∈ Lr, for some 2 < r <∞, then the MSE of Q/Z can bebounded in terms of the MSE of Q and a higher order error of Z,E[(Z − Z)p

]for some p > 2.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 11 / 25

Page 24: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsConvergence of Mean Square Error [Stuart, Scheichl, T. ’17]

Consider the diffusion problem

−∇ · (k(x)∇p(x)) = g(x), in D ⊂ Rd, y = O(p) + η,

with prior distribution

I a log-normal distribution k(x) = exp(∑du

n=1 unφn(x))

, with {φn}dun=1

given functions in L∞(D) and un ∼ N(0, 1) ⇒ Q/Z ∈ Lr, r <∞I a uniform distribution, i.e. k(x) = m(x) +

∑du

n=1 unφn(x), with

{φn}dun=1 given functions in L∞(D) and un ∼ U [−1, 1] ⇒ Q/Z ∈ L∞

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 12 / 25

Page 25: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsNumerical Results [Stuart, Scheichl, T. ’17]

Flow cell model problem on (0, 1)2

u log-normal random field

Observed data: y = {p(xi) + ηi}9i=1, with ηi ∼ N(0, 0.09)

QoI φ is outflow over right boundary

Accuracy ǫ10

-210

-110

0

Co

mp

uta

tio

na

l co

st

103

104

105

106

107

108

109

1010

MC (dep.)QMC (dep.)MLMC (dep.)

ǫ-4

ǫ-3

ǫ-2

Accuracy ǫ10

-210

-110

0

Co

mp

uta

tio

na

l co

st

103

104

105

106

107

108

109

1010

MC (indep.)QMC (indep.)MLMC (indep.)

ǫ-4

ǫ-3

ǫ-2

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 13 / 25

Page 26: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsRelated works

A number of works have recently considered the ratio estimator approachin the context of PDE constrained inverse problems, as it allows to reusemachinery developed for π0:

[Schillings, Schwab ’13]: dimension-adaptive sparse grids

[Dick, Gantner, Le Gia, Schwab ’17]: (multilevel) higher orderQuasi-Monte Carlo

[Gantner, Peters ’18]: higher order Quasi-Monte Carlo for PDEs onrandom domains

When ‖Γ‖ � 1 or dy � 1, the posterior density πy may concentrate, andthe prior evaluations become difficult to evaluate accurately.

[Schillings, Schwab ’16]: rescaling of parameter space around(unique) MAP point

[Schillings, Sprungk, Wacker ’19]: use of Laplace approximations

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 14 / 25

Page 27: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Ratio EstimatorsRelated works

A number of works have recently considered the ratio estimator approachin the context of PDE constrained inverse problems, as it allows to reusemachinery developed for π0:

[Schillings, Schwab ’13]: dimension-adaptive sparse grids

[Dick, Gantner, Le Gia, Schwab ’17]: (multilevel) higher orderQuasi-Monte Carlo

[Gantner, Peters ’18]: higher order Quasi-Monte Carlo for PDEs onrandom domains

When ‖Γ‖ � 1 or dy � 1, the posterior density πy may concentrate, andthe prior evaluations become difficult to evaluate accurately.

[Schillings, Schwab ’16]: rescaling of parameter space around(unique) MAP point

[Schillings, Sprungk, Wacker ’19]: use of Laplace approximations

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 14 / 25

Page 28: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsIntroduction

The second approach we are going to discuss is the (multilevel)Metropolis-Hastings algorithm, a type of Markov chain Monte Carlo(MCMC) algorithm.

The estimator of Eπy [φ] takes the form

EMHh,N =

1

N

N∑i=1

φh(u(i)),

where u(i) ∼ πyh, for 1 ≤ i ≤ N , and πyh(u) = 1Zh

exp(−Φh(u))π0(u).

Since πyh is not known in closed form, it is generally not possible togenerate i.i.d. samples distributed according to πyh.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 15 / 25

Page 29: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsIntroduction

The second approach we are going to discuss is the (multilevel)Metropolis-Hastings algorithm, a type of Markov chain Monte Carlo(MCMC) algorithm.

The estimator of Eπy [φ] takes the form

EMHh,N =

1

N

N∑i=1

φh(u(i)),

where u(i) ∼ πyh, for 1 ≤ i ≤ N , and πyh(u) = 1Zh

exp(−Φh(u))π0(u).

Since πyh is not known in closed form, it is generally not possible togenerate i.i.d. samples distributed according to πyh.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 15 / 25

Page 30: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsStandard Metropolis-Hastings [Robert, Casella ’99]

ALGORITHM 1. (Standard MH-MCMC)

Choose u(1) with πyh(u(1)) > 0.

At state u(i), sample a proposal u′ from density q(u′ |u(i)).

Accept sample u′ with probability

α(u′ |u(i)) = min

(1,

πyh(u′) q(u(i) |u′)πyh(u(i)) q(u′ |u(i))

),

i.e. u(i+1) = u′ with probability α(u′ |u(i)); otherwise stay atu(i+1) = u(i).

The proposal density q is chosen to be easy to sample from.

The accept/reject step is added in order to obtain samples from πyh.

Knowledge of the normalising constant Zh of πyh is not required.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 16 / 25

Page 31: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsStandard Metropolis-Hastings [Robert, Casella ’99]

ALGORITHM 1. (Standard MH-MCMC)

Choose u(1) with πyh(u(1)) > 0.

At state u(i), sample a proposal u′ from density q(u′ |u(i)).

Accept sample u′ with probability

α(u′ |u(i)) = min

(1,

πyh(u′) q(u(i) |u′)πyh(u(i)) q(u′ |u(i))

),

i.e. u(i+1) = u′ with probability α(u′ |u(i)); otherwise stay atu(i+1) = u(i).

The proposal density q is chosen to be easy to sample from.

The accept/reject step is added in order to obtain samples from πyh.

Knowledge of the normalising constant Zh of πyh is not required.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 16 / 25

Page 32: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsStandard Metropolis-Hastings [Robert, Casella ’99]

ALGORITHM 1. (Standard MH-MCMC)

Choose u(1) with πyh(u(1)) > 0.

At state u(i), sample a proposal u′ from density q(u′ |u(i)).

Accept sample u′ with probability

α(u′ |u(i)) = min

(1,

πyh(u′) q(u(i) |u′)πyh(u(i)) q(u′ |u(i))

),

i.e. u(i+1) = u′ with probability α(u′ |u(i)); otherwise stay atu(i+1) = u(i).

The proposal density q is chosen to be easy to sample from.

The accept/reject step is added in order to obtain samples from πyh.

Knowledge of the normalising constant Zh of πyh is not required.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 16 / 25

Page 33: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsStandard Metropolis-Hastings [Robert, Casella ’99]

ALGORITHM 1. (Standard MH-MCMC)

Choose u(1) with πyh(u(1)) > 0.

At state u(i), sample a proposal u′ from density q(u′ |u(i)).

Accept sample u′ with probability

α(u′ |u(i)) = min

(1,

πyh(u′) q(u(i) |u′)πyh(u(i)) q(u′ |u(i))

),

i.e. u(i+1) = u′ with probability α(u′ |u(i)); otherwise stay atu(i+1) = u(i).

The proposal density q is chosen to be easy to sample from.

The accept/reject step is added in order to obtain samples from πyh.

Knowledge of the normalising constant Zh of πyh is not required.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 16 / 25

Page 34: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

Key ingredients in multilevel method:

telescoping sum: E [φhL ] = E [φh0 ] +∑L

`=1 E [φh` ]− E[φh`−1

],

number of samples per level N` is decreasing.

The target distribution πyh` depends on h`, so need to define multilevelestimator carefully!

EπyhL[φhL ] = Eπyh0

[φh0 ]︸ ︷︷ ︸standard MCMC

+L∑`=1

Eπyh`[φh` ]− Eπyh`−1

[φh`−1

]︸ ︷︷ ︸

2 level MCMC (NEW)

EMLMH{h`,N`} :=

1

N0

N0∑i=1

φh0(u(i)0 ) +

L∑`=1

1

N`

N∑i=1

(φh`(u

(i)` )− φh`−1

(U(i)`−1)

)We need to be able to choose N` → 1 as `→∞!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 17 / 25

Page 35: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

Key ingredients in multilevel method:

telescoping sum: E [φhL ] = E [φh0 ] +∑L

`=1 E [φh` ]− E[φh`−1

],

number of samples per level N` is decreasing.

The target distribution πyh` depends on h`, so need to define multilevelestimator carefully!

EπyhL[φhL ] = Eπyh0

[φh0 ] +

L∑`=1

Eπyh`[φh` ]− Eπyh`−1

[φh`−1

]

EπyhL[φhL ] = Eπyh0

[φh0 ]︸ ︷︷ ︸standard MCMC

+L∑`=1

Eπyh`[φh` ]− Eπyh`−1

[φh`−1

]︸ ︷︷ ︸

2 level MCMC (NEW)

EMLMH{h`,N`} :=

1

N0

N0∑i=1

φh0(u(i)0 ) +

L∑`=1

1

N`

N∑i=1

(φh`(u

(i)` )− φh`−1

(U(i)`−1)

)We need to be able to choose N` → 1 as `→∞!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 17 / 25

Page 36: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

Key ingredients in multilevel method:

telescoping sum: E [φhL ] = E [φh0 ] +∑L

`=1 E [φh` ]− E[φh`−1

],

number of samples per level N` is decreasing.

The target distribution πyh` depends on h`, so need to define multilevelestimator carefully!

EπyhL[φhL ] = Eπyh0

[φh0 ]︸ ︷︷ ︸standard MCMC

+

L∑`=1

Eπyh`[φh` ]− Eπyh`−1

[φh`−1

]︸ ︷︷ ︸

2 level MCMC (NEW)

EMLMH{h`,N`} :=

1

N0

N0∑i=1

φh0(u(i)0 ) +

L∑`=1

1

N`

N∑i=1

(φh`(u

(i)` )− φh`−1

(U(i)`−1)

)

We need to be able to choose N` → 1 as `→∞!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 17 / 25

Page 37: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

Key ingredients in multilevel method:

telescoping sum: E [φhL ] = E [φh0 ] +∑L

`=1 E [φh` ]− E[φh`−1

],

number of samples per level N` is decreasing.

The target distribution πyh` depends on h`, so need to define multilevelestimator carefully!

EπyhL[φhL ] = Eπyh0

[φh0 ]︸ ︷︷ ︸standard MCMC

+

L∑`=1

Eπyh`[φh` ]− Eπyh`−1

[φh`−1

]︸ ︷︷ ︸

2 level MCMC (NEW)

EMLMH{h`,N`} :=

1

N0

N0∑i=1

φh0(u(i)0 ) +

L∑`=1

1

N`

N∑i=1

(φh`(u

(i)` )− φh`−1

(U(i)`−1)

)We need to be able to choose N` → 1 as `→∞!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 17 / 25

Page 38: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

We need to generate (coupled) Markov chains {u(i)` } and {U (i)

`−1}, suchthat

{U (i)`−1} has marginal distribution πyh`−1

,

{u(i)` } has marginal distribution πyh` ,

V[φh`(u(i)` )− φh`−1

(U(i)`−1)]→ 0 as `→∞.

The main idea of our algorithm is to:

use Metropolis-Hastings with U ′`−1 ∼ q(·|U(i)`−1) to generate U

(i+1)`−1 ,

use Metropolis-Hastings with u′` = U(i+1)`−1 to generate u

(i+1)` .

In practice, we use t` steps of Metropolis-Hastings to generate U(i+1)`−1 ,

such that U(i)`−1 and U

(i+1)`−1 are essentially uncorrelated. This significantly

simplifies the computation of the acceptance probability for u(i+1)` .

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 18 / 25

Page 39: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

We need to generate (coupled) Markov chains {u(i)` } and {U (i)

`−1}, suchthat

{U (i)`−1} has marginal distribution πyh`−1

,

{u(i)` } has marginal distribution πyh` ,

V[φh`(u(i)` )− φh`−1

(U(i)`−1)]→ 0 as `→∞.

The main idea of our algorithm is to:

use Metropolis-Hastings with U ′`−1 ∼ q(·|U(i)`−1) to generate U

(i+1)`−1 ,

use Metropolis-Hastings with u′` = U(i+1)`−1 to generate u

(i+1)` .

In practice, we use t` steps of Metropolis-Hastings to generate U(i+1)`−1 ,

such that U(i)`−1 and U

(i+1)`−1 are essentially uncorrelated. This significantly

simplifies the computation of the acceptance probability for u(i+1)` .

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 18 / 25

Page 40: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsDefinition of Multilevel Metropolis-Hastings [Dodwell, Ketelsen, Scheichl, T. ’15]

We need to generate (coupled) Markov chains {u(i)` } and {U (i)

`−1}, suchthat

{U (i)`−1} has marginal distribution πyh`−1

,

{u(i)` } has marginal distribution πyh` ,

V[φh`(u(i)` )− φh`−1

(U(i)`−1)]→ 0 as `→∞.

The main idea of our algorithm is to:

use Metropolis-Hastings with U ′`−1 ∼ q(·|U(i)`−1) to generate U

(i+1)`−1 ,

use Metropolis-Hastings with u′` = U(i+1)`−1 to generate u

(i+1)` .

In practice, we use t` steps of Metropolis-Hastings to generate U(i+1)`−1 ,

such that U(i)`−1 and U

(i+1)`−1 are essentially uncorrelated. This significantly

simplifies the computation of the acceptance probability for u(i+1)` .

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 18 / 25

Page 41: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsImplementation details [Dodwell, Ketelsen, Scheichl, T. ’15]

The acceptance probability α2L for u(i)` is easy to compute, and we

can prove E[α2L]→ 1 as `→∞. This means P[u

(i)` = U

(i)`−1

]→ 1 as

`→∞.

To define the level ` approximation πyh` , we can

I change the number of parameters on level `: u ∈ RN → u1:R`∈ RR` ,

I change the noise level on level `: Γ→ Γ`, where η ∼ N(0,Γ).

In practice, we always start at level 0 when generating samples, anduse a sub-sampling rate t`.

I This leads to an efficient implementation with small integratedautocorrelation times (O(1)) on levels ` ≥ 1 on level `.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 19 / 25

Page 42: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsImplementation details [Dodwell, Ketelsen, Scheichl, T. ’15]

The acceptance probability α2L for u(i)` is easy to compute, and we

can prove E[α2L]→ 1 as `→∞. This means P[u

(i)` = U

(i)`−1

]→ 1 as

`→∞.

To define the level ` approximation πyh` , we can

I change the number of parameters on level `: u ∈ RN → u1:R`∈ RR` ,

I change the noise level on level `: Γ→ Γ`, where η ∼ N(0,Γ).

In practice, we always start at level 0 when generating samples, anduse a sub-sampling rate t`.

I This leads to an efficient implementation with small integratedautocorrelation times (O(1)) on levels ` ≥ 1 on level `.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 19 / 25

Page 43: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsNumerical Example [Dodwell, Ketelsen, Scheichl, T. ’15]

Flow cell model problem on (0, 1)2

u log-normal random field

Observed data: y = {p(xi) + ηi}16i=1, with ηi ∼ N(0, 10−4)

QoI φ is outflow over right boundary

00.01 0.02 0.03 0.04

Cos

t in

CP

U T

ime

(sec

s)

102

103

104

105

106

MLMCMCStandard MCMC

2

4

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 20 / 25

Page 44: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

(Multilevel) Metropolis Hastings EstimatorsRelated works

Earlier related work using coarse models to ”pre-screen” proposals forfine models:

I [Liu, ’01] : surrogate transition method

I [Christen, Fox ’05] : delayed acceptance

I [Efendiev, Hou, Lou ’06] : application to PDEs

There are several other works discussing multilevel methods in thecontext of inverse problems, see the MLMC community webpage.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 21 / 25

Page 45: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

Conclusions

Standard sampling methods quickly become computationallyinfeasible for practical applications.

Multilevel methods are powerful tools for reducing the computationalburden in practical applications, leading to a cost reduction of ordersof magnitude.

In the context of Bayesian inverse problems, multilevel methodsinclude

I multilevel ratio estimators (importance sampling),

I multilevel Metropolis-Hastings estimators.

Multilevel methods are generally applicable!

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 22 / 25

Page 46: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

References I

Mulilevel Community Webpage.

https://people.maths.ox.ac.uk/gilesm/mlmc_community.html.

J. A. Christen and C. Fox, Markov chain Monte Carlo using an approximation,Journal of Computational and Graphical Statistics, 14 (2005), pp. 795–810.

J. Dick, R. N. Gantner, Q. T. Le Gia, and C. Schwab, Multilevelhigher-order quasi-Monte Carlo Bayesian estimation, Mathematical Models andMethods in Applied Sciences, 27 (2017), pp. 953–995.

T. Dodwell, C. Ketelsen, R. Scheichl, and A. Teckentrup, Ahierarchical multilevel Markov chain Monte Carlo algorithm with applications touncertainty quantification in subsurface flow, SIAM/ASA Journal on UncertaintyQuantification, 3 (2017), pp. 1075–1108.

Y. Efendiev, T. Hou, and W. Luo, Preconditioning Markov chain MonteCarlo simulations using coarse-scale models, SIAM Journal on ScientificComputing, 28 (2006), pp. 776–803.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 23 / 25

Page 47: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

References II

R. N. Gantner and M. D. Peters, Higher-Order Quasi-Monte Carlo forBayesian Shape Inversion, SIAM/ASA Journal on Uncertainty Quantification, 6(2018), pp. 707–736.

J. Kaipio and E. Somersalo, Statistical and computational inverse problems,Springer, 2004.

J. S. Liu, Monte Carlo strategies in scientific computing, Springer, 2001.

C. Robert and G. Casella, Monte Carlo Statistical Methods, Springer, 1999.

D. Rudolf, Explicit error bounds for Markov chain Monte Carlo, arXiv preprintarXiv:1108.3201, (2011).

R. Scheichl, A. Stuart, and A. Teckentrup, Quasi-Monte Carlo andMultilevel Monte Carlo Methods for Computing Posterior Expectations in EllipticInverse Problems, SIAM/ASA Journal on Uncertainty Quantification, 5 (2017),pp. 493–518.

C. Schillings and C. Schwab, Sparse, adaptive Smolyak quadratures forBayesian inverse problems, Inverse Problems, 29 (2013), p. 065011.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 24 / 25

Page 48: Multilevel Sampling Methods for Inverse Problems Constrained by Partial Differential … · 2019-02-22 · Multilevel Sampling Methods for Inverse Problems Constrained by Partial

References III

, Scaling limits in computational Bayesian inversion, ESAIM: MathematicalModelling and Numerical Analysis, 50 (2016), pp. 1825–1856.

C. Schillings, B. Sprungk, and P. Wacker, On the Convergence of theLaplace Approximation and Noise-Level-Robustness of Laplace-based Monte CarloMethods for Bayesian Inverse Problems, arXiv preprint arXiv:1901.03958, (2019).

A. M. Stuart, Inverse Problems: A Bayesian Perspective, Acta Numerica, 19(2010), pp. 451–559.

A. Teckentrup (Edinburgh) Multilevel methods in IPs February 20, 2019 25 / 25