surrogate and reduced-order models · surrogate and reduced-order models problem: difficult to...

Surrogate and Reduced-Order ModelsProblem: Difficult to obtain sufficient number of realizations of discretized PDE models for Bayesian model calibration, design and control.

Solution: Construct surrogate models• Also termed data-fit models, response surface models, emulators, meta-models

• Projection-based models often called reduced-order models

Constitutive Closure Relations: e.g.,

Mass

Momentum

Energy

Water

Aerosol

@⇢

@t+r · (⇢v) = 0

@v@t

= -v ·rv -1⇢rp - gk̂ - 2⌦⇥ v

⇢cV@T@t

+ pr · v = -r · F +r · (krT ) + ⇢q̇(T , p, ⇢)

p = ⇢RT

@mj

@t= -v ·rmj + Smj (T , mj ,�j , ⇢) , j = 1, 2, 3,

@�j

@t= -v ·r�j + S�j (T ,�j , ⇢) , j = 1, · · · , J,

Surrogate Models: MotivationExample: Consider the heat equation

Notes:• Requires approximation of PDE in 3-D• What would be a simple surrogate?

t

1 x , y , z

with the response

Boundary Conditions

Initial Conditions

y(q) =

Z 1

0

Z 1

0

Z 1

0

Z 1

0u(t , x , y , z)dxdydzdt

@u

@t

=@2

u

@x

2 +@2

u

@y

2 +@2

u

@z

2 + f (q)

Surrogate Models: MotivationExample: Consider the heat equation

with the response

Boundary Conditions

Initial Conditions

y(q) =

Z 1

0

Z 1

0

Z 1

0

Z 1


@u

@t

=@2

u

@x

2 +@2

u

@y

2 +@2

u

@z

2 + f (q)

Question: How do you construct a polynomial surrogate?• Regression• Interpolation

t

1 x , y , z

Surrogate: Quadraticys(q) = (q - 0.25)2 + 0.5

Surrogate ModelsRecall: Consider the model

Question: How do you construct a polynomial surrogate?• Interpolation• Regression

M=7k=2

with the response

Boundary Conditions

Initial Conditions

y(q) =

Z 1

0

Z 1

0

Z 1

0

Z 1


@u

@t

=@2

u

@x

2 +@2

u

@y

2 +@2

u

@z

2 + f (q)

t

1 x , y , z

Surrogate: Quadraticys(q) = (q - 0.25)2 + 0.5

Surrogate and Reduced-Order ModelsIssues: •Techniques for regression versus interpolation

• Grid choices for interpolation• Techniques for time-dependent problems

Data-Fit ModelsNotes:• Often termed response surface models, surrogates, emulators, meta-models.

• Rely on interpolation or regression.

• Data can consist of high-fidelity simulations or experiments.

• Common techniques: polynomial models, kriging (Gaussian process regression), orthogonal polynomials.

Statistical Model:

y = f(q)

Strategy: Consider high fidelity model

with M model evaluations

y = f (q)

ym = f (qm) , m = 1, ... , M

fs(q): Surrogate for f (q)

ym = fs(qm) + "m , m = 1, ... , M

Data-Fit Models – Polynomial EmulatorQuadratic Emulator: Regression

Least Squares Estimate:

Deterministic System:

Notes:• Good choice for optimization;

• Accurate approximation may require high-order polynomials;

• Does not provide uncertainty bounds for uncertainty quantification.

fs(q;�) = �0 + �1q + �2q2

yobs

= X�

� = [XTX]�1XT yobs

Quadratic

Res

pons

e

Data

Parameters q

Emulator

Data-Fit Models – CollocationStrategy: Consider high fidelity model

y = f(q)

with M model evaluations

ym = f(qm) , m = 1, · · · ,M

Collocation Surrogate:

ys(q) = fs(q) =MX

m=1

ymLm(q)

Lm(q) =MY

j=0j 6=m

q � q j

qm � q j=

(q � q 1) · · · (q � qm�1)(q � qm+1) · · · (q � qM )

(qm � q 1) · · · (qm � qm�1)(qm � qm+1) · · · (qm � qM )

Note:

Lm(qj) = �jm =

⇢0 , j 6= m1 , j = m

Result: ys(qm) = f(qm)

where Lm(q) is a Lagrange polynomial, which in 1-D is represented by

ym

Data-Fit Models – CollocationStrategy: Consider high fidelity model

Alternative: Collocation surrogate

Note: Result:

Lm(q) =MY

j=0j 6=m

q - q j

q m - q j =(q - q 1) · · · (q - q m-1)(q - q m+1) · · · (q - q M)

(q m - q 1) · · · (q m - q m-1)(q m - q m+1) · · · (q m - q M)

Lm(q j) = �jm =

�0 , j 6= m1 , j = m

y = f (q)

where Lm(q) is a Lagrange polynomial, which in 1-D, is represented by

ym

with M samples ym , m = 1, ... , M

yM(q) =MX

m=1

ymLm(q)

yM(qm) = ym

Properties:• Easy to add points• Method is nonintrusive and treats

code as black box.

Data-Fit Models – Gaussian Process Emulator

Gaussian Process: Kriging

fs(q;�) = gT (q)�+ Z (q)

• gT (q)� : Trend function

• Z (q): Gaussian process with

cov[Z (qi), Z (qj)] = �2R(qi, qj) + �2

0

�(qi, qj)

R(qi, qj) = exp

-pX

k=1

��✓k(qik - qj

k)��k

!

High Fidelity Model: M evaluations

ym = f (qm) , m = 1, ... , M

Statistical Model:

ym = fs(qm,�) + "m , "m ⇠ N(0,�20)

Note:�(qi , qj) =

�1 , kqj - qik = 00 , else

Universal Kriging: gT (q)� =PK

k=0 �kgk(q)

Ordinary Kriging: gT (q)� = �0

Gaussian Process Emulator


fs(q;�) = gT (q)�+ Z (q)

cov[Z (qi), Z (qj)] = �2R(qi, qj) + �2

0

�(qi, qj)

R(qi, qj) = exp

-pX

k=1

��✓k(qik - qj

k)��k

!

Statistical Model:

ym = fs(qm,�) + "m , "m ⇠ N(0,�20)

Note: Hyper-parameters ✓k ,�ktuned to achieve varying degrees of correlation

−4 −2 0 2 40

0.2

0.4

0.6

0.8

1

qi−qj

R(q

i ,qj )

γ=0.5γ=1γ=2

Gaussian Process Emulator


fs(q;�) = gT (q)�+ Z (q)

cov[Z (qi), Z (qj)] = �2R(qi, qj) + �2

0

�(qi, qj)

R(qi, qj) = exp

-pX

k=1

��✓k(qik - qj

k)��k

!

Statistical Model:

ym = fs(qm,�) + "m , "m ⇠ N(0,�20)

Note: In absence of observation noise, GP surrogate interpolates; i.e.,

ym = fs(qm,�) , m = 1, ... , M

Uncertainty Bounds:

Gaussian Process EmulatorObservations:

Likelihood:

Log-likelihood:

and

Take ys = [y1, ... , yM ]T , 1 = [1, · · · , 1] and Rij = R(qi , qj)

L(�0

,�2

; q) =1

(2⇡�2)M/2|R|1/2

exp

-

1

2�2

(ys - �0

1)TR-1(ys - �0

1)�

` = -M2

ln(2⇡)-M2

ln(�2)-12

ln(R)-1

2�2 (ys - �01)TR-1(ys - �01)

) @`

@�0=

12�2 1TR-1(ys - �01) · 2 = 0

) �0(✓,�) =⇥1TR-11

⇤-1 ⇥1TR-1ys⇤

@`

@�2 = -M2

1�2 +

12(�2)2 (ys - �01)TR-1(ys - �01) = 0

) �2(✓,�) =1M

[ys - �01]T R-1 [ys - �01]

Note: Optimize to infer ✓,�

Gaussian Process PredictionStrategy:

Assume we have M points ys = [f (q1), ... , f (qM)]T and we want

data ys and prediction yp. Form augmented vector

ey =

ys

yp

�=

2

6664

f (q1)...

f (qM)yp

3

7775

Let

and

to make prediction yp that maximizes likelihood of observed

eR =

"R rrT 1

#

=

"M ⇥ M M ⇥ 11 ⇥ M 1 ⇥ 1

#

r(q) =

2

64cov[Z (q1), Z (q)]

.

.

.

cov[Z (qM), Z (q)]

3

75 =

2

64R(q1

, q).

.

.

R(qM, q)

3

75

Gaussian Process PredictionLog-Likelihood:

Last term:

Note: Form partitioned matrix

Note:

=

" eR-111

eR-112

eR-121

eR-122

#

eR-1 =

"R-1 + R-1r

�1 - rTR-1r

�-1 rTR-1 -R-1r�1 - rTR-1r

�-1

-�1 - rTR-1r

�-1 rTR-1�1 - rTR-1r

�-1

#

` = -M2

ln(2⇡)-M2

ln e�2 -12

ln |eR|- 12e�2 [ey - �01]T eR-1 [ey - �01]

-12e�2

"ys - �01yp - �0

#T "R rrT 1

#-1 "ys - �01yp - �0

#

ys - �01yp - �0

�T

=⇥f (q1)- �0, ... , f (qM)- �0 | yp - �0

⇤

Gaussian Process PredictionThen

Thus@`

@yp= 0

Best Unbiased Estimate for Mean:

"ys - �01yp - �0

#T " eR11 eR12

eR21 eR22

#-1 "ys - �01yp - �0

#

=h(ys - �01)T eR-1

11 + (yp - �0)eR-121 (ys - �01)T eR-1

12 + (yp - �0)eR-122

i "ys - �01yp - �0

#

= F (ys)+ (yp -�0)eR-121 (ys -�01)+ (ys -�01)T eR-1

12 (yp -�0)+ (yp -�0)2eR-1

22

= F (ys) +

-2rTR-1(ys - �01)

1 - rTR-1r(yp - �0) +

11 - rTR-1r

(yp - �0)

�

) -11 - rTR-1r

(yp - �0) +rTR-1(ys - �01)

1 - rTR-1r= 0

yp(q) = E[fs(q,�0)] = �0 + rT (q)R-1(ys - �01)

Gaussian Process PredictionBest Unbiased Estimate for Mean:

yp(q) = E[fs(q,�0)] = �0 + rT (q)R-1(ys - �01)

Note:

r(q) =

2

64

R(q1, q)...

R(qM , q)

3

75 ) r(qj) =

2

64

R(q1, qj)...

R(qM , qj)

3

75

Then

rTR-1[ys - �01] =⇥R(q1, qj), ... , R(qM , qj)

⇤2

4 R-1

3

5

2

64

y1 - �0...

yM - �0

3

75

= yj - �0

so yp(qj) = yj

Gaussian Process Prediction

Variance:

where

results from the condition

�2 =1M

[ys - �0(✓,�)1]T R-1 [ys - �0(✓,�)1]

var[fs(q,�0)] = �21 - rTR-1r +

(rTR-11 - 1)2

1TR-11

�

@`

@�2 = 0

References:• C. Rasmussen and C. Williams, Gaussian Processes for Machine Learning, MIT Press,

2006

• A.I.J. Forrester, A.Sóbester and A.J. Keane, Engineering Design via Surrogate Modelling: A Practical Guide, Progress in Astronautics and Aeronautics Volume 26, John Wiley and Sons, Chickester, UK, 2007

Example: Scaled Branin Function

Function:

where

and

• Functionmovedup80andflipped

• Usedprimarilytotestoptimizationroutines• 3uniquemaxima

Example: Scaled Branin Function• N=1000randomlyselectedpoints usedtoconstructsurrogatemodel

• m=100randomlyselectedtestpointsusedtoassessaccuracy

• MATLABfitrgp andpredictfunctions• Maximumlikelihoodestimateofhyperparameters

• RMSerrorovermpredictions=0.0460

Example: Scaled Branin Function

1-D Function:

• Sameconstantsusedasbefore

• Uniquemaximum

• MATLABprovides99%predictionintervals

• Addingnewdatatightenspredictionintervals

• Regression,sonotexactatdatapoints

Surrogate Models – Grid Choice

Example:

−1 −0.5 0 0.5 1−0.2

0

0.2

0.4

0.6

0.8

1

1.2

q

f(q)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

q

f(q)

−1 −0.5 0 0.5 1−80

−60

−40

−20

0

20

q

f(q)

−1 −0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

q

f(q)

Consider the Runge function f (q) = 1

1+25q2

with points

qj = -1 + (j - 1)2M

, j = 1, ... , M qj = - cos

⇡(j - 1)

M - 1

, j = 1, ... , M

Sparse Grid TechniquesTensored Grids: Exponential growth as a function of dimension

Sparse Grids: Same accuracy with significantly reduce number of points

R

0 1

1 x y

2 x

2xy y

2

3 x

3x

2y xy

2y

3

4 x

4x

3y x

2y

2xy

3y

4

Motivation: Do not need full set of points to achieve same degree of accuracy

Sparse Grid TechniquesTensored Grids: Exponential growth Sparse Grids: Same accuracy

p R` Sparse Grid R Tensored Grid R = (R`)p

2 9 29 81

5 9 241 59,049

10 9 1581 > 3⇥ 109

50 9 171,901 > 5⇥ 1047

100 9 1,353,801 > 2⇥ 1095

Reduced-order ModelsBased on Projections: e.g., Evolution model:

Full-Order Approximate Solution

satisfies

Goal: Construct reduced-order solution Note: • Basis construction crux of method.

• Eigenfunctions

• Proper Orthogonal Decomposition (POD)

Reduced-order ModelsExample: Eigenfunctions for heat equation

Weak Formulation:

Reduced-Order Basis:

Reduced-Order Model:

satisfying Reduced-Order System:

Reduced-order ModelsProper Orthogonal Decomposition (POD): •Based on snapshots

•Set typically highly redundant;

•POD extracts coherent structure having largest mean square projection onto set of observations.

•Related to Karhunen-Loeve expansion and singular value decomposition.

Strategy: Seek basis function of the form


Reduced-order ModelsStrategy: Seek basis function of the form


Note: R is positive, self-adjoint operator

Equivalent Problem: Find largest eigenvalue of

or

surrogate and reduced-order models · surrogate and reduced-order models problem: difficult to...

Documents