project leader’s report mucm advisory panel meeting, november 2006

42
Project leader’s report MUCM Advisory Panel Meeting, November 2006

Upload: keon-mousley

Post on 31-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Project leader’s report MUCM Advisory Panel Meeting, November 2006

Project leader’s report

MUCM Advisory Panel Meeting, November 2006

Page 2: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Outline

Background: uncertainty in models MUCM overview Putting the structures in place Specific progress

Page 3: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Background: Uncertainty in Models

Page 4: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Computer models

In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real-world processes For understanding, prediction, control

There is a growing realisation of the importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty,

it’s easy to dismiss them

Page 5: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Sources of uncertainty

A computer model takes inputs x and produces outputs y = f(x)

How might y differ from the true real-world value z that the model is supposed to predict? Error in inputs x

Initial values, forcing inputs, model parameters Error in model structure or solution

Wrong, inaccurate or incomplete science Bugs, solution errors

Page 6: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Quantifying uncertainty

The ideal is to provide a probability distribution p(z) for the true real-world value The centre of the distribution is a best estimate Its spread shows how

much uncertainty about z is induced by uncertainties on the last slide

How do we get this? Input uncertainty: characterise p(x), propagate

through to p(y) Structural uncertainty: characterise p(z-y)

Page 7: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Example: UK carbon flux in 2000

Vegetation model predicts carbon exchange from each of 700 pixels over England & Wales Principal output is Net Biosphere Production

Accounting for uncertainty in inputs Soil properties Properties of different types of vegetation

Aggregated to England & Wales total Allowing for correlations Estimate 7.55 Mt C Std deviation 0.57 Mt C

Page 8: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Maps

Page 9: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Sensitivity analysis

Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the vegetation parameters As opposed to soil

parameters Contribution of

vegetation uncertainty is largest in grasslands/moorlands

Page 10: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

England & Wales aggregate

PFTPlug-in estimate

(Mt C)Mean(Mt C)

Variance (Mt C2)

Grass 5.28 4.64 0.269

Crop 0.85 0.45 0.034

Deciduous 2.13 1.68 0.013

Evergreen 0.80 0.78 0.001

Covariances 0.001

Total 9.06 7.55 0.321

Page 11: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Reducing uncertainty

To reduce uncertainty, get more information! Informal – more/better science

Tighten p(x) through improved understanding Tighten p(z-y) through improved modelling or

programming Formal – using real-world data

Calibration – learn about model parameters Data assimilation – learn about the state variables Learn about structural error z-y Validation

Page 12: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Example: Nuclear accident

Radiation was released after an accident at the Tomsk-7 chemical plant in 1993

Data comprise measurements of the deposition of ruthenium 106 at 695 locations obtained by aerial survey after the release

The computer code is a simple Gaussian plume model for atmospheric dispersion

Two calibration parameters Total release of 106Ru (source term) Deposition velocity

Page 13: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Data

Page 14: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

A small sample (N=10 to 25) of the 695 data points was used to calibrate the model

Then the remaining observations were predicted and RMS prediction error computed

On a log scale, error of 0.7 corresponds to a factor of 2

Calibration

Sample size N 10 15 20 25

Best fit calibration 0.82 0.79 0.76 0.66

Bayesian calibration 0.49 0.41 0.37 0.38

Page 15: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

So far, so good, but

In principle, all this is straightforward In practice, there are many technical difficulties

Formulating uncertainty on inputs Elicitation of expert judgements

Propagating input uncertainty Modelling structural error Anything involving observational data!

The last two are intricately linked And computation

Page 16: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

The problem of big models

Tasks like uncertainty propagation and calibration require us to run the model many times

Uncertainty propagation Implicitly, we need to run f(x) at all possible x Monte Carlo works by taking a sample of x from p(x) Typically needs thousands of model runs

Calibration Traditionally this is done by searching the x space for

good fits to the data This is impractical if the model takes more than a few

seconds to run We need a more efficient technique

Page 17: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Gaussian process representation

More efficient approach First work in early 1980s

Consider the code as an unknown function f(.) becomes a random process We represent it as a Gaussian process (GP)

Training runs Run model for sample of x values Condition GP on observed data Typically requires many fewer runs than MC

And x values don’t need to be chosen randomly

Page 18: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Emulation

Analysis is completed by prior distributions for, and posterior estimation of, hyperparameters

The posterior distribution is known as an emulator of the computer code Posterior mean estimates what the code would

produce for any untried x (prediction) With uncertainty about that prediction given by

posterior variance Correctly reproduces training data

Page 19: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

2 code runs

Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points

Page 20: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

3 code runs

Adding another point changes estimate and reduces uncertainty

Page 21: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

5 code runs

And so on

Page 22: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Then what?

Given enough training data points we can emulate any model accurately So that posterior variance is small “everywhere” Typically, feasible with orders of magnitude fewer

model runs than traditional methods Use the emulator to make inference about other

things of interest Uncertainty analysis, sensitivity analysis,

calibration, data assimilation, optimisation, … Conceptually very straightforward in the

Bayesian framework But of course can be technically hard

Page 23: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Research directions

Models with heterogeneous local behaviour Regions of input space with rapid response, jumps

High dimensional models Many inputs, outputs, data points

Dynamic models Data assimilation

Stochastic models Relationship between models and reality

Model/emulator validation Multiple models

Design of experiments Sequential design

Page 24: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

MUCM Overview

Page 25: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

MUCM in a nutshell Managing Uncertainty in Complex Models

Four-year research grant 7 postdoctoral research assistants 4 PhD studentships

Started in June 2006 Based in Sheffield and 4 other UK universities

Objective: To develop Bayesian model uncertainty

methods into a robust technology … toolkit, UML specifications

that is widely applicable across the spectrum of modelling applications

case studies

Page 26: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Theme 1: High Dimensionality

Tackling problems associated with dimensionality of inputs, outputs, parameters, and data

WP 1.1 – Screening (PS) Identifying most important inputs/outputs

WP 1.2 – Sparsity and Projection (RA) Dimension reduction using modern

computational techniques WP 1.3 – Multiscale models (RA)

Linking models and data at different resolutions Theme leader: Dan Cornford

Page 27: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Theme 2: Using Observational Data

Tackling problems associated with model structural error to link models to field data

WP 2.1 – Linking Models to Reality (RA) Modelling structural error

WP 2.2 – Diagnostics and Validation (PS) Criticising our statistical representations

WP 2.3 – Calibration & Data Assimilation (RA) Extending calibration techniques, particularly to

dynamic models Theme leader: Michael Goldstein

Page 28: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Theme 3: Realising the Potential

Turning theory into reliable, widely applicable techniques across a wide range of models

WP 3.1 – Experimental Design (RA + PS) Designing input sets for running models, and

planning observational studies WP 3.2 – The Toolkit (RA + PS)

Distilling experience with methods into robust tools, relaxing constraints

WP 3.3 – Case Studies (RA) Three substantial case studies

Theme leader: Peter Challenor

Page 29: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Organisation overview

Page 30: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Organisation by theme

1. Cornford 2. Goldstein 3. Challenor

1.1 Boukouvalas Cornford (Challenor)

1.2 Maniyar Cornford (Wynn)

1.3 CummingGoldstein (Rougier)

2.1 HouseGoldstein (O’Hagan)

2.2 Bastos O’Hagan (Rougier)

2.3 BhattacharyaOakley (Cornford)

3.1 Maruri-AguilarWynn (Goldstein)Youssef Wynn (Oakley)

3.2 GattikerChallenor (O’Hagan, Cornford)StephensonChallenor (Oakley)

3.3 GoslingO’Hagan (Challenor)

O’Hagan

Page 31: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Organisation by committee

The whole Team meets twice a year Presentations, reports and planning

The Project Management Board meets four times a year Formal decision making, budgeting, personnel

matters The Advisory Panel meets with the

investigators twice a year Providing external support and advice

Page 32: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

The Team

Investigators Challenor, Cornford, Goldstein, Oakley,

O’Hagan, Rougier, Wynn Project manager

Green RAs

Bhattacharya, Cumming, Gattiker, Gosling, House, Maniyar, Maruri-Aguilar

PSs Bastos, Bouskouvalas, Stephenson, Youssef

Page 33: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

The Board

Project Management Board is the primary project management body Tony O’Hagan (Sheffield, Chair) Dan Cornford (Aston) Peter Challenor (Southampton) Michael Goldstein (Durham) Henry Wynn (LSE)

Non-voting Jeremy Oakley (Sheffield) Jonty Rougier (Durham) Jo Green (Sheffield)

Page 34: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

The Panel

Advisory Panel comprises modellers, model users and model uncertainty experts from a wide range of fields

Industry Bob Parish, Hilmi Kurt-Elli, Clive Bowman

Academia Ron Akehurst, Martin Dove, Keith Beven,

Douglas Kell, Ian Woodward Research institutions

Richard Haylock, Andrea Saltelli, Andy Hart, David Higdon, Mat Collins

Page 35: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

The Mentor

Peter Green (Bristol) Appointed by EPSRC Liaise between project team and EPSRC Advise team

Page 36: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Putting the Structures in Place

Page 37: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

General

All RAs, PSs and Project Manager recruited Started at various times from 1 June to 1

October Need to replace Bhattacharya

Website, wiki, email lists, logo, templates created Reading list, glossary under development

Monthly reporting established RAs set up reading club Links established with related projects

Particularly with SAMSI programme in USA

Page 38: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Project planning

First draft of rolling workplans Descriptions and objectives Detailed plans and milestones for 12 months

ahead With month-by-month detail for 6 months

Outline plans and milestones for remainder of project

Will be updated quarterly Milestones and deliverables carefully monitored Panel will receive plans from previous Board

Page 39: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Financial management

Handled at quarterly Board meetings Phased budget plan created for each

institution RAs appointed initially for 3 years

Fourth year funds retained in reserve

Page 40: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Contacts with Panel members

Introductory meetings held with most members An RA has been assigned to each

To develop understanding of the models and the modelling area

To act as link between other team members and Panel member

Beginning to explore use of models Some models also sourced from other contacts

Page 41: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Specific progress 1

Emulator fitting Study of methods to estimate roughness

parameters Acquisition of existing packages

Multiscale models Multiscale version of Daisyworld model created

Non-homogeneous models Voronoi tessellation method improved Paper in preparation

Page 42: Project leader’s report MUCM Advisory Panel Meeting, November 2006

www.mucm.group.shef.ac.uk

Specific progress 2

Design Study of aberration and relationship to kernel Paper in preparation

Dynamic models Basic theory of dynamic emulation developed Toy dynamic model created and emulated Paper in preparation Hydrological model acquired