over-parameterisation and model reduction neil crout environmental science school of biosciences...

36
Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Upload: margery-fleming

Post on 17-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Over-parameterisation and model reduction

Neil Crout

Environmental Science

School of Biosciences

University of Nottingham

Page 2: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Co-workers

Davide Tarsitano Glen Cox James Gibbons Andy Wood Jim Craigon Steve Ramsden Yan Jiao Tim Reid

with thanks to BBSRC and Leverhulme

Page 3: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Motivation

Page 4: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Increase the complexity of the model to improve agreement with observations

Can continue this until ALL the variation in the data is accounted for– Including the noise!

Models can become overparameterised

Over-parameterisation

Page 5: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Judgements have to be made about the level of complexity in a model (parsimony)

In a statistical context methods for model selection exist

Key feature - comparing models Key feature - comparing models

Over-parameterisation

Page 6: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

So what!

Our models are not statistical

Mechanistic No fitting involved Predicting lots of different things (multi-variate) Have to account for lots of different cases Did we mention they’re mechanistic

– Newton– Thermodynamics– etc all on our side

Page 7: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Overparameterisation and mechanistic models? Mechanistically inspired – but empirically implemented So normal statistical rules of business apply Model parameters are often not formally fitted

– worse they may be ‘tuned’– might be measured (but model structure may not be consistent with

real value)

Often (maybe always) models are implicitly fitted– comparisons are made with the real world – model changes– development iterates

But normal rules of business apply – greater complexity may not equal greater realism

Page 8: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Overparameterisation and mechanistic models?

Mechanistic models have the potential for over-parameterisation

This is a source of uncertainty

Perhaps of equal importance Mechanistic models are cited as tools to test

understanding For this we do need to know whether a model’s

formulation really can be supported by observations

Page 9: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Judgements have to be made about the level of complexity in a model

Mostly this is subjective – supported by tools such as sensitivity analysis etc– sometimes by consideration of alternative model formulations– but this is generally difficult….

Want some statistical support for model selection But that requires a set of alternative models to compare

Model Reduction – Why?

Page 10: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Model Reduction – Why? Want to assess whether a model has the most

appropriate level of detail We can get some idea on this by comparing

models of the same system which have different levels of detail

But, we don’t have lots of different models So we consider reducing the model we have to

create alternative model formulations which can be compared

Comparison is restricted – reduced models are drawn from the same source– but hopefully better than no comparison at all

Page 11: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Some illustrations

Page 12: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Example: radioacaesium uptake

Page 13: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

‘Absalom’ radiocaesium uptake model

NH4

% clay % C K+ pH

θclay θhumus Kx soil M_camgCEChumus

CECclay Kx humus(2)

mK(1)Kdclay(1)

Kdhumus

RIPclay(2)

CF(2)KdlKdr

D factor Cssol

Cssoil

Csplant

Page 14: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Model Reduction – How?

Start with a ‘full’ model of a system Reduce it (systematically, automatically) Producing a ‘set’ of alternative model formulations

for the same system Reduction?

– Inter-connected nature of typical models mean can’t simply leave things out

– We replace variables with a constant– e.g. the mean or median that the variable attains in the

full model

Page 15: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Reduction by variable replacement

Suppose the full model has a variable Vi

Which is a function of the model variables, inputs, and parameters

Arrange so that the model has

Where is either 0 (no replacement) or 1 (replaced) Systematically reducing the model then involves setting

the values of the vector

)( αI,V,ii fV

)()1( αI,V,iiiii fRV

Page 16: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Search the model replacement space

Search the combination of possible models (i.e. varying )– exhaustive if computationally feasible– stochastic search (e.g. MCMC) can be more efficient– discrete space

Calculate a posterior probability for each model configuration

This is based on comparison with observations– So this is a data-led approach– Data has to be representative for analysis to be

representative– Certainly need to be aware of the data’s limitations

Page 17: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

10 candidate variables – 210 combinations

NH4

% clay % C K+ pH

θclay θhumus Kx soil M_camgCEChumus

CECclay Kx humus(2)

mK(1)Kdclay(1)

Kdhumus

RIPclay(2)

CF(2)KdlKdr

D factor Cssol

Cssoil

Csplant

Page 18: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Posterior probability for top 10 models

0.00001

0.0001

0.001

0.01

0.1

1

1 2 3 4 5 6 7 8 9 10

Model Rank

Mar

gin

al P

ost

erio

r P

rob

abil

ity

Full Model

Page 19: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

NH4

% clay % C K+ pH

θclay θhumus Kx soil M_camgCEChumus

CECclay Kx humus(2)

mK(1)Kdclay(1)

Kdhumus

RIPclay(2)

CF(2)KdlKdr

D factor Cssol

Cssoil

Csplant

Full model (as published)

Page 20: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

NH4

% clay % C K+ pH

θclay θhumus Kx soil M_camgCEChumus

CECclay Kx humus(2)

mK(1)Kdclay(1)

Kdhumus

RIPclay(2)

CF(2)KdlKdr

D factor Cssol

Cssoil

Csplant

this one works better…

Page 21: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Better?

Reduced uncertainty, better predicting models– Tested with independent data

Requiring fewer inputs– Specifically soil pH– Model is used spatially, so this is a good thing

operationally

Greater confidence that the model is not over-parameterised….?

Diagnostic to guide/support model development

Page 22: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Methane Emissions from Wetlands

Page 23: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Methane emission from wetlands

Page 24: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Top ‘performing’ models

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

1 2 3 4 5 6 7 8 9 10Model Rank

Ma

rgin

al P

rop

os

al M

od

el

Pro

ba

bili

ty

Full Model

Page 25: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Mechanistic Interpretation

Model diagnostics – present results as probability of replacing specific variables

Summing over the model combinations explored– seasonal transfer of plant matter to soil organic matter

(100%) – reducing methane oxidation from mechalis-menten to

first order (39%)– temperature dependency of methane oxidation (15%)– seasonal dependency of plant growth (99%).

Page 26: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

C & N dynamics in the arctic

N15 applied to field plots

3 years subsequent measurement

used to parameterise a model based on MBL-GEM

Page 27: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

C & N dynamics in the arctic

V e g C D etritu s C

C O 2

V e g N D etritu s N

In o rg a n icN itro g en

U pE ffo rt N

U p Effo rt C

A

C AR B O N

N ITR O G E N

AC C L IM ATIO N

L i tte r C

L i t te r N

U ptal e N /R e l e as e N M i n N /Im m o b N

Input N L e ac hi ng N

U ptal e C /Ve g R e s p M i c r o R e s p

Q P T/Q V R T Q M R T

Q N R T/Q V U T

Q M R T

Page 28: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

14 candidate variables (16k combinations)

0.000001

0.00001

0.0001

0.001

0.01

0.1

1

1

177

353

529

705

881

1057

1233

1409

1585

1761

1937

2113

2289

2465

2641

2817

2993

3169

3345

3521

3697

3873

4049

Model ID

Mar

gin

al P

ost

erio

r P

rob

abili

ty

Full Model

Page 29: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Ensemble prediction over set of models

-200

-150

-100

-50

0

50

100

0 10 20 30 40 50

Time (years)

Sy

ste

m N

et

C B

ala

nc

e (

g m

-2)

Page 30: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Sirius – Wheat Crop Growth Model

Page 31: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Soil moisture

Q

Qfc

Qwp

exw[1]

aw[1]

uw[1]

AW[1]

Water balance

Si leafarea

Percolation

WP (Rain +irrigation)

si

Waterbalance.Si AW[1..30]

exw[1..30]

SOIL.EXW[1..30]

aw[1..30]

Nex[1..30]

x

Equilibrium

dn

Nuw[1..30] Naw[1..30]

aw[1..30] uw[1..30]exw[1..30]

Nex[1..30]

Soil evapEVAPO.evsoil

evsoilaw[1..5] exw[1..5]

GRAIN

GrainDemand

DeadLeafNPoolN GrainN

AvN

SoilN

CropN

UptakeN

leafarea

LeafNc

ExLeafNcExStemNcStemNc

MinGrainDemand

GrainDM

TTOUTBiomass BIOANTH

DemandBiomassBGF

GrainNoverGrainDm

AreaANTH

N’Pulses

Qr

Norganic

NdpNi

NfNm Ts Ta

x[1..8]

Naw[1]

SumP

uw[1..8]

aw[1..8]

Naw[1..8] Nuw[1..8]

QQwp Qfc

Rain+Irr

FertTmean

Q

Wateruptake

aw[1..30]

x

WP(Pottrans)

EVAPO.pottrans

exw[1..30]

RZexw

TransP

SF

Rootlen

si

CropN

AvNstembiomass

Nex[1..30]

StemNanforP

Naw[1..30]leafarea

biomass

minstemDemandUptakeN

CropN

leafarea

StemNc

LeafNc

leafarea

StemExNLeafdemand

MaxStemdemand

SF

Rootlen

Dry matter

PAR

rad

BiomassTau

drfacgrowth

RUEEarWt

SFAnthesis

Therm

Lastleaf

PHENO

Grainend

Lastleaf

WINTER

CROPUP

Anthesis

Grainstart

Therm

IPHASETTOUT

BIOANTH

Biomass

BiomassBGF

AreaANTH

Reduc

FleafNo

VernalisationPotlfno

PrimordNo

Amnlfno

Vprog

Roots

TTROOT

Rootlength

TTOUT

Anthesis

CANOPY

SHOUT

TTOPT

CanTemp

LAIStageAreamax

TTFIX

Areaopt

Deadleaves

TTCAN

Maxgaklir

GAKLIR

Leafarea

TTOUT

Soilmax

SoilMin

TempMinTempMax

Leafnumber

DrFacLAITTOUT

SF

Reduc

EVAPOSOIL.AW[1..30]

PTSOIL

AVWATER

EVSOIL

SLOSLECUM

PTAYRN

Def

WATCON

TAU

tadj

ENAVEVAP

PENMAN

HSLOP(Tmean)

aw[1..30

TCminTCMax

Hsoil

Soilmax

Alpha

exw[1..30]

PotTransleafarea

Transp

HcropCONDUC

TmeanTemp Max

Temp MinSoilmin

TDeepsoil

Rad

Rootlen

DrFacgrowth

DrFacLai

SFInputs

Temp Max

Temp Max

Rain

Radiation

TTOUTTDeepsoil

VARIETYSoil

Results

Page 32: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Top 75% of the distribution/ensemble

0

1

2

3

4

5

6

7

1 1001 2001 3001 4001 5001

Model Rank

PM

Px1

04

1139 1052

Page 33: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

40 other variables<<50%

Interpretation....

Crop N physiology

50%

Vernalisation99% N-mineralisation

Temp/moisture adjustment~99%

Canopy Temperature

96%Diurnal adjustment

of thermal time99%

Nitrogen Leaching

50%

Page 34: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Other applications…

Applying the same/similar approach to other models, e.g.

Marine Ecology Models (carbon cycling) JULES – land surface scheme for GCM FARM-ADAPT – very large farm management

model– Linear programming based model– V. large number of variables (10 000s)– More challenging

Page 35: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Some conclusions

Whether a model’s formulation is appropriate is uncertain

Model reduction provides a way to test (challenge) model formulation (perhaps a brutal test)

May give us some insurance against over-parameterisation (reduce uncertainty)

Does test the understanding built into our models All models investigated so far can be reduced by

variable replacement Most reduced models have some advantage

Page 36: Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

Useful references? NMJ Crout, D Tarsitano, AT Wood. Is my model too complex? Evaluating model

formulation using model reduction. Environmental Modelling & Software, in press. JM Gibbons, GM Cox, ATA Wood, J Craigon, SJ Ramsden, D Tarsitano, NMJ Crout

(2008). Applying Bayesian Model Averaging to mechanistic models: an example and comparison of methods. Environmental Modelling & Software, 23:973-985.

Cox GM, Gibbons JM, Wood ATA, Craigon J, Ramsden SJ, Crout NMJ (2006). Towards the systematic simplification of mechanistic models. Ecological Modelling 198:240-246.

Bernhardt, K. 2008. Finding alternatives and reduced formulations for process based models. Evolutionary Computation 16:1-16. Alternative approach.

Asgharbeygi, N., Langley, P., Bay, S., Arrigo, 2006. Inductive revision of quantitative process models. Ecol. Mod., 194:70-79. Alternative approach to model reduction, albeit without a statistical framework

Brooks RJ, Tobias AM (1996) Choosing the best model: level of detail, complexity, & model... Mathl. Comput. Mod. 24:1-14. Interesting article on model utility

Anderson, T.R., 2005. Plankton functional type modelling: running before we can walk? J. Plankton Research 27:1073-1081. Nice discussion of model complexity in context of marine systems. Quite controversial, replies to the replies to the replies.

Jakeman, A.J, Letcher, R.A., Norton, J.P., 2006 Ten iterative steps in development and evaluation of environmental models. Environmental Modelling & Software 21, 602-614. Very nice discussion on what is good practice in model development

Myung, J., Pitt, M. A., 2002. When a good fit can be bad. Trends. Cogn. Sci., 6:421-425. Good introduction to model selection criteria

Burnham, K.P., Anderson, D.R., 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretical Approach. 2nd ed. New York: Springer-Verlag. Ultimate reference, but not without critics.