collaborative science: give us your information, not your conclusions andy packard mechanical...

Collaborative Science: Give us your information, not your conclusions

Andy PackardMechanical Engineering

jointly with Michael Frenklach, Ryan Feeley and Trent RussiUniversity of California

Berkeley, CA

A workshop on the occasion of Keith Glover’s 60th birthday21 and 22 April 2006

Support from NSF grants: CTS-0113985 and CHE-0535542Support from CITRIS

Copyright 2006, Packard, Frenklach, Feeley, and, Russi. This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

http://creativecommons.org/licenses/by-sa/2.0/

Its just Model Invalidation on seemingly hard problems

–Assess joint consistency (or not) of ≈100 measured observables and models

–Common parameter uncertainty (≈100 parameters) across models

–Each model high order (≈100 states)

–Each model nonlinear ODE– the nonlinearities matter

–Mathematically unstructured measurements/observables– Time between successive peaks

– Ratios of steady-state responses

– Ratio of time between successive peaks for an experiment and controlled variation

– …

– Expert, handcrafted “filtering and sensor fusion”

–User is a modestly skeptical science community. Their goal/dream:– bolster scientific panel decisions

– choose research directions

by making better use of their community’s research portfolio

Keith’s take on “Collaborative Science”

Insensitive to known unknowns, uncontrolled effects

Collaborative Science: Who does what?

Function approximation on Cubes (polynomial

and/or rational)

error assessment

Experiment Design

Sensitivity Analysis

on each model

(many) Laboratory Experiments

(many) Science Models

Prior Information

(many) Science researchers

Invalidation certificates (polynomial inequalities)

Nonlinear programming (polynomial constraints

and objective)

μ mindset

High performance scientific computing

Numerical analyst

Embarrassingly Parallelizable structure

scalability

Branch & Bound

accuracy

Consistency

Prediction

Relevance

questions(reformulate as

invalidation)

Revisit/think fundamentals

feedback

The result of this process

When we say Invalidation Proof, or Certificate, what do we really mean?

Binary tree (from Branch & Bound)– The tree partitions prior information into cubes

– At each leaf, you find a cube, and

• Array of Rational functions (one for each Model)

• Array of Approximation Error bounds (one for each approximation)

• Sum of Squares certificate proving emptiness of the array of constraints

Fairly appealing to user. Good tools to drill in and interactively confirm, are necessary (initially…)

The conclusion might be wrong.1. Error bound on function approximation wrong

– SOS emptiness proof is not necessarily valid

2. Experiment result wrong

3. Science model wrong

4. A-priori info on parameters wrong

Proof is wrong!

Collaborative Science Limit “science” to mean mechanism understanding through modeling and experimentation for the purpose of prediction

Applicability:–chemical kinetics modeling

–atmospheric chemistry modeling

–…

–systems biology

Goal of “collaboration”: quantify the joint information implicit in the community’s research portfolio.–Portfolio: diverse, and individually generated

–parametrized models which explain/govern behavior –observed facts about behavior

Result:–better decisions

–more reliance on “the facts”

–quicker path to answer relevant questions

focus onProcess models are complex, though physics based

governing equations are widely accepted

Uncertainty in process behavior exists, but much is known regarding “where” the uncertainty lies in the governing equations (uncertain parameters)

Numerical simulations of process, with uncertain parameters “fixed” to certain values, may be performed “reliably”

Processes are studied experimentally in labs

starting point

Methane Combustion: CH4 + 2 O2 CO2 + 2 H2O

Gaseous methane reaction models have grown in complexity over time with the aim of improving predictions.

(~1970): 15 elementary reactions with 12 species

(~1980): 75 elementary reactions with 25 species

(Now): 300+ elementary reactions, 50+ species. Used to predict heat release and concentrations over a wide scale, from work production to pollutant formation.–ODE model called “GRI-Mech 3.0”

–several releases, now static…–used as a common benchmark–Result of 0.5 alpha collab. science

Pathway diagram for methane combustion [Turns]

How did the GRI-Mech come about?

Each of the GRI-Mech ODE releases embody the work of many people, but not explicitly working together. How did the successful collaboration occur?

Informal mode?– assimilate conclusions of each paper sequentially– “read my paper”– “data is available on my website”

No. Didn’t/Doesn’t work – community tried it, but predictive capability of model did not reliably improve as more high-quality experiments were done.

– Papers tend to lump modeling and theory, experiments, analysis and convenience assumptions, leading to a concise text-based conclusion

– Conclusions are conditioned on these additional assumptions necessary to make the conclusion concise.

– Impossible to anonymously “collaborate" since the convenience assumptions are unique to each paper.

– Goals of one paper are often the convenience assumptions of another.– Difficult/impossible to trace the quality of a conclusion reached sequentially

across papers– Posted data is often the text-based conclusion in e-form, little additional

information

authors stake professional reputation on these…

but if really, really pressed, perhaps not these (perhaps doubt those of their colleagues as well)

Lessons LearnedChemical kinetics modeling is a form of

– high dimensional (mechanisms are complex),

– distributed (efforts of many, working separately)

system identification.

The effort of researchers yields complex, intertwined, factual assertions about the unfalsified values of the model parameters

– Handbook style of {parameter, nominal, range, reference} will not work

– Unrepeatable/undocumented data analysis can be as confounding as unrepeatable experiments (destructive too…)

– Each individual assertion is usually not illuminating in the problem’s natural coordinates. Concise individual conclusions are actually rare.

– Information-rich, “anonymous” collaboration is necessary

– Computers must do the heavy lifting

• Managing lists of assertions, reasoning and inference

– Useful role of journal paper: document methodology leading to assertion

Here, we take deterministic, worst-case view– Understand the impact (on all posed questions:

prediction, consistency, relevance) of the currently unfalsified parameter set.

Alternate model: Separate asserted facts from analysis

Two types of assertions: models and observed behavior – Assertion of models of physical processes (e.g., “if we knew the

parameter values, this parametrized mathematics would accurately model the process”)

– Assertion of measured outcomes of physical processes (e.g., “I performed experiment, and the process behaved as follows…”)

Together, these form constraints in "world"-parameter space of physical constants.

Analysis (global optimization) on the constraints– Check consistency of a collection of assertions

– Explore the information implied by the assertions

– …

– (old standby) Generate consistent parameter samples.

Collaborative Science is the open availability of these 3 components

Collaborative Science

Data Collaboration

“Data”

“…how well data are turned into knowledge depends on how they gathered, organized, managed, and exhibited—and those tasks are increasingly arduous as the data increase. … databases can be far more than repositories—they can serve as tools for creating new knowledge”

Is this really necessary?

2000 NRC Workshop on Bioinformatics: Converting Data to Knowledge

We think so, at least in chemical kinetics modeling.

Other fields have related views…

GRI-Mech: Successful Data CollaborationResult:

High quality, predictive Methane reaction model: 50+ Species/300+ Reactions

Based on:77 peer-reviewed, published Experiments/Measured Outcomes of ~25 groups

Infrastructure to use these did not exist– Grassroots effort of 4 groups– Decide on a common, “encompassing” list of species/reactions– Extract the information in each paper, not simply assimilate conclusions– Reverse-engineer assertions in light of the common reaction model

The rest was relatively “easy”– Optimization to get “best” fit single parameter vector– Validation (on ~120 other published results)

Features (www.me.berkeley.edu/gri_mech)– Only use "raw" scientific assertions - not the potentially erroneous conclusions

– “give me your information, not your conclusions…”– Treats the models/experiments as information, and combines them all.

Moving forward– With the assertions now in place, much more can be inferred…

eliminating the incompatible convenience assumptions

http://www.me.berkeley.edu/gri_mech

GRI DataSet (assertion set)

d2 u2 d77 u77

d1 u1

300+ Reactions,50+ Species

Process P1

ProcessP2

ProcessP77

The GRI-Mech (www.me.berkeley.edu/gri_mech) DataSet is collection of 77 experimental reports, consisting of models and ``raw'' measurement data, compiled/arranged towards obtaining a complete mechanism for CH4 + 2O2 → 2H2O + CO2 capable of accurately predicting pollutant formation. The DataSet consists of:• Reaction model: 53 chemical species, 325 reactions, depending on…• Unknown parameters (): 102 active parameters, essentially the various rate constants. • Prior Information: , each normalized parameter is presumed known to lie between -1 and 1.• Processes (Pj): 77 widely trusted, high-quality laboratory experiments, all involving methane combustion, but under different physical manifestations, and different conditions.

• Process Models (Mj): 77 0-d, 1-d and 2-d numerical PDE models, coupled with the common reaction model.• Measured Data (dj,uj) data and measurement uncertainty from 77 peer-reviewed papers reporting above experiments.

CH4 + 2O2

↓2H2O + CO2

•kth assertion associated with prior info:

•Assertions associated with jth dataset unit:

The prior information, models and measured data constitute assertions about possible parameter values.

100+ unknown parameters

Chemistry() Transport 1M1()



Research portfolio expressed as deterministic constraintsSuitable for analysis (generally optimization over these)

http://www.me.berkeley.edu/gri_mech

Questions to ask

Given– Prior info, – Models, measurements & uncertainty

The feasible set is implied

Consistency: Quantify a measure of consistency of the dataset

Prediction: Bound the range of another model , on

Explore feasible set:

sensitivity to u

the “dataset”

Inner/Outer Bounds

Example: Prediction: Bound the range of another model , on

Upper bound on maximum by showing infeasability (emptiness) of

Lower bound on maximum by evaluating at a feasible point, given

Outer bound

Inner bound

Given the GRI dataset, and , and an additional model, . Consider the prediction problem

Treat the outer bounds as functions of the experimental uncertainty level, .

Look at differential sensitivity of the prediction width to this level.Compute these for many random

0 0.05 0.1 0.15 0.20

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

# of

occ

urre

nces

Prediction Sensitivity

Some values of j, (i.e., a model, measured data, measurement uncertainty) are particularly uninformative in this manner.

“How important is my experimental contribution when considered as just part of a larger collection?” Others are relevant in a modest number of

cases.

A few seem to contribute almost always.

In isolation, none of these individual 4 constraints stands out as special.

Computational exercise: assess capability of assertions in predicting the outcome of an additional model, M0.

Method H: Use only the prior information ( 2 H) on parameters; gives the prediction interval

Method Q: Community “pools” prior information and assertions, yielding the consistent coordinate-aligned cube. The prediction interval for M0 uses this,

Method F: The prediction directly uses the raw model/data pairs from all assertions, as well as the prior information.

High price of low cost uncertainty description

d1 u1 Process P1

CH4 + 2O2

↓2H20 + C02

d0 u0 ?

ProcessP0

M0

M1

d77 u77

ProcessP77

M77

How much information is lost when resorting to method Q instead of F?

Define the “loss in using method Q''

No loss (LQ=0) if prediction by Q is as tight as that achieved by F.

Complete loss (LQ=1) occurs if prediction by Q is no better than method H (only using prior info). In such case, the experimental results are effectively wasted.

Method Q pays a significant price for its crude representation of the constraints.

Loss using consistent, coordinate-aligned cube

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Loss (LQ

)

Fraction of Cases with Loss x

Frequency of Loss

In 70% of cases, the loss exceeds 0.7

0 20 40 60 80 1000

0.1

0.2

0.3

0.4

0.5

Parameter #

0 20 40 60 80 1000

0.1

0.2

0.3

0.4

0.5

Parameter #

0 20 40 60 800

0.1

0.2

0.3

0.4

0.5

Experiment #

Experiment #0 20 40 60 80

0

0.1

0.2

0.3

0.4

0.5

Consistency results for GRI-DataSet assertions

GRI-DataSet is consistent,

Nevertheless, the consistency measure is very sensitive (using multipliers from the dual form) to 2 particular experimental assertions, but not to the prior info.

The scientists involved rechecked calculations, and concluded that reporting errors had been made.

Both reports were updated -- one measurement value increased, one decreased -- exactly what the consistency analysis had suggested (without us informing them of that).

Sensitivity of the consistency measure to individual assertions is greatly reduced, and spread more evenly across data set.

PrIMe: Process Informatics Modeling (www.primekinetics.org)

Combustion impacts everything– Economies– Politics– Environment

Predictive capability leads to informed decisions and policymaking

PrIMe: A community activity aimed at the development of predictive reaction models for combustion

Challenge– to meet immediate needs for predictive reaction models in combustion

engineering, the petrochemical industry, and pharmaceuticals– build reaction models in a consistent and systematic way incorporating all

data and including all members of the scientific community

Details– partial support from NSF CyberInfrastructure program– UC Berkeley CITRIS-hosted project (ww.citris.berkeley.edu)– Kickoff meeting was yesterday in Berkeley

I have an idea of how to measure the elusive reaction between C14H7

and C3H3 forming C16H8 and CH2. What impact would such a measurement have on the three competing hypotheses concerning the nucleation of interstellar dust?

If the rate coefficient is established to within 3% accuracy, I will be able to discriminate between hypotheses A and B.

I do not think my experiment can attain better that 10% accuracy. What is the next best thing I can do experimentally to advance knowledge of this subject?

Measure the reaction between C10H7 and C3H2; I can then discriminate between hypotheses B and C.

Chemist to PrIMe

PrIMe to Chemist

Chemist to PrIMe

PrIMe to Chemist

Sometime in 2008…

What fueling rate produces peak output power while holding NOx yields within the EPA prescribed limits in a HCCI engine running GTLprescribed fuel #22 with design and operating parameters: xx, yy, ...…

Engineer to PrIMe

Sometime in 2010…

How much longer will there be an Antarctic ozone hole?

…

Policymaker to PrIMe

Sometime in 2020…

PrIMe to Engineer

PrIMe to Policymaker

PrIMe

Theoretician

Experimentalist

Physical Modeler

Scientific Computing

Numerical Analyst

elements

species

reactions

experiments

tools

numerical models

Contributors

Parameter IDtools

PrIMe

elements

species

reactions

experiments

numerical models

User: “Check joint consistency of these experiments/models”

Specified by user

Associated with experiments

Update referable analysis archive

Consistency Analysis

CA alg RFTA.03B

Some specification of the consistency analysis (type, and exact codes) for repeatability

Alliance for Cellular Signaling (AfCS) Similar origin to GRI Mech – a few people, frustrated by the uncoordinated, tunnel vision (deliberately leaving out interactions for simplicity sake) of the signaling community

– spearheaded by Gilman (UT Southwestern Medical Center)– saw the need for a large-scale examination/treatment of the problem

10 laboratories investigating basic questions in cell signaling– How complex is signal processing in cells?– What is the structure and dynamics of the network?– Can functional modules be defined?

Key Advantage of AfCS:– High quality data from single cell type– All findings/data available (www.signaling-gateway.org)

from Henry Bourne, UCSF “The collaboration itself is the biggest experiment of all. After all, the scientific culture of biology is traditionally very individualistic and it will be interesting to see if scientists can work as a large and

complex exploratory expedition.” (http://www.nature.com/nature/journal/v420/n6916/full/420600a.html)

Vision paper in Nature talks about socialistic aspects of science (http://www.nature.com/nature/journal/v420/n6916/full/nature01304.html)

Distributed system IDFocus is on high quality, community dataModels, analysis tools not emphasized

http://www.signaling-gateway.org/

http://www.nature.com/nature/journal/v420/n6916/full/420600a.html

http://www.nature.com/nature/journal/v420/n6916/full/nature01304.html

Calcium Signaling Application

Together with AfCS scientists, we extracted key, relevant features of calcium response to create 18 experimental assertions– Rise time, peak value, fall time

– 6 different stimuli levels

Published models constitute various model assertions– Goldbeter, Proc. Natl. Acad Sci. 1990

– Wiesner, American J. Physiology. 1996

– Lemon, J. Theor. Biology, 2003

Models are ODEs, each derived from a hypothesized network

Calcium Signaling Application

Results (18 assertions, plus prior info)– Goldbeter, 6 states, 20 parameters, invalidated 30 minutes

– Wiesner, 8 states, 27 parameters

• 10 node “machine”

• Invalidated in 2 days

– Lemon, 8 states, 34 parameters

• Feasible points found in ~8 hours

• New data led to invalidation

Conclusion: likely that more proteins and accompanying interactions are necessary to mathematically describe the signaling pathway.

These tools (eg., model invalidation, model-directed experimentation) were not part of the original AfCS mission, but the alliance is acquiring an “appreciation” of modeling and verification.

Smaller problems, but relatively much harder than the chemistry analysis

How are we computing? Invalidation Certificates

Consider invalidating the constraints (prior info, and N dataset units)

The invalidation certificate is a binary tree, with L leaves. At the i’th leaf – coordinate-aligned cube

– Polynomial/rational functions (“surrogate models”) & error bounds, which satisfy

– sum of squares certificate proving the emptiness of

Moreover Caveat: with each Mj relatively complex, these error bounds are generally heuristic,

implicitly assuming regularity in Mj

How are we computing? Invalidation Certificates

Why do emptiness proofs on the algebraic models?

Easier. The original problem was

Could derive invalidation certificates directly for the ODEs, in principle– ODE reachability analysis using barrier (Lyapunov) functions

• eg., ODE solution cannot get within of for any value of

– Use sum of squares certificates to bound reachability

– (N&) Sufficient conditions using semidefinite programming

– For the methane/transport models, the SDPs would be almost unimaginably large

– Perhaps a fresh look could reveal a new approach…

In its simplest form, think of Mj(ρ) as the response at a fixed time, of an ODE model (with parameters ρ) from a

fixed initial condition

Vinter, Prajna, Papachristodoulou, Doyle, Allgöwer,…

Error bounds: pragmatic issues

Recall, at the i’th leaf – coordinate-aligned cube– collection of surrogate models and error bounds, which satisfy

Error bounds are estimated statistically. They are more likely “reliable” if M is well-behaved. So, through:

– experience, and– domain-specific knowledge

the scientist is responsible to design/select experiments/features that– are measurable in the lab– feature model is well-behaved over the parameter space, and– show sensitivity to some coordinates of the parameter space

Random experimental investigations could lead nowhere, and even break the analysis… therefore…

Prudent experiment selection is critical to success

Summary: How are we computing?

Transforming real models to polynomial/rational models– Large-scale computer “experimentation” on

• Random sampling and sensitivity calculations to determine active parameters

• Latin Hypercube experiment design on active parameter cube

– Polynomial or rational fit

– Assess residuals, account for fit error

Assertions become polynomial/rational inequality constraints

Most analysis is optimization subject to these constraints– S-procedure, sum-of-squares (emptiness proofs, outer bounds)

– Constrained nonlinear optimization for inner bounds

– Branch & Bound to eliminate ambiguity due to fit errors

– Overall, straightforward and brute force, parallelizes rather easily

Collaborative Science: Conclusions

GRI Mech, AfCS and PrIMe are domain specific examples

Requires–Data sharing/contributions

–Model sharing/contributions

–Math tools sharing/contributions

Benefits–Scalable distributed version of the scientific method

–Roadmap to reliable prediction

–Information transfer between disciplines and scales

Present challenges–Community involvement and participation

–Privacy versus Open/CommunityAnalyzing proprietary data

–Convenient infrastructure

–Math analysis methods

Is a rich, large-scale, practical problem that improves the

consistency in which scientific results are used to make decisions

and set policy.

Collaborators Pete Seiler (UCB, Honeywell)Adam Arkin and Matt Onsum (UCB)Greg Smith (SRI)

GRI-Mech Team: Michael Frenklach, Hai Wang, Michael Goldenberg, Nigel Moriarty, Boris Eiteener, Bill Gardiner, Huixing Yang, Zhiwei Qin, Tom Bowman, Ron Hanson, David Davidson, David Golden, Greg Smith, Dave Crossley

PrIMe Team: UCB: Michael Frenklach, Andy Packard, Zoran Djurisic, Ryan Feeley, Trent RussiStanford: David Golden, Tom Bowman, …MIT: Bill Green, Greg McRae, …EU: Mike Pilling, …NIST: Tom Allision, Greg Rosasco, …ANL: Branko Ruscic, …CMCS…

Support from NSF grants: CTS-0113985 (ITR, 2001-2006)CHE-0535542 (CyberInfrastructure, 2005-2010)

DisseminationPapersRyan Feeley, Michael Frenklach, Matt Onsum, Trent Russi, Adam Arkin and Andy Packard, “Model Discrimination using Data Collaboration,” to appear, J. Physical Chemistry A, 2006.

Greg Smith, Michael Frenklach, Ryan Feeley, Andy Packard and Pete Seiler, “A System Analysis Approach to Atmospheric Observations and Models: the Mesospheric HOx Dilemma,” to appear J. Geophys. Res. (Atmospheres), 2006.

Pete Seiler, Michael Frenklach, Andy Packard and Ryan Feeley, “Numerical approaches for collaborative data processing,” to appear Optimization and Engineering, Kluwer, 2006.

Ryan Feeley, Pete Seiler, Andy Packard and Michael Frenklach, “Consistency of a reaction data set,” J. Physical Chemistry A, vol. 108, pp. 9573-9583, 2004.

Michael Frenklach, Andy Packard, Pete Seiler and Ryan Feeley, “Collaborative data processing in developing predictive models of complex reaction systems,” International Journal of Chemical Kinetics, vol. 36, issue 1, pp. 57-66, 2004.

Michael Frenklach, Andy Packard and Pete Seiler, “Prediction uncertainty from models and data,” 2002 American Control Conference, pp. 4135-4140, Anchorage, May 8-10, 2002.

Project websiteSlides, drafts, notes, proposal and related links, etc can be found at http://jagger.me.berkeley.edu/~pack/nsfuncertainty

collaborative science: give us your information, not your conclusions andy packard mechanical...

Documents

skeptical science community

process behavior

just model invalidation

parameters wrongproof

research directions

joint information implicit

onprocess models

generatedparametrized