pathway modeling and problem solving environments

49
Pathway Modeling and Problem Solving Environments Cliff Shaffer Department of Computer Science Virginia Tech Blacksburg, VA 24061

Upload: lawson

Post on 11-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Pathway Modeling and Problem Solving Environments. Cliff Shaffer Department of Computer Science Virginia Tech Blacksburg, VA 24061. The Fundamental Goal of Molecular Cell Biology. Application: Cell Cycle Modeling. How do cells convert genes into behavior? Create proteins from genes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pathway Modeling and Problem Solving Environments

Pathway Modeling andProblem Solving Environments

Cliff ShafferDepartment of Computer Science

Virginia TechBlacksburg, VA 24061

Page 2: Pathway Modeling and Problem Solving Environments

The Fundamental Goal of Molecular Cell Biology

Page 3: Pathway Modeling and Problem Solving Environments

Application:Cell Cycle Modeling

How do cells convert genes into behavior? Create proteins from genes Protein interactions Protein effects on the cell

Our study organism is the cell cycle of the budding yeast Saccharomyces cerevisiae.

Page 4: Pathway Modeling and Problem Solving Environments

S

cell divisio

n

G1

DNAreplication

G2M(mitosis)

Page 5: Pathway Modeling and Problem Solving Environments

growth

Clb5MBF

P Sic1 SCFSic1Swi5

Clb2Mcm1

APCCdc14

Cdc14

CDKs

Cln2SBF

?

andCln3

Bck2

DNA synthesis

Inactive trimer

Inactive trimer

P

Clb2

Budding

Cdc20

Cdc20

Cdh1

Cdh1

Mcm1

Mad2

unaligned chromosomes

RENT

Cdc14

APC-P

Cln2Clb2Clb5

Lte1

SBF

Esp1 Esp1Pds1

Pds1

Net1

Net1P

PPX

Cdc15/MENTem1-GDP

Tem1-GTPBub2

unaligned chromosomesCdh1

Sister chromatid separation

Mcm1Cdc20

Mitosis

Page 6: Pathway Modeling and Problem Solving Environments

Modeling Techniques

One method: Use ODEs that describe the rate at which each protein concentration changes Protein A degrades protein B:

… with initial condition [A](0) = A0.

Parameter c determines the rate of degradation. Sometimes modelers use “creative” rate laws to

approximate subsystems

]A[]B[ cdt

d

Page 7: Pathway Modeling and Problem Solving Environments

'1 1 2

d[Cln2] [SBF] [Cln2]d

k k kt

' '3 3 4 4 5

d[Clb2] [Mcm1] [Cdh1] [Clb2] [Sic1][Clb2]d

k k k k kt

' '6 6 T 7 7

6 T 7

[Cdc20] [Cdh1] [Cdh1] [Clb5] [Cdh1]d[Cdh1]d [Cdh1] [Cdh1] [Cdh1]

k k k k

t J J

synthesis degradation

synthesis degradation binding

activation inactivation

Mathematical Model

Page 8: Pathway Modeling and Problem Solving Environments

0 50 100 1500.0

0.5

1.0

1.5

0.0

0.5

0.0

0.5

1.0

1

2

Time (min)

CKI

mass

Clb2

Cln2

Cdh1

Simulation of the budding yeast cell cycle

G1 S/M

Cdc20

Page 9: Pathway Modeling and Problem Solving Environments

Table 6. Properties of clb, sic1, and hct1 mutants

mass at birth

mass at

SBF 50%

mass at

DNA repl.

mass at bud ini.

mass at division

TG1

(min)

changed

parameter

Comments

1 wild type

(daughter) 0.71 1.07

(71’) 1.15 (84’)

1.15 (84’)

1.64 (146’)

84 CT 146 min (time of occurrence of event)

2 clb1 clb2

0.71 1.07 1.16 1.16 No mit k's,b2 = 0

k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.

3 clb1 clb2

1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1

k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.

4 clb5 clb6 0.73 1.07

(65’) 1.30 (99’)

1.17 (80’)

1.70 (146’)

99 k's,b5 = 0 k"s,b5 = 0

Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.

5 clb5 clb6

GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1

k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.

6 sic1 0.66 1.00

(73’) 0.82 (37’)

1.06 (83’)

1.52 (146’)

38 k's,c1 = 0 k"s,c1 = 0

Schneider 1996 Fig 4, sic1 uncouples S phase from budding.

7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0

Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.

8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high

throughout the cycle. 9 sic1 hct1

0.71 No SBF 0.72 No bud No mit k's,c1 = 0

k"d,b2 = 0.01 Visintin 1997, telophase arrest.

10 sic1 GAL-CLB5

first cycle second cycle

0.71 0.52

0.74

0.73

No repl

0.76

1.20

k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0

Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.

Page 10: Pathway Modeling and Problem Solving Environments

Table 6. Properties of clb, sic1, and hct1 mutants

mass at birth

mass at

SBF 50%

mass at

DNA repl.

mass at bud ini.

mass at division

TG1

(min)

changed

parameter

Comments

1 wild type

(daughter) 0.71 1.07

(71’) 1.15 (84’)

1.15 (84’)

1.64 (146’)

84 CT 146 min (time of occurrence of event)

2 clb1 clb2

0.71 1.07 1.16 1.16 No mit k's,b2 = 0

k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.

3 clb1 clb2

1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1

k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.

4 clb5 clb6 0.73 1.07

(65’) 1.30 (99’)

1.17 (80’)

1.70 (146’)

99 k's,b5 = 0 k"s,b5 = 0

Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.

5 clb5 clb6

GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1

k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.

6 sic1 0.66 1.00

(73’) 0.82 (37’)

1.06 (83’)

1.52 (146’)

38 k's,c1 = 0 k"s,c1 = 0

Schneider 1996 Fig 4, sic1 uncouples S phase from budding.

7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0

Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.

8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high

throughout the cycle. 9 sic1 hct1

0.71 No SBF 0.72 No bud No mit k's,c1 = 0

k"d,b2 = 0.01 Visintin 1997, telophase arrest.

10 sic1 GAL-CLB5

first cycle second cycle

0.71 0.52

0.74

0.73

No repl

0.76

1.20

k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0

Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.

d CDK dt = k1 - (v2’ + v2” . Cdh1 ) . CDK

d Cdh1dt =

(k3’ + k3” . Cdc20A) (1 - Cdh1) J3 + 1 - Cdh1 -

(k4’ + k4” . CDK . M) Cdh1 J4 + Cdh1

d IEPdt = k9 . CDK . M . (1 – IEP ) – k10 . IEP

d Cdc20T

dt = k5’ + k5” (CDK . M)4

J54 + (CDK . M)4 - k6

. Cdc20T

d Cdc20A

dt = k7 . IEP (Cdc20T - Cdc20A)

J7 + Cdc20T - Cdc20A -

k8 . MAD Cdc20A

J8 + Cdc20A - k6

. Cdc20T

Differential equations Parameter values

k1 = 0.0013, v2’ = 0.001, v2” = 0.17,

k3’ = 0.02, k3” = 0.85, k4’ = 0.01, k4” = 0.9,

J3 = 0.01, J4 = 0.01, k9 = 0.38, k10 = 0.2,

k5’ = 0.005, k5” = 2.4, J5 = 0.5, k6 = 0.33,

k7 = 2.2, J7 = 0.05, k8 = 0.2, J8 = 0.05,

Experimental Data

Page 11: Pathway Modeling and Problem Solving Environments

Tyson’s Budding Yeast Model

Tyson’s model contains over 30 ODEs, some nonlinear.Events can cause concentrations to be reset.About 140 rate constant parameters Most are unavailable from experiment and must set by

the modeler

Page 12: Pathway Modeling and Problem Solving Environments

Fundamental Activities

Collect information Search literature (databases), Lab notebooks

Define/modify models A user interface problem

Run simulations Equation solvers (ODEs, PDEs, deterministic,

stochastic)Compare simulation results to experimental data Analysis

Page 13: Pathway Modeling and Problem Solving Environments

Modeling Lifecycle

Page 14: Pathway Modeling and Problem Solving Environments

Our Mission: Build Software to Help the Modelers

Typical cycle time for changing the model used to be one month Collect data on paper lab notebooks Convert to differential equations by hand Calibrate the model by trial and error Inadequate analysis tools

Goal: Change the model once per day. Bottleneck should shift to the experimentalists

Page 15: Pathway Modeling and Problem Solving Environments

Another ViewCurrent models of simple organisms contain a few 10s of equations.To model mammalian systems might require two orders of magnitude in additional complexity.We hope our current vision for tools can supply one order of magnitude.The other order of magnitude is an open problem.

Page 16: Pathway Modeling and Problem Solving Environments

JigCell

Current Primary Software Components:JigCell Model BuilderJigCell Run ManagerJigCell ComparatorAutomated Parameter Estimation (PET)Bifurcation Analysis (Oscill8)

http://jigcell.biol.vt.edu

Page 17: Pathway Modeling and Problem Solving Environments

Model Builder

Run Manager

Comparator

Parameter Values

ParameterOptimizer

Optimum Parameter Values

Page 18: Pathway Modeling and Problem Solving Environments

From a wiring diagram…

JigCell Model Builder

Page 19: Pathway Modeling and Problem Solving Environments

N.B. Parameters are given names,not numerical values!

…to a reaction mechanism

… to ordinary differential equations (ode files, SBML)

JigCell Model Builder

Page 20: Pathway Modeling and Problem Solving Environments

Mutations

Wild type cellMutations Typically caused by gene knockout Consider a mutant with no B to degrade A.

Set c = 0 We have about 130 mutations

each requires a separate simulation run

Page 21: Pathway Modeling and Problem Solving Environments

• Inheritance patterns

Basal Set(wild-type)

Derived Set(mutant A)

Derived Set(mutant B)

Derived Set(mutant C)

Derived Set(mutant A’)

Derived Set(mutant AB)

Derived Set(mutant A’C)

Run Manager

Page 22: Pathway Modeling and Problem Solving Environments

JigCell Run Manager

Page 23: Pathway Modeling and Problem Solving Environments

Phenotypes

Each mutant has some observed outcome (“experimental” data). Generally qualitative. Cell lived Cell died in G1 phase

Model should match the experimental data. Model should not be overly sensitive to the rate

constants. Overly sensitive biological systems tend not to

survive

Page 24: Pathway Modeling and Problem Solving Environments

Visualize results

Kumagai1 Kumagai2

Comparator

Page 25: Pathway Modeling and Problem Solving Environments

Comparator

Page 26: Pathway Modeling and Problem Solving Environments

Optimization

How to decide on parameter values?Key features of optimization Each problem is a point in multidimensional space Each point can be assigned a value by an objective

function The goal is to find the best point in the space as defined

by the objective function We usually settle for a “good” point

Page 27: Pathway Modeling and Problem Solving Environments

Parameter Optimization

Page 28: Pathway Modeling and Problem Solving Environments

Error Functionorthogonal distance regression

Levenberg-Marquardt algorithm

Parameter Optimization

Page 29: Pathway Modeling and Problem Solving Environments

Only 1 experiment shown here. The model must be fitted simultaneously to many different experiments.

Parameter Optimization

Page 30: Pathway Modeling and Problem Solving Environments

Global DIRECT Search(DIViding RECTangles)

Page 31: Pathway Modeling and Problem Solving Environments

Global DIRECT Search(DIViding RECTangles)

Page 32: Pathway Modeling and Problem Solving Environments
Page 33: Pathway Modeling and Problem Solving Environments
Page 34: Pathway Modeling and Problem Solving Environments

Composition MotivationModels are reaching the limits of manageability due to an increase in: Size Complexity

Making a model suitable for stochastic simulation increases the number of reactions by a factor of 3-5.Models of the mammalian cell cycle will require 100-1000 reactions (even more for stochastic simulation).

Page 35: Pathway Modeling and Problem Solving Environments

Model CompositionNotice that the yeast cell diagram contains natural components

Page 36: Pathway Modeling and Problem Solving Environments

Composition ProcessesFusion Merging two or more existing models

Composition Build up model hierarchy from existing models by

describing their interactions and connectionsAggregation Connects modular blocks using controlled interfaces

(ports)Flattening Convert hierarchy back into a single “flat” model for

use with standard simulators

Page 37: Pathway Modeling and Problem Solving Environments

Composition Processes

Page 38: Pathway Modeling and Problem Solving Environments

Sample Sub-models

Page 39: Pathway Modeling and Problem Solving Environments

Sample Composed Model

Page 40: Pathway Modeling and Problem Solving Environments

Composition WizardFinal Species Mapping Table

Page 41: Pathway Modeling and Problem Solving Environments

Composition WizardFinal Reaction Mapping Table

Page 42: Pathway Modeling and Problem Solving Environments

Aggregated Submodels

Page 43: Pathway Modeling and Problem Solving Environments

Final Aggregated Model

Page 44: Pathway Modeling and Problem Solving Environments

Aggregation Connector

Page 45: Pathway Modeling and Problem Solving Environments

Composition in SBML

Virginia Tech’s proposed language features to support composition/aggregation being written into forthcoming SBML Level 3 definition

Page 46: Pathway Modeling and Problem Solving Environments

Stochastic Simulation

ODE-based (deterministic) models cannot explain behaviors introduced by random nature of the system. Variations in mass of division Variations in time of events Differences in gross outcomes

Page 47: Pathway Modeling and Problem Solving Environments

Gillespie’s Stochastic Simulation Algorithm

There is a population for each chemical speciesThere is a “propensity” for each reaction, in part determined by populationEach reaction changes population for associated speciesLoop: Pick next reaction (random, propensity) Update populations, propensities

Slow, there are approximations to speed it up

Page 48: Pathway Modeling and Problem Solving Environments

Comments on Collaboration

Domain team routinely underestimates how difficult it is to create reliable and usable software.CS team routinely underestimates how difficult it is to stay focused on the needs of the domain team.Partial solution: truly integrate.

Page 49: Pathway Modeling and Problem Solving Environments

How to Succeed in CBB

Programming skills are necessary but not sufficientMath is usually the biggest bottleneck Statistics for Bioinformatics Numerical analysis, optimization, differential equations

for computational biology

Chemistry/biochemistry are good choices for domain knowledgeYou have to have an “interdisciplinary attitude”