bayesian network models of biological signaling pathways karensachs@stanford.edu

Post on 24-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Bayesian network models of Biological signaling pathways

karensachs@stanford.edu

K. Sachs2

From Phospho-molecular profiling to Signaling pathways

High throughput dataR

af

Erk

p38

PKA

PKC

Jnk

PIP2

PIP3

Plc

Akt

...

Cell1

Cell2

Cell3

Cell4

Cell600

Signaling Pathways

Flow Measurments

Picture: John Albeck

K. Sachs

Outline

What are signaling pathways?

What kind of data is available study them?

How do we use Bayesian networks to learn their structure?

Two extensions: Markov

neighborhood algorithm

Bayesian network based cyclic networks (BBCs)

3

K. Sachs

Outline

What are signaling pathways?

What kind of data is available study them?

How do we use Bayesian networks to learn their structure?

Two extensions: Markov

neighborhood algorithm

Bayesian network based cyclic networks (BBCs)

4

K. Sachs5

Cell death ProliferationSecrete cytokines

Cells respond to their environment

Inside each cell is a molecular network

K. Sachs6

“Central Dogma”

Translation

ProteinDNA

Transcription

mRNA

Modification

Modified Protein

‘Blueprint’- instructions

for production

of all proteins

Delivers instruction

s for specific gene

Ribosome: Protein-

production factory

K. Sachs7

Signaling & Genetic pathways

A

B

C

A

BTF

DNA

RNA

C

Cell response

K. Sachs

Outline

What are signaling pathways?

What kind of data is available study them?

How do we use Bayesian networks to learn their structure?

Two extensions: Markov

neighborhood algorithm

Bayesian network based cyclic networks (BBCs)

8

K. Sachs9

d[R]dt k1[LR]

k2[R][L]

...

Spectrum of Modeling Tools in Systems Biology

K. Sachs10

Graph

Node: Measured level/activity of protein

Edge: Influence (dependency) between proteins

Conditional probability distributions

Each node has a conditional probability given its parents

Protein A

Protein B

Protein C Protein D

Protein E

Bayesian Networks

P(B|A=‘On’)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

-1 0 10 1 2

K. Sachs

How do we use Bayesian Networks to infer pathways?

11

The Technical Details

BayesianScore (S) logP(S D)

logP(S) logP(D S) c

Score candidate models

Use a heuristic search to find high scoring models

... P(D,S)P( S)dn

1

... P(D, S)dn

1

P(DS)

(analytical solution!)

K. Sachs12

Protein data

Western blot

K. Sachs13

Protein data

Protein arrays

K. Sachs14

Protein data

Mass Spectrometry

All of these lysate approaches give 1

measurement per protein for 10^3-10^7 cells

K. Sachs15

Flow Cytometry: Single Cell Analysis

Thousands of datapoints

K. Sachs16

MEK3/6

MAPKKK

PLC

Erk1/2

Mek1/2

Raf

PKC

p38

Akt

MAPKKK

MEK4/7

JNK

L

A

TLck

VAVSLP-76

RAS

PKA

1 2 3

CD28CD3

PI3K

LFA-1

Cytohesin

Zap70

PIP3

PIP2

JAB-1

Activators

1.-CD3

2.-CD28

3. ICAM-2

4. PMA

5. 2cAMP

Inhibitors

6. G06976

7. AKT inh

8. Psitect

9. U0126

10. LY294002

10

5

46

7

9

8

Stimulations and perturbations

K. Sachs17

Datasets of cells• condition ‘a’• condition ‘b’•condition…‘n’

Raf

Mek

1/2

Erk

p38

PK

AP

KC

Jnk

PIP

2P

IP3

Plc

Akt

12 Color Flow Cytometry

perturbation a

perturbation n

perturbation b

Conditions (multi-well format)

T-Lymphocyte Data

Primary human T-Cells

9 conditions (6 Specific

interventions)

9 phosphoproteins, 2 phospolipids

600 cells per condition 5400 data-points

Omar Perez

K. Sachs18

Statistical Dependencies

A

B

C D

E

Phosp

ho A

Phospho B

K. Sachs19

Statistical Dependencies

Edges can be directed (primarily) due to the use of

interventions

A

B

C D

E

Phosp

ho A

Phospho B

K. Sachs20

Overview

Influence

diagram of

measured

variables

Bayesian Network Analysis

Datasets of cells• condition ‘a’• condition ‘b’•condition…‘n’

Raf

Mek

1/2

Erk

p38

PK

AP

KC

Jnk

PIP

2P

IP3

Plc

Akt

Multiparameter Flow Cytometry

perturbation a

perturbation n

perturbation b

Conditions (multi well format)

K. Sachs21

PKC

Raf

P44/42

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Phospho-Proteins Phospho-Lipids Perturbed in data

Inferred Network

K. Sachs22

PKC

Raf

P44/42

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Phospho-Proteins Phospho-Lipids Perturbed in data

How well did we do?

Direct phosphorylation

K. Sachs23

Features of Approach

Direct phosphorylation:

Mek

Difficult to detect using other forms of high-throughput data:

-Protein-protein interaction data

-Microarrays

Erk

K. Sachs24

PKC

Raf

P44/42

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Phospho-Proteins Phospho-Lipids Perturbed in data

How well did we do?

K. Sachs25

PKC

Raf

P44/42

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Phospho-Proteins Phospho-Lipids Perturbed in data

How well did we do?

Indirect Signaling

K. Sachs26

Indirect signaling

Dismissing edges

Raf Mek Erk

PKC Jnk PKC Mapkkk Jnk

Not measured

Mek4/7

Indirect connections can be found even when the intermediate molecule(s) are not

measured

Indirect signaling

K. Sachs27

Indirect signaling - Complex example

Is this a mistake?

The real picture

Phoso-protein specific

More than one pathway of influence

PKC Raf Mek

PKC Rafs259 Mek

Rafs497

Ras

K. Sachs28

PKC

Raf

P44/42

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Expected Pathway

15/17 Classic

Phospho-Proteins Phospho-Lipids Perturbed in data

How well did we do?

K. Sachs29

PKC

Raf

Erk

Mek

Plc

PKA

Akt

Jnk P38

PIP2

PIP3

Expected Pathway

Reported

Missed

15/17 Classic

17/17 Reported

3 Missed

Reversed

Phospho-Proteins Phospho-Lipids Perturbed in data

Signaling pathway reconstruction

[Sachs et al 2005]

K. Sachs

Caveats

Inhibitor specificity Binding site similar

across proteins

Reagent availability and specificity

Data quality

These are issues in many biological apps!

30

I think I’ll bind here

K. Sachs

Outline

What are signaling pathways?

What kind of data is available study them?

How do we use Bayesian networks to learn their structure?

Two extensions: Markov

neighborhood algorithm

Bayesian network based cyclic networks (BBCs)

31

K. Sachs32

Markov Neighborhood Algorithm

K. Sachs33

Building larger networks

12 color capability Model 50-100 variables

4 color capability Model 12 variables

PKC

Raf

P44/42

Mek

PlcPKA

Akt

Jnk P38

PIP2

PIP3

~80 proteins involved in

MAPK signaling

(11- at the cutting edge- is NOT enough!)

K. Sachs34

Measured subsets = Incomplete dataset (Missing data)

Insufficient information for standard approaches (will perform poorly)

Use a set of biologically motivated assumptions to constrain search..

And to reduce the number of experiments

( )11

4= 330

K. Sachs35

Constraining the search

Plus potential perturbation parents

Identify candidate parents

Using ‘Markov neighborhoods’

(for each variable)

K. Sachs36

Bayesian Network Analysis

(Constrained search)

Raf

Mek

1/2

Erk p38

PK

AP

KC

Jnk

PIP

2P

IP3

Plc

Akt

Molecules 1, 3, 7, 9

Molecules 2, 4, 7, 10

Molecules 1, 2, 6, 11

Approach overview

K. Sachs37

Neighborhood reduction

CB

E

DA

F

4 color capability

Conditional independencies in the

substructure?ABC

411

K. Sachs38

Accurate Reproduction of Model ~15 experiments, 4-colors

Confidence value different from original

model

PKC

Raf

Erk

Mek

Plc

Akt

Jnk P38

PIP2

PIP3

PKA

K. Sachs39

Raf

Mek

1/2

Erk p38

PK

AP

KC

Jnk

PIP

2P

IP3

Plc

Akt

Active learning approach

K. Sachs

Outline

What are signaling pathways?

What kind of data is available study them?

How do we use Bayesian networks to learn their structure?

Two extensions: Markov

neighborhood algorithm

Bayesian network based cyclic networks (BBCs)

40

K. Sachs41

Learning cyclic structures with Bayesian networks

Biological networks contain many loops

Bayesian networks are constrained to be acyclic

So…

K. Sachs

Overcoming acyclicity

Signaling pathways contain many cycles

Bayesian networks are constrained to be acyclic

How can we accurately model pathways with cycles?

42

GRB2/SOSGRB2/SOS

RafRaf

MEKMEK

ErkErk

RasRas

Develop a new, Bayesian network derived algorithm that models

cycles…

K. Sachs

Bayesian Network Based Cyclic Networks (BBNs)

I. Break loops with molecule inhibitors

II. Use BN to learn the structure (now not cyclic!)

III. Close loops

43

GRB2/SOSGRB2/SOS

RafRaf

MEKMEK

ErkErk

RasRas

Mek inhibitor

Solomon Itani

K. Sachs44

GRB2/SOSGRB2/SOS

RafRaf

MEKMEK

ErkErk

RasRas

I. Break loops with molecule inhibitors Detect loops P(A)A* ~= P(A)

II. Use BN to learn the structure (now not cyclic!)

III. Close loops

P(B|Pa(B)) A* ~= P(B|Pa(B))

AB

Bayesian Network Based Cyclic Networks (BBNs)

K. Sachs45

Future work

Larger network from overlapping sets (Markov neighborhood)

Dynamic models over time

Differences in signaling (sub-populations, treatment conditions, cell types, disease states)

K. Sachs46

Acknowledgements

Shigeru Okumur

a

Funding

LLS post doctoral fellowship

Solomon Itani

Garry Nolan

Dana Pe’er

Doug Lauffenburge

r

Omar Perez

Dennis Mitchell

Mesrob Ohannessia

n

Extra slides

Mathematical Intuition

BB CC C is independent of A given B.

AA

AA BB

CCDD

C independent of A given B and D

1) No need to introduce time!!!

2) When loops are broken, the result is a BN!!!

K. Sachs49

Prediction: ErkAktErk1/2 unperturbed Erk Akt not well established

in literature

Predictions:

Erk1/2 influences Akt

While correlated, Erk1/2 does not influence PKA

PKC

Raf

Erk1/2

Mek

PKA

Akt

K. Sachs50

Validation

control, stimulated

Erk1 siRNA, stimulated

SiRNA on Erk1/Erk2 Select transfected cells Measure Akt and PKA

100 101 102 103 104

APC-A: p-akt-647 APC-A100 101 102 103 104

PE-A: p-pka-546 PE-A

P-Akt P-PKA

P=9.4e-5 P=0.28

top related