dynamic modelling of microarray data

22
Dynamic modelling of microarray data. Martino Barenco Institute of Child Health / UCL

Upload: fifi

Post on 21-Mar-2016

63 views

Category:

Documents


1 download

DESCRIPTION

Dynamic modelling of microarray data. Martino Barenco Institute of Child Health / UCL. Outline. Goal: predict targets of a known transcription factor in a complex response using dynamic models and time course microarray data. HVDM: Hidden Variable Dynamic Modelling. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dynamic modelling of microarray data

Dynamic modelling of microarray data.

Martino BarencoInstitute of Child Health / UCL

Page 2: Dynamic modelling of microarray data

Goal: predict targets of a known transcription factor in a complex response using dynamic models and time course microarray data. HVDM: Hidden Variable Dynamic Modelling

Outline

1) Principle + Results (Genome Biology 2006)

2) Techniques (R/Bioconductor implementation: rHVDM)

Page 3: Dynamic modelling of microarray data

Gene expression modelTranscript concentration Xj(t):

dX j (t)dt

= B j + S j f (t) − D j X j (t)

transcription rates

degradation rate

transcription factor activity

f(t)Bj=0Sj=3Dj=1

Bj 010Dj 10.1

Sj 36Bj 010Dj 10.1

Sj 36

Xj(t)

Page 4: Dynamic modelling of microarray data

Algorithm Principle:I) Training step:

Inputs:- Previous biological knowledge: known targets of the transcription factor- Expression values of those targets

Output:- Transcription factor activity (the hidden variable)- Kinetic parameters for the training genes

II) Screening step (for each single gene):Input:- Transcription factor activity - Expression profile of the gene

Output:- Dependency status of the gene: target or not?

Page 5: Dynamic modelling of microarray data

B j S j f (t) D j X j (t)dX j (t)

dt

Training step (j: training genes)

Screening step (j: individual gene being screened)

B j S j f (t) D j X j (t)dX j (t)

dt

Page 6: Dynamic modelling of microarray data

The p53 network

Active p53

Rb/E2F1

E2F1

Rb

CDK4Cell CycleG1/S Arrest p73

14-3-3

Jun-Bp21

Baxp53AIFPuma

FasPiddDR5

bcl2

mybJun

MDM2

p19Arf

p53

CHK2

Active ATM

ATM

DNADamage

G2/MArrest

Survival

DeathReceptor

MitochondrialApoptosis

Page 7: Dynamic modelling of microarray data

Experimental setup

Human T cells (MOLT4/p53 wild-type) submitted to 5Gy irradiation.

mRNA harvested 2,4,6,8,10,12 hours after irradiation, and just before (0 hrs time point).

Affymetrix microarrays (HG-U133) were then run.

Experiment was run in triplicates.

Page 8: Dynamic modelling of microarray data

Results of training step: activity profile of p53

Page 9: Dynamic modelling of microarray data

Screening Q: what are the other genes that are p53

activated? Putative p53 targets must both:

a) Fit the model wellb) Have a sensitivity coefficient Sj>0

dX j (t)dt

= B j + S j f (t) − D j X j (t)

Page 10: Dynamic modelling of microarray data

Model Sensitivityscore M (Z-score)

damage-specific DNA binding protein 2, 48kDa DDB2 203409_at 18.74 18.24CD38 antigen (p45) CD38 205692_s_at 36.69 14.77ferredoxin reductase FDXR 207813_s_at 79.82 13.19hypothetical protein FLJ22457 FLJ22457 221081_s_at 60.45 11.01tripartite motif-containing 22 TRIM22 213293_s_at 41.36 10.99carnitine O-octanoyltransferase CROT 204573_at 84.4 10.98glutaminase 2 (liver, mitochondrial) GLS2 205531_s_at 42.83 10.28leucine-rich repeats and death domain containing LRDD 219019_at 78.8 9.9hect domain and RLD 5 HERC5 219863_at 37.65 9.55cyclin G1 CCNG1 208796_s_at 17.04 9.37BCL2-interacting killer BIK 205780_at 19.43 9.35activating signal cointegrator 1 complex subunit 3 ASCC3 212815_at 60.34 9.26sestrin 1 SESN1 218346_s_at 8.37 9.25p53 target zinc finger protein WIG1 219628_at 41.33 9.19tumor necrosis factor receptor superfamily, member 10bTNFRSF10B 209295_at 27.34 9.05chromosome 6 open reading frame 4 C6orf4 215411_s_at 86.45 8.81cyclin-dependent kinase inhibitor 1A(p21) CDKN1A 202284_s_at 24.98 8.4etoposide induced 2.4 mRNA EI24/PIG8 216396_s_at 88.04 8.2mitogen-activated protein kinase kinase kinase kinase 4 MAP4K4 206571_s_at 62.88 7.54lymphoid-restricted membrane protein LRMP 204674_at 26.92 7.36xeroderma pigmentosum, group C XPC 209375_at 43.09 7.36TNF (ligand) superfamily, member 4 (Ox40L) TNFSF4 207426_s_at 34.73 7.15Human cleavage /polyadenylation specificity factor CPSF1 33132_at 77.75 7.09AMP-activated protein kinase, beta 1 subunit PRKAB1 201834_at 25.72 7.01transducer of ERBB2, 1 TOB1 202704_at 92.69 6.79p53-inducible cell-survival factor P53CSV 218403_at 48.33 6.5sortilin-related receptor, L(DLR class) SORL1 203509_at 15.66 6.34Fas (TNF receptor superfamily, member 6) FAS 216252_x_at 44.31 6.23ribonucleotide reductase M1 polypeptide RRM1 201477_s_at 46.58 6.19archaemetzincins-2 AMZ2 218167_at 37.48 6.16galactose-3-O-sulfotransferase 4 GAL3ST4 219815_at 38.62 5.97growth arrest and DNA-damage-inducible, alpha GADD45A 203725_at 84.23 5.89hypothetical protein FLJ11259 FLJ11259 218627_at 7.23 5.87major histocompatibility complex, class I, B HLA-B 209140_x_at 89.77 5.79testis specific, 10 TSGA10 220623_s_at 20.85 5.67hypothetical protein MDS025 MDS025 218288_s_at 31.35 5.66TP53 activated protein 1 TP53AP1 209917_s_at 22.22 5.65leukemia inhibitory factor LIF 205266_at 14.86 5.62interferon stimulated exonuclease gene 20kDa-like 1 ISG20L1 219361_s_at 48.55 5.56

Gene Title Gene Symbol Affymetrix Identifier

Page 11: Dynamic modelling of microarray data

P21: part oftraining set

CD38:Uncovered by screening

Page 12: Dynamic modelling of microarray data

Verification experimentsiRNA knock down of p53:

HVDM predictions:

Page 13: Dynamic modelling of microarray data

Ingredients needed1) ODE integrator:

dX j (t)dt

= B j + S j f (t) − D j X j (t) + parameter values

X j,MODEL(t)

2) Model fitting:

Find set of parameter values s.t.

X j,MODEL(t) ≅ X j,DATA (t)3) Want to take measurement noise into the data into account

4) Specifically for the Bioconductor implementation: be reasonably quick

Page 14: Dynamic modelling of microarray data

1) ODE integration

01020304050607080

0 2 4 6 8 10 12

- Want to estimate slope of at t=6

X j,MODEL (t)

€ - Slope=weighted sum of time points around t=6

dX j,MODEL(t)dt

≅ A.X j,MODEL (t)

A.X j (t) = B j + S j f (t) − D j X j (t)- i.e. the ODE is turned into a system of linear equations

X j,MODEL (t) = (A + D jI )-1(B j 1+ S j f(t))Formal solution:

Page 15: Dynamic modelling of microarray data

2) Model fitting1) Start with a “random” set of parameters:

2) Compute a solution:

3) Compare with data using a merit function:

4) Vary p systematically until a minimum value for M(p) is reached.

p = {B1,..,Bm,S1,..,Sm,D1,..,Dm, f }

j = 1,...,m

M(p) =ˆ X j(ti) − X j(ti)

σ X j(ti)( )

⎝ ⎜ ⎜

⎠ ⎟ ⎟

n time points (i)m genes (j)

∑2

X j,MODEL (t) = (A + D jI )-1(B j 1+ S j f(t))

Page 16: Dynamic modelling of microarray data

Fitting algorithms: Originally used simplex-based method

(Nelder-Mead) (GB paper) Followed by a MCMC step to determine

confidence intervals (GB paper) rHVDM (Bioconductor) uses Levenberg-

Marquardt (gradient-based). By-product is the Hessian, which allows to

compute confidence intervals.

Page 17: Dynamic modelling of microarray data

Difference between MCMC and LM confidence intervals.

Basal rates

0

10

20

30

40

50

60

70

80

203409_at 218346_s_at 209295_at 202284_s_at 205780_at

Sensitivity

0

0.5

1

1.5

2

2.5

203409_at 218346_s_at 209295_at 202284_s_at 205780_at

Degradation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

203409_at 218346_s_at 209295_at 202284_s_at 205780_at

Transcription factor activity (sample1)

0

50

100

150

200

250

300

350

400

450

1 2 3 4 5 6 7

Page 18: Dynamic modelling of microarray data

Importance of confidence intervals Biological data is inherently noisy. Don’t want

to assume that measurement are exact. example:

Genes with a flat profile would be a good fit to the equation (Sj=0)

Essential to identify these situations to detect targets of the transcription factor

dX j (t)dt

= B j + S j f (t) − D j X j (t)

Page 19: Dynamic modelling of microarray data

Parameter count reduction / identifiability

€ dXj(t)dt= Bj + Sj (α  g(t) + β) – Dj Xj(t)

= (Bj + Sj β) + α  Sj g(t) – Dj Xj(t)

= ~Bj + ~Sj g(t) – Dj Xj(t)

dX j (t)dt

= B j + S j f (t) − D j X j (t)

Replace f(t) with

f (t) = αg(t) + β

Solution:Let Sp21=1 (removes “”’’ ambiguity)and f(0)=0 (removes “’’ ambiguity) parameter count is reduced by 2

Page 20: Dynamic modelling of microarray data

Confidence intervals importance II

Solution measure one of the kinetic parameters independently, integrate that in the fitting:

Initial fitting:

Page 21: Dynamic modelling of microarray data

Measurementerror

AlgorithmicspeedParameter

identifiability

Parameter countreduction

Confidenceintervals

Page 22: Dynamic modelling of microarray data

AcknowledgementsSonia Shah (Bloomsbury Centre for

Bioinformatics)Dan Brewer (Institute of Cancer Research)Crispin Miller (Patterson Institute for Cancer

Research)Daniela Tomescu (ICH)Mike Hubank (ICH)Robin Callard (ICH)Jaroslav Stark (CISBIC, Imperial College)