netbiosig2014-talk by ashwini patil

23
TimeXNet: Identifying active gene sub-networks using time- course gene expression profiles Ashwini Patil Institute of Medical Science University of Tokyo NetBio SIG, ISMB 2014

Upload: alexander-pico

Post on 10-May-2015

525 views

Category:

Science


5 download

DESCRIPTION

NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014

TRANSCRIPT

TimeXNet: Identifying active gene sub-networks using

time-course gene expression profilesAshwini Patil

Institute of Medical ScienceUniversity of Tokyo

NetBio SIG, ISMB 2014

Goal• Comprehensive computational analysis of the innate

immune response

Mouse Interaction network103218 protein-protein, protein-DNA,

post-translational modifications

Time-course gene expressionRNA-seq expression levels in dendritic cells on LPS stimulus at 8 time points

Innate immune system

Kawai & Akira, Nat. Immunology, 2010

Method - TimeXNetPartition differentially expressed genes into 3 time-based groups

Identify most probable paths in the network connecting the three groups

Patil et al., PLOS Comp. Biol., 2013

Minimum cost flow optimization

• ResponseNet• Identifies paths between two groups of genes (genetic hits and differentially

expressed genes in yeast)

- Yeger-Lotem et l., Nat. Genetics, 2009

TimeXNet methodology• Edge cost: inversely proportional to edge reliability• Edge capacity: directly proportional to

• Fold change in expression of adjacent gene(s)• Absolute tag counts of adjacent gene(s)

• Objective function

Minimize cost of flow through the network from T1 to T3 genes• Constraint

Flow must pass through intermediate nodes (T2 genes)

Most probable paths connecting T1->T2->T3 genes 2681 scored interactions among 1225 proteins

Candidate genesEarly genes

(0.5-1 hour)

Intermediate genes

(2-4 hours)

Late genes

(6-8 hours)

Genes with no change

in expression

Gene Flow Gene Flow Gene Flow Gene Flow

Jun 13.68 Socs3 85.85 Cxcl10 10.91 Stat1 8.74

Fos 10.34 Nfκb1 76.87 Ddx58 9.33 Mapk8 8.72

Il1b 9.86 Jak2 54.44 Stat2 8.65 Irf5 7.60

Tnf 9.36 Src 38.30 Atf3 8.29 Adcy5 7.43

Cxcl2 7.59 Pik3r5 27.86 Isg15 8.15 Mapk1 7.40

Il1a 7.40 Rela 23.35 Irf7 7.30 Sp1 7.37

Akt1 6.43 Stat5a 20.40 Nos2 6.91 Stat6 7.17

Atf4 5.49 Met 18.94 Ifnar2 5.20 Sp3 7.13

Candidate networks

Gm13305

Ifnar1

Il12rb1

Il13ra2

Ifngr2

Gm2002

Il13ra1

Il11ra1

Stat5b

Stat4

Irs3

Irs4

Lifr

Jak2

Cxcr4

Stat6

Il9r

Nck1

Il20ra

Il22ra11Il22ra2

Il7r

Il2rgIl4ra2

Il28raa Il2ra

Il6ra

Ifnar2

Il21r

Stat2

Il3ra

Crlf2

Ifngr1

Il15ra

Ddx58

Fos

Rela

Nfkb1

Stat5a

Bcl10

Il10rb

JunStat1

Sp3

CR974586.2

Socs3

Foxo3CT868723.4 Csf2rb

Gfi1b

CT868723.4CT868723.4CT868723.4Csf2rb2

Cntfr

bb Cre

• Socs3• Suppressor of cytokine signaling 3• Induced by Nfkb and inhibits a large number of proteins, specifically the

interleukin receptors

Candidate networks

Method evaluation

• Comparison with experimentally identified regulators• Amit et al., Science 2009: 49.6% previously unknown genes identified• Chevrier et al., Cell 2011: 69.8% regulators (novel and known) and 54.9% TLR target

genes identified

• Overlap with KEGG pathways• Directed paths of 3 to 7 edges identified in 13 KEGG pathways• Jak-STAT signaling pathway, Chemokine signaling pathway, Toll-like receptor pathway,

MAPK signaling pathway

Noise in the interaction network

Comparison with other methods

Method Experimentally confirmed regulators (3 datasets)

KEGG Pathways with predicted

paths (max length)

Execution time (4 CPUs, 2.4Ghz, 12Gb

RAM)

Prior knowledge required

Time-course data

TimeXNet 49.6%1 69.8%2 54.9%3 13 (7 edges) 3 min None Yes

ResponseNet* 39.2%1 53.5%2 39.2%3 0 (3 edges) 1 min None No

SDREM 12.0%1 32.6%2 11.8%3 2 (4 edges) ~10 days Initial genes Yes

1 Regulatory genes from Amit et al., Science, 20092 Regulatory genes from Chevrier et al., Cell, 2011

3 Target genes from Chevrier et al., Cell, 2011

*Local implementation using GLPK

Yeast osmotic stress response

• Time-course gene expression (min) in yeast on hyperosmotic stress- Romero-Santacreu et al., RNA 2009

• Previously used to evaluate SDREM and ResponseNet- Gitter et al., Genome Research 2013

• Genes with 1.5 fold change in expression• Initial response genes: 2-4 min • Intermediate regulators: 6-8 min• Final effectors: 10-15 min

Predicted osmotic stress response network

• 2-4 min

• 6-8 min

• 10-15 min

• Predicted MethodGold

Standard* TFs* Hog1 RuntimeTimeXNet 19 5 Yes 5 secSDREM* 10 4 Yes -

ResponseNet* 3 2 No -*Taken from Gitter et al., Genome Research 2013

Circadian regulation of metabolism in mouse liver cells

- Unpublished

• Paths connecting genes showing rhythmic patterns of expression in 24 hours• Network predicted by TimeXNet contains Sphk2, Pld1, Pld2, Glud1

TimeXNet Availability: http://timexnet.hgc.jp/

• Input • 3 sets of genes with

scores• Weighted interaction

network• Parameters gamma1 and

2• Location of glpsol

executable from the GLPK• Directory where results

will be storedCytoscape

Running TimeXNet• Standalone application • Command line version• Iterative command line version to

identify optimal parameters

Patil & Nakai, under review

Conclusion• TimeXNet: A method to predict active gene sub-networks using time-course

gene expression profiles• Advantages

• Accurate and fast• Independent of biological system: Innate immune response, circadian regulation of

metabolism in mouse, yeast osmotic stress response• Amenable to incorporation of other time-course data types: phosphorylation levels,

protein levels, epigenetic information

• Issues to be addressed• Allowing path prediction between more than 3 groups of genes while maintaining

speed and accuracy • Incorporating other forms of time-course information• Enhancements: Automatic install of GLPK, allowing users to enter non-numeric gene

IDsPatil et al., PLOS Comp. Biol., 2013

Acknowledgements• Innate immune response

• Prof. Kenta Nakai - University of Tokyo• Dr. Yutaro Kumagai – Osaka University• Dr. Kuo-ching Liang – University of Tokyo• Prof. Yutaka Suzuki – University of Tokyo• Dr. Tomonao Inobe – Toyama University

• Yeast osmotic stress response• Dr. Anthony Gitter – Microsoft Research

• Circadian regulation of metabolism• Dr. Craig Jolley – RIKEN Center for

Developmental Biology, Kobe

• Funding• Japan Society for the Promotion of

Science (JSPS) FIRST Program• JSPS Grant-in-Aid for Young Scientists• Takeda Science Foundation (with Dr.

Tomonao Inobe)

• Computational resources• Supercomputer at the Human Genome

Center, Institute of Medical Science, University of Tokyo

Thank you!

Edge CapacitiesFor edges between the auxiliary source, S, and the initial response genes GT1,

2 1   log       / /

imax iSi T

imax ii i

fc eC i G

fc N e N

(3)

For edges connected to the intermediate regulators GT2,

2 2  2    log         ,     / /

imax iij T T

imax ii i

fc eC i G j G

fc N e N

(4)

2 2

   log      log/ // /

    ,    2

jmax jimax i

imax jmaxi ji ji j

ij T

fc efc e

fc N fc Ne N e NC i j G

(5)

For edges between the late effectors, GT3, and the auxiliary sink T,

2 3   log         / /

imax iiT T

imax ii i

fc eC i G

fc N e N

(6)

For edges between the auxiliary source, S, and the initial response genes GT1,

2 1   log       / /

imax iSi T

imax ii i

fc eC i G

fc N e N

(3)

For edges connected to the intermediate regulators GT2,

2 2  2    log         ,     / /

imax iij T T

imax ii i

fc eC i G j G

fc N e N

(4)

2 2

   log      log/ // /

    ,    2

jmax jimax i

imax jmaxi ji ji j

ij T

fc efc e

fc N fc Ne N e NC i j G

(5)

For edges between the late effectors, GT3, and the auxiliary sink T,

2 3   log         / /

imax iiT T

imax ii i

fc eC i G

fc N e N

(6)

For edges between the auxiliary source, S, and the initial response genes GT1,

2 1   log       / /

imax iSi T

imax ii i

fc eC i G

fc N e N

(3)

For edges connected to the intermediate regulators GT2,

2 2  2    log         ,     / /

imax iij T T

imax ii i

fc eC i G j G

fc N e N

(4)

2 2

   log      log/ // /

    ,    2

jmax jimax i

imax jmaxi ji ji j

ij T

fc efc e

fc N fc Ne N e NC i j G

(5)

For edges between the late effectors, GT3, and the auxiliary sink T,

2 3   log         / /

imax iiT T

imax ii i

fc eC i G

fc N e N

(6)

For edges connected to the intermediate regulators GT2,

• Graph G = (V, E) with E edges and V nodes (containing S – auxiliary source, T – auxiliary sink)

• fc = fold change• = average expression level at all time

points• N = number of genes with expression

values• S = auxiliary source node• T = auxiliary sink node• GT1, GT2, GT3 = genes having

maximal fold change at times T1, T2 and T3

For all other edges, not connected to the intermediate regulators or the auxiliary source and sink,

2 1  ,  ij TC i j S G T

Edge costs

1        Si Si Tw C i G (8)

2        ij ij Tw C i G (9)

3        iT iT Tw C i G (10)

2   ,  ij ij Tw f s i j S G T , as per equation (2)

The edge costs were calculated as:

10log     ,  ij ijA w i j E (11)

Where ()f = scaling function

likelihood ratio    , HitPredictijs i j ; 0.163 999ijs

999    , Innatedb,  KEGGijs i j

      ,   TRANSFACijs Transfac score i j ; 1 6ijs

1        Si Si Tw C i G (8)

2        ij ij Tw C i G (9)

3        iT iT Tw C i G (10)

2   ,  ij ij Tw f s i j S G T , as per equation (2)

The edge costs were calculated as:

10log     ,  ij ijA w i j E (11)

1        Si Si Tw C i G (8)

2        ij ij Tw C i G (9)

3        iT iT Tw C i G (10)

2   ,  ij ij Tw f s i j S G T , as per equation (2)

The edge costs were calculated as:

10log     ,  ij ijA w i j E (11)

1        Si Si Tw C i G (8)

2        ij ij Tw C i G (9)

3        iT iT Tw C i G (10)

2   ,  ij ij Tw f s i j S G T , as per equation (2)

The edge costs were calculated as:

10log     ,  ij ijA w i j E (11)

Optimization problem