identifying causal genes and dysregulated pathways in complex diseases

Click here to load reader

Post on 22-Feb-2016

20 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Yoo-Ah Kim NIH / NLM / NCBI. Identifying Causal Genes and Dysregulated Pathways in Complex Diseases. Nov. 6 th , 2010. Complex Diseases. Associated with the effects of multiple genes As opposed to single gene diseases - PowerPoint PPT Presentation

TRANSCRIPT

Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

Identifying Causal Genes and Dysregulated Pathways in Complex DiseasesNov. 6th, 2010

Yoo-Ah KimNIH / NLM / NCBI1Complex DiseasesAssociated with the effects of multiple genesAs opposed to single gene diseasesThe combination of genomic alteration may vary strongly among different patientsDysregulating the same components, thus often leading to the same disease phenotypeDifficult to study and TreatCancer, Heart diseases, Diabetes, etc.

2Copy Number VariationsTwo copies of each gene are generally assumed to be present in a genomeGenomic regions may be deleted or duplicated causing CNVSome CNVs are associated with susceptibility or resistance to diseases such as cancer

Copy Number Variations in 158 Glioblastoma patients3Identifying Genomic Causes in Complex DiseasesIdentify genotypic causes in individual patients as well as dysregulated pathwaysSystems biology approachGenome-wide searchGraph theoretic algorithmsCircuit flowSet cover158 Glioblastoma multiforme patients

4Glioblastoma multiforme (GBM)the most common and most aggressive type of primary brain tumor in humans

5Expression as Quantitative Trait

Genotype:Copy number variationsPhenotype:Gene expression6eQTL (expression Quantitative Trait Loci) AnalysisWhile we assume that the genetic variation is the cause and expression change is the effect, we dont know molecular pathways behind the relation

Putative target gene Putative causal gene/loci

7Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover

casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes

cases

targetGene gmtagSNP sn

causalgenes

+ -ACTF-DNAphosphoryl.eventprotein-proteinDB8Target Gene SelectionSelect a representative set of disease genes Filter differentially expressed genes for each caseMulti-set coverGene 1 Gene 2 Gene 3

.....

ControlsDisease CasesGene Expression9Target Gene Selection (Continued)Minimum multi-set covera gene covers a particular disease case if the gene is differentially expressed in the caseFind a smallest set of genes that covers (almost) all cases at least k times selected 74 target genes

GenesDisease Casescase1case2case3case4CDBAEcase5case6case710Associations between the expression of target genes and copy number variations of genomic loci

Linear regressionFor every pair of tag loci and target genes

eQTLcasestarget genestag Locicases

11Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover

casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes

cases

targetGene gmtagSNP sn

causalgenes

+ -ACTF-DNAphosphoryl.eventprotein-proteinDB12Finding Candidate Causal Genes

Genotypic VariationsTarget Genes13Finding Candidate Causal Genes?Genotypic VariationsTarget Genes

C1C2C3C4C5Candidate Genes14Finding Candidate Causal GenesGenotypic VariationsTarget Genes

C1C2C3C4C5Candidate GenesDInteraction Networkprotein-protein interactions phosphorylation eventstranscription factor interactions.15Finding Candidate Causal GenesGenotypic VariationsTarget Genes

C1C2C3C4C5Candidate GenesuvDCurrent flow +-

Resistance (u, v) is set to be reversely proportional to (|corr (expr(u), expr(D))| + |corr(expr(v), expr(D))|)/2

Interaction Network16Finding Candidate Causal GenesGenotypic VariationsTarget Genes

C1C2C3C4C5Candidate GenesDCurrent flow +-

Compute the amount of current entering each causal gene by solving a system of linear equationsInteraction Network17Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover

casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes

cases

targetGene gmtagSNP sn

causalgenes

+ -ACTF-DNAphosphoryl.eventprotein-proteinDB18Final Causal Gene Selection

casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case19Final Causal Gene Selection

casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case

20Final Causal Gene Selection

casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case

WEIGHT21Final Causal Gene SelectionFind a smallest set of genes covering (almost) all cases at least k times minimum weighted multi-set cover

22Dysregulated PathwaysCausal paths between a target and a causal gene a maximum current path

C1C2C3C4C5D

23Results158 GBM patient samples32 non-tumor control samples

74 target genes128 causal genes Disease hubs genes frequently appearing on causal paths

24Selected Causal GenesNumber of GenesOverlap with GBM genes Step B: eQTL160560.56 (75)Step C: Circuit flow7010.045 (10)Step D: Set cover 1284.7 10-4 (6)25Results

128 causal genes from set cover (STEP D)

701 candidate causal gene from circuit flow algorithm (STEP C)26Causal Genes

BSOSC Review, November 2008P-valueGenesGlioma0.008PRKCA,EGFR,AKT1,CDKN2A,CAMK2G,TP53,RB1,PTENCell cycle0.028MCM7,CDKN2A,CDC2,TP53,ORC5L,RB1,ATR,BUB3,CUL1p53 signaling pathway0.030CDKN2A,CDC2,TP53,ATR,FAS,THBS1,PTENProteasome0.026PSMA1,PSMC6,PSMB1,PSMC3,PSMA5,PSMA4Functional analysis using DAVIDThe selected causal gene set includes many known cancer implicated genes

27PTEN as causal gene

fold change- 0 +TF-DNAprotein-proteinkinaseTFcausalgenescell cycel genes and glioma genes.

28EGFR as causal and target gene

fold change- 0 +kinaseTFcausalgenesTF-DNAprotein-proteinphosphorylationCausal EGFRTarget EGFR29ConclusionA novel computational method to simultaneously identify causal genes and dys-regulated pathwaysCircuit flow algorithmMulti-set coverAugmentation of eQTL evidence with interaction information resulted in a very powerful approachuncover potential causal genes as well as intermediate nodes on molecular pathways Our method can be applied to any disease system where genetic variations play a fundamental causal role30AcknowledgementsTeresa M. PrzytyckaStefan Wuchty

Other group membersDong Yeon ChoYang HuangDamian WojtowiczJie Zheng

31Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover

casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes

cases

targetGene gmtagSNP sn

causalgenes

+ -ACTF-DNAphosphoryl.eventprotein-proteinDB32

EGFR as causal and target geneCausal Paths

fold change- 0 +kinaseTFcausalgenesTF-DNAprotein-proteinphosphorylationcausal EGFRtarget EGFR

PTEN as causal geneCausal Pathsfold change- 0 +TF-DNAprotein-proteinkinaseTFcausalgenesOur MethodIntegrate several types of data Gene expressionCopy number variations Molecular interactions

36Methods and ResultsMethodmodel the expression change of disease genes as a function of genomic alterations translated the propagation of information from a potential causal to a disease gene as the flow of electric current through a network of molecular interactions. multi-set cover: select most prominent genes

Validated our approach by testing the enrichment of selected causal genes with known GBM/Glioma related genes

diseasegene gmtagSNPsncausalgenes

+ -37

View more