identifying causal genes and dysregulated pathways in complex diseases
Click here to load reader
Post on 22-Feb-2016
20 views
Embed Size (px)
DESCRIPTION
Yoo-Ah Kim NIH / NLM / NCBI. Identifying Causal Genes and Dysregulated Pathways in Complex Diseases. Nov. 6 th , 2010. Complex Diseases. Associated with the effects of multiple genes As opposed to single gene diseases - PowerPoint PPT PresentationTRANSCRIPT
Identifying Causal Genes and Dysregulated Pathways in Complex Diseases
Identifying Causal Genes and Dysregulated Pathways in Complex DiseasesNov. 6th, 2010
Yoo-Ah KimNIH / NLM / NCBI1Complex DiseasesAssociated with the effects of multiple genesAs opposed to single gene diseasesThe combination of genomic alteration may vary strongly among different patientsDysregulating the same components, thus often leading to the same disease phenotypeDifficult to study and TreatCancer, Heart diseases, Diabetes, etc.
2Copy Number VariationsTwo copies of each gene are generally assumed to be present in a genomeGenomic regions may be deleted or duplicated causing CNVSome CNVs are associated with susceptibility or resistance to diseases such as cancer
Copy Number Variations in 158 Glioblastoma patients3Identifying Genomic Causes in Complex DiseasesIdentify genotypic causes in individual patients as well as dysregulated pathwaysSystems biology approachGenome-wide searchGraph theoretic algorithmsCircuit flowSet cover158 Glioblastoma multiforme patients
4Glioblastoma multiforme (GBM)the most common and most aggressive type of primary brain tumor in humans
5Expression as Quantitative Trait
Genotype:Copy number variationsPhenotype:Gene expression6eQTL (expression Quantitative Trait Loci) AnalysisWhile we assume that the genetic variation is the cause and expression change is the effect, we dont know molecular pathways behind the relation
Putative target gene Putative causal gene/loci
7Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover
casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes
cases
targetGene gmtagSNP sn
causalgenes
+ -ACTF-DNAphosphoryl.eventprotein-proteinDB8Target Gene SelectionSelect a representative set of disease genes Filter differentially expressed genes for each caseMulti-set coverGene 1 Gene 2 Gene 3
.....
ControlsDisease CasesGene Expression9Target Gene Selection (Continued)Minimum multi-set covera gene covers a particular disease case if the gene is differentially expressed in the caseFind a smallest set of genes that covers (almost) all cases at least k times selected 74 target genes
GenesDisease Casescase1case2case3case4CDBAEcase5case6case710Associations between the expression of target genes and copy number variations of genomic loci
Linear regressionFor every pair of tag loci and target genes
eQTLcasestarget genestag Locicases
11Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover
casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes
cases
targetGene gmtagSNP sn
causalgenes
+ -ACTF-DNAphosphoryl.eventprotein-proteinDB12Finding Candidate Causal Genes
Genotypic VariationsTarget Genes13Finding Candidate Causal Genes?Genotypic VariationsTarget Genes
C1C2C3C4C5Candidate Genes14Finding Candidate Causal GenesGenotypic VariationsTarget Genes
C1C2C3C4C5Candidate GenesDInteraction Networkprotein-protein interactions phosphorylation eventstranscription factor interactions.15Finding Candidate Causal GenesGenotypic VariationsTarget Genes
C1C2C3C4C5Candidate GenesuvDCurrent flow +-
Resistance (u, v) is set to be reversely proportional to (|corr (expr(u), expr(D))| + |corr(expr(v), expr(D))|)/2
Interaction Network16Finding Candidate Causal GenesGenotypic VariationsTarget Genes
C1C2C3C4C5Candidate GenesDCurrent flow +-
Compute the amount of current entering each causal gene by solving a system of linear equationsInteraction Network17Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover
casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes
cases
targetGene gmtagSNP sn
causalgenes
+ -ACTF-DNAphosphoryl.eventprotein-proteinDB18Final Causal Gene Selection
casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case19Final Causal Gene Selection
casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case
20Final Causal Gene Selection
casescausal genesA putative causal gene explains a disease case if its corresponding tag locus has a copy number alteration its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case
WEIGHT21Final Causal Gene SelectionFind a smallest set of genes covering (almost) all cases at least k times minimum weighted multi-set cover
22Dysregulated PathwaysCausal paths between a target and a causal gene a maximum current path
C1C2C3C4C5D
23Results158 GBM patient samples32 non-tumor control samples
74 target genes128 causal genes Disease hubs genes frequently appearing on causal paths
24Selected Causal GenesNumber of GenesOverlap with GBM genes Step B: eQTL160560.56 (75)Step C: Circuit flow7010.045 (10)Step D: Set cover 1284.7 10-4 (6)25Results
128 causal genes from set cover (STEP D)
701 candidate causal gene from circuit flow algorithm (STEP C)26Causal Genes
BSOSC Review, November 2008P-valueGenesGlioma0.008PRKCA,EGFR,AKT1,CDKN2A,CAMK2G,TP53,RB1,PTENCell cycle0.028MCM7,CDKN2A,CDC2,TP53,ORC5L,RB1,ATR,BUB3,CUL1p53 signaling pathway0.030CDKN2A,CDC2,TP53,ATR,FAS,THBS1,PTENProteasome0.026PSMA1,PSMC6,PSMB1,PSMC3,PSMA5,PSMA4Functional analysis using DAVIDThe selected causal gene set includes many known cancer implicated genes
27PTEN as causal gene
fold change- 0 +TF-DNAprotein-proteinkinaseTFcausalgenescell cycel genes and glioma genes.
28EGFR as causal and target gene
fold change- 0 +kinaseTFcausalgenesTF-DNAprotein-proteinphosphorylationCausal EGFRTarget EGFR29ConclusionA novel computational method to simultaneously identify causal genes and dys-regulated pathwaysCircuit flow algorithmMulti-set coverAugmentation of eQTL evidence with interaction information resulted in a very powerful approachuncover potential causal genes as well as intermediate nodes on molecular pathways Our method can be applied to any disease system where genetic variations play a fundamental causal role30AcknowledgementsTeresa M. PrzytyckaStefan Wuchty
Other group membersDong Yeon ChoYang HuangDamian WojtowiczJie Zheng
31Method OutlineTarget gene selectionGene expressioneQTLFind association between expression and copy numberCircuit flow algorithmMolecular interactionsCandidate causal genesCausal gene selectionWeighted multiset cover
casestarget genesgmg3g2g1tag locisns3s2s1s4casescausalgenes
cases
targetGene gmtagSNP sn
causalgenes
+ -ACTF-DNAphosphoryl.eventprotein-proteinDB32
EGFR as causal and target geneCausal Paths
fold change- 0 +kinaseTFcausalgenesTF-DNAprotein-proteinphosphorylationcausal EGFRtarget EGFR
PTEN as causal geneCausal Pathsfold change- 0 +TF-DNAprotein-proteinkinaseTFcausalgenesOur MethodIntegrate several types of data Gene expressionCopy number variations Molecular interactions
36Methods and ResultsMethodmodel the expression change of disease genes as a function of genomic alterations translated the propagation of information from a potential causal to a disease gene as the flow of electric current through a network of molecular interactions. multi-set cover: select most prominent genes
Validated our approach by testing the enrichment of selected causal genes with known GBM/Glioma related genes
diseasegene gmtagSNPsncausalgenes
+ -37