gary bader fged_toronto_2012
DESCRIPTION
Pathway analysis for genomicsTRANSCRIPT
Pathway analysis for genomics
Gary Bader Oct.23.2012 – FGED Toronto
Microtubule
Cytoskeleton
Cell Projection
& Cell Motility
Cell Proliferation
Glycosylation
Adhesion
Regulation of GTPase
Kinase Activity/Regulation
CNS Development
Intellectual
Disability
Autism
GTPase/Ras
Signaling
Regulation of cell proliferation
Positive regulation of cell proliferation
Tyrosin kinase
Vasculature develepment
Palate develepment
Organ Morphogenesis
Behavior
Heart develepment
RHO Ras
Membrane
Kinase regulation
Cell Motility
(stricter cluster)
Centrosome
Nucleolus
Cell cycle
Regulation of
hormone levels
Aminoacid
derivative /
amine
metabolism
Synaptic vescicle maturation
Reelin pathway
LIS1 in neuronal
migration and
development
Negative
regulation
of cell cycle
cKIT
pathwaymTor
pathway
Zn finger
domain
Carboxyl
esterase
domain
Ras signaling GTPase regulator
Neuron
migration
Cell Motility
(stricter cluster)
Cell morphogenesis
Cell projection
organization
CNS
development
Brain
development
Neurite development
CNS neuron
differentiation
AxonogenesisProjection neuron
axonogenesis
Cerebral cortex
cell migration
SMC flexible hinge domain
Urea and amine group metabolism
MHC-I
Zoom of CNS-Development
ID ID
ASDASD
Both
0%
12.5%
Enrichedin deletions
FDR
Known disease genes
Enriched onlyin disease genes
Node type (gene-set)
Edge type (gene-set overlap)
From disease genesto enriched gene-sets
Between gene-setsenriched in deletions
Between sets enriched in deletions and in diseasegenes or between diseasesets only
Pinto et al. FuncBonal impact of global rare copy number variaBon in auBsm spectrum disorders. Nature. 2010 Jun 9.
CorrelaBon to CausaBon • GWAS: find geneBc markers correlated with disease – powerful approach, but: – genomics reduces staBsBcal power (>mulBple tesBng correcBon with >SNPs)
– rare variants = more samples
• Associate pathways to increase power – Fewer pathways, organize many rare variants (damaging the system causes the disease)
• Use pathway knowledge to idenBfy potenBal disease causes
The Systems Biology Pyramid
Cary, Bader, Sander, FEBS Letters 579 (2005) 1815-20
Chris Sander, MSKCC
Cytoscape cPath2
Pathway Commons BioPAX
hWp://pathguide.org
Vuk Pavlovic Sylva Donaldson
>320 Pathway Databases!
• Varied formats, representa;on, coverage • Pathway data extremely difficult to combine and use
BioPAX Pathway Language • Represent:
– Metabolic pathways – Signaling pathways – Protein-‐protein, molecular interacBons – Gene regulatory pathways – GeneBc interacBons
• Community effort: pathway databases distribute pathway informaBon in standard format
www.biopax.org
Emek Demir
SBGN.org
BioCarta
BioPAX
Aim: Convenient Access to Pathway Information
Facilitate creaBon and communicaBon of pathway data Aggregate pathway data in the public domain Provide easy access for pathway analysis
Long term: Converge to integrated cell map
hCp://pathwaycommons.org
Pathway Commons: cPath2
• hWp://www.pathwaycommons.org/pc2/
Emek Demir, Igor Rodchenkov, Chris Sander Ozgun Babur, Arman Aksoy, Onur Sumer, Ethan Cerami, Ben Gross
Network visualizaBon and analysis
UCSD, ISB, Agilent, MSKCC, Pasteur, UCSF, Unilever, UToronto, U Texas
hWp://cytoscape.org
Pathway comparison Literature mining Gene Ontology analysis AcBve modules Complex detecBon Network moBf search
Cytoscape 3
• Complete re-‐architecture: OSGi – everything is an app • Enables future features:
– More stable and powerful APIs – ScripBng, macros, recordable history, beWer undo/redo – Command line mode, good for use on compute clusters – InteracBve control from other scripBng languages e.g. R
• Fixing bugs and porBng plugins • 3.0 beta now available
– Mirror funcBonality in 2.8 – Encourage plugin to app porBng – hWp://www.cytoscape.org/cy3.html
14
hWp://cytoscapeweb.cytoscape.org/
Chris;an Tannus-‐Lopes, Max Franz, Yue Dong
Compound Nodes Onur Sumer Ugur Dogrusoz Bilkent, Ankara
hCp://cytoscapeweb.cytoscape.org/demos/compound
Cytoscape.js: HTML5 – iPad cytoscape.github.com/cytoscape.js/
Gene Function Prediction
hWp://www.genemania.org
Quaid Morris (Donnelly), Sara Mostafavi Rashad Badrawi, Ovi Comes, Sylva Donaldson, Max Franz, Christian Lopes, Farzana Kazi, Jason Montojo, Harold Rodriguez, Khalid Zuberi
• Guilt-‐by-‐associaBon principle • Biological networks are combined intelligently to opBmize predicBon accuracy • Algorithm is more fast and accurate than its peers
Mostafavi S et al. Genome Biol. 2008;9 Suppl 1:S4
Warde-‐Farley D et al. Nucleic Acids Res. 2010 Jul;8:W214-‐20.
The Factoid Project
• Publishing in science – Highly inefficient – Outdated technology, difficult to search and compute
• hWp://www.elseviergrandchallenge.com/ – Winner: hWp://reflect.ws/
• Pathway and network informaBon database curaBon – Highly inefficient
• The factoid project
Max Franz, Igor Rodchenkov, Ozgun Babur, Emek Demir, Chris Sander
Pathway and Network Analysis
1. Gene set: pathway enrichment analysis
2. Network: network regions (modules), regulaBon
3. Process model: classical pathways
4. SimulaBon model: detailed models
Mec
hani
stic
U
nder
stan
ding
Gen
ome
Cov
erag
e A
nd u
se
Increase Coverage and Depth
Cellular Process Representa;on • Gene set • Network • Process model • SimulaBon model
• Analysis methods need to keep up
Cov
erag
e
Depth
Dep
th
Data and analysis methods
Present
Future
Km=3.0×10−4
Acknowledgements Bader Lab Domain InteracBon Team Chris Tan Shirley Hui Shobhit Jain Brian Law Jüri Reimand Former: David Gfeller Xiaojian Shao
Funding
hWp://baderlab.org
www.GeneMANIA.org Quaid Morris (Donnelly) Rashad Badrawi, Ovi Comes, Sylva Donaldson, Christian Lopes, Farzana Kazi, Jason Montojo, Sara Mostafavi, Harold Rodriguez, Khalid Zuberi
Gene;c Intx, Pathways: Anastasia Baryshnikova Iain Wallace Magali Michaut Ron Ammar Daniele Merico Ruth Isserlin Vuk Pavlovic Igor Rodchenkov
Pathway Commons Chris Sander Ethan Cerami Ben Gross Emek Demir Igor Rodchenkov Nadia Anwar Ozgun Babur