enrichment network analysis and visualization (enviz) cytoscape plugin for integrative statistical...

19
Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched data sets Anya Tsalenko Agilent Laboratories December 14, 2012

Upload: kelly-higgins

Post on 18-Dec-2015

229 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Enrichment Network Analysis and Visualization (ENViz)

Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched data sets

Anya TsalenkoAgilent Laboratories

December 14, 2012

Page 2: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Why ENViz?Many high throughput data sets measured in the same set of samples:- ‘omics’- proteomics- metabolomics

Rich databases with systematic annotations: - GO - pathways - drug targets

How do we analyze this data together to get deeper biological insights into studied phenotype?

Page 3: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Genomic Workbench

Integrated AnalysisNetwork Biology

Integrated Biology Informatics

Primary Analysis

LC/MSGC/MS

MicroarraysTarget Enrichment

NMR

Microfluidics

Proteins

Metabolites

DNA / RNA

miRNA

GeneSpring MassHunter Workstation

Genome Browser

Public Data

BIOLOGICAL INSIGHT!

Hypothesis, experiment, model

Integrated Biology

Page 4: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Example: breast cancer study

“miRNA-mRNA integrated analysis reveals roles for miRNAs in primary breast tumors”

Enerly et al, PLoS One 2011

• Cancer dataset from Anne-Lise Børresen-Dale Lab in Norwegian Radium Hospital, Oslo

• 100 breast tumor samples with various characteristics

• Matched miRNA and mRNA data, Agilent microarrays

Page 5: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Correlation of miRNA and mRNA expression, miR-150

Sorted expression of miRNA -150

Genes sorted by correlation to miR-150 across 100 breast cancer samples

Page 6: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Enrichment analysis of ranked list of genes correlated to miR-150

GO terms enrichment analysis in the top of the list of genes ordered by correlation to miR-150 based on minimum Hypergeometric Statistics (Eden et al, PLoS CB 2007)

mHG p-value<E-147

Analysis and visualization in GOrilla softwarehttp://cbl-gorilla.cs.technion.ac.il/

Page 7: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Biological validation

Association between miR-19a and the cell-cycle module was substantiated as an association to proliferation.

Further validated using high-throughput transfection assays where transfection of miR-19a to MCF7 cell lines resulted in increased proliferation.

GO enrichment for genes correlated to miR-19a

Page 8: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Generic 3 matrices enrichment analysis

Two different types of measurements

in the same set of samples: mRNA and miRNA expression (or other

non-coding RNAs) mRNA expression and quantitative

clinical phenotypes mRNA expression and metabolites

levels mRNA expression and copy number

Roy Navon

Enrichments

Correlations

Analysis is based on statistical enrichment of annotation elements in lists ranked by correlation

Enrichment can be calculated based on any annotation such as GO, pathway, disease ontology or other custom primary data categories

Primary Datag

en

es

samples

Pivot Data

samples

miR

NA

s/o

the

r

Annotation

Pathways/GO/other

Page 9: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

ENViz: what it isEnrichment Network Visualization (ENViz): a Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched data sets

Page 10: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Use the main control panel to:• Input primary data, pivot, and

annotation files• Run analysis• Set thresholds that control the size of

the enrichment network to visualize• Run the visualization

Separate sub-panels can be collapsed or expanded by clicking on their handles (collapsible subpanels, Bader Lab, U Toronto)

Interactive Legend:• graphical overview of the workflow. • click on labeled boxes for file prompt. • drag and drop a file reference onto a

labeled box.

Control Panel

Page 11: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Enrichment Network

Enrichment network built from mRNA and miRNA data from Enerly et al, using WikiPathway annotation.

Results are represented as bi-partite graph: nodes = pathways (yellow->red) and miRNAs (grey).

Edge represents enrichment of pathway node in the set of genes whose expression correlate the expression pattern of miRNA node, red = positive correlation, blue = negative correlation

Page 12: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Enrichment Network Zoom:

• Zoom in to see details around selected nodes and edges• See zoomed-in network in the context of the whole network on the bottom left

Page 13: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Pathway visualization in WikiPathways

• Click on selected edge loads and shows corresponding WikiPathway• All gene nodes in the mRNA processing pathway that map to primary data elements

are color coded (blue -> red) for correlation score between the primary data element (mRNA) and the pivot data element for the clicked edge (hsa-miR-92a)

• thick borders and high opacity show genes above correlation threshold that were included in the gene set used for enrichment analysis.

Page 14: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Tiling Pathway views

• Double-click on a Pathway Node to loads multiple WikiPathways, each one colored by correlation with the specific pivot datum for an Edge, connected to the Node, up to a user-configurable limit

• Network views are tiled in a small multiples view that accentuates contrasts between correlations for different pivot data.

Page 15: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Gene Ontology enrichment and visualization

• Enrichment network built from Enerly et al. mRNA and miRNA data, and Gene Ontology annotation.

• left = bi-partite graph for GO terms (yellow -> red scale) and miRNA (grey)• edge is enrichment of the GO term in the set of genes most correlated with the miRNA. • right = GO summary network for GO terms in the left enrichment network. Each GO nodes

color-coded by cumulative enrichment score for its set of pivot nodes. • parent terms are added, to complete the GO hierarchy view.

Page 16: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

miR-150 - oriented GO Terms

• Double-click on an pivot node in the enrichment network to show GO terms in the GO Summary network that have significant enrichment values for selected pivot.

• GO Summary network on the right is color-coded by enrichment of genes correlated to miR-150

Page 17: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Summary: key features of ENViz

• Enrichment of annotation elements among primary data most correlated to secondary(pivot) data across a set of samples for each pivot and each annotation node

• Representation of results as bi-partite graph (network)

• Pathway and GO enrichment analysis with customized visualization

• Zoom-in into results in the context of WikiPathways• Interactive and intuitive data loading and analysis• Power of network analysis in Cytoscape

Page 18: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Next steps• Beta-release for collaborators

email: [email protected]

• Working on performance, completeness, robustness for Cytoscape plugin release

• Extend support for other organisms beyond Homo sapiens, Mus Musculus, mycobacterium tuberculosis

• Extend the range of database id mappings

• Possible future features: heatmap view, sample grouping, more built-in annotation types (TFs, disease ontologies)

Page 19: Enrichment Network Analysis and Visualization (ENViz) Cytoscape plugin for integrative statistical analysis and visualization of multiple sample matched

Acknowledgements

• Agilent Team– Allan Kuchinsky, Roy Navon, Zohar Yakhini, Michael Creech

• Technion– Israel Steinfeld

• Collaborators– Norwegian Radium Hospital, Oslo: Espen Enerly, Kristine

Kleivi, Vessela N. Kristensen, Anne-Lise Børresen-Dale – UCSF/Gladstone: Alex Pico, Nathan Salomonis, Kristina

Hanspers, Bruce Conklin, Scooter Morris– Maastricht University: Thomas Kelder, Martijn van Iersel, Chris

Evelo– Cytoscape core developers and PIs: Trey Ideker, Chris Sander,

Gary Bader, Benno Schwikowski, Mike Smoot, Peng Liang, Kei Ono, Leroy Hood, Ben Gross, Ethan Cerami