linking proteomic and transcriptional data through the interactome and epigenome reveals a map of...

28
Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene-induced Signaling Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) April 16, 2015 All figures and quotes from Huang2013 unless noted otherwise

Upload: virgil-cooper

Post on 19-Dec-2015

228 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Linking Proteomic and Transcriptional Data

through the Interactome and Epigenome Reveals a Map of Oncogene-induced

Signaling

Anthony GitterCancer Bioinformatics (BMI 826/CS 838)

April 16, 2015

All figures and quotes from Huang2013 unless noted otherwise

Page 2: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Prize-Collecting Steiner Tree (PCST)• Pathways needed to determine how heterogeneous

drivers lead to cancer-related phenotypes• PCST focuses on learning pathway structure

Page 3: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Pathway structure important for therapeutics

Vogelstein2013

Kinase inhibitors targetspecific pathway members

Page 4: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Glioblastoma model

• Phosphorylation changes (88 proteins)• DNase hypersensitivity changes (~13000 regions)• Gene expression changes (1623 genes)

U87H – High EGFRvIII U87DK – Dead Kinase

Page 5: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Pathways for integrating data• Low correlation between

phosphorylation and transcription• Provide complementary

information

Page 6: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Two stages: define prizes, solve PCST

OmicsIntegrator

Page 7: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Stage 1: phosphorylation prizes• Prize based on phosphorylation fold change• Proteins with prizes called terminals• p: protein• prizeph: phosphorylation prize

𝑝𝑟𝑖𝑧𝑒 h𝑝 (𝑝 )=|log 2 h h𝑝 𝑜𝑠𝑝 𝑜𝑈 87𝐻 (𝑝 )h h𝑝 𝑜𝑠𝑝 𝑜𝑈 87𝐷𝐾 (𝑝 )|

Page 8: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Stage 1: TF prizes

• Find differentially expressed genes• Find differential DNase peaks• Create gene X TF score matrix

DNA binding motif affinity for TF τ at score for peak i in condition c proximal to gene g

Peaks mapped to gene g in U87DK cellsPeaks mapped to gene g in U87H cells

Page 9: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Stage 1: TF prizes continued• Test each TF independently for association with

differentially expressed gene g• Are coefficients in linear regression significantly

different from 0?

Regression coefficient, use t-test statistic for prize

Binding affinity for TF τ near gene gGene g log2 expression fold change

Page 10: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Stage 2: identify subnetwork

Page 11: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Solving PCST

• Use off-the-shelf Steiner tree solver• Solver creates an integer program

Penalty for omitted prize nodes

Cost of selected edgesPrize vs. edge cost tradeoff

𝑜 (𝐹 )=𝛽 ∑𝑣∉𝑉 𝐹

𝑝 (𝑣 )+ ∑𝑒∈𝐸𝐹

𝑐 (𝑒)

Page 12: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

EGRFvIII signaling pathway

Page 13: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Comparing with other network approaches

• Also compare to xenograft phosphorylation

Page 14: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Using pathway structure for validation• Which proteins to test?• What are appropriate negative controls?

𝑠𝑐𝑜𝑟𝑒 (𝑣 𝑖 )= ∑𝑒𝑖𝑗∈𝐸 ,𝑣 𝑗∈𝑉𝐹

(1−𝑐 (𝑒𝑖𝑗 ))

Page 15: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Three tiers of proteins to test

Page 16: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Inhibitor viability screening

• U87H cells very sensitive to inhibiting high-ranked targets

Page 17: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Inhibitor viability screening

• 3 of 4 top targets more toxic in U87H• 1 of 3 lower targets more toxic, requires high dose

Page 18: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

ChIP-seq to explore EP300 role• EP300 is a Steiner node, top-ranked TF• Find its targets include epithelial-mesenchymal

transformation markers

Page 19: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

PCST versus other pathway approaches• MEMo / Multi-Dendrix (mutual exclusivity)• RPPA regression• GSEA / PARADIGM (pathway enrichment / activity)• HotNet / NBS (network diffusion)• ActiveDriver (phosphorylation impact)

Page 20: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Multi-sample Prize-Collecting Steiner Forest (Multi-PCSF)• PCST => PCSF: allow multiple disjoint trees• PCSF => Multi-PCSF: jointly optimize pathways for

many related samples (e.g. tumors)• Approximate optimization with belief propagation

instead of integer program

vi

vj vlvk

Page 21: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

PCST formulation

Protein-protein interactions

Alteration No data

Penalty for omitted prize nodes

Cost of selected edgesPrize vs. edge cost tradeoff

𝑜 (𝐹 )=𝛽 ∑𝑣∉𝑉 𝐹

𝑝 (𝑣 )+ ∑𝑒∈𝐸𝐹

𝑐 (𝑒)

Page 22: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Multi-PCSF formulation

Alteration Important in related tumors No data

Artificial prizes to share informationOriginal Steiner tree objective

Explaining individuals’ data vs. similarity tradeoff

𝑜 (ℱ )=∑𝑖=1

𝑁

𝑜 (𝐹 𝑖 )+𝜆∑𝑖=1

𝑁

∑𝑣∉𝑉 𝐹 𝑖

𝜙(𝛼 ,𝑣 ,𝑝𝑖 (𝑣 ) ,ℱ {𝐹 ¿¿ 𝑖)

Page 23: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Mutation or proteomic change

Important in related tumors

Steiner node

Breast cancer tumor TCGA-AN-A0AR

Page 24: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Unique pathways with common core

Mutation or proteomic change

Important in related tumors

Steiner node

Page 25: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Appendix

Page 26: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Alternative pathway identification algorithms• Steiner tree/forest (related to PCST)• Prize-collecting Steiner forest (PCSF)• Belief propagation approximation (msgsteiner)

• k-shortest paths• Ruths2007• Shih2012

• Integer programs• Signaling-regulatory Pathway INferencE (SPINE)• Chasman2014

Page 27: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Alternative pathway identification algorithms continued• Path-based objectives• Physical Network Models (PNM)• Maximum Edge Orientation (MEO)• Signaling and Dynamic Regulatory Events Miner (SDREM

)

• Maximum flow• ResponseNet

• Hybrid approaches• PathLinker: random walk + shortest paths• ANAT: shortest path + Steiner tree

Page 28: Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene- induced Signaling Anthony Gitter Cancer Bioinformatics

Recent developments in pathway discovery• Multi-task learning: jointly model several related

biological conditions• ResponseNet extension: SAMNet• Steiner forest extension: Multi-PCSF• SDREM extension: MT-SDREM

• Temporal data• ResponseNet extension: TimeXNet• Pathway synthesis