prediction of protein networks through data integration

135
Prediction of protein networks through data integration Lars Juhl Jensen EMBL Heidelberg

Upload: lars-juhl-jensen

Post on 27-Jun-2015

353 views

Category:

Technology


3 download

DESCRIPTION

MIPS Retreat, Kloster Frauenchiemsee, Chiemsee, Germany, July 9-10, 2007

TRANSCRIPT

Page 1: Prediction of protein networks through data integration

Prediction of protein networks through data integration

Lars Juhl Jensen

EMBL Heidelberg

Page 2: Prediction of protein networks through data integration

prediction of interactions

Page 3: Prediction of protein networks through data integration

STRING

Page 4: Prediction of protein networks through data integration
Page 5: Prediction of protein networks through data integration

functional interactions

Page 6: Prediction of protein networks through data integration

373 genomes

Page 7: Prediction of protein networks through data integration

model organism databases

Page 8: Prediction of protein networks through data integration

Ensembl

Page 9: Prediction of protein networks through data integration

Genome Reviews

Page 10: Prediction of protein networks through data integration

RefSeq

Page 11: Prediction of protein networks through data integration

genomic context methods

Page 12: Prediction of protein networks through data integration

gene neighborhood

Page 13: Prediction of protein networks through data integration
Page 14: Prediction of protein networks through data integration

gene fusion

Page 15: Prediction of protein networks through data integration
Page 16: Prediction of protein networks through data integration

phylogenetic profiles

Page 17: Prediction of protein networks through data integration
Page 18: Prediction of protein networks through data integration
Page 19: Prediction of protein networks through data integration
Page 20: Prediction of protein networks through data integration
Page 21: Prediction of protein networks through data integration

Cell

Cellulosomes

Cellulose

Page 22: Prediction of protein networks through data integration

correct interactions

Page 23: Prediction of protein networks through data integration

wrong associations

Page 24: Prediction of protein networks through data integration

phylogenetic profiles

Page 25: Prediction of protein networks through data integration
Page 26: Prediction of protein networks through data integration

SVDSingular Value Decomposition

Page 27: Prediction of protein networks through data integration

Euclidian distance

Page 28: Prediction of protein networks through data integration

gene neighborhood

Page 29: Prediction of protein networks through data integration
Page 30: Prediction of protein networks through data integration

sum of intergenic distances

Page 31: Prediction of protein networks through data integration

raw quality scores

Page 32: Prediction of protein networks through data integration

rank by reliability

Page 33: Prediction of protein networks through data integration

not comparable

Page 34: Prediction of protein networks through data integration

Euclidian distance

Page 35: Prediction of protein networks through data integration

sum of intergenic distances

Page 36: Prediction of protein networks through data integration

benchmarking

Page 37: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 38: Prediction of protein networks through data integration
Page 39: Prediction of protein networks through data integration

raw quality scores

Page 40: Prediction of protein networks through data integration

probabilistic scores

Page 41: Prediction of protein networks through data integration

curated knowledge

Page 42: Prediction of protein networks through data integration

many sources

Page 43: Prediction of protein networks through data integration

KEGGKyoto Encyclopedia of Genes and Genomes

Page 44: Prediction of protein networks through data integration

Reactome

Page 45: Prediction of protein networks through data integration

PIDNCI-Nature Pathway Interaction Database

Page 46: Prediction of protein networks through data integration

STKESignal Transduction Knowledge Environment

Page 47: Prediction of protein networks through data integration

MIPSMunich Information center

for Protein Sequences

Page 48: Prediction of protein networks through data integration

Gene Ontology

Page 49: Prediction of protein networks through data integration

different gene identifiers

Page 50: Prediction of protein networks through data integration

synonyms list

Page 51: Prediction of protein networks through data integration

literature mining

Page 52: Prediction of protein networks through data integration

MEDLINE

Page 53: Prediction of protein networks through data integration

SGDSaccharomyces Genome Database

Page 54: Prediction of protein networks through data integration

The Interactive Fly

Page 55: Prediction of protein networks through data integration

OMIMOnline Mendelian Inheritance in Man

Page 56: Prediction of protein networks through data integration

co-mentioning

Page 57: Prediction of protein networks through data integration

NLPNatural Language Processing

Page 58: Prediction of protein networks through data integration

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 59: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 60: Prediction of protein networks through data integration
Page 61: Prediction of protein networks through data integration

primary experimental data

Page 62: Prediction of protein networks through data integration

gene expression

Page 63: Prediction of protein networks through data integration

GEOGene Expression Omnibus

Page 64: Prediction of protein networks through data integration

expression compendia

Page 65: Prediction of protein networks through data integration

protein interactions

Page 66: Prediction of protein networks through data integration

BINDBiomolecular Interaction Network Database

Page 67: Prediction of protein networks through data integration

BioGRIDGeneral Repository for Interaction Datasets

Page 68: Prediction of protein networks through data integration

DIPDatabase of Interacting Proteins

Page 69: Prediction of protein networks through data integration

IntAct

Page 70: Prediction of protein networks through data integration

MINTMolecular Interactions Database

Page 71: Prediction of protein networks through data integration

HPRDHuman Protein Reference Database

Page 72: Prediction of protein networks through data integration

many sources

Page 73: Prediction of protein networks through data integration

different gene identifiers

Page 74: Prediction of protein networks through data integration

redundancy

Page 75: Prediction of protein networks through data integration

not comparable

Page 76: Prediction of protein networks through data integration

merge data by publication

Page 77: Prediction of protein networks through data integration

raw quality scores

Page 78: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 79: Prediction of protein networks through data integration
Page 80: Prediction of protein networks through data integration

combine all evidence

Page 81: Prediction of protein networks through data integration

spread over many species

Page 82: Prediction of protein networks through data integration

transfer by orthology

Page 83: Prediction of protein networks through data integration

naïve Bayesian scoring

Page 84: Prediction of protein networks through data integration
Page 85: Prediction of protein networks through data integration

prediction of interactions

Page 86: Prediction of protein networks through data integration

NetworKIN

Page 87: Prediction of protein networks through data integration
Page 88: Prediction of protein networks through data integration

the idea

Page 89: Prediction of protein networks through data integration

phosphoproteomics

Page 90: Prediction of protein networks through data integration

mass spectrometry

Page 91: Prediction of protein networks through data integration
Page 92: Prediction of protein networks through data integration

phosphorylation sites

Page 93: Prediction of protein networks through data integration

Phospho.ELM

Page 94: Prediction of protein networks through data integration

in vivo

Page 95: Prediction of protein networks through data integration

kinases are unknown

Page 96: Prediction of protein networks through data integration

computational methods

Page 97: Prediction of protein networks through data integration

NetPhosK

Page 98: Prediction of protein networks through data integration

Scansite

Page 99: Prediction of protein networks through data integration

sequence motifs

Page 100: Prediction of protein networks through data integration
Page 101: Prediction of protein networks through data integration

kinase families

Page 102: Prediction of protein networks through data integration

overprediction

Page 103: Prediction of protein networks through data integration

no context

Page 104: Prediction of protein networks through data integration

what a kinase could do

Page 105: Prediction of protein networks through data integration

not what it actually does

Page 106: Prediction of protein networks through data integration

context

Page 107: Prediction of protein networks through data integration

co-activators

Page 108: Prediction of protein networks through data integration

scaffolders

Page 109: Prediction of protein networks through data integration

protein networks

Page 110: Prediction of protein networks through data integration
Page 111: Prediction of protein networks through data integration

the algorithm

Page 112: Prediction of protein networks through data integration

NetworKIN

Page 113: Prediction of protein networks through data integration
Page 114: Prediction of protein networks through data integration

benchmarking

Page 115: Prediction of protein networks through data integration

Phospho.ELM

Page 116: Prediction of protein networks through data integration
Page 117: Prediction of protein networks through data integration

2.5-fold better accuracy

Page 118: Prediction of protein networks through data integration

context is crucial

Page 119: Prediction of protein networks through data integration

global statistics

Page 120: Prediction of protein networks through data integration
Page 121: Prediction of protein networks through data integration

visualization

Page 122: Prediction of protein networks through data integration
Page 123: Prediction of protein networks through data integration

ATM signaling

Page 124: Prediction of protein networks through data integration
Page 125: Prediction of protein networks through data integration

experimental validation

Page 126: Prediction of protein networks through data integration

summary

Page 127: Prediction of protein networks through data integration

reanalysis

Page 128: Prediction of protein networks through data integration

benchmarking

Page 129: Prediction of protein networks through data integration

integration

Page 130: Prediction of protein networks through data integration

complementary data types

Page 131: Prediction of protein networks through data integration

computational methods

Page 132: Prediction of protein networks through data integration

reproduce what is know

Page 133: Prediction of protein networks through data integration

biological discoveries

Page 134: Prediction of protein networks through data integration

testable hypotheses

Page 135: Prediction of protein networks through data integration

Acknowledgments

The STRING database– Christian von Mering

– Michael Kuhn

– Berend Snel

– Martijn Huynen

– Sean Hooper

– Samuel Chaffron

– Julien Lagarde

– Mathilde Foglierini

– Peer Bork

Literature mining– Jasmin Saric

– Rossitza Ouzounova

– Isabel Rojas

The NetworKIN method– Rune Linding

– Gerard Ostheimer

– Francesca Diella

– Karen Colwill

– Jing Jin

– Pavel Metalnikov

– Vivian Nguyen

– Adrian Pasculescu

– Jin Gyoon Park

– Leona D. Samson

– Rob Russell

– Peer Bork

– Michael Yaffe

– Tony Pawson