proteomics technologies and protein-protein interaction

23
Proteomics technologies and protein-protein interaction Lars Kiemer Center for Biological Sequence Analysis The Technical University of Denmark Advanced bioinformatics – November

Upload: saxton

Post on 06-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Proteomics technologies and protein-protein interaction. Lars Kiemer Center for Biological Sequence Analysis The Technical University of Denmark Advanced bioinformatics – November 2005. Outlining the problem. Around 30% of the human proteins still have no annotated function. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Proteomics technologies and protein-protein interaction

Proteomics technologies and protein-protein

interaction

Lars Kiemer

Center for Biological Sequence AnalysisThe Technical University of Denmark

Advanced bioinformatics – November 2005

Page 2: Proteomics technologies and protein-protein interaction

November 2005

Outlining the problem

Around 30% of the human proteins still have no annotated function.

Even if the function is known, we often don’t know anything about the big picture (regulation?, multiple functions?, pathogenesis?, mutations?, splice variants?).

In fact, the individual proteins are as interesting as bricks in a wall – what we want to know about is the system.

Page 3: Proteomics technologies and protein-protein interaction

November 2005

Ras

Raf

MEK

MAPK MAPKNUCLEUS

CYTOPLASM

EXTRACELLULAR

Rap1

bRaf

NC

AM

DAGL

Ca2+

FynFG

FR C

B1

NC

AM

NC

AM

NC

AM

Frs2

PLC

Shc

Fak

PKC

PKA

Grb2

Sos

GAP43

CaMKII

CREB

C-Fos

Example: signal transduction cascade

Page 4: Proteomics technologies and protein-protein interaction

November 2005

Ras

Raf

MEK

MAPK

Transcription

MAPKNUCLEUS

CYTOPLASM

EXTRACELLULAR

cAMPRap1

bRaf

NC

AM

DAGL

DAG PIP22-AG

Ca2+

Fyn

FG

FR C

B1

NC

AM

NC

AM

NC

AM

Frs2

PLC Shc Fak

PKCPKA

IP3Grb2

SosGrb2Sos

GAP43

CaMKII CREB

C-Fos

Example: signal transduction cascade

Page 5: Proteomics technologies and protein-protein interaction

November 2005

Obtaining data

High-throughput data can provide information about interactions with other proteins, protein abundance in different tissues, transcriptional regulation, etc.

High-throughput experimental techniques provide large data sets – thus no manual curation is possible.

These data sets often contain false positives. But combining several such data sets increases

confidence.

Page 6: Proteomics technologies and protein-protein interaction

November 2005

Protein interactions reveal a lot!

Hints of the function of a protein are revealed when its interaction partners are known.

Guilt by association!

Complexes in which none of the interaction partners have known functions are even more interesting.

Page 7: Proteomics technologies and protein-protein interaction

November 2005

Yeast-two-hybrid screening

Has been widely used Only binary interactions High false postive rate Proteins must be able to

enter the nucleus

Page 8: Proteomics technologies and protein-protein interaction

November 2005

Affinity purification

Large-scale Can be done on any preparation

of cells Often complexes are purified and

the order of binding is not obtained

An extra step is needed to identify purified proteins

Page 9: Proteomics technologies and protein-protein interaction

November 2005

Q1

TOF

q2

Mass Analyzer(s)

Separates gas-phaseIons by m/z

Ion Source

Converts the analyteinto gasphase ions

3 principal components

+

Detector

Ions are detected as

they disharge on the detector

Mass spectrometer

Page 10: Proteomics technologies and protein-protein interaction

November 2005

Mass spectrometry in short

Extremely sensitive Weight precision of one atom In principle, detection of one, relatively short peptide allows

for unambiguous identification.

Some proteins are difficult to chop up with proteases. Some peptides are very difficult to ionize. Due to the high sensitivity of the method, contaminations are

difficult to avoid.

Page 11: Proteomics technologies and protein-protein interaction

November 2005

Affinity pulldown

Bait Prey

Spoke Matrix Truth?

Protein interaction databases: Spoke/Matrix

Page 12: Proteomics technologies and protein-protein interaction

November 2005

Protein interaction data:

A total of 18.629 articles represented in the databases (June 2005).

Database Unique article references

# interaction pairs in unique references.

DIP 1.353 5.403 (binary?)

MINT 1.406 5.430 (spoke)

Intact 355 6.836 (spoke)

GRID 1.232 49.135 (binary?)

BIND* (protein part) 5.733 44.279 (spoke/matrix)

HPRD 6.989 14.533 (matrix)*Approx. 10% of pp interactions in BIND are db’ imports

Protein interaction databases: Overlap

Page 13: Proteomics technologies and protein-protein interaction

November 2005

Species bias in available data

A few select organisms are very well-studied, while others are not.

The BIND database, species distribution (Alfarano et al., NAR, 2005): Yeast

Drosophila

Human

C. elegans

Mouse

Helicobacter

Bos taurus

HIV

Other

Page 14: Proteomics technologies and protein-protein interaction

November 2005

Orthologs?

Orthologous genes are direct descendants of a gene in a common ancestor:

(O'Brien K, Remm et al. 2005)

S. cerevisiae

D. melanogaster

H. sapiens

Trans-organism protein interaction network

Page 15: Proteomics technologies and protein-protein interaction

November 2005

D. melanogaster Experim.

C. elegans Experim.

S. cerevisiae Experim.

H. sapiens MOSAIC

Trans-organism protein interaction network

Page 16: Proteomics technologies and protein-protein interaction

November 2005

Repetition of experiments adds credibility

Light blue connection – 1 experiment.Darker blue connection – >1 experiment, 1 organism.Purple connection - >1 experiment, >1 organisms.

Light blue connection – 1 experiment.Darker blue connection – >1 experiment, 1 organism.Purple connection - >1 experiment, >1 organisms.

Page 17: Proteomics technologies and protein-protein interaction

November 2005

Adding co-expression data

Red connector – co-expression in 80 different tissues with a correlation coefficient above 0.7.Grey nodes – no expression data available.

Red connector – co-expression in 80 different tissues with a correlation coefficient above 0.7.Grey nodes – no expression data available.

Page 18: Proteomics technologies and protein-protein interaction

November 2005

Nucleolus dynamics

Nodes are coloured according to level of protein in the nucleolus following transcriptional inhibition (Andersen et al., Nature, 2005).

Nodes are coloured according to level of protein in the nucleolus following transcriptional inhibition (Andersen et al., Nature, 2005).

decreased

unchanged

Relative level of protein in the nucleolusafter inhibition of transcription

increased

decreased

unchanged

Relative level of protein in the nucleolusafter inhibition of transcription

increased

Page 19: Proteomics technologies and protein-protein interaction

November 2005

Adding up to make high quality associations

Integration of various data sources builds up confidence

Integration of various data sources builds up confidence

Page 20: Proteomics technologies and protein-protein interaction

November 2005

Upon integration comes enlightenment

Page 21: Proteomics technologies and protein-protein interaction

November 2005

Upon integration comes enlightenment

Page 22: Proteomics technologies and protein-protein interaction

November 2005

Identifying functional complexes

Ribosome (predominantly 60S)

DNA repairSMARCA complex

TFIID

Arp2/3

Page 23: Proteomics technologies and protein-protein interaction

November 2005

Summary

Protein-protein interactions can reveal hints about the function of a protein (guilt by association).

Information about protein interactions is obtained with different technologies each with its own advantages and weaknesses.

Due to the high degree of systemic conservation, interactions can be inferred from observed interactions in other species.

Data are always error-prone. Repeated observations build up confidence.

Integrating different types of data can futher build up confidence.