[email protected] proteome and interactome bioinformatics

24
[email protected] Proteome and interactome Bioinformatics

Upload: hollie-mclaughlin

Post on 03-Jan-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Proteome and interactome

Bioinformatics

Page 2: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Contents

Protein-protein interactions Two-hybrid assays Mass spectrometry

Cellular localization of proteins GFP tags

Protein-DNA interactions ChIP-on-chip

Page 3: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Two-hybrid assays

Functional genomics

Page 4: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Two-hybrid method

DNA-binding ORF A

DNA-binding Activation

ORF B Activation

Transcription factor

A

RNApol

B

Hybrid constructions

RNApolA

B RNApolA

B

Interaction reporter gene is expressed

No interaction reporter gene is not expressed

Bait Prey

Bait

Prey

Bait

Prey

Page 5: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Two-hybrid

Ito et al. (2001) PNAS 98: 4569-4574

Uetz et al. (2000). Nature 403: 623-631

Page 6: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Comparison of the results

When the second “comprehensive” analysis was published, the overlap between thee results obtained in the two independent studies was surprisingly low.

How to interpret this ? Problem of coverage ? Each

study would only represent a fraction of what remains to be discovered.

Problem of noise ? Either or both studies might contain a large number of false positives.

Differences in experimental conditions ?

Ito et al. (2001) PNAS 98: 4569-4574

Page 7: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Connectivity in protein interaction networks

Jeong et al (2001) calculate connectivity in the protein interaction network revealed by the two-hybrid analysis of Uetz and co-workers.

The connectivity follows a power law:

most proteins have a few connections;

a few proteins are highly connected

Highly connected proteins correspond to essential proteins.

Jeong, H., S.P. Mason, A.L. Barabasi, and Z.N. Oltvai. 2001. Lethality and centrality in protein networks. Nature 411: 41-42.

Page 8: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Mass-spectrometry

Functional genomics

Page 9: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

1. Construction of a bank of TAG-fused ORFs

2. Expression of the tagged baits in yeast

tag Y ORF

Y

B

EAYC

D

4. Affinity purification

3. Cell lysis

anti-tagepitope

+All cellular proteins,…

Other proteins,…

Isolation of protein complexes

tagged bait

Slide from Nicolas Simonis

Page 10: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

B

EAYC

D

B

EAY

CD

1 dimension SDS-PAGE

B C

isolation

Mass spectrometry

B

C

E

E

= YLR258w

= YER133w

= YER054c

A

Y

D

= YPR184w

= YKL085w

= YPR160w

Mass spectrometry - Protein identification

Slide from Nicolas Simonis

Page 11: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Protein complexes

High-throughput mass-spectrometric protein complex identification (HMS-PCI)

MDS proteomics

493 complexes

Tandem Affinity Purification (TAP)

CELLZOME: 232 complexes

Ho et al. (1999). Nature 415: 180-183

Gavin et al. (1999). Nature 415: 141-147

Page 12: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Network of complexes

Gavin et al. (1999). Nature 415: 141-147

Page 13: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Assessment of interactome data

Functional genomics

Page 14: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Assessment of interactome data

von Mering et al (2002). Nature.

Page 15: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Comparison of large-scale interaction data

von Mering et al (2002) compared the results from Two-hybrid assays Mass spectrometry (TAP and HMS-PCI) Co-expression in microarray experiments Synthetic lethality Comparative genomics (conservation of operons, phylogenetic profiles,

and gene fusion) Among 80,000 interactions, no more than 2,400 are supported by two

different methods. Each method is more specifically related to some

functional classes cellular location

Reference: von Mering et al. (2002). Nature 750

Page 16: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Comparison of pairs of interacting proteins with functional classes

von Mering et al (2002). Nature 750.

Page 17: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

von Mering et al (2002). Nature.

Validation with annotated complexes

von Mering et al (2002) collected information on experimentally proven physical protein-protein interactions, and measured the coverage and positive predictive value of each predictive method

Coverage• fraction of reference set covered by

the data. Positive predictive value

• Fraction of data confirmed by reference set.

• (Note: they call this “accuracy”, but this term is usually not used in this way)

Beware: the scale is logarithmic ! This enforces the differences in the

lower part of the percentages (0-10), but “compresses” the values between 10 and 50, which gives a false impression of good accuracy.

Page 18: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Cellular localization of proteins

Bioinformatics

Page 19: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

4156 proteins detected by fluorescence microscopy analysis

Nature (2003) 425: 686-691

Slide adapted from Bruno André

Page 20: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Global analysis of protein localization

This analysis allowed to obtain information for thousands of proteins for which the cellular localization was previously unknown.

Slide adapted from Bruno André

Page 21: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Localisation and ORF function

For historical reasons, the yeast genome is “over-annotated”. The method used for predicting genes from genome sequences included

many false positives, especially among short predicted ORFs. Most of the questionable ORFs were unobserved in the global localization

analysis. These mainly correspond to short ORFs.

Source: Bruno André

Page 22: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

[email protected]

Protein-DNA interactionsChIP-on-chip technology

Functional genomics

Page 23: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

The ChIP-on-chip method

Chromatin Immuno-precipitation (ChIP) Tagging of a transcription factor of interest with a

protein fragment recognized by some antibody. Immobilization of protein-DNA interactions with a

fixative agent. DNA fragmentation by ultrasonication. Precipitation of the DNA-protein complexes. Un-binding of the DNA-protein bounds.

Measurement of DNA enrichment. Two extracts are co-hybridized on a microarray

(chip),where each spot contains one DNA fragment where a factor is likely to bind (e.g. an intergenic region, or a smaller fragment)..

• For the yeast S.cerevisiae, chips have been designed with all the intergenic regions (6000 regions, avg. 500bp/region)

• Recent technology allows to spot 3e+5 300bp DNA fragments on a single slide.

The first extract (labelled in red) is enriched in DNA fragments bound to the tagged transcription factor.

The second extract (labelled in green) has not been enriched.

The log-ratio between red and green channels indicate the enrichment for each intergenic region.

Page 24: Jacques.van.Helden@ulb.ac.be Proteome and interactome Bioinformatics

Lee et al (2002)

In 2002, Lee et al publish a systematic characterization of the binding regions of 106 yeast transcription factors.

Lee et al. 2002. Science 298: 799-804.