jacques.van.helden@ulb.ac.be proteome and interactome bioinformatics

Post on 03-Jan-2016

223 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Jacques.van.Helden@ulb.ac.be

Proteome and interactome

Bioinformatics

Contents

Protein-protein interactions Two-hybrid assays Mass spectrometry

Cellular localization of proteins GFP tags

Protein-DNA interactions ChIP-on-chip

Jacques.van.Helden@ulb.ac.be

Two-hybrid assays

Functional genomics

Two-hybrid method

DNA-binding ORF A

DNA-binding Activation

ORF B Activation

Transcription factor

A

RNApol

B

Hybrid constructions

RNApolA

B RNApolA

B

Interaction reporter gene is expressed

No interaction reporter gene is not expressed

Bait Prey

Bait

Prey

Bait

Prey

Two-hybrid

Ito et al. (2001) PNAS 98: 4569-4574

Uetz et al. (2000). Nature 403: 623-631

Comparison of the results

When the second “comprehensive” analysis was published, the overlap between thee results obtained in the two independent studies was surprisingly low.

How to interpret this ? Problem of coverage ? Each

study would only represent a fraction of what remains to be discovered.

Problem of noise ? Either or both studies might contain a large number of false positives.

Differences in experimental conditions ?

Ito et al. (2001) PNAS 98: 4569-4574

Connectivity in protein interaction networks

Jeong et al (2001) calculate connectivity in the protein interaction network revealed by the two-hybrid analysis of Uetz and co-workers.

The connectivity follows a power law:

most proteins have a few connections;

a few proteins are highly connected

Highly connected proteins correspond to essential proteins.

Jeong, H., S.P. Mason, A.L. Barabasi, and Z.N. Oltvai. 2001. Lethality and centrality in protein networks. Nature 411: 41-42.

Jacques.van.Helden@ulb.ac.be

Mass-spectrometry

Functional genomics

1. Construction of a bank of TAG-fused ORFs

2. Expression of the tagged baits in yeast

tag Y ORF

Y

B

EAYC

D

4. Affinity purification

3. Cell lysis

anti-tagepitope

+All cellular proteins,…

Other proteins,…

Isolation of protein complexes

tagged bait

Slide from Nicolas Simonis

B

EAYC

D

B

EAY

CD

1 dimension SDS-PAGE

B C

isolation

Mass spectrometry

B

C

E

E

= YLR258w

= YER133w

= YER054c

A

Y

D

= YPR184w

= YKL085w

= YPR160w

Mass spectrometry - Protein identification

Slide from Nicolas Simonis

Protein complexes

High-throughput mass-spectrometric protein complex identification (HMS-PCI)

MDS proteomics

493 complexes

Tandem Affinity Purification (TAP)

CELLZOME: 232 complexes

Ho et al. (1999). Nature 415: 180-183

Gavin et al. (1999). Nature 415: 141-147

Network of complexes

Gavin et al. (1999). Nature 415: 141-147

Jacques.van.Helden@ulb.ac.be

Assessment of interactome data

Functional genomics

Assessment of interactome data

von Mering et al (2002). Nature.

Comparison of large-scale interaction data

von Mering et al (2002) compared the results from Two-hybrid assays Mass spectrometry (TAP and HMS-PCI) Co-expression in microarray experiments Synthetic lethality Comparative genomics (conservation of operons, phylogenetic profiles,

and gene fusion) Among 80,000 interactions, no more than 2,400 are supported by two

different methods. Each method is more specifically related to some

functional classes cellular location

Reference: von Mering et al. (2002). Nature 750

Comparison of pairs of interacting proteins with functional classes

von Mering et al (2002). Nature 750.

von Mering et al (2002). Nature.

Validation with annotated complexes

von Mering et al (2002) collected information on experimentally proven physical protein-protein interactions, and measured the coverage and positive predictive value of each predictive method

Coverage• fraction of reference set covered by

the data. Positive predictive value

• Fraction of data confirmed by reference set.

• (Note: they call this “accuracy”, but this term is usually not used in this way)

Beware: the scale is logarithmic ! This enforces the differences in the

lower part of the percentages (0-10), but “compresses” the values between 10 and 50, which gives a false impression of good accuracy.

Jacques.van.Helden@ulb.ac.be

Cellular localization of proteins

Bioinformatics

4156 proteins detected by fluorescence microscopy analysis

Nature (2003) 425: 686-691

Slide adapted from Bruno André

Global analysis of protein localization

This analysis allowed to obtain information for thousands of proteins for which the cellular localization was previously unknown.

Slide adapted from Bruno André

Localisation and ORF function

For historical reasons, the yeast genome is “over-annotated”. The method used for predicting genes from genome sequences included

many false positives, especially among short predicted ORFs. Most of the questionable ORFs were unobserved in the global localization

analysis. These mainly correspond to short ORFs.

Source: Bruno André

Jacques.van.Helden@ulb.ac.be

Protein-DNA interactionsChIP-on-chip technology

Functional genomics

The ChIP-on-chip method

Chromatin Immuno-precipitation (ChIP) Tagging of a transcription factor of interest with a

protein fragment recognized by some antibody. Immobilization of protein-DNA interactions with a

fixative agent. DNA fragmentation by ultrasonication. Precipitation of the DNA-protein complexes. Un-binding of the DNA-protein bounds.

Measurement of DNA enrichment. Two extracts are co-hybridized on a microarray

(chip),where each spot contains one DNA fragment where a factor is likely to bind (e.g. an intergenic region, or a smaller fragment)..

• For the yeast S.cerevisiae, chips have been designed with all the intergenic regions (6000 regions, avg. 500bp/region)

• Recent technology allows to spot 3e+5 300bp DNA fragments on a single slide.

The first extract (labelled in red) is enriched in DNA fragments bound to the tagged transcription factor.

The second extract (labelled in green) has not been enriched.

The log-ratio between red and green channels indicate the enrichment for each intergenic region.

Lee et al (2002)

In 2002, Lee et al publish a systematic characterization of the binding regions of 106 yeast transcription factors.

Lee et al. 2002. Science 298: 799-804.

top related