supporting online material for - medschool.ucsd.edu 10 mm tris ph 8.0 1 mm edta 100 mm liac 50 % w/v...

24
www.sciencemag.org/cgi/content/full/1162609/DC1 Supporting Online Material for Conservation and Rewiring of Functional Modules Revealed by an Epistasis Map in Fission Yeast Assen Roguev, Sourav Bandyopadhyay, Martin Zofall, Ke Zhang, Tamas Fischer, Sean R. Collins, Hongjing Qu, Michael Shales, Han-Oh Park, Jacqueline Hayles, Kwang-Lae Hoe, Dong-Uk Kim, Trey Ideker,* Shiv I. Grewal,* Jonathan S. Weissman,* Nevan J. Krogan* *To whom correspondence should be addressed. E-mail: [email protected] (T.I.); [email protected] (S.I.G.); [email protected] (J.S.W.); [email protected] (N.J.K.) Published 25 September 2008 on Science Express DOI: 10.1126/science.1162609 The main PDF file includes: Materials and Methods SOM Text Figs. S1 to S5 References Other Supporting Online Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/1162609/DC1) Tables S1 to S8 as a zipped archive 1162609sTablesS1-S8.zip Databases S1 to S4 as zipped archives 1162609sDataset_S1.zip 1162609sDataset_S2.zip 1162609sDataset_S3.zip 1162609sDataset_S3.zip

Upload: trinhque

Post on 07-May-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

www.sciencemag.org/cgi/content/full/1162609/DC1

Supporting Online Material for

Conservation and Rewiring of Functional Modules Revealed by an Epistasis Map in Fission Yeast

Assen Roguev, Sourav Bandyopadhyay, Martin Zofall, Ke Zhang, Tamas Fischer, Sean R. Collins, Hongjing Qu, Michael Shales, Han-Oh Park, Jacqueline Hayles, Kwang-Lae Hoe, Dong-Uk Kim, Trey Ideker,* Shiv I. Grewal,* Jonathan S. Weissman,* Nevan J. Krogan*

*To whom correspondence should be addressed. E-mail: [email protected] (T.I.);

[email protected] (S.I.G.); [email protected] (J.S.W.); [email protected] (N.J.K.)

Published 25 September 2008 on Science Express DOI: 10.1126/science.1162609

The main PDF file includes:

Materials and Methods SOM Text Figs. S1 to S5 References

Other Supporting Online Material for this manuscript includes the following: (available at www.sciencemag.org/cgi/content/full/1162609/DC1)

Tables S1 to S8 as a zipped archive 1162609sTablesS1-S8.zip Databases S1 to S4 as zipped archives 1162609sDataset_S1.zip 1162609sDataset_S2.zip 1162609sDataset_S3.zip 1162609sDataset_S3.zip

2

Materials and Methods

1. Strain construction – outlines the procedures used to build the strain

library used

2. Genetic crosses – detailed protocols for performing the genetic crosses

3. Gene set - provides details about selection of genes

4. Data acquisition and analysis – information about image acquisition and

data processing

5. Characterization of Rsh1 – methods used for the characterization of the

new RNAi pathway component Rsh1

6. Estimation of rates and significance of conservation – the methods

used for generating the graphs on Figure 4 and Figure S2

3

1. Strain construction The array G418 resistant haploid single deletion mutants were isogenic to SP286

(h+ ade6-M210 (M216);ura4-D18; leu1-32) and assembled from the BIONEER

single deletion set. The G418 resistance marker was switched to NAT by

amplifying the NAT selection module from pFA6a-NatMX6 (S1) using the

following oligonucleotides (5’-3’):

MX4/6_fwd: GACATGGAGGCCCAGAATAC

MX4/6_rev: TGGATGGCGGCGTTAGTATC

The switching module was introduced into the G418 resistant background. After

the genomic integrations were confirmed by PCR, a NAT selection targeting

cassette was amplified from genomic DNA and introduced into the PEM2

background (S2). Thus, the resulting deletion alleles are identical to the ones

present in the BIONEER set and only differ by the selectable marker. DAmP (S3)

alleles were constructed by inserting a NatMX6 selectable module into the 3’-

UTR of the gene of interest. For genes not present in the BIONEER set deletion

mutants were constructed by replacing the entire open reading frame with a

selectable marker cassette amplified using oligos containing long (up to 180

nucleotides) homology arms flanking the insertion point.

PROTOCOL: Quick DNA prep from S. pombe (produces DNA good enough for genotyping) Materials

25 mM NaOH in water PCR thermocycler Thin-wall PCR tubes or plates

Procedure

Resuspend small amount of cells into 50 ul of 25 mM NaOH. Incubate for 25 minutes at 100 C in a PCR cycler. Vortex and spin down the cell debris. Use 1-1.5 ul / 25 ul PCR reaction for genotyping.

4

PROTOCOL: Long DNA prep from yeast (produces high quality DNA that can be used for amlyfying targeting cassettes from genomic DNA) Materials 0.5 mm glass beads (biospec or similar) YDEB

10 mM Tris pH 8 100 mM NaCl 1 mM EDTA 2% Triton X100 1% SDS

P1/RNAse and EB from Qiagen miniprep kits Procedure

1. Spin cells down @ max for 10 secs in a screw-cap tube. 2. Add 200 ul YDEB and ca. 400 ul glass beads 3. Add 200 ul P/C/I pH 8. 4. Vortex for 2-3 min @ max setting. 5 Spin down 3 mins @ max. 6. Take 150 ul of the upper phase. 7. Add 150 ul of P1/RNAse/NaOAc (600 mM NaOAc in P1). 8. Incubate 10 min @ 37 C. 9. Add 800 ul 96% EtOH, vortex, spin down for 3 min @ max. 10. Trash sup, wash pellet w/ 500 ul 70% EtOH. 11. Spin down for 1 min @ max. 12. Dry pellet (speed-vac is best). 13. Dissolve in 50 ul EB.

PROTOCOL: 96 well transformation of S. pombe materials

GENETIX 48 well plates with selective media 96 well PCR plates multi-channel pipette salmon sperm DNA @ 2 mg/ml (ssDNA), denature by boiling for 5 mins and immediately placing on ice drugs (final concentrations in the medium) NAT = 100 ug / ml G418 = 100 ug/ml YE5S medium (5 g/l yeast extract, 30 g/l glucose, 225 mg/l adenine, histidine, leucine, uracil and lysine hydrochloride).

solutions (prepare ex tempore and filter sterilize)

LiAc/TE 10 mM Tris pH 8.0

5

1 mM EDTA 100 mM LiAc LP-50 10 mM Tris pH 8.0 1 mM EDTA 100 mM LiAc 50 % w/v PEG 3350 (or 4000)

Procedure

The following is for 96 transformations 1. Grow 100 ml culture to OD < 1. Split into 2 x 50 ml Falcon tubes. 2. Spin down @ 900 x g (= 2000 rpm) for 5 mins, RT. Pool pellets into 1 x

50 ml Falcon tube. 3. Re-suspend in 20 ml ddH2O, spin again @ 900 x g (= 2000 rpm) for 5

mins, RT. 4. Re-suspend in 20 ml 0.1 M LiAc/TE, spin again @ 900 x g (= 2000 rpm)

for 5 mins, RT. 5. Re-suspend in 3 ml 0.1 M LiAc/TE. 6. Add x 30 ul to the transforming mix (10 ul DNA + 10 ul denatured

ssDNA) in 96 well PCR plate. 7. Incubate 15 min @ RT. 8. Add 150 ul LP-50 and mix. 9. Incubate 1 h @ 30 C. 10. Add 20 ul DMSO, mix. 11. Heat-shock 10 min @ 42 C 12. Spin down 5 min, 900 x g (= 2000 rpm), discard sup. 13. add 50 ul YE5S, mix, incubate @ 30 C for 3-4 h. 14. Plate onto selective 48 square-well plate. 15. Colonies should start appearing in 2 - 4 days.

6

2. Genetic crosses Genetic crosses were carried out on the Singer ROTOR pinning station using the following modified PEM procedure (S2). PROTOCOL: Genetic Screens in S. pombe using the PEM2 system Preliminaries PLATES

Plate names YE5S = YE5S SPAS = SPAS NAT = YE5S + 100 ug/ml NAT G418 = YE5S + 100 ug/ml G418 GC = YE5S + 100 ug.ml G418 + 100 ug/ml cycloheximide (CYH) GNC = YE5S + 100 ug.ml G418 + 100 ug/ml NAT + 100 ug/ml cycloheximide (CYH) GC1 and GC2 are GC plates used in two consecutive steps of the protocol.

Plate amounts YE5S = Q-arrays / 3 NAT = 2 x Q-arrays SPAS = Q-arrays GC = 2 x Q-arrays GNC = Q-arrays

Plate colorcodes

YE5S I NAT I I G418 I I SPAS I I GC I I I I I GNC I I I I I I I

Query (Q-arrays) in 384 format

Prepare Q-lawns Spread up to 500 ul of thick culture onto a NAT plate using glass beads and incubate at 30 C for 2-3 days. Prepare Q-arrays Source plate: NAT ( I I ) Target plate: NAT ( I I ) Program Agar-Agar Replicate Replicate One 384 Parameters Source pressure: 100 %

7

Target pressure: 100 % Offset: Manual Offset radius: 1 mm Do 2 pins per plate picking cells from different parts of the lawn plate. Incubate at 30 C for 2-3 days.

Target arrays (T-arrays) in 384 format Prepare T-arrays Source plate: G418 ( I I ) Target plate: YE5S ( I ) Program Agar-Agar Replicate Replicate Many 384 Parameters Source pressure: 100 % Target pressure: 100 % Offset: Manual Number of replicas: 2 Economy: ON Revisit source: ON Offset radius: different radius may be needed to hit smaller colonies NOTE: Use relatively fresh copies of the T-arrays. Number of replicas needed is ca. the number of Q-arrays / 3 (e.g. for 30 Q-arrays one needs 10 T-arrays). Incubate at 30 C for 2-3 days.

Mating (Day 0)

Source plate: T-array ( I ) and Q-array ( I I ) Target plate: SPAS ( I I ) Combine the T-array and the Q-array onto a SPAS plate generating a 1536 density array. First pin the T-array and then pin the Q-array on top of it. You will need two (2) 384 pads per mating. Program Agar-Agar Array Single Source 384-1536 Parameters Source pressure: 100 % Target pressure: 100 % Offset: Manual Economy: ON

8

Revisit source: OFF Offset radius: different radius may be needed to hit smaller colonies Incubate for 5 - 6 days at !!! ROOM TEMPERATURE !!! packing the plates in plastic bags to prevent drying.

SPAS-GC1 (Day 6) Source plate: SPAS ( I I ) Target plate: GC ( I I I I I ) Replicate the mating arrays form SPAS onto GC plates using 384 pads. Do 2 pins per array onto the same target GC plate. You will need two (2) 384 pads per array. Program Agar-Agar Replicate Replicate One1536 Parameters Source pressure: 100 % Target pressure: 100 % Offset: OFF (If the source arrays are offset switch to Manual) Economy: ON Revisit source: ON Offset radius: different radius may be needed to hit smaller colonies When loading the pads click ‘Modify’ to change to 384 pads. Incubate for 3 days at 30 C.

GC1-GC2 (Day 9)

Source plate: GC ( I I I I I ) Target plate: GC ( I I I I I ) Replicate the arrays form the GC1 plates onto GC2 plates using 1536 pads. Do 1 pin per array. Program Agar-Agar Replicate Replicate One1536 Parameters Source pressure: 100 % Target pressure: 100 % Offset: Manual Offset radius: different radius may be needed to hit smaller colonies Incubate for 2 days at 30 C. Optional: Take pictures of the GC1 plates.

9

GC2-GNC (Day 11) Source plate: GC ( I I I I I ) Target plate: GNC ( I I I I I I I ) Replicate the arrays form the GC2 plates onto GNC plates using 1536 pads. Do 1 pin per array. Program Agar-Agar Replicate Replicate One1536 Parameters Source pressure: 100 % Target pressure: 100 % Offset: Manual Offset radius: different radius may be needed to hit smaller colonies Optional: Take pictures of the GC2 plates. Incubate at 30 C. Take pictures of the GNC at 24, 36 and 48 hours. Store the final plates in coldroom.

10

3. Gene set Gene selection Selection of genes used in this study was mainly based on signal-rich

genetic profiles from two previously published datasets (S3, S4) as well as

conserved pathways present in S. pombe but not in budding yeast (e.g. the RNAi

pathway). For a complete list of genes see Table S1. Within the gene set there are several meiotic genes not expressed in

mitosis (eg, rec12, rdh54) yet genetic interactions with these genes were

detected in vegetative growth. Several explanations may exist. For example

these factors may, in fact, play roles in mitotically-growing cells, and therefore

would provide significant genetic interactions when combined with other

mutations. Another, perhaps more plausible explanation is that during the genetic

screen, following mating of the two single mutants, the resulting diploid cells

undergo meiosis, when these factors are expressed and function. Therefore if

these genes are required for efficient meiotic progression, spore formation or

germination, their absence would have a detrimental effect at the outcome of the

cross and would ultimately be manifested as a negative genetic interaction at the

end of the screen.

Sequence conservation biases To evaluate sequence conservation biases that could potentially influence

downstream analyses, a pre-calculated BLAST results set over 5 eukaryotic

genomes (S. cerevisiae, D. melanogaster, C. elegans, A. thaliana and H.

sapiens) from the COGs database (http://www.ncbi.nlm.nih.gov/COG/) was used.

For each protein sequence in S. pombe, a vector containing the P-values of the

best BLAST hits from each of the 5 genomes was created after applying a

conservative cutoff of 10-10. Then, a median over this vector was computed and

used as a measure for evolutionary conservation of protein sequences. The

complete (550 genes) and the orthologs (239 genes) sets show no significant

biases compared to the rest of S. pombe genome (Figure S5A). Also, we did not

observe significant association between sequence conservation and genetic

11

interaction profile correlations over the set of orthologs used in the evolutionary

analysis of genetic interaction networks (Figure S5B). Moreover, the distributions

of correlation coefficients between ortholog profiles of the conserved and non-

conserved proteins were statistically indistinguishable (two sample t-test P = 0.26

at 5% significance level) (Figure S5C).

S. pombe protein-protein interaction dataset

A set of 151 protein-protein interaction pairs (Table S2) was compiled from

the BioGRID (S5) and BIOBASE International (www.biobase-international.com)

databases as well as unpublished data from A. F. Stewart considering only

interactions derived from stringent biochemical methods, i.e. mainly affinity

tagging/purification combined with mass spectrometry.

12

4. Data acquisition and analysis Data collection

Images of the agar plates were acquired and analyzed using a setup

similar to the published one (S6) and the raw data was processed using the E-

MAP toolbox (S6). The final dataset was subjected to hierarchical clustering

using the Cluster package (S7).

Data processing Colony size measured from high-density arrays (Figure S1B) was used as

a quantitative phenotypic readout to compute a genetic interaction score (S-

score) (S6). Because strong genetic interactions are rare and most double

mutant combinations should have weak or no effect (S3, S4, S8, S9), a normal

distribution of S-scores was observed (Figure S1C). Linkage biases due to the

lower recombination frequency between closely linked loci (manifested by slower

apparent growth and thus resulting in a lower S-score) were eliminated after

examining the relationship between the chromosomal distance and the strength

of the observed phenotype (Figure S1D, Figure S4). A conservative threshold

of 500 kb from each locus was applied over the initial dataset (Dataset S3) and

scores for gene pairs within this window (10,238 interactions in total, Table S7)

were removed.

Data quality assessment The quality of the data was assessed by examining the correlation among

replicate and “marker-swap” experiments, where each genetic interaction is

measured using mutant alleles marked with antibiotic resistance genes

(Kanamycin (KAN) and Nourseothricin (NAT)) in a reciprocal fashion (i.e.

geneAΔ::KAN X geneBΔ::NAT and geneAΔ::NAT X geneBΔ::KAN) (S2, S6).

Therefore, correlation of the scores from these experiments can be used to

assess the quality of the dataset and identify systematic biases as well as corrupt

strains, which were removed. Our final dataset is of high quality since the S-

13

scores between these replicates and “marker-swaps” experiments display strong

positive correlation (r=0.57), which is comparable to data we have generated in

budding yeast (S3, S4) (Figure S1B).

The final dataset comprises 118,575 measurements and contains 5,772

significant negative (S-score ≤ -2.5) and 1,812 significant positive (S-score ≥ 2)

interactions. All data generated in this study can be accessed using an interactive andsearchable website (http://interactome-cmp.ucsf.edu).All data presented in this study can be accessed in a searchable format at http://interactome-cmp.ucsf.edu and will also be deposited into the BioGRID database (S5).

14

5. Characterization of Rsh1 Reverse transcription (RT-PCR): Total RNA was extracted form

exponentially growing cells and treated with RNase-free DNase I (Promega).

Centromeric transcripts weredetected by reverse transcription performed with

One Step RT PCR kit (Qiagen). PCR products were resolved on 2% agarose gel

and visualized by ethidium bromide staining. Samples without reverse

transcriptase (-RT) were processed in parallel to control for DNA contamination.

Chromatin immunoprecipitation: ChIP was performed as described

previously (S10). Immunoprecipitation were performed using antibodies raised

against full-length Swi6 protein, dimethylated H3K9 peptides (Abcam) and Myc

epitope (Santa Cruz Biotechnology).

Small RNAs: Twenty µg of small RNAs fraction, purified by mirVana

miRNA purification kit (Ambion), was resolved on 15% urea-PAGE and

transferred to HybondN+ membrane. Membrane was hybridized with single-

stranded RNA, transcribed with α-P32-UTP and hydrolyzed to average lengths of

~50 nucleotides.

Spot silencing assays: A ura4+ reporter gene was inserted at outer

repeat region of centromere 1 (otr1::ura4+). Serial dilutions of the respective

mutants were plated on nonselective (NS), uracil-deficient (-URA) and

counterselective media (FOA).

Oligonucleotides (5’-3’): act1 act1frw GAAGTACCCCATTGAGCACGG act1rev CAATTTCACGTTCGGCGGTAG leu1 leu1.4 TAGAAGCCTCACCTCCCAAA leu1.3 TTTGGTCAAGAGCCCTCGTA

15

otr dg660frw GACCTAGAAGTAAAATTCGT dg660rev GCGGTTGTTTGGCACTGAATGTAA otr dh dh383frw TGCTGTCATACTACACTGCA dh383rev TTCTGAATAATTGGGATCGC otr::ura4 jpo4 CGTGAGTATACAAACAAATACACTAGG jpo17 CTACTCTTCTCGATGATCCTGTAA

16

6. Estimation of rates and significance of conservation A set of 239 direct orthologs between S. cerevisiae and S. pombe was

compiled based on curated annotation (Valerie Wood, personal communication)

and the YOGY database (S11). As ortholog definitions are a subject to change

and there may be some misplaced assignments a frozen version of the set used

for the downstream analysis is provided in Table S3.

There were 17,251 genetic interactions that were shared among orthologs

in both species. The pair-wise scores between species was found to be

significantly related via Pearson correlation (r=0.14, p < 10-170, Figure S2A). For

comparison, we also generated a random dataset based on 100 permutations of

these scores, which showed no correlation (r=0.009, p=0.103).

To determine biological subsets that might show trends of conservations,

we assembled two different datasets: physically interacting protein pairs and

functionally related pairs. Known physical protein interactions in S. cerevisiae

were taken from (S12) which were pruned for high confidence interactions (PE

conf > 0.2) for a total of 119 interactions (Table S5). We found these pairs were

highly conserved (r=0.41, Figure S2A). We also determined a set of functionally

related proteins as the top 5% (13,052) most functionally similar gene pairs

covered in chromosomal biology E-MAP (S4). Functional similarity was

determined by comparison to the background probability of picking two genes

with the same shared functional annotation (S13) from the entire yeast genome

(via a hypergeometric test). This set was then limited to pairs falling between the

239 orthologous genes and the 119 physical protein interactions were removed

for a total of 939 functionally related non-interacting protein pairs (Table S6).

Genetic interactions of pairs from this set were also correlated between species

(r=0.30).

For negative interactions, the conservation rate was determined by

calculating for every protein pair in S. cerevisiae the probability of observing the

same S-score or less between the orthologous genes in S. pombe. For positive

interactions, the probability was calculated based on observing the same S-score

17

or greater in S. pombe. We evaluated the conservation rate over a variety of

cutoffs (Figure S2B). This conservation rate was then assessed for significance

using Fisher's exact test based on a 2 x 2 contingency table and a two-tailed p-

value was calculated. The significance of the conservation rate of all the data

versus the randomized set is shown for a variety of cutoffs (Figure S2C) as well

as for physically associated and functionally related pairs versus all of the data

(Figure S2D). The significant conservation of genetic interactions observed in

this work were also not due to the conservation of protein-protein interactions

and functionally related proteins alone, as all genetically interacting pairs

excluding those that are functionally related or whose proteins physically interact

had a comparable conservation rate as 'all' genetic interactions in Figure 4A with

a 15% conservation rate of negative interactions (p=5x10-10 versus random) and

a 5% conservation rate of positive interactions (p=6x10-3 versus random).

We also compared the observed conservation trends to independent

studies of synthetic lethality and synthetic sickness data deposited into the

BioGRID database (S5). These types of interactions correspond to having a

strongly negative S-score which we compared with strong negative interactions

among orthologs in S. pombe (S-score < -2.5). Consistent with Figures 4A, S2C

and S2D we observe an 18% negative conservation rate overall (p=4x10-12 versus random) and 31% conservation rate among functionally related protein

pairs that are not physically interacting (p=10-2 versus all).

18

Supplementary text

Interpreting positive and negative genetic interactions

We have published several quantitative genetic interaction maps in

budding yeast (S3, S4) and have observed similar ratios of positive to negative

genetic interactions. While we cannot give an exact explanation of why this is the

case, it may be that these ratios are, in fact, a characteristic feature of genetic

networks from unicellular eukaryotes. More work will be needed to understand

the meaning of the general trends we see. It is, however, not particularly

surprising that there would be more negative than positive interactions. Positive

interactions often occur when two genes are exclusively working in the same

cellular pathway. However, proteins are often multi-functional and are not

restricted to a single pathway. Hence, a factor that works in multiple pathways

may not necessarily display positive genetic interactions with any factors in any

single pathway. Also, in the past, we have noted that genes coding for proteins

that are physically associated often exhibit positive genetic interactions (S4).

However, protein-protein interaction pairs may also display negative genetic

interactions, especially when the corresponding proteins are part of an essential

complex. The logic is as follows: deletion of one non-essential component of an

essential complex does not completely disable the complex whereas introduction

of the second one does. An example of this is the non-essential components of

the 19S proteasome (e.g. Sem1, Rpn4, Rpn9, Rpn10) show negative interactions

with one another (S4). Furthermore, if a protein complex is comprised of several

functional distinct sub-modules, then negative interactions may exist between

components of the different modules. For example, the transcriptional initiation

complex, Mediator, is comprised of four different modules, the head, tail, middle

and Cdk module and components in the different modules display negative

genetic interactions with one another (S4). These are additional reasons why

negative genetic interactions would outnumber positive ones.

0 0.5 1 1.5 2x 106

-20

-15

-10

-5

0

5

chromosomal distance (Mb)

S sc

ore

Linkage CurveD

2.5

−8 −6 −4 −2 0 2 4 6 80

0.5

1.0

1.5

2.0

3.0 x 104

S−score

frequ

ency

Distribution of S−scoresC

B

Figure S1A. Scatter plot of interaction scores from replicate and “marker-swap” experiments (see text). Each point represents scores from two independent measurements for a single pair of genesand the correlation coe�cient is r = 0.57.B. A representative image of a high-density (1536) colony array used in the genetic analysis.Examplesof negative and positive genetic interactions are highlighted with blue and yellow boxes, respectively. C. Distribution of interaction scores across the entire spectrum of interaction strengths. As expected,the distribution is centered over 0 and most interactions are weak and fall in the interval -2:+2.D. Median interaction scores as a function of chromosomal distance between genes. A conservativecuto� of 500 kb (vertical black line) was applied to eliminate biases due to linkage e�ects.

A10 r = 0.57

5

0

-5

-10-15

-20

-20 -15 -10 -5 0 5 10

-25

-25-30

-30Unaveraged score #1

Una

vera

ged

scor

e #2

-20 -15 -10 -5 0 5 10-30

-25

-20

-15

-10

-5

0

5

10

S-Score (Sc)

S-sc

ore

(Sp)

all pairs(r = 0.14)PPI pairs(r = 0.41)

S-Score (Sc)−15−20 −10 −5 0 5 10

0

10

20

30

40

50

60

70

80all dataPPI’sfunctionally

randomrelated

Cons

erva

tion

Rate

in S

. pom

be (%

)

S-Score−15 −10 −5 0 5 10−2

0

42

68

1012141618

-log 10

(p-v

alue

)

Significance of Conservation Between All Data vs. Random

all data

S-Score

Significance of Enrichment vs. All Data

−15 −10 −5 0 5 10−2

02468

10121416

-log 10

(p-v

alue

)

PPI’sfunctionallyrelated

00

0.2 0.4 0.6 0.8

0.1

0.2

0.3

0.4

0.5

0.6

0.7

COP score (Sc)

CO

P sc

ore

(Sp)

all pairsPPI pairsE

frequ

ency

0.020

0.04

0.06

0.08

0.10

0.12

0.140.160.18

Correlation Coefficient-1 -0.5 0 0.5 1

all pairsdirect orthologsPPI pairsF

A B

DC

Figure S2For a detailed description of the analysis used to generate this figuresee Materials and Methods.A. Scatter plot of S-scores from the 239 direct 1:1 orthologs corresponding to 17,251 geneticinteractions measured in both species. Gene pairs corresponding to protein-protein interactions(119, see Table S5) in budding yeast (PE confidence > 0.2) are represented in yellow.B. Conservation rate of positive and negative genetic interactions based on comparison withS. cerevisiae. Conservation rates of random set (black), all pairs (red), pairs of genes coding forphysically interacting proteins (greens) and pairs of functionally related genes, excluding genescoding for physically interacting proteins (blue) are shown using a sliding S-score cutoff. C. Significance of conservation all genetic interactions between orthologs compared to randomlypermuted data from B.D. Same as C but for the subsets of functionally related genes (blue) and genes coding forphysically interacting proteins (green). E. Scatter plot of COP (Complex or Linear Pathway) scores with pairs of genes coding forphysically associated proteins in yellow.F. Distribution of the cross-species Pearson correlation coefficient of genetic profiles.Data for all pairs (blue), direct orthologs (red) and PPI pairs (green) is shown.

S.c. HIR-C S.p. HIR-C

HPC2-SP HIR1-SP

HPC2-SC HIR1-SC

DC

C1

CTF

18

CTF

4 C

HL1

M

RC

1 C

SM3

HD

A1

IES2

A

RP8

V

PS75

K

IP3

HPC

2 H

IR1

SW

D1

SW

D3

SPP

1 R

CO

1 P

OB3

S

IF2

HO

S2

SET

3 S

DC

1 B

RE2

S

SU72

D

BR1

MET

18

MU

B1

AR

P4

ELC

1 R

EB1

VPH

2 S

AS3

NTO

1 B

IK1

KIP

2 P

LM2

RR

D2

MBP

1 R

IM4

UBP

5 C

WC

21

TM

A20

TM

A22

SG

F29

NG

G1

GC

N5

SKI

8 S

KI3

SKI

2

PAC10-SC PFD1-SC GIM4-SC YKE2-SC GIM3-SC GIM5-SC

S.c. Prefoldin S.p. Prefoldin

ASE

1 S

PT23

R

BL2

ER

V46

RSC

6 Y

TA7

RIS

1 B

UB3

BR

E2

SSU

72

DBR

1 M

ET18

M

UB1

GIM4-SP PFD1-SP YKE2-SP

GIM5-SP GIM3-SP

DC

C1

CTF

18

CTF

4 C

HL1

Y

AF9

SW

C5

SW

R1

VPS

72

AR

P6

VPS

71

HTZ

1

BIM

1 C

IN8

HC

M1

MAD

1 M

AD2

TU

B3

VPS

70

GIM

4 P

FD1

PAC

10

YKE

2 G

IM3

GIM

5

SW

D1

SW

D3

SPP

1 R

CO

1 P

OB3

S

IF2

HO

S2

SET

3 S

DC

1

PM

S1

MLH

1 T

OP1

S

KI8

SKI

3 S

KI2

YJR

119C

M

NN

10

SH

G1

RAD

16

VR

P1

AR

P1

BM

H1

PSH

1

PAC10-SP

ReplicationCheckpointComplex

ReplicationCheckpointComplex HIR-C SET1-C TMA-C SAGA SKI-CSET3-C

SWR-C SET1-C SET1-CSET3-C

SpindleCheckpoint

Prefoldin SKI-C

Figure S3 Comparison of genetic interaction profiles of the Prefoldin complex and the HIR chromatin remodelingcomplex in S. cerevisiae and S. pombe. Analogous sets of genetic interactions from the two organisms areshown (see Dataset S2) with regions of interest highlighted.

−20 −18 −16 −14 −12 −10 −8 −6 −4 −2 0 2 4 6 8 10100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

S−score

frequency

linkage datafinal dataset

Figure S4Distributions of interactions scores (S-scores) removed due to linkage effects (red) and final dataset (blue).

not conserved

Figure S5BLAST P-values for this analysis (see Materials and Methods) are from http://www.ncbi.nlm.nih.gov/COG/ and only hits with P-value lower than 10-10 were considered.A. Histogram of protein sequence conservation over the whole genome (blue), the set of 550 genes on the E-MAP (green)and the set of 239 orthologs (red) against 5 eukaryotic genomes.B. Scatter plot of between species correlation coefficients of genetic interaction profiles of the set of 239 orthologs as a functionof protein sequence conservation. Blue and green rectangle contain datapoints corresponding to non-conserved and conservedgenes respectively.C. Distribution of correlation coefficients of ortholog profiles for the two boxed sub-populations from B. The two distibutions areindistinguishable at 5% significance level (P = 0.26).

log10 of median best BLAST P−value

−100 −90 −80 −70 −60 −50 −40 −30 −20 −100

0.005

0.01

0.015

0.02

0.025

0.03

more conserved less conserved

whole genomeall genes on E−MAP (550 genes)orthologs set (239 genes)

0 20 40 60 80 100−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7BA

-log10 of median best BLAST P−value

more conservedless conserved

r = -0.0817

corre

latio

n co

effic

ient

of o

rthol

ogs

prof

iles

−1 −0.5 0 0.5 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

correlation coefficient of ortholog profiles

freq

uenc

y

freq

uenc

y

Cnot conservedconserved

References S1. P. Hentges, B. Van Driessche, L. Tafforeau, J. Vandenhaute, A. M. Carr,

Yeast 22, 1013 (2005). S2. A. Roguev, M. Wiren, J. S. Weissman, N. J. Krogan, Nat Methods 4, 861

(2007). S3. M. Schuldiner et al., Cell 123, 507 (2005). S4. S. R. Collins et al., Nature 446, 806 (2007). S5. C. Stark et al., Nucleic Acids Res 34, D535 (2006). S6. S. R. Collins, M. Schuldiner, N. J. Krogan, J. S. Weissman, Genome Biol

7, R63 (2006). S7. M. B. Eisen, P. T. Spellman, P. O. Brown, D. Botstein, Proc Natl Acad Sci

U S A 95, 14863 (1998). S8. A. H. Tong et al., Science 303, 808 (2004). S9. D. Segre, A. Deluna, G. M. Church, R. Kishony, Nat Genet 37, 77 (2005). S10. J. Nakayama, A. J. Klar, S. I. Grewal, Cell 101, 307 (2000). S11. C. J. Penkett, J. A. Morris, V. Wood, J. Bahler, Nucleic Acids Res 34,

W330 (2006). S12. S. R. Collins et al., Mol Cell Proteomics 6, 439 (2007). S13. M. Ashburner et al., Nat Genet 25, 25 (2000).