cell reports supplemental information lipid cooperativity ... · pdf filecell reports...

Cell Reports

Supplemental Information

Lipid Cooperativity as a General

Membrane-Recruitment Principle for PH Domains

Ivana Vonkova, Antoine-Emmanuel Saliba, Samy Deghou, Kanchan Anand, Stefano

Ceschia, Tobias Doerks, Augustinus Galih, Karl G. Kugler, Kenji Maeda, Vladimir Rybin,

Vera van Noort, Jan Ellenberg, Peer Bork, and Anne-Claude Gavin

1

SUPPLEMENTAL DATA

Supplemental figures

Figure S1. Assessment of the quality of the dataset and overview of the detected PH

domain-liposome interactions (related to Figure 1).

A. Control for potential position-dependent artifacts on the liposome array. Each cell of the

heatmap corresponds to a position on the liposome array and indicates the logarithmic value

of the ratio between the number of experiments of NBI > 0.037 (i.e., interactions) and the

total number of experiments measured at the particular coordinate. Squared cells represent the

positions of the positive control (liposomes containing a PE-biotin lipid), which remained

fixed across all replicates. B. Control for PH domain-sfGFP fusions folding using circular

dichroism (CD). C. For each PH domain studied with CD a secondary structure prediction

from the CD data are given (column two and three; structured [%] represents a sum of helix,

beta and turn elements predictions). The calculated melting temperature is given in column

four. D. Determination of NBI cutoff maximizing sensitivity and specificity of LBDs-

liposome interaction detection. Top, ROC curve and, bottom, precision-recall curve analyses

of the NBIs of the screen dataset. The NBI cutoff of 0.037 yields a true positive rate (recall)

of 75.2%, a false positive rate of 3.4%, precision of 86% (dashed lines). AUC (area under

curve) of ROC curve was 0.95. E. Qualitative reproducibility of the screen. All LBDs-

liposome experiments were assigned as interaction or no binding according to the mean NBI

value calculated from all available replicates (interaction if NBI ≥ 0.037, otherwise no

binding). The big pie chart shows reproducibility of these annotations based on NBI

annotations of the corresponding replicates. In cases of inconsistent annotation,

reproducibility based on visual inspection of the images is shown (small pie charts). F.

Determination of the reproducibility index (RI) for each domain-liposome type experiment

2

studied. The upper plot represents the standard error (SENBI) as a function of the NBI (log10

transformed). The lower plot represents the RI as a function of the NBI (log10 transformed).

Interactions yield a positive RI while no binding events yield a negative RI. The closer the RI

is to 0, the more confident is the datapoint (interaction or no binding). The plot on the right

side of the panel represents randomly picked examples of domain-liposome type experiments

which yielded different ranges of RI. G. Assessment of the ability of LiMA to recover the

specific lipid-binding profile of four positive control LBDs (EEA1-FYVE, HSV2,

Lactadherin-C2, p40phox-PX). The numbers behind the lipid names indicate lipid

concentration (mol %). H. For the four positive controls shown in panel (G), the boxplots

show the NBI and 1/RI for the known specific lipid(s) partners and other lipids and lipid

mixtures. The P values show the statistical significance between indicated pairs of boxplots. I.

Comparison of dissociation constants (Kd) calculated from titration experiments using LiMA

on a set of PH domains with a reference Kds (Ref. Kd). J. Correlation between NBI

measurements and literature derived Kds for selected PH domain-lipid combinations. The

Pearson correlation coefficient of the presented interactions was 0.74 and associated P =

2.6.10-3

. K. Visualization of a relationship between the time delay in the imaging (x axis) and

NBI measurements between replicates (y axis). L. Correlation analysis of NBIs measured on

the same liposome array imaged twice, once at time 0 (imaging 1) and then two hours later

(imaging 2), for PLCD1 (upper plot) and EEA1 (lower plot). M. Correlation analysis of LBDs

concentration versus NBIs of all interactions detected in the screen. N. Pairwise correlation

analysis of NBIs for LBDs with more than 1.5 fold difference in protein concentration

between replicates. The NBIs from the corresponding replicates were compared pairwise

(Wilcoxon test followed by Bonferroni correction) and corrected P values, reflecting

statistical significance of difference in NBIs, were plotted for each pair. P value 0.05 (solid

line) was used as a threshold for statistically significant difference. The domain name together

with the concentration (μM) of the two corresponding replicates (Repl.) is given. O. Number

3

of interactions detected per PH domain. The proteins for which data from only one replicate

are available are labeled in red, the proteins for which the mean RI of the interactions was ≥ 2

are marked in blue.

6

Figure S2: Hierarchical clustering of liposome types according to the similarities of their

PH domain-binding profiles (related to Figure 2).

A. Top, hierarchical clustering of single signalling lipid-containing liposomes according to the

similarities in their PH domain-binding profiles. The colours indicate lipid families to which

the signalling lipids belong. The lipids that are not physiological in S. cerevisiae are marked

with §. Middle, barplot represents the number of PH domains that were recruited to

membranes of particular liposome type. Bottom, boxplot giving the NBI values of interacting

PH domains. The numbers behind the lipid names indicate lipid concentration (mol %) used

for each signalling lipid. B. Detailed view on the hierarchical clustering shown in the Figure

2A. The liposome types composed of combination of signalling lipids are clustered according

to their PH domain binding profiles. Right, the number of PH domains interacting with each

liposome type and left, the distribution of NBI values are shown. The lipids that are not

physiological in S. cerevisiae are marked with §. C. The optimum number of clusters in the

hierarchical clustering shown in (B) and Figure2A was decided based on a partitioning around

medoids. Two clusters composed of liposome types containing PM PIPs (triangles) or

organelle PIPs (circles), respectively, were found. D. The silhouette plot assesses the

robustness of the two clusters identified in (C). The average silhouette width of 0.54 indicates

that a robust structure has been found. # indicates the dipalmitoyl variant of PI(3,5)P2.

8

Figure S3: Cooperation of lipids in the targeting of PH domains to membranes (related

to Figure 3).

A. Landscape of cooperating lipids in the targeting of mammalian PH domains to liposomal

membranes. B. Validation of interactions of PH domains with cooperating lipids using dose

response experiments. The figure gives the results of dose response experiments (heatmaps)

for 31 selected PH domain-lipids combination pairs and compares them with the results

obtained from the PH domain screen (bar plots and RI values). The results are divided into

four groups: true positive, true negative, false positive and false negative. Each cell in the

heatmaps gives the NBI value (violet) measured for the PH domain interaction with liposomes

containing the indicated concentration of signalling lipids. The grey colour indicates missing

data. Values are mean (n ≥ 2). The bar plots show mean NBI values (± s.d.) measured in the

PH domain screen for each liposome type containing indicated signalling lipids or their

combinations (black, NBI ≥ 0.037; white, NBI < 0.037). The concentration of signalling lipids

in liposomes used in the PH domain screen was 10 mol %, except the combination of

PI(4,5)P2 and DHS1P, for which data from liposomes containing 7 mol % of both lipids are

shown. The RI values for each experiment is given under the bar plots. In the group of false

negative, the PH domains recognizing cooperating lipids at lower lipid concentrations are

marked with a star.

11

Figure S4: Correlation of cooperativity indices and in vivo validation (related to Figure

4).

A, Correlations between CIs calculated for combinations of PI(4,5)P2 with PS and other

negatively charged auxiliary lipids (DHS1P, PHS1P, S1P and Cer1P). B, Summary of in vivo

experiments of GFP fusions of SKG3 and CAF120 proteins in S. cerevisiae. The metabolism

of phosphatidylserine and PI(4,5)P2 was perturbed with CHO1 (phospahtidylserine synthase

deletion) and MSS4ts (thermosentsitive mutant of the phosphatidylinositol-4-phosphate 5-

kinase). Scale bars, 3 μm. The following columns summarize information on the classification

as member of CDC42 network (based on STRING database) and on the subcellular

localization in bud (based on Yeast GFP fusion localization database). The last two columns

show in vitro data for N terminal PH domains of SKG3 (SKG3N) and CAF120 (CAF120N).

They indicate if any high confidence detection of PI(4,5)P2 and phosphatidylserine

cooperation (as shown in Figure 3A) were detected by giving CI and RI calculated for the

interactions with PI(4,5)P2:phosphatidylserine combination.

12

Figure S5: PIP-binding profile of PH domains and details of validation of the organelle

PIP binding motif (OBM) (related to Figure 5).

A. Heatmap representing the PIP-binding profile of PH domains. The solid line separates

groups of PH domains interacting and not interacting with organelle PIPs, respectively, which

were used as training sets during identification of the OBM. The barplot on the left shows the

mean protein concentration (μM) across the replicates used for each PH domain. The PH

domains for which data from only one replicate are available are marked in red. B. OBM

(pink) and basic sequence motif (BSM, blue) scoring scheme. Each position in OBM and

BSM is associated with a score and is pointed out on the secondary structure scheme of a PH

domain. C. Selected segments of PH domains [corresponding to the parts schematically

represented above in (B)] from CERT, FAPP1, OSBP2 and SWH1 proteins with the mutated

residues highlighted. (*) indicates naturally occurring mutations.

14

Supplemental table legends

Table S1: List of lipids (related to Figure 1)

A. List of lipids (related to Figure 1)

30 different lipids (including 1 fluorescent lipid, 1 PEGylated lipid and 1 biotinylated lipid)

have been used in this study. The first column provides the category of the lipids and the

second one the abbreviation used in this study [(§) indicates that the lipid is not physiological

in yeast]. Common name, systematic name and synonyms are listed in the third, fourth and

fifth column, respectively. Cas number is given in the column six. Pubchem substance ID and

LIPID MAPS ID (http://www.lipidmaps.org) are indicated in the columns seven and eight.

The lipid supplier, the catalog number and the lipid origin (yeast, Saccharomyces cerevisiae)

are given in the columns nine, ten and eleven. Lipid solubility is given in the column twelve.

The column thirteen lists the overall charge calculated for each lipid using the data from

Gallego et al (PMID: 21119626). * and # indicate different variants of PI(4)P and PI(3,5)P2

lipid, respectively. (n/a: not available)

B. Current knowledge of in vivo lipid concentration and commonly used lipid

concentration in physiological in vitro protein-lipid interaction assay (related to Figure

1)

For every lipid species used in this study (column 1), the in vivo probes available in literature

are reminded in column 2. In column 3, the in vivo concentration available from lipidomics

studies for every lipid used in this study are reported with the associated organism and the

Pubmed ID number (PMID). Remark: since sterols are not present, the total of mol% of

glycerolipids, glycerophospholipids, glycerophosphoinositol phosphates and sphingolipids

15

does not reach 100%. For every lipid the subcellular enrichment is reported in column 4.

Finally, the commonly used in physiological in vitro protein-lipid interaction assay are

compiled. n/a: not available, (§) lipids not physiological in yeast.

C. List of lipid mixtures (related to Figure 1)

In total, 125 lipid mixtures were used for the liposome arrays. In all lipid mixtures, PC is used

as a carrier lipid (second column) complemented with up to four signaling lipids (columns

three to six). Each lipid mixture contained a PEGylated lipid (column seven) and a fluorescent

lipid (column eight). The corresponding molar ratio for each lipid mixture is indicated in the

columns labeled mol % (columns nine to fifteen). The lipid abbreviations are as defined in

Table S1A. (§) indicates that the lipid is not physiological in S. cerevisiae. # indicates

different variant of PI(3,5)P2 lipid.

16

Table S2: Proteins used in the screen (related to Figure 1)

In total, 95 LBDs were tested, out of which 91 were PH domains. The first column gives

protein names as used in this study. The suffixes _1 or _2 indicate two variants of the same

PH domain. The suffixes _A or _B indicate two different proteins from C. thermophilum that

are orthologous to the same protein from S. cerevisiae. The second column gives systematic

(ORF) names, where applicable. The third and the fourth column contain Uniprot ID and

species, respectively. The type of LBD is given in the fifth column. The prediction score for

each PH domain derived from SMART database is shown in the sixth column. In case of high

SMART e-value, the e-values from BLAST or NCBI CDS are given. The column seven

indicates the studies (PMID) that studied or predicted the particular LBD. The eight and ninth

columns contain information about amino acid (AA) borders and exact AA sequence of

probed LBDs, respectively. Source of the DNA template for all LBDs is indicated in the tenth

column. The expression level in E. coli and the solubility are given in the column eleven and

twelve, respectively. The column thirteen shows median protein concentration in μM across

all replicates. Proteins with more than 1.5 fold difference in protein concentration between at

least two replicates are marked (‡). Proteins for which the different protein concentrations had

impact on the affinity to liposomes are marked (*). Columns fourteen to twentytwo give

information about protein concentration (μM) used in each replicate. The column twentythree

displays if the average RI of detected interactions indicates high confidence (0 < RI < 2) and

the column twentyfoure gives the cooperativity class. The column twentyfive gives the

annotations based on in vivo localization from Yeast GFP fusion localization database (Huh et

al., 2003; PMID :14562095). The column twentysix gives the annotations used in Fig. 4C, D

and E. CDC42 network annotation was derived from STRING database, nucleus annotation is

based on information available in Yeast GFP fusion localization database or GO biological

process annotation (marked with #). (n/a: not available; ND: not determined)

17

Table S3: Comprehensive view on the results of all domain-liposomes experiments

probed in the systematic analysis of PH domains (related to Figure 2 and 3)

Summary of mean NBI values from all domain-liposome experiments tested. The domain-

liposome experiments with the mean NBI value above the threshold (0.037) are in green. The

experiments with the mean NBI value below threshold are in white. Domains are indicated in

the first column and liposome types in the first row. The number behind the name of a

liposome type indicates concentration (mol %) of the lipid(s) used. (n/a: not available)

The worksheet named "RIs" gives the RI values for all domain-liposome experiments tested.

The worksheets named "NBIs replicate 1-4" gives all the NBI values measured for each

replicate.

Table S4: Current knowledge of lipid-binding properties of the LBDs used in this study

(related to Figure 1)

A. The table shows 45 PH domains that were previously tested for their lipid-binding

properties. The first column gives the name of the PH domain as used in this study. The

second and third columns indicate if the interaction with lipid was detected in previous studies

(refered to with PMID) and in this study with high confidence (0 < RI < 2), respectively. The

last column summarizes the match of our data with the literature-based benchmark dataset.

B. The table summarizes the current knowledge about the lipid ligand preferences and the

protein recruitment to biological membranes of the LBDs tested in this study that were used

as a benchmark. The data of lipid ligands were obtained from liposome based/SPR studies or

supported by a structure information (Pleckstrin). The first column gives protein names as

used in this study. The second column indicates the LBD type that was used in each particular

study. The third column refers to the studies (PMID) showing in vivo protein/domain

recruitment to the biological membranes detected either by imaging or Ras rescue assay

18

(marked with *). The preferred lipid ligands are listed in the column four. The information

about the affinity of the interaction (if available) is given in the fifth column (Kd). The

column six indicates the source of the information (PMID). The interactions recapitulated in

this study are marked green (interactions with only one replicate available are in pale green),

the interactions not recapitulated are marked in red. (n/a: not available)C, The table

summarizes the PH domains for which new high confident (0 < RI < 2) interactions were

detected. Novel interaction indicates the 34 PH domains that were not previously reported to

interact with any membranes. New specificity/mechanism indicates the 26 PH domains for

which the interaction with lipids was previously reported and for which we propose additional

specificity and/or binding mechanism. The 30 PH domains that did not interact with any

liposomes with high confidence in this study are in italics. (n/a: not available)

Table S5: Summary of cooperative interactions detected, BSM/OBM scores, and

proteins and primers used for additional experiments (related to Figures 3, 4 and 5, and

Experimental Procedures)

A. Basic sequence motif (BSM) and Organelle PIP-binding motif (OBM) scores (related

to Figure 5)

The first column gives the names of protein either using the Uniprot ID or, for the PH

domains used in this study, the PH domain name as defined in Table S2. The second and the

third column give the OBM and BSM score. The fourth column gives the total score that is

the sum of OBM and BSM scores.

B. Proteins used in additional experiments (related to Figures 3, 4 and 5)

The table summarizes information about proteins used in the additional, validation

experiments. The first column gives domain names as used in this study. The second and the

19

third columns give standard and systematic (ORF) protein names where applicable. The

fourth and the fifth column indicate Uniprot ID and species, respectively. The amino acid

sequence borders of the PH domains and the exact amino acid sequences are in the column six

and seven, respectively. The source of the DNA template for the constructs is indicated in the

column eight. The columns nine to fifteen summarize information about protein concentration

(μM) used in the individual experiments shown in Figure 3B, Figure4B, Figure5B,D,E,F; and

FigureS1B,C,I and FigureS3b.

C. Summary of the cooperative interactions detected (related to Figure 3 and 4)

The PH domains are divided into three groups according to their cooperativity class (class 1,

2a and 2b). The barplots show NBI values measured with liposome types composed of either

PIP species alone (yellow bars), or combination of PIPs and auxiliary lipids (colour/white

bars, order indicated in the legend in the figure, from left to right). The colour bars indicate

cooperative interactions - the combinations composed of lipids physiological in S. cerevisiae

in blue, the combinations composed of at least one lipid non-physiological in S. cerevisiae in

green, the incomplete data for which the cooperativity could not be assessed are in gray, and

the white bars indicate non-cooperative interactions or no binding. The horizontal line

indicates the threshold (NBI = 0.037). Only data of PH domains with at least one

cooperativity event of physiological lipids are shown. (§) indicates that the lipid is not

physiological in S. cerevisiae, # indicates different variants of PI(3,5)P2 lipid.

D. Primers used for cloning (related to Experimental Procedures)

The summary of primers used for cloning of all domains and full length proteins. The first

column indicates the domain name. The second and the third column give the sequences of

forward and reverse primers, respectively. The lower case letters represent the specific

sequence matching the template DNA. The upper case letters indicate extra nucleotides

20

needed for cloning (red - restriction site; blue - additional nucleotides added to secure proper

function of restriction enzymes). The fourth column indicates the final vector where the

cloned genes were inserted to.

21

SUPPLEMENTAL EXPERIMENTAL PROCEDURES

Selection of Lipid Binding Domains (LBDs)

The protein candidates were identified using Hidden Markov Models (HMMs) of PH

domains from the Pfam database (http://pfam.sanger.ac.uk/) to search against S. cerevisiae

and C. thermophilum proteomes. The candidate list was supplemented with additional S.

cerevisiae and mammalian PH domains found in the literature. The prediction of amino acid

sequences of PH domains was refined by secondary (Psipred; bioinf.cs.ucl.ac.uk/psipred/)

and 3D structure (Phyre; http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index)

prediction modeling, which enabled us to set borders of each PH domain more precisely. In

cases of ambiguous predictions of boundaries, multiple variants of particular domain were

selected. The amino acid sequences of all PH domains were submitted to SMART database

(smart.embl-heidelberg.de/) and NCBI CDS (Conserved Domain Search;

www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) to obtain the e-value score ( Table S2).

Recombinant protein expression

DNA frgments encoding 74 LBDs were codon optimized for expression in E. coli and

synthesized (Entelechon). Sequences coding remaining LBDs were cloned either from S.

cerevisiae genomic DNA (14) or cDNA isolated from C. thermophilum (5) (kind gift of N.

Silva Martin, EMBL). Full length sequence of HSV2 was cloned from S. cerevisiae genomic

DNA. C2 domain of lactadherin was cloned from p416-GFP-Lact-C2 vector (Haematologic

technologies). PH domain of CERT was cloned from HA-CERT-pcDNA3.1 plasmid (kind

gift of J. Holthuis, University of Osnabrück), FAPP1 PH domain was cloned from PHFAPP1

-

pBGP plasmid (Levine and Munro, 2002). PH domains of OSBP proteins (OSBP1, OSBP2,

OSBPL3, OSBPL7, OSBPL8, OSBPL10 and OSBPL11) were cloned from human cDNA

(Openbiosystems, Table S5B), except PH domain of OSBPL5 that was cloned from

22

synthetic gene (Entelechon). For sequence details of all primers see Table S5D. Single point

mutations were introduced by the QuikChange Lightning Site-Directed Mutagenesis Kit

(Agilent Technologies). All sequences were cloned into pETM11 vector and proteins were

expressed as N-terminal His6-SUMO3 and C-terminal sfGFP fusions in E. coli (BL21 STAR,

Invitrogen) (Saliba et al., 2014).

Proteins were produced in 5 mL cultures in 24 deep-well plates in auto-inducing ZY media.

Cells were grown at 37 °C up to OD600 ~2, subsequently the temperature was shifted to 15 °C

(synthetic genes) or 18 °C (cloned genes) and proteins were produced o/n for 14-15 h. Cells

were pelleted at 3,000 g for 20 min and washed in cold PBS. Final pellets (volume ~100 μL)

were snap frozen in liquid nitrogen and stored at -80 °C for further use.

Preparation of cell extracts

Cell lysis was performed as described previously (Saliba et al., 2014). Expression level and

protein solubility were evaluated on a Coomassie stained gel and a western blot with anti-GFP

antibody (Miltenyi Biotec GmbH, MACS Molecular). Fluorescence intensity of sfGFP-tagged

protein in the cell extracts was measured on a microplate reader (BioTek) at excitation and

emission wavelengths of 485 nm and 528 nm, respectively. Concentrations of sfGFP-fusions

in the cell extracts were estimated from the fluorescence intensity as described previously

(Saliba et al., 2014). Subsequently, a small amount (final concentration 50 μg/ml) of

streptavidin-AF488 (Life Technologies) was spiked into each cell extract. Cell extracts were

centrifuged for 10 min at 16,000 g at 4°C prior use.

Protein purification

Protein purification was performed as described previously (Saliba et al., 2014). For the

titration experiments and the circular dichroism (CD) spectroscopy analysis, the gel filtration

23

elution buffer (10 mM HEPES pH 7.5, 250 mM NaCl) was changed for the assay buffer (10

mM HEPES pH 7.5, 150 mM NaCl) via an overnight dialysis.

Circular dichroism (CD) spectroscopy

CD spectra were collected on a Jasco-815 spectropolarimeter in a quartz cuvette (2 mm path

length) at 20°C with protein concentrations in the range of 4.5-7.0 μM ( Table S5B) in the

assay buffer (10 mM HEPES pH 7.5, 150 mM NaCl). Secondary structure analysis was

performed using Jasco Spectra Manager software package. Thermal denaturation experiments

were performed by monitoring the CD signal at 222 nm while heating protein solutions from

20°C to 90°C with a constant rate 1°C per minute.

Liposome microarrays preparation

Fabrication and principle of the liposome microarray-based assay (LiMA) have been

previously described (Saliba et al., 2014). All lipid mixtures were prepared in chloroform

based solvent ( Table S1A) and stored in 1.5 ml glass vials (Sigma) under argon at −20 °C.

The final concentration of each lipid mixture was 0.26 mM. Each lipid mixture was composed

of a carrier lipid (phosphatidylcholine PC, up to 95 mol %), a signalling lipid (up to 10 mol

%) or a combination of signaling lipids (up to 10 mol %), and completed with

phosphatidylethanolamine (PE)-Atto 647 (0.1 mol %) to ensure autofocusing during

automated microscopy, and PE-PEG350 (0.5 mol %) that facilitates the generation of

liposomes. The signalling lipids consisted of phosphoinositides, sphingolipids,

glycerophospholipids and DAG. Altogether 122 different lipid mixtures were used ( Table

S1C). For phosphatidylinositol PI(3,5)P2 two variants differing in fatty acyl chains were

probed - dioleyl [DOPI(3,5)P2] and dipalmitoyl [DPPI(3,5)P2] ( Tables S1). Unlike liposomes

containing DOPI(3,5)P2, the liposomes containing DPPI(3,5)P2 were not recognized by the

specific PI(3,5)P2 sensor Hsv2 protein. The interactions with DOPI(3,5)P2 represented the

24

majority of binding events reported for PI(3,5)P2-containing liposomes, therefore only those

were considered for further analysis and only DOPI(3,5)P2 was used in validation

experiments.

A thin agarose layer (TAL) of 250 nm height was prepared by coating coverslips (30x45 mm,

#1; Menzel) with 1% low melting agarose (Sigma) in water. Lipid mixtures were spotted on

the TAL by a syringe-driven spotter (Automatic TLC-spotter4, Camag). Each spot on the

array had 800 μm x 800 μm with a distance between two spots of 200 μm (the overall spot

density was 100 spots per 1 cm2). The array consisted of 120 spots in 4 groups of 30 spots on

area of 10 mm x 18 mm. A polydimethylsiloxane (PDMS) device containing 4 separate

chambers was then bonded onto the array as described before (Saliba et al., 2014). Assembled

microfluidic devices were stored at 4 °C under inert atmosphere before use.

The impact of mutations in the organelle PIP binding motif (OBM) was tested on arrays

containing plasma membrane (PM) PIPs [PI(3,4)P2 and PI(4,5)P2] and organelle PIPs

[PI(3)P, PI(4)P, PI(5)P and PI(3,5)P2] of two different concentrations, 5 and 10 mol %

(Figures 5D and E). For the dose response experiments, arrays containing liposomes with

increasing concentrations (from 0 to 7 mol %) of PI(4)P* (porcine brain extract; Table S1A)

and PI(4,5)P2 were used (Figure 5F).

For dose response experiments for validation of cooperative interactions, the liposome arrays

containing liposomes of increasing concentrations (0, 3, 6 and 10 mol %) of two lipids

[PI(4,5)P2/ PI(3,5)P2 and DHS1P/ PHS1P/ PS/DHS/PHS/Phytocer] were used (Figure 3B and

Figure S3B). For combination of PI(4,5)P2 and DHS1P only concentration up to 6 mol %

were used for PI(4,5)P2.

25

For titration experiments for Kd calculations, the liposome arrays containing various PIPs

[PI(3)P, PI(4)P*, PI(3,5)P2, PI(3,4)P2 and PI(4,5)P2] with a concentration of 3 mol % were

used ( Figure S1I).

Measurement of membrane recruitment using LiMA

Giant liposomes (> 5 μm) were spontaneously formed within 5 minutes after injection of the

assay buffer (10 mM Hepes, 150 mM NaCl, pH 7.4) into the microfluidic chambers.

Successful formation of liposomes was confirmed by microscopy. Each liposome spot was

automatically imaged in two fluorescence channels, Atto 647 (3 ms exposure time) for

monitoring the presence of liposomes, and GFP (100 ms exposure time) to check for possible

autofluorescence. After acquisition of the blank images, 20 μL (= volume of one chamber) of

cell extracts were injected in each chamber using a syringe pump (KD scientific). After 20

minutes incubation, unbound proteins were washed by 4 chamber volumes (80 μL) of assay

buffer. Subsequently, the microfluidic device was coupled to the microscope. Images of every

single liposome spot were acquired automatically.

Each protein domain was tested on the array in multiple replicates. The position of the

liposome spots on the array was shuffled for each replicate to control for potential position

bias. On each array ten liposome spots containing lipid mixture with PE-biotin were placed

on different positions providing a specific ligand for streptavidin-AF488 (Figure S1A).

Positive signal from these control liposomes served as assurance of a successful experimental

protocol.

Image acquisition and image processing

Images were acquired using the same equipment and settings as described previously (Saliba

et al., 2014). One constant exposure time of 3 ms was used for Atto 647 images, representing

the position of liposome membranes, and exposure times of 5 ms, 10 ms, 30 ms, 50 ms, 75 ms

26

and 100 ms were used for GFP images, representing protein-liposome binding events.

Multiple exposure times for GFP were selected in order to capture a broad range of binding

intensities of tested domains. Three liposome arrays (3x 120 spots) were always prepared in

parallel and imaged in a sequence one after another. The whole procedure of imaging of the

three arrays took about 3 hours altogether.

Images were processed with the same software and settings as described previously (Saliba et

al., 2014). Normalized binding intensity (NBI) values were calculated as described previously

(Saliba et al., 2014).

Titration experiments for Kd calculations

The purified sfGFP fusions of PH domains of Boi2, Swh1 and AKT1 were diluted in the

assay buffer (10 mM HEPES pH 7.5, 150 mM NaCl) to reach the following protein

concentrations (BOI2: 0.2, 0.5, 1, 3, and 7 μM; AKT1: 0.05, 0.1, 0.2, 0.5, 1, 2, 3, 4, and 6 μM;

SWH1: 0.1, 0.2, 0.5, 1, 1.5, 2, 3, 5, and 7 μM). Subsequently, all these dilutions were

separately probed on arrays made of various PIPs-containing liposomes [PI(3)P, PI(4)P*,

PI(3,5)P2, PI(3,4)P2 and PI(4,5)P2] with a constant PIPs concentration of 3 mol % and NBI

values were measured for all interactions (see below). The Kd calculation was performed

using a non-linear regression analysis with Origin 7.5 software.

Assessment of the relationship between the time delay in the imaging and NBI

measurements

The liposome arrays were automatically imaged in the same way, starting from the initial

position (top left, A1) and the end position (bottom right, J12) position ( Figure S1A). Since

the positions of the individual lipid mixtures were reshuffled between replicates, each

interaction between a particular domain with a liposome type can be assigned to a certain

‘time delay in imaging’ (ie. the time needed for the automatic microscope to reach the

27

position the specific liposome spot starting from the initial position of the array). We have

then calculated the time delay (ie. the difference in ‘time delay in imaging’) between the

respective replicates and correlated these delays with the differences in corresponding NBIs

measured (Figure S1K).

To further assess possible bias introduced by time delay in imaging, we also performed two

experiments (using PLCD1 and EEA1 proteins) in which we imaged the same liposome

arrays twice, once at time 0 and once two hours later, and then we correlated the NBIs

obtained during the both time points (Figure S1L). No decrease in NBIs was detected for the

time ranges tested using for the data collection.

ROC curve-based evaluation of interactions

In order to use only the high quality data for analysis, all acquired images were visually

inspected and bad quality images (no liposomes formed, unfocused images and/or presence

of protein precipitates) were removed. The images were manually annotated as “interaction”,

“no binding” or “dubious”. We performed a ROC analysis (package ROCR (Sing et al.,

2005)) using the manual annotation in order to extract an NBI threshold that was

subsequently used as interaction predictor. The threshold was set to NBI value 0.037

(interactions with NBI ≥ 0.037, otherwise no binding), leading to 75.2% true positive rate

and 3.4% false positive rate and a precision of 86%, AUC of ROC curve = 0.95 (Figure

S1B). The final NBI value for each domain-liposome experiment was calculated as a mean

NBI from all available replicates. The mean NBI values were used for further analysis.

Assessment of effect of protein concentration and positional bias

NBI values for all images manually annotated as interaction were collected for all replicates

of domains with more than 1.5 fold difference in protein concentration between replicates.

Wilcoxon test followed by Bonferroni correction (R package stats) were performed to detect

28

significant differences between measured NBIs (threshold was set to P < 0.05 after

correction).

In order to look for position bias, we computed the logarithmic value of the ratio between

number of data points of NBI > 0.037 (i.e., interactions) and the total number of data points

measured at the particular coordinate of the array.

Assessment of data reproducibility

The quantitative reproducibility of the data was computed by calculating the Pearson

correlation between the NBIs of the corresponding replicates (Figure 1B). The qualitative

reproducibility (Figure S1E) was assessed by comparing the annotations of protein-liposome

experiments based on both, NBI threshold and manual annotation. Each domain-liposome

experiment was assigned as interaction or no binding based on the mean NBI value.

Subsequently, the annotations of all individual replicates were compared. For the cases of

inconsistent annotation based on NBI, the manual annotation was used.

Comparison of S. cerevisiae and C. thermophilum binding profiles

To assess the similarity of the binding profiles of 27 pairs of S. cerevisiae and C.

thermophilum PH domains orthologs, we computed a Pearson correlation coefficient of the

NBIs measured on the entire dataset (Figure 1C).

Clustering analysis

The uncentered Pearson correlation was used to cluster the liposome types according to the

similarities between their PH domain binding profiles (i.e., what PH domains interacted with

them and with what affinity) using the log(NBI) values of base 1. The liposomes were

clustered using the hclust function from the R package stats. The dissimilarity matrix was

computed using the Dist function from the amap package. The same approach was used for

29

clustering liposomes containing single signalling lipids and liposomes containing combination

of signalling lipids. Fisher test (R package stats) was used for assessment of statistical

significance of differences in auxiliary lipid charge distribution between individual subgroups

(Figure 2A).

To further validate the separation of clusters of plasma membrane PIPs and organelle PIPs

observed in the hierarchical clustering analysis, we performed a Principal Coordinates

Analysis (PCoA) and represented the two first dimensions, MDS1 and MDS2, of the two

different clusters in boxplots (Figure 2C). The PCoA was performed using the implementation

of the R package Vegan (Dixon, 2003). Subsequently, we performed a Wilcoxon test between

those two dimensions (MDS1 and MDS2). We demonstrated further the robustness of these

clusters using the partitioning around medoids (Hennig, 2010) (Figure S2C) and the silhouette

coefficient (Rousseeuw, 1987) (Figure S2D).

Calculation of the Reproducibility Index (RI)

The reproducibility index (RI) is computed for each interaction that was measured at least

twice. For each interaction between a protein domain and a liposome type, we calculated the

mean NBI of the replicates (NBI) as well as the associated standard error (SENBI). We then

calculated the RI of the interaction as such, where Th is the binding threshold (0.037):

If NBI > Th then

If NBI < Th then

Based on the RI calculation, the experiments have been assigned to four categories: high

confidence interactions (RI ]0;2[), high confidence no binding (RI ]-2;0[), low confidence

interactions (RI ≥ 2) and low confidence no binding (RI ≤ -2).

30

Cooperativity assessment

To evaluate the role of cooperative associations or rheostasis (i.e. the changes in binding

affinity for one lipid owing to the presence of another lipid) in the recruitment of PH domains

to phosphoinositide-containing membranes, we derived a cooperativity index (CI = NBIL1+L2

/ [NBIL1+NBIL2]). An interaction was considered cooperative when CI > 1, that is when the

binding affinity to liposomes containing two signaling lipids was stronger than the sum of the

binding affinities to liposomes containing the individual lipids (NBIL1+L2 > NBIL1+NBIL2 and,

to account for technical variability, NBIL1+L2 also has to be higher than the highest value

obtained for NBIL1+NBIL2 amongst the replicates). We also observed instances of negative

cooperativity (NBIL1+L2 < max{NBIL1;NBIL2}; CI = NBIL1+L2 / max{NBIL1;NBIL2} and, to

account for technical variability, NBIL1+L2 also has to be lower than the lowest value obtained

for NBIL1 or NBIL2 amongst the replicates).

The dose response experiments of 31 randomly selected PH domain-lipids combination pairs

were performed on liposome arrays containing liposomes of increasing concentrations of two

lipids (Figure S3B). The PH domains were tested at the same protein concentration as used in

the PH domain screen where the cooperative/non-cooperative interactions were observed.

The screen results (“predictions”) along with the validation results (the “truth”) were used to

estimate a true positive rate and false negative rate of the cooperativity heatmap.

The PH domains were separated into three cooperativity classes (Figure 4A) based on

manual inspection of data represented in plots in Table S5C.

Principal component analysis of proteins based on the cooperativity index values

Principal component analysis (PCA) was performed using the R package ade4 (Dray and

Dufour, 2007). The PCA was performed on a set of S. cerevisiae PH domains with CI >1 for

31

at least one liposome type (total 37 PH domains). Only CI values obtained for liposome types

containing PI(4,5)P2:PS (10:10 mol %), PI(4,5)P2:DHS1P (10:10 mol %), or PI(4,5)P2:PHS1P

were considered.

The PH domains were annotated using the information of the corresponding full length

proteins available in STRING database (Franceschini et al., 2013) and Yeast GFP fusion

localization database (Huh et al., 2003) (Table S2). From the STRING database, protein-

protein interactions with a high experimental score (>0.8) were considered. With this filter,

the protein CDC42 was identified as the common and specific interactor of the majority of

proteins containing PH domains of high PC2 values (> quantile 75%) as defined by the PCA,

which correspond to high CI for PI(4,5)P2:PS liposomes. In order to make sense of one

outliner (BUD4), we subsequently refined our filter and considered also interactions whose

STRING text mining score was superior to 0.5. The refined filter identified only the BUD4

(score 0.649) as the additional CDC42 interactor.

The nuclear localization annotation was derived from Yeast GFP fusion database (nucleus and

nuclear periphery annotations considered) and supplemented with proteins manually curated

as being involved in positive regulation of transcription from RNA polymerase II promoter in

Gene Ontology biological process annotations (http://www.yeastgenome.org).

Search of the organelle PIP binding motifs (OBM)

We first defined two groups of PH domains - the PH domains with ability to interact with

organelle PIPs (16) represented the foreground set, and all remaining PH domains

(interacting with PM PIPs only, or not interacting with any individual PIPs at all) represented

the background set (74) (Figure S5A). All PH domain sequences from both sets were aligned

(hmmer version 3.0) against the Pfam PH domain seed HMM. Two PH domains were

removed from the foreground set - AVO1 PH domain (high SMART e-value prediction

32

score), and PDK1 PH domain (not aligned properly with the other PH domains). The multi-

harmony algorithm (Brandt et al., 2010) was then run on the multiple alignments of the

remaining sequences (14 PH domains of the foreground set and 74 PH domains of the

background set). The algorithm computes how the amino acids content of one group at one

position differs from the amino acids content of the other group and thus identify

residues/positions in the alignment that are specific for one group. The algorithm proposed

17 best scoring positions. Out of these 17 positions we have selected nine that: (i) showed no

overlap in overrepresented amino acids in both, the foreground and the background, and (ii)

were located inside the predicted PH domain secondary structure.

The sequences of 90 PH domains tested in this study and 1,115 PH domains extracted from

Uniprot database (http://www.uniprot.org/; Uniprot release 2013_11) were ranked according

to a scoring scheme which was decided manually and works as follows:

1) conservation of OBM residues - 3 points for the two positions of lowest variability (T in

the β1-β2 loop and K in the β7-strand) and 1 point for the other seven. The maximal score for

OBM was 13.

2) conservation of BSM (basic sequence motif) residues - 1 point for each. The maximal

score for BSM was 3.

When summed, OBM and BSM could give a maximum score of 16 (Figure S5B, Table

S5A).

The statistical significance of the difference between NBI values measured for interactions of

PH domains of different scores with liposomes containing either PM, or organelle PIPs was

calculated using the Wilcoxon test. The glutamine residue used for lysine/arginine

substitution in PH domains with engineered mutation was selected based on the PH

33

consensus sequence coming from Pfam database (based on 163 PH domains) (Levine and

Munro, 2002). Naturally occurring variants in OBM were searched in COSMIC database

(Forbes et al., 2010) (http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/). The

statistical significance of the differences in NBI values measured for wild types and mutated

forms of PH domains were calculated using the Wilcoxon test.

Yeast strains

Yeast strains with deleted or depleted lipid metabolizing genes (CHO1, MSS4ts) were

created in the SGA Y7039 strain (a gift from C. Boone, University of Toronto) derived from

the BY4741 background (Tong and Boone, 2007) (MAT can1:: STE2pr-LEU2 lyp1

ura30 leu20 his31 met150) by standard yeast molecular biology procedures (Janke et

al., 2004). These mutant strains were mated with selected yeast strains from Yeast GFP Clone

Collection (Invitrogen) (BEM3, BOI1, BOI2, CDC24, CLA4) using robot facilitated mating,

sporulation and strain selection of the standard SGA protocols (Huh et al., 2003; Tong and

Boone, 2007). The genotypes of the final strains were confirmed by PCR.

CERT-PH, FAPP1-PH, OSBP2-PH, OPY1C-PH and PLCD1-PH domains and their mutants

were cloned into pRS315-mCherry vector (kind gift of E.C. Hurt, BZH, Heidelberg). SWH1-

PH domain was cloned from yeast genomic DNA (amino acid sequence as for E. coli

expression) (Table S5D) and then inserted into the pRS315-mCherry vector. Single point

mutation was introduced by the QuikChange Lightning Site-Directed Mutagenesis Kit

(Agilent Technologies). All domains were expressed with C-terminal mCherry tag from

ADH1 promotor. For colocalization experiments a yeast strain with GFP-tagged trans-Golgi

marker KEX2 (Yeast GFP Clone Collection) was used. Effect of lowered PI(4)P level in

Golgi membranes was tested with thermosensitive PIK1ts strain (Audhya et al., 2000) (kind

gift of S. Emr, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca)

34

[MATa leu2-3,112 ura3-52 his3-D200 trp1-D901 lys2-801 suc2-D9, pik1::HIS3, harboring

pRS314pik1-83 (Amp, TRP1 CEN6 pik1-83)].

Imaging of yeast strains

Yeast strains harbouring mCherry-tagged PH domains (KEX2-GFP and PIK1ts strains) were

grown in SD medium without tryptophan overnight at 30 °C (KEX2-GFP) or 25 °C (PIK1ts).

Cells were diluted to OD600 < 0.05 and further grown for ~ 3 h in respective temperature. The

cells were adhered on glass-slides (BioTek) coated with Concavalin A (Sigma) and imaged.

The PIK1ts strains tested at nonpermissive temperature were incubated at 37 °C for 50

minutes prior imaging. Images were acquired with an Olympus IX-81 microscope with ×100

oil objective /NA 1.45 objective lens and Orca-ER camera (Hamamatsu).

For live-cell imaging of yeast strains harbouring both GFP-fused genes and

deletions/depletions of genes of lipid metabolic enzymes cells were inoculated in SD medium

without tryptophan and histidine, supplemented in addition with 1.0 mM ethanolamine in

case of cho1, and grown overnight at 30 °C. Cells were diluted to OD600 = 0.1, adhered on

Concanavalin A-coated 96 well glass bottom plates, and imaged on fully motorized Olympus

fluorescence microscope system (Olympus IX-81) at 30 °C (temperature controlled

incubator, EMBL manufacture). The mss4ts strains tested at nonpermissive temperature were

incubated at 37 °C for 50 minutes prior imaging. Images were acquired using a ×100 /NA

1.45, low noise highly sensitive ORCA-R camera (Hamamastu), MT20 illumination system,

and Uniblitz Electro-Programmable Shutter system. All acquired images were processed with

ImageJ software.

35

SUPPLEMENTAL REFERENCES

Audhya, A., Foti, M., and Emr, S.D. (2000). Distinct roles for the yeast phosphatidylinositol

4-kinases, Stt4p and Pik1p, in secretion, cell growth, and organelle membrane dynamics.

Molecular biology of the cell 11, 2673-2689.

Brandt, B.W., Feenstra, K.A., and Heringa, J. (2010). Multi-Harmony: detecting functional

specificity from sequence alignment. Nucleic Acids Res 38, W35-40.

Dixon, P. (2003). VEGAN, a package of R functions for community ecology. Journal of

Vegetation Science 14, 927-930.

Dray, S., and Dufour, A.-B. (2007). The ade4 package: implementing the duality diagram for

ecologists. Journal of statistical software 22, 1-20.

Forbes, S.A., Tang, G., Bindal, N., Bamford, S., Dawson, E., Cole, C., Kok, C.Y., Jia, M.,

Ewing, R., Menzies, A., et al. (2010). COSMIC (the Catalogue of Somatic Mutations in

Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38,

D652-657.

Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J.,

Minguez, P., Bork, P., von Mering, C., et al. (2013). STRING v9.1: protein-protein

interaction networks, with increased coverage and integration. Nucleic Acids Res 41, D808-

815.

Hennig, C. (2010). fpc: Flexible procedures for clustering. R package version Available:

http://cranr-projectorg/web/packages/fpc/.

Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S., and

O'Shea, E.K. (2003). Global analysis of protein localization in budding yeast. Nature 425,

686-691.

Janke, C., Magiera, M.M., Rathfelder, N., Taxis, C., Reber, S., Maekawa, H., Moreno-

Borchart, A., Doenges, G., Schwob, E., Schiebel, E., et al. (2004). A versatile toolbox for

PCR-based tagging of yeast genes: new fluorescent proteins, more markers and promoter

substitution cassettes. Yeast 21, 947-962.

Levine, T.P., and Munro, S. (2002). Targeting of Golgi-specific pleckstrin homology domains

involves both PtdIns 4-kinase-dependent and -independent components. Current biology : CB

12, 695-704.

Rousseeuw, P.J. (1987). Silhouettes: a graphical aid to the interpretation and validation of

cluster analysis. Journal of computational and applied mathematics 20, 53-65.

Saliba, A.E., Vonkova, I., Ceschia, S., Findlay, G.M., Maeda, K., Tischer, C., Deghou, S., van

Noort, V., Bork, P., Pawson, T., et al. (2014). A quantitative liposome microarray to

systematically characterize protein-lipid interactions. Nat Methods 11, 47-50.

Sing, T., Sander, O., Beerenwinkel, N., and Lengauer, T. (2005). ROCR: visualizing classifier

performance in R. Bioinformatics 21, 3940-3941.

Tong, A.H.Y., and Boone, C. (2007). High-Throughput Strain Construction and Systematic

Synthetic Lethal Screening in Saccharomyces cerevisiae. Methods in Microbiology 36, 369–

386, 706–707.

cell reports supplemental information lipid cooperativity ... · pdf filecell reports...

Documents