cell reports supplemental information lipid cooperativity ... · pdf filecell reports...
TRANSCRIPT
Cell Reports
Supplemental Information
Lipid Cooperativity as a General
Membrane-Recruitment Principle for PH Domains
Ivana Vonkova, Antoine-Emmanuel Saliba, Samy Deghou, Kanchan Anand, Stefano
Ceschia, Tobias Doerks, Augustinus Galih, Karl G. Kugler, Kenji Maeda, Vladimir Rybin,
Vera van Noort, Jan Ellenberg, Peer Bork, and Anne-Claude Gavin
1
SUPPLEMENTAL DATA
Supplemental figures
Figure S1. Assessment of the quality of the dataset and overview of the detected PH
domain-liposome interactions (related to Figure 1).
A. Control for potential position-dependent artifacts on the liposome array. Each cell of the
heatmap corresponds to a position on the liposome array and indicates the logarithmic value
of the ratio between the number of experiments of NBI > 0.037 (i.e., interactions) and the
total number of experiments measured at the particular coordinate. Squared cells represent the
positions of the positive control (liposomes containing a PE-biotin lipid), which remained
fixed across all replicates. B. Control for PH domain-sfGFP fusions folding using circular
dichroism (CD). C. For each PH domain studied with CD a secondary structure prediction
from the CD data are given (column two and three; structured [%] represents a sum of helix,
beta and turn elements predictions). The calculated melting temperature is given in column
four. D. Determination of NBI cutoff maximizing sensitivity and specificity of LBDs-
liposome interaction detection. Top, ROC curve and, bottom, precision-recall curve analyses
of the NBIs of the screen dataset. The NBI cutoff of 0.037 yields a true positive rate (recall)
of 75.2%, a false positive rate of 3.4%, precision of 86% (dashed lines). AUC (area under
curve) of ROC curve was 0.95. E. Qualitative reproducibility of the screen. All LBDs-
liposome experiments were assigned as interaction or no binding according to the mean NBI
value calculated from all available replicates (interaction if NBI ≥ 0.037, otherwise no
binding). The big pie chart shows reproducibility of these annotations based on NBI
annotations of the corresponding replicates. In cases of inconsistent annotation,
reproducibility based on visual inspection of the images is shown (small pie charts). F.
Determination of the reproducibility index (RI) for each domain-liposome type experiment
2
studied. The upper plot represents the standard error (SENBI) as a function of the NBI (log10
transformed). The lower plot represents the RI as a function of the NBI (log10 transformed).
Interactions yield a positive RI while no binding events yield a negative RI. The closer the RI
is to 0, the more confident is the datapoint (interaction or no binding). The plot on the right
side of the panel represents randomly picked examples of domain-liposome type experiments
which yielded different ranges of RI. G. Assessment of the ability of LiMA to recover the
specific lipid-binding profile of four positive control LBDs (EEA1-FYVE, HSV2,
Lactadherin-C2, p40phox-PX). The numbers behind the lipid names indicate lipid
concentration (mol %). H. For the four positive controls shown in panel (G), the boxplots
show the NBI and 1/RI for the known specific lipid(s) partners and other lipids and lipid
mixtures. The P values show the statistical significance between indicated pairs of boxplots. I.
Comparison of dissociation constants (Kd) calculated from titration experiments using LiMA
on a set of PH domains with a reference Kds (Ref. Kd). J. Correlation between NBI
measurements and literature derived Kds for selected PH domain-lipid combinations. The
Pearson correlation coefficient of the presented interactions was 0.74 and associated P =
2.6.10-3
. K. Visualization of a relationship between the time delay in the imaging (x axis) and
NBI measurements between replicates (y axis). L. Correlation analysis of NBIs measured on
the same liposome array imaged twice, once at time 0 (imaging 1) and then two hours later
(imaging 2), for PLCD1 (upper plot) and EEA1 (lower plot). M. Correlation analysis of LBDs
concentration versus NBIs of all interactions detected in the screen. N. Pairwise correlation
analysis of NBIs for LBDs with more than 1.5 fold difference in protein concentration
between replicates. The NBIs from the corresponding replicates were compared pairwise
(Wilcoxon test followed by Bonferroni correction) and corrected P values, reflecting
statistical significance of difference in NBIs, were plotted for each pair. P value 0.05 (solid
line) was used as a threshold for statistically significant difference. The domain name together
with the concentration (μM) of the two corresponding replicates (Repl.) is given. O. Number
3
of interactions detected per PH domain. The proteins for which data from only one replicate
are available are labeled in red, the proteins for which the mean RI of the interactions was ≥ 2
are marked in blue.
4
5
6
Figure S2: Hierarchical clustering of liposome types according to the similarities of their
PH domain-binding profiles (related to Figure 2).
A. Top, hierarchical clustering of single signalling lipid-containing liposomes according to the
similarities in their PH domain-binding profiles. The colours indicate lipid families to which
the signalling lipids belong. The lipids that are not physiological in S. cerevisiae are marked
with §. Middle, barplot represents the number of PH domains that were recruited to
membranes of particular liposome type. Bottom, boxplot giving the NBI values of interacting
PH domains. The numbers behind the lipid names indicate lipid concentration (mol %) used
for each signalling lipid. B. Detailed view on the hierarchical clustering shown in the Figure
2A. The liposome types composed of combination of signalling lipids are clustered according
to their PH domain binding profiles. Right, the number of PH domains interacting with each
liposome type and left, the distribution of NBI values are shown. The lipids that are not
physiological in S. cerevisiae are marked with §. C. The optimum number of clusters in the
hierarchical clustering shown in (B) and Figure2A was decided based on a partitioning around
medoids. Two clusters composed of liposome types containing PM PIPs (triangles) or
organelle PIPs (circles), respectively, were found. D. The silhouette plot assesses the
robustness of the two clusters identified in (C). The average silhouette width of 0.54 indicates
that a robust structure has been found. # indicates the dipalmitoyl variant of PI(3,5)P2.
7
8
Figure S3: Cooperation of lipids in the targeting of PH domains to membranes (related
to Figure 3).
A. Landscape of cooperating lipids in the targeting of mammalian PH domains to liposomal
membranes. B. Validation of interactions of PH domains with cooperating lipids using dose
response experiments. The figure gives the results of dose response experiments (heatmaps)
for 31 selected PH domain-lipids combination pairs and compares them with the results
obtained from the PH domain screen (bar plots and RI values). The results are divided into
four groups: true positive, true negative, false positive and false negative. Each cell in the
heatmaps gives the NBI value (violet) measured for the PH domain interaction with liposomes
containing the indicated concentration of signalling lipids. The grey colour indicates missing
data. Values are mean (n ≥ 2). The bar plots show mean NBI values (± s.d.) measured in the
PH domain screen for each liposome type containing indicated signalling lipids or their
combinations (black, NBI ≥ 0.037; white, NBI < 0.037). The concentration of signalling lipids
in liposomes used in the PH domain screen was 10 mol %, except the combination of
PI(4,5)P2 and DHS1P, for which data from liposomes containing 7 mol % of both lipids are
shown. The RI values for each experiment is given under the bar plots. In the group of false
negative, the PH domains recognizing cooperating lipids at lower lipid concentrations are
marked with a star.
9
10
11
Figure S4: Correlation of cooperativity indices and in vivo validation (related to Figure
4).
A, Correlations between CIs calculated for combinations of PI(4,5)P2 with PS and other
negatively charged auxiliary lipids (DHS1P, PHS1P, S1P and Cer1P). B, Summary of in vivo
experiments of GFP fusions of SKG3 and CAF120 proteins in S. cerevisiae. The metabolism
of phosphatidylserine and PI(4,5)P2 was perturbed with CHO1 (phospahtidylserine synthase
deletion) and MSS4ts (thermosentsitive mutant of the phosphatidylinositol-4-phosphate 5-
kinase). Scale bars, 3 μm. The following columns summarize information on the classification
as member of CDC42 network (based on STRING database) and on the subcellular
localization in bud (based on Yeast GFP fusion localization database). The last two columns
show in vitro data for N terminal PH domains of SKG3 (SKG3N) and CAF120 (CAF120N).
They indicate if any high confidence detection of PI(4,5)P2 and phosphatidylserine
cooperation (as shown in Figure 3A) were detected by giving CI and RI calculated for the
interactions with PI(4,5)P2:phosphatidylserine combination.
12
Figure S5: PIP-binding profile of PH domains and details of validation of the organelle
PIP binding motif (OBM) (related to Figure 5).
A. Heatmap representing the PIP-binding profile of PH domains. The solid line separates
groups of PH domains interacting and not interacting with organelle PIPs, respectively, which
were used as training sets during identification of the OBM. The barplot on the left shows the
mean protein concentration (μM) across the replicates used for each PH domain. The PH
domains for which data from only one replicate are available are marked in red. B. OBM
(pink) and basic sequence motif (BSM, blue) scoring scheme. Each position in OBM and
BSM is associated with a score and is pointed out on the secondary structure scheme of a PH
domain. C. Selected segments of PH domains [corresponding to the parts schematically
represented above in (B)] from CERT, FAPP1, OSBP2 and SWH1 proteins with the mutated
residues highlighted. (*) indicates naturally occurring mutations.
13
14
Supplemental table legends
Table S1: List of lipids (related to Figure 1)
A. List of lipids (related to Figure 1)
30 different lipids (including 1 fluorescent lipid, 1 PEGylated lipid and 1 biotinylated lipid)
have been used in this study. The first column provides the category of the lipids and the
second one the abbreviation used in this study [(§) indicates that the lipid is not physiological
in yeast]. Common name, systematic name and synonyms are listed in the third, fourth and
fifth column, respectively. Cas number is given in the column six. Pubchem substance ID and
LIPID MAPS ID (http://www.lipidmaps.org) are indicated in the columns seven and eight.
The lipid supplier, the catalog number and the lipid origin (yeast, Saccharomyces cerevisiae)
are given in the columns nine, ten and eleven. Lipid solubility is given in the column twelve.
The column thirteen lists the overall charge calculated for each lipid using the data from
Gallego et al (PMID: 21119626). * and # indicate different variants of PI(4)P and PI(3,5)P2
lipid, respectively. (n/a: not available)
B. Current knowledge of in vivo lipid concentration and commonly used lipid
concentration in physiological in vitro protein-lipid interaction assay (related to Figure
1)
For every lipid species used in this study (column 1), the in vivo probes available in literature
are reminded in column 2. In column 3, the in vivo concentration available from lipidomics
studies for every lipid used in this study are reported with the associated organism and the
Pubmed ID number (PMID). Remark: since sterols are not present, the total of mol% of
glycerolipids, glycerophospholipids, glycerophosphoinositol phosphates and sphingolipids
15
does not reach 100%. For every lipid the subcellular enrichment is reported in column 4.
Finally, the commonly used in physiological in vitro protein-lipid interaction assay are
compiled. n/a: not available, (§) lipids not physiological in yeast.
C. List of lipid mixtures (related to Figure 1)
In total, 125 lipid mixtures were used for the liposome arrays. In all lipid mixtures, PC is used
as a carrier lipid (second column) complemented with up to four signaling lipids (columns
three to six). Each lipid mixture contained a PEGylated lipid (column seven) and a fluorescent
lipid (column eight). The corresponding molar ratio for each lipid mixture is indicated in the
columns labeled mol % (columns nine to fifteen). The lipid abbreviations are as defined in
Table S1A. (§) indicates that the lipid is not physiological in S. cerevisiae. # indicates
different variant of PI(3,5)P2 lipid.
16
Table S2: Proteins used in the screen (related to Figure 1)
In total, 95 LBDs were tested, out of which 91 were PH domains. The first column gives
protein names as used in this study. The suffixes _1 or _2 indicate two variants of the same
PH domain. The suffixes _A or _B indicate two different proteins from C. thermophilum that
are orthologous to the same protein from S. cerevisiae. The second column gives systematic
(ORF) names, where applicable. The third and the fourth column contain Uniprot ID and
species, respectively. The type of LBD is given in the fifth column. The prediction score for
each PH domain derived from SMART database is shown in the sixth column. In case of high
SMART e-value, the e-values from BLAST or NCBI CDS are given. The column seven
indicates the studies (PMID) that studied or predicted the particular LBD. The eight and ninth
columns contain information about amino acid (AA) borders and exact AA sequence of
probed LBDs, respectively. Source of the DNA template for all LBDs is indicated in the tenth
column. The expression level in E. coli and the solubility are given in the column eleven and
twelve, respectively. The column thirteen shows median protein concentration in μM across
all replicates. Proteins with more than 1.5 fold difference in protein concentration between at
least two replicates are marked (‡). Proteins for which the different protein concentrations had
impact on the affinity to liposomes are marked (*). Columns fourteen to twentytwo give
information about protein concentration (μM) used in each replicate. The column twentythree
displays if the average RI of detected interactions indicates high confidence (0 < RI < 2) and
the column twentyfoure gives the cooperativity class. The column twentyfive gives the
annotations based on in vivo localization from Yeast GFP fusion localization database (Huh et
al., 2003; PMID :14562095). The column twentysix gives the annotations used in Fig. 4C, D
and E. CDC42 network annotation was derived from STRING database, nucleus annotation is
based on information available in Yeast GFP fusion localization database or GO biological
process annotation (marked with #). (n/a: not available; ND: not determined)
17
Table S3: Comprehensive view on the results of all domain-liposomes experiments
probed in the systematic analysis of PH domains (related to Figure 2 and 3)
Summary of mean NBI values from all domain-liposome experiments tested. The domain-
liposome experiments with the mean NBI value above the threshold (0.037) are in green. The
experiments with the mean NBI value below threshold are in white. Domains are indicated in
the first column and liposome types in the first row. The number behind the name of a
liposome type indicates concentration (mol %) of the lipid(s) used. (n/a: not available)
The worksheet named "RIs" gives the RI values for all domain-liposome experiments tested.
The worksheets named "NBIs replicate 1-4" gives all the NBI values measured for each
replicate.
Table S4: Current knowledge of lipid-binding properties of the LBDs used in this study
(related to Figure 1)
A. The table shows 45 PH domains that were previously tested for their lipid-binding
properties. The first column gives the name of the PH domain as used in this study. The
second and third columns indicate if the interaction with lipid was detected in previous studies
(refered to with PMID) and in this study with high confidence (0 < RI < 2), respectively. The
last column summarizes the match of our data with the literature-based benchmark dataset.
B. The table summarizes the current knowledge about the lipid ligand preferences and the
protein recruitment to biological membranes of the LBDs tested in this study that were used
as a benchmark. The data of lipid ligands were obtained from liposome based/SPR studies or
supported by a structure information (Pleckstrin). The first column gives protein names as
used in this study. The second column indicates the LBD type that was used in each particular
study. The third column refers to the studies (PMID) showing in vivo protein/domain
recruitment to the biological membranes detected either by imaging or Ras rescue assay
18
(marked with *). The preferred lipid ligands are listed in the column four. The information
about the affinity of the interaction (if available) is given in the fifth column (Kd). The
column six indicates the source of the information (PMID). The interactions recapitulated in
this study are marked green (interactions with only one replicate available are in pale green),
the interactions not recapitulated are marked in red. (n/a: not available)C, The table
summarizes the PH domains for which new high confident (0 < RI < 2) interactions were
detected. Novel interaction indicates the 34 PH domains that were not previously reported to
interact with any membranes. New specificity/mechanism indicates the 26 PH domains for
which the interaction with lipids was previously reported and for which we propose additional
specificity and/or binding mechanism. The 30 PH domains that did not interact with any
liposomes with high confidence in this study are in italics. (n/a: not available)
Table S5: Summary of cooperative interactions detected, BSM/OBM scores, and
proteins and primers used for additional experiments (related to Figures 3, 4 and 5, and
Experimental Procedures)
A. Basic sequence motif (BSM) and Organelle PIP-binding motif (OBM) scores (related
to Figure 5)
The first column gives the names of protein either using the Uniprot ID or, for the PH
domains used in this study, the PH domain name as defined in Table S2. The second and the
third column give the OBM and BSM score. The fourth column gives the total score that is
the sum of OBM and BSM scores.
B. Proteins used in additional experiments (related to Figures 3, 4 and 5)
The table summarizes information about proteins used in the additional, validation
experiments. The first column gives domain names as used in this study. The second and the
19
third columns give standard and systematic (ORF) protein names where applicable. The
fourth and the fifth column indicate Uniprot ID and species, respectively. The amino acid
sequence borders of the PH domains and the exact amino acid sequences are in the column six
and seven, respectively. The source of the DNA template for the constructs is indicated in the
column eight. The columns nine to fifteen summarize information about protein concentration
(μM) used in the individual experiments shown in Figure 3B, Figure4B, Figure5B,D,E,F; and
FigureS1B,C,I and FigureS3b.
C. Summary of the cooperative interactions detected (related to Figure 3 and 4)
The PH domains are divided into three groups according to their cooperativity class (class 1,
2a and 2b). The barplots show NBI values measured with liposome types composed of either
PIP species alone (yellow bars), or combination of PIPs and auxiliary lipids (colour/white
bars, order indicated in the legend in the figure, from left to right). The colour bars indicate
cooperative interactions - the combinations composed of lipids physiological in S. cerevisiae
in blue, the combinations composed of at least one lipid non-physiological in S. cerevisiae in
green, the incomplete data for which the cooperativity could not be assessed are in gray, and
the white bars indicate non-cooperative interactions or no binding. The horizontal line
indicates the threshold (NBI = 0.037). Only data of PH domains with at least one
cooperativity event of physiological lipids are shown. (§) indicates that the lipid is not
physiological in S. cerevisiae, # indicates different variants of PI(3,5)P2 lipid.
D. Primers used for cloning (related to Experimental Procedures)
The summary of primers used for cloning of all domains and full length proteins. The first
column indicates the domain name. The second and the third column give the sequences of
forward and reverse primers, respectively. The lower case letters represent the specific
sequence matching the template DNA. The upper case letters indicate extra nucleotides
20
needed for cloning (red - restriction site; blue - additional nucleotides added to secure proper
function of restriction enzymes). The fourth column indicates the final vector where the
cloned genes were inserted to.
21
SUPPLEMENTAL EXPERIMENTAL PROCEDURES
Selection of Lipid Binding Domains (LBDs)
The protein candidates were identified using Hidden Markov Models (HMMs) of PH
domains from the Pfam database (http://pfam.sanger.ac.uk/) to search against S. cerevisiae
and C. thermophilum proteomes. The candidate list was supplemented with additional S.
cerevisiae and mammalian PH domains found in the literature. The prediction of amino acid
sequences of PH domains was refined by secondary (Psipred; bioinf.cs.ucl.ac.uk/psipred/)
and 3D structure (Phyre; http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index)
prediction modeling, which enabled us to set borders of each PH domain more precisely. In
cases of ambiguous predictions of boundaries, multiple variants of particular domain were
selected. The amino acid sequences of all PH domains were submitted to SMART database
(smart.embl-heidelberg.de/) and NCBI CDS (Conserved Domain Search;
www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) to obtain the e-value score ( Table S2).
Recombinant protein expression
DNA frgments encoding 74 LBDs were codon optimized for expression in E. coli and
synthesized (Entelechon). Sequences coding remaining LBDs were cloned either from S.
cerevisiae genomic DNA (14) or cDNA isolated from C. thermophilum (5) (kind gift of N.
Silva Martin, EMBL). Full length sequence of HSV2 was cloned from S. cerevisiae genomic
DNA. C2 domain of lactadherin was cloned from p416-GFP-Lact-C2 vector (Haematologic
technologies). PH domain of CERT was cloned from HA-CERT-pcDNA3.1 plasmid (kind
gift of J. Holthuis, University of Osnabrück), FAPP1 PH domain was cloned from PHFAPP1
-
pBGP plasmid (Levine and Munro, 2002). PH domains of OSBP proteins (OSBP1, OSBP2,
OSBPL3, OSBPL7, OSBPL8, OSBPL10 and OSBPL11) were cloned from human cDNA
(Openbiosystems, Table S5B), except PH domain of OSBPL5 that was cloned from
22
synthetic gene (Entelechon). For sequence details of all primers see Table S5D. Single point
mutations were introduced by the QuikChange Lightning Site-Directed Mutagenesis Kit
(Agilent Technologies). All sequences were cloned into pETM11 vector and proteins were
expressed as N-terminal His6-SUMO3 and C-terminal sfGFP fusions in E. coli (BL21 STAR,
Invitrogen) (Saliba et al., 2014).
Proteins were produced in 5 mL cultures in 24 deep-well plates in auto-inducing ZY media.
Cells were grown at 37 °C up to OD600 ~2, subsequently the temperature was shifted to 15 °C
(synthetic genes) or 18 °C (cloned genes) and proteins were produced o/n for 14-15 h. Cells
were pelleted at 3,000 g for 20 min and washed in cold PBS. Final pellets (volume ~100 μL)
were snap frozen in liquid nitrogen and stored at -80 °C for further use.
Preparation of cell extracts
Cell lysis was performed as described previously (Saliba et al., 2014). Expression level and
protein solubility were evaluated on a Coomassie stained gel and a western blot with anti-GFP
antibody (Miltenyi Biotec GmbH, MACS Molecular). Fluorescence intensity of sfGFP-tagged
protein in the cell extracts was measured on a microplate reader (BioTek) at excitation and
emission wavelengths of 485 nm and 528 nm, respectively. Concentrations of sfGFP-fusions
in the cell extracts were estimated from the fluorescence intensity as described previously
(Saliba et al., 2014). Subsequently, a small amount (final concentration 50 μg/ml) of
streptavidin-AF488 (Life Technologies) was spiked into each cell extract. Cell extracts were
centrifuged for 10 min at 16,000 g at 4°C prior use.
Protein purification
Protein purification was performed as described previously (Saliba et al., 2014). For the
titration experiments and the circular dichroism (CD) spectroscopy analysis, the gel filtration
23
elution buffer (10 mM HEPES pH 7.5, 250 mM NaCl) was changed for the assay buffer (10
mM HEPES pH 7.5, 150 mM NaCl) via an overnight dialysis.
Circular dichroism (CD) spectroscopy
CD spectra were collected on a Jasco-815 spectropolarimeter in a quartz cuvette (2 mm path
length) at 20°C with protein concentrations in the range of 4.5-7.0 μM ( Table S5B) in the
assay buffer (10 mM HEPES pH 7.5, 150 mM NaCl). Secondary structure analysis was
performed using Jasco Spectra Manager software package. Thermal denaturation experiments
were performed by monitoring the CD signal at 222 nm while heating protein solutions from
20°C to 90°C with a constant rate 1°C per minute.
Liposome microarrays preparation
Fabrication and principle of the liposome microarray-based assay (LiMA) have been
previously described (Saliba et al., 2014). All lipid mixtures were prepared in chloroform
based solvent ( Table S1A) and stored in 1.5 ml glass vials (Sigma) under argon at −20 °C.
The final concentration of each lipid mixture was 0.26 mM. Each lipid mixture was composed
of a carrier lipid (phosphatidylcholine PC, up to 95 mol %), a signalling lipid (up to 10 mol
%) or a combination of signaling lipids (up to 10 mol %), and completed with
phosphatidylethanolamine (PE)-Atto 647 (0.1 mol %) to ensure autofocusing during
automated microscopy, and PE-PEG350 (0.5 mol %) that facilitates the generation of
liposomes. The signalling lipids consisted of phosphoinositides, sphingolipids,
glycerophospholipids and DAG. Altogether 122 different lipid mixtures were used ( Table
S1C). For phosphatidylinositol PI(3,5)P2 two variants differing in fatty acyl chains were
probed - dioleyl [DOPI(3,5)P2] and dipalmitoyl [DPPI(3,5)P2] ( Tables S1). Unlike liposomes
containing DOPI(3,5)P2, the liposomes containing DPPI(3,5)P2 were not recognized by the
specific PI(3,5)P2 sensor Hsv2 protein. The interactions with DOPI(3,5)P2 represented the
24
majority of binding events reported for PI(3,5)P2-containing liposomes, therefore only those
were considered for further analysis and only DOPI(3,5)P2 was used in validation
experiments.
A thin agarose layer (TAL) of 250 nm height was prepared by coating coverslips (30x45 mm,
#1; Menzel) with 1% low melting agarose (Sigma) in water. Lipid mixtures were spotted on
the TAL by a syringe-driven spotter (Automatic TLC-spotter4, Camag). Each spot on the
array had 800 μm x 800 μm with a distance between two spots of 200 μm (the overall spot
density was 100 spots per 1 cm2). The array consisted of 120 spots in 4 groups of 30 spots on
area of 10 mm x 18 mm. A polydimethylsiloxane (PDMS) device containing 4 separate
chambers was then bonded onto the array as described before (Saliba et al., 2014). Assembled
microfluidic devices were stored at 4 °C under inert atmosphere before use.
The impact of mutations in the organelle PIP binding motif (OBM) was tested on arrays
containing plasma membrane (PM) PIPs [PI(3,4)P2 and PI(4,5)P2] and organelle PIPs
[PI(3)P, PI(4)P, PI(5)P and PI(3,5)P2] of two different concentrations, 5 and 10 mol %
(Figures 5D and E). For the dose response experiments, arrays containing liposomes with
increasing concentrations (from 0 to 7 mol %) of PI(4)P* (porcine brain extract; Table S1A)
and PI(4,5)P2 were used (Figure 5F).
For dose response experiments for validation of cooperative interactions, the liposome arrays
containing liposomes of increasing concentrations (0, 3, 6 and 10 mol %) of two lipids
[PI(4,5)P2/ PI(3,5)P2 and DHS1P/ PHS1P/ PS/DHS/PHS/Phytocer] were used (Figure 3B and
Figure S3B). For combination of PI(4,5)P2 and DHS1P only concentration up to 6 mol %
were used for PI(4,5)P2.
25
For titration experiments for Kd calculations, the liposome arrays containing various PIPs
[PI(3)P, PI(4)P*, PI(3,5)P2, PI(3,4)P2 and PI(4,5)P2] with a concentration of 3 mol % were
used ( Figure S1I).
Measurement of membrane recruitment using LiMA
Giant liposomes (> 5 μm) were spontaneously formed within 5 minutes after injection of the
assay buffer (10 mM Hepes, 150 mM NaCl, pH 7.4) into the microfluidic chambers.
Successful formation of liposomes was confirmed by microscopy. Each liposome spot was
automatically imaged in two fluorescence channels, Atto 647 (3 ms exposure time) for
monitoring the presence of liposomes, and GFP (100 ms exposure time) to check for possible
autofluorescence. After acquisition of the blank images, 20 μL (= volume of one chamber) of
cell extracts were injected in each chamber using a syringe pump (KD scientific). After 20
minutes incubation, unbound proteins were washed by 4 chamber volumes (80 μL) of assay
buffer. Subsequently, the microfluidic device was coupled to the microscope. Images of every
single liposome spot were acquired automatically.
Each protein domain was tested on the array in multiple replicates. The position of the
liposome spots on the array was shuffled for each replicate to control for potential position
bias. On each array ten liposome spots containing lipid mixture with PE-biotin were placed
on different positions providing a specific ligand for streptavidin-AF488 (Figure S1A).
Positive signal from these control liposomes served as assurance of a successful experimental
protocol.
Image acquisition and image processing
Images were acquired using the same equipment and settings as described previously (Saliba
et al., 2014). One constant exposure time of 3 ms was used for Atto 647 images, representing
the position of liposome membranes, and exposure times of 5 ms, 10 ms, 30 ms, 50 ms, 75 ms
26
and 100 ms were used for GFP images, representing protein-liposome binding events.
Multiple exposure times for GFP were selected in order to capture a broad range of binding
intensities of tested domains. Three liposome arrays (3x 120 spots) were always prepared in
parallel and imaged in a sequence one after another. The whole procedure of imaging of the
three arrays took about 3 hours altogether.
Images were processed with the same software and settings as described previously (Saliba et
al., 2014). Normalized binding intensity (NBI) values were calculated as described previously
(Saliba et al., 2014).
Titration experiments for Kd calculations
The purified sfGFP fusions of PH domains of Boi2, Swh1 and AKT1 were diluted in the
assay buffer (10 mM HEPES pH 7.5, 150 mM NaCl) to reach the following protein
concentrations (BOI2: 0.2, 0.5, 1, 3, and 7 μM; AKT1: 0.05, 0.1, 0.2, 0.5, 1, 2, 3, 4, and 6 μM;
SWH1: 0.1, 0.2, 0.5, 1, 1.5, 2, 3, 5, and 7 μM). Subsequently, all these dilutions were
separately probed on arrays made of various PIPs-containing liposomes [PI(3)P, PI(4)P*,
PI(3,5)P2, PI(3,4)P2 and PI(4,5)P2] with a constant PIPs concentration of 3 mol % and NBI
values were measured for all interactions (see below). The Kd calculation was performed
using a non-linear regression analysis with Origin 7.5 software.
Assessment of the relationship between the time delay in the imaging and NBI
measurements
The liposome arrays were automatically imaged in the same way, starting from the initial
position (top left, A1) and the end position (bottom right, J12) position ( Figure S1A). Since
the positions of the individual lipid mixtures were reshuffled between replicates, each
interaction between a particular domain with a liposome type can be assigned to a certain
‘time delay in imaging’ (ie. the time needed for the automatic microscope to reach the
27
position the specific liposome spot starting from the initial position of the array). We have
then calculated the time delay (ie. the difference in ‘time delay in imaging’) between the
respective replicates and correlated these delays with the differences in corresponding NBIs
measured (Figure S1K).
To further assess possible bias introduced by time delay in imaging, we also performed two
experiments (using PLCD1 and EEA1 proteins) in which we imaged the same liposome
arrays twice, once at time 0 and once two hours later, and then we correlated the NBIs
obtained during the both time points (Figure S1L). No decrease in NBIs was detected for the
time ranges tested using for the data collection.
ROC curve-based evaluation of interactions
In order to use only the high quality data for analysis, all acquired images were visually
inspected and bad quality images (no liposomes formed, unfocused images and/or presence
of protein precipitates) were removed. The images were manually annotated as “interaction”,
“no binding” or “dubious”. We performed a ROC analysis (package ROCR (Sing et al.,
2005)) using the manual annotation in order to extract an NBI threshold that was
subsequently used as interaction predictor. The threshold was set to NBI value 0.037
(interactions with NBI ≥ 0.037, otherwise no binding), leading to 75.2% true positive rate
and 3.4% false positive rate and a precision of 86%, AUC of ROC curve = 0.95 (Figure
S1B). The final NBI value for each domain-liposome experiment was calculated as a mean
NBI from all available replicates. The mean NBI values were used for further analysis.
Assessment of effect of protein concentration and positional bias
NBI values for all images manually annotated as interaction were collected for all replicates
of domains with more than 1.5 fold difference in protein concentration between replicates.
Wilcoxon test followed by Bonferroni correction (R package stats) were performed to detect
28
significant differences between measured NBIs (threshold was set to P < 0.05 after
correction).
In order to look for position bias, we computed the logarithmic value of the ratio between
number of data points of NBI > 0.037 (i.e., interactions) and the total number of data points
measured at the particular coordinate of the array.
Assessment of data reproducibility
The quantitative reproducibility of the data was computed by calculating the Pearson
correlation between the NBIs of the corresponding replicates (Figure 1B). The qualitative
reproducibility (Figure S1E) was assessed by comparing the annotations of protein-liposome
experiments based on both, NBI threshold and manual annotation. Each domain-liposome
experiment was assigned as interaction or no binding based on the mean NBI value.
Subsequently, the annotations of all individual replicates were compared. For the cases of
inconsistent annotation based on NBI, the manual annotation was used.
Comparison of S. cerevisiae and C. thermophilum binding profiles
To assess the similarity of the binding profiles of 27 pairs of S. cerevisiae and C.
thermophilum PH domains orthologs, we computed a Pearson correlation coefficient of the
NBIs measured on the entire dataset (Figure 1C).
Clustering analysis
The uncentered Pearson correlation was used to cluster the liposome types according to the
similarities between their PH domain binding profiles (i.e., what PH domains interacted with
them and with what affinity) using the log(NBI) values of base 1. The liposomes were
clustered using the hclust function from the R package stats. The dissimilarity matrix was
computed using the Dist function from the amap package. The same approach was used for
29
clustering liposomes containing single signalling lipids and liposomes containing combination
of signalling lipids. Fisher test (R package stats) was used for assessment of statistical
significance of differences in auxiliary lipid charge distribution between individual subgroups
(Figure 2A).
To further validate the separation of clusters of plasma membrane PIPs and organelle PIPs
observed in the hierarchical clustering analysis, we performed a Principal Coordinates
Analysis (PCoA) and represented the two first dimensions, MDS1 and MDS2, of the two
different clusters in boxplots (Figure 2C). The PCoA was performed using the implementation
of the R package Vegan (Dixon, 2003). Subsequently, we performed a Wilcoxon test between
those two dimensions (MDS1 and MDS2). We demonstrated further the robustness of these
clusters using the partitioning around medoids (Hennig, 2010) (Figure S2C) and the silhouette
coefficient (Rousseeuw, 1987) (Figure S2D).
Calculation of the Reproducibility Index (RI)
The reproducibility index (RI) is computed for each interaction that was measured at least
twice. For each interaction between a protein domain and a liposome type, we calculated the
mean NBI of the replicates (NBI) as well as the associated standard error (SENBI). We then
calculated the RI of the interaction as such, where Th is the binding threshold (0.037):
If NBI > Th then
If NBI < Th then
Based on the RI calculation, the experiments have been assigned to four categories: high
confidence interactions (RI ]0;2[), high confidence no binding (RI ]-2;0[), low confidence
interactions (RI ≥ 2) and low confidence no binding (RI ≤ -2).
30
Cooperativity assessment
To evaluate the role of cooperative associations or rheostasis (i.e. the changes in binding
affinity for one lipid owing to the presence of another lipid) in the recruitment of PH domains
to phosphoinositide-containing membranes, we derived a cooperativity index (CI = NBIL1+L2
/ [NBIL1+NBIL2]). An interaction was considered cooperative when CI > 1, that is when the
binding affinity to liposomes containing two signaling lipids was stronger than the sum of the
binding affinities to liposomes containing the individual lipids (NBIL1+L2 > NBIL1+NBIL2 and,
to account for technical variability, NBIL1+L2 also has to be higher than the highest value
obtained for NBIL1+NBIL2 amongst the replicates). We also observed instances of negative
cooperativity (NBIL1+L2 < max{NBIL1;NBIL2}; CI = NBIL1+L2 / max{NBIL1;NBIL2} and, to
account for technical variability, NBIL1+L2 also has to be lower than the lowest value obtained
for NBIL1 or NBIL2 amongst the replicates).
The dose response experiments of 31 randomly selected PH domain-lipids combination pairs
were performed on liposome arrays containing liposomes of increasing concentrations of two
lipids (Figure S3B). The PH domains were tested at the same protein concentration as used in
the PH domain screen where the cooperative/non-cooperative interactions were observed.
The screen results (“predictions”) along with the validation results (the “truth”) were used to
estimate a true positive rate and false negative rate of the cooperativity heatmap.
The PH domains were separated into three cooperativity classes (Figure 4A) based on
manual inspection of data represented in plots in Table S5C.
Principal component analysis of proteins based on the cooperativity index values
Principal component analysis (PCA) was performed using the R package ade4 (Dray and
Dufour, 2007). The PCA was performed on a set of S. cerevisiae PH domains with CI >1 for
31
at least one liposome type (total 37 PH domains). Only CI values obtained for liposome types
containing PI(4,5)P2:PS (10:10 mol %), PI(4,5)P2:DHS1P (10:10 mol %), or PI(4,5)P2:PHS1P
were considered.
The PH domains were annotated using the information of the corresponding full length
proteins available in STRING database (Franceschini et al., 2013) and Yeast GFP fusion
localization database (Huh et al., 2003) (Table S2). From the STRING database, protein-
protein interactions with a high experimental score (>0.8) were considered. With this filter,
the protein CDC42 was identified as the common and specific interactor of the majority of
proteins containing PH domains of high PC2 values (> quantile 75%) as defined by the PCA,
which correspond to high CI for PI(4,5)P2:PS liposomes. In order to make sense of one
outliner (BUD4), we subsequently refined our filter and considered also interactions whose
STRING text mining score was superior to 0.5. The refined filter identified only the BUD4
(score 0.649) as the additional CDC42 interactor.
The nuclear localization annotation was derived from Yeast GFP fusion database (nucleus and
nuclear periphery annotations considered) and supplemented with proteins manually curated
as being involved in positive regulation of transcription from RNA polymerase II promoter in
Gene Ontology biological process annotations (http://www.yeastgenome.org).
Search of the organelle PIP binding motifs (OBM)
We first defined two groups of PH domains - the PH domains with ability to interact with
organelle PIPs (16) represented the foreground set, and all remaining PH domains
(interacting with PM PIPs only, or not interacting with any individual PIPs at all) represented
the background set (74) (Figure S5A). All PH domain sequences from both sets were aligned
(hmmer version 3.0) against the Pfam PH domain seed HMM. Two PH domains were
removed from the foreground set - AVO1 PH domain (high SMART e-value prediction
32
score), and PDK1 PH domain (not aligned properly with the other PH domains). The multi-
harmony algorithm (Brandt et al., 2010) was then run on the multiple alignments of the
remaining sequences (14 PH domains of the foreground set and 74 PH domains of the
background set). The algorithm computes how the amino acids content of one group at one
position differs from the amino acids content of the other group and thus identify
residues/positions in the alignment that are specific for one group. The algorithm proposed
17 best scoring positions. Out of these 17 positions we have selected nine that: (i) showed no
overlap in overrepresented amino acids in both, the foreground and the background, and (ii)
were located inside the predicted PH domain secondary structure.
The sequences of 90 PH domains tested in this study and 1,115 PH domains extracted from
Uniprot database (http://www.uniprot.org/; Uniprot release 2013_11) were ranked according
to a scoring scheme which was decided manually and works as follows:
1) conservation of OBM residues - 3 points for the two positions of lowest variability (T in
the β1-β2 loop and K in the β7-strand) and 1 point for the other seven. The maximal score for
OBM was 13.
2) conservation of BSM (basic sequence motif) residues - 1 point for each. The maximal
score for BSM was 3.
When summed, OBM and BSM could give a maximum score of 16 (Figure S5B, Table
S5A).
The statistical significance of the difference between NBI values measured for interactions of
PH domains of different scores with liposomes containing either PM, or organelle PIPs was
calculated using the Wilcoxon test. The glutamine residue used for lysine/arginine
substitution in PH domains with engineered mutation was selected based on the PH
33
consensus sequence coming from Pfam database (based on 163 PH domains) (Levine and
Munro, 2002). Naturally occurring variants in OBM were searched in COSMIC database
(Forbes et al., 2010) (http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/). The
statistical significance of the differences in NBI values measured for wild types and mutated
forms of PH domains were calculated using the Wilcoxon test.
Yeast strains
Yeast strains with deleted or depleted lipid metabolizing genes (CHO1, MSS4ts) were
created in the SGA Y7039 strain (a gift from C. Boone, University of Toronto) derived from
the BY4741 background (Tong and Boone, 2007) (MAT can1:: STE2pr-LEU2 lyp1
ura30 leu20 his31 met150) by standard yeast molecular biology procedures (Janke et
al., 2004). These mutant strains were mated with selected yeast strains from Yeast GFP Clone
Collection (Invitrogen) (BEM3, BOI1, BOI2, CDC24, CLA4) using robot facilitated mating,
sporulation and strain selection of the standard SGA protocols (Huh et al., 2003; Tong and
Boone, 2007). The genotypes of the final strains were confirmed by PCR.
CERT-PH, FAPP1-PH, OSBP2-PH, OPY1C-PH and PLCD1-PH domains and their mutants
were cloned into pRS315-mCherry vector (kind gift of E.C. Hurt, BZH, Heidelberg). SWH1-
PH domain was cloned from yeast genomic DNA (amino acid sequence as for E. coli
expression) (Table S5D) and then inserted into the pRS315-mCherry vector. Single point
mutation was introduced by the QuikChange Lightning Site-Directed Mutagenesis Kit
(Agilent Technologies). All domains were expressed with C-terminal mCherry tag from
ADH1 promotor. For colocalization experiments a yeast strain with GFP-tagged trans-Golgi
marker KEX2 (Yeast GFP Clone Collection) was used. Effect of lowered PI(4)P level in
Golgi membranes was tested with thermosensitive PIK1ts strain (Audhya et al., 2000) (kind
gift of S. Emr, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca)
34
[MATa leu2-3,112 ura3-52 his3-D200 trp1-D901 lys2-801 suc2-D9, pik1::HIS3, harboring
pRS314pik1-83 (Amp, TRP1 CEN6 pik1-83)].
Imaging of yeast strains
Yeast strains harbouring mCherry-tagged PH domains (KEX2-GFP and PIK1ts strains) were
grown in SD medium without tryptophan overnight at 30 °C (KEX2-GFP) or 25 °C (PIK1ts).
Cells were diluted to OD600 < 0.05 and further grown for ~ 3 h in respective temperature. The
cells were adhered on glass-slides (BioTek) coated with Concavalin A (Sigma) and imaged.
The PIK1ts strains tested at nonpermissive temperature were incubated at 37 °C for 50
minutes prior imaging. Images were acquired with an Olympus IX-81 microscope with ×100
oil objective /NA 1.45 objective lens and Orca-ER camera (Hamamatsu).
For live-cell imaging of yeast strains harbouring both GFP-fused genes and
deletions/depletions of genes of lipid metabolic enzymes cells were inoculated in SD medium
without tryptophan and histidine, supplemented in addition with 1.0 mM ethanolamine in
case of cho1, and grown overnight at 30 °C. Cells were diluted to OD600 = 0.1, adhered on
Concanavalin A-coated 96 well glass bottom plates, and imaged on fully motorized Olympus
fluorescence microscope system (Olympus IX-81) at 30 °C (temperature controlled
incubator, EMBL manufacture). The mss4ts strains tested at nonpermissive temperature were
incubated at 37 °C for 50 minutes prior imaging. Images were acquired using a ×100 /NA
1.45, low noise highly sensitive ORCA-R camera (Hamamastu), MT20 illumination system,
and Uniblitz Electro-Programmable Shutter system. All acquired images were processed with
ImageJ software.
35
SUPPLEMENTAL REFERENCES
Audhya, A., Foti, M., and Emr, S.D. (2000). Distinct roles for the yeast phosphatidylinositol
4-kinases, Stt4p and Pik1p, in secretion, cell growth, and organelle membrane dynamics.
Molecular biology of the cell 11, 2673-2689.
Brandt, B.W., Feenstra, K.A., and Heringa, J. (2010). Multi-Harmony: detecting functional
specificity from sequence alignment. Nucleic Acids Res 38, W35-40.
Dixon, P. (2003). VEGAN, a package of R functions for community ecology. Journal of
Vegetation Science 14, 927-930.
Dray, S., and Dufour, A.-B. (2007). The ade4 package: implementing the duality diagram for
ecologists. Journal of statistical software 22, 1-20.
Forbes, S.A., Tang, G., Bindal, N., Bamford, S., Dawson, E., Cole, C., Kok, C.Y., Jia, M.,
Ewing, R., Menzies, A., et al. (2010). COSMIC (the Catalogue of Somatic Mutations in
Cancer): a resource to investigate acquired mutations in human cancer. Nucleic Acids Res 38,
D652-657.
Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J.,
Minguez, P., Bork, P., von Mering, C., et al. (2013). STRING v9.1: protein-protein
interaction networks, with increased coverage and integration. Nucleic Acids Res 41, D808-
815.
Hennig, C. (2010). fpc: Flexible procedures for clustering. R package version Available:
http://cranr-projectorg/web/packages/fpc/.
Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S., and
O'Shea, E.K. (2003). Global analysis of protein localization in budding yeast. Nature 425,
686-691.
Janke, C., Magiera, M.M., Rathfelder, N., Taxis, C., Reber, S., Maekawa, H., Moreno-
Borchart, A., Doenges, G., Schwob, E., Schiebel, E., et al. (2004). A versatile toolbox for
PCR-based tagging of yeast genes: new fluorescent proteins, more markers and promoter
substitution cassettes. Yeast 21, 947-962.
Levine, T.P., and Munro, S. (2002). Targeting of Golgi-specific pleckstrin homology domains
involves both PtdIns 4-kinase-dependent and -independent components. Current biology : CB
12, 695-704.
Rousseeuw, P.J. (1987). Silhouettes: a graphical aid to the interpretation and validation of
cluster analysis. Journal of computational and applied mathematics 20, 53-65.
Saliba, A.E., Vonkova, I., Ceschia, S., Findlay, G.M., Maeda, K., Tischer, C., Deghou, S., van
Noort, V., Bork, P., Pawson, T., et al. (2014). A quantitative liposome microarray to
systematically characterize protein-lipid interactions. Nat Methods 11, 47-50.
Sing, T., Sander, O., Beerenwinkel, N., and Lengauer, T. (2005). ROCR: visualizing classifier
performance in R. Bioinformatics 21, 3940-3941.
Tong, A.H.Y., and Boone, C. (2007). High-Throughput Strain Construction and Systematic
Synthetic Lethal Screening in Saccharomyces cerevisiae. Methods in Microbiology 36, 369–
386, 706–707.