2019 rokos award research internship report alex welch · 2019. 10. 24. · 2019 rokos award...
TRANSCRIPT
2019 Rokos Award Research Internship Report – Alex Welch This summer I was fortunate to have the opportunity to carry out a research internship at the MRC London Institute of Medical
Sciences with Dr Holger Kramer. Without the funding provided by the Rokos Award this placement would not have been
possible, so I am extremely grateful for the generous support.
My experience
This opportunity was so valuable because it gave me an insight into the world of
academic research and allowed me to experience working on a project where the
outcomes of experiments really were unknown. In addition to learning about the
complex mass spectrometers in the facility and the way in which the data is
analysed, I spent many hours working in the wet lab. This gave me the chance to
develop broader lab skills and learn from experienced researchers working in the
facility, which will be of great value in all future lab projects I undertake. Also, I had
the chance to talk to the researchers about different career routes and hear their
views on subjects such as the differences between working in academia and industry.
During my placement I stayed in Imperial College student halls close to Hyde Park. I
thoroughly enjoyed my time living in London and luckily for me, the BBC Proms is
held every summer in the Royal Albert Hall. This meant I also had the chance to
attend many concerts in the evenings from world class orchestras and soloists.
Background
The Biological Mass Spectrometry and Proteomics Facility at MRC LMS
utilises advanced mass spectrometry techniques to improve our
understanding of how proteins and their complexes are regulated in health
and disease. The aspects of proteomics which the facility focus on include:
• Quantitative analysis of protein phosphorylation and other post-
translational modifications (e.g. ubiquitination, hydroxylation)
• Characterisation of dynamic protein-protein interactions
• Chemical proteomics approaches
• Subcellular proteomics
• Analysis of enzymatic activities by mass spectrometry
• Clinical proteomics and biomarker discovery
The research
Project aim: Optimise a method for total protein hydrolysis and amino acid analysis to enable the identification of
post-translational modifications.
The lab’s standard workflow for the analysis of a protein sample first involves carrying out a tryptic digest using the enzyme
trypsin. The resulting peptides are loaded on a nanoflow high-performance liquid chromatography (HPLC) system paired with an
electron spray ionisation mass spectrometer (ESI-MS). We used mass spectrometers containing an ion trap mass analyzer called
the Orbitrap, which consists of two outer electrodes and a central electrode. Ions entering the Orbitrap are captured through
"electrodynamic squeezing,” after which they oscillate around the central electrode. As different ions oscillate at different
frequencies, the ions are separated. By measuring the oscillation frequencies induced by ions on the outer electrodes, the mass
spectra of the ions are acquired using image current detection, through the Fourier transform of the frequency signal. Orbitrap
technology produces incredibly high-resolution, accurate-mass (HRAM) data and revolutionized the MS field when it was first
made commercially available in 2005. The amino acid sequences of the tryptic peptides can be determined from the b/y
fragment ions produced in the MS/MS scan, where molecules enter the higher-energy collisional dissociation (HCD) cell in a
process called collision-induced dissociation (CID).
Diagram of Thermo Q Exactive Hybrid Quadrupole-Orbitrap mass spectrometer.
Using Mascot, a software search engine that uses mass spectrometry data to identify proteins from peptide sequence
databases, the proteins present in the sample can be identified. It is possible to identify post translational modifications (PTMs)
by also searching this database for ‘variable modifications’ but unless we know what PTMs to expect, we would have to search
for all possible variable modifications at once (all possible PTMs). This means that the search space is large and there is a high
probability of identifying a match in the Mascot database by chance. Therefore, the expectation values will be high for matches
and they are not likely to be classed as significant. By hydrolysing the protein into its amino acids and identifying what PTMs are
present on these using MS, we would only need to search the Mascot database for peptides with the variable modifications
found in the amino acids, reducing the search space so matches are more likely to be significant.
Example of the results of a Mascot search.
We selected 3 proteins to work with to develop the method, bovine serum albumin (BSA), beta-casein and bovine cytochrome C.
These were good candidates as they are relatively inexpensive and easily sourced from a biomolecule supplier. Also, when we
searched UniProt to learn more about these proteins (e.g presence of signal peptides and PTMs), there was literature to suggest
that these proteins had PTMs such as serine phosphorylation. We first carried out the standard tryptic digest procedure (FASP
protocol) and analysed the samples using a Thermo LTQ Velos Orbitrap LC-MS instrument. Surprisingly, the Mascot database
search told us that the ‘pure’ beta-casein bought from the biomolecule supplier actually contained a mixture of proteins (in very
small amounts). Although this wasn’t an issue for our project, it demonstrated the sensitivity of the instrument and highlighted
that it was important to be aware of the biomolecule supplier’s criteria for determining purity.
After reading several papers suggesting different protocols for hydrolysing proteins into amino acids, we decided to use pronase,
a commercially available mixture of proteases isolated from the extracellular fluid of Streptomyces griseus (a bacterium). Using a
protocol involving molecular weight cut-off filters and several rounds of centrifugation, we could carry out the digestion of the
proteins into amino acids and separate the amino acids from any undigested protein. To use the nanoflow LC-MS method used
by the lab for proteomics, we first needed to carry out a dansyl derivitisation reaction with the amino acids. A literature search
indicated that this reaction is sensitive to pH and we found a paper that suggested a particular range in pH. Therefore, we
carried out this reaction at pH 9.4 and pH 9.9, analysed the samples using LC-MS and compared the extracted ion
chromatograms for each amino acid at the 2 pHs. This was carried out using Thermo Xcalibur Qual Browser software and we
concluded that the lower pH provided better peak shapes and normalised levels.
An example showing the extracted ion chromatograms for glycine, with pH 9.4 on the left.
We then compared the data from the dansyl derivatised (pH 9.4) amino acid and nanoflow LC-MS method with data produced
using unmodified amino acids but using a hydrophilic interaction liquid chromatography (HILIC) MS method. This was a method
frequently used by a postdoc in the lab who worked in metabolomics, using much higher flow rates than typically used for
proteomics. This approach produced improved signal & peak shape overall, despite the fact that we saw better separation of
closely related amino acids using the nanoflow method. After discussion, we concluded that the HILIC MS method would be best
for the project.
The Thermo Orbitrap Q-Exactive instrument with Vanquish UHPLC we used for the HILIC MS method.
Optimising HILIC MS method
The next step was to optimise this method for amino acid analysis. We investigated the effect of pH on the data (pH 3.5 and pH
4.6) and found that some amino acids had better normalised levels and peak shape at the higher pH whereas others were better
at the lower, showing no consistent pattern. The extracted ion chromatograms were obtained using Thermo Scientific Freestyle
software, searching the amino acid masses automatically by applying a template XML document containing all the masses we
were investigating. This dramatically increased the speed of analysis compared to the method used to compare the effect of pH
on the dansyl derivatisation.
Below shows the extracted ion chromatograms of amino acids in Thermo Scientific Freestyle, followed by the comparison of these
at pH 3.5 and pH 4.6 for threonine, glycine, serine, citrulline & glutamic acid.
We then experimented with changing various parts of the protocol such as pronase concentration and incubation time. As
pronase also contains phosphatases, we ran a set of these samples with phosphatase inhibitor cocktails to try to retain
phosphoamino acids (one of the most important PTMs). The data suggested that higher concentrations of pronase had a lot of
autodigestion. This is a major issue for us because we wouldn’t know which amino acids detected are from our digested proteins
of interest or from components of pronase. We decided the best method is a lower pronase concentration but with a 90hr
incubation time. We then compared the data between the 24hr and 90hr incubation times (using Freestyle software), with and
without the phosphatase inhibitors. Surprisingly, we found that there were lower amino acid signals with the phosphatase
inhibitors than without for each time point. This suggests that the phosphatase inhibitors also inhibited the proteolytic activity
of the pronase. Furthermore, we checked for the phosphoamino acids on the negative ion mode on the MS and did not get
peaks at the expected m/z values, suggesting that the phosphatase inhibitor cocktails were ineffective at preventing the loss of
the phosphorylation. We also compared the amino acid signals after a pronase digestion of the intact proteins and then with the
tryptic digests of the proteins. We found that the tryptic digest followed by incubation with pronase produced the highest amino
acid normalised levels. This may also be because in the trypsin digest protocol we reduced the disulfide bridges in the proteins
(BSA in particular contains many), allowing pronase to more effectively hydrolyse the peptide bonds.
To allow quantitative analysis of the data, we used Xcalibur software to integrate the areas under the peaks of the extracted ion
chromatograms. By also integrating the areas of the peaks for each of the amino acids in the standard mix (where all amino
acids are present at 10µM) we could construct a 2-point calibration curve to estimate the concentration of the amino acids
produced from the proteolytic activity of pronase. We then subtracted the concentrations of each amino acid found in the
control sample (pronase & buffer) from each amino acid concentration in the samples with protein and pronase, as these amino
acids would be a result of pronase autodigestion. This allowed us to compare the concentrations of amino acids with what
would be expected from the protein sequence (by finding the % composition of each amino acid using the online tools UniProt
and proteomics toolkit). Using Microsoft Xcel, we produced graphs that showed for certain amino acids the concentration was
highly similar to the expected value, whereas in others it was very different. It is possible that the step of subtracting the
background amino acid concentration in the pronase control from the samples did not properly account for the pronase
autodigestion rate in the samples including the proteins. For example, there may have been a lower rate of pronase
autodigestion in the protein samples than in the blank, as there may be a lower probability that the pronase enzymes would
make contact with each other in solution. This may explain why the estimated concentrations did not match particularly well
with the expected values from the protein sequence.
A screenshot of the work in Xcel to produce the graphs comparing the experimental concentrations of amino acids with the
predicted concentrations.
The future of the project
PTMs of particularly interest occur in histone proteins as these can impact gene expression by altering chromatin structure or
recruiting histone modifiers. Lysine residues in histones often have PTMs such as acetylation & methylation. We investigated
whether PTMs on lysine amino acids would survive the incubation step with pronase. To test this, we incubated 5 lysine derivate
standards (Hydroxylysine, Acetyl-Lysine, Dimethyl Lysine, Trimethyl Lysine & Lysine(Butyryl)-OH) with pronase/buffer and also
set up negative controls (Lysine derivatives in buffer). At this stage, my time with the facility came to an end but the project is
still in progress.
Typical PTMs in histone proteins.