2019 rokos award research internship report alex welch · 2019. 10. 24. · 2019 rokos award...

2019 Rokos Award Research Internship Report – Alex Welch This summer I was fortunate to have the opportunity to carry out a research internship at the MRC London Institute of Medical

Sciences with Dr Holger Kramer. Without the funding provided by the Rokos Award this placement would not have been

possible, so I am extremely grateful for the generous support.

My experience

This opportunity was so valuable because it gave me an insight into the world of

academic research and allowed me to experience working on a project where the

outcomes of experiments really were unknown. In addition to learning about the

complex mass spectrometers in the facility and the way in which the data is

analysed, I spent many hours working in the wet lab. This gave me the chance to

develop broader lab skills and learn from experienced researchers working in the

facility, which will be of great value in all future lab projects I undertake. Also, I had

the chance to talk to the researchers about different career routes and hear their

views on subjects such as the differences between working in academia and industry.

During my placement I stayed in Imperial College student halls close to Hyde Park. I

thoroughly enjoyed my time living in London and luckily for me, the BBC Proms is

held every summer in the Royal Albert Hall. This meant I also had the chance to

attend many concerts in the evenings from world class orchestras and soloists.

Background

The Biological Mass Spectrometry and Proteomics Facility at MRC LMS

utilises advanced mass spectrometry techniques to improve our

understanding of how proteins and their complexes are regulated in health

and disease. The aspects of proteomics which the facility focus on include:

• Quantitative analysis of protein phosphorylation and other post-

translational modifications (e.g. ubiquitination, hydroxylation)

• Characterisation of dynamic protein-protein interactions

• Chemical proteomics approaches

• Subcellular proteomics

• Analysis of enzymatic activities by mass spectrometry

• Clinical proteomics and biomarker discovery

The research

Project aim: Optimise a method for total protein hydrolysis and amino acid analysis to enable the identification of

post-translational modifications.

The lab’s standard workflow for the analysis of a protein sample first involves carrying out a tryptic digest using the enzyme

trypsin. The resulting peptides are loaded on a nanoflow high-performance liquid chromatography (HPLC) system paired with an

electron spray ionisation mass spectrometer (ESI-MS). We used mass spectrometers containing an ion trap mass analyzer called

the Orbitrap, which consists of two outer electrodes and a central electrode. Ions entering the Orbitrap are captured through

"electrodynamic squeezing,” after which they oscillate around the central electrode. As different ions oscillate at different

frequencies, the ions are separated. By measuring the oscillation frequencies induced by ions on the outer electrodes, the mass

spectra of the ions are acquired using image current detection, through the Fourier transform of the frequency signal. Orbitrap

technology produces incredibly high-resolution, accurate-mass (HRAM) data and revolutionized the MS field when it was first

made commercially available in 2005. The amino acid sequences of the tryptic peptides can be determined from the b/y

fragment ions produced in the MS/MS scan, where molecules enter the higher-energy collisional dissociation (HCD) cell in a

process called collision-induced dissociation (CID).

Diagram of Thermo Q Exactive Hybrid Quadrupole-Orbitrap mass spectrometer.

Using Mascot, a software search engine that uses mass spectrometry data to identify proteins from peptide sequence

databases, the proteins present in the sample can be identified. It is possible to identify post translational modifications (PTMs)

by also searching this database for ‘variable modifications’ but unless we know what PTMs to expect, we would have to search

for all possible variable modifications at once (all possible PTMs). This means that the search space is large and there is a high

probability of identifying a match in the Mascot database by chance. Therefore, the expectation values will be high for matches

and they are not likely to be classed as significant. By hydrolysing the protein into its amino acids and identifying what PTMs are

present on these using MS, we would only need to search the Mascot database for peptides with the variable modifications

found in the amino acids, reducing the search space so matches are more likely to be significant.

https://en.wikipedia.org/wiki/Ion_trap

https://en.wikipedia.org/wiki/Mass_analyzer

Example of the results of a Mascot search.

We selected 3 proteins to work with to develop the method, bovine serum albumin (BSA), beta-casein and bovine cytochrome C.

These were good candidates as they are relatively inexpensive and easily sourced from a biomolecule supplier. Also, when we

searched UniProt to learn more about these proteins (e.g presence of signal peptides and PTMs), there was literature to suggest

that these proteins had PTMs such as serine phosphorylation. We first carried out the standard tryptic digest procedure (FASP

protocol) and analysed the samples using a Thermo LTQ Velos Orbitrap LC-MS instrument. Surprisingly, the Mascot database

search told us that the ‘pure’ beta-casein bought from the biomolecule supplier actually contained a mixture of proteins (in very

small amounts). Although this wasn’t an issue for our project, it demonstrated the sensitivity of the instrument and highlighted

that it was important to be aware of the biomolecule supplier’s criteria for determining purity.

After reading several papers suggesting different protocols for hydrolysing proteins into amino acids, we decided to use pronase,

a commercially available mixture of proteases isolated from the extracellular fluid of Streptomyces griseus (a bacterium). Using a

protocol involving molecular weight cut-off filters and several rounds of centrifugation, we could carry out the digestion of the

proteins into amino acids and separate the amino acids from any undigested protein. To use the nanoflow LC-MS method used

by the lab for proteomics, we first needed to carry out a dansyl derivitisation reaction with the amino acids. A literature search

indicated that this reaction is sensitive to pH and we found a paper that suggested a particular range in pH. Therefore, we

carried out this reaction at pH 9.4 and pH 9.9, analysed the samples using LC-MS and compared the extracted ion

chromatograms for each amino acid at the 2 pHs. This was carried out using Thermo Xcalibur Qual Browser software and we

concluded that the lower pH provided better peak shapes and normalised levels.

An example showing the extracted ion chromatograms for glycine, with pH 9.4 on the left.

We then compared the data from the dansyl derivatised (pH 9.4) amino acid and nanoflow LC-MS method with data produced

using unmodified amino acids but using a hydrophilic interaction liquid chromatography (HILIC) MS method. This was a method

frequently used by a postdoc in the lab who worked in metabolomics, using much higher flow rates than typically used for

proteomics. This approach produced improved signal & peak shape overall, despite the fact that we saw better separation of

closely related amino acids using the nanoflow method. After discussion, we concluded that the HILIC MS method would be best

for the project.

The Thermo Orbitrap Q-Exactive instrument with Vanquish UHPLC we used for the HILIC MS method.

Optimising HILIC MS method

The next step was to optimise this method for amino acid analysis. We investigated the effect of pH on the data (pH 3.5 and pH

4.6) and found that some amino acids had better normalised levels and peak shape at the higher pH whereas others were better

at the lower, showing no consistent pattern. The extracted ion chromatograms were obtained using Thermo Scientific Freestyle

software, searching the amino acid masses automatically by applying a template XML document containing all the masses we

were investigating. This dramatically increased the speed of analysis compared to the method used to compare the effect of pH

on the dansyl derivatisation.

Below shows the extracted ion chromatograms of amino acids in Thermo Scientific Freestyle, followed by the comparison of these

at pH 3.5 and pH 4.6 for threonine, glycine, serine, citrulline & glutamic acid.

We then experimented with changing various parts of the protocol such as pronase concentration and incubation time. As

pronase also contains phosphatases, we ran a set of these samples with phosphatase inhibitor cocktails to try to retain

phosphoamino acids (one of the most important PTMs). The data suggested that higher concentrations of pronase had a lot of

autodigestion. This is a major issue for us because we wouldn’t know which amino acids detected are from our digested proteins

of interest or from components of pronase. We decided the best method is a lower pronase concentration but with a 90hr

incubation time. We then compared the data between the 24hr and 90hr incubation times (using Freestyle software), with and

without the phosphatase inhibitors. Surprisingly, we found that there were lower amino acid signals with the phosphatase

inhibitors than without for each time point. This suggests that the phosphatase inhibitors also inhibited the proteolytic activity

of the pronase. Furthermore, we checked for the phosphoamino acids on the negative ion mode on the MS and did not get

peaks at the expected m/z values, suggesting that the phosphatase inhibitor cocktails were ineffective at preventing the loss of

the phosphorylation. We also compared the amino acid signals after a pronase digestion of the intact proteins and then with the

tryptic digests of the proteins. We found that the tryptic digest followed by incubation with pronase produced the highest amino

acid normalised levels. This may also be because in the trypsin digest protocol we reduced the disulfide bridges in the proteins

(BSA in particular contains many), allowing pronase to more effectively hydrolyse the peptide bonds.

To allow quantitative analysis of the data, we used Xcalibur software to integrate the areas under the peaks of the extracted ion

chromatograms. By also integrating the areas of the peaks for each of the amino acids in the standard mix (where all amino

acids are present at 10µM) we could construct a 2-point calibration curve to estimate the concentration of the amino acids

produced from the proteolytic activity of pronase. We then subtracted the concentrations of each amino acid found in the

control sample (pronase & buffer) from each amino acid concentration in the samples with protein and pronase, as these amino

acids would be a result of pronase autodigestion. This allowed us to compare the concentrations of amino acids with what

would be expected from the protein sequence (by finding the % composition of each amino acid using the online tools UniProt

and proteomics toolkit). Using Microsoft Xcel, we produced graphs that showed for certain amino acids the concentration was

highly similar to the expected value, whereas in others it was very different. It is possible that the step of subtracting the

background amino acid concentration in the pronase control from the samples did not properly account for the pronase

autodigestion rate in the samples including the proteins. For example, there may have been a lower rate of pronase

autodigestion in the protein samples than in the blank, as there may be a lower probability that the pronase enzymes would

make contact with each other in solution. This may explain why the estimated concentrations did not match particularly well

with the expected values from the protein sequence.

A screenshot of the work in Xcel to produce the graphs comparing the experimental concentrations of amino acids with the

predicted concentrations.

The future of the project

PTMs of particularly interest occur in histone proteins as these can impact gene expression by altering chromatin structure or

recruiting histone modifiers. Lysine residues in histones often have PTMs such as acetylation & methylation. We investigated

whether PTMs on lysine amino acids would survive the incubation step with pronase. To test this, we incubated 5 lysine derivate

standards (Hydroxylysine, Acetyl-Lysine, Dimethyl Lysine, Trimethyl Lysine & Lysine(Butyryl)-OH) with pronase/buffer and also

set up negative controls (Lysine derivatives in buffer). At this stage, my time with the facility came to an end but the project is

still in progress.

Typical PTMs in histone proteins.

2019 rokos award research internship report alex welch · 2019. 10. 24. · 2019 rokos award...

Documents