a novel adaptive background subtraction ......ground is complex, partially resolved and, commonly,...
TRANSCRIPT
TO DOWNLOAD A COPY OF THIS POSTER VISIT WWW.WATERS.COM/POSTERS ©2006 Waters Corporation
m/z1360 1380 1400 1420 1440 1460 1480
%
0
100
%
0
1001463.0
1370.9
1362.9
1412.91388.9
1382.9
1398.91444.9
1424.9 1436.9 1454.91473.9
1463.0
1370.91412.9
1388.9 1445.0
OVERVIEW PURPOSE Reduction of chemical noise using a novel ‘Adaptive’ Background Subtraction (ABS) algorithm. METHODS Statistical analysis of the local intensity distribution for each nominal mass region. Estimation and removal of intensity distribution due to repetitive background. RESULTS Comparison of databank search results. a) MALDI-MS PMF 500 attomoles Enolase –1 Peptide coverage without ABS……. 26% Peptide coverage with ABS………. 42% b) LC-ESI-MS “Protein Expression System” E. Coli Digest. Proteins identified >95% confidence without ABS.. 57 Proteins identified >95% confidence with ABS…...105
A NOVEL ADAPTIVE BACKGROUND SUBTRACTION ALGORITHM FOR REDUCTION OF CHEMICAL NOISE IN MASS SPECTRA.
Martin R Green1, Keith Richardson1,Martin Palmer1, Richard Denny1, Iain Campuzano1, John Skilling2. 1Waters Corporation, Manchester, England; 2Maximum Entropy Data Consultants Ltd, Kenmare, Ireland.
ABS SUBTRACTION METHOD A unique ‘background’ intensity distribution is calculated for each nominal mass bin in the mass spectrum. The method used to calculate the background distribution for a single nominal mass channel M is as follows. 1. Spectra are divided (conceptually) into m nominal mass
channels centered on nominal mass channel M. (Figure 1) 2. Each of the m nominal mass channels are further sub-
divided into n discreet channels. 3. The intensity data within the first of the n subdivisions for
each of the m nominal mass channels is collected into a single data set and an intensity value corresponding to the 50% quantile calculated. (Figure 2)
4. This procedure is repeated for each of the corresponding n subdivisions for each of the m nominal mass channels defining a new intensity distribution.
5. The calculated intensity distribution is subtracted from the input data centred on the nominal mass channel M. (Figure 3).
RESULTS-MALDI PMF MALDI PMF (Peptide Mass Fingerprint) search results (SwissProt protein database) from 500 attomoles of Yeast Enolase tryptic digest were compared with and without application of adaptive background subtraction. Both PLGS and MASCOT® (Matrix Science, UK) )search engines were used through the ‘ProteinLynx’ Browser. Figure 4 A. Shows the MALDI MS spectra from 500 attomoles of Yeast Enolase-1 tryptic digest. Figure 4 B. Shows the same spectra after application of the adaptive background subtraction algorithm. Figure 5 A. Shows an expanded region of the spectrum shown in figure 4A prior to subtraction. Figure 5 B. Shows an expanded region of the spectrum shown in figure 4B after subtraction. Successful protein identification by MALDI-PMF, at low concentrations is often limited by the presence of background chemical noise. In this example the adaptive background subtraction algorithm efficiently reduces the strongly periodic chemical noise improving identification confidence.
CONCLUSION
• The Adaptive Background Subtraction (ABS) algorithm is capable of dramatically reducing background with short range (~ 1Da) periodicity and longer range intensity variation.
• The algorithm adapts to the nature of the
background throughout the mass spectrum resulting in optimum signal to noise throughout the acquired mass range.
• For very complex protein digest samples
application of ABS consistently improves peptide coverage and protein identification confidence from protein databank searches.
ASMS 2006
INTRODUCTION For many mass spectrometric techniques detection limits are re-stricted by the presence of chemical background. The precise nature of this background is rarely known. In Electrospray, chemical background may arise from clustering of solvent and analyte ions. In MALDI, chemical background may arise from matrix cluster ions. In these techniques the chemical back-ground is complex, partially resolved and, commonly, periodic in nature. Here we describe a novel background subtraction algorithm capable of reducing chemical noise in mass spectra. Improvements in signal to noise are illustrated for complex pro-tein digest samples using both MALDI and ESI modes .
EXPERIMENTAL All results obtained using a Waters® Q-TofTM Premier mass spec-trometer. MALDI: Sample—500 attomoles Yeast Enolase tryptic digest. Matrix—Alpha cyano-4-hydroxy-cinnamic acid. SwissProt protein databank. ESI: Sample mixture— 4ug of E. coli tryptic digest, 500fmol Yeast Enolase, 500fmol Bovine Serum Albumin,500fmol Yeast Alco-hol Dehydrogenase,500fmol Rabbit Glycogen Phosphorylase. Chemistry—Waters Atlantis® 300µm x 15cm dC18. Conditions—Flow rate 4µL/min Gradient - 40% acetonitrile (0.1% formic acid) over 90mins. Custom databank containing: E. coli proteins, 12,885 random protein sequences, yeast alcohol dehydrogenase , rabbit gly-cogen phosphorylase, bovine serum albumin and yeast enolase. Processing: Waters ProteinLynxTM Global SERVER 2.2.5 (PLGS), and Waters Protein Expression Informatics.
0
2
4
6
8
10
12
14
10 20 30 40 50 60 70
Intensity %
Freq
uenc
y
50% intensity quantile
Figure 1. Example of an ESI mass spectrum subdivided into 9 nominal mass channels centered on nominal mass channel M. Intensity data from the first of the n sub-divisions is shown in figure 2.
Figure 2. Distribution of intensities produced after collapsing data from first sub-division of each nominal mass channel shown in Figure 1.
646 646.5 647 647.5 648 648.5 649
-50
0
50
100
150
200
250
300
350
400
450
500
647 647.2 647.4 647.6 647.8
Fig 2 100%
Inte
nsity
m/z Figure 3. Portion of ESI mass spectrum showing input data, calculated background distribution and data after subtraction (Inset).
Figure 4. MALDI-MS 500 attomoles Yeast Enolase tryptic digest. A = be-
m/z1000 1200 1400 1600 1800 2000 2200 2400
%
0
100
%
0
100 1066.1
1175.81463.0 1841.91658.1
1066.1
1175.81463.01277.1 1841.9
1658.1 2442.1
Figure 6 A. Shows the PLGS databank search results without adaptive background subtraction. Figure 6 B. Shows the PLGS databank search results after adaptive background subtraction. The signal to noise ratio of the spectra submitted for database searching is clearly im-proved.
Figure 6. PLGS database search A. = without ABS. B. = With ABS.
A
B
Search Engine Number of Peptides NO ABS
Number of Peptides WITH ABS
PGLS 10 (26%) 19 (42%)
MASCOT 6 (17%) 10 (30%)
Table 1 Shows a summary of the PMF results for Yeast Enolase –1 obtained using the PLGS and MASCOT search en-gines. The number of peptides identified and the percentage peptide coverage is shown. In both cases application of adap-tive background subtraction resulted in significant improve-ment.
Table 1. Summary of PLGS and MASCOT databank search results for the protein Yeast Enolase -1 with and without ABS.
RESULTS LC-ESI-MS LC-ESI-MS data was acquired from a sample of E. Coli tryptic digest containing four non E. Coli standard protein digests spiked at low concentrations, using the Waters Protein Expression System. Precursor and Fragment data were acquired by alternating between low (MS) and elevated (MSE) collision energy. The data was processed using Waters Protein Expression System Informatics before and after application of ABS. Results were searched using the PLGS search engine against a custom databank containing E. Coli proteins, the four spiked proteins and a number of randomly generated protein sequences. Figure 6. Shows the PLGS search result window for the Protein Expression analysis using ABS.
Figure 6. PGLS database search results. Protein expression with ABS.
0
10
20
30
40
50
60
70
80
ECol
i_Pr
otei
n_17
8630
4_py
ruva
te_d
ehyd
roge
nase
_(de
carb
oxyla
s...
ECol
i_Pr
otei
n_00
5_PH
S2_R
ABIT
_Glyc
ogen
_pho
spho
ryla
se__
mu.
..
ECol
i_Pr
otei
n_17
8973
8_G
TP-b
indi
ng_p
rote
in_c
hain
_elo
ngat
ion_
f...
ECol
i_Pr
otei
n_00
6_AL
BU_B
OVI
N_S
erum
_alb
umin
_pre
curs
or_(
Al...
ECol
i_Pr
otei
n_17
8619
6_ch
aper
one_
Hsp7
0;_D
NA_b
iosy
nthe
sis;_
...
ECol
i_Pr
otei
n_17
8714
0_30
S_rib
osom
al_s
ubun
it_pr
otei
n_S1
ECol
i_Pr
otei
n_17
9058
6_G
roEL
_cha
pero
ne_H
sp60
_pep
tide-
depe
...
ECol
i_Pr
otei
n_17
9036
1_gl
ycer
ol_k
inas
e
ECol
i_Pr
otei
n_17
9014
4_try
ptop
hana
se
ECol
i_Pr
otei
n_17
8993
4_gl
utam
ate_
deca
rbox
ylase
_iso
zym
e
ECol
i_Pr
otei
n_17
8768
4_al
dehy
de_d
ehyd
roge
nase
_NAD
-link
ed
ECol
i_Pr
otei
n_17
8834
1_gl
ucon
ate-
6-ph
osph
ate_
dehy
drog
enas
e_...
ECol
i_Pr
otei
n_17
8630
7_lip
oam
ide_
dehy
drog
enas
e_(N
ADH)
;_co
...
ECol
i_Pr
otei
n_17
8664
0_tri
gger
_fac
tor;_
a_m
olec
ular
_cha
pero
ne_i
...
ECol
i_Pr
otei
n_17
8693
9_cit
rate
_syn
thas
e
ECol
i_Pr
otei
n_17
9044
5_iso
citra
te_l
yase
ECol
i_Pr
otei
n_17
8738
1_iso
citra
te_d
ehyd
roge
nase
_spe
cific
_for
_NAD
P
ECol
i_Pr
otei
n_17
8914
1_en
olas
e
ECol
i_Pr
otei
n_17
8890
2_se
rine_
hydr
oxym
ethy
ltran
sfer
ase
ECol
i_Pr
otei
n_17
9046
6_pe
ripla
smic
_mal
tose
-bin
ding
_pro
tein
;_su
...
ECol
i_Pr
otei
n_17
9041
2_pr
otei
n_ch
ain_
elon
gatio
n_fa
ctor
_EF-
Tu_.
..
ECol
i_Pr
otei
n_17
8694
8_su
ccin
yl-C
oA_s
ynth
etas
e_be
ta_s
ubun
it
ECol
i_Pr
otei
n_17
8929
4_ph
osph
oglyc
erat
e_kin
ase
ECol
i_Pr
otei
n_17
8857
2_gl
ycer
opho
spho
dies
ter_
phos
phod
iest
era.
..
ECol
i_Pr
otei
n_17
8929
3_fru
ctos
e-bi
spho
spha
te_a
ldol
ase_
class
_II
ECol
i_Pr
otei
n_00
2_AD
H1_Y
EAST
_Alc
ohol
_deh
ydro
gena
se_I
_(E.
..
ECol
i_Pr
otei
n_17
8847
3_ga
lact
ose-
bind
ing_
trans
port_
prot
ein;
_rec
...
ECol
i_Pr
otei
n_17
8807
9_gl
ycer
alde
hyde
-3-p
hosp
hate
_deh
ydro
ge...
ECol
i_Pr
otei
n_17
8963
2_m
alat
e_de
hydr
ogen
ase
ECol
i_Pr
otei
n_17
8920
9_ho
mol
og_o
f_pe
ctin
_deg
radi
ng_e
nzym
e_...
ECol
i_Pr
otei
n_17
9019
2_D-
ribos
e_pe
ripla
smic_
bind
ing_
prot
ein
ECol
i_Pr
otei
n_17
8636
6_pr
otei
n_ch
ain_
elon
gatio
n_fa
ctor
_EF-
Ts
ECol
i_Pr
otei
n_17
8697
0_ph
osph
oglyc
erom
utas
e_1
ECol
i_Pr
otei
n_17
8920
8_2-
deox
y-D
-glu
cona
te_3
-deh
ydro
gena
se
ECol
i_Pr
otei
n_17
8815
6_2-
keto
-3-d
eoxy
gluc
onat
e_6-
phos
phat
e_al
...
ECol
i_Pr
otei
n_17
9083
6_hy
pero
smot
ically
_ind
ucib
le_p
erip
lasm
ic_...
ECol
i_Pr
otei
n_17
8682
2_al
kyl_
hydr
oper
oxid
e_re
duct
ase_
C22
_sub
...
ECol
i_Pr
otei
n_17
8973
9_30
S_rib
osom
al_s
ubun
it_pr
otei
n_S7
_ini
ti...
ECol
i_Pr
otei
n_17
8875
7_PT
S_sy
stem
_glu
cose
-spe
cific_
IIA_c
om...
ECol
i_Pr
otei
n_17
8758
4_th
iol_
pero
xidas
e
ECol
i_Pr
otei
n_17
9003
8_pr
otei
n_ex
port;
_mol
ecul
ar_c
hape
rone
;_...
ECol
i_Pr
otei
n_17
9064
7_50
S_rib
osom
al_s
ubun
it_pr
otei
n_L9
ECol
i_Pr
otei
n_17
8748
9_DN
A-bi
ndin
g_pr
otei
n_HL
P-II_
(HU_
BH2_
...
ECol
i_Pr
otei
n_17
9069
1_or
f_hy
poth
etic
al_p
rote
in
ECol
i_Pr
otei
n_17
8969
7_50
S_rib
osom
al_s
ubun
it_pr
otei
n_L1
5
ECol
i_Pr
otei
n_23
6733
4_50
S_rib
osom
al_s
ubun
it_pr
otei
n_L1
1
ECol
i_Pr
otei
n_17
8927
1_in
_gly
cine_
cleav
age_
com
plex
_car
rier_
o...
ECol
i_Pr
otei
n_17
8992
5_or
f_hy
poth
etic
al_p
rote
in
ECol
i_Pr
otei
n_17
9041
8_50
S_rib
osom
al_s
ubun
it_pr
otei
n_L7
/L12
ECol
i_Pr
otei
n_17
8992
6_or
f_hy
poth
etic
al_p
rote
in
ECol
i_Pr
otei
n_17
8957
7_50
S_rib
osom
al_s
ubun
it_pr
otei
n_L2
1
ECol
i_Pr
otei
n_17
8860
1_or
f_hy
poth
etic
al_p
rote
in
Pept
ide
cove
rage
%
Without ABS With ABS
Figure 7. Proteins identified with >95% confidence.
Figure 7. Shows a Venn diagram illustrating the number of proteins identified (>95% confidence). Approximately twice as many proteins were confidently identified when ABS was used compared to processing without ABS.
Figure 8. Shows comparison of peptide coverage for each of the 53 proteins common to the data sets shown in Figure 7. Consistently higher peptide coverage is reported after process-ing with ABS .
Figure 8. Comparison of peptide coverage assigned proteins. • = With ABS
• = Without ABS
Table 2 Shows a summary of the PLGS search results (> 95% confidence) for the four spiked protein standards. The number of peptides identified and the percentage peptide coverage is shown. Using ABS all four target proteins were identified. Without ABS only three were identified. In addition the results after processing with ABS show significantly higher peptide coverage for those proteins identified.
A
B
A
B
Figure 5. MALDI-MS 500 attomoles Yeast Enolase tryptic digest. A = be-fore ABS. B = After ABS
5 Proteins
105 Proteins 53 Proteins
100 %
Inte
nsity
m/z 643 644 645 646 647 648 649 650 651 652
M
Protein Number of Peptides NO ABS
Number of Peptides WITH ABS
BSA 4 (6%) 14 (24%)
PhosB 5 (8%) 14 (22%)
ADH 3 (9%) 7 (25%)
Enolase X 5 (15%) Table 2. Summary of PLGS search results (> 95% confidence) for protein standards spiked into E Coli Tryptic digest.