a novel adaptive background subtraction ......ground is complex, partially resolved and, commonly,...

TO DOWNLOAD A COPY OF THIS POSTER VISIT WWW.WATERS.COM/POSTERS ©2006 Waters Corporation

m/z1360 1380 1400 1420 1440 1460 1480

%

0

100

%

0

1001463.0

1370.9

1362.9

1412.91388.9

1382.9

1398.91444.9

1424.9 1436.9 1454.91473.9

1463.0

1370.91412.9

1388.9 1445.0

OVERVIEW PURPOSE Reduction of chemical noise using a novel ‘Adaptive’ Background Subtraction (ABS) algorithm. METHODS Statistical analysis of the local intensity distribution for each nominal mass region. Estimation and removal of intensity distribution due to repetitive background. RESULTS Comparison of databank search results. a) MALDI-MS PMF 500 attomoles Enolase –1 Peptide coverage without ABS……. 26% Peptide coverage with ABS………. 42% b) LC-ESI-MS “Protein Expression System” E. Coli Digest. Proteins identified >95% confidence without ABS.. 57 Proteins identified >95% confidence with ABS…...105

A NOVEL ADAPTIVE BACKGROUND SUBTRACTION ALGORITHM FOR REDUCTION OF CHEMICAL NOISE IN MASS SPECTRA.

Martin R Green1, Keith Richardson1,Martin Palmer1, Richard Denny1, Iain Campuzano1, John Skilling2. 1Waters Corporation, Manchester, England; 2Maximum Entropy Data Consultants Ltd, Kenmare, Ireland.

ABS SUBTRACTION METHOD A unique ‘background’ intensity distribution is calculated for each nominal mass bin in the mass spectrum. The method used to calculate the background distribution for a single nominal mass channel M is as follows. 1. Spectra are divided (conceptually) into m nominal mass

channels centered on nominal mass channel M. (Figure 1) 2. Each of the m nominal mass channels are further sub-

divided into n discreet channels. 3. The intensity data within the first of the n subdivisions for

each of the m nominal mass channels is collected into a single data set and an intensity value corresponding to the 50% quantile calculated. (Figure 2)

4. This procedure is repeated for each of the corresponding n subdivisions for each of the m nominal mass channels defining a new intensity distribution.

5. The calculated intensity distribution is subtracted from the input data centred on the nominal mass channel M. (Figure 3).

RESULTS-MALDI PMF MALDI PMF (Peptide Mass Fingerprint) search results (SwissProt protein database) from 500 attomoles of Yeast Enolase tryptic digest were compared with and without application of adaptive background subtraction. Both PLGS and MASCOT® (Matrix Science, UK) )search engines were used through the ‘ProteinLynx’ Browser. Figure 4 A. Shows the MALDI MS spectra from 500 attomoles of Yeast Enolase-1 tryptic digest. Figure 4 B. Shows the same spectra after application of the adaptive background subtraction algorithm. Figure 5 A. Shows an expanded region of the spectrum shown in figure 4A prior to subtraction. Figure 5 B. Shows an expanded region of the spectrum shown in figure 4B after subtraction. Successful protein identification by MALDI-PMF, at low concentrations is often limited by the presence of background chemical noise. In this example the adaptive background subtraction algorithm efficiently reduces the strongly periodic chemical noise improving identification confidence.

CONCLUSION

• The Adaptive Background Subtraction (ABS) algorithm is capable of dramatically reducing background with short range (~ 1Da) periodicity and longer range intensity variation.

• The algorithm adapts to the nature of the

background throughout the mass spectrum resulting in optimum signal to noise throughout the acquired mass range.

• For very complex protein digest samples

application of ABS consistently improves peptide coverage and protein identification confidence from protein databank searches.

ASMS 2006

INTRODUCTION For many mass spectrometric techniques detection limits are re-stricted by the presence of chemical background. The precise nature of this background is rarely known. In Electrospray, chemical background may arise from clustering of solvent and analyte ions. In MALDI, chemical background may arise from matrix cluster ions. In these techniques the chemical back-ground is complex, partially resolved and, commonly, periodic in nature. Here we describe a novel background subtraction algorithm capable of reducing chemical noise in mass spectra. Improvements in signal to noise are illustrated for complex pro-tein digest samples using both MALDI and ESI modes .

EXPERIMENTAL All results obtained using a Waters® Q-TofTM Premier mass spec-trometer. MALDI: Sample—500 attomoles Yeast Enolase tryptic digest. Matrix—Alpha cyano-4-hydroxy-cinnamic acid. SwissProt protein databank. ESI: Sample mixture— 4ug of E. coli tryptic digest, 500fmol Yeast Enolase, 500fmol Bovine Serum Albumin,500fmol Yeast Alco-hol Dehydrogenase,500fmol Rabbit Glycogen Phosphorylase. Chemistry—Waters Atlantis® 300µm x 15cm dC18. Conditions—Flow rate 4µL/min Gradient - 40% acetonitrile (0.1% formic acid) over 90mins. Custom databank containing: E. coli proteins, 12,885 random protein sequences, yeast alcohol dehydrogenase , rabbit gly-cogen phosphorylase, bovine serum albumin and yeast enolase. Processing: Waters ProteinLynxTM Global SERVER 2.2.5 (PLGS), and Waters Protein Expression Informatics.

0

2

4

6

8

10

12

14

10 20 30 40 50 60 70

Intensity %

Freq

uenc

y

50% intensity quantile

Figure 1. Example of an ESI mass spectrum subdivided into 9 nominal mass channels centered on nominal mass channel M. Intensity data from the first of the n sub-divisions is shown in figure 2.

Figure 2. Distribution of intensities produced after collapsing data from first sub-division of each nominal mass channel shown in Figure 1.

646 646.5 647 647.5 648 648.5 649

-50

0

50

100

150

200

250

300

350

400

450

500

647 647.2 647.4 647.6 647.8

Fig 2 100%

Inte

nsity

m/z Figure 3. Portion of ESI mass spectrum showing input data, calculated background distribution and data after subtraction (Inset).

Figure 4. MALDI-MS 500 attomoles Yeast Enolase tryptic digest. A = be-

m/z1000 1200 1400 1600 1800 2000 2200 2400

%

0

100

%

0

100 1066.1

1175.81463.0 1841.91658.1

1066.1

1175.81463.01277.1 1841.9

1658.1 2442.1

Figure 6 A. Shows the PLGS databank search results without adaptive background subtraction. Figure 6 B. Shows the PLGS databank search results after adaptive background subtraction. The signal to noise ratio of the spectra submitted for database searching is clearly im-proved.

Figure 6. PLGS database search A. = without ABS. B. = With ABS.

A

B

Search Engine Number of Peptides NO ABS

Number of Peptides WITH ABS

PGLS 10 (26%) 19 (42%)

MASCOT 6 (17%) 10 (30%)

Table 1 Shows a summary of the PMF results for Yeast Enolase –1 obtained using the PLGS and MASCOT search en-gines. The number of peptides identified and the percentage peptide coverage is shown. In both cases application of adap-tive background subtraction resulted in significant improve-ment.

Table 1. Summary of PLGS and MASCOT databank search results for the protein Yeast Enolase -1 with and without ABS.

RESULTS LC-ESI-MS LC-ESI-MS data was acquired from a sample of E. Coli tryptic digest containing four non E. Coli standard protein digests spiked at low concentrations, using the Waters Protein Expression System. Precursor and Fragment data were acquired by alternating between low (MS) and elevated (MSE) collision energy. The data was processed using Waters Protein Expression System Informatics before and after application of ABS. Results were searched using the PLGS search engine against a custom databank containing E. Coli proteins, the four spiked proteins and a number of randomly generated protein sequences. Figure 6. Shows the PLGS search result window for the Protein Expression analysis using ABS.

Figure 6. PGLS database search results. Protein expression with ABS.

0

10

20

30

40

50

60

70

80

ECol

i_Pr

otei

n_17

8630

4_py

ruva

te_d

ehyd

roge

nase

_(de

carb

oxyla

s...

ECol

i_Pr

otei

n_00

5_PH

S2_R

ABIT

_Glyc

ogen

_pho

spho

ryla

se__

mu.

..

ECol

i_Pr

otei

n_17

8973

8_G

TP-b

indi

ng_p

rote

in_c

hain

_elo

ngat

ion_

f...

ECol

i_Pr

otei

n_00

6_AL

BU_B

OVI

N_S

erum

_alb

umin

_pre

curs

or_(

Al...

ECol

i_Pr

otei

n_17

8619

6_ch

aper

one_

Hsp7

0;_D

NA_b

iosy

nthe

sis;_

...

ECol

i_Pr

otei

n_17

8714

0_30

S_rib

osom

al_s

ubun

it_pr

otei

n_S1

ECol

i_Pr

otei

n_17

9058

6_G

roEL

_cha

pero

ne_H

sp60

_pep

tide-

depe

...

ECol

i_Pr

otei

n_17

9036

1_gl

ycer

ol_k

inas

e

ECol

i_Pr

otei

n_17

9014

4_try

ptop

hana

se

ECol

i_Pr

otei

n_17

8993

4_gl

utam

ate_

deca

rbox

ylase

_iso

zym

e

ECol

i_Pr

otei

n_17

8768

4_al

dehy

de_d

ehyd

roge

nase

_NAD

-link

ed

ECol

i_Pr

otei

n_17

8834

1_gl

ucon

ate-

6-ph

osph

ate_

dehy

drog

enas

e_...

ECol

i_Pr

otei

n_17

8630

7_lip

oam

ide_

dehy

drog

enas

e_(N

ADH)

;_co

...

ECol

i_Pr

otei

n_17

8664

0_tri

gger

_fac

tor;_

a_m

olec

ular

_cha

pero

ne_i

...

ECol

i_Pr

otei

n_17

8693

9_cit

rate

_syn

thas

e

ECol

i_Pr

otei

n_17

9044

5_iso

citra

te_l

yase

ECol

i_Pr

otei

n_17

8738

1_iso

citra

te_d

ehyd

roge

nase

_spe

cific

_for

_NAD

P

ECol

i_Pr

otei

n_17

8914

1_en

olas

e

ECol

i_Pr

otei

n_17

8890

2_se

rine_

hydr

oxym

ethy

ltran

sfer

ase

ECol

i_Pr

otei

n_17

9046

6_pe

ripla

smic

_mal

tose

-bin

ding

_pro

tein

;_su

...

ECol

i_Pr

otei

n_17

9041

2_pr

otei

n_ch

ain_

elon

gatio

n_fa

ctor

_EF-

Tu_.

..

ECol

i_Pr

otei

n_17

8694

8_su

ccin

yl-C

oA_s

ynth

etas

e_be

ta_s

ubun

it

ECol

i_Pr

otei

n_17

8929

4_ph

osph

oglyc

erat

e_kin

ase

ECol

i_Pr

otei

n_17

8857

2_gl

ycer

opho

spho

dies

ter_

phos

phod

iest

era.

..

ECol

i_Pr

otei

n_17

8929

3_fru

ctos

e-bi

spho

spha

te_a

ldol

ase_

class

_II

ECol

i_Pr

otei

n_00

2_AD

H1_Y

EAST

_Alc

ohol

_deh

ydro

gena

se_I

_(E.

..

ECol

i_Pr

otei

n_17

8847

3_ga

lact

ose-

bind

ing_

trans

port_

prot

ein;

_rec

...

ECol

i_Pr

otei

n_17

8807

9_gl

ycer

alde

hyde

-3-p

hosp

hate

_deh

ydro

ge...

ECol

i_Pr

otei

n_17

8963

2_m

alat

e_de

hydr

ogen

ase

ECol

i_Pr

otei

n_17

8920

9_ho

mol

og_o

f_pe

ctin

_deg

radi

ng_e

nzym

e_...

ECol

i_Pr

otei

n_17

9019

2_D-

ribos

e_pe

ripla

smic_

bind

ing_

prot

ein

ECol

i_Pr

otei

n_17

8636

6_pr

otei

n_ch

ain_

elon

gatio

n_fa

ctor

_EF-

Ts

ECol

i_Pr

otei

n_17

8697

0_ph

osph

oglyc

erom

utas

e_1

ECol

i_Pr

otei

n_17

8920

8_2-

deox

y-D

-glu

cona

te_3

-deh

ydro

gena

se

ECol

i_Pr

otei

n_17

8815

6_2-

keto

-3-d

eoxy

gluc

onat

e_6-

phos

phat

e_al

...

ECol

i_Pr

otei

n_17

9083

6_hy

pero

smot

ically

_ind

ucib

le_p

erip

lasm

ic_...

ECol

i_Pr

otei

n_17

8682

2_al

kyl_

hydr

oper

oxid

e_re

duct

ase_

C22

_sub

...

ECol

i_Pr

otei

n_17

8973

9_30

S_rib

osom

al_s

ubun

it_pr

otei

n_S7

_ini

ti...

ECol

i_Pr

otei

n_17

8875

7_PT

S_sy

stem

_glu

cose

-spe

cific_

IIA_c

om...

ECol

i_Pr

otei

n_17

8758

4_th

iol_

pero

xidas

e

ECol

i_Pr

otei

n_17

9003

8_pr

otei

n_ex

port;

_mol

ecul

ar_c

hape

rone

;_...

ECol

i_Pr

otei

n_17

9064

7_50

S_rib

osom

al_s

ubun

it_pr

otei

n_L9

ECol

i_Pr

otei

n_17

8748

9_DN

A-bi

ndin

g_pr

otei

n_HL

P-II_

(HU_

BH2_

...

ECol

i_Pr

otei

n_17

9069

1_or

f_hy

poth

etic

al_p

rote

in

ECol

i_Pr

otei

n_17

8969

7_50

S_rib

osom

al_s

ubun

it_pr

otei

n_L1

5

ECol

i_Pr

otei

n_23

6733

4_50

S_rib

osom

al_s

ubun

it_pr

otei

n_L1

1

ECol

i_Pr

otei

n_17

8927

1_in

_gly

cine_

cleav

age_

com

plex

_car

rier_

o...

ECol

i_Pr

otei

n_17

8992

5_or

f_hy

poth

etic

al_p

rote

in

ECol

i_Pr

otei

n_17

9041

8_50

S_rib

osom

al_s

ubun

it_pr

otei

n_L7

/L12

ECol

i_Pr

otei

n_17

8992

6_or

f_hy

poth

etic

al_p

rote

in

ECol

i_Pr

otei

n_17

8957

7_50

S_rib

osom

al_s

ubun

it_pr

otei

n_L2

1

ECol

i_Pr

otei

n_17

8860

1_or

f_hy

poth

etic

al_p

rote

in

Pept

ide

cove

rage

%

Without ABS With ABS

Figure 7. Proteins identified with >95% confidence.

Figure 7. Shows a Venn diagram illustrating the number of proteins identified (>95% confidence). Approximately twice as many proteins were confidently identified when ABS was used compared to processing without ABS.

Figure 8. Shows comparison of peptide coverage for each of the 53 proteins common to the data sets shown in Figure 7. Consistently higher peptide coverage is reported after process-ing with ABS .

Figure 8. Comparison of peptide coverage assigned proteins. • = With ABS

• = Without ABS

Table 2 Shows a summary of the PLGS search results (> 95% confidence) for the four spiked protein standards. The number of peptides identified and the percentage peptide coverage is shown. Using ABS all four target proteins were identified. Without ABS only three were identified. In addition the results after processing with ABS show significantly higher peptide coverage for those proteins identified.

A

B

A

B

Figure 5. MALDI-MS 500 attomoles Yeast Enolase tryptic digest. A = be-fore ABS. B = After ABS

5 Proteins

105 Proteins 53 Proteins

100 %

Inte

nsity

m/z 643 644 645 646 647 648 649 650 651 652

M

Protein Number of Peptides NO ABS

Number of Peptides WITH ABS

BSA 4 (6%) 14 (24%)

PhosB 5 (8%) 14 (22%)

ADH 3 (9%) 7 (25%)

Enolase X 5 (15%) Table 2. Summary of PLGS search results (> 95% confidence) for protein standards spiked into E Coli Tryptic digest.

matusonl

Text Box

720001724EN

a novel adaptive background subtraction ......ground is complex, partially resolved and, commonly,...

Documents