use of sequest search results with proteored miape extractor

24
La Cristalera, Miraflores de la Sierra, 10-11 December 2012 HPP Sp- HPP Use of SEQUEST search results with ProteoRed.org MIAPE Extractor

Upload: brier

Post on 22-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Use of SEQUEST search results with ProteoRed.org MIAPE Extractor. Sp-HPP. HPP. La Cristalera , Miraflores de la Sierra, 10-11 December 2012. Index. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Use of SEQUEST search results with ProteoRed MIAPE Extractor

La Cristalera, Miraflores de la Sierra, 10-11 December 2012

HPP Sp-HPP

Use of SEQUEST search results with ProteoRed.org MIAPE Extractor

Page 2: Use of SEQUEST search results with ProteoRed MIAPE Extractor

1. A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE ToolkitÓscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián

2. Data dependent acquisition using inclusion list (IL)Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal

INDEX

Page 3: Use of SEQUEST search results with ProteoRed MIAPE Extractor

Ó. Gallardo

MASCOT WORKFLOW

MIAPE Generation

MIAPEExtractor

Mass Spectra Identification

Mascot

Output file

mzIdentML

MIAPE MS MIAPE MSI

A D MS D T I H E I K

<MZID/>

/>MIAPEMS<

ψ

/>MIAPEMSI<

ψ

MIAPEGenerator

Tool

RAWMGF

MGF

Page 4: Use of SEQUEST search results with ProteoRed MIAPE Extractor

MIAPEExtractor

Ó. Gallardo

Mass Spectra Identification Output file

PROTEOME DISCOVERER WORKFLOW

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

mzIdentMLA D MS D T I H E I K

<MZID/>

Page 5: Use of SEQUEST search results with ProteoRed MIAPE Extractor

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

RAW

MGF

MGF

(GPL) LP-CSIC/UAB 2011-2012

Page 6: Use of SEQUEST search results with ProteoRed MIAPE Extractor

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

RAW

MGF

MGF

Proteome Discoverer

Discoverer Daemon

Page 7: Use of SEQUEST search results with ProteoRed MIAPE Extractor

MIAPEExtractor

Ó. Gallardo

Mass Spectra Identification Output file

PROTEOME DISCOVERER WORKFLOW

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

mzIdentMLA D MS D T I H E I K

<MZID/>

Discoverer Daemon

Proteome Discoverer

Page 8: Use of SEQUEST search results with ProteoRed MIAPE Extractor

ProCon0.9.152

Ó. Gallardo

A. Medina August 2012

PROTEOME DISCOVERER WORKFLOW

MSF

MSF

mzIdentMLA D MS D T I H E I K

<MZID/>

Page 9: Use of SEQUEST search results with ProteoRed MIAPE Extractor

ProCon0.9.162

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

MSF

MSF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

...........................................................67% finished

.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai

.TaxID for organismName unknown: Sphaerochaeta globosa

...TaxID for organismName unknown: Leptospira borgpetersenii serovar

.....MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finishedSequenceCollection writtenCV term for unknown modification Deamidated / +0.984 Da (N, Q) not found.CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found.Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851)

1. ProCon 0.9.162 was unable to interpret correctly the Controlled Vocabulary used by

Proteome Discoverer to identify Post Translational Modifications (PTMs)

...........................................................67% finished

.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai

.TaxID for organismName unknown: Sphaerochaeta globosa

...TaxID for organismName unknown: Leptospira borgpetersenii serovar

.....MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finishedSequenceCollection writtenCV term for unknown modification Deamidated / +0.984 Da (N, Q) not found.CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found.

2. ProCon 0.9.162 also had problems with it’s internal array references

ERROR!!

Page 10: Use of SEQUEST search results with ProteoRed MIAPE Extractor

ProCon 0.9.16

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

MSF

MSF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

...........................................................67% finished

.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai

.TaxID for organismName unknown: Sphaerochaeta globosa

...TaxID for organismName unknown: Leptospira borgpetersenii serovar

.....MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finishedSequenceCollection writtenCV term for unknown modification Deamidated / +0.984 Da (N, Q) not found.CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found.Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException at de.mpc.Prot2MzIdent.AParamHandler.createThresholdParameterList(AParamHandler.java:526) at de.mpc.Prot2MzIdent.PD12ToMzIdentML.getProteinDetectionProtocol(PD12ToMzIdentML.java:851)

1. ProCon 0.9.163 was unable to identify correctly Post Translational Modifications (PTMs) , marking all of them as “unknown

modification” in the resulting mzIdentML file

...........................................................67% finished

.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai

.TaxID for organismName unknown: Sphaerochaeta globosa

...TaxID for organismName unknown: Leptospira borgpetersenii serovar

.....MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finishedSequenceCollection writtenCV term for unknown modification Deamidated / +0.984 Da (N, Q) not found.CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found.

2. ProCon 0.9.163 had still problems with it’s internal array references

ERROR!!

23

Page 11: Use of SEQUEST search results with ProteoRed MIAPE Extractor

ProCon 0.9.16

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

MSF

MSF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

34

Page 12: Use of SEQUEST search results with ProteoRed MIAPE Extractor

MIAPE Generation

MIAPEGenerator

Tool

MIAPEExtractor

Ó. Gallardo

Mass Spectra Identification Output file

PROTEOME DISCOVERER WORKFLOW

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

Discoverer Daemon

Proteome Discoverer

Page 13: Use of SEQUEST search results with ProteoRed MIAPE Extractor

MIAPE Generation

MIAPEExtractor

Ó. Gallardo

Mass Spectra Identification Output file

PROTEOME DISCOVERER WORKFLOW

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

Discoverer Daemon

Proteome Discoverer ...........................................................67% finished.....................TaxID for organismName unknown: Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai.TaxID for organismName unknown: Sphaerochaeta globosa...TaxID for organismName unknown: Leptospira borgpetersenii serovar.....MyProgressBar for getSpectrumIdentificationListAndProteinDetectionListAndPeptideEvidences for SEQ finishedSequenceCollection writtenCV term for unknown modification Deamidated / +0.984 Da (N, Q) not found.CV term for unknown modification Acetyl / +42.011 Da (Any NTerminus) not found.

Spectra IDs didn’t match between MGF file and mzIdentML file

IDmgf

IDmzid

IDIDID ID

PepMSCharge

RT

Page 14: Use of SEQUEST search results with ProteoRed MIAPE Extractor

MIAPE Generation

MIAPEGenerator

Tool

MIAPEExtractor

Ó. Gallardo

Mass Spectra Identification Output file

PROTEOME DISCOVERER WORKFLOW

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

MIAPE MS MIAPE MSI/>MIAPE

MS<

ψ

/>MIAPEMSI<

ψ

Discoverer Daemon

Proteome Discoverer

ID ID

PepMSCharge

RT

ID

Page 15: Use of SEQUEST search results with ProteoRed MIAPE Extractor

Ó. Gallardo

PROTEOME DISCOVERER WORKFLOW

MIAPE Generation

MIAPEGenerator

Tool

MIAPEExtractor

Mass Spectra Identification Output file

RAW MSF

MSF

Proteome Discoverer

MGF

MGF

.Prot.XML

mzIdentMLA D MS D T I H E I K

<MZID/>

MIAPE MS MIAPE MSI/>MIAPE

MS<

ψ

/>MIAPEMSI<

ψ

Discoverer Daemon

Proteome Discoverer

Page 16: Use of SEQUEST search results with ProteoRed MIAPE Extractor

1. Uploading of MSF + mzIdentML files through MIAPE Extractor is not yet automatized

2. Although we can generate MIAPE data from Sequest search results, MIAPE Toolkit

doesn’t work very well with this data for the analysis stage: we can not retrieve the

identified proteins, there are problems with the Sequest Score fields, …

1. We are working in an automation script, to automatize MIAPE Extractor data

extraction: MIAPE Extractor Automator v.22. Development of MIAPE Extractor and

MIAPE Generator tool continues improvement in each version

1. Exportation of Prot.XML files from the MSF ones, and utter conversion of MSF +

Prot.XML files to mzIdentML files is not automatized

2. ProCon has still some errors, is very slow with large files, and is memory hungry

ProCon developers are working in a new version that doesn’t need Prot.XML files, making the

conversion process much faster and easier.

WORK IN PROGRESS

Ó. Gallardo

Page 17: Use of SEQUEST search results with ProteoRed MIAPE Extractor

1. A working Workflow to extract MIAPE information from Proteome Discoverer 1.3 search results using ProteoRed MIAPE ToolkitÓscar Gallardo, Joan Villanueva, Montserrat Carrascal, Joaquín Abián

2. Data dependent acquisition using inclusion list (IL)Joan Villanueva, Óscar Gallardo, Joaquín Abián, Montserrat Carrascal

INDEX

Page 18: Use of SEQUEST search results with ProteoRed MIAPE Extractor

RATIONAL OF USING DDP WITH INCLUSION LIST (IL):

a.- Most target proteins assigned to the groups of the shotgun project were not detected using shotgun approaches.

b.- The few detected peptides were not optimum for MRM analysis (not proteotypic, with Met/Cys, with missed cleavage).

c.- Preliminary tests at LP-CSIC/UAB using targeted approaches require a limited list of peptides (need to restrict the list of target m/z values to 20-30) and failed to detect the target proteins.

DDP with Inclusion list increases the probability to positively detect low abundant proteins/peptides without the constraints of targeted approaches.

16 PROTEINS SELECTED FOR INCLUSION LIST

- 6 proteins assigned to the LPCSICUAB laboratory- 10 proteins assigned to MRM labs and not

detected by shotgun

Laboratory Uniprot Name

Canals P69905 HBA_HUMANFB Q6GPI1 CTRB2_HUMANCG P24855 DNAS1_HUMANMPV Q6A1A2 PDPK2_HUMANFC P16444 DPEP1_HUMANCG Q9BSW7 SYT17_HUMANCG P11597 CETP_HUMANMPV P15391 CD19_HUMANCG Q53FZ2 ACSM3_HUMANFV Q8N4N3 KLH36_HUMANAbian Q9BUU2 METTL22_HUMANAbian P33076 CIITA_HUMANAbian Q9Y661 HS3ST4_HUMANAbian Q14703 MBTPS1_HUMANAbian B7ZMK8 PRSS36_HUMANAbian A4GXA9 EME2_HUMAN

Data dependent acquisition with inclusion list

J. Villanueva

Page 19: Use of SEQUEST search results with ProteoRed MIAPE Extractor

To obtain the inclusion list: 1.- All tryptic peptides 7-25AA. 2.- m/z values assuming z=2 and z=3 for all peptides. 3.- Filter duplicate m/z values (software requirement) Number of m/z values in the inclusion list: 556 (num peptides 282)

   Signal ID m/zP33076_GCTLLLTARPR 400.9013P11597_VFHSLAK 401.2348P16444_YPDLIAELLR 401.5646Q53FZ2_EGWGNLK 402.2062P24855_YDIALVQEVR 402.5561Q8N4N3_VASMNQR 403.2032Q8N4N3_VKPAVCSLLPK 404.5779Q14703_APCPGCSHLTLK 409.5392Q9Y661_AISDYTQTLSK 409.5473Q9BSW7_TAVEQWHSLR 409.5478P69905_VDPVNFK 409.7243P16444_TLEQMDVVHR 409.8769A4GXA9_MGLLAVGPDLSR 410.2292

Samples CCD18 and MCF7Aliquot 250 µg protein

OffGel (12 fractions)

FASP digestion

LC-MS/MS (DDP, IL, Targeted)

Protein Discoverer

Procedure: Data Dependent with IL

J. Villanueva

Page 20: Use of SEQUEST search results with ProteoRed MIAPE Extractor

DATA DEPENDENT WITH INCLUSION LIST: LTQ-ORBITRAP

RT: 0.00 - 140.02

0 20 40 60 80 100 120 140Time (min)

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e A

bund

ance

185641.90

256753.80

269255.50 7533

136.19

345868.15

241251.34

414980.00

439283.91

530198.67

173439.81

6742122.97

6210114.05

90525.72

40717.09

7635136.17

277055.79

261653.80 3374

64.60 358367.932495

52.20186942.36 3766

71.01 541898.504522

83.95157137.08 6832

122.716285

113.47106528.39 4819

88.7330815.32

NL: 2.17E9TIC F: FTMS + p NSI Full ms [400.00-1800.00] MS HPP_VallHebron_DDPorbi_Test1_120724_Fr06_06

NL: 9.66E8TIC F: FTMS + p NSI Full ms [400.00-1800.00] MS HPP_VallHebron_DDPorbi_Test1_120724_Fr07_07

Offgel Fr6

Offgel Fr7

Sample VH: MCF-7MS traces

J. Villanueva

Page 21: Use of SEQUEST search results with ProteoRed MIAPE Extractor

RESULT:Data dependent with IL: 282 Listed peptides undetected

(same that targeted experiments)

Low amount of target proteins

Proteins not expressed in these cells

RESULTS: Inclusion list and targeted

DATA PROCESSING FOR IL DATA: 1.- MGF generation with PDv1.32.- Database search: Proteome Discoverer and Mascot3.- FDR 5%

J. Villanueva

Page 22: Use of SEQUEST search results with ProteoRed MIAPE Extractor

DATA PROCESSING: 1.- MGF generation with PDv1.32.- Database search: Proteome Discoverer (and Mascot)3.- Search results and Filtering (1 %FDR): MIAPE Extractor (Data

Inspector Module) and Proteome Discoverer.

Work in progress:MIAPE EXTRACTOR:

The data could be uploaded and the FDR process could be achieved.

Data Inspector Module: Detected errors to be solved: unable to extract protein information from SEQUEST data.

 

Chromosome 16 protein description: Data Dependent Analysis

J. Villanueva

Page 23: Use of SEQUEST search results with ProteoRed MIAPE Extractor

Sample Acquisition method

search method

MIAPE EXTRACTOR PROTEOME DISCOVERERNum peptides Num proteins Num peptides Num proteins

MCF7 DDP MASCOT 3079 2316 -- -- SEQUEST 3561 1422 3616 1282

CCD18 DDP MASCOT 3102 2370 3765 1180 SEQUEST 2250 980 2475 946

Work in progress...

Number of proteins that passed the 1%FDR filter:1.- Significant differences between searching algorithms

Need an in-depth data revision.

J. Villanueva

Page 24: Use of SEQUEST search results with ProteoRed MIAPE Extractor

La Cristalera, Miraflores de la Sierra, 10-11 December 2012

HPP Sp-HPP

Use of SEQUEST search results with ProteoRed.org MIAPE Extractor

Thank you for your attention.

Any question?