proteome discoverer - hecklab.com · proteome discoverer. workflow concept data goes through the...

58
PROTEOME DISCOVERER

Upload: others

Post on 05-Nov-2019

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

PROTEOME DISCOVERER

Page 2: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Workflow concept

Data goes through the workflow Spectra Peptides Quantitation

A “Node” contains an operation An edge represents data flow The results are brought together in tables Protein (group) table Peptide (group) table

Page 3: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Two aspects of the analysis

Identification MS1 + MS2

Quantification MS1

Page 4: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Identification

Page 5: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Decision tree based MS

Page 6: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

How to analyze?

Decision tree is implemented in acquisition software

Data analysis is slightly different for each fragmentation / detector method

Rebuild the decision tree in the workflow!

Page 7: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Overview of the steps

1. Select the file(s) to use2. Select only the MS2 spectra3. Choose the different fragmentation methods4. Create a specific workflow for each

fragmentation type1. Spectrum filtering2. Database search

5. PSM Validation, PTM scoring

Page 8: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Example dataset 1

Human cancer cell-line with 3 treatments Fractionated with SCX Decision tree: HCD / ETD-FT / ETD

Page 9: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

1. Select files to use

Page 10: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

2. Extract the ms2 spectra

Page 11: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

3. Select spectrum origins

Page 12: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Side note:

Page 13: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

For ETD: precursor removal

Page 14: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

For all: size reduction

Page 15: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Identifications: Mascot

Page 16: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

MASCOT search engine

Page 17: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

17

Proteomics Mass Spectrometry

Trypsin Digest

Page 18: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

18

Single Stage MS

MS

Page 19: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

19

Tandem Mass Spectrometry(MS/MS)

Precursor selection

Page 20: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

20

Tandem Mass Spectrometry(MS/MS)

Precursor selection + collision induced dissociation

(CID)

MS/MS

Page 21: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

How do search engines work?

Input Spectrums Sequence database (proteins) Search parameters

Match spectrum to possible peptides Score

Page 22: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

spectrumPrecursor

m/zCharge

Precursor mass

Mass range

Range of possible

peptide masses

Page 23: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

A peptide database

We now know a range where the peptide mass has to be in

Which peptides confirm to this criterium? Taken from protein database Modifications taken into account Miscleavages taken into account

Page 24: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Digest, modify

Variablemodifications

Fixedmodifications

Protein database

Allowed missed

cleavages

Peptide databasewith modifications

indexed by mass

Page 25: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Matching the spectra

Now we’re able to match the spectrum For that, theoretical spectra are composed This is done with the information from the

peptide database

Page 26: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Theoretical spectraMS/MS-

spectrumCompare and score

Ranked list of peptides

Peptide database

With modifications

Indexed by mass

Range of possible

peptide masses

Page 27: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

How to build a theoretical spectrum Information is needed about Amino acid masses Modification masses Methods of fragmentation

Page 28: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

28

Peptide Fragmentation

Peptide: S-G-F-L-E-E-D-E-L-KMW ion ion MW

88 b1 S GFLEEDELK y9 1080145 b2 SG FLEEDELK y8 1022292 b3 SGF LEEDELK y7 875405 b4 SGFL EEDELK y6 762534 b5 SGFLE EDELK y5 633663 b6 SGFLEE DELK y4 504778 b7 SGFLEED ELK y3 389907 b8 SGFLEEDE LK y2 260

1020 b9 SGFLEEDEL K y1 147

Page 29: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Identifications: Mascot

Page 30: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

PSM Thresholds

Page 31: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

PSM thresholds

Several options: Target/decoy databases Percolator

Page 32: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

False Discovery Rate

This is a number indicating how many of your hits are ‘false’

If the FDR is 0.01 (1%), we expect 1 out of 100 peptide hits to be a false identification

The number is estimated by the number of hits in the decoy database search result:FDR = ‘decoy hits’ / ‘target hits’

Page 33: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Decoy hits

Too much “noise” Mixed spectrum Overall low signal

MS2 peaks do match with a peptide

Page 34: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Two ways of FDR calculation

Concatenated database One search A spectrum matches either one or the other

(“Competitive”)

Separate decoy database (in Mascot) Two searches A spectrum may match both decoy or target seq

(“non-competitive”)

FDR = decoy hits / total hits

forward

reversed

forward

reversed

Page 35: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Limitation: one dimension

0 10 20 30 40

Mascot score

0 10 20 30 40

Mascot score

Mas

s de

lta (p

pm)

0

10

20

30

40

Page 36: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Introducing Percolator

Percolator (Lukas Käll, McCoss Lab 2007) Machine learning Multiple features

Page 37: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

What is a support vector machine Classification algorithm Best hyperplane between two groups

Page 38: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

What is a support vector machine Classification algorithm Best hyperplane between two groups

Page 39: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Soft edges for SVMs

Page 40: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Using several features

Score Mass delta (ppm) Delta score Number of matched ions …

Page 41: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Higher dimensions

Page 42: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

PTM scoring

PhosphoRS node Parallel to percolator node

PhosphoRS version 2 names change PhosphoRS score

Mao will explain in the afternoon

Page 43: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Quantification

Page 44: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Labeled Quantification

Peptide modifications Extracted Ion Chromatrogram (XIC)

Node to extract the intensities

Page 45: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

How to interpret the XICs

The XICs need to be grouped Isotopes Known modification (e.g. dimethylation, SILAC)

The node needs details Quantification method

Page 46: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Manage quant methods

Page 47: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge
Page 48: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

PD Output

Page 49: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Understanding PD output

PD run log Node result columns Protein output Peptide output

NB: the “modified peptide” column comes from the Mascot node, whilst the phosphoRSsite localizations are from that node Modified peptide does not represent “best”

localizations

Page 50: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

“Peptides” tab

List Peptides Grouping (right click) Grouped Ungrouped (PTMs) Grouping parameters are in the “Result Filters”

tab Based on sequence Based on sequence and mass (i.e. modifications)

PD will filter the PTMs based on the settings in the “Result Filters” tab

Page 51: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Peptide output

< 0.5>2

Page 52: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

“Protein” tab

List proteins Grouping (right click) Grouped Ungrouped

Grouping based on shared peptides

Page 53: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

What’s Grouped?

Grouping allows for the calculation of standard deviations

Show the highest scores

Page 54: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge
Page 55: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Filters

Page 56: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Example dataset 2

Yeast time course experiment Labeled with TMT 6 -plex Fractionated with SCX

Page 57: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

… build this!

Page 58: Proteome Discoverer - Hecklab.com · PROTEOME DISCOVERER. Workflow concept Data goes through the workflow Spectra Peptides Quantitation A “Node” contains an operation An edge

Alternative TMT analysis

Isobar package Nice Statistical analysis Nice output Freely available

Disadvantages Written in R Somewhat hard to useAsk Bas