direct experimental observation of functional protein isoforms by tandem mass spectrometry
DESCRIPTION
Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry. Nathan Edwards Center for Bioinformatics and Computational Biology University of Maryland, College Park. Synopsis. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/1.jpg)
Direct Experimental Observation
of Functional Protein Isoforms
by Tandem Mass Spectrometry
Direct Experimental Observation
of Functional Protein Isoforms
by Tandem Mass Spectrometry
Nathan EdwardsCenter for Bioinformatics and Computational BiologyUniversity of Maryland, College Park
![Page 2: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/2.jpg)
2
Synopsis
• MS/MS spectra provide evidence for the amino-acid sequence of functional proteins.
• Key concepts:• Spectrum acquisition is unbiased• Direct observation of amino-acid sequence• Sensitive to small sequence variations
![Page 3: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/3.jpg)
3
Synopsis
• MS/MS spectra provide evidence for the amino-acid sequence of functional proteins.
• Applications:• Cancer biomarkers• Genome annotation
![Page 4: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/4.jpg)
4
Mass Spectrometry for Proteomics
• Measure mass of many (bio)molecules simultaneously• High bandwidth
• Mass is an intrinsic property of all (bio)molecules• No prior knowledge required
![Page 5: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/5.jpg)
5
Mass Spectrometer
Ionizer
Sample
+_
Mass Analyzer Detector
• MALDI• Electro-Spray
Ionization (ESI)
• Time-Of-Flight (TOF)• Quadrapole• Ion-Trap
• ElectronMultiplier(EM)
![Page 6: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/6.jpg)
6
High Bandwidth
100
0250 500 750 1000
m/z
% I
nte
nsit
y
![Page 7: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/7.jpg)
7
Mass is fundamental!
![Page 8: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/8.jpg)
8
Mass Spectrometry for Proteomics
• Measure mass of many molecules simultaneously• ...but not too many, abundance bias
• Mass is an intrinsic property of all (bio)molecules• ...but need a reference to compare to
![Page 9: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/9.jpg)
9
Mass Spectrometry for Proteomics
• Mass spectrometry has been around since the turn of the century...• ...why is MS based Proteomics so new?
• Ionization methods• MALDI, Electrospray
• Protein chemistry & automation• Chromatography, Gels, Computers
• Protein / genome sequences• A reference for comparison
![Page 10: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/10.jpg)
10
Sample Preparation for Peptide Identification
Enzymatic Digestand
Fractionation
![Page 11: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/11.jpg)
11
Single Stage MS
MS
m/z
![Page 12: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/12.jpg)
12
Tandem Mass Spectrometry(MS/MS)
Precursor selection
m/z
m/z
![Page 13: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/13.jpg)
13
Tandem Mass Spectrometry(MS/MS)
Precursor selection + collision induced dissociation
(CID)
MS/MS
m/z
m/z
![Page 14: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/14.jpg)
14
Peptide Identification
• For each (likely) peptide sequence1. Compute fragment masses2. Compare with spectrum3. Retain those that match well
• Peptide sequences from (any) sequence database• Swiss-Prot, IPI, NCBI’s nr, ESTs, genomes, ...
• Automated, high-throughput peptide identification in complex mixtures
![Page 15: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/15.jpg)
15
Peptide Identification
...can provide direct experimental evidence for the amino-acid sequence of functional proteins.
Evidence for:• Functional protein isoforms• Translation start and frame• Proteins with short open-reading-frames
![Page 16: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/16.jpg)
16
Why is this useful for ...... genome annotation?
• Evidence for SNPs and alternative splicing stops with transcription
• No genomic or transcript evidence for translation start-site.
• Conservation doesn’t stop at coding bases!
• Statistical gene-finders struggle with micro-exons, translation start-site, and short ORFs.
![Page 17: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/17.jpg)
17
Why is this useful for ...... cancer biomarkers?
• Alternative splicing is the norm!• Only 20-25K human genes• Each gene makes many proteins• Some splicing is believed to be silencing• Lots of splicing in cancer
• Proteins have clinical implications• Statistical biomarker discovery• Putative malfunctioning proteins
![Page 18: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/18.jpg)
18
What can be observed?
• Known coding SNPs
• Novel coding mutations
• Alternative splicing isoforms
• Microexons ( non-cannonical splice-sites )
• Alternative translation start-sites ( codons )
• Alternative translation frames
• “Dark” open-reading-frames
![Page 19: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/19.jpg)
19
Splice Isoform
• Human Jurkat leukemia cell-line• Lipid-raft extraction protocol, targeting T cells• von Haller, et al. MCP 2003.
• LIME1 gene:• LCK interacting transmembrane adaptor 1
• LCK gene:• Leukocyte-specific protein tyrosine kinase• Proto-oncogene• Chromosomal aberration involving LCK in leukemias.
• Multiple significant peptide identifications
![Page 20: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/20.jpg)
20
Splice Isoform
![Page 21: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/21.jpg)
21
Novel Splice Isoform
![Page 22: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/22.jpg)
22
Novel Mutation
• HUPO Plasma Proteome Project• Pooled samples from 10 male & 10 female
healthy Chinese subjects• Plasma/EDTA sample protocol• Li, et al. Proteomics 2005. (Lab 29)
• TTR gene• Transthyretin (pre-albumin) • Defects in TTR are a cause of amyloidosis.• Familial amyloidotic polyneuropathy
• late-onset, dominant inheritance
![Page 23: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/23.jpg)
23
Novel Mutation
Ala2→Pro associated with familial amyloid polyneuropathy
![Page 24: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/24.jpg)
24
Novel Mutation
![Page 25: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/25.jpg)
25
Translation Start-Site
• Human erythroleukemia K562 cell-line• Depth of coverage study• Resing et al. Anal. Chem. 2004.
• THOC2 gene:• Part of the heteromultimeric THO/TREX complex.
• Initially believed to be a “novel” ORF• RefSeq mRNA in Jun 2007, no RefSeq protein• TrEMBL entry Feb 2005, no SwissProt entry• Genbank mRNA in May 2002 (complete CDS)• Plenty of EST support• ~ 100,000 bases upstream of other isoforms
![Page 26: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/26.jpg)
26
Translation Start-Site
![Page 27: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/27.jpg)
27
Translation Start-Site
![Page 28: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/28.jpg)
28
Translation Start-Site
![Page 29: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/29.jpg)
29
Translation Start-Site
![Page 30: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/30.jpg)
30
Easily distinguish minor sequence variations
Two B. anthracis Sterne α/β SASP annotations
• RefSeq/Gb: MVMARN... (7441 Da)• CMR: MARN... (7211 Da)
• Intact proteins differ by 230 Da• 7441 Da vs 7211 Da
• N-terminal tryptic peptides:• MVMAR (606.3 Da), MVMARNR (876.4 Da), vs• MARNR (646.3 Da)• Very different MS/MS spectra
![Page 31: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/31.jpg)
31
Bacterial Gene-Finding
…TAGAAAAATGGCTCTTTAGATAAATTTCATGAAAAATATTGA…
Stopcodon
Stopcodon
• Find all the open-reading-frames...
...courtesy of Art Delcher
![Page 32: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/32.jpg)
32
Bacterial Gene-Finding
…TAGAAAAATGGCTCTTTAGATAAATTTCATGAAAAATATTGA…
Stopcodon
Stopcodon
…ATCTTTTTACCGAGAAATCTATTTAAAGTACTTTTTATAACT…
ShiftedStop
Stopcodon
Reversestrand
• Find all the open-reading-frames...
...but they overlap – which ones are correct?
...courtesy of Art Delcher
![Page 33: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/33.jpg)
33
Coding-Sequence “Score”
...courtesy of Art Delcher
![Page 34: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/34.jpg)
34
Glimmer3 Performance
Organism Length GC% # Genes ExtraArchaeoglobus fulgidus 2.18Mb 48.6 1165 1162 99.70% 875 75.10% 1305Bacillus anthracis 5.23Mb 35.4 3132 3129 99.9% 2768 88.4% 2340Bacillus subtilis 4.21Mb 43.5 1576 1567 99.4% 1429 90.7% 2879Campylobacter jejuni 1.78Mb 30.3 1233 1233 100.0% 1149 93.2% 668Carboxydothermus hydrogenoformans 2.40Mb 42.0 1753 1752 99.9% 1590 90.7% 865Caulobacter crescentus 4.02Mb 67.2 2192 2187 99.8% 1552 70.8% 1559Chlorobium tepidum 2.15Mb 56.5 1292 1289 99.8% 949 73.5% 765Clostridium perfringens 3.03Mb 28.6 1504 1503 99.9% 1385 92.1% 1178Colwellia psychrerythraea 5.37Mb 38.0 3063 3060 99.9% 2663 86.9% 1714Dehalococcoides ethenogenes 1.47Mb 48.9 1069 1059 99.1% 929 86.9% 483Escherichia coli 4.64Mb 50.8 3603 3553 98.6% 3150 87.4% 913Geobacter sulfurreducens 3.81Mb 60.9 2351 2340 99.5% 1974 84.0% 1091Haemophilus influenzae 1.83Mb 38.1 1170 1170 100.0% 1054 90.1% 639Helicobacter pylori 1.67Mb 38.9 915 914 99.9% 805 88.0% 765Listeria monocytogenes 2.91Mb 38.0 1966 1965 99.9% 1797 91.4% 845Methylococcus capsulatus 3.30Mb 63.6 2015 2005 99.5% 1542 76.5% 1231Mycobacterium tuberculosis 4.40Mb 65.6 2217 2205 99.5% 1493 67.3% 2104Neisseria meningitidis 2.27Mb 51.5 1232 1217 98.8% 1042 84.6% 1329Porphyromonas gingivalis 2.34Mb 48.3 1200 1198 99.8% 933 77.8% 887Pseudomonas fluorescens 7.07Mb 63.3 4535 4503 99.3% 3577 78.9% 1871Pseudomonas putida 6.18Mb 61.5 3633 3596 99.0% 2825 77.8% 1916Ralstonia solanacearum 3.72Mb 67.0 2512 2487 99.0% 2061 82.0% 1077Staphylococcus epidermidis 2.62Mb 32.1 1650 1649 99.9% 1511 91.6% 771Streptococcus agalactiae 2.16Mb 35.6 1441 1438 99.8% 1336 92.7% 683Streptococcus pneumoniae 2.16Mb 39.7 1359 1355 99.7% 1214 89.3% 780Thermotoga maritima 1.86Mb 46.2 1092 1090 99.8% 892 81.7% 804Treponema denticola 2.84Mb 37.9 1463 1463 100.0% 1332 91.0% 1210Treponema pallidum 1.14Mb 52.8 575 572 99.5% 425 73.9% 557Ureaplasma parvum 0.75Mb 25.5 327 327 100.0% 300 91.7% 293Wolbachia endosymbiont 1.08Mb 34.2 628 627 99.8% 528 84.1% 537
99.6% 84.3%Averages:
Genome Glimmer3 PredictionsMatches Correct Starts
• Glimmer3 trained & compared to RefSeq genes with annotated function
• Correct STOP:• 99.6%
• Correct START:• 84.3%
• “Not all the genomes necessarily have carefully/accurately annotated start sites, so the results for number of correct starts may be suspect.”
![Page 35: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/35.jpg)
35
N-terminal peptides
• (Protein) N-terminal peptides establish• start-site of known & unexpected ORFs
Use:• Directly to annotate genomes• Evaluate and improve algorithms• Map cross-species
![Page 36: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/36.jpg)
36
N-terminal peptide workflows
• Typical proteomics workflows sample peptides from the proteome “randomly”
• Caulobacter crescentus (70%)• 3733 Proteins (RefSeq Genome annot.)• 66K tryptic peptides (600 Da to 3000 Da)• 2085 N-terminal tryptic peptides (3%)
![Page 37: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/37.jpg)
37
N-terminal peptide workflow
• Protect protein N-terminus
• Digest to peptides• Chemically modify
free peptide N-term• Use chem. mod. to
capture unwanted peptides
Nat Biotech, Vol. 21, pp. 566-569, 2003.
![Page 38: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/38.jpg)
38
Increasing N-terminal peptide coverage
• Multiple (digest) enzymes:• trypsin-R:
60% (80%)• acid + lys-C + trypsin:
85% (94%)• Repeated LC-MS/MS• Precursor Exclusion /
Inclusion lists• MALDI / ESI• Protein separation
and/or orthogonal fractionation Anal Chem, Vol. 76, pp. 4193-4201, 2004.
![Page 39: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/39.jpg)
39
Proteomics Informatics
• Search spectra against:• Entire bacterial genome;• All Met initiated peptides; or • Statistically likely Met initiated peptides.
• Easily consider initial Met loss PTM, too
• Off-the-shelf MS/MS search engines (Mascot / X!Tandem / OMSSA)
![Page 40: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/40.jpg)
40
Other Practical Issues
• Suitable for commonly available instrumentation• Only the sample prep. is (somewhat) novel.
• Need living organism• Stage of life-cycle?
• Bang for buck?• N-terminal peptides / $$$$
• In discussions with JCVI (ex TIGR)• Possible pilot project?
![Page 41: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/41.jpg)
41
Other Research Projects
• Improving peptide identification by MS/MS• Spectral matching using HMMs• Combining search engine results • Spectral matching for detection and quantitation
• Microorganism identification using MS• Live public web-site and database
• (Inexact) uniqueness guarantees• Primer/Probe oligo design• Pathogen detection (DNA & Peptide)• Significant false-positive peptide identifications
![Page 42: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/42.jpg)
42
Spectral Matching
• Detection vs. identification• Increased sensitivity• No novel peptides
• NIST GC/MS Spectral Library• Identifies small molecules, • 100,000’s of (consensus) spectra• Bundled/Sold with many instruments• “Dot-product” spectral comparison• Current project: Peptide MS/MS
![Page 43: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/43.jpg)
43
Peptide DLATVYVDVLK
![Page 44: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/44.jpg)
44
Peptide DLATVYVDVLK
![Page 45: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/45.jpg)
45
Hidden Markov Models for Spectral Matching
• Capture statistical variation and consensus in peak intensity
• Capture semantics of peaks• Extrapolate model to other peptides
• Good specificity with superior sensitivity for peptide detection• Assign 1000’s of additional spectra (w/ p-value < 10-5)
![Page 46: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/46.jpg)
46
www.RMIDb.org
![Page 47: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/47.jpg)
47
www.RMIDb.org
Statistics:• 16.7 x 106 (6.4 x 106) protein sequences• ~ 40,000 organisms, ~ 19,700 species• 557 (415) complete genomes
Sources:• TIGR’s CMR, SwissProt, TrEMBL, Genbank
Proteins, RefSeq Proteins & Genomes• Inclusive Glimmer3 predictions on Genomes• Pfam and GO assignments using BOINC grid
![Page 48: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/48.jpg)
48
www.RMIDb.org
Accessed from all over the world...
![Page 49: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/49.jpg)
49
Uniqueness guarantees
• 20-mer oligo signatures for B. anthracis• In all available strains as exact match• No (inexact) match to other Bacillus species
Specificity # Signatures % of genome
Exact 2035086 39.4%
k = 1 866787 16.8%
k = 2 75795 1.5%
k = 3 174 0.003%
![Page 50: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/50.jpg)
50
Uniqueness guarantees
• Human genome primer design problem
• “4-unique” DNA 20-mers:• Edit-distance ≥ 5 to any non-specific
hybridization site• No such valid loci on Chr. 22!• Currently analyzing entire genome
• “3-unique” DNA 20-mers:• Initial experiments suggest ~ 0.01% valid• Approx. 1 valid oligo every 10,000 bases
![Page 51: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/51.jpg)
51
Future Research Plans
• Cancer biomarkers:
• Optimize proteomics workflow for protein sequence coverage
• Improve informatics infrastructure to make interpretation easier
• Identify splice variants in cancer cell-lines (MCF-7) and clinical brain tumor samples
![Page 52: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/52.jpg)
52
Future Research Plans
• Genome Annotation
• Collect evidence for functional alternative splicing in public datasets into dbPEP.
• Conduct pilot project for bacterial genome annotation with JCVI.
• Improve informatics infrastructure to make interpretation easier.
![Page 53: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/53.jpg)
53
Future Research Plans
• Peptide Identification
• Expand library of HMM models for high-confidence spectral matching
• Spectral matching for biomarkers and quantitation (with Calibrant).
• Specificity metric for peptides identified using MS/MS
![Page 54: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/54.jpg)
54
Future Research Plans
• Microorganism identification by mass spectrometry
• Specificity of tandem mass spectra
• Revamp RMIDb prototype
• Incorporate spectral matching, top-down.
![Page 55: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/55.jpg)
55
Future Research Plans
• Oligonucleotide Design
• Uniqueness oracle for inexact match in human
• Integration with Primer3
• Tiling, multiplexing, pooling, & tag arrays
![Page 56: Direct Experimental Observation of Functional Protein Isoforms by Tandem Mass Spectrometry](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681401a550346895dab685c/html5/thumbnails/56.jpg)
56
Acknowledgements
• Catherine Fenselau, Steve Swatkoski• UMCP Biochemistry
• Chau-Wen Tseng, Xue Wu• UMCP Computer Science
• Cheng Lee, Brian Balgley• Calibrant Biosystems
• PeptideAtlas, HUPO PPP, X!Tandem
• Funding: NIH/NCI, USDA/ARS