automatic annotation of n-glycan species in maldi-tof-tof spectra for rapid profiling and comparing...

25
Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu Tang Indiana University Bloomington School of Informatics and Computing

Post on 19-Dec-2015

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing

Chuan-Yih, Yu

2010.05.14 Capstone

Advisor: Prof. Haixu Tang

Indiana University Bloomington School of Informatics and Computing 

Page 2: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

2

Outline

• Introduction– Glycoprotein, Monosaccharides, N-linked

glycosylation, and Mass Spectrometry

• Problem set• Goals• MultiNGlycan• Result• Future works

Page 3: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

3

Introduction

• Post-Translation Modification (PTM)– An enzyme-catalyzed change after synthesized– Acetylation, Cleavage, Glycosylation, Methylation,

Phosphorylation, and Prenylation

• 50% of all eukaryotic proteins are glycosylated1

[Apweiler, et al.]

1.Apweiler, R., H. Hermjakob, and N. Sharon, On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim Biophys Acta, 1999. 1473(1): p. 4-8

http://yahoo.brand.edgar-online.com/EFX_dll/EDGARpro.dll?FetchFilingHTML1?SessionID=WD8AC7y2l3h1FMr&ID=5101862

Page 4: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

4

Glycosylation

• N-linked glycosylation – Core structure – 2 GlcNac + 3 Man– Asn-X-Ser or Asn-X-Thr, X can be any but

Pro (glycosylation  sequon)– Glycosylation before folding

• O-linked glycosylation– Many different core structures– Serine or Threonine– Glycosylation after folding

Page 5: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

5

N-linked glycosylation • Tree structure• Monosaccharides- building blocks of

polysaccharide chain• Diverse linage – at most four

branches• Three types of N-linked glycan tree

– High mannose– Complex– Hybrid

Graphs: Varki, A., Essentials of glycobiology. 2nd ed. 2009, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press. xxix, 784 p

Name Molecular formula/ Structure

Mannose (Man) C6H12O6

Galactose (Gal) C6H12O6

Fucose (Fuc) C6H12O5

GlcNac C8H15NO6

NeuNAC C11H19NO9

NeuNGC C11H19NO10

Page 6: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

7

MALDI-TOF-TOF

• Matrix-assisted laser desorption/ionization• Time of flight (TOF)

Graph:MALDI-TOF Mass Analysis. (2008, 11 16). Retrieved May 2, 2009, from The Protein Facility of the Iowa State University Office of Biotechnology www.protein.iastate.edu/maldi.html

Page 7: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

8

Problem Sets

Glycopeptide isotope pattern overlap

Graphs: Isotope Pattern Calculator v4.0 http://yanjunhua.tripod.com/pattern.htm http://en.wikipedia.org/wiki/Carbon

2 GlcNac + 9 Man = 2374.5960 7 GlcNac + 3 Man = 2375.63

Mass % Mass %

2371 0.0

2372 84.3 2372 0.0

2373 100.0 2373 82.4

2374 68.5 2374 100.0

2375 34.3 2375 68.8

2376 13.9 2376 34.4

Page 8: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

9

Problem Sets

High-throughput glycans profiling

http://www.functionalglycomics.org

Page 9: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

10

Goals

• Glycans profile correlation– Report scores for non-overlap and overlap

profile– Glycans examination

• Glycan profile comparison– Report significant glycan between groups– Glycans biomarker discovery

Page 10: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

11

Glycans Profile Correlation

• For each glycan combination– 412 different glycan combinations[Krambeck, et al. ]1

– Generate a theoretical isotope pattern– Calculate the correlation for following cases

1. Glycans

2. Glycans + Glycans, linear combination applied

3. Glycans + Unknown, linear combination applied

• Mercury algorithm2

– Generate the unknown isotope pattern

2.Rockwood, A., S. Van Orden, and R. Smith, Rapid Calculation of Isotope Distributions. Analytical Chemistry, 1995. 67: p. 2699-2704.

1.Krambeck, F.J. and M.J. Betenbaugh, A mathematical model of N-linked glycosylation. Biotechnol Bioeng, 2005. 92(6): p. 711-28.

Page 11: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

12

Three Cases

Experiment spectrum

Glycans

α

α

Glycans

Unknown

ScoreTheoretical isotope pattern

β

β 0.2

0.8

0.6

Page 12: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

13

Glycan Profile Comparison

• Multiple spectra comparison• Biomarker discovery

– Given spectrum with several conditions– Find distinct glycans between samples

Graph: Ressom, H.W., et al., Analysis of MALDI-TOF mass spectrometry data for discovery of peptide and glycan biomarkers of hepatocellular carcinoma. J Proteome Res, 2008. 7(2): p. 603-10.

HCC: Hepatocellular Carcinoma( Cancer of liver)

CLD: Chronic liver disease

Page 13: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

14

Concept

Health spectra(H1, H2, H3…Hk)

Disease spectra(D1, D2, D3…Dk)

Remove the least significant component. Repeat until all the score above threshold.

1.Hastie, T., et al., 'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol, 2000. 1(2): p. RESEARCH0003

70% identical with a cutoff at 0.5

Page 14: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

15

Multi N-Glycan

• Software Requirements– .net framework 2.0 using C#– C++ runtime– R– Thermo Scientific Xcalibur

• Input– Spectrum

• Plain text (Peak list), mzXML1,RAW (Thermo Scientific raw file)

– Glycans list• CSV file (User-defined)

• Output– List of glycans with scores

1.Pedrioli, P., et al., A Common Open Representation of Mass Spectrometry Data and its Application in a Proteomics Research Environment. Nature Biotechnology, 2004. 22(11): p. 1459-1466.

Page 15: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

16

Software Interface

Page 16: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

17

Software features

• Signal preprocessing provided– Subtracting background – Smoothing peak– Tolerating Mass Spectrometry accuracy

• Flexible parameters incorporate actual experiment

• Useful tools provides– Isotope pattern generator

• Content rich output, multi-format supports– csv, text, html

Page 17: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

18

Software screenshot

Html result export

Page 18: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

19

Software screenshot

Page 19: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

20

Result

• Data set– Liver Cancer : 73 individuals– Health: 78 individuals

• 412 glycan structures are tested• Glycan criterion

– Correlation score cut off < 0.5– Present in 30% of total spectra

Zhiqun T., et al., Identification of N-Glycan Serum Markers Associated with Hepatocellular Carcinoma from Mass Spectrometry Data. J Proteome Res, 2009

Ressom, H.W., et al., Analysis of MALDI-TOF mass spectrometry data for discovery of peptide and glycan biomarkers of hepatocellular carcinoma. J Proteome Res, 2008. 7(2): p. 603-10.

Anoop M., Chuan-Yih Y., A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data. I690 project, 2009 Fall

Page 20: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

21

Result

Filtered out

Can’t find the glycan structure in CFG databaseCorrelation score

Overlap with 2192

Page 21: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

22

Result

Page 22: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

23

Future Works

• Test on more clinical samples• Verify the correlation between glycan

modification which reported by MultiNGlycan with Hepatocellular arcinoma

• Perform these tasks on O-linked glycan• Apply de novo glycan sequencing on reported

glycan (ongoing)

Page 23: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

24

References• Anoop M., Chuan-Yih Y., A Multi-PCA Approach to Glycan Biomarker Discovery using Mass

Spectrometry Profile Data. I690 project, 2009 Fall• Apweiler, R., H. Hermjakob, and N. Sharon, On the frequency of protein glycosylation, as deduced

from analysis of the SWISS-PROT database. Biochim Biophys Acta, 1999. 1473(1): p. 4-8.• Dalit Shental-Bechor and Yaakov Levy, Effect of glycosylation on protein folding: A close look at

thermodynamic stabilization, PNAS June 11, 2008• Hastie, T., et al., ‘Gene shaving’ as a method for identifying distinct sets of genes with similar

expression patterns. Genome Biol, 2000. 1(2): p. RESEARCH0003.• Krambeck, F.J. and M.J. Betenbaugh, A mathematical model of N-linked glycosylation. Biotechnol

Bioeng, 2005. 92(6): p. 711-28.• Pedrioli, P., et al., A Common Open Representation of Mass Spectrometry Data and its Application

in a Proteomics Research Environment. Nature Biotechnology, 2004. 22(11): p. 1459-1466.• Ressom, H.W., et al., Analysis of MALDI-TOF mass spectrometry data for discovery of peptide

and glycan biomarkers of hepatocellular carcinoma. J Proteome Res, 2008. 7(2): p. 603-10.• Rockwood, A., S. Van Orden, and R. Smith, Rapid Calculation of Isotope Distributions. Analytical

Chemistry, 1995. 67: p. 2699-2704.• Zhiqun, T., et al., Identification of N-glycan serum markers associated with hepatocellular

carcinoma from mass spectrometry data. J Proteome Res, 2010. 9(1): p. 104-12.• Varki, A., Essentials of glycobiology. 2nd ed. 2009, Cold Spring Harbor, N.Y.: Cold Spring Harbor

Laboratory Press. xxix, 784 p.

Page 24: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

25

Acknowledge

• Advisor: Prof. Haixu Tang • Co-worker: Anoop Mayampurath• Collaborator: Yehia Mechref, Department

of Chemistry• COL Lab members

• This work will present in 26th May, 58th ASMS Conference Salt Lake City, Utah and submit to the Bioinformatics Application Notes.

Page 25: Automatic annotation of N-glycan species in MALDI-TOF-TOF spectra for rapid profiling and comparing Chuan-Yih, Yu 2010.05.14 Capstone Advisor: Prof. Haixu

Thank You