b cell epitopes and predictions - dtu …€¦ · b cell epitopes and predictions thursday, 11 june...

44
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009

Upload: nguyenmien

Post on 12-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

Technical University of Denmark - DTUDepartment of systems biology

B CELL EPITOPES AND PREDICTIONS

Thursday, 11 June 2009

Page 2: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

OUTLINE

• What is a B-cell epitope?

• How can you predict B-cell epitopes?

Thursday, 11 June 2009

Page 3: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

WHAT IS A B-CELL EPITOPE?

Antibody Fabfragment

• B-cell epitopes:

• Accessible structural feature of a pathogen molecule.

• Antibodies are developed to bind the epitope specifically using the complementary determining regions (CDRs).

Thursday, 11 June 2009

Page 4: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

THE BINDING INTERACTIONS

• Salt bridges

• Hydrogen bonds

• Hydrophobic interactions

• Van der Waals forces

Binding strength

Thursday, 11 June 2009

Page 5: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

B-CELL EPITOPE CLASSIFICATION

Linear epitopesOne segment of the amino acid

chainDiscontinuous epitope (with

linear determinant)

Discontinuous epitopeSeveral small segments

brought into proximity by the protein fold

B-cell epitope: structural feature of a molecule or pathogen, accessible and recognizable by B-cell receptors

and antibodies

Thursday, 11 June 2009

Page 6: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

BINDING OF A DISCONTINUOUS EPITOPE

Antibody FAB fragment complexed with Guinea Fowl Lysozyme (1FBI).

Black: Light chain, Blue: Heavy chain, Yellow: Residues with atoms distanced < 5Å from FAB antibody

fragments.

Guinea Fowl Lysozyme KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNSQNRNTDGS

DYGVLNSRWWCNDGRTPGSRNLCNIPCSALQSSDITATANCAKKIVSDG

GMNAWVAWRKCKGTDVRVWIKGCRL

Thursday, 11 June 2009

Page 7: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

B-CELL EPITOPE ANNOTATION• Linear epitopes:

• Chop sequence into small pieces and measure binding to antibody

• Discontinuous epitopes:

• Measure binding of whole protein to antibody

• The best annotation method : X-ray crystal structure of the antibody-epitope complex

Thursday, 11 June 2009

Page 8: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

B-CELL EPITOPE DATA BASES

• Databases:

• IEDB, Los Alamos HIV database, Protein Data Bank, AntiJen, BciPep

• Large amount of data available for linear epitopes

• Few data available for discontinuous

Thursday, 11 June 2009

Page 9: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

Technical University of Denmark - DTUDepartment of systems biology

B CELL EPITOPE PREDICTION

Thursday, 11 June 2009

Page 10: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

SEQUENCE-BASED METHODS FOR PREDICTION OF LINEAR

EPITOPES• Protein hydrophobicity – hydrophilicity algorithms

• Parker, Fauchere, Janin, Kyte and Doolittle, Manavalan• Sweet and Eisenberg, Goldman, Engelman and Steitz (GES), von Heijne

• Protein flexibility prediction algorithm • Karplus and Schulz

• Protein secondary structure prediction algorithms • PsiPred (D. Jones)

• Protein “antigenicity” prediction :• Hopp and Woods, WellingTSQDLSVFPLASCCKDNIASTSVTLGCLVTGYLPMSTTVTWDTGSLNKNVTTFPTTFHETYGLHSIVSQVTASGKWAKQRFTCSVAHAESTAINKTFSACALNFIPPTVKLFHSSCNPVGDTHTTIQLLCLISGYVPGDMEVIWLVDGQKATNIFPYTAPGTKEGNVTSTHSELNITQGEWVSQKTYTCQVTYQGFTFKDEARKCSESDPRGVTSYLSPPSPL

Thursday, 11 June 2009

Page 11: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

PROPENSITY SCALES: THE PRINCIPLE

• The Parker hydrophilicity scale

• Derived from experimental data

D 2.46E 1.86N 1.64S 1.50Q 1.37G 1.28K 1.26T 1.15R 0.87P 0.30H 0.30C 0.11A 0.03Y -0.78V -1.27M -1.41I -2.45 F -2.78L -2.87W -3.00

Hydrophilicity

Thursday, 11 June 2009

Page 12: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

….LISTFVDEKRPGSDIVEDLILKDENKTTVI….

(-2.78 + -1.27 + 2.46 +1.86 + 1.26 + 0.87 + 0.3)/7 = 0.39

Prediction scores:

0.38 0.1 0.6 0.9 1.0 1.2 2.6 1.0 0.9 0.5 -0.5

Epitope

PROPENSITY SCALES: THE PRINCIPLE

Thursday, 11 June 2009

Page 13: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

EVALUATION OF PERFORMANCE

Thursday, 11 June 2009

Page 14: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

TURN PREDICTION AND B-CELL EPITOPES

• Pellequer found that 50% of the epitopes in a data set of 11 proteins were located in turns

Turn propensity scales for each position in the turn were used for epitope prediction.

Pellequer et al.,Immunology letters, 1993

1

2

3

4

Thursday, 11 June 2009

Page 15: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

BLYTHE AND FLOWER 2005

• Extensive evaluation of propensity scales for epitope prediction

• Conclusion:

–Basically all the classical scales perform close to random!

–Other methods must be used for epitope prediction

Thursday, 11 June 2009

Page 16: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

BEPIPRED• Parker hydrophilicity scale

• PSSM

• PSSM based on linear epitopes extracted from the AntiJen database

• Combination of the Parker prediction scores and PSSM leads to prediction score

• Tested on the Pellequer dataset and epitopes in the HIV Los Alamos database

Thursday, 11 June 2009

Page 17: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

PSSM

A C D ………… G S T V W

Pos 1 7.28

Pos 2 9.39

Pos 3 0.3

Pos 4 5.2

Pos 5 7.9

….LISTFVDEKRPGSDIVEDLILKDENKTTVI….

2.46+1.86+1.26+0.87+0.3 = 6.75 Prediction value

Thursday, 11 June 2009

Page 18: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

ROC EVALUATION

Evaluation on HIV Los

Alamos data set

Thursday, 11 June 2009

Page 19: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

BEPIPRED PERFORMANCE• Pellequer data set:

–Levitt AROC = 0.66

–Parker AROC = 0.65

–BepiPred AROC = 0.68

• HIV Los Alamos data set

–Levitt AROC = 0.57

–Parker AROC = 0.59

–BepiPred AROC = 0.60

Thursday, 11 June 2009

Page 20: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

BEPIPRED• BepiPred conclusion:

• On both of the evaluation data sets, Bepipred was shown to perform better

• Still the AROC value is low compared to T-cell epitope prediction tools!

• Bepipred is available as a webserver :

• www.cbs.dtu.dk/services/BepiPred

Thursday, 11 June 2009

Page 21: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

PREDICTION OF LINEAR EPITOPES

Con• only ~10% of epitopes can

be classified as “linear” • weakly immunogenic in

most cases• most epitope peptides do

not provide antigen-neutralizing immunity

• in many cases represent hypervariable regions

Pro• easily predicted

computationally • easily identified experimentally• immunodominant epitopes in

many cases • do not need 3D structural

information• easy to produce and check

binding activity experimentally

Thursday, 11 June 2009

Page 22: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

SEQUENCE BASED PREDICTION METHODS

• Linear methods for prediction of B cell epitopes have low performances

• The problem is analogous to the problems of representing the surface of the earth on a two-dimensional map

• Reduction of the dimensions leads to distortions of scales, directions, distances

• The world of B-cell epitopes is 3 dimensional and therefore more sophisticated methods must be developed

Regenmortel 1996,Meth. of Enzym. 9.

Thursday, 11 June 2009

Page 23: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

SO WHAT IS MORE SOPHISTICATED?

• Use of the three dimensional structure of the pathogen protein

• Analyze the structure to find surface exposed regions

• Additional use of information about conformational changes, glycosylation and trans-membrane helices

Thursday, 11 June 2009

Page 24: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

SOURCES OF THREE-DIMENSIONAL STRUCTURES

• Experimental determination • X-ray crystallography • NMR spectroscopy

• Both methods are time consuming and not easily done in a larger scale

• Structure prediction

• Homology modeling• Fold recognition

• Less time consuming, but there is a possibility of incorrect predictions, specially in loop regions

Thursday, 11 June 2009

Page 25: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

PROTEIN STRUCTURE PREDICTION METHODS

• Homology/comparative modeling >25% sequence identity (seq 2 seq alignment)

• Fold-recognition <25% sequence identity (Psi-blast search/ PSSM 2 seq

alignment)

• Ab initio structure prediction 0% sequence identity

Thursday, 11 June 2009

Page 26: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

WHAT DOES ANTIBODIES RECOGNIZE IN A PROTEIN?

A: Everything accessible to a 10 Å probe on a protein surfaceNovotny J. A static accessibility model of protein antigenicity.

Int Rev Immunol 1987 Jul;2(4):379-89

probe

Antibody Fabfragment

Protrusion index

Thursday, 11 June 2009

Page 27: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

THE CEP SERVER

• Conformational epitope server

http://202.41.70.74:8080/cgi-bin/cep.pl

• Uses protein structure as input

• Finds stretches in sequences which are surface exposed

Thursday, 11 June 2009

Page 28: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

THE DISCOTOPE SERVER

• CBS server for prediction of discontinuous epitopes

• Uses protein structure as input

• Combines propensity scale values of amino acids in discontinuous epitopes with surface exposure

• http://www.cbs.dtu.dk/services/DiscoTope

Thursday, 11 June 2009

Page 29: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

DISCOTOPE

• Prediction of residues in discontinuous B cell epitopes using protein 3D structures

Pernille Haste Andersen, Morten Nielsen and Ole Lund, Protein Science 2006

Thursday, 11 June 2009

Page 30: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

• Structures of antibodies/antigen protein complexes in the Protein DataBank

• Dr. Andrew Martin’s SACS database (available at http://www.bioinf.org.uk/abs/sacs) was used to get an overview of PDB entries

• Epitopes in the data set were identified by finding residues within 4Å from heavy or light chains in the Abs

• We used homology grouping and cross-validation for the training and testing of the method to avoid biasing towards specific antigens

• The 5 sets used for cross-validated training/testing are available at:http://www.cbs.dtu.dk/suppl/immunology/DiscoTope.php

A DATA SET OF DISCONTINUOUS B CELL

EPITOPES

An example: The epitope of the outer surface protein A from Borrelia

Burgdorferi (1OSP)

Thursday, 11 June 2009

Page 31: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

LOG-ODDS RATIOS OF AMINO ACIDS IN DISCONTINUOUS EPITOPES

Frequencies of amino acids in epitope residues compared to frequencies of non-epitope residues

Several discrepancies compared to the Parker hydrophilicity scale

Predictive performance (AUC) of B cell epitopes:Parker 0.614Epitope log–odds 0.634

Thursday, 11 June 2009

Page 32: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

DiscoTope prediction value

DISCOTOPE: A PREDICTION METHOD USING 3D STRUCTURES

A combination method:

• Addition of epitope log-odds values of residues in spatial proximity

• Contact numbers

.LIST..FVDEKRPGSDIVED……ALILKDENKTTVI.

-0.145

+0.346+1.136

Contact number : 7Sum of log-odds values

w

+0.691+0.346+1.136+1.180+1.164

Thursday, 11 June 2009

Page 33: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

• Receiver Operator Characteristics (ROC) curves were used for performance measures

• The reported performance is an average of the AUC values of the non-homologous groups of antigens:

– Parker 0.614 Seq.-based

– Epitope log–odds 0.634 Seq.-based

–Contact numbers 0.647 Str.-based

•Naccess 0.673 Str.-based

–DiscoTope 0.711 Seq./Str.-based

DISCOTOPE : PREDICTION OF DISCONTINUOUS EPITOPES

Thursday, 11 June 2009

Page 34: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

EVALUATION EXAMPLE AMA1• Apical membrane antigen 1 from

Plasmodium falciparum (not used for training/testing)

• Two epitopes were identified using phage-display, sequence variance analysis and point-mutation

(green backbone)

• Most residues identified as epitopes were successfully predicted by DiscoTope

(black side chains)

DiscoTope is available as webserver:http://www.cbs.dtu.dk/services/DiscoTope/

Thursday, 11 June 2009

Page 35: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

Vol. 24 no. 12 2008, pages 1459–1460BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btn199

Structural bioinformatics

PEPITO: improved discontinuous B-cell epitope prediction usingmultiple distance thresholds and half sphere exposureMichael J. Sweredoski1,2 and Pierre Baldi1,2,*1Department of Computer Science and 2Institute for Genomics and Bioinformatics, University of California, Irvine,92697-3435, California, USA

Received on March 3, 2008; revised on April 18, 2008; accepted on April 20, 2008

Advance Access publication April 28, 2008

Associate Editor: Anna Tramontano

ABSTRACT

Motivation: Accurate prediction of B-cell epitopes is an important

goal of computational immunology. Up to 90% of B-cell epitopes are

discontinuous in nature, yet most predictors focus on linear

epitopes. Even when the tertiary structure of the antigen is available,

the accurate prediction of B-cell epitopes remains challenging.

Results: Our predictor, PEPITO, uses a combination of amino-acid

propensity scores and half sphere exposure values at multiple

distances to achieve state-of-the-art performance. PEPITO achieves

an area under the curve (AUC) of 75.4 on the Discotope dataset.

Additionally, we benchmark PEPITO as well as the Discotope

predictor on the more recent Epitome dataset, achieving AUCs of

68.3 and 66.0, respectively.

Availability: PEPITO is available as part of the SCRATCH suite of

protein structure predictors via www.igb.uci.edu.

Contact: [email protected]

Supplementary information: Supplementary data are available at

Bioinformatics online.

1 INTRODUCTION

B-cell epitope prediction is an important, but unsolved problem inbioinformatics. The ability to accurately predict B-cell epitopeswould aid researchers in a variety of immunological applications.Initial attempts at predicting B-cell epitopes involved the

calculation of propensity scales (Hopp and Woods, 1981).While this information can be useful in predicting B-cellepitopes, Blythe and Flower (2005) showed that propensityscales alone are not enough to accurately predict epitopes.Many of the previous predictors have focused on linear B-cell

epitopes. Some of these methods include ABCpred (Saha andRaghava, 2006), BEPITOPE (Odorico and Pellequer, 2003),Bepipred (Larsen et al., 2006) and PEOPLE (Alix, 1999).However, past surveys have estimated that only 10% of theB-cell epitopes are continuous (van Regenmortel, 1996).Additionally, van Regenmortel (2006) noted that even linearepitopes adopt a conformational structure and therefore thedistinction is somewhat blurred. Far fewer predictors have beendeveloped for discontinuous B-cell epitopes. One of the firstmethods explicitly created for identification of discontinuousepitopes was conformational epitope predictor (CEP)

(Kulkarni-Kale et al., 2005). Another method described byRapberger et al. (2007) incorporates epitope–paratope shapecomplementarity to predict interaction sites. One of the mostrecent, state-of-the-art, predictors of discontinuous epitopes isDiscotope (Andersen et al., 2006), which uses both contactnumbers (i.e. the number of C! atoms within a certain distancethreshold) and an amino-acid propensity scale.Our predictor, PEPITO, attempts to overcome some of the

limitations of previous predictors by incorporating an amino-acid propensity scale along with side chain orientation andsolvent accessibility information using half sphere exposurevalues (Hamelryck, 2005). To increase robustness, PEPITOuses propensity scales and half sphere exposure values atmultiple distance thresholds from the target residue.

2 METHODS

2.1 DatasetsWe obtained epitope datasets for benchmarking prediction methodsfrom both the Discotope Supplementary Materials (Andersen et al.,2006) and Epitome (Schlessinger et al., 2006). The two datasets containdifferent sets of protein chains and differ in their epitope/non-epitopeclassification rules. The Discotope dataset, which consists of 75 proteinchains, labels all residues in antigen chains within 4 A of an antibody asepitopes. The Epitome dataset, which consists of 140 protein chains,seeks to eliminate incidental contacts by labeling residues in the antigenwithin 6 A of the complementary determining regions of the antibodychains as epitopes.

We derived two additional datasets, C[Discotope] and C[Epitome],from the set of protein chains that are common to both the Epitomeand Discotope datasets. The two datasets differ in the method used toidentify epitope residues. Eight hundred and seventy-five of the residuesin the derived datasets are defined as epitopes using both methods. Fourhundred and seventy-one of the residues in the derived datasets aredefined as epitopes using the Epitome method but not the Discotopemethod. One hundred and nine of the residues in the derived datasetsare defined as epitopes using the Discotope method but not the Epitomemethod. The assertions by Schlessinger et al. (2006) would indicate thatthe 471 residues are integral to the antigen–antibody binding while the109 residues result from incidental contacts.

Testing procedures require that the protein chains present in thedatasets be clustered to prevent any one family from dominating theperformance measures. Protein families were previously annotated forthe Discotope dataset. UniqueProt (Mika and Rost, 2003) was used toidentify protein families in the Epitome dataset and the two deriveddatasets.*To whom correspondence should be addressed.

! The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] 1459

BioMed Central

!"#$%&%'(%)

!"#$%&'()*%+&',-&.,+&/0-#-0,'&"(+",1%12

BMC Bioinformatics

Open AccessSoftwareElliPro: a new structure-based tool for the prediction of antibody epitopesJulia Ponomarenko*1,2, Huynh-Hoa Bui3, Wei Li, Nicholas Fusseder, Philip E Bourne1,2, Alessandro Sette4 and Bjoern Peters4

Address: 1San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA, 2Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA, 3Isis Pharmaceuticals, Inc., 1896 Rutherford Road, Carlsbad, California 92008, USA and 4La Jolla Institute for Allergy and Immunology, 9420 Athena Circle, La Jolla, California 92037, USA

Email: Julia Ponomarenko* - [email protected]; Huynh-Hoa Bui - [email protected]; Wei Li - [email protected]; Nicholas Fusseder - [email protected]; Philip E Bourne - [email protected]; Alessandro Sette - [email protected]; Bjoern Peters - [email protected]* Corresponding author

AbstractBackground: Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highlydesirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity,solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton andcolleagues proposed a method for identifying continuous epitopes in the protein regions protrudingfrom the protein's globular surface. The aim of this work was to implement that method as a web-tool and evaluate its performance on discontinuous epitopes known from the structures ofantibody-protein complexes.

Results: Here we present ElliPro, a web-tool that implements Thornton's method and, togetherwith a residue clustering algorithm, the MODELLER program and the Jmol viewer, allows theprediction and visualization of antibody epitopes in a given protein sequence or structure. ElliProhas been tested on a benchmark dataset of discontinuous epitopes inferred from 3D structures ofantibody-protein complexes. In comparison with six other structure-based methods that can beused for epitope prediction, ElliPro performed the best and gave an AUC value of 0.732, when themost significant prediction was considered for each protein. Since the rank of the best predictionwas at most in the top three for more than 70% of proteins and never exceeded five, ElliPro isconsidered a useful research tool for identifying antibody epitopes in protein antigens. ElliPro isavailable at http://tools.immuneepitope.org/tools/ElliPro.

Conclusion: The results from ElliPro suggest that further research on antibody epitopesconsidering more features that discriminate epitopes from non-epitopes may further improvepredictions. As ElliPro is based on the geometrical properties of protein structure and does notrequire training, it might be more generally applied for predicting different types of protein-proteininteractions.

Published: 2 December 2008

BMC Bioinformatics 2008, 9:514 doi:10.1186/1471-2105-9-514

Received: 24 September 2008Accepted: 2 December 2008

This article is available from: http://www.biomedcentral.com/1471-2105/9/514

© 2008 Ponomarenko et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

RECENT DEVELOPMENTS

Thursday, 11 June 2009

Page 36: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

SECONDARY STRUCTURE IN EPITOPES

Sec struct: H T B E S G I .

Log odds ratio -0.19 0.30 0.21 -0.27 0.24 -0.04 0.00 0.17

H: Alpha-helix (hydrogen bond from residue i to residue i+4)

G: 310-helix (hydrogen bond from residue i to residue i+3)I: Pi helix (hydrogen bond from residue i to residue i+5)

E: Extended strandB: Beta bridge (one residue short strand)

S: Bend (five-residue bend centered at residue i) T: H-bonded turn (3-turn, 4-turn or 5-turn)

. : Coil

Guillermo Carbajosa]

Thursday, 11 June 2009

Page 37: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

AMINO ACIDS IN EPITOPES

Amino Acid G A V L I M P F W S

e/E 0.09 0.07 0.05 0.08 0.04 0.02 0.06 0.03 0.01 0.08

. 0.07 0.08 0.07 0.10 0.06 0.03 0.05 0.05 0.02 0.07

Amino acid C T Q N H Y E D K R

e/E 0.03 0.08 0.04 0.04 0.02 0.04 0.06 0.07 0.07 0.04

. 0.03 0.06 0.04 0.05 0.02 0.03 0.04 0.04 0.05 0.04

Guillermo Carbajosa]

Thursday, 11 June 2009

Page 38: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

DIHEDRAL ANGLES IN EPITOPES

Z-scores for number of dihedral angle combinations in epitopes vs. non epitopes

Phi\Psi 1 2 3 4 5 6 7 8 9 10 11 12

1 -0.47 0.44 -0.58 0.45 0.46 0.00 0.00 -0.73 -0.79 0.00 -0.83 1.42

2 -0.01 -0.12 -1.82 0.52 1.75 0.00 0.00 0.00 1.42 -0.82 0.00 0.00

3 1.82 -2.26 -1.57 0.48 0.10 0.00 -0.77 0.45 1.77 0.00 -0.82 0.99

4 1.76 1.15 -0.34 0.75 0.00 0.00 0.97 0.16 0.38 1.03 0.00 0.00

5 -0.85 0.45 -1.09 0.57 0.00 0.00 0.00 0.13 1.52 0.00 1.02 -0.79

6 0.60 1.28 1.30 1.73 0.00 0.00 0.00 0.00 1.32 -0.89 -0.76 0.00

7 0.27 -0.91 1.67 -0.51 0.00 0.00 0.00 0.00 -1.02 -1.09 0.00 0.00

8 0.93 1.21 -0.23 -3.63 0.49 0.00 0.00 0.00 0.00 -0.19 0.31 -0.82

9 0.00 0.28 -0.67 0.33 0.01 -0.83 0.00 0.00 0.87 0.23 0.00 0.00

10 0.00 0.95 1.71 -0.70 0.00 0.00 0.00 1.29 1.08 0.00 1.00 0.00

11 0.00 0.00 1.02 0.00 0.00 0.00 0.00 0.86 -0.75 0.00 0.00 0.00

12 0.42 0.83 0.28 1.68 0.00 0.00 0.00 0.00 1.03 -0.21 -0.79 0.93 Guillermo Carbajosa

Thursday, 11 June 2009

Page 39: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

RATIONAL VACCINE DESIGN

>PATHOGEN PROTEINKVFGRCELAAAMKRHGLDNYRGY

SLGNWVCAAKFESNF

Rational Vaccine Design

Thursday, 11 June 2009

Page 40: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

RATIONAL B-CELL EPITOPE DESIGN

• Protein target choice

• Structural analysis of antigen

Known structure or homology modelPrecise domain structurePhysical annotation (flexibility, electrostatics, hydrophobicity)Functional annotation (sequence variations, active sites, binding sites, glycosylation sites, etc.)

Known 3D structure

Model

Thursday, 11 June 2009

Page 41: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

• Protein target choice

• Structural annotation

• Epitope prediction and ranking

RATIONAL B-CELL EPITOPE DESIGN

Surface accessibilityProtrusion indexConserved sequenceGlycosylation status

Thursday, 11 June 2009

Page 42: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

RATIONAL B-CELL EPITOPE DESIGN

• Protein target choice

• Structural annotation

• Epitope prediction and ranking

• Optimal Epitope presentation

Fold minimization, orDesign of structural mimics Choice of carrier (conjugates, DNA plasmids, virus like particles)Multiple chain protein engineering

Thursday, 11 June 2009

Page 43: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

MULTI-EPITOPE PROTEIN DESIGN

Rational optimization of epitope-VLP chimeric proteins:

Design a library of possible linkers (<10 aa)

Perform global energy optimization in VLP (virus-like particle) context

Rank according to estimated energy strain

B-cellepitope

T-cellepitope

Thursday, 11 June 2009

Page 44: B CELL EPITOPES AND PREDICTIONS - DTU …€¦ · B CELL EPITOPES AND PREDICTIONS Thursday, 11 June 2009. ... AROC = 0.60 Thursday, 11 June 2009. ...  Thursday, 11 …

Technical University of Denmark - DTUDepartment of systems biology

CE

NT

ER

FOR

BIO

LOG

ICA

L SE

QU

EN

CE

AN

ALY

SIS

ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial

CONCLUSIONS

• Rational vaccines can be designed to induce strong and epitope-specific B-cell responses

• Selection of protective B-cell epitopes involves structural, functional and immunogenic analysis of the pathogenic proteins

• When you can: Use protein structure for prediction

• Structural modeling tools are helpful in prediction of epitopes, design of epitope mimics and optimal epitope presentation

Thursday, 11 June 2009