prediction of protein-protein binding sites and epitope mapping · 2018. 4. 26. · – 32...

27
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved. Prediction of Protein-Protein Binding Sites and Epitope Mapping

Upload: others

Post on 11-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Prediction of Protein-Protein

Binding Sites and

Epitope Mapping

Page 2: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Epitope Mapping

▪ Antibodies interact with antigens at epitopes

– Epitope is collection residues on antigen

– Continuous (sequence) or non-continuous

– Patentable druggable sites

▪ Methods for mapping epitopes

– X-ray crystallography (gold standard)

– Peptide scanning

– Display technologies (e.g. phage display)

– Hydrogen-Deuterium exchange (HDX)

▪ In silico methods for mapping epitopes

– Patch Analysis (antigen surface properties)

– Protein Docking (pose prediction)

– Very challenging: large surface area / conformational flexibility

▪ Key question: are in silico methods accurate?

– Suggest mutagenesis experiments early in development

– Reduce crystallographic efforts / expense

Antigen

Non-continuous interaction

Continuousinteraction

Page 3: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Hydrophobic Patches

▪ Identify key hydrophobic regions on the molecular surface

– Hydrophobic patches implicated in protein binding

– Desolvation of hydrophobic patches is a driver of protein-protein interactions (e.g. binding affinity, HIC, aggregation, etc.)

▪ Protein Patch Analyzer Methodology

– Derive patches from local lipophilic potential1

– Map atomic logP (octanol/water partition coefficient) onto surface

– Remove portions of surface with local logP < 0.1 (≈ methyl C)

– Retain contiguous patches of at least 50 Å2 (≈ methane)

Hydrophobic

Hydrophilic

Hydrophobic patch

1. Wildman,S.A., Crippen,G.M.;J. Chem. Inf. Comput. Sci. 39(5), 868-873, (1999)

Antibody(PDB:3G6D)

Negative patch

Page 4: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

% Inclusion Body Predictions for 31 Adnectins

▪ Sequences and experimental dataprovided in publication1

– % IB: a measure of insoluble proteinaggregate formation

– 31 triple mutants in series

▪ Mutating key residues of hydrophobicpatches improves solution behavior

– Mutated residues correspond tohydrophobic patches

– Polar variantsreduce % IB

– Rational designusing homologymodels effectivein this case

Adnectin % IB

31 76

1 17

2 19

3 20

Adnectin 31Homology Model

Trainor et al, JMB 2016,428(6), p1365-1374

31

1

2

3

Page 5: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Hydrophobic Patches Predict Adnectin Solubility

▪ Homology model of all 31 mutant adnectins

– Measure hydrophobic patch surface area

– Correlate with experimental data – compare with SAP score

0

10

20

30

40

50

60

70

80

10 20 30 40 50 60 70 80

Pre

dic

ted

% in

Incl

usi

on

Bo

die

s

Experimental % in Inclusion Bodies

MOE (R2=0.701)

SAP (R2=0.709)

MOE Hydrophobic Patch Descriptor

– Sum of hydrophobic patch surface areas

– Averaged over 25 LowModeMD conformations

SAP Spatial Aggregation Propensity (Trout)

– Score of exposed residues

– Averaged over MD simulation

Chennamsetty et al, J Phys Chem B 2010,114(19), p6614-6624

Page 6: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

▪ ~3000 crystal structures of antibodies in PDB, 1582 used

– Removed structures with resolution > 3 Å or missing segments in Fv domain

▪ Small hydrophobic patches are common

– Unusually large patches present concernfor developability

– Multiple hydrophobic patches togethercan reduce solubility

Hydrophobic Patch StatisticsFr

eq

ue

ncy

120 Å2

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# of Hydrophobic Patches

0

50

100

150

200

250

300

60

80

10

01

20

14

01

60

18

02

00

22

02

40

26

02

80

30

03

20

34

03

60

38

04

00

42

04

40

46

0

SA of Largest Hydrophobic Patch (Å2)

Page 7: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

HIC Predictions for 137 Clinical Candidates

▪ Recent publication providing sequences and experimental data1

– Diverse set of antibody Fv regions expressed on common IgG1 scaffold

– Use data as benchmark to measure broad applicability of HIC predictions

▪ Generate homology models and predictions with descriptors / QSPR

– Automatic antibody modeling and patch surface area calculation

AntibodyHomology Model

1: Jain et al. PNAS 2017;114:944-949

Page 8: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

HIC Predictions for 137 Clinical Candidates

▪ 3D hydrophobicity descriptorsoutperform sequence-based index

– Signals are generally modest

– Likely due to clinical candidateshaving been solubility optimized

– Difficult to differentiate narrowrange of low HIC RT values

▪ CDR region important

– Candidates -> framework mutations

– LowModeMD sampling improvement

▪ Prediction quality is case dependent

– Adnectin predictions more accurate

– Triple mutants easier than multi-class

▪ QSPR model more predictive

– 4-point QSPR can enhance signal

r2 = 0.17

r2 = 0.10

r2 = 0.23

r2 = 0.33

r2 = 0.41

Page 9: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

▪ Sum of hydrophobic patch surface areas of CDRs calculated

▪ Many candidates hydrophobic vs PDB antibodies

– Trastuzumab has a low HIC RT (9.7 min)

– Cixutumumab has higher HIC RT (11.8 min) but longer H3 loop dilutes the hydrophobicity

– Cituxumumab removed from pipeline after phase II

▪ Predictions across classes are difficult

– Relative predictions within a series is more reliable

– Ab:Ag co-crystal structure useful for optimization

HIC Predictions for 137 Clinical Candidates

Fre

qu

en

cy

cdr_hyd SA (Å2)Trastuzumab

Cixutumumab

700 Å2

0

5

10

15

20

25

0 50 100 150 200 250 300 350 400 450 500 550 600 650 700

240 Å2

PDB

Page 10: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

▪ 750 antibody:antigen complexes from antibody structural database

– Antigens larger than 2000 residues removed

– Count hydrophobic patch frequency at PPI

– Order by patch surface area (blue is largest)

▪ Useful for epitope mapping and identifying residues for PPI optimization

– To reduce undesired PPIs, mutate residues forming hydrophobic patches

Hydrophobic Patches Prevalent at PPI Sites

%

0

10

20

30

40

50

60

70

80

90

100

ab ag

Antibody

Antigen

Antibody

Antigen

Page 11: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Patch Analysis of Interleukin-1β

▪ Known complexes with Canakinumab and Gevokizumab

– Protein-protein interfaces consist of clusters of multiple patches

– Patches are present in other exposed regions

▪ Patch analysis alone is insufficient to predict epitopes

– Ranking patches by size does not correlate with interface

Interleukin-1β (4I1B)

Canakinumab (4G6J)Gevokizumab (4G6M)

Page 12: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Protein-Protein Docking

▪ Predict macromolecular complexes with protein docking

– More comprehensive approach for identifying binding sites

– Computationally more expensive than Protein Patch Analysis

▪ First principles physical approximations

– No protein structure training set used in parametrization

▪ Site restraints incorporated using generic energy term

– Applied automatically to antibody CDR regions

AntibodyPDB: 3G6D

Antigen PosePredictions (100-200)

Antibody CDRs

Page 13: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

0 10 20 30 40 50 60 70 80 90 100

PatchDockHEX/G

GRAMMHEX

FTDock

FTDock/GZDOCK2.1MolFit/G

MolFig/GHDOT

ZDOCK1.3SDOCK

ZDOCK2.…FRODOCK

ATTRACT

Success rate (%)

Performance of Protein Dockers on Antibodies

Top 5

Top 30

Top 100

Antibody Docking Results

▪ Repacking/refinement improves ranking– Backbone and sidechain

rearrangement in complex

– Scoring the single correct pose is not realistic

– Ranking of top 100 shows good enrichment

– Use ensemble of poses to refine prediction

▪ ZDOCK 4.0 Benchmark test set– Success defined as a structure with ligand RMSD < 10 Å

– Prior knowledge (site restraint) needed to achieve high accuracy with state-of-the-art dockers

Huang, S.-Y.; Exploring the potential of global protein–protein

docking; Drug Discovery Today, 20(8) (2015) 969–977

21 antibody-antigen complexes in ZDOCK set

ATTRACT:LJ

ZDOCK3.0.2

PIPER

MOE

Page 14: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Epitope Mapping Uses Multiple Docked Poses

▪ An approach for analyzing the protein poses is required

– 100-200 poses needed for reasonable performance

– Use protein-protein interaction fingerprints to describe poses

▪ Goal: Identify sets of interacting residues

– Dock antibody Fv restricted to CDRs to the entire antigen

– For each pose, identify antigen residues that are implicated

– Generate statistical summaries of docked poses

– Is epitope identification more robust than predicting a single pose?

▪ Validation: Similarity to native complex structure

– Measure overlap betweensets of interacting residues in different poses

– Use ZDOCK5 benchmark complex structures

Interacting residues(spheres)

Antibody pose 1 (blue)

Antibody pose 2 (red)

Antigen

Page 15: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Identify Interactions using Surface Patch Contacts

▪ Identify residue-residue interactions from antibody poses

– Fingerprints: surface patch contacts on the antigen in each pose

– Count residue on the antigen if it is interacting with the antibody

▪ Classify patch contacts into four categories

– Both HYD, One HYD, Any POS/NEG, None

▪ Example: Colicin E7 nuclease (7CEI), key residue N26

Native complex crystal contacts Unbound structure docked pose (5Å)

H-bonds

ReceptorsurfaceInhibitor

surface

Page 16: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

▪ Benchmark data set of protein-protein complexes

– 150 rigid-body targets from ZDOCK test set

– 32 antibody-antigen complexes, 118 others

▪ Protein contacts calculated using distance and energy

– Cut-offs applied to select mostsignificant interactions

– Average number of contacts:~6 per complex

▪ Surface interactions calculatedfrom patch fingerprints

– Most contacts correspond tosurface interactions

– Energetic calculations tend toover-emphasize charged interactions

– Surface contacts betterrepresent hydrophobic contacts

0

5

10

15

20

25

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Protein Contacts

non-Ab

Antibody

0

5

10

15

20

25

30

0 1 2 3 4 5 6 7 8 9

Residues in Common

non-Ab

Antibody

Surface Contacts Describe Native Complexes

Page 17: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Protein-Protein Interaction Fingerprints

▪ Populations are calculated for each residue in all poses

– The top-200 refined poses are used in the analysis

– The most likely residues to interact with the antibody are identified

▪ Highest peaks do not necessarily represent epitopes

– Cluster poses to identify residues that are spatially close

Interleukin-1β/canakinumabPose Interaction Histogram

Page 18: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Using Fingerprints to Determine Epitopes

Poses intop cluster

selected

Consensusresidues in

epitope

Interleukin-1β/canakinumab Epitope Selection▪ Cluster poses by fingerprint

similarity

– Connected poses are defined by a similarity cutoff (0.4)

▪ Rank clusters by ensemble free energy

– Effective temperature is an empirical parameter fit to score values

▪ For each cluster generate the epitope

– Each cluster is a potential epitope (spatially close)

– Select highest-population residues from each cluster

Page 19: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Epitope Benchmark Results

▪ Predictions carried out using ZDOCK test set

– Selected targets with representative docking performance

– Divided into antibody (20) and non-antibody (30) classes

▪ Native contacts identified in predicted epitopes

▪ At least 3 native contacts in top-ranked epitope

– Antibody: 60%, non-antibody: 53%

0 25 50 75 100

Top 5

Top 1

Antibody

5+ 3-4 1-2

0 25 50 75 100

Top5

Top 1

non-Antibody

5+ 3-4 1-2

0

5

10

0 1 2 3-5 6-9 10-14 15+

Nu

mb

er o

f ta

rget

s

Number of Hits in top-200

Benchmark Docking ResultsAntibody

non-Antibody

Percentage of targets with native contacts

Page 20: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Predicted Epitopes of IL-1β Complexes

▪ Results of docking with different antibody structures

– Correct ordering impossible from patch analysis alone

– Top-ranked docking poses are not in top-ranked clusters

▪ PPI fingerprint analysis correctly predicts crystal epitopes

Gevokizumab#1#2#3

Canakinumab#1#2#3

IL-1β

Page 21: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Hydrogen-Deuterium Exchange (HDX)

▪ Soak the protein with deuterium

– Deuterium will replace H over time; buried atoms no replacement

– Solvent accessibility, h-bonding, sidechain neighbors affect uptake

▪ Digest with pepsin and use mass spectrometry

– Collection of partially deuterated sequence fragments

▪ Measure deuteration of backbone amides in solution

D2O

Peptide N-DPeptide N-H

Iacob et al, Expert Rev. Proteomics 2015,12(2), p1-11

D

Page 22: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

▪ Compare bound and unbound deuterated structures

– Reduction in deuteration indicates interface residues

▪ Provides low-resolution information – sequence segments

– Qualitative effect, divided into weak and strong categories

– Effects may not be due to contact, e.g. flexibility changes

– Not all interactions appear

▪ Idea: use HDX data to bias docking to find IL-23:adnectin epitope residues

Adnectin interface residuesshow less deuteration

(less exposed to solvent)

IL-23p40

HDX Useful to Probe Binding Interfaces

Iacob et al, Expert Rev. Proteomics 2015,12(2), p1-11

p40 domain sequence p19 domain sequence

HDX MS Data (IL-23)

0 50 100 150 200 250 300 0 50 100 150

p19

Page 23: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Characterization of Binding Interface

▪ HDX data alone is not enough to characterize epitope

– Resolution does not label single residues

– Energy-based methods assume accurate atomic structure

– Buried surface alone does not distinguish different types of interactions (e.g. hydrophobic, H-bonds)

– Surface Patch Fingerprints can be effective

▪ Patch contacts reflect key interactions

– Close agreement with atomic-levelinteractions in native structure

– More selective than contacts basedon energy alone

– Surface patches provide a more robustcharacterization of docked conformations

Epitoperesidues

HDX MS Data (IL-23)

p40 50 100 150 200 250 300 p19 50 100 150

Co-crystal Inspection

Co-crystal Patch FP

Page 24: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Interleukin-23 : Adnectin Calculation

▪ IL-23:Adnectin show conformational changes in complex

– Docking the bound conformations works very well (RMSD < 0.5 Å, #1)

– Docking unbound conformations fails (too much loop movement)

▪ Docking procedure

1. 20 Adnectin models

2. 20 IL-23 models

3. HDX IL-23 constraints

4. FG-loop constraints

5. Dock Pairs – 400 runs

▪ Epitope Analysis

– Keep top 1,000 poses

– Calculate patch FPsand cluster

– Rank clusters by dockingscore, contact area, HDX overlap

– Output top rank

IL-23 loop notwell resolvedin unboundcrystal structure

20 IL-23 Homology Models(from PDB:3D87 - unbound)

20 Adnectin Homology Models(from PDB:1FNA - unbound)

AdnectinFG-loop

Page 25: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Interleukin-23 : Adnectin Epitope Prediction

▪ Ensemble modeling preserveskey features of complex

– Consensus fingerprints showgood correspondence (4/8)to crystal interactions

– Homology models outperformdocking of unbound structureswhen there is significantconformational flexibility

▪ HDX biased protein docking can elucidate epitope

0 50 100 150 200 250 300

3D87:1FNA models

3D87 models:Xtal

Xtal:1FNA models

Crystal inspection

0 50 100 150p40 p19

Receptormodels

DockedposesPredicted

epitope

Variableloops

Page 26: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.

Conclusions

▪ Protein Patch Analyzer for detecting protein hot spots

– Fast method for predicting protein binding sites on a single body

– Identify hydrophobic patches implicated in PPIs

▪ Predict protein-protein complexes with docking

– Predicts protein interactions between two bodies

– Site restraints derived from automatic sequence annotation are effective for docking antibody-antigen complexes

▪ Analysis of docked poses can be used for epitope mapping

– Similar poses determined by clustering PPI fingerprints

– First-ranked epitope correct in 56% of antibody-antigen targets

▪ Epitope analysis can be used to interpret HDX data

– Case study: Adnectin complex with IL-23

– HDX data can be incorporated into docking calculations

– Consensus fingerprints show good agreement with crystal structure

▪ Epitope mapping available in MOE 2018

Page 27: Prediction of Protein-Protein Binding Sites and Epitope Mapping · 2018. 4. 26. · – 32 antibody-antigen complexes, 118 others Protein contacts calculated using distance and energy

Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.