discovery of ligand-protein interactome
TRANSCRIPT
Genome-Wide Discovery of Protein-Ligand Interactions by a Combined Computational & Energetic-Based Approach
Daisuke Kihara Department of Biological SciencesDepartment of Computer Science
Purdue University, IN, USA
1
http://kiharalab.org
Comprehensive Detection of Protein-Ligand Interactions in a Cell
From single molecules to interactions and networks
Protein-protein interaction networks can be identified by several experimental methods, yeast 2 hybrid, tagged-protein + mass spec
Protein-ligand interaction network (pathways) e.g. KEGG: compilation of individual interactions in literature
2
Combined Computational & Experimental Approach
3(Zeng et al., J Proteome Res, 2016)
Patch-Surfer 2.0: Local Patch-Based Pocket Comparison Method
4(Sael & Kihara, Proteins, 2012)
(Zhu, Xiong, & Kihara, Bioinformatics, 2015)
3D Zernike invariants An extension of
spherical harmonics based descriptors
A 3D object can be represented by a series of orthogonal functions, thus practically represented by a series of coefficients as a feature vector
Compact Rotation invariant
5
A surface representation of 1ew0A (A) is reconstructed from its 3D Zernike invariants of the order 5, 10, 15, 20, and 25 (B-F). (Sael & Kihara, 2009)
),()(),,( mlnl
mnl YrRrZ
),( mlY )(rRnl
),,( rZ mnl
: Spherical harmonics, : radial functions
polynomials in Cartesian coordinates
143 .)()(
xxxx dZf m
nlmnl Zernike moments:
Zernike invariants: 2)( mnl
lm
lmnlF
Pocket Features to Compare
6
• Shape• Electrostatic Potential• Hydrophobicity• Visibility
3DZD for
Approximate Patch Position:Histogram of Geodesic Distance to other Seed Points
The Number of Patches for Several Ligand Binding Pockets
7
Non-Redundant Database of Ligand Binding Pockets
Selected from ligand-bound protein structures from PDB
2444 different ligand types 6547 pockets
117 ligands have more than 5 pockets
8
Predicting Binding Ligand from Screening Results
9
Query pocket
Pocket database
Matched pockets
Ligand of the pocket
1lj8_A NAD
1ebw_AB BEI
3b4y_A F42_FLC
3oa2_ACD NAD
1bxk_A NAD
3c1o_A NAP
1nuq_A DND
2jhf_A NAD
1nyt_B NAP
…… ……
ligand Pocket Score
NAP 22.87
NAD 18.75
NDP 16.55
DND 14.81
ATP 12.75
….. …..
Binding Ligand Prediction Results
Top 5 Top 10 Top 15 Top 20 Top 25117 Ligands 0.254 0.438 0.547 0.611 0.65950 Groups* 0.459 0.628 0.726 0.791 0.835(without flexible ligands)**107 Ligands 0.272 0.472 0.587 0.653 0.69950 Groups 0.487 0.663 0.754 0.810 0.845
10
* Ligands are grouped with SIMCOMP ligand structure similarity. At this level, ligands with up to a few atom differences are clustered. E. g. glucose and mannose are grouped but not with sucrose. NAD and NADP are clustered but not with ATP.** After removing 10 ligands with largest flexibility (the average number of rotatable bonds per atom)
Ligand Types with High and Low Accuracies
11
darunavirNADPH
Iron-sulfur cluster
Tris-aminomethane
polyethylene glycol
N-acetyl-D-glucosamine
3-Pyridinium-1-Ylpropane-1-Sulfonate
Patch-Surfer Retrieval Results forFlexible Ligands: FAD and NAD
12
flavin adenine dinucleotide (FAD) FAD
Nicotinamide adenine dinucleotide(NAD)
1cqx 1jr8 1e8g 1k87 1mi3 1s7g
Patch-based: 3rd Global pocket: 31st
Patch-based: 1st Global pocket: 18th
Patch-based: 2nd Global pocket: 16th
PL-PatchSurfer2: Local Surface-Based Virtual Screening
Shin, Christoffer, Wang, & Kihara, J Chem Inf. Model. (2016)
Benchmark
25 targets from DUD set (Huang et al., 2006) Nuclear receptors: 8 Kinase: 7 Serine protease: 2 Other proteins: 8
~40~360 actives for each target. Active: Decoy ratio is kept to 1:29.
If the library is larger than 3000, 60 actives and 1740 decoy compounds are selected.
Program EF1% EF5% EF10% BEDROC
PL-PatchSurfer 15.47 5.25 3.11 0.310
AutoDock Vina 7.92 5.05 3.37 0.276
AutoDock4 7.36 3.83 2.74 0.226
DOCK6 11.47 4.02 2.47 0.239
ROCS 11.76 5.54 3.52 0.317
Screening Results on the DUD set
Programs EF1% EF5% EF10% BEDROC
PL-P.Surfer 14.69/12.48 5.12/4.55 3.03/2.91 0.30/0.28
DOCK6 12.49/7.69 4.60/3.58 2.86/2.48 0.27/0.21
Vina 7.35/3.86 4.95/2.55 3.39/2.00 0.27/0.16
Difference of Enrichment Factor for 19 Holo/Apo Target Structures
Screening Results Using Structure Models
Methods Structure
EF1% EF5% EF10% BEDROC
PL-PSurfer X-ray 12.86 5.28 3.29 0.31TBM 11.76 5.28 3.35 0.31
Autodock Vina X-ray 8.63 6.14 4.09 0.33TBM 1.68 1.30 1.30 0.09
DOCK6 X-ray 11.70 4.40 2.98 0.26TBM 2.58 1.88 1.58 0.12
17
TBM: template-based models
Combined Computational & Experimental Approach
18(Zeng et al., J Proteome Res, 2016)
Identifying NAD Binding Proteins with Pulse-Proteolysis in E.coli Proteome
19
• Stabilization from ligand binding leads to a change in protein abundance after pulse proteolysis.
• Digested peptides by pulse proteolysis is filtered out by FASP.
• The change in abundance was measured by tandem mass tags (TMT) labeling coupled quantitative mass spectrometry.
Detected NAD Binding Proteins
20
Three urea concentrations,3.5M, 4.0M, and 4.5M were used. Considered as NAD binding if the stability changed by 1.25 fold or larger with/without NAD in 2 or more replicates.
21(Zeng et al., J Proteome Res, 2016)
Predicted NAD binding pose of the eight predicted novel NAD binding proteins
22
NAD is colored in cyan and crystal structure of the cognate ligand is shown in magenta.
23
(2016)
Summary Patch-Surfer2.0 compares a query pocket against a
database of known ligand binding pockets and predicts binding ligands for the pocket.
PL-PatchSurfer2 compares a pocket directly against ligand molecules. Tolerant to small difference of conformations of
molecules Combined with Pulse-proteolysis to identify novel
NAD binding proteins in the E. coli proteome.
24
AcknowledgementsW. Andy Tao (Purdue) Lingfei Zeng
25@kiharalab
Woong-Hee Shin
Chiwook Park (Purdue) Nathan Gardner
Lyman Monroe