protein surface analysis for functional analysis and prediction t. andrew binkowski and andrzej...
Post on 17-Dec-2015
216 Views
Preview:
TRANSCRIPT
Protein Surface Analysis for Functional Analysis and Prediction
T. Andrew Binkowski and Andrzej Joachimiak
2009 NIGMS Workshop: Enabling Technologies for Structural Biology
March 4-6, 2009
Outline
How Can Surface Analysis Aid Your Structural Genomics Effort?
Protein Surfaces
Comparing Surfaces of Proteins
Surface Analysis in the Structural Genomics Pipeline
The Global Protein Surface Survey
3
Functional Inference in Proteins
Transfer function based on similarity to a protein with known biological activity
Sequence 30-70% Functional sites result from spatial
interactions of key residues in diverse regions of primary sequence
Structure Reveal more distant relationships 1 fold ~ many functions; vice versa
Example: generalized secondary structural element
Different SSE can bring residues in spatial proximity (Jaroszewski & Godzick, ISMB 00)
4
Functional Inference in Proteins
Functional surfaces may be the most conserved structural features of proteins Surfaces performing identical biochemical activity can be found within different
protein scaffolds or in the absence clear evolutionary relationships
Exploit ability of proteins to preserve local spatial residue patterns Presents another opportunity to infer insightful ideas about their biological function
and mechanisms
Novel heme-monooxygenase•12% sequence identity• vs. all •Experimentally verified activity
5
Surfaces of Proteins
Surface: Local grouping of solvent accessible
atoms
Pockets: Empty concavity on a protein
surfaces into which solvent can gain access
Identifying surfaces: Methods:
Solvent accessibility, Geometry, Grids, Spheres
Applications: CASTp, Surfnet, Pocket, Ligsite, Pass
Our approach: Computational geometry (alpha
shape) CASTp, PDB, Swiss-Prot,
Catalytic Site Atlas Ligand binding surfaces:
Exclusion contact surface (solvent accessibility difference)Muck & Edelsbrunner, ACM Tran Graph, 1994; Edelsbrunner, Facello, Liang, Disc Appl Math, 1996; Liang, Edelsbrunner,
Woodward, Protein Sci, 1998
6
Global Protein Surface Survey
http://gpss.mcsg.anl.gov
7
Comparing Surfaces of Proteins
SurfaceScreen Methodology for identifying similarly
shaped proteins and aligning them
Optimizes two components Global Shape
Perceived similarity Size and scale, independent of
chemistry
Local physicochemical texture Preserved atom/residue orientation Conservation of chemical
complimentarity
Global Surface Shape Filtering
Surface
Constrained SpatialSurface Refinement
Apply ScoringFunctions
Surface ShapeAlignment
8
Comparing Surfaces of Proteins:Global Shape Similarity
Surface Shape Signatures (SSS) Represent signature of a surface
as distribution sampled from a shape function (Osada et. Al., 2002)
Comparison of probability distributions
Kolmogorov-Smirnov Earth Mover’s Distance
ATP Binding sites protein kinase CK2 from Z. mays (b) phosphopantetheine
adenylyltransferase from E. coli (c) maltose/maltodextrin transport
protein from E. coli (d,cyan chain A, light blue chain B)
50 non-homologous sites (< 30% sequence identity)
9
Spatial Surface Alignment Refinement
Combinatorial comparison of residue sets in “neighborhood” Maintain “like” correspondence
of types Maximum common residues
Enumerate and evaluate alignment orientations Find optimal superposition
using SVD of correlation matrix (Umeyama 1991)
Heme binding pockets of myoglobin from different organisms.
10
Evaluating Surface Alignments
Surface Volume Overlap:
Interpretation of SVOT is not straightforward Need global and local
BABAAB VVVV ABBBAA
ABAB VVV
VSVOT
ABBBAA
ABAB VVV
VgSVOT
abbbaa
abab VVV
VlSVOT
RMSD Distance: Estimate the probability of obtaining a specific
RMSD for nres
Compute random surface alignments (108) and build lookup tables
RMSD variants: cRMSD (coordinate) oRMSD (orientation)
Benchmarking Surface Alignments
11
12
Heme Binding Site Retrieval
seq. & fold
surface analysis
Heme (iron-protoporphyrin IX) Multi-functional (i.e.oxygen binding/transport,
electron transfer and redox)
Binding on 20 different folds Between proteins <2% seq. id.
Query myoglobin (gray) against PDB structure to identify hemoproteins Retrieval rate (area under ROC
curve) Sequence: 68.7% Structure (SSM): 64.4% Surface: 95.8%
Detection of convergent heme binding site on IsdG from S. aureus Missing characteristic sequence motif 12% seq id; different scaffold Experimentally verified
monooxygenase activity
13
ATP: Retrieval of a Flexible Ligand
Adenosine 5’-triphosphate multifunctional nucleotide (i.e.cell signaling, enegry transfer)
58 unique EC classifications #.#.#.# Conformational flexibility
Retrieval rates for 4 conformations (79.1%-85.4%); method is tolerant to flexible ligands
14
Prediction and Validation of GDP Binding Surface
Structure of F420-0:gamma-glutamyl ligase from A. fulgidus
Large binding surface was searched to support functional predictions and GDP binding surface is identified Posed GDP based on superposition
of surfaces (red)
Co-crystallization experiments validates prediction
Surface Analysis in the Structural Genomics Pipeline
15
Exploiting Protein Surfaces in Structural Genomics
16
Developing surface-based tools to address specific needs of structural genomics pipeline
Ligands for co-crystallization Aid in the assignment of
electron density Functional annotation tools Drive further studies (i.e.
ligand binding, discovery)
Co-
crys
talli
zatio
nM
utat
ion
Functional Analysis
Ligand
Identification
Future StudiesDiscovery
17
Surface Identification
Partially Solvedor Low Quality
Structure
Search GPSS forBinding Sites
Co-crystallizationExperiments
Crystallization/Structure Improvement
Introduction of GDP to F420-0:gamma-glutamyl ligase from A. fulgidus improves resolution from 2.8 to 1.9 Angstroms and orders loop regions.
18
Assisted Electron Density Assignment
Unidentified ligand density
Construct surface surrounding density and search against ligand surface library Does not require entire structure
to be built
19
Assisted Electron Density Assignment
Applicable to ligands of various molecular weights and sizes Fructose (pdb id=1zx5) NADP (pdb id=2ag8)
Suggest a list in cases of ambiguity
20
Landscape Analysis: ATP
Classification based on surface similarity shows functional families have preferred (not necessarily unique) surfaces and conformation
21
Automated Protein Kinase Classification
All-against-all surface comparison of all protein
kinases in the PDB Color labeled by expert annotation (KinBase)
Surface clustering identifies: Dual substrate specificity of CK2 proteins Active/inactive states
Similarity detected between MAP p38 kinase and Abelson leukemia virus tyrosine kinase (Abl) with bound cancer drug STI-571 MAP kinase has unique DFG “out” conformation not
previously seen in ser/thr kinases
22
Function Sleuth
Conserved protein of unknown function (VCA0319) from V. cholerae apc29617
Unique arrangement of common structural motifs Problematic for secondary structure
and fold analysis
Surface analysis identifies DNA binding surface and 5 putative metal binding sites All 5 metal binding sites showed
strong preference for Mg
Putative metalloregulated repressor with Mg-regulated mechanism of DNA binding
1bdb NAD
1hoh MGD
2qwr ANP
1jbw ACQ
Function Sleuth
23
Target APC7761 (3fd3)Agrobacterium tumefaciens str. C58
Function Sleuth
Target APC61725 (3fz5)
Rhodobacter sphaeroides 2.4.1
Top 17 most similar surfaces bind B12
24
1i9c
Global Protein Surface Survey
SurfaceScreen for PSI ‘function sleuth’ targets Automated analysis of largest 5 surfaces
(per chain and unit)
Technical Note: DOE INCITE on Blue/GeneP at ANL
25
http://gpss.mcsg.anl.gov
26
Conclusion
Comparing surfaces of proteins can be a useful tool with many applications Functional characterization Assisted electron density assignment Automated classification
Global Protein Surface Survey http://gpss.mcsg.anl.gov
2727
Acknowledgements
ANL/MCSGH. An, G. Babnigg, L. Bigelow, A. Binkowski, C-s. Chang, S. Clancy, G. Cobb,M. Cuff, M. Donnelly, C. Giometti,W. Eschenfeldt, Y. Fan,C. Hatzos, R. HendricksG. Joachimiak, H. Li, L. Keigher,Y-c. Kim,N. Maltseva, E. Marland,S. Moy, R. Mulligan,B. Nocek, J. Osipiuk, M. Schiffer,
G. Montelione, Ruthgers Univ. NESGCT. Terwilliger, Los Alamos, ITCSGZ. Derewenda, Univ. of Virginia, ITCSG Z. Dauter, NCIJ. Liang, Univ. of IllinoisD. Sherman, U. Michigan
Washington Univ.D. Fremont,T. Brett, C. Nelson,
Univ. of VirginiaW. Minor, M. Chruszcz, M. Cyborowski, M. Grabowski, P. Lasota, P. Miles,M. Zimmerman, H. Zheng
Univ. of Texas SWMCZ. Otwinowski, D. Borek, A. Kudlicki, A. Q. Mei, M. Rowicka
Northwestern Univ. W. Anderson, O. KiryukhinaD. Miller, G. Minasov, L. Shuvalova, X. Yang, Y. Tang
Univ. College London @ EBI, J. Thornton, C. Orengo, M. Bashton, R. Laskowski, D. Lee, R. Marsden, D. McKenzie, A. Todd, J. Watson
Univ. of Toronto A. Edwards, C. Arrowsmith, A. Savchenko,E. Evdokimova, J. Guthrie, A. Khachatryan, M. Kudrytska, T. Skarina, X. (Linda) Xu
Univ. of ChicagoO. Schneewind, D. Missiakas, P. Gornicki, S. Koide, ITCSGW-j. Tang,B. Roux,J. L. RobertsonM.R. Rosner,T. Kossiakoff, ITCSGV. Tereshko,
Funding: NIH and DOE
ANL/MCSG A. Sather,G. Shackelford,L. Stols, K. Tan,C. Tesar,R-y. Wu, L. Volkart, R-g. Zhang, M. Zhou,ANL/SBCN. Duke, S. Ginell,F. Rotella
Thank you
top related