ample – using de novo or ab initio protein structure modelling techniques to create and enhance...
TRANSCRIPT
![Page 1: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/1.jpg)
AMPLE – Using de novo or ab initio protein structure modelling techniques to create
and enhance search models for use in Molecular Replacement
Jaclyn Bibby, Jens Thomas, Olga Mayans and Daniel RigdenInstitute of Integrative Biology
Ronan Keegan and Martyn Winn
Collaborative Computational Project 4 (CCP4)
![Page 2: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/2.jpg)
AMPLEab initio modelling of proteins for molecular replacement
• Joint development by CCP4 and the University of Liverpool
• AMPLE is a comprehensive project to assess the suitability of using cheaply obtained ab initio models in molecular replacement
• An additional goal of the project is to make AMPLE into an automated software tool that can be made generally available to potential users through CCP4
![Page 3: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/3.jpg)
Ab initio structure prediction
• Ab initio (or de novo) structure prediction is the prediction of a target structure fold based purely on its sequence information
• Methods have greatly improved in recent years with the aid of the CASP experiments (Critical Assessment of Protein structure prediction)
• Some examples are:– Rosetta– I-TASSER– QUARK
![Page 4: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/4.jpg)
Ab initio structure prediction
1. 1000’s of “Decoys” assembled of fragments from PDB structures
2. Decoys are clustered and centroid representatives of largest cluster are considered candidate fold predictions
3. Side chains added to selected decoys4. Refinement under a more realistic
physics-based force field
• Initial fragment assembly stage requires relatively modest computing power
• Refinement stage can require supercomputing resources
![Page 5: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/5.jpg)
Ab initio structure prediction
1. 1000’s of “Decoys” assembled of fragments from PDB structures
2. Decoys are clustered and centroid representatives of largest cluster are considered candidate fold predictions
3. Side chains added to selected decoys4. Refinement under a more realistic
physics-based force field
• Initial fragment assembly stage requires relatively modest computing power
• Refinement stage can require supercomputing resources
![Page 6: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/6.jpg)
Ab initio modelling and Molecular Replacement
• Combining this method with molecular replacement can be a powerful technique for solving the phase problem in cases where there are no obvious homologous structures available
• Two approaches have been taken1. All-atom modelling to produce single search models of maximum
completeness and accuracy (Qian et. al, 2007)• Solved 1/3 of test set of 30 targets (Das et. al, 2009)• Computationally expensive
2. Taking cheaply obtained decoys from the initial fragment assembly step and preparing them has been shown to produce successful MR search models (Rigden et al. 2008, Caliandor 2009)
![Page 7: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/7.jpg)
Synergiesab initio modelling Molecular Replacement
Produces clusters of similar model structures
Works effectively with superposed ensembles approximating the target
Within and between clusters, similarity indicates accuracy can trim inaccurate regions leaving more reliable core
May only require a partial model
Fast modelling is polyAla only Side chains are often (partially) removed for MR
![Page 8: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/8.jpg)
The AMPLE Pipeline
• Uses Rosetta to perform ab initio modelling and the generation of decoy models
• Can also accept models generated externally
• Currently designed for < 120 residues and resolution better than 2.2 Å (but may work outside these restrictions e.g. transmembrane, coiled-coil proteins)
![Page 9: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/9.jpg)
• Decoys generated first with Rosetta
• Quark has also been used during development
• Typical number of decoys required is 1000 but this can be varied
• In easier cases as few as 50 decoys can be sufficient
Decoy Generation
![Page 10: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/10.jpg)
Decoy Clustering
• Clustering using SPICKER to identify the most likely fold for the target
• A large top cluster is usually indicative of a correct prediction
• A subset of decoys (max. 200) closest to the centroid of each of the largest 3 clusters are selected for further processing
![Page 11: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/11.jpg)
Decoy Clustering
• Each cluster is then structurally aligned using the maximum likelihood algorithm implemented in Theseus (Theobold & Wuttke, 2006)
– This helps to identify structurally conserved regions
– Gives a variance score which can later be used to guide truncation
![Page 12: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/12.jpg)
Ensemble Truncation
• We’ve found that success or failure in the molecular replacement step is highly sensitive to the accuracy of the search model
• Sampling many degrees of truncation with different levels of side chain inclusion is essential
• High variance regions are cut away in steps to give a set of ensemble models
![Page 13: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/13.jpg)
• These truncated clusters are further processed and side chains are added to give a large set of search models for molecular replacement
Further processing
![Page 14: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/14.jpg)
Molecular replacement
• Molecular replacement is performed using MrBUMP from the CCP4 suite which automates the procedure– Search models are
processed by both Phaser and Molrep
– Post molecular replacement, positioned search models are refined using Refmac5 to get an initial indication of success or failure
– C-alpha tracing with SHELXE, model building with Buccaneer, ARP/wARP
![Page 15: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/15.jpg)
Testing
• Test set of 295 small proteins (40-120 residues) from the PDB• Structure factor data also available• Resolution of 2.2 Å or better• Single molecule in the asymmetric unit• Mixture of all-α, all-β and mixed α-β secondary structure • 1000 decoys generated for each case using Rosetta• Information from any homologues was excluded from the
fragment generation step
![Page 16: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/16.jpg)
Assessing Solutions• Initially we used Reforigin to
compare solutions with the deposited structures
• More stringent method: attempt to rebuild the structures– SHELXE: partial CC of >25% &
average fragment length of 10 or more
– Further confirmation provided through building with ARP/wARP and Buccaneer.
![Page 17: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/17.jpg)
• Using these guidelines, 126 successes out of 295 (~43%) were achieved
• These are solutions that could be successfully traced in SHELXE
• Other well positioned solutions existed but could not be traced. These may be possible to solve through manual model building
![Page 18: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/18.jpg)
Results based on secondary structure type
Overall success rate: all-α = 80%; all-β = 2%; mixed α-β = 37%
![Page 19: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/19.jpg)
Variance and Truncation• Variability between decoys in each cluster corresponds to
their deviation from the deposited native structure
• 2P5K example: C-terminal region predicted as least reliably modelled portion by Theseus alignment variance score
![Page 20: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/20.jpg)
Search model ensemble size/truncation
![Page 21: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/21.jpg)
Running Times
• Average times for complete run (decoy generation, preparation, MR and chain tracing) was 2 CPU days
• A parallelised version of the code making use of Sun Grid Engine for batch farming of model generation and molecular replacement significantly speeds up the process. Results can be achieved in less than 1 hour
![Page 22: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/22.jpg)
a. Clustered decoy models, b. Truncated ensemble, c. Positioned MR solution, d. Shelxe c-alpha trace, e. Completed structure
Exploiting distant homologues
![Page 23: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/23.jpg)
Remodelling related NMR structures or distant homologues
• Can provide AMPLE with a template for the target which could be a related NMR structure or a distant homologue
• AMPLE will use Rosetta to “re-model” this template to something that should in theory be closer to the target
![Page 24: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/24.jpg)
Experimentally very difficult to work with/crystallise Represent ~30% of all proteins Make up < 3% of structures in the Protein Data Bank Presence in the membrane constrains their shape, so they can be
easier to model MR with ab initio transmembrane models hasn't been tried yet
extra cellular(aqueous)
transmembrane region(hydrophobic)
intra cellular(aqueous)
Image: http://en.wikiversity.org/wiki/File:Cytochrome_C_Oxidase_1OCC_in_Membrane_2.png
Transmembrane Proteins
![Page 25: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/25.jpg)
Selected 18 transmembrane proteins:
23 – 249 residues
1.45 – 2.5A resolution
7 clear successes 5 possible successes 223 residue structure (3GD8)
could be largest ever solved with ab initio modelling.
![Page 26: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/26.jpg)
Selected 18 transmembrane proteins:
23 – 249 residues
1.45 – 2.5A resolution
7 clear successes 5 possible successes 223 residue structure (3GD8)
could be largest ever solved with ab initio modelling.
![Page 27: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/27.jpg)
Coiled-coil targets
• Difficult to solve in MR even with good homologues• α-helical nature makes them suitable targets for AMPLE• Initial testing has been very promising with 80% success rate• Some novel structures have also been solved
![Page 28: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/28.jpg)
AMPLE Availability• Beta version available as part of CCP4 6.3.0.
• Improved and more robust version to be released as part of CCP4 6.4.0.
• Requires installation of several non-ccp4 packages:– Rosetta, SHELXE, Theseus, SPICKER, Maxcluster
• Future versions will have a reduced number of dependencies
![Page 29: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/29.jpg)
Documentation available from http://ccp4wiki.org
![Page 30: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/30.jpg)
Summary• AMPLE is a pipeline designed to prepare cheaply obtained
decoy models from ab initio modelling for use as search models in molecular replacement
• Results show that the method works well for smaller proteins particularly those containing α-helical secondary structure
• Tests were limited to structures of 120 residues in length but has worked for cases up to 250 residues
• New avenues – NMR, Homolgue remodelling• Several real successes • Currently available as a beta-version in CCP4 6.3.0
![Page 31: AMPLE – Using de novo or ab initio protein structure modelling techniques to create and enhance search models for use in Molecular Replacement Jaclyn Bibby,](https://reader037.vdocuments.net/reader037/viewer/2022110304/5519c922550346443e8b479e/html5/thumbnails/31.jpg)
Acknowledgements• Jaclyn Bibby, Daniel Rigden, Jens Thomas, University of Liverpool • Olga Mayans, University of Liverpool• Martyn Winn, Daresbury Laboratory• Andrea Thorn, Tim Gruene & George Sheldrick (SHELX)• Developers of Rosetta and Quark• Refmac: Garib Mushudov, LMB-MRC Cambridge• Molrep: Alexei Vagin & Andrey Lebedev• Phaser: Randy Read, Airlie McCoy & Gabor Bunkozci• Thanks to authors of all underlying programs• Funding:
• BBSRC• Support from CCP4 & the Research Complex at Harwell
Poster: MS04-12 (Rootes Building)