swaati robettaa

12
1 CENTRAL UNIVERTY OF BIHAR BIS 553: protein modelling and simulation ROBETTA DE NOVO STRUCTURE PREDICTION (lec-1) ROBETTA as de novo Structure Prediction Submitted to:- Submitted by:- Dr. Durg Vijay Singh Swati Kumari Roll no- 22 2nd semester Central University of Bihar,Patna

Upload: swaatisoni

Post on 15-Apr-2017

77 views

Category:

Science


0 download

TRANSCRIPT

1

CENTRAL UNIVERTY OF BIHAR

BIS 553: protein modelling and simulation

ROBETTA DE NOVO STRUCTURE PREDICTION (lec-1)

ROBETTA as de novo Structure Prediction

Submitted to:- Submitted by:-

Dr. Durg Vijay Singh Swati KumariRoll no- 222nd semester

Central University of Bihar,Patna

2

CONTENT :-

Sl. No. Index Page no.1. Introduction 2. Performance of Robetta

in CASP 3. Development and

history 4. Aims of Robetta5. Robetta as multiple

functional modules6. Steps of Robetta7. Domain prediction8. What is Ginzu Protocol9. High resolution

structure prediction10. Component of Low

Resolution Scoring Function

11. Component of Low Resolution Scoring Function

12. Fragment-based Methods (Rosetta)

13. Limitations using Robetta

14. Reference

3

Introduction :- “Protein structure prediction and analysis using the Robetta server”

Rosetta is a well-established computational software suite with a variety of tools developed for macromolecular modeling, structure prediction and functional design.

It uses a massive distributed computing infrastructure (Rosetta@home)The Robetta server (http://robetta.bakerlab.org) provides automated tools for protein structure prediction and analysis.

For structure prediction, sequences submitted to the server are parsed into putative domains and structural models are generated using either comparative modeling or de novo structure prediction methods.

If a confident match to a protein of known structure is found using BLAST, PSI-BLAST, FFAS03 or 3D-Jury, it is used as a template for comparative modeling. If no match is found, structure predictions are made using the de novo Rosetta fragment insertion method.

Experimental nuclear magnetic resonance (NMR) constraints data can also be submitted with a query sequence for RosettaNMR de novo structure determination. Other current capabilities include the prediction of the effects of mutations on protein–protein interactions using computational interface alanine scanning.

The Rosetta method was originally developed for de novo protein structure prediction and is regularly one of the best performers in the community-wide Critical Assessment ofStructure Prediction (CASP).

Robetta provides both ab initio and comparative models of protein domains.

Domains without a detectable PDB homolog are modeled with the Rosetta de novo protocol.

Comparative models are built from template PDB’s detected and aligned locally installed version of HHSEARCH/HHpred , Raptorx and sparks-x.

Alignment are clustered and comparative models are generated using Rosetta CM Protocol

Procedure is fully automated.

Robetta continually evaluated through CAMEO (server11).

4

Robetta is evaluated in blind benchmarking of CASP.

Robetta uses ROSETTA software which is developed and maintained by Rosetta commons

In addition to the de novostructure prediction, Rosetta also has methods for :-

1.protein-protein and protein-small molecule docking, 2.homology modeling, 3.novel protein design, 4.redesign of existing proteins for altered function.

In principle, Rosetta implements mostly knowledge-guided Metropolis Monte Carlo sampling approaches coupled with knowledge-guided energy functions to perform two tasks: sampling the conformational space and evaluating the energy of the resulting structural models.

The well-validated energy function and sampling methodologies used in Rosetta form thefoundation for the high quality prediction and design of macromolecular structures and interactions, with successful stories ranging from fibril structure prediction to RNA folding to the design of new enzyme catalysts.

With citations in hundreds of research publications, Rosetta is a trusted resource for manytop research teams in pharmaceutical companies and non-profit institutions.

http://rosetta.insilicos.com/what/http://en.wikipedia.org/wiki/Rosetta@homehttp://nar.oxfordjournals.org/content/32/suppl_2/W526.full

Performance of Robetta in Critical Assessment of Techniques for Protein Structure Prediction (CASP):-

Result from the fourth and fift critical assessments of structure prediction (CASP4,CASP5,CASP6) have shown that Robetta is currently one of the best method for de novo protein structure prediction and distant fold recognition.

Robetta has participated as an automated prediction server in the biannual CASP experiments since CASP5 in 2002, performing among the best in the automated server prediction category.

5

Robetta has since competed in CASP6 and 7, where it did better than average among bothautomated server and human predictor groups.

In modeling protein structure as of CASP6, Robetta first searches for structural homologsusing BLAST, PSI-BLAST, and 3D-Jury, then parses the target sequence into its individual domains, or independently folding units of proteins, by matching the sequence to structural families in the Pfam database. Domains with structural homologs then follow a "template-based model" (i.e., homology modeling) protocol.

In CASP8, Robetta was augmented to use Rosetta's high resolution all-atom refinement method,the absence of which was cited as the main cause for Robetta being less accurate than the Rosetta@home network in CASP7.

http://en.wikipedia.org/wiki/Rosetta@home

Development and history :-

Originally introduced by the Baker laboratory at the University of Washington in 1998 as an ab initio approach to structure prediction, Rosetta has since branched into several development streams and distinct services. The Rosetta platform derives its name from the Rosetta Stone, as it attempts to decipher the structural "meaning" of proteins' amino acid sequences.

More than seven years after Rosetta's first appearance, the Rosetta@home project was released (i.e. announced as no longer beta) on October 6, 2005.

Many of the graduate students and other researchers involved in Rosetta's initial development have since moved to other universities and research institutions, and subsequently enhanced different parts of the Rosetta project.

http://en.wikipedia.org/wiki/Rosetta@home

Aim of Robetta :-

Rosetta@home aims to predict protein–protein docking and design new proteins with the help of about sixty thousand active volunteered computers processing at 83 teraFLOPS on average as of April 18, 2014.

Foldit, a Rosetta@Home videogame, aims to reach these goals with a crowdsourcing approach. Though much of the project is oriented towards basic research on improving the accuracy and robustness of the proteomics methods,

6

Rosetta@home also does applied research on malaria , Alzheimer's disease and other pathologies.

In addition to disease-related research, the Rosetta@home network serves as a testing framework for new methods in structural bioinformatics.

These new methods are then used in other Rosetta-based applications, like RosettaDock and the Human Proteome Folding Project, after being sufficiently developed and proven stable on Rosetta@home's large and diverse collection of volunteer computers.

http://en.wikipedia.org/wiki/Rosetta@home

Robetta as multiple functional modules :-

Rosetta’s macromolecular modeling capabilities empower researchers to address a wide variety of questions in structural biology. The software contains multiple functional modules with some representative features as follows.

Application Name Description

Structureprediction

AbinitioRelax Predict high resolution 3-D structure of a protein from its amino acid sequence

Comparative_modeling Build structural models of proteins using one or more known structures as templates

Rna_denovo De novo tertiary structure prediction and design of complex RNAs with high resolution

Design

FixedBBProteinDesign Redesign the amino acids for target protein backbones

Enzdes Design a protein active site to catalyze a chemical reaction

AnchoredDesign Design interfaces between known target structures and new binding partners

Protein_docking Predict and refine the docked conformation of two proteins with a known structure

7

DockLigand_docking Predict the orientation that a

small molecule binds to a protein target and calculate binding energy

AntibodyModeler Predict antibody Fv region structures and perform antibody-antigen docking

http://rosetta.insilicos.com/what/

Steps of Robetta :-

Robetta works as two phase-

1. Initialy search for conformational land scape i..e, low resolution approach.

2. High resolution approach whrere atomic detail and physically derived energy functions are imployed.

Low resolution Prediction (Predict structure about near accuracy):-

In low resolution phase overall topology is searched using a statistical scoring function and fragment assembly.

An atomic detail refignment phase using rotamers and small backbaone angle moves and more physically relevant scoring function.

Robetta used information from the PDB to estimate the possible conformation for local sequence segments.

It first generate libraries of local sequence fragments exsized from non-reduntant (curated) version of the PDB on the basis of local sequence similarity (3-9 residuesmatches between the querry sequence and given structure in the PDB).

The selection of Fragments of local structure on the basis of match of local sequence dramatically reduces the size of associable conformational land scape.

8

Tertiary stucture generated using monte carlo search of the possible combination oflikely local structure minimizing a scoring function that account for local structureinformation such as compactness, hydrophobic buried, specific pair interaction & strand pairing (beta strand).

High resolution Prediction (Predict structure about accurate accuracy):-

For the Second stage refinement the centroid representation of Amino Acid side chains used in the low resolution phase are replaced with atomic detailed rotamer representation.

In this phase scoring function include salvation term, H-bond term, and other term with direct physical interpretation.

Fig :- steps of robetta

source – Structural Bioinformatics, 2 nd edition, edited by Jenny Gu, Philip E. Bourne.

9

Refinement Phase :-

The most natural starting point for simulating high resolution protein folding is Standard Molecular Dynamics Simulation (numerically integrated newton's equation of motion for polypeptide chain) using physically regionable potential function.

Domain prediction:- Domain prediction is a critical pre resquisite to the structur prediction “As the sizeof the protein increases, its conformational space also increases.”

Current denovo methods are limited to protein domain of 150 amino acid domain residue for alpha-beta protein.

80 residue for beta folds and 150 for alpha fold only. To overcome this two approaches can apply-

1. Increase the size range of denovo structure prediction.

2. Dividing protein into domains prior to attempting two protein structure prediction.

"A domain is generally define as a portion of protein that folds independently of the rest of the protein."

So dividing a query sequence into their smallest component domain prior to folding is straight forward way to increase the size of the predictio.

For many proteins domains division can be easily found while several domain remains beyond our ability to correctly detached them.

The determination of domain, family membership and its boundries for multidomain protein is a vital step in structure annotation/ prediction.

In brief, most domain protein partial methods relay on hierarchy searching for domains in query sequence with collection of primary sequence methods, domains library search and matches to structural domains in the PDB.

10

What is Ginzu Protocol :-

Ginzu is a PDB template identification and domain prediction protocol that attempts to determine the regions of a protein chain that are aligned to PDB templates with reasonable confidence, and in regions where templates are not detected, it attempts to find regions that will fold into globular units, called "domains".

Referance – http://robetta.bakerlab.org/faqs.jsp#removingjobs

source – Structural Bioinformatics, 2 nd edition, edited by Jenny Gu, Philip E. Bourne.

Component of Low Resolution Scoring Function :-

1. Residue Environment (salvation)2. Residue Pair Interaction (electrostatics and disulfides)3. Steric Repulsion4. Radition of Gyration (compactness measure - Van der Waals Interaction and Solvation) 5. C-B density (Salvation, Correction for exclude value effect introduced by Simulation)

11

6. Strand Pairing (H-bonding)7. Strand Arrangement into Sheets8. Helix Strand Packing

Component of High Resolution Scoring Function :-

1. Ramachandran Torsion Preferences2. Lennard Jones Interaction3. H-bonding4. Salvation5. Residue Pair Interaction (electrostatic interaction, disulphide bond)6. Rotamer Self Energy7. Unfolded State Preferences

Fragment-based Methods (Rosetta) :-

Hypothesis, the PDB database contains all the possible conformations that a short region of a protein chain might adopt.

How do we choose fragments that are most likely to correctly represent the query sequence?

12

Limitation using Robetta :-

There are many limitations to consider when using Robetta, as with all structure prediction methods. The de novo protocol is optimized for small single domain proteins (<120 residues). Within this limit, models are frequently around 3–7 Å RMSD to more than half of the native structure. Above this limit, models are still likely to have at least 50 residues within 4 Å RMSD, as shown in Table 1. For comparative modeling, the quality of the model is greatly dependent on the correct selection of the best possible parent template and alignment. Because of these factors, results are highly dependent on the accuracy of the domain assignments. A general rule to follow is that BLAST, PSI-BLAST, FFAS03 and 3D-Jury parent detections should be considered the most reliable, in that order. Domains predicted from Pfam-A and the MSA should be treated with caution, particularly for longer domains and also those that were assigned solely by the MSA.

http://nar.oxfordjournals.org/content/32/suppl_2/W526.full

Reference:1. http://rosetta.insilicos.com/what/2. http://en.wikipedia.org/wiki/Rosetta@home3. http://nar.oxfordjournals.org/content/32/suppl_2/W526.full4. http://en.wikipedia.org/wiki/Rosetta@home5. http://rosetta.insilicos.com/what/6. Structural Bioinformatics, 2 nd edition, edited by Jenny Gu, Philip E. Bourne.7. http://robetta.bakerlab.org/faqs.jsp#removingjobs8. http://nar.oxfordjournals.org/content/32/suppl_2/W526.full