protein structure prediction using coarse grain force fields · nasir mahmood. 12.02.2010. protein...

Nasir Mahmood

12.02.2010

Protein Structure Prediction using Coarse Grain Force Fields

• Introduction

• Probabilistic Ab Initio – Standard– Score function– Search Method– Results

• Probabilistic Ab Initio - Extended– Score Function : Introducing Solvation– Search Method: Bias Fix– Results

• Outlook

• Summary

Overview

2

“All the information required by protein to adopt its final conformation is encoded in its sequence”

Christian B. Anfinsen (1916 - 1995)

Source: http://nobelprize.org/

3

• information he referred to has not been decoded yet

• interestingly, these days we also know about proteins like ‘prions’

ExperimentalMethods

X-Ray Crystallography

NMRSpectroscopy

Cryo-EM

Time (year)

N

More than 3 decades and only 60000+ structures

ExperimentalMethods


NMRSpectroscopy

Cryo-EM

Time (year)

N

5

Time (year)

N

610100×

61090×

61080×

61070×

61060×

61050×

61040×

61030×

61020×

61010×

SequenceDatabase Growth

6

ExperimentalMethods


NMRSpectroscopy

Cryo-EM

Methods

ExperimentalMethods


NMRSpectroscopy

Cryo-EM

Computational Methods

Homology Modeling

FoldRecognition

Ab Initio Modeling

PDB

Accu

racy

Computation cost

PDB

dep

ende

nce

Physical Principles

Experimental Data

7

• Physics-based• Best but most difficult (Force fields)• Computationally expensive

• Statistics-based• Boltzmann distributions• Statistical mechanical ensembles

• We use Descriptive Statistics• Bayesian formulation• No hidden approximations• No energies but find distributions

• Monte Carlo Methods

• Molecular Dynamics

Ab Initio Methods

TE/k-i e B∆=P

8

• Purely Probabilistic Force Field• Mixture of Probabilities:• Sequence, Structure, Solvation

• No energies• No Boltzmann statistics

• Coarse Grained • reduced dimensionality• relies on dihedral angles• no side chains• 5-atoms representation• Fragment Assembly

• Simulated Annealing / Monte Carlo• Move set: biased & unbiased• Acceptance criterion: ratio

of probabilities

Our Ab Initio Method

9

ProbabilisticScore Function

10

•Representation : • Reduced, Simplified• 5-atoms per amino acid• dihedral angles (phi, psi)

•Bivariate Gaussian

2. Structure

•Multi-way Bernoulli1. SequenceS A E M P

WN

FYK HQ

T SG

DIL C

11

(A)

i

i + 1

i + 2

(B)Sequence Structure

P L E N R R V 3.11.1

2.00.9

1.5-2.5

1.72.3

-2.0-0.9

-1.5-1.2

-1.2-0.8

i

i + 1

i + 2

N

(C)

A S T C W R I -3.1-1.1

-2.0-0.9

-0.5-0.7

-1.7-0.5

-2.0-0.3

-1.5-0.8

-2.2-1.0

MS T C W R I -1.1-1.1

-2.0-0.9

-0.5-0.7

-1.7-0.5

-2.0-0.3

-1.5-0.8

-2.2-1.0

MT C W R I -1.1-1.1

-2.1-0.4

-0.5-0.7

-1.7-0.5

-2.0-0.3

-1.5-0.8

-2.2-1.0

F……

6101.5×

12

13

ExpectationMaximization

Fragment Library

BayesianClassifier

GGGG ..GAEG ..GAEG ..DCWF ..WFDC ..

STDC ..STST..WFTG ..CCAD ..ACAD ..

Classified

Statistical Models

Fragment Generation

Sequence

A S L T 2087

05-71

-3215

80-07

Structure

AS

LT

208705-71

-3215

80-07

SL

TM

208705-71

-3215

80-07

LT

LT

208705-71

-3215

80-07

TL

TI

208705-71

-3215

80-07

LT

TA

208705-71

-3215

80-07

TT

AT

208705-71

-3215

80-07

TA

QW

208705-71

-3215

80-07

AQ

WW

208705-71

-3215

80-07

QW

WE

208705-71

-3215

80-07

WW

EW

208705-71

-3215

80-07

WE

WC

208705-71

-3215

80-07

class 0class 1class 2

class 5

class 3class 4

class 6Classified

GGGG ..

GAEG ..

GAEG ..

DCWF ..

WFDC ..

STDC ..

STST..

WFTG ..

CCAD ..

ACAD ..

14

Search Method

15

Prob

abili

ty

Conformational space

Final Model

(i-1)

(i)

Relative probabilities:

• Normal methods :

( )( )1-i

ii xp

xp=PInitial (random)

conformation

TE/k-i e B∆=P

16

73167 543117793 1466

psiphi

Random Angle Generator

0-180

180

0

180

phi

psi

-180

180

-180 180

0

0

PDB

FragmentLibrary fragments

Unbiased Biased

6102×≈

17

18

Interplay of Cartesian Coordinates & Dihedral Angles

Choi, V.: 2005, On Updating torsion angles of molecular conformations, J Chem Inf Model 46, 438–444.

Results

19

NativeModel

Results

20

2hfq

NativeModel

Results

21

2hd3

NativeModel

Results

22

Phi

Psi

2gzv

Model

Results

23

Time

Scor

e

2hj1

Native

Temperature

Results

24

Time

Scor

e

Phi

Psi

Temperature

Score Function:Introducing Solvation

25

PDB

27

28

Trp

Gly Lys Ser

PDB

• Representation : • Reduced, Simplified• 5-atoms per amino acid• dihedral angles (phi, psi)

• Bivariate Gaussian

2. Structure

• Multi-way Bernoulli

1. Sequence

S A E MPW

NFY

K HQ T SG DI L C

• Simple Gaussian

3. Solvation

29

• Mixture Models: Connections Residues Geometry Location in protein

ExpectationMaximization

Fragment Library

BayesianClassifier

GGGG ..GAEG ..GAEG ..DCWF ..WFDC ..

STDC ..STST..WFTG ..CCAD ..ACAD ..

Re-Classified

Statistical Models

PDB

Sequence StructureA S L T -3.1

-1.1-2.0-0.9

-0.5-0.7

-1.7-0.5 12 07 08 11

Solvation

S L T I -2.0-0.9

-0.5-0.7

-1.7-0.5

-1.2-0.4 07 08 11 09

30

Search Method:Bias Fix & Combining

Fragments

31

32

Bias Fix

33

Combining Fragments andProbabilities

Results

34

Native Model

1fsv

2hep

Results

35

2k4x

1agt

Results

Native Model36

2k53

2k4n

Native Model

Results

37

Results

Native Model

2hf1

38

Future Outlook

Hydrogen bond energy(kcal/mol)

• Introduce hydrogen bonds – as a probabilistic term

• Hydrogen bond energies have normal distribution

• Use Simple Gaussian model

N

39

Summary

•Purely Probabilistic Approach for Protein Structure Prediction

• Score function consists of a set of probability distributions•Conformation probabilities - mixture of probabilities, no

energies at all

• generates protein/protein-like conformations• long-range interactions not well represented• In future, hydrogen bond term could improve results

• Application to sequence optimization•Rapid sampling – combine with other score functions

40

Thanks for your attention!

protein structure prediction using coarse grain force fields · nasir mahmood. 12.02.2010. protein...

Documents