the folding network of villin headpiece subdomain

59
The folding network of villin headpiece The folding network of villin headpiece subdomain subdomain Hongxing Lei Hongxing Lei Beijing Institute of Genomics Beijing Institute of Genomics Chinese Academy of Sciences Chinese Academy of Sciences

Upload: janna-mcclain

Post on 30-Dec-2015

37 views

Category:

Documents


2 download

DESCRIPTION

The folding network of villin headpiece subdomain. Hongxing Lei Beijing Institute of Genomics Chinese Academy of Sciences. The Protein Folding Problem. ?. The importance of protein folding. Amyloid diseases Alzheimer ’ s disease (AD) Parkinson ’ s disease (PD) Huntington ’ s disease - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The folding network of villin headpiece subdomain

The folding network of villin headpiece The folding network of villin headpiece subdomainsubdomain

Hongxing LeiHongxing Lei

Beijing Institute of GenomicsBeijing Institute of Genomics

Chinese Academy of SciencesChinese Academy of Sciences

Page 2: The folding network of villin headpiece subdomain

The Protein Folding ProblemThe Protein Folding Problem

?

Page 3: The folding network of villin headpiece subdomain

The importance of protein foldingThe importance of protein folding

Amyloid diseasesAmyloid diseases AlzheimerAlzheimer’’s disease (AD)s disease (AD) ParkinsonParkinson’’s disease (PD)s disease (PD) HuntingtonHuntington’’s diseases disease Prion diseasesPrion diseases Amyotrophic lateral sclerosis (ALS)Amyotrophic lateral sclerosis (ALS)

Protein structure predictionProtein structure prediction Protein designProtein design

Page 4: The folding network of villin headpiece subdomain

unfolded state

formation ofmicrodomains

diffusion and collision ofmicrodomains native state

formation ofa nucleus

collapse

Page 5: The folding network of villin headpiece subdomain

Folding funnelFolding funnel

Onuchic & Wolynes, COSB 2004, 14:70-75

Page 6: The folding network of villin headpiece subdomain

The challenges in all-atom protein foldingThe challenges in all-atom protein folding

Time scaleTime scale Protein folding: Protein folding: secondsseconds Simulation: Simulation: microsecondmicrosecond Gap: Gap: 101066

Solution: Solution: Ultrafast-folding proteins / Ultrafast-folding proteins / SupercomputersSupercomputers

Energetic accuracyEnergetic accuracy ΔΔGGfoldfold ( (a few kcal/mol, hydrogen bonda few kcal/mol, hydrogen bond)) High accuracy of force fieldHigh accuracy of force field

Page 7: The folding network of villin headpiece subdomain

1998: villin headpiece, 36 amino acids,1998: villin headpiece, 36 amino acids, 3+Å3+Å

2002/2003:2002/2003:– trpcage, 20 amino acids,trpcage, 20 amino acids, 1Å 1Å– Villin headpiece by Folding@HomeVillin headpiece by Folding@Home (3.8Å) (3.8Å)– Villin headpiece by Shen et alVillin headpiece by Shen et al (3.0Å) (3.0Å)– BBA5 by Folding@HomeBBA5 by Folding@Home (2.2-2.5Å) (2.2-2.5Å)

Recently (Scheraga and others)Recently (Scheraga and others)– A few small proteinsA few small proteins 2.0-4.0Å 2.0-4.0Å

Ab initioAb initio all-atom protein folding all-atom protein folding

Page 8: The folding network of villin headpiece subdomain

Villin headpiece subdomain (HP35)Villin headpiece subdomain (HP35)

Page 9: The folding network of villin headpiece subdomain

Review of previous workReview of previous work

Page 10: The folding network of villin headpiece subdomain

Best folded structure from Best folded structure from simulationsimulation

Cα RMSD 0.39 Å

Page 11: The folding network of villin headpiece subdomain

Four states from simulationFour states from simulation

Page 12: The folding network of villin headpiece subdomain

Thermodynamic properties from simulationThermodynamic properties from simulation

Page 13: The folding network of villin headpiece subdomain

The folding pathway of HP35The folding pathway of HP35

Page 14: The folding network of villin headpiece subdomain

Results from 10μs simulations

Page 15: The folding network of villin headpiece subdomain

Folding trajectory #1Folding trajectory #1

Page 16: The folding network of villin headpiece subdomain

Segment foldingSegment folding

Page 17: The folding network of villin headpiece subdomain

Population of native hydrogen Population of native hydrogen bondsbonds

0

30

60

90

hyd

rog

en

bo

nd

occ

up

an

cy (

%)

helix I helix II helix III

Page 18: The folding network of villin headpiece subdomain

Folding landscapeFolding landscape

4 8

4

8

RM

SD

of s

egm

ent B

RMSD of segment A

1.000

50.00

100.0

500.0

1000

5000

10000

4.000E4

6.420E4

Page 19: The folding network of villin headpiece subdomain

Free energy landscapeFree energy landscape

0-2.5 us 2.5-5.0 us

5.0-7.5 us 7.5-10.0 us

Page 20: The folding network of villin headpiece subdomain

Top ten clustersTop ten clusters

5.90 Å, 5.90 Å, 12.42%12.42%

6.33 Å, 6.33 Å, 9.79%9.79%

6.13 Å, 5.12%6.13 Å, 5.12% 3.07 Å, 3.47%3.07 Å, 3.47% 1.50 Å, 1.50 Å, 3.21%3.21%

5.87 Å, 3.16%5.87 Å, 3.16% 5.54 Å, 5.54 Å, 2.68%2.68%

5.65 Å, 2.55%5.65 Å, 2.55% 5.85Å, 2.54%5.85Å, 2.54% 6.22 Å, 6.22 Å, 2.34%2.34%

Page 21: The folding network of villin headpiece subdomain

Folding network (RMSD)Folding network (RMSD)

Page 22: The folding network of villin headpiece subdomain

Folding network (Epot)Folding network (Epot)

Page 23: The folding network of villin headpiece subdomain

Scale free propertyScale free property

0.0 0.8 1.6-3

-2

-1

0lo

g1

0(p

(k))

log10(k)

R2 = 0.786

Page 24: The folding network of villin headpiece subdomain

HubsHubs

Degree: 17RMSD-ALL: 5.98 ÅRMSD-CA: 4.27 ÅRMSD-segment A: 3.96 ÅRMSD-segment B: 1.18 ÅRGYR: 10.80 ÅPopulation: 1735

Degree: 45RMSD-ALL: 7.26 ÅRMSD-CA : 5.90 ÅRMSD-segment A : 5.17 ÅRMSD-segment B : 1.63 ÅRGYR : 9.75 ÅPopulation: 124243

Degree: 24RMSD-ALL: 3.75 ÅRMSD-CA : 1.50 ÅRMSD-segment A : 0.36ÅRMSD-segment B : 0.59 ÅRGYR : 10.17 ÅPopulation: 32090

Page 25: The folding network of villin headpiece subdomain

BottlenecksBottlenecks

Betweenness: 2.78RMSD-ALL: 6.24 ÅRMSD-CA : 5.02 ÅRMSD-segment A: 4.40 ÅRMSD-segment B : 1.53 ÅRGYR : 10.86 ÅPopulation : 550

Betweenness: 4.11RMSD-ALL: 6.63 ÅRMSD-CA : 4.03 ÅRMSD-segment A: 4.64 ÅRMSD-segment B : 1.07 ÅRGYR : 11.02 ÅPopulation : 873

Betweenness: 2.95RMSD-ALL: 5.70 ÅRMSD-CA : 4.34 ÅRMSD-segment A: 3.38 ÅRMSD-segment B : 1.34 ÅRGYR : 10.42 ÅPopulation : 237

Page 26: The folding network of villin headpiece subdomain

Folding trajectory #2Folding trajectory #2

Page 27: The folding network of villin headpiece subdomain

Segment foldingSegment folding

Page 28: The folding network of villin headpiece subdomain

Population of native hydrogen bondsPopulation of native hydrogen bonds

0

30

60

90

helix III helix II helix I

hyd

rog

en

bo

nd

occ

up

an

cy (

%)

Page 29: The folding network of villin headpiece subdomain

4 8

4

8

RM

SD

of se

gm

en

t B

RMSD of segment A

1.000

50.00

100.0

500.0

1000

5000

10000

2.000E4

2.130E4

Folding landscapesFolding landscapes

Page 30: The folding network of villin headpiece subdomain

Free energy landscapeFree energy landscape

0-2.5 us 2.5-5.0 us

5.0-7.5 us 7.5-10.0 us

Page 31: The folding network of villin headpiece subdomain

Top ten clustersTop ten clusters

3.19 Å, 8.54%3.19 Å, 8.54% 2.31 Å, 7.25%2.31 Å, 7.25% 1.71 Å, 6.15%1.71 Å, 6.15% 3.433.43Å, 5.17%Å, 5.17% 1.10 Å, 3.56%1.10 Å, 3.56%

6.79 Å, 1.94%6.79 Å, 1.94% 7.38 Å, 1.88%7.38 Å, 1.88% 3.31 Å, 1.84%3.31 Å, 1.84% 6.85 Å, 1.50%6.85 Å, 1.50% 3.88 Å, 1.42%3.88 Å, 1.42%

Page 32: The folding network of villin headpiece subdomain

Folding network (RMSD)Folding network (RMSD)

Page 33: The folding network of villin headpiece subdomain

Folding network (Epot)Folding network (Epot)

Page 34: The folding network of villin headpiece subdomain

Scale free propertyScale free property

0.0 0.7 1.4-3

-2

-1

0

log

(p(k

))

log(k)

R2 = 0.723

Page 35: The folding network of villin headpiece subdomain

HubsHubs

Degree : 36RMSD-ALL: 3.73 ÅRMSD-CA : 1.71 ÅRMSD-segment A: 0.63 ÅRMSD-segment B : 0.69 ÅRGYR : 10.05 ÅPopulation : 61485

Degree : 31RMSD-ALL: 5.99 ÅRMSD-CA : 3.92 ÅRMSD-segment A: 4.13 ÅRMSD-segment B : 0.97 ÅRGYR : 11.50 ÅPopulation : 2689

Degree : 30RMSD-ALL: 6.83 ÅRMSD-CA : 5.83 ÅRMSD-segment A: 4.88 ÅRMSD-segment B : 1.65 ÅRGYR : 9.93 ÅPopulation : 5991

Degree : 22RMSD-ALL: 6.75 ÅRMSD-CA : 5.13 ÅRMSD-segment A: 5.04 ÅRMSD-segment B : 0.61 ÅRGYR : 12.30 ÅPopulation : 2854

Page 36: The folding network of villin headpiece subdomain

BottlenecksBottlenecks

Betweenness: 2.46RMSD-ALL: 7.23 ÅRMSD-CA : 5.80 ÅRMSD-segment A: 5.17 ÅRMSD-segment B : 0.82 ÅRGYR : 10.63 ÅPopulation : 392

Betweenness: 2.27RMSD-ALL: 6.22 ÅRMSD-CA : 4.50 ÅRMSD-segment A: 4.84 ÅRMSD-segment B : 1.82 ÅRGYR : 10.97 ÅPopulation : 890

Betweenness: 2.48RMSD-ALL: 6.62 ÅRMSD-CA : 4.93 ÅRMSD-segment A: 4.50 ÅRMSD-segment B : 1.13 ÅRGYR : 11.43 ÅPopulation : 260

Page 37: The folding network of villin headpiece subdomain

A SCORING FUNCTION A SCORING FUNCTION FOR STRUCTURE FOR STRUCTURE PREDICTIONPREDICTION

Page 38: The folding network of villin headpiece subdomain

SCORING FUNCTIONSSCORING FUNCTIONS Knowledge-based functionsKnowledge-based functions

(well compacted; surface area; contact (well compacted; surface area; contact order)order)

Physics-based functionsPhysics-based functions

(free energy; potential energy; (free energy; potential energy; hydrogen bond energy; VDW energy)hydrogen bond energy; VDW energy)

Page 39: The folding network of villin headpiece subdomain

OUR SCORING OUR SCORING FUNCTIONFUNCTION

F(E)=EF(E)=ESESE + a*E + a*EFFFF + b*E + b*EHBHB

EESESE= the statistical energy= the statistical energy

EEFFFF= the force field physical energy with GB = the force field physical energy with GB solvation modelsolvation model

EEHBHB= the main chain hydrogen bonding energy= the main chain hydrogen bonding energy a= the coefficient of the force field physical a= the coefficient of the force field physical

energy termenergy term b= the coefficient of the main chain hydrogen b= the coefficient of the main chain hydrogen

bonding energy termbonding energy term

Page 40: The folding network of villin headpiece subdomain

DECOY SETSDECOY SETShttp://depts.washington.edu/baker

pg/decoys/

1.1.a wide variety of different a wide variety of different proteins;proteins;

2.2.close to the native structure;close to the native structure;

3.3.produced by a relatively unbiased produced by a relatively unbiased procedureprocedure

Page 41: The folding network of villin headpiece subdomain

Decoy setsDecoy sets

Training sets ( 14 × 100 )Training sets ( 14 × 100 )

Testing sets ( 13 × 100 )Testing sets ( 13 × 100 ) Group a: contain 3-11 acceptable decoysGroup a: contain 3-11 acceptable decoys

Group b: contain at least 93 acceptable Group b: contain at least 93 acceptable decoysdecoys

RMSD <5Å acceptable decoysTotal : 534, 38.14%

Page 42: The folding network of villin headpiece subdomain

Decoy setsDecoy sets

Page 43: The folding network of villin headpiece subdomain

F(E)=F(E)=EESESE + A*E+ A*EFFFF + + B*EB*EHBHB

Scoring Scoring methodmethod

CCCCaveave--

with RMSD (SD)with RMSD (SD)CcCcaveave

-with TM-score -with TM-score (SD)(SD)

NumberNumber

DFIREDFIRE

0.4730.473 (0.312)(0.312) -0.451-0.451 (0.261)(0.261) 9898

RAPDF RAPDF

0.4970.497 (0.203)(0.203) -0.478-0.478 (0.173)(0.173) 9595

DOPE DOPE

0.5200.520 (0.214)(0.214) -0.442-0.442 (0.243)(0.243) 9393

Page 44: The folding network of villin headpiece subdomain

F(E)=F(E)=EESESE + A*E+ A*EFFFF + B*E + B*EHBHB

Page 45: The folding network of villin headpiece subdomain

F(E)=EF(E)=ESESE + A*+ A*EEFFFF + + B*EB*EHBHB

EEFFFF = the force field physical energy with GB = the force field physical energy with GB solvation modelsolvation model

Two protocols:Two protocols:

only a minimization;only a minimization;

after minimization, a 40 ps molecule dynamic after minimization, a 40 ps molecule dynamic run followed by another minimization.run followed by another minimization.

(The results from both protocols are very similar, and therefore, (The results from both protocols are very similar, and therefore, the use of the less time consuming protocol was adopted. )the use of the less time consuming protocol was adopted. )

Page 46: The folding network of villin headpiece subdomain

F(E)=EF(E)=ESESE + A*+ A*EEFFFF + B*E + B*EHBHB

Scoring Scoring methodmethod

CCCCaveave--

with RMSD (SD)with RMSD (SD)CcCcaveave

-with TM-score (SD)-with TM-score (SD)NumberNumber

AMBER99AMBER99

0.1960.196 (0.204)(0.204) -0.216-0.216 (0.243)(0.243) 7777

OPLS-aa OPLS-aa

0.2110.211 (0.241)(0.241) -0.224-0.224 (0.271)(0.271) 7979

CHARMM27 CHARMM27

0.0140.014 (0.216)(0.216) -0.015-0.015 (0.198)(0.198) 5858

Page 47: The folding network of villin headpiece subdomain

Various force fields in TinkerVarious force fields in Tinker

Page 48: The folding network of villin headpiece subdomain

F(E)=EF(E)=ESESE + A*+ A*EEFFFF + B*E + B*EHBHB

Scoring Scoring methodmethod

CCCCaveave--

with RMSD (SD)with RMSD (SD)CcCcaveave

-with TM-score (SD)-with TM-score (SD)NumberNumber

AMBER03AMBER03

0.3130.313 (0.223)(0.223) -0.331-0.331 (0.232)(0.232) 9797

AMBER99 AMBER99

0.2540.254 (0.162)(0.162) -0.272-0.272 (0.146)(0.146) 8686

AMBER99SBAMBER99SB

0.3420.342 (0.162)(0.162) -0.353-0.353 (0.152)(0.152) 9696

AMBER96 AMBER96

0.2930.293 (0.136)(0.136) -0.325-0.325 (0.157)(0.157) 9090

AMBER94 AMBER94

0.2420.242 (0.227)(0.227) -0.261-0.261 (0.206)(0.206) 8282

Page 49: The folding network of villin headpiece subdomain

AMBER force fieldsAMBER force fields

Page 50: The folding network of villin headpiece subdomain

F(E)=EF(E)=ESESE + A*E+ A*EFFFF + + B*B*EEHBHB

Scoring Scoring methodmethod

CCCCaveave--

with RMSD (SD)with RMSD (SD)CcCcaveave

-with TM-score -with TM-score (SD)(SD)

NumberNumber

DSSPDSSP

0.0190.019 (0.328)(0.328) -0.007-0.007 (0.284)(0.284) 5858

ROSETTA ROSETTA

-0.186-0.186 (0.432)(0.432) 0.1030.103 (0.376)(0.376) 3434

Page 51: The folding network of villin headpiece subdomain

Hydrogen bonding energyHydrogen bonding energy

Page 52: The folding network of villin headpiece subdomain

Parameters from grid searchParameters from grid search

A search to get the maximum number of total A search to get the maximum number of total acceptable decoys among the top 10 list.acceptable decoys among the top 10 list.

Both “a” and “b” were from 0 to 0.5.Both “a” and “b” were from 0 to 0.5. The maximum number of total acceptable The maximum number of total acceptable

decoys was found to be 112 out of the 140 decoys was found to be 112 out of the 140 selections (14*10). selections (14*10).

The corresponding parameters are a = 0.12 The corresponding parameters are a = 0.12 and b = 0.06.and b = 0.06.

The overall 80% acceptable decoys are also The overall 80% acceptable decoys are also significantly higher than the 38.1% in the whole significantly higher than the 38.1% in the whole training sets.training sets.

Page 53: The folding network of villin headpiece subdomain

Scoring Scoring methodmethod

CCCCaveave--

with RMSD (SD)with RMSD (SD)CcCcaveave

-with TM-score -with TM-score (SD)(SD)

NumberNumber

F(E)F(E)

0.5380.538 (0.223)(0.223) -0.476-0.476 (0.248)(0.248) 112112

ROSETTA ROSETTA

0.3990.399 (0.293)(0.293) -0.391-0.391 (0.321)(0.321) 9595

Comparison with Rosetta energyComparison with Rosetta energy

Page 54: The folding network of villin headpiece subdomain

Comparison with Rosetta energyComparison with Rosetta energy

Page 55: The folding network of villin headpiece subdomain

Performance on the training Performance on the training setset

Page 56: The folding network of villin headpiece subdomain

RMSD (Å)

Sco

re

(kc

al/m

ol)

Performance on the training Performance on the training setset

Page 57: The folding network of villin headpiece subdomain

RMSD (Å)

Sco

re

(kc

al/m

ol)

Performance on the testing Performance on the testing setset

Page 58: The folding network of villin headpiece subdomain

Performance on the testing setPerformance on the testing set

RMSD (Å)

Sco

re

(kc

al/m

ol)

Page 59: The folding network of villin headpiece subdomain

AcknowledgementsAcknowledgements