geofold: a mechanistic model to study the effect of topology on protein unfolding pathways and...

1
GeoFold: A mechanistic model to study the effect of topology on protein unfolding pathways and kinetics Vibin Ramakrishnan*, Saeed Salem , Saipraveen Srinivasan , Suzanne Matthews , Mohammed J. Zaki , Wilfredo Colón , Christopher Bystroff * CENTER FOR BIOTECHNOLOGY AND INTERDISCIPLINARY STUDIES, DEPTS OF BIOLOGY*, CHEMISTRY & COMPUTER SCIENCE RENSSELAER POLYTECHNIC INSTITUTE, TROY, NEW YORK 12180 Figure 1 .(a) E lemental subsystem for kinetic model. f is a su bset ofthe protein , and is partitioned int o u 1 and u 2 , passin over energy barr ier ‡. (b) Topologi operators. A pivot motion is a rot around a point. A hinge is a rotati around two points. A breakis a translatio Rotations and translati ons mustnot cross chains. Abstract: Protein unfolding is modeled as an ensemble of pathways, where each is a tree of intermediate states and branches in the tree represent additional degrees of conformational freedom. Using a known protein structure, the program (GeoFold) generates a directed acyclic graph of linked elemental subsystems, each modeling a partitioning of a substructure into two at topologically allowed positions in the chain. The graph begins at the native state and ends at small fragments representing the fully unfolded state. Each substructure is assigned a free energy based on its buried solvent accessible surface area, sidechain entropy, and backbone configurational entropy. Each bifurcating edge, representing the transition state of a single partitioning, is assigned a free energy barrier height based on the principle that exposed surfaces are solvated before the configurational entropy is expressed. To simulate unfolding on the graph, rates are calculated for each elemental subsystem at each time step using transition state theory. The model exhibits two-state behavior with respect to temperature or denaturant and shows the expected linear relationship of overall folding rate with denaturant. Predicted unfolding rates are compared with experimental values for fifteen well-studied proteins. Strengths and deficiencies of the model are discussed. Funding from DBI-0448072 PDB code MW Length Class OMP M T L B 1dpc 26259 243 24 α/β C N 1dwk 17263 156 10 α+α/β N N,C N,C 1hxx 37103 340 3 β X 1ino 19611 175 6 α/β Mn N,C C N,C 1isc 21163 192 2 α+α/β Fe C 1jsw 52427 478 4 α N,C,loop 1koj 63102 557 2 α/β N,C,loop 1mpf 37177 340 3 β X 1onr 35139 316 2 α/β N,C C 1pd5 25700 219 3 α/β C N N 1pho 37633 330 3 β X 1s7c 35587 331 4 α/β C N 1sβp 34542 310 1 α/β N C 2ddm 30889 283 2 α/β N,C N Selected seqs from E. coli lysate 2D gel Outer membrane protein Bound metal Termini buried Latch Termini blocked b y oligomer Kinetically stable proteins share structural features. ln(ku) (calculate -40 -30 -20 -10 0 10 20 0 5 10 15 20 25 30 35 40 45 50 Omega 2CAB 1INO 1YAL 1SSX 1QJ8 9PAP 1SPD 1UU8 1RX4 2IZB 1ELP 1NOC 2YYY 1SHF 1C9O 1PBA 1ARR 1UCW No observed kinetic stabili Observed Kinetic stability B D C A Voltage High MW Low MW V oltage High M W Low MW Non-kinetically stable proteins Kinetically stable protein (A) Cell extract SDS-PAGE separation on a narrow gel strip. (B) The gel strip is immersed in boiling 1% SDS. (C) Second dimension SDS-PAGE separation. (D) The gel is stained. The non- kinetically stable proteins migrate equal distances in both dimension. 1D High MW 1D Low MW 2D High MW 2D Low MW Latch Buried termini or barrel less kinetically stable more kinetically stable Both Termini blocked by dimerization monomer Shared latches or intertwined. Terminus blocked by dimerization By manually unfolding 1uuf, we discover that we have few choices! Loop hinges away Latch terminus detaches first helix pivots out First strand of C-terminal domain swings away C-terminal domain begins unravelling N-terminal domain separates from C- terminal domain C-terminal domain falls apart N-terminal latch, now unblocked, swings away. loops unhinge. Strand pivots out. Beta barrel comes apart. Done. 1UUF, order of loss of inter- residue contacts. early late . . . 1. Kinetically stable (KS)proteins can be identified by differential resistance to SDS denaturation 2. KS proteins in E.coli has recurrent structural themes _ High throughput method for isolating all KS proteins in a cell lysate Meinhold, D., Boswell, S., Colon, W., Biochemistry 2005, 44, 14715-14724 Xia, K., Manning, M., Hesham, H., Lin, Q., et al., Proc Nat Acad Sci 2007, 104, 17329-17334 3. GeoFold models folding as a series of splittings of three types: break, pivot and hinge. k qu =− d [f ] dt = γ[ f ] e (−ΔG f q RT ) (a)Elemental subsystem for the kinetic model. f is a subset of the protein, and is partitioned into u 1 and u 2 , passing over energy barrier ‡. (b) Topological operators. A pivot motion is a rotation around a point. A hinge is a rotation around two points. A break is a translation. Rotations and translations must not cross chains. Pivots, hinges and breaks have different entropy values S, because they add different numbers of degrees of freedom to the system. 4. GeoFold generates a kinetic model for a protein . The concentrations of all species except the native structure are set to zero. The change in concentration of f at each time step, for one transitions state (‡, red diamonds) is calculated as 5. UnfoldSim runs a kinetic simulation . 9. Unfolding pathways can be summarized using contact maps _ When the simulation reaches 50% unfolded, we look at the concentration of each pairwise contact, rank the concentrations from small (red) to large (blue). Blue regions of the map show the folding core, while red regions show the early unfolding events (latches, dimer contacts, etc) Unfolding of 1UUF at omega=1.0 0 20 40 60 80 100 120 0.00E+00 2.00E-06 4.00E-06 6.00E-06 8.00E-06 1.00E-05 1.20E-05 1.40E-05 time (s) [Folded] [Unfolded] [Intermediates] GeoFold generates a directed acyclic graph (DAG) linking all topologically possible transitions (red diamonds), each splitting one intermediate substructure (orange circles) into two. After running UnfoldSim (see 5), the pathways that account for most of the unfolding traffic are identified (think lines and large symbols). by the program UnfoldSim. 6. High traffic unfolding pathways may be dissected and visualized (here for 1UUF, alcohol dehydrogenase, a KS protein) 7. Unfolding rates can be measured as k u =ln(2)/t 1/2 _ Following the changes in concentration with time, we sum over all the nodes that represent small segments of the chain, to get [Unfolded]. The time at which [Unfolded] reaches 50% is t 1/2 , the half-life of unfolding. t 1/2 depends on temperature, urea concentration (1/omega in our simulations), and other parameters of the force field, including sidechain entropy, void volume, pivot entropy, hinge entropy, and break entropy. ln(k u ) can be plotted versus omega (solvation free energy, inversely related to [Urea]). We get the expected linear relationship. By extrapolating, we can infer the rates in pure water, which are too slow to simulate. We find the GeoFold rates agree with experimental ones! 8. Simulated k u s agree with experimental k u s for KS and non- KS proteins URL: http://www.bioinfo.rpi.edu/matths3/geofold.html

Post on 21-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: GeoFold: A mechanistic model to study the effect of topology on protein unfolding pathways and kinetics Vibin Ramakrishnan *, Saeed Salem ‡, Saipraveen

GeoFold: A mechanistic model to study the effect of topology on protein unfolding pathways and kineticsVibin Ramakrishnan*, Saeed Salem‡, Saipraveen Srinivasan¶, Suzanne Matthews‡, Mohammed J. Zaki‡, Wilfredo Colón¶, Christopher Bystroff *‡

CENTER FOR BIOTECHNOLOGY AND INTERDISCIPLINARY STUDIES, DEPTS OF BIOLOGY*, CHEMISTRY¶ & COMPUTER SCIENCE‡

RENSSELAER POLYTECHNIC INSTITUTE, TROY, NEW YORK 12180

Figure 1. (a) Elemental subsystem for the kinetic model. f is a su bset of the protein, and is partitioned int o u1 and u2, passing over energy barrier ‡. (b) Topological operators. A pivot motion is a rotation around a point. A hinge is a rotation around two points. A break is a translation. Rotations and translati ons must not cross chains.

Abstract:

Protein unfolding is modeled as an ensemble of pathways, where each is a tree of intermediate states and branches in the tree represent additional degrees of conformational freedom. Using a known protein structure, the program (GeoFold) generates a directed acyclic graph of linked elemental subsystems, each modeling a partitioning of a substructure into two at topologically allowed positions in the chain. The graph begins at the native state and ends at small fragments representing the fully unfolded state. Each substructure is assigned a free energy based on its buried solvent accessible surface area, sidechain entropy, and backbone configurational entropy. Each bifurcating edge, representing the transition state of a single partitioning, is assigned a free energy barrier height based on the principle that exposed surfaces are solvated before the configurational entropy is expressed. To simulate unfolding on the graph, rates are calculated for each elemental subsystem at each time step using transition state theory. The model exhibits two-state behavior with respect to temperature or denaturant and shows the expected linear relationship of overall folding rate with denaturant. Predicted unfolding rates are compared with experimental values for fifteen well-studied proteins. Strengths and deficiencies of the model are discussed.

Funding from

DBI-0448072

PDB code MW Length 4° Class OMP M T L B1dpc 26259 243 24 α/β C N1dwk 17263 156 10 α+α/β N ,N C ,N C1hxx 37103 340 3 β X1ino 19611 175 6 α/β Mn ,N C C ,N C1isc 21163 192 2 α+α/β Fe C1jsw 52427 478 4 α , ,N C loop1koj 63102 557 2 α/β , ,N C loop1mpf 37177 340 3 β X1onr 35139 316 2 α/β ,N C C

1 5pd 25700 219 3 α/β C N N

1pho 37633 330 3 β X1 7s c 35587 331 4 α/β C N1sbp 34542 310 1 α/β N C2ddm 30889 283 2 α/β ,N C N

Selected seqs from E. coli lysate 2D gel

Outer membrane protein

Bound metal

Termini b

uried

Latch Termini b

locked

by oligomerKinetically stable proteins share structural features.

ln(ku) (calculated) vs

-40

-30

-20

-10

0

10

20

0 5 10 15 20 25 30 35 40 45 50

Omega

ln(ku), unfolding rate

2CAB 1INO

1YAL 1SSX

1QJ8 9PAP

1SPD 1UU8

1RX4 2IZB

1ELP 1NOC

2YYY 1SHF

1C9O 1PBA

1ARR 1UCW

No observed kinetic stabilityObserved Kinetic stability

B D

CA

Voltage

High

MW

Low

MW

V oltage

H igh M W

L ow MW

Non-kinetically

stable proteins

Kineticallystable protein

(A) Cell extract SDS-PAGE separation on a narrow gel strip. (B) The gel strip is immersed in boiling 1% SDS. (C) Second dimension SDS-PAGE separation. (D) The gel is stained. The non-kinetically stable proteins migrate equal distances in both dimension.

1D High MW

1D Low MW2D

HighMW

2D LowMW

Latch

Buried termini or barrel

less kinetically stable more kinetically stable

Both Termini blockedby dimerization

monomer

Shared latchesor intertwined.

Terminus blockedby dimerization

By manually unfolding 1uuf, we discover that we have few choices!

Loop hinges away

Latch terminus detachesfirst

helix pivots out

First strand of C-terminal domain swings away

C-terminal domain begins unravelling

N-terminal domain separates from C-terminal domain

C-terminal domain falls apart

N-terminal latch, now unblocked, swings away.

loops unhinge.

Strand pivots out.Beta barrel comes apart.Done.

1UUF, order of loss of inter-residue contacts.

early

late

.

.

.

1. Kinetically stable (KS)proteins can be identified by differential resistance to SDS denaturation

2. KS proteins in E.coli has recurrent structural themes _

High throughput method for isolating all KS proteins in a cell lysate

Meinhold, D., Boswell, S., Colon, W., Biochemistry 2005, 44, 14715-14724

Xia, K., Manning, M., Hesham, H., Lin, Q., et al., Proc Nat Acad Sci 2007, 104, 17329-17334

3. GeoFold models folding as a series of splittings of three types: break, pivot and hinge.

kqu = −d[ f ]

dt= γ [ f ]e(−ΔG f− q

‡ RT )

(a)Elemental subsystem for the kinetic model. f is a subset of the protein, and is partitioned into u1 and u2, passing over energy barrier ‡.

(b) Topological operators. A pivot motion is a rotation around a point. A hinge is a rotation around two points. A break is a translation. Rotations and translations must not cross chains.

Pivots, hinges and breaks have different entropy values S, because they add different numbers of degrees of freedom to the system.

4. GeoFold generates a kinetic model for a protein .

The concentrations of all species except the native structure are set to zero. The change in concentration of f at each time step, for one transitions state (‡, red diamonds) is calculated as

5. UnfoldSim runs a kinetic simulation .

9. Unfolding pathways can be summarized using contact maps _

When the simulation reaches 50% unfolded, we look at the concentration of each pairwise contact, rank the concentrations from small (red) to large (blue). Blue regions of the map show the folding core, while red regions show the early unfolding events (latches, dimer contacts, etc)

Unfolding of 1UUF at omega=1.0

0

20

40

60

80

100

120

0.00E+00 2.00E-06 4.00E-06 6.00E-06 8.00E-06 1.00E-05 1.20E-05 1.40E-05

time (s)

relative concentration (%)

[Folded]

[Unfolded]

[Intermediates]

GeoFold generates a directed acyclic graph (DAG) linking all topologically possible transitions (red diamonds), each splitting one intermediate substructure (orange circles) into two. After running UnfoldSim (see 5), the pathways that account for most of the unfolding traffic are identified (think lines and large symbols).

by the program UnfoldSim.

6. High traffic unfolding pathways may be dissected and visualized (here for 1UUF, alcohol dehydrogenase, a KS protein)

7. Unfolding rates can be measured as ku=ln(2)/t1/2 _

Following the changes in concentration with time, we sum over all the nodes that represent small segments of the chain, to get [Unfolded]. The time at which [Unfolded] reaches 50% is t1/2, the half-life of unfolding. t1/2 depends on temperature, urea concentration (1/omega in our simulations), and other parameters of the force field, including sidechain entropy, void volume, pivot entropy, hinge entropy, and break entropy.

ln(ku) can be plotted versus omega (solvation free energy, inversely related to [Urea]). We get the expected linear relationship. By extrapolating, we can infer the rates in pure water, which are too slow to simulate.

We find the GeoFold rates agree with experimental ones!

8. Simulated kus agree with experimental kus for KS and non-KS proteins URL: http://www.bioinfo.rpi.edu/matths3/geofold.html