absolute binding free energies
DESCRIPTION
...of ligands from umbrella sampling and steered MD simulations; ...applications to ions, small molecules and toxin peptides. Absolute binding free energies. Contents of this presentation I. Introduction. General principles of free energy calculations Illustrations by simple examples - PowerPoint PPT PresentationTRANSCRIPT
...of ligands from umbrella sampling and steered MD simulations;
...applications to ions, small molecules and toxin peptides.
Introduction
Contents of this presentation I
General principles of free energy calculations
Illustrations by simple examples Toy models: ions in small water boxes
Theoretical Underpinnings Ways to affirm and void the fidelity of simulation
models to reality Advanced examples
Ligands in gramicidin-A channel, K+ channels
Introduction
Contents of this presentation II
Full analysis of an absolute free energy of binding calculation
Using toxin peptides (my work) Summary and notes
Implications of various assumptions Warning signs to watch out for
...let’s go!
Introduction
Goal of FE calculations
AccuracyChief motivation of conducting calculations is to
predict/replicate experimental valuesWe must not lose sight of this
Therefore, it is worthwhile to test and verify our foundationsEspecially for calculations that last over a
month
Introduction
Mechanics of FE calculations
Umbrella sampling and steered MD are path-dependent calculations
e.g. the ligand must physically move out of the receptor
A well-constructed simulation can:Take information from known reaction mechanismsProvide information on how those mechanisms
occure
Introduction
Foundation of FE calculations
F = ma All simulations are models Classical approximation of QM phenomena
System proceeds via forces imposed on atomsEnergy calculations occur via manipulation of the
these forces
Small loss in accuracy must be granted
...motivated?
A simple manipulation
Steered molecular dynamics
F= ½ k ( r – r0 + tv )
SMD is a “moving spring”Force is imposed on an atom or set of atomsEquilibrium position moves to create a path
Work done by this spring is used to find the local free energy surface.
Steered Molecular Dynamics
A simple model
A small box, containing: ~400 water atoms, one neon atom.
Suppose that we create an artificial gaussian barrier (of ~5 kT) for the Ne atom.
How do I measure the free energy surface of that barrier, using the Ne as a probe?
Steered Molecular Dynamics
Simulation design
Constrain position of Ne in (x,y,z) to a starting location
Define path along water box
Measure the total amount of work done
W = ∫ F(z) dz
W → manipulations → free energy surface
Steered Molecular Dynamics
Work done(?)
Resulting potential of mean force
Includes environmental influences i.e. action of water around
the atom
NB: notice that PMF does not return to zero readily
Steered Molecular Dynamics
Averaging
A single trajectory does not “sample” the reaction sufficientlyi.e. A free energy surface requires comprehensive knowledge
of the sub-states, ala possible trajectories
Trajectories contain influence from the SMD pulling itselfSome of the work is directed at displacing the solvent around
NeThis needs to be accounted for
Steered Molecular Dynamics
Jarzynski’s Equality
.˙. Take multiple trajectoriesThe average of multiple trajectories should even out random
influences on the PMF
< e -W/kT > ≈ Σi=1n e -Wi / kT / n
.˙. Apply Jarzynski’s Equality (given certain assumptions)
e -ΔG/kT = < e -W/kT >
Steered Molecular Dynamics
Average work
The Boltzmann average of 10 calculations
v = 10 Å ns-1
A very fast calculation, converges because Neon does not interact with environment
...next: theory
SMD Assumptions
Criteria of SMD-based FE calculations
Reaction coordinate must be well chosen This coordinate measures all contributions to
the real ΔG, and only those contributions. The simulation must be near-equilibrium
Jarzynski's Equality holds when no dissipative work is done by the pulling
SMD Assumptions
Reaction coordinates
In SMD, the dimension(s) controlled by the constraining potential is a
reaction coordinate The reaction coordinate can be:
DistanceCenter of mass (collective variable)RMSD to target structure
The path traced through the SMD simulation is a reaction path
SMD Assumptions
Theoretical interpretation
Our systems are canonical ensembles:Z(n,P,T)
We wish to measure particular sub-states of the system
e.g. ligand-bound and ligand-unbound. Thus, steered molecular dynamics
(SMD):Constrains the system to certain sub-states
according to some coordinateDrives the system along this coordinate to
new sub-states by application of forces.
SMD Assumptions
Implications on path design
Both the reaction-path and the simulations must sufficiently sample the required sub-space
May not be complete, but must be representative
In a ligand-unbinding process Translation must be primary Rotation may be secondary
Phase space
BulkBulk
BoundBound
SMD Assumptions
Implications on path design
Sampling and stability of a system suffers where large barriers must be crossed
Therefore, choose path of least resistance
Simply using the distance between the ligand and the whole receptor may not
be a good choice…
SMD Assumptions
Criteria of SMD-based FE calculations
Reaction coordinate must be well chosen This coordinate measures all contributions to
the real ΔG, and only those contributions. The simulation must be conducted in a
near-equilibrium state Jarzynski's Equality holds when no dissipative
work is done by the pulling
SMD Assumptions
Theoretical implications
SMD –(Jarzynski's Equality)→ Free energy JE comes with certain qualifications:
Dissipative work can be done on the systemPulling velocity
i.e. System must equilibrate around SMD perturbation, else perturbation will also be measured. (We will show this later.)
...next:
Collecting local information
Umbrella Sampling
F= ½ k ( r – ri), i along path
US is a static potentialForce is also imposed on an atom or set of atomsMultiple overlapping states are constructed to cover
reaction path.
Each state provides information about local surfaceLink to derive complete surface
...
Umbrella sampling
Second toy model
A box containing two ions sodium and chloride
Solution known FE surface related to radial
distribution function
This was done ab-initio yesterday
Umbrella sampling
Second toy model
Reaction coordinate Na – Cl separation
Umbrella potential 1 Å apart, 2.5-9.5 Å k = 10 kcal mol-1 Å-2
Derive original by WHAM analysis
Umbrella sampling
PMF convergence
Phase space
BulkBulk
BoundBound
Umbrella sampling
PMF
Phase space
BulkBulk
BoundBound
...pretty...
US assumptions
Criteria of US-based FE calculations
Reaction coordinate must be well chosen This coordinate measures all contributions to
the real ΔG, and only those contributions. Sufficient sampling over the entire path:
Convergence of PMF curve means that environmental variables are well sampled.
Overlap between adjacent windows.
US assumptions
Reaction coordinate
Same arguments as for SMD (underlying physics identical)
Umbrella sampling is capable of treating two/three dimensionsAs long as all dimensions are properly
sampled
US assumptions
Criteria of US-based FE calculations
Reaction coordinate must be well chosen This coordinate measures all contributions to
the real ΔG, and only those contributions. Sufficient sampling over the entire path:
Convergence of PMF curve means that environmental variables are well sampled
Overlap between adjacent windows
US assumptions
Environmental variables
All coordinates not included in your reaction coordinate(s) must be well sampled
This means that simulation has visited dimensions perpendicular to reaction path that may contribute to FEbind
BulkBulk
BoundBound
US assumptions
Window Overlap
Require enough sampling to accurately interpolate between windows
BulkBulk
BoundBound
US assumptions
Window-overlap
Define measure of overlap between two distributions
When underlying surface is flat, harmonic potential produces gaussian distributions
Theoretical overlap: Ω = [ 1- erf(d/8σ)]
US assumptions
Ω = [ 1- erf(d/8σ)]
In practice, minimum overlap is ~2%
Overlap should agree with theoretical value when in bulk
k = 20 kcal/mol/Å-2
d = 0.5 ÅΩ = 15%
k = 40 kcal/mol/Å-2
d = 0.5 ÅΩ = 4%
Steered MD –vs– Umbrella Sampling
Summary
Constructing FE surfaces via SMDStraightforward constructionRelies on JE conditions: Difficult to achieve in
practice
Constructing FE surfaces via USAdditional checks and balances
Both dependent on sufficient sampling
...take a break?
J Chem. Phys. 128:155104 (2008)
JE/US comparisons
Using several testcases of ions andmolecules inchannel systems
Ion transit through membraneIon binding to gramicidin exteriorOrganic-cation binding to gramicidin-A
J Chem. Phys. 128:155104 (2008)
JE/US comparisons
Comparing PMFsTests balanced by
equalising the totalsimulation time
SMD setup @ v=5 Å ns-1 ~ US setup
SMD: Also use different pulling velocities to test reversibility of JE
J Chem. Phys. 128:155104(2008)
Ion transit (nanotube)
36
Results equivalent between two methods
Energy surface not equal at both openings resulting from system setup
JE valid
J Chem. Phys. 128:155104(2008)
Ion transit (gA)
37
Umbrella sampling, not SMD, gives symmetric surface
Pulling at different velocities do not seem to help Can potentially use
v < 1 Å ns-1 But more time consuming
than equivalent US setup
J Chem. Phys. 128:155104(2008)
JE: Practical problems?
Equilibration time sharply increases for peptide environmentsNanotube highly ordered
.˙. Fast dissipation
Reliance on sampling “negative work” trajectoriesHigh v: low probability for the
environment to push SMD particle
J Chem. Phys. 128:155104(2008)
K+-binding to gA
( next test case: ) gA has a weak ion binding
site at the entrances
Smaller potentialsPerhaps using a smaller k will
reduce the perturbations
39
v =2.5 Å ns-1
J Chem. Phys. 128:155104(2008)
K+-binding to gA
What about using different force constants?No significant help in repairing
JE assumptions
Using small k may reduce perturbations on systemHowever, binding site shape lostk must be greater than binding
well ‘potential’
40
v =2.5 Å ns-1
k = 2 kcal mol-1 Å-2
k = 20 kcal mol-1 Å-2
J Chem. Phys. 128:155104(2008)
K+-binding to gA
There are hard limits to varying parameters
41
J Chem. Phys. 128:155104(2008)
EA and TEA binding
Ethylammonium (EA) and tetra-ethylammonium (TEA) bind weakly to gA v = 2.5 A /ns k = 20 kcal mol-1 Å-2
Reducing barrier height produces no difference here
J Chem. Phys. 128:155104(2008)
Toy comparison
43
If it doesn’t work for small cations, it won’t work for a peptide Test for a purported binding of
CnERG1 toxin to hERG channel
since rate of dissipation to environment is less than rate of work done…
SMD-PMFs essentially measures work done to move solvent
CnErg1
J Chem. Phys. 128:155104 (2008)
Mechanisms
44
Input: work done on system Carried out by imposed SMD
forces (irreversible) Contribution from underlying FE
surface (reversible) Output: dissipation to heat-
bath (NPT systems) Equilibration occurs by two
means: Forces bleed out to atoms far
from SMD location Temperature coupling to
thermostat JE maintained in O < I
conditions (only in nanotube)
J Chem. Phys. 128:155104 (2008)
Mechanisms
The time for equilibration is such that v << 2.5 Ang/ns is required for JE condition to hold
This velocity requirement become more stringent as: ligand size increases interactions increase
Not as efficient as umbrella sampling.
...paper finished.
J. Phys. Chem. B 111:11303 (2007)
Organic cations-gA
Block of GA by various small molecular ligands
Energetics dependent on ligand size and partial charges
Comparison between ligands, and with extant experimental data
J. Phys. Chem. B 111:11303 (2007)
Organic cations
Use Autodock3 to find potential binding sites of molecules
MD, umbrella sampling to find PMF and free energy of binding
COM-coordinates of ligands10 kcal mol-1 Å-2
0.5 Å ns-1
J. Phys. Chem. B 111:11303 (2007)
Molecule list
Use of six different molecules
Varying sizes and polarity
Determines strength of binding, and whether molecules can permeate through gA
J. Phys. Chem. B 111:11303 (2007)
MA and EA
J. Phys. Chem. B 111:11303 (2007)
FMI and GNI
J. Phys. Chem. B 111:11303 (2007)
TMA and TEA
J. Phys. Chem. B 111:11303 (2007)
Comparison of FEbind
FMI and GNI bindingChannel lifetimes increases by many foldsBinding must influence the center of pocketA binding site likely exists in the centre of the
channel, not in simulation
Molecule z (Å) r (Å) FEbind (kT) K(M-1) (expt.)
MA 10.7 ± 0.8 0.7 ± 0.4 -1.4 4.1 (4.4)
EA 12.5 ± 0.6 1.4 ± 0.6 1.6 0.2 (~0)
FMI 12.6 ± 0.5 1.5 ± 0.7 0.5 0.6 (23)
GNI 12.8 ± 0.3 2.3 ± 0.5 -2.2 8.9
TMA 13.2 ± 0.6 1.4 ± 0.6 0 1
TEA 14.1 ± 0.5 2.2 ± 0.8 0.9 0.4
...wait, how did we get FE?
How did we finally derive FEbind?
From PMF to free energy
Assumption: x-y variations in the PMF are “small” in the region sampled
Keq = ∫ ∫ ∫ e-W(z)/kT dx dy dz = π R2 ∫ e-W(z)/kT dz
∆Gb= -kT ln (Keq Co )
Co is standard concentration What is R?
How did we finally derive FEbind?
Integration Volume
The measured PMF occurs over a certain volume This represents the size of
the entire binding pocket Larger binding pocket =>
larger effective ∆Gb
This is then standardised to 1 M of ligand Equivalent to 1 ligand per
1661 Å3
BulkBulk
BoundBound
How did we finally derive FEbind?
Integration Volume
One dimensional PMF merely hides the fact that binding sites have volume Assumption essentially
states that the PMF value at some position is the average PMF value of the local slice
R is the radius of these local slices This depends on the actual
area sampled
BulkBulk
BoundBound
How did we finally derive FEbind?
Integration Volume
In practice, R does not change significantly over the length of the bound area Therefore, can pick uniform average
R at a minimal loss of accuracy
e-W(z)/kT means that only the sampling around binding site is critical
The rest of the path connects this to bulk Set R to be average area visited
by ligand in site
BulkBulk
BoundBound
How did we finally derive FEbind?
From PMF to free energy
∆Gb= -kT ln (Keq Co )Other notes:
These derivations assume a simple two state mechanism
[L] + [B] [LB]
In cases of e.g. cooperative binding, the relationship needs a different derivation
How did we finally derive FEbind?
From PMF to free energy
∆Gb= -kT ln (Keq Co )Other notes:
We also ignore possibilities of multiple binding sites
Secondary binding pockets within the same site may contribute – this depends on your sampling and reaction paths
...next paper...
Biophys J. (2011)
K+ permeation
PMFs can be used to study permeation processes subject to classical
assumptions
The K+ channel conducts ions at a near diffusion rate This implies that only small
barriers exist along the path How does this occur?
Biophys J. (2011)
K+ permeation
K+ ions occupy the filter at all times in-vivoS1/S3, S0/S2/S4
If less ions are present, the channel closes.˙. conduction must involve concerted movement
Biophys J. (2011)
Setup
Kv1.2 (Shaker)
Reaction coordinate along channel-axis
Define various PMFs for the different conditions that may exist:one K+ approaching two ions within filterTwo ions moving along filterThree ions moving along filter
Biophys J. (2011)
One ion movt.
Approach from left
Biophys J. (2011)
One ion movt.
Approach from right
Significant difference between the two occupancy states
Biophys J. (2011)
Barrier-less permeation?
The “barrier-less” transport involvesMovement of the two ions in the filterFilling the hole left behindEvery pair must be separated by 1 water
Adjacent K+ states are unfavourableStates like S1/S3/S4 should not exist
We further test the cohesive movement
Biophys J. (2011)
K+ permeation
2-ion movement
S2/S4 -> S1/S3Barrier less
S1/S3 -> S0/S2Large barrier?
Biophys J. (2011)
K+ permeation
3-ion movement
S1/S3/‘S5’ -> S0/S2/S4also large barrier
Biophys J. (2011)
Barrier-less permeation?
The large barrier cannotphysically existBut simulation well converged: there must be a problem with the
simulation itselfClassical forcefields used here are not polarisableLeads to difference between protein behaviour near solvent (S0) and
within filter (S1-S4)?
Biophys J. (2011)
Barrier-less permeation?
The large barrier cannotphysically existSimilar problem with
permeationAll due to polarisation of
molecules in different media
Stay tuned for next generation forcefields
...take a break?
Binding of Charybdotoxin to KcsA Current paper in submission
Various aspects shown above in illustrations
We will now cover the project in detailHighlighting the tests that affirm accuracy
i.e. Sampling sufficiency, coordinate choices, control…
US assumptions
Criteria of US-based FE calculations
Reaction coordinate must be well chosen This coordinate measures all contributions to
the real ΔG, and only those contributions. Sufficient sampling over the entire path:
Convergence of PMF curve means that environmental variables are well sampled.
Overlap between adjacent windows. Complexity from collective variables:
Influence of internal coordinates on PMF.
US assumptions
Ligand Complexity
Reaction coordinates work by contracting the entire freedom of the ligand
A ligand molecule has 3N degrees of freedom A fraction of these are important Maximum of ~3 coordinates is reasonable.
In complex ligands with multiple sites of interaction, must either:
select all important coordinates as reaction coordinates, e.g. charge-centers.
Contract them further, e.g. center-of-mass Then must deal with how well reaction coordinate
corresponds to the interaction sites
Ligand complexity Small ligands have
dozens of atoms Relatively few
internal degrees of freedoms
Choosing COM does not introduce complications
Ligand complexity Proteins contain 100s
of atoms with tertiary structure many internal
degrees of freedom low vibrational
modes. Choosing COM –
Internal modes may respond to umbrella potential
The binding of charybdotoxin
Background
Scorpion toxinsSelectively targets neuronal ion channels and
blocks conduction.
Specificity can be altered by mutationsModified toxins for therapeutic targetsPotential for:
therapeuticsin-vivo studies of channel distribution
The binding of charybdotoxin
Background
Potassium channel targetsKv1 family, calcium-activated channels
Structure difficult to obtain.
Bacterial K+ channel, KcsA, is easierMutate residues and bind scorpion toxins
KcsA-ChTX complex obtained by NMR + previous crystallographyPark et. al. (2005)
The binding of charybdotoxin
MD system setups
Reaction coordinate: Toxin backbone center-of-
mass. Path extends along
channel-axis.
Harmonic constraints on residues 3-35 and 17-20 Prevent unfolding of protein
during sampling.
~60,000 atoms Match experimental ionic
concentration (150 mM)
NPT ensemble
KcsA and membrane lightly constrained Prevent unlikely case of
drifting
The binding of charybdotoxin
The center-of-mass coordinate
Represents the translational freedoms of the ligand.
Sampling must integrate rotational effects and internal modes
Works for ions and small ligands with single important site of interaction
May not work alone for peptides with multiple sites of interaction
The binding of charybdotoxin
Cartoon example
A ligand is pulled from its binding site to the bulk However, ligand unfolds over the path of reaction
coordinate. Then the chosen path must include the energy
of unfolding.
The binding of charybdotoxin
Prior Work
We’ve carried out a direct umbrella sampling procedure for a scorpion toxin
COM coordinate Results in partial unfolding
of alpha helix Trapped in alternate
conformation
The binding of charybdotoxin
Solution
Parts of the protein are restrained in order to prevent unfolding
As a check, calculate the free energy of restraining and unrestraining
The binding of charybdotoxin
Solution
Contribution of thermodynamic cycle needs to be calculated (site) 1.43 (bulk) 1.47
Negligible in this case
Not negligible if restraints applied to functional residues
The binding of charybdotoxin
Spoiler
Parts of the protein are restrained in order to prevent unfolding
A different “path” is sampled when toxin is restrained
The binding of charybdotoxin
Sampling Overlap
Plot overlap between successive windows Gaps likely at transitional
barriers These require additional
windows A junction can introduce errors
of up to 0.5 kcal mol-1
The binding of charybdotoxin
Sampling Overlap
The binding of charybdotoxin
Spoiler II
WHAM analysis interpolates data to connect two adjacent windows. Gaps in this case introduce
~0.5 kcal mol-1 differences Important w.r.t. accuracy
The binding of charybdotoxin
Spoiler II - note
Angular peaks within an otherwise smooth PMF curve Location of barrier Potential indication of
insufficient sampling
PMF convergence
(work computer died)
PMF
B&C:toxin-restrained
PMFs
-----
A:non-restrained PMFs
The binding of charybdotoxin
path-dependence of PMF
The binding of charybdotoxin
path-dependence of PMF
The binding of charybdotoxin
k-dependence of PMF
The binding of charybdotoxin
k-dependence of PMF
There is no dependence of the binding free energy on k The storage of elastic
energy in the peptide is conservative
(error is calculated by standard deviations of
subsets of sampling data)
Source set FE of binding
A20 -17.1 ± 0.9
A40 -16.8 ± 2.3
B20 -8.7 ± 1.7
C40 -7.6 ± 1.1
Experiment -8.3
The binding of charybdotoxin
Kinetic analyses
Rotational freedom of ChTX increases in steps Significant increase after
contact disassociation
Clearly there shouldbe free rotation in bulk
The binding of charybdotoxin
Kinetic analyses
Translational freedom only achieved outside binding pocket
Effective binding site is actually rather small
Transition region withsome charge contacts
The binding of charybdotoxin
Kinetic analyses
Although ligand is 100’s of atoms, sampling is not as hard as one might imagine
The binding of charybdotoxin
Kinetic analyses
Binding site is small due to contacts
Multiple charge interactions “lock” the toxin to the pore Therefore a very
narrow and deep binding pocket
The binding of charybdotoxin
Kinetic analyses
Significant residues identifiable R25, K27, R34, Y36 Not K11 or other charges
Correlates with existing mutational data
Addendum
Take home message
Many types of ligand interactions can be explored
Important caveats exist, but not extraordinarily challenging
Most useful in explaining why a process occurs Complementary with experimental data