absolute binding free energies

...of ligands from umbrella sampling and steered MD simulations;

...applications to ions, small molecules and toxin peptides.

Introduction

Contents of this presentation I

General principles of free energy calculations

Illustrations by simple examples Toy models: ions in small water boxes

Theoretical Underpinnings Ways to affirm and void the fidelity of simulation

models to reality Advanced examples

Ligands in gramicidin-A channel, K+ channels

Introduction

Contents of this presentation II

Full analysis of an absolute free energy of binding calculation

Using toxin peptides (my work) Summary and notes

Implications of various assumptions Warning signs to watch out for

...let’s go!

Introduction

Goal of FE calculations

AccuracyChief motivation of conducting calculations is to

predict/replicate experimental valuesWe must not lose sight of this

Therefore, it is worthwhile to test and verify our foundationsEspecially for calculations that last over a

month

Introduction

Mechanics of FE calculations

Umbrella sampling and steered MD are path-dependent calculations

e.g. the ligand must physically move out of the receptor

A well-constructed simulation can:Take information from known reaction mechanismsProvide information on how those mechanisms

occure

Introduction

Foundation of FE calculations

F = ma All simulations are models Classical approximation of QM phenomena

System proceeds via forces imposed on atomsEnergy calculations occur via manipulation of the

these forces

Small loss in accuracy must be granted

...motivated?

A simple manipulation

Steered molecular dynamics

F= ½ k ( r – r0 + tv )

SMD is a “moving spring”Force is imposed on an atom or set of atomsEquilibrium position moves to create a path

Work done by this spring is used to find the local free energy surface.

Steered Molecular Dynamics

A simple model

A small box, containing: ~400 water atoms, one neon atom.

Suppose that we create an artificial gaussian barrier (of ~5 kT) for the Ne atom.

How do I measure the free energy surface of that barrier, using the Ne as a probe?


Simulation design

Constrain position of Ne in (x,y,z) to a starting location

Define path along water box

Measure the total amount of work done

W = ∫ F(z) dz

W → manipulations → free energy surface


Work done(?)

Resulting potential of mean force

Includes environmental influences i.e. action of water around

the atom

NB: notice that PMF does not return to zero readily


Averaging

A single trajectory does not “sample” the reaction sufficientlyi.e. A free energy surface requires comprehensive knowledge

of the sub-states, ala possible trajectories

Trajectories contain influence from the SMD pulling itselfSome of the work is directed at displacing the solvent around

NeThis needs to be accounted for


Jarzynski’s Equality

.˙. Take multiple trajectoriesThe average of multiple trajectories should even out random

influences on the PMF

< e -W/kT > ≈ Σi=1n e -Wi / kT / n

.˙. Apply Jarzynski’s Equality (given certain assumptions)

e -ΔG/kT = < e -W/kT >


Average work

The Boltzmann average of 10 calculations

v = 10 Å ns-1

A very fast calculation, converges because Neon does not interact with environment

...next: theory

SMD Assumptions

Criteria of SMD-based FE calculations

Reaction coordinate must be well chosen This coordinate measures all contributions to

the real ΔG, and only those contributions. The simulation must be near-equilibrium

Jarzynski's Equality holds when no dissipative work is done by the pulling

SMD Assumptions

Reaction coordinates

In SMD, the dimension(s) controlled by the constraining potential is a

reaction coordinate The reaction coordinate can be:

DistanceCenter of mass (collective variable)RMSD to target structure

The path traced through the SMD simulation is a reaction path

SMD Assumptions

Theoretical interpretation

Our systems are canonical ensembles:Z(n,P,T)

We wish to measure particular sub-states of the system

e.g. ligand-bound and ligand-unbound. Thus, steered molecular dynamics

(SMD):Constrains the system to certain sub-states

according to some coordinateDrives the system along this coordinate to

new sub-states by application of forces.

SMD Assumptions

Implications on path design

Both the reaction-path and the simulations must sufficiently sample the required sub-space

May not be complete, but must be representative

In a ligand-unbinding process Translation must be primary Rotation may be secondary

Phase space

BulkBulk

BoundBound

SMD Assumptions

Implications on path design

Sampling and stability of a system suffers where large barriers must be crossed

Therefore, choose path of least resistance

Simply using the distance between the ligand and the whole receptor may not

be a good choice…

SMD Assumptions

Criteria of SMD-based FE calculations


the real ΔG, and only those contributions. The simulation must be conducted in a

near-equilibrium state Jarzynski's Equality holds when no dissipative

work is done by the pulling

SMD Assumptions

Theoretical implications

SMD –(Jarzynski's Equality)→ Free energy JE comes with certain qualifications:

Dissipative work can be done on the systemPulling velocity

i.e. System must equilibrate around SMD perturbation, else perturbation will also be measured. (We will show this later.)

...next:

Collecting local information

Umbrella Sampling

F= ½ k ( r – ri), i along path

US is a static potentialForce is also imposed on an atom or set of atomsMultiple overlapping states are constructed to cover

reaction path.

Each state provides information about local surfaceLink to derive complete surface

...

Umbrella sampling

Second toy model

A box containing two ions sodium and chloride

Solution known FE surface related to radial

distribution function

This was done ab-initio yesterday

Umbrella sampling

Second toy model

Reaction coordinate Na – Cl separation

Umbrella potential 1 Å apart, 2.5-9.5 Å k = 10 kcal mol-1 Å-2

Derive original by WHAM analysis

Umbrella sampling

PMF convergence

Phase space

BulkBulk

BoundBound

Umbrella sampling

PMF

Phase space

BulkBulk

BoundBound

...pretty...

US assumptions

Criteria of US-based FE calculations


the real ΔG, and only those contributions. Sufficient sampling over the entire path:

Convergence of PMF curve means that environmental variables are well sampled.

Overlap between adjacent windows.

US assumptions

Reaction coordinate

Same arguments as for SMD (underlying physics identical)

Umbrella sampling is capable of treating two/three dimensionsAs long as all dimensions are properly

sampled

US assumptions




Convergence of PMF curve means that environmental variables are well sampled

Overlap between adjacent windows

US assumptions

Environmental variables

All coordinates not included in your reaction coordinate(s) must be well sampled

This means that simulation has visited dimensions perpendicular to reaction path that may contribute to FEbind

BulkBulk

BoundBound

US assumptions

Window Overlap

Require enough sampling to accurately interpolate between windows

BulkBulk

BoundBound

US assumptions

Window-overlap

Define measure of overlap between two distributions

When underlying surface is flat, harmonic potential produces gaussian distributions

Theoretical overlap: Ω = [ 1- erf(d/8σ)]

US assumptions

Ω = [ 1- erf(d/8σ)]

In practice, minimum overlap is ~2%

Overlap should agree with theoretical value when in bulk

k = 20 kcal/mol/Å-2

d = 0.5 ÅΩ = 15%

k = 40 kcal/mol/Å-2

d = 0.5 ÅΩ = 4%

Steered MD –vs– Umbrella Sampling

Summary

Constructing FE surfaces via SMDStraightforward constructionRelies on JE conditions: Difficult to achieve in

practice

Constructing FE surfaces via USAdditional checks and balances

Both dependent on sufficient sampling

...take a break?

J Chem. Phys. 128:155104 (2008)

JE/US comparisons

Using several testcases of ions andmolecules inchannel systems

Ion transit through membraneIon binding to gramicidin exteriorOrganic-cation binding to gramicidin-A

J Chem. Phys. 128:155104 (2008)

JE/US comparisons

Comparing PMFsTests balanced by

equalising the totalsimulation time

SMD setup @ v=5 Å ns-1 ~ US setup

SMD: Also use different pulling velocities to test reversibility of JE

J Chem. Phys. 128:155104(2008)

Ion transit (nanotube)

36

Results equivalent between two methods

Energy surface not equal at both openings resulting from system setup

JE valid

J Chem. Phys. 128:155104(2008)

Ion transit (gA)

37

Umbrella sampling, not SMD, gives symmetric surface

Pulling at different velocities do not seem to help Can potentially use

v < 1 Å ns-1 But more time consuming

than equivalent US setup

J Chem. Phys. 128:155104(2008)

JE: Practical problems?

Equilibration time sharply increases for peptide environmentsNanotube highly ordered

.˙. Fast dissipation

Reliance on sampling “negative work” trajectoriesHigh v: low probability for the

environment to push SMD particle

J Chem. Phys. 128:155104(2008)

K+-binding to gA

( next test case: ) gA has a weak ion binding

site at the entrances

Smaller potentialsPerhaps using a smaller k will

reduce the perturbations

39

v =2.5 Å ns-1

J Chem. Phys. 128:155104(2008)

K+-binding to gA

What about using different force constants?No significant help in repairing

JE assumptions

Using small k may reduce perturbations on systemHowever, binding site shape lostk must be greater than binding

well ‘potential’

40

v =2.5 Å ns-1

k = 2 kcal mol-1 Å-2

k = 20 kcal mol-1 Å-2

J Chem. Phys. 128:155104(2008)

K+-binding to gA

There are hard limits to varying parameters

41

J Chem. Phys. 128:155104(2008)

EA and TEA binding

Ethylammonium (EA) and tetra-ethylammonium (TEA) bind weakly to gA v = 2.5 A /ns k = 20 kcal mol-1 Å-2

Reducing barrier height produces no difference here

J Chem. Phys. 128:155104(2008)

Toy comparison

43

If it doesn’t work for small cations, it won’t work for a peptide Test for a purported binding of

CnERG1 toxin to hERG channel

since rate of dissipation to environment is less than rate of work done…

SMD-PMFs essentially measures work done to move solvent

CnErg1

J Chem. Phys. 128:155104 (2008)

Mechanisms

44

Input: work done on system Carried out by imposed SMD

forces (irreversible) Contribution from underlying FE

surface (reversible) Output: dissipation to heat-

bath (NPT systems) Equilibration occurs by two

means: Forces bleed out to atoms far

from SMD location Temperature coupling to

thermostat JE maintained in O < I

conditions (only in nanotube)

J Chem. Phys. 128:155104 (2008)

Mechanisms

The time for equilibration is such that v << 2.5 Ang/ns is required for JE condition to hold

This velocity requirement become more stringent as: ligand size increases interactions increase

Not as efficient as umbrella sampling.

...paper finished.

J. Phys. Chem. B 111:11303 (2007)

Organic cations-gA

Block of GA by various small molecular ligands

Energetics dependent on ligand size and partial charges

Comparison between ligands, and with extant experimental data

J. Phys. Chem. B 111:11303 (2007)

Organic cations

Use Autodock3 to find potential binding sites of molecules

MD, umbrella sampling to find PMF and free energy of binding

COM-coordinates of ligands10 kcal mol-1 Å-2

0.5 Å ns-1

J. Phys. Chem. B 111:11303 (2007)

Molecule list

Use of six different molecules

Varying sizes and polarity

Determines strength of binding, and whether molecules can permeate through gA

J. Phys. Chem. B 111:11303 (2007)

MA and EA

J. Phys. Chem. B 111:11303 (2007)

FMI and GNI

J. Phys. Chem. B 111:11303 (2007)

TMA and TEA

J. Phys. Chem. B 111:11303 (2007)

Comparison of FEbind

FMI and GNI bindingChannel lifetimes increases by many foldsBinding must influence the center of pocketA binding site likely exists in the centre of the

channel, not in simulation

Molecule z (Å) r (Å) FEbind (kT) K(M-1) (expt.)

MA 10.7 ± 0.8 0.7 ± 0.4 -1.4 4.1 (4.4)

EA 12.5 ± 0.6 1.4 ± 0.6 1.6 0.2 (~0)

FMI 12.6 ± 0.5 1.5 ± 0.7 0.5 0.6 (23)

GNI 12.8 ± 0.3 2.3 ± 0.5 -2.2 8.9

TMA 13.2 ± 0.6 1.4 ± 0.6 0 1

TEA 14.1 ± 0.5 2.2 ± 0.8 0.9 0.4

...wait, how did we get FE?

How did we finally derive FEbind?

From PMF to free energy

Assumption: x-y variations in the PMF are “small” in the region sampled

Keq = ∫ ∫ ∫ e-W(z)/kT dx dy dz = π R2 ∫ e-W(z)/kT dz

∆Gb= -kT ln (Keq Co )

Co is standard concentration What is R?


Integration Volume

The measured PMF occurs over a certain volume This represents the size of

the entire binding pocket Larger binding pocket =>

larger effective ∆Gb

This is then standardised to 1 M of ligand Equivalent to 1 ligand per

1661 Å3

BulkBulk

BoundBound


Integration Volume

One dimensional PMF merely hides the fact that binding sites have volume Assumption essentially

states that the PMF value at some position is the average PMF value of the local slice

R is the radius of these local slices This depends on the actual

area sampled

BulkBulk

BoundBound


Integration Volume

In practice, R does not change significantly over the length of the bound area Therefore, can pick uniform average

R at a minimal loss of accuracy

e-W(z)/kT means that only the sampling around binding site is critical

The rest of the path connects this to bulk Set R to be average area visited

by ligand in site

BulkBulk

BoundBound



∆Gb= -kT ln (Keq Co )Other notes:

These derivations assume a simple two state mechanism

[L] + [B] [LB]

In cases of e.g. cooperative binding, the relationship needs a different derivation



∆Gb= -kT ln (Keq Co )Other notes:

We also ignore possibilities of multiple binding sites

Secondary binding pockets within the same site may contribute – this depends on your sampling and reaction paths

...next paper...

Biophys J. (2011)

K+ permeation

PMFs can be used to study permeation processes subject to classical

assumptions

The K+ channel conducts ions at a near diffusion rate This implies that only small

barriers exist along the path How does this occur?

Biophys J. (2011)

K+ permeation

K+ ions occupy the filter at all times in-vivoS1/S3, S0/S2/S4

If less ions are present, the channel closes.˙. conduction must involve concerted movement

Biophys J. (2011)

Setup

Kv1.2 (Shaker)

Reaction coordinate along channel-axis

Define various PMFs for the different conditions that may exist:one K+ approaching two ions within filterTwo ions moving along filterThree ions moving along filter

Biophys J. (2011)

One ion movt.

Approach from left

Biophys J. (2011)

One ion movt.

Approach from right

Significant difference between the two occupancy states

Biophys J. (2011)

Barrier-less permeation?

The “barrier-less” transport involvesMovement of the two ions in the filterFilling the hole left behindEvery pair must be separated by 1 water

Adjacent K+ states are unfavourableStates like S1/S3/S4 should not exist

We further test the cohesive movement

Biophys J. (2011)

K+ permeation

2-ion movement

S2/S4 -> S1/S3Barrier less

S1/S3 -> S0/S2Large barrier?

Biophys J. (2011)

K+ permeation

3-ion movement

S1/S3/‘S5’ -> S0/S2/S4also large barrier

Biophys J. (2011)


The large barrier cannotphysically existBut simulation well converged: there must be a problem with the

simulation itselfClassical forcefields used here are not polarisableLeads to difference between protein behaviour near solvent (S0) and

within filter (S1-S4)?

Biophys J. (2011)


The large barrier cannotphysically existSimilar problem with

permeationAll due to polarisation of

molecules in different media

Stay tuned for next generation forcefields

...take a break?

Binding of Charybdotoxin to KcsA Current paper in submission

Various aspects shown above in illustrations

We will now cover the project in detailHighlighting the tests that affirm accuracy

i.e. Sampling sufficiency, coordinate choices, control…

US assumptions




Convergence of PMF curve means that environmental variables are well sampled.

Overlap between adjacent windows. Complexity from collective variables:

Influence of internal coordinates on PMF.

US assumptions

Ligand Complexity

Reaction coordinates work by contracting the entire freedom of the ligand

A ligand molecule has 3N degrees of freedom A fraction of these are important Maximum of ~3 coordinates is reasonable.

In complex ligands with multiple sites of interaction, must either:

select all important coordinates as reaction coordinates, e.g. charge-centers.

Contract them further, e.g. center-of-mass Then must deal with how well reaction coordinate

corresponds to the interaction sites

Ligand complexity Small ligands have

dozens of atoms Relatively few

internal degrees of freedoms

Choosing COM does not introduce complications

Ligand complexity Proteins contain 100s

of atoms with tertiary structure many internal

degrees of freedom low vibrational

modes. Choosing COM –

Internal modes may respond to umbrella potential

The binding of charybdotoxin

Background

Scorpion toxinsSelectively targets neuronal ion channels and

blocks conduction.

Specificity can be altered by mutationsModified toxins for therapeutic targetsPotential for:

therapeuticsin-vivo studies of channel distribution


Background

Potassium channel targetsKv1 family, calcium-activated channels

Structure difficult to obtain.

Bacterial K+ channel, KcsA, is easierMutate residues and bind scorpion toxins

KcsA-ChTX complex obtained by NMR + previous crystallographyPark et. al. (2005)


MD system setups

Reaction coordinate: Toxin backbone center-of-

mass. Path extends along

channel-axis.

Harmonic constraints on residues 3-35 and 17-20 Prevent unfolding of protein

during sampling.

~60,000 atoms Match experimental ionic

concentration (150 mM)

NPT ensemble

KcsA and membrane lightly constrained Prevent unlikely case of

drifting


The center-of-mass coordinate

Represents the translational freedoms of the ligand.

Sampling must integrate rotational effects and internal modes

Works for ions and small ligands with single important site of interaction

May not work alone for peptides with multiple sites of interaction


Cartoon example

A ligand is pulled from its binding site to the bulk However, ligand unfolds over the path of reaction

coordinate. Then the chosen path must include the energy

of unfolding.


Prior Work

We’ve carried out a direct umbrella sampling procedure for a scorpion toxin

COM coordinate Results in partial unfolding

of alpha helix Trapped in alternate

conformation


Solution

Parts of the protein are restrained in order to prevent unfolding

As a check, calculate the free energy of restraining and unrestraining


Solution

Contribution of thermodynamic cycle needs to be calculated (site) 1.43 (bulk) 1.47

Negligible in this case

Not negligible if restraints applied to functional residues


Spoiler

Parts of the protein are restrained in order to prevent unfolding

A different “path” is sampled when toxin is restrained


Sampling Overlap

Plot overlap between successive windows Gaps likely at transitional

barriers These require additional

windows A junction can introduce errors

of up to 0.5 kcal mol-1


Sampling Overlap


Spoiler II

WHAM analysis interpolates data to connect two adjacent windows. Gaps in this case introduce

~0.5 kcal mol-1 differences Important w.r.t. accuracy


Spoiler II - note

Angular peaks within an otherwise smooth PMF curve Location of barrier Potential indication of

insufficient sampling

PMF convergence

(work computer died)

PMF

B&C:toxin-restrained

PMFs

-----

A:non-restrained PMFs


path-dependence of PMF


k-dependence of PMF


k-dependence of PMF

There is no dependence of the binding free energy on k The storage of elastic

energy in the peptide is conservative

(error is calculated by standard deviations of

subsets of sampling data)

Source set FE of binding

A20 -17.1 ± 0.9

A40 -16.8 ± 2.3

B20 -8.7 ± 1.7

C40 -7.6 ± 1.1

Experiment -8.3


Kinetic analyses

Rotational freedom of ChTX increases in steps Significant increase after

contact disassociation

Clearly there shouldbe free rotation in bulk


Kinetic analyses

Translational freedom only achieved outside binding pocket

Effective binding site is actually rather small

Transition region withsome charge contacts


Kinetic analyses

Although ligand is 100’s of atoms, sampling is not as hard as one might imagine


Kinetic analyses

Binding site is small due to contacts

Multiple charge interactions “lock” the toxin to the pore Therefore a very

narrow and deep binding pocket


Kinetic analyses

Significant residues identifiable R25, K27, R34, Y36 Not K11 or other charges

Correlates with existing mutational data

Addendum

Take home message

Many types of ligand interactions can be explored

Important caveats exist, but not extraordinarily challenging

Most useful in explaining why a process occurs Complementary with experimental data

absolute binding free energies

Documents

local free energy surface

ne atom

steered md simulations

neon atom

action of water

water boxmeasure

water atoms

atom nb