kasson gromacs workshop tutorial 2013.pdf

How long do I need to run my simulations?An exercise in thinking about rates and stability

Peter KassonDepartments of Molecular Physiology and Biological Physicsand Biomedical EngineeringUniversity of Virginia

Theory and examples fromKasson et al., JACS 2009

Scenarios

You have a snazzy new simulation. Say you want to assess the stability of a molecular complex, of a protein-ligand interaction, or of a particular protein conformation.

Situation 1: You do controls. Excellent idea. After 1 microsecond of simulation, your known-unstable mutant has one dissociation event and your wild-type has none. What does this mean?

Situation 2: You don't have good controls. Life is hard. You have a model for the bound complex of the never-before-seen kickbutt receptor. You do 1 microsecond of simulation and it sits there like a rock. Do you submit to Science?

Fundamental process at play

• All of these are examples of some A-B transition

• If we can treat this transition as two-state, we can use single-event observations (simulations or otherwise) to learn a lot about rates and stability.

Some nit-picky details

• Any given point in phase space, which we will call a microstate (set of positions and velocities) b in B, decorrelates with time tau such that P(x(t) is in D | x(0) = b) ~ P(y(t) is in D | y(0) = c) for any molecular trajectories x,y and microstates b,c in B if t > tau. [There is a slight simplification encoded in the "~".]

• So now if we consider molecular trajectories starting in B and potentially transitioning to D, using one trajectory within B to estimate rates B->D is equivalent to using multiple trajectories within B if the length of each trajectory > tau.

Basic probability rules for a single-exponential process

So what do we do with this?

• We encapsulate a set of simulation data D = (N,n,{ti,d},{Ti}) as a group of N simulations, n of which lead to dissociation with times of dissociation {t1..tn}, and N-n of which do not dissociate over the observation time intervals {T1 .. TN-n}. Then,

Now the cool part...

• Using a uniform prior, we obtain

So what does this mean in practice?

Given a set of observations (protein unfolds or fails to unfold, ligand dissociates or stays), we can compute the likelihood of a rate for this process.

Example

We look at oligosaccharide dissociation from influenza hemagglutinin. We measure the distance between the sialic acid and the binding pocket.

Scoring by visual inspection

Example: Ligand dissociation from hemagglutinin

• We define ligand dissociation as when a sialic acid moves 1 nm from the closest binding pocket atom. In practice, this usually indicates full and irreversible dissociation.

• Simple example: 1 x 80-ns simulation, 3 ligands. 1 ligand dissociates after 68 ns, 2 ligands remain bound for 80 ns. What does this tell us?

Practice!

1. Download rate_estimate.py

2. Start python (with scipy, eventually gflags)

3. import rate_estimate.py

4. Play...

5. What is the probability that the rate is faster than 1/ns? What about 1/us?

Let’s go back to protein stability

• If you’re running 100-ns simulations, how many simulations would you run to be 95% sure that the protein is stable on a 100-ns timescale?

• What about a microsecond timescale?

Comparing rates

So now you have two mutants of a protein. Which is more stable? How confident are you in this?

The probability of a difference delta = k1 - k2

Lovely equation...

We can use the same sequence of times

• T1 = { T1i} of simulations that do not dissociatet1 = {t1i} of simulations that do dissociate after time t1i

• T2 = { T2i} of simulations that do not dissociatet2 = {t2i} of simulations that do dissociate after time t2i

Given a series of observations (many simulations),we can calculate this

Blue lines =probabilities

Red triangles=confidence intervals

Confidence intervals? You didn’t say anything about that!

As always, when counting small event numbers, error estimates are essential!

Big nasty equation...how do I do error estimates?

Bootstrap is your friend!

Other cool tricks...computing event sensitivity

Expected number of dissociation events detected. The expected number of dissociation events detected in our simulations are plotted as a function of the G‡ of a given mutant. The plot in blue assumes a wild-type koff of 84 s-1, and the plot in red assumes a koff of 10-4 s-1.

More code and details coming!

• This tutorial is very much a work in progress.

• More code and examples (and more polished slides) coming soon!

kasson gromacs workshop tutorial 2013.pdf

Documents