the tau-leap method for simulating stochastic kinetic models
DESCRIPTION
TRANSCRIPT
Approximate accelerated stochastic simulation ofchemically reacting systems
Colin Gillespie
October 16, 2013
Overview
Stochastic kinetic models
Issues with the Direct methodτ -leap method (Gillespie, 2001)
Leap conditionsMid-point estimation
Examples
Source code (R and LATEX of these slides):https://github.com/csgillespie/talks/
Stochastic kinetic models
A biochemical network is represented as a set of pseudo-biochemicalreactions:
u species & v reactions
Ri : p11X1 + p12X2 + · · ·+ p1uXuci−→ q11X1 + q12X2 + · · ·+ q1uXu
Stochastic rate constant ci .
Hazard/instantaneous rate: hi(Xt , ci) where Xt = (X1,t , . . . ,Xu,t) is thecurrent state of the system.
Under mass-action stochastic kinetics, the hazard function is proportional toa product of binomial coefficients, with
hi(Xt , ci) = ci
u∏j=1
(Xj,t
pij
).
Stochastic kinetic model
Describe the SKM by a Markov jump process (MJP)
The effect of reaction Rk is to change the value of each species Xi byqji − pji
The stoichiometry matrix S has elements sij = qji − pji
It can be shown that the time to the next reaction is
t ∼ Exp(h0(Xt , c)) where h0(Xt , c) =v∑
i=1
hi(Xi , ci)
and the reaction is of type i with probability hi(Xt , ci)/h0(Xt , c)
The process is easily simulated using the Direct method (Gillespiealgorithm)
Example: Lotka-Volterra system
R1: Prey reproduction
X1c1−−→ 2X1
R2: Prey death, predator reproduction
X1 + X2c2−−→ 2X2
R3: Predator death
X2c3−−→ ∅
Popu
lati
on
0
100
200
300
400
500
0 10 20 30
Stochastic realisations
Prey
Predator
Time
Popu
lati
on
0
100
200
300
400
500
0 10 20 30
X(0) = (100, 100)c = (0.5, 0.0025, 0.3)
The direct method
1 Initialisation: initial conditions, reactions constants, and random numbergenerators
2 Propensities update: Update each of the v hazard functions, hi(x)
3 Propensities total: Calculate the total hazard h0 =∑v
i=1 hi(x)
4 Reaction time: τ = −ln[U(0, 1)]/h0 and t = t + τ
5 Reaction selection: A reaction is chosen proportional to it’s hazard6 Reaction execution: Update species7 Iteration: If the simulation time is exceeded stop, otherwise go back to
step 2
Typically there are a large number of iterates
The direct method
1 Initialisation: initial conditions, reactions constants, and random numbergenerators
2 Propensities update: Update each of the v hazard functions, hi(x)
3 Propensities total: Calculate the total hazard h0 =∑v
i=1 hi(x)
4 Reaction time: τ = −ln[U(0, 1)]/h0 and t = t + τ
5 Reaction selection: A reaction is chosen proportional to it’s hazard6 Reaction execution: Update species7 Iteration: If the simulation time is exceeded stop, otherwise go back to
step 2
Typically there are a large number of iterates
The direct method
1 Initialisation: initial conditions, reactions constants, and random numbergenerators
2 Propensities update: Update each of the v hazard functions, hi(x)
3 Propensities total: Calculate the total hazard h0 =∑v
i=1 hi(x)
4 Reaction time: τ = −ln[U(0, 1)]/h0 and t = t + τ
5 Reaction selection: A reaction is chosen proportional to it’s hazard6 Reaction execution: Update species7 Iteration: If the simulation time is exceeded stop, otherwise go back to
step 2
Typically there are a large number of iterates
The direct method
1 Initialisation: initial conditions, reactions constants, and random numbergenerators
2 Propensities update: Update each of the v hazard functions, hi(x)
3 Propensities total: Calculate the total hazard h0 =∑v
i=1 hi(x)
4 Reaction time: τ = −ln[U(0, 1)]/h0 and t = t + τ
5 Reaction selection: A reaction is chosen proportional to it’s hazard6 Reaction execution: Update species7 Iteration: If the simulation time is exceeded stop, otherwise go back to
step 2
Typically there are a large number of iterates
Approximations
Relax some assumptions (e.g. discreteness and stochasticity) in order tomake simulation faster and more scalable
Diffusion approximation / chemical Langevin equation (CLE)
Linear noise approximation (LNA) / moment closure (2MA)
ODE
Hybrid discrete-continuous models
Poisson leap
If all reactions are zeroth-order, then the model is a homogeneousPoisson process
Hence the number of reactions in (t0, t1) follows a Poisson distributionFor a more general model, if we consider a small time interval,(t, t + ∆t), then:
the hazard rates should be approximately constantthe number of reactions (of a given type) can be sampled from a Poissondistribution
A balance between speed and accuracy
Poisson leap method
1 Set t = 0. Initialise the rate constants and the initial molecule numbers x2 Calculate hi(x , ci), for i = 1, . . . , v , and simulate the v -dimensional
reaction vector r , with i th entry a Po(hi(x , ci)∆t) random quantity3 Update the state according4 Update t := t + ∆t5 Output t and x . If t < Tmax return to step 2
How do you chose ∆t?
Poisson leap method
1 Set t = 0. Initialise the rate constants and the initial molecule numbers x2 Calculate hi(x , ci), for i = 1, . . . , v , and simulate the v -dimensional
reaction vector r , with i th entry a Po(hi(x , ci)∆t) random quantity3 Update the state according4 Update t := t + ∆t5 Output t and x . If t < Tmax return to step 2
How do you chose ∆t?
Example: Lotka-Volterra
Suppose we are interested inparameter inference
A possible prior could beindependent Uniform priorsover U(−8, 8) for each log(ci)
Three samples from this prioryield very different realisations
Probability of extinction bytime t = 30, is around 0.86
Each simulation would requirea very different ∆t
Popu
lati
on
0
20
40
60
80
100
0 10 20 30
Gillespie simulations
Popu
lati
on
0
400
800
1200
1600
2000
0 10 20 30
Popu
lati
on
0
2000
4000
6000
8000
10000
0 10 20 30
Basic τ -leap method
Suppose a temporal leap τ will result in a state change λ
Choose a value of τ that satisfies the leap condition. For each reaction,Rj , we want
|hj(x + λ)− hj(x)|
to be small
Sample kj ∼ P(hj(x)τ)
Compute λ
Set t := t + τ and x := x + λ
Choosing τ
If the reactions don’t depend on x, the leap condition will be satisfiedexactly for any τ (and so exact)
If population numbers are large, then it would take a large number ofreactions to “noticeably” change the hazard functions
If satisfying the leap condition requires a very small value τ << 1/h0(x),then we may as well use the exact SSA
A procedure for selecting τ (Gillespie 2001)
The expected net change in (t, t + τ) will be:
λ̄ =v∑
j=1
[hj(x)τ ]sj = τξ(x)
So we require that the expected changes in the propensity functions intime τ , are bounded by some fraction of all propensity functions, i.e.
|hj(x + λ)− hj(x)| < εh0(x) for j = 1, . . . , v .
Estimate the difference using a Taylor expansion:
hj(x + λ)− hj(x) 'u∑
i=1
τξi(x)∂
∂xihj(x)
Defining
bji(x) ≡ ∂hj(x)
∂xi(j = 1, . . . , v ; i = 1, . . . , u)
A procedure for selecting τ
The requirement becomes
τ
∣∣∣∣∣u∑
i=1
ξi(x)bji(x)
∣∣∣∣∣ ≤ εh0(x) (j = 1, . . . , v)
The largest value of τ consistent with this condition (and hence optimal) is
τ = minj∈[1,v]
{εh0(x)∣∣∑u
i=1 ξi(x)bji(x)∣∣}
Typical values of ε are around 0.05
Example: Lotka-Volterra
Haz
ard
0
50
100
150
200
0 10 20 30
h1
h2
h3
Example: Lotka-Volterra
Haz
ard
0
50
100
150
200
0 10 20 30
h1
h2
h3
τ
0.2
0.4
0.6
0.8
1
0 10 20 30
Example: Lotka-Volterra
Haz
ard
0
50
100
150
200
0 10 20 30
h1
h2
h3
τ
0.2
0.4
0.6
0.8
1
0 10 20 30
Example: Lotka-Volterra
Haz
ard
0
50
100
150
200
0 10 20 30
h1
h2
h3
τ
0.2
0.4
0.6
0.8
1
0 10 20 30
Mean τ
τ
Freq
uenc
y
0.0 0.1 0.2 0.3 0.4 0.50
200
400
600
800
1000
0.0 0.1 0.2 0.3 0.4 0.5
The estimated-midpoint technique
The leap condition requires that the hazards functions do not“appreciably” change in the course of a leapBut we want to take large leaps, so we will inevitably get computationalerrorsThis is similar to solving the ODE
dX(t)dt
= f (X)
using an Euler schemeA standard technique is to use a second-order Runge-Kutta or modifiedEuler method
X(t + ∆t) = X(t) + f [X(t) + 0.5f (X(t))∆t]∆t
i.e. we use an Euler method to estimate the midpoint during [t, t + ∆t],then calculate the increment in X by evaluating the slope function f at thatestimated midpoint
Example: Death model
The death model contains a single reaction
X µ−−→ ∅
and has hazard function h1(x , µ) = µx and state change vector s = −1.
The solution to the CME is:
Pr(X = x ; t) =
(x0
x
)e−µτ(x0−x)(1− e−µτ )x
Death model
τ = 0.1 τ = 0.4 τ = 0.9
0.00
0.05
0.10
0.15
0 50 100 150 0 50 100 150 0 50 100 150Population
Pr(k
)
Method Exact
Pr(X = x ; τ) =
(x0
x
)e−µτ(x0−x)(1− e−µτ )x
Death model: p-leap
τ = 0.1 τ = 0.4 τ = 0.9
0.00
0.05
0.10
0.15
0 50 100 150 0 50 100 150 0 50 100 150Population
Pr(k
)
Method Exact Poisson−leap
If we perform a single leap of length τ , the number of executed reactions are
Pp(k ;µx0, τ) =eµx0τ (µx0τ)k
k!for k = 0, . . .
Death model: τ -leap + mid-point
τ = 0.1 τ = 0.4 τ = 0.9
0.00
0.05
0.10
0.15
0 50 100 150 0 50 100 150 0 50 100 150Population
Pr(k
)
Method Exact Poisson−leap Mid−point
We estimate the mid-point to be:
x ′ ≡ x0 − b0.5τµx0c
so the number of reactions executed is
Pp(k ;µx0, τ) =eµx ′τ (µx ′τ)k
k!for k = 0, . . .
Example: Lotka-Volterra
Prey Predator
●●●●●●●
●
●
●
●
●
●●
0
100
200
300
400
p−leap τ−leap τ−leap + mid p−leap τ−leap τ−leap + mid
Simulator
Popu
lati
on
∆ t
●
●
●
●
●
●
●
0.09
0.18
0.27
0.36
0.45
0.54
0.63
Example: Lotka-Volterra
Prey Predator
●●●● ●
● ●● ●●
●●
●●
●● ●● ●
●
●
●
●
●
●
●
●
●
0
100
200
300
400
p−leap τ−leap τ−leap + mid p−leap τ−leap τ−leap + mid
Simulator
Popu
lati
on
∆ t
●
●
●
●
●
●
●
0.09
0.18
0.27
0.36
0.45
0.54
0.63
For these parameter values and this model, we have the approximaterelationship (obtained via simulation)
ε ' 0.556×∆t
Example: Lotka-Volterra
Prey Predator
●●●
●
●●
●
●●
●
●●
●
●●
●
●●
●
●●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0
100
200
300
400
p−leap τ−leap τ−leap + mid p−leap τ−leap τ−leap + mid
Simulator
Popu
lati
on
∆ t
●
●
●
●
●
●
●
0.09
0.18
0.27
0.36
0.45
0.54
0.63
For these parameter values and this model, we have the approximaterelationship (obtained via simulation)
ε ' 0.556×∆t
Summary
Similar issues arise when solving SDEs with the Euler-Maruyamascheme
In fact, if we substitute Poisson with Gaussian random numbers in thep-leap scheme, we get the Euler-Maruyama algorithm
Choosing a fixed ∆t for a wide range of parameter combinations doesn’tmake sense
Is it possible to these ideas when constructing bridges for SDEs?
References
Gillespie DT. Exact stochastic simulation of coupled chemical reactions. TheJournal of Physical Chemistry, 1977, 81: 2340 – 2361. Direct method
Gillespie, DT. Approximate accelerated stochastic simulation of chemicallyreacting systems. The Journal of Chemical Physics, 2001, 115: 1716. τ -leapmethod
Sandmann, W. Streamlined formulation of adaptive explicit-implicit tau-leapingwith automatic tau selection. Simulation Conference (WSC), Proceedings of the2009 Winter. IEEE, 2009. Nice overview of the various τ -leap flavours.
Golightly, A & Gillespie, CS. Simulation of Stochastic Kinetic Models. In SilicoSystems Biology. Humana Press, 2013, 169 – 187. Overview of various SSA.
https://github.com/csgillespie/In-silico-Systems-Biology