embedding machine learning in formal stochastic models of ...€¦ · embedding machine learning in...

113
Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with Anastasis Georgoulas and Guido Sanguinetti, School of Informatics, University of Edinburgh 21st September 2016 quanƟcol . . ......... ... ... ... ... ... ... Hillston 21/9/2016 1 / 70

Upload: others

Post on 23-Jun-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Embedding Machine Learning inFormal Stochastic Models of Biological Processes

Jane Hillston

Joint work with Anastasis Georgoulas and Guido Sanguinetti,School of Informatics, University of Edinburgh

21st September 2016

quan col. . ...............................

Hillston 21/9/2016 1 / 70

Page 2: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 2 / 70

Page 3: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 3 / 70

Page 4: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Formal modelling

Formal languages (process algebras, Petri nets, rule-based) provide aconvenient interface for describing complex systems.

High-level abstraction makes writing and manipulating models easier.

They can capture different kinds of behaviour: deterministic,stochastic, . . .

Formal nature lends itself to automatic, rigorous methods for analysisand verification.

. . . but what if parts of the system are unknown?

Hillston 21/9/2016 4 / 70

Page 5: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Alternative perspective

?

?

Model creation is data-driven

Hillston 21/9/2016 5 / 70

Page 6: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Modelling

There are two approaches to model construction:

Machine Learning: extracting a model from the data generated by thesystem, or refining a model based on system behaviour usingstatistical techniques.

Mechanistic Modelling: starting from a description or hypothesis,construct a model that algorithmically mimics the behaviourof the system, validated against data.

Hillston 21/9/2016 6 / 70

Page 7: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Machine Learning

prior

posterior

data

inference

Bayes’ Theorem

For the distribution of a parameter θ and observed data D,

P(θ | D) ∝ P(θ)P(D | θ)

Hillston 21/9/2016 7 / 70

Page 8: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Machine Learning

prior

posterior

data

inference

Bayes’ Theorem

For the distribution of a parameter θ and observed data D,

P(θ | D) ∝ P(θ)P(D | θ)

Hillston 21/9/2016 7 / 70

Page 9: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bayesian statistics

Represent belief and uncertainty as probability distributions (prior,posterior).

Treat parameters and unobserved variables similarly.

Bayes’ Theorem:

P(θ | D) =P(θ) · P(D | θ)

P(D)

posterior ∝ prior · likelihood

Hillston 21/9/2016 8 / 70

Page 10: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bayesian statistics

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Prior distribution (no data)

Hillston 21/9/2016 9 / 70

Page 11: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bayesian statistics

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

After 20 data points

Hillston 21/9/2016 9 / 70

Page 12: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bayesian statistics

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

After 40 data points

Hillston 21/9/2016 9 / 70

Page 13: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bayesian statistics

0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

After 60 data points

Hillston 21/9/2016 9 / 70

Page 14: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Mechanistic modelling

Models are constructed reflecting what is known about the components ofthe biological system and their behaviour.

Several approaches originating in theoretical computer science have beenproposed to capture the system behaviour in a high-level way.

These are then compiled into executable models1 which can be run todeepen understanding of the model.

Executing the model generates data that can be compared with biologicaldata.

1Jasmin Fisher, Thomas A. Henzinger: Executable cell biology. NatureBiotechnology 2007

Hillston 21/9/2016 10 / 70

Page 15: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Mechanistic modelling

Models are constructed reflecting what is known about the components ofthe biological system and their behaviour.

Several approaches originating in theoretical computer science have beenproposed to capture the system behaviour in a high-level way.

These are then compiled into executable models1 which can be run todeepen understanding of the model.

Executing the model generates data that can be compared with biologicaldata.

1Jasmin Fisher, Thomas A. Henzinger: Executable cell biology. NatureBiotechnology 2007

Hillston 21/9/2016 10 / 70

Page 16: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Mechanistic modelling

Models are constructed reflecting what is known about the components ofthe biological system and their behaviour.

Several approaches originating in theoretical computer science have beenproposed to capture the system behaviour in a high-level way.

These are then compiled into executable models1 which can be run todeepen understanding of the model.

Executing the model generates data that can be compared with biologicaldata.

1Jasmin Fisher, Thomas A. Henzinger: Executable cell biology. NatureBiotechnology 2007

Hillston 21/9/2016 10 / 70

Page 17: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Mechanistic modelling

Models are constructed reflecting what is known about the components ofthe biological system and their behaviour.

Several approaches originating in theoretical computer science have beenproposed to capture the system behaviour in a high-level way.

These are then compiled into executable models1 which can be run todeepen understanding of the model.

Executing the model generates data that can be compared with biologicaldata.

1Jasmin Fisher, Thomas A. Henzinger: Executable cell biology. NatureBiotechnology 2007

Hillston 21/9/2016 10 / 70

Page 18: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Optimizing models

Usual process of parameterising a model is iterative and manual.

model

data

simulate/analyse

update

Hillston 21/9/2016 11 / 70

Page 19: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Comparing the techniques

Data-driven modelling:

+ rigorous handling of parameter uncertainty- limited or no treatment of stochasticity- in many cases bespoke solutions are required which can

limit the size of system which can be handled

Mechanistic modelling:

+ general execution ”engine” (deterministic or stochastic)can be reused for many models

+ models can be used speculatively to investigate roles ofparameters, or alternative hypotheses

- parameters are assumed to be known and fixed, orcostly approaches must be used to seek appropriateparameterisation

Hillston 21/9/2016 12 / 70

Page 20: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Comparing the techniques

Data-driven modelling:

+ rigorous handling of parameter uncertainty- limited or no treatment of stochasticity- in many cases bespoke solutions are required which can

limit the size of system which can be handled

Mechanistic modelling:

+ general execution ”engine” (deterministic or stochastic)can be reused for many models

+ models can be used speculatively to investigate roles ofparameters, or alternative hypotheses

- parameters are assumed to be known and fixed, orcostly approaches must be used to seek appropriateparameterisation

Hillston 21/9/2016 12 / 70

Page 21: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Developing a probabilistic programming approach

What if we could...

include information about uncertainty in the model?

automatically use observations to refine this uncertainty?

do all this in a formal context?

Starting from an existing process algebra (Bio-PEPA), we have developeda new language ProPPA that addresses these issues2.

2Anastasis Georgoulas, Jane Hillston, Dimitrios Milios, Guido Sanguinetti:Probabilistic Programming Process Algebra. QEST 2014: 249-264.

Hillston 21/9/2016 13 / 70

Page 22: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 14 / 70

Page 23: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Probabilistic programming

A programming paradigm for describing incomplete knowledge scenarios,and resolving the uncertainty.

Describe how the data is generated in syntax like a conventionalprogramming language, but leaving some variables uncertain.

Specify observations, which impose constraints on acceptable outputsof the program.

Run program forwards: Generate data consistent with observations.

Run program backwards: Find values for the uncertain variableswhich make the output match the observations.

Hillston 21/9/2016 15 / 70

Page 24: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Probabilistic programming

A programming paradigm for describing incomplete knowledge scenarios,and resolving the uncertainty.

Describe how the data is generated in syntax like a conventionalprogramming language, but leaving some variables uncertain.

Specify observations, which impose constraints on acceptable outputsof the program.

Run program forwards: Generate data consistent with observations.

Run program backwards: Find values for the uncertain variableswhich make the output match the observations.

Hillston 21/9/2016 15 / 70

Page 25: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Probabilistic programming

A programming paradigm for describing incomplete knowledge scenarios,and resolving the uncertainty.

Describe how the data is generated in syntax like a conventionalprogramming language, but leaving some variables uncertain.

Specify observations, which impose constraints on acceptable outputsof the program.

Run program forwards: Generate data consistent with observations.

Run program backwards: Find values for the uncertain variableswhich make the output match the observations.

Hillston 21/9/2016 15 / 70

Page 26: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Probabilistic programming

A programming paradigm for describing incomplete knowledge scenarios,and resolving the uncertainty.

Describe how the data is generated in syntax like a conventionalprogramming language, but leaving some variables uncertain.

Specify observations, which impose constraints on acceptable outputsof the program.

Run program forwards: Generate data consistent with observations.

Run program backwards: Find values for the uncertain variableswhich make the output match the observations.

Hillston 21/9/2016 15 / 70

Page 27: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 16 / 70

Page 28: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Molecular processes as concurrent computations

ConcurrencyMolecularBiology

Metabolism SignalTransduction

Concurrentcomputational processes

Molecules Enzymes andmetabolites

Interactingproteins

Synchronous communication Molecularinteraction

Binding andcatalysis

Binding andcatalysis

Transition or mobilityBiochemicalmodification orrelocation

Metabolitesynthesis

Protein binding,modification orsequestration

A. Regev and E. Shapiro Cells as computation, Nature 419, 2002.

Hillston 21/9/2016 17 / 70

Page 29: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Molecular processes as concurrent computations

ConcurrencyMolecularBiology

Metabolism SignalTransduction

Concurrentcomputational processes

Molecules Enzymes andmetabolites

Interactingproteins

Synchronous communication Molecularinteraction

Binding andcatalysis

Binding andcatalysis

Transition or mobilityBiochemicalmodification orrelocation

Metabolitesynthesis

Protein binding,modification orsequestration

A. Regev and E. Shapiro Cells as computation, Nature 419, 2002.

Hillston 21/9/2016 18 / 70

Page 30: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Stochastic Process Algebra

In a stochastic process algebra actions (reactions) have a name or type,but also a stochastic duration or rate.

In systems biology modelling it is these rates that are often unknown.

Hillston 21/9/2016 19 / 70

Page 31: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Stochastic Process Algebra

In a stochastic process algebra actions (reactions) have a name or type,but also a stochastic duration or rate.

In systems biology modelling it is these rates that are often unknown.

The language may be used to generate a Markov Process (CTMC).

SPAMODEL

LABELLEDTRANSITION

SYSTEMCTMC Q- -

SOS rules state transition

diagram

Q is the infinitesimal generator matrix characterising the CTMC.

Models are typically executed by simulation using Gillespie’s StochasticSimulation Algorithm (SSA) or similar.

Hillston 21/9/2016 19 / 70

Page 32: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Stochastic Process Algebra

In a stochastic process algebra actions (reactions) have a name or type,but also a stochastic duration or rate.

In systems biology modelling it is these rates that are often unknown.

The language may be used to generate a Markov Process (CTMC).

SPAMODEL

LABELLEDTRANSITION

SYSTEMCTMC Q- -

SOS rules state transition

diagram

Q is the infinitesimal generator matrix characterising the CTMC.

Models are typically executed by simulation using Gillespie’s StochasticSimulation Algorithm (SSA) or similar.

Hillston 21/9/2016 19 / 70

Page 33: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Stochastic Process Algebra

In a stochastic process algebra actions (reactions) have a name or type,but also a stochastic duration or rate.

In systems biology modelling it is these rates that are often unknown.

The language may be used to generate a Markov Process (CTMC).

SPAMODEL

LABELLEDTRANSITION

SYSTEMCTMC Q- -

SOS rules state transition

diagram

Q is the infinitesimal generator matrix characterising the CTMC.

Models are typically executed by simulation using Gillespie’s StochasticSimulation Algorithm (SSA) or similar.

Hillston 21/9/2016 19 / 70

Page 34: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Stochastic Process Algebra

In a stochastic process algebra actions (reactions) have a name or type,but also a stochastic duration or rate.

In systems biology modelling it is these rates that are often unknown.

The language may be used to generate a Markov Process (CTMC).

SPAMODEL

LABELLEDTRANSITION

SYSTEMCTMC Q- -

SOS rules state transition

diagram

Q is the infinitesimal generator matrix characterising the CTMC.

Models are typically executed by simulation using Gillespie’s StochasticSimulation Algorithm (SSA) or similar.

Hillston 21/9/2016 19 / 70

Page 35: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bio-PEPA modelling

The state of the system at any time consists of the local states ofeach of its sequential/species components.

The local states of components are quantitative rather thanfunctional, i.e. biological changes to species are represented asdistinct components.

A component varying its state corresponds to it varying its amount.

This is captured by an integer parameter associated with the speciesand the effect of a reaction is to vary that parameter by a numbercorresponding to the stoichiometry of this species in the reaction.

Hillston 21/9/2016 20 / 70

Page 36: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bio-PEPA modelling

The state of the system at any time consists of the local states ofeach of its sequential/species components.

The local states of components are quantitative rather thanfunctional, i.e. biological changes to species are represented asdistinct components.

A component varying its state corresponds to it varying its amount.

This is captured by an integer parameter associated with the speciesand the effect of a reaction is to vary that parameter by a numbercorresponding to the stoichiometry of this species in the reaction.

Hillston 21/9/2016 20 / 70

Page 37: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bio-PEPA modelling

The state of the system at any time consists of the local states ofeach of its sequential/species components.

The local states of components are quantitative rather thanfunctional, i.e. biological changes to species are represented asdistinct components.

A component varying its state corresponds to it varying its amount.

This is captured by an integer parameter associated with the speciesand the effect of a reaction is to vary that parameter by a numbercorresponding to the stoichiometry of this species in the reaction.

Hillston 21/9/2016 20 / 70

Page 38: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Bio-PEPA modelling

The state of the system at any time consists of the local states ofeach of its sequential/species components.

The local states of components are quantitative rather thanfunctional, i.e. biological changes to species are represented asdistinct components.

A component varying its state corresponds to it varying its amount.

This is captured by an integer parameter associated with the speciesand the effect of a reaction is to vary that parameter by a numbercorresponding to the stoichiometry of this species in the reaction.

Hillston 21/9/2016 20 / 70

Page 39: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

The abstraction

Each species i is described by a species component Ci

Each reaction j is associated with an action type αj and its dynamicsis described by a specific function fαj

The species components (now quantified) are then composed together todescribe the behaviour of the system.

Hillston 21/9/2016 21 / 70

Page 40: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

The abstraction

Each species i is described by a species component Ci

Each reaction j is associated with an action type αj and its dynamicsis described by a specific function fαj

The species components (now quantified) are then composed together todescribe the behaviour of the system.

Hillston 21/9/2016 21 / 70

Page 41: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

The abstraction

Each species i is described by a species component Ci

Each reaction j is associated with an action type αj and its dynamicsis described by a specific function fαj

The species components (now quantified) are then composed together todescribe the behaviour of the system.

Hillston 21/9/2016 21 / 70

Page 42: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

The semantics

The semantics is defined by two transition relations:

First, a capability relation — is a transition possible?

Second, a stochastic relation — gives rate of a transition, derivedfrom the parameters of the model.

The labelled transition system generated by the stochastic relationformally defines the underlying CTMC.

Hillston 21/9/2016 22 / 70

Page 43: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

The semantics

The semantics is defined by two transition relations:

First, a capability relation — is a transition possible?

Second, a stochastic relation — gives rate of a transition, derivedfrom the parameters of the model.

The labelled transition system generated by the stochastic relationformally defines the underlying CTMC.

Hillston 21/9/2016 22 / 70

Page 44: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Example — in Bio-PEPA

I RSIS S

Rspread

stop1stop2

k_s = 0.5;

k_r = 0.1;

kineticLawOf spread : k_s * I * S;

kineticLawOf stop1 : k_r * S * S;

kineticLawOf stop2 : k_r * S * R;

I = (spread,1) ↓ ;

S = (spread,1) ↑ + (stop1,1) ↓ + (stop2,1) ↓ ;

R = (stop1,1) ↑ + (stop2,1) ↑ ;

I[10] BC∗

S[5] BC∗

R[0]

Hillston 21/9/2016 23 / 70

Page 45: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Example — in Bio-PEPA

I RSIS S

Rspread

stop1stop2

k_s = 0.5;

k_r = 0.1;

kineticLawOf spread : k_s * I * S;

kineticLawOf stop1 : k_r * S * S;

kineticLawOf stop2 : k_r * S * R;

I = (spread,1) ↓ ;

S = (spread,1) ↑ + (stop1,1) ↓ + (stop2,1) ↓ ;

R = (stop1,1) ↑ + (stop2,1) ↑ ;

I[10] BC∗

S[5] BC∗

R[0]

Hillston 21/9/2016 23 / 70

Page 46: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

A Probabilistic Programming Process Algebra: ProPPA

The objective of ProPPA is to retain the features of the stochastic processalgebra:

simple model description in terms of components

rigorous semantics giving an executable version of the model...

... whilst also incorporating features of a probabilistic programminglanguage:

recording uncertainty in the parameters

ability to incorporate observations into models

access to inference to update uncertainty based on observations

Hillston 21/9/2016 24 / 70

Page 47: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

A Probabilistic Programming Process Algebra: ProPPA

The objective of ProPPA is to retain the features of the stochastic processalgebra:

simple model description in terms of components

rigorous semantics giving an executable version of the model...

... whilst also incorporating features of a probabilistic programminglanguage:

recording uncertainty in the parameters

ability to incorporate observations into models

access to inference to update uncertainty based on observations

Hillston 21/9/2016 24 / 70

Page 48: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Example Revisited

I RSIS S

Rspread

stop1stop2

k_s = 0.5;

k_r = 0.1;

kineticLawOf spread : k_s * I * S;

kineticLawOf stop1 : k_r * S * S;

kineticLawOf stop2 : k_r * S * R;

I = (spread,1) ↓ ;

S = (spread,1) ↑ + (stop1,1) ↓ + (stop2,1) ↓ ;

R = (stop1,1) ↑ + (stop2,1) ↑ ;

I[10] BC∗

S[5] BC∗

R[0]

Hillston 21/9/2016 25 / 70

Page 49: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Additions

Declaring uncertain parameters:

k s = Uniform(0,1);

k t = Uniform(0,1);

Providing observations:

observe(’trace’)

Specifying inference approach:

infer(’ABC’)

Hillston 21/9/2016 26 / 70

Page 50: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Additions

I RSIS S

Rspread

stop1stop2

k_s = Uniform(0,1);

k_r = Uniform(0,1);

kineticLawOf spread : k_s * I * S;

kineticLawOf stop1 : k_r * S * S;

kineticLawOf stop2 : k_r * S * R;

I = (spread,1) ↓ ;

S = (spread,1) ↑ + (stop1,1) ↓ + (stop2,1) ↓ ;

R = (stop1,1) ↑ + (stop2,1) ↑ ;

I[10] BC∗

S[5] BC∗

R[0]

observe(’trace’)

infer(’ABC’) //Approximate Bayesian Computation

Hillston 21/9/2016 27 / 70

Page 51: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Semantics

A Bio-PEPA model can be interpreted as a CTMC; however, CTMCscannot capture uncertainty in the rates (every transition must have aconcrete rate).

ProPPA models include uncertainty in the parameters, whichtranslates into uncertainty in the transition rates.

A ProPPA model should be mapped to something like a distributionover CTMCs.

Hillston 21/9/2016 28 / 70

Page 52: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

parameter

model

k = 2

CTMC

Hillston 21/9/2016 29 / 70

Page 53: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

parameter

model

k ∈ [0,5]

set of CTMCs

Hillston 21/9/2016 29 / 70

Page 54: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

parameter

model

k ∼ p

distributionover CTMCs

μ

Hillston 21/9/2016 29 / 70

Page 55: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Constraint Markov Chains

Constraint Markov Chains3 (CMCs) are a generalization of DTMCs, inwhich the transition probabilities are not concrete, but can take any valuesatisfying some constraints.

Constraint Markov Chain

A CMC is a tuple 〈S , o,A,V , φ〉, where:

S is the set of states, of cardinality k .

o ∈ S is the initial state.

A is a set of atomic propositions.

V : S → 22A

gives a set of acceptable labellings for each state.

φ : S × [0, 1]k → {0, 1} is the constraint function.

3Caillaud et al., Constraint Markov Chains, Theoretical Computer Science, 2011Hillston 21/9/2016 30 / 70

Page 56: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Constraint Markov Chains

In a CMC, arbitrary constraints are permitted, expressed through thefunction φ: φ(s, ~p) = 1 iff ~p is an acceptable vector of transitionprobabilities from state s.

However,

CMCs are defined only for the discrete-time case, and

this does not say anything about how likely a value is to be chosen,only about whether it is acceptable.

To address these shortcomings, we define Probabilistic ConstraintMarkov Chains.

Hillston 21/9/2016 31 / 70

Page 57: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Hillston 21/9/2016 32 / 70

Page 58: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Probabilistic CMCs

A Probabilistic Constraint Markov Chain is a tuple 〈S , o,A,V , φ〉, where:

S is the set of states, of cardinality k .

o ∈ S is the initial state.

A is a set of atomic propositions.

V : S → 22A

gives a set of acceptable labellings for each state.

φ : S × [0,∞)k → [0,∞) is the constraint function.

This is applicable to continuous-time systems.

φ(s, ·) is now a probability density function on the transition ratesfrom state s.

Hillston 21/9/2016 33 / 70

Page 59: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Semantics of ProPPA

The semantics definition follows that of Bio-PEPA, which is defined usingtwo transition relations:

Capability relation — is a transition possible?

Stochastic relation — gives rate of a transition

The distribution over the parameter values induces a distribution overtransition rates.

Rules are expressed as state-to-function transition systems (FuTS).

Hillston 21/9/2016 34 / 70

Page 60: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Semantics of ProPPA

The semantics definition follows that of Bio-PEPA, which is defined usingtwo transition relations:

Capability relation — is a transition possible?

Stochastic relation — gives distribution of the rate of a transition

The distribution over the parameter values induces a distribution overtransition rates.

Rules are expressed as state-to-function transition systems (FuTS4).

This gives rise the underlying PCMC.

4De Nicola et al., A Uniform Definition of Stochastic Process Calculi, ACMComputing Surveys, 2013

Hillston 21/9/2016 35 / 70

Page 61: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Semantics of ProPPA

The semantics definition follows that of Bio-PEPA, which is defined usingtwo transition relations:

Capability relation — is a transition possible?

Stochastic relation — gives distribution of the rate of a transition

The distribution over the parameter values induces a distribution overtransition rates.

Rules are expressed as state-to-function transition systems (FuTS4).

This gives rise the underlying PCMC.

4De Nicola et al., A Uniform Definition of Stochastic Process Calculi, ACMComputing Surveys, 2013

Hillston 21/9/2016 35 / 70

Page 62: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Simulating Probabilistic Constraint Markov Chains

Probabilistic Constraint Markov Chains are open to two alternativedynamic interpretations:

1 For each trajectory, for each uncertain transition rate, sample once atthe start of the run and use that value throughout;

2 During each trajectory, each time a transition with an uncertain rateis encountered, sample a value but then discard it and re-samplewhenever this transition is visited again.

1 Uncertain Markov Chains

2 Imprecise Markov Chains

Our current work is focused on the Uncertain Markov Chain case.

Hillston 21/9/2016 36 / 70

Page 63: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Simulating Probabilistic Constraint Markov Chains

Probabilistic Constraint Markov Chains are open to two alternativedynamic interpretations:

1 For each trajectory, for each uncertain transition rate, sample once atthe start of the run and use that value throughout;

2 During each trajectory, each time a transition with an uncertain rateis encountered, sample a value but then discard it and re-samplewhenever this transition is visited again.

1 Uncertain Markov Chains

2 Imprecise Markov Chains

Our current work is focused on the Uncertain Markov Chain case.

Hillston 21/9/2016 36 / 70

Page 64: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Simulating Probabilistic Constraint Markov Chains

Probabilistic Constraint Markov Chains are open to two alternativedynamic interpretations:

1 For each trajectory, for each uncertain transition rate, sample once atthe start of the run and use that value throughout;

2 During each trajectory, each time a transition with an uncertain rateis encountered, sample a value but then discard it and re-samplewhenever this transition is visited again.

1 Uncertain Markov Chains

2 Imprecise Markov Chains

Our current work is focused on the Uncertain Markov Chain case.

Hillston 21/9/2016 36 / 70

Page 65: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Simulating Probabilistic Constraint Markov Chains

Probabilistic Constraint Markov Chains are open to two alternativedynamic interpretations:

1 For each trajectory, for each uncertain transition rate, sample once atthe start of the run and use that value throughout;

2 During each trajectory, each time a transition with an uncertain rateis encountered, sample a value but then discard it and re-samplewhenever this transition is visited again.

1 Uncertain Markov Chains

2 Imprecise Markov Chains

Our current work is focused on the Uncertain Markov Chain case.

Hillston 21/9/2016 36 / 70

Page 66: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Simulating Probabilistic Constraint Markov Chains

Probabilistic Constraint Markov Chains are open to two alternativedynamic interpretations:

1 For each trajectory, for each uncertain transition rate, sample once atthe start of the run and use that value throughout;

2 During each trajectory, each time a transition with an uncertain rateis encountered, sample a value but then discard it and re-samplewhenever this transition is visited again.

1 Uncertain Markov Chains

2 Imprecise Markov Chains

Our current work is focused on the Uncertain Markov Chain case.

Hillston 21/9/2016 36 / 70

Page 67: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 37 / 70

Page 68: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inference

parameter

model

k ∼ p

distributionover CTMCs

μ

Hillston 21/9/2016 38 / 70

Page 69: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inference

parameter

model

k ∼ p

distributionover CTMCs

μobservations

inference

posteriordistribution

μ*

Hillston 21/9/2016 38 / 70

Page 70: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inference

P(θ | D) ∝ P(θ)P(D | θ)

The ProPPA semantics does not define a single inference algorithm,allowing for a modular approach.

Different algorithms can act on different input (time-series vsproperties), return different results or in different forms.

Exact inference is often impossible, as we cannot calculate thelikelihood.

We must use approximate algorithms or approximations of the system.

Hillston 21/9/2016 39 / 70

Page 71: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inferring likelihood in uncertain CTMCsTransient probabilities can be expressed as:

dpi (t)

dt=

∑j 6=i

pj(t) · qji − pi (t)∑j 6=i

qij

The probability of a single observation (y , t) can then be expressed as

p(y , t) =∑i∈S

pi (t)π(y | i)

where π(y | i) is the probability of observing y when in state i .

The likelihood can then be expressed as

P(D | θ) =N∏j=1

∑i∈S

p(i |θ)(tj)π(yj | i)

Hillston 21/9/2016 40 / 70

Page 72: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inferring likelihood in uncertain CTMCsTransient probabilities can be expressed as:

dpi (t)

dt=

∑j 6=i

pj(t) · qji − pi (t)∑j 6=i

qij

The probability of a single observation (y , t) can then be expressed as

p(y , t) =∑i∈S

pi (t)π(y | i)

where π(y | i) is the probability of observing y when in state i .

The likelihood can then be expressed as

P(D | θ) =N∏j=1

∑i∈S

p(i |θ)(tj)π(yj | i)

Hillston 21/9/2016 40 / 70

Page 73: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inferring likelihood in uncertain CTMCsTransient probabilities can be expressed as:

dpi (t)

dt=

∑j 6=i

pj(t) · qji − pi (t)∑j 6=i

qij

The probability of a single observation (y , t) can then be expressed as

p(y , t) =∑i∈S

pi (t)π(y | i)

where π(y | i) is the probability of observing y when in state i .

The likelihood can then be expressed as

P(D | θ) =N∏j=1

∑i∈S

p(i |θ)(tj)π(yj | i)

Hillston 21/9/2016 40 / 70

Page 74: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Calculating the transient probabilities

For finite state-spaces, the transient probabilities can, in principle, becomputed as

p(t) = p(0)eQt .

Likelihood is hard to compute:

Computing eQt is expensive if the state space is large

Impossible directly in infinite state-spaces

Hillston 21/9/2016 41 / 70

Page 75: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

Hillston 21/9/2016 42 / 70

Page 76: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

x

xx x

t

X

Hillston 21/9/2016 42 / 70

Page 77: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

x

xx x

t

X

Hillston 21/9/2016 42 / 70

Page 78: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

x

xx x

t

X

Hillston 21/9/2016 42 / 70

Page 79: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

x

xx x

t

XΣ(xi-yi)

2 > εrejected

Hillston 21/9/2016 42 / 70

Page 80: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Basic Inference

Approximate Bayesian Computation is a simple simulation-basedsolution:

I Approximates posterior distribution over parameters as a set of samplesI Likelihood of parameters is approximated with a notion of distance.

x

xx x

t

XΣ(xi-yi)

2 < εaccepted

Hillston 21/9/2016 42 / 70

Page 81: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Approximate Bayesian Computation

ABC algorithm

1 Sample a parameter set from the prior distribution.

2 Simulate the system using these parameters.

3 Compare the simulation trace obtained with the observations.

4 If distance < ε, accept, otherwise reject.

This results in an approximate to the posterior distribution.As ε→ 0, set of samples converges to true posterior.We use a more elaborate version based on Markov Chain Monte Carlosampling.

Hillston 21/9/2016 43 / 70

Page 82: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Inference for infinite state spaces

Various methods become inefficient or inapplicable as the state-spacegrows.

How to deal with unbounded systems?

Multiple simulation runs

Large population approximations (diffusion, Linear Noise,. . . )

Systematic truncation

Random truncations

Hillston 21/9/2016 44 / 70

Page 83: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Expanding the likelihoodThe likelihood can be written as an infinite series:

p(x ′, t ′ | x , t) =∞∑

N=0

p(N)(x ′, t ′ | x , t)

=∞∑

N=0

[f (N)(x ′, t ′ | x , t)− f (N−1)(x ′, t ′ | x , t)

]where

x∗ = max{x , x ′}p(N)(x ′, t ′ | x , t) is the probability of going from state x at time t tostate x ′ at time t ′ through a path with maximum state x∗ + N

f (N) is the same, except the maximum state cannot exceed x∗ + N(but does not have to reach it)

Any finite number of terms can be computed — Can the infinite sum becomputed or estimated?

Hillston 21/9/2016 45 / 70

Page 84: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Expanding the likelihoodThe likelihood can be written as an infinite series:

p(x ′, t ′ | x , t) =∞∑

N=0

p(N)(x ′, t ′ | x , t)

=∞∑

N=0

[f (N)(x ′, t ′ | x , t)− f (N−1)(x ′, t ′ | x , t)

]where

x∗ = max{x , x ′}p(N)(x ′, t ′ | x , t) is the probability of going from state x at time t tostate x ′ at time t ′ through a path with maximum state x∗ + N

f (N) is the same, except the maximum state cannot exceed x∗ + N(but does not have to reach it)

Any finite number of terms can be computed — Can the infinite sum becomputed or estimated?

Hillston 21/9/2016 45 / 70

Page 85: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

(0,0) (0,1) (0,3) (0,4)

(1,0) (1,3) (1,4)

(2,0) (2,1) (2,2) (2,3) (2,4)

(3,0) (3,1) (3,2) (3,3) (3,4)

(1,1)

(0,2)

(1,2)x

x'

Hillston 21/9/2016 46 / 70

Page 86: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

(0,0) (0,1) (0,3) (0,4)

(1,0) (1,3) (1,4)

(2,0) (2,1) (2,2) (2,3) (2,4)

(3,0) (3,1) (3,2) (3,3) (3,4)

(1,1)

(0,2)

(1,2)

S0

x

x'

x*

Hillston 21/9/2016 46 / 70

Page 87: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

(0,0) (0,1) (0,3) (0,4)

(1,0) (1,3) (1,4)

(2,0) (2,1) (2,2) (2,3) (2,4)

(3,0) (3,1) (3,2) (3,3) (3,4)

(1,1)

(0,2)

(1,2)

S0S1

x

x'

x*

Hillston 21/9/2016 46 / 70

Page 88: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Russian Roulette Truncation

We want to estimate the value of

f =∞∑n=0

fn

where the fn’s are computable.

Truncate the sum randomly: stop at term k with probability qk .

Form f̂ as a partial sum of the fn, n = 1, . . . , k, rescaled appropriately.

f̂ =k∑

n=0

fnk−1∏j=0

(1− qj)

Hillston 21/9/2016 47 / 70

Page 89: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Russian Roulette Truncation

We want to estimate the value of

f =∞∑n=0

fn

where the fn’s are computable.

Choose a single term fk with probability pk ; estimate f̂ = fkpk

f̂ is unbiased... but its variance can be high.

Truncate the sum randomly: stop at term k with probability qk .

Form f̂ as a partial sum of the fn, n = 1, . . . , k, rescaled appropriately.

f̂ =k∑

n=0

fnk−1∏j=0

(1− qj)

Hillston 21/9/2016 47 / 70

Page 90: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Russian Roulette Truncation

We want to estimate the value of

f =∞∑n=0

fn

where the fn’s are computable.

Truncate the sum randomly: stop at term k with probability qk .

Form f̂ as a partial sum of the fn, n = 1, . . . , k, rescaled appropriately.

f̂ =k∑

n=0

fnk−1∏j=0

(1− qj)

Hillston 21/9/2016 47 / 70

Page 91: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Russian Roulette

f̂ ← f0i ← 1p ← 1loop

Choose to stop with probability qiif stopping then

return f̂else

p ← p · (1− qi )f̂ ← f̂ + fi

pi ← i + 1

end ifend loop

Hillston 21/9/2016 48 / 70

Page 92: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Russian Roulette

f̂ =k∑

n=0

fn∏k−1j=0 (1− qj)

In our case, f is a probability that we wish to approximate.

Using f̂ instead of f leads to an error; however f̂ is unbiased:E [f̂ ] = f .

f̂ is also guaranteed to be positive.

Pseudo-marginal algorithms can use this and still draw samples fromthe correct distribution.

We have developed both Metropolis-Hastings and Gibbs-like samplingalgorithms based on this approach5.

5Unbiased Bayesian Inference for Population Markov Jump Processes via RandomTruncations. A.Georgoulas, J.Hillston and D.Sanguinetti, to appear in Stats & Comp.

Hillston 21/9/2016 49 / 70

Page 93: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

SSA samplesHillston 21/9/2016 50 / 70

Page 94: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Posterior samplesHillston 21/9/2016 51 / 70

Page 95: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 52 / 70

Page 96: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Example model

I RSIS S

R

k_s = Uniform(0,1);

k_r = Uniform(0,1);

kineticLawOf spread : k_s * I * S;

kineticLawOf stop1 : k_r * S * S;

kineticLawOf stop2 : k_r * S * R;

I = (spread,1) ↓ ;

S = (spread,1) ↑ + (stop1,1) ↓ + (stop2,1) ↓ ;

R = (stop1,1) ↑ + (stop2,1) ↑ ;

I[10] BC∗

S[5] BC∗

R[0]

observe(’trace’)

infer(’ABC’) //Approximate Bayesian Computation

Hillston 21/9/2016 53 / 70

Page 97: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Results

Tested on the rumour-spreading example, giving the two parametersuniform priors.

Approximate Bayesian Computation

Returns posterior as a set of points (samples)

Observations: time-series (single simulation)

Hillston 21/9/2016 54 / 70

Page 98: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Results: ABC

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

ks

k r

Hillston 21/9/2016 55 / 70

Page 99: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Results: ABC

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

ks

k r

Hillston 21/9/2016 56 / 70

Page 100: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Results: ABC

0 0.2 0.4 0.6 0.8 10

2000

4000

6000

8000

10000

12000

kr

Num

ber

of s

ampl

es

0 0.2 0.4 0.6 0.8 10

1000

2000

3000

4000

5000

6000

7000

ks

Num

ber

of s

ampl

es

Hillston 21/9/2016 57 / 70

Page 101: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Genetic Toggle Switch

Two mutually-repressing genes: promoters (unobserved) and theirprotein products

Bistable behaviour: switching induced by environmental changes

Synthesised in E. coli6

Stochastic variant7 where switching is induced by noise

6Gardner, Cantor & Collins, Construction of a genetic toggle switch in Escherichiacoli, Nature, 2000

7Tian & Burrage, Stochastic models for regulatory networks of the genetic toggleswitch, PNAS, 2006

Hillston 21/9/2016 58 / 70

Page 102: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Genetic Toggle Switch

P1

∅P2

G1,on

G1,off G2,on

G2,off

participates accelerates

Hillston 21/9/2016 59 / 70

Page 103: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Toggle switch model: species

G1 = activ1 ↑ + deact1 ↓ + expr1 ⊕;G2 = activ2 ↑ + deact2 ↓ + expr2 ⊕;

P1 = expr1 ↑ + degr1 ↓ + deact2 ⊕ ;

P2 = expr2 ↑ + degr2 ↓ + deact1 ⊕

G1[1] <*> G2[0] <*> P1[20] <*> P2[0]

observe(toggle_obs);

infer(rouletteGibbs);

Hillston 21/9/2016 60 / 70

Page 104: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

θ 1 = Gamma(3,5); //etc...

kineticLawOf expr1 : θ 1 * G1;

kineticLawOf expr2 : θ 2 * G2;

kineticLawOf degr1 : θ 3 * P1;

kineticLawOf degr2 : θ 4 * P2;

kineticLawOf activ1 : θ 5 * (1 - G1);

kineticLawOf activ2 : θ 6 * (1 - G2);

kineticLawOf deact1 : θ 7 * exp(r ∗ P2) * G1;

kineticLawOf deact2 : θ 8 * exp(r ∗ P1) * G2;

G1 = activ1 ↑ + deact1 ↓ + expr1 ⊕;G2 = activ2 ↑ + deact2 ↓ + expr2 ⊕;P1 = expr1 ↑ + degr1 ↓ + deact2 ⊕ ;

P2 = expr2 ↑ + degr2 ↓ + deact1 ⊕

G1[1] <*> G2[0] <*> P1[20] <*> P2[0]

observe(toggle_obs);

infer(rouletteGibbs);

Hillston 21/9/2016 61 / 70

Page 105: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Experiment

Simulated observations

Gamma priors on all parameters (required by algorithm)

Goal: learn posterior of 8 parameters

5000 samples taken using the Gibbs-like random truncation algorithm

Hillston 21/9/2016 62 / 70

Page 106: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Promoters

0 20 40 60 80 100

Hillston 21/9/2016 63 / 70

Page 107: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Proteins

0 20 40 60 80 1000

5

10

15

20

25

30

Hillston 21/9/2016 64 / 70

Page 108: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Observations used

0 20 40 60 80 1000

5

10

15

20

25

30

Hillston 21/9/2016 65 / 70

Page 109: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Results

Hillston 21/9/2016 66 / 70

Page 110: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Outline

1 Introduction

2 Probabilistic Programming

3 ProPPA

4 Inference

5 Results

6 Conclusions

Hillston 21/9/2016 67 / 70

Page 111: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Workflow

model

inference algorithm

low-level description

inference results

(samples)

statistics

plotting

prediction

...

infercompile

Hillston 21/9/2016 68 / 70

Page 112: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Summary

ProPPA is a process algebra that incorporates uncertainty andobservations directly in the model, influenced by probabilisticprogramming.

Syntax remains similar to Bio-PEPA.

Semantics defined in terms of an extension of Constraint MarkovChains.

Observations can be either time-series or logical properties.

Parameter inference based on random truncations (Russian Roulette)offers new possibilities for inference.

Hillston 21/9/2016 69 / 70

Page 113: Embedding Machine Learning in Formal Stochastic Models of ...€¦ · Embedding Machine Learning in Formal Stochastic Models of Biological Processes Jane Hillston Joint work with

Thanks

Anastasis Georgoulas Guido Sanguinetti

Hillston 21/9/2016 70 / 70