approximate bayesian computation using indirect inference€¦ · smc algorithm settings and...
TRANSCRIPT
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Approximate Bayesian Computation using
Indirect Inference
Chris [email protected]
Acknowledgement: Prof Tony Pettitt and Prof Malcolm Faddy
School of Mathematical Sciences, Queensland University ofTechnology
June 26, 2012
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Approximate Bayesian Computation (ABC)
Bayesian statistics involves inference based on the posteriordistribution
π(θ|y) ∝ f (y|θ)π(θ).
What happens when likelihood f (y|θ) unavailable?
ABC instead uses model simulations and compares simulatedwith observed
Naive Algorithm - Rejection Sampling
Sample θ ∼ π(·)Simulate x ∼ f (·|θ)Accept θ if ρ(y, x) ≤ ǫ
Repeat the above until we have N draws, θ1, . . . ,θN
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
The Approximate Posterior
Involves a joint ‘approximate’ posterior distribution
π(θ, x|y, ǫ) ∝ g(y|x, ǫ)f (x|θ)π(θ)
g(y|x, ǫ) is a weighting function (Reeves and Pettitt, ’05).
How do we compare y and x? Directly? But too highdimensional?
Typically use statistics to summarise the dataS(·) = S1(·), . . . ,Sp(·) (low dimensional)
ρ(y, x) = ‖S(y)− S(x)‖
One popular choice is to set
g(y|x, ǫ) = 1(ρ(y, x) ≤ ǫ)
Choice of ǫ trade-off between accuracy and efficiency
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
The Approximation and Summary Statistic Choice
Errors from insufficient summaries and ǫ > 0 (see Fearnheadand Prangle (2012))
Efficient algorithm gets low tolerance
Thus summary statistic choice is vital for good approximation
Approaches for Summary Statistics
Full data (Barthelme and Chopin (2011), White et al (2012))
Data reduction techniques (Blum et al (2012))
Choose subset out of larger set (Nunes and Balding (2010),Barnes et al (2012))
Indirect inference (motivated by application in this talk)
Advantage of indirect inference approach: knowledge that summarystatistics are ‘good’ can be obtained prior to ABC analysis
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
ABC Algorithms
Rejection Sampling (above) (Pritchard et al, 1999 andBeaumont et al, 2002)
Advantage: Simplicity, Independent drawsDisadvantage: Acceptance rate too low.
MCMC ABC (Marjoram et al, 2003)
Carefully chosen proposal q(θ∗, x∗|θ, x) = q(θ∗|θ)f (x∗|θ∗).Likelihoods CancelAdvantage: Efficient relative to Rejection SamplingDisadvantage: Still quite inefficient, can get stuck,multi-modal targets?
SMC ABC (Sisson et al (2007), Toni et al (2009), Drovandiand Pettitt (2011), Beaumont et al (2009), Del Moral et al(2012))
SMC Advantages: Very efficient, adaptive
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Motivating Application - Macroparasite Immunity
Estimate parameters of a Markov process model explainingmacroparasite population development with host immunity
212 hosts (cats) i = 1, . . . , 212. Each cat injected with lijuvenile Brugia pahangi larvae (approximately 100 or 200).
At time ti host is sacrificed and the number of matures arerecorded
Host assumed to develop an immunity
Three variable problem: M(t) matures, L(t) juveniles, I (t)immunity.
Only L(0) and M(ti) is observed for each host
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
0 200 400 600 800 1000 12000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Pro
porto
n of
Mat
ures
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Trivariate Markov Process of Riley et al (2003)
M(t) L(t)
I(t)
Mature Parasites Juvenile Parasites
Immunity
Maturation
Gain of immunity Loss of immunity
Death due to im
munity
Natural death
Natural death
Invisible
Invisible
Invisible
γL(t)
νL(t) µI I (t)
βI(t)L
(t)
µLL(t)
µMM
(t)
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
The Model and Intractable Likelihood
Deterministic form of the model
dL
dt= −µLL− βIL− γL,
dM
dt= γL− µMM,
dI
dt= νL− µI I ,
µm, γ fixed. ν, µL, µI , β require estimation
Likelihood-based inference appears intractable.
Could form complete likelihood with missing larvae andimmunity (too much missing data)Matrix exponential form of the likelihood (matrix too large)
Simulation from the model straightforward using Gillespie’salgorithm (Gillespie, 1977)
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Developing Summary Statistics
Need to develop summary statistics that efficiently summarizedata!
Covariates: numbers of juveniles, sacrifice time.An approach based on indirect inference (Gourieroux andRonchetti, 1993)
Propose an auxiliary model pa(y|θa) where parameter θa iseasily estimableAuxiliary model is flexible enough to provide a gooddescription of the dataSimulate xθ from target intractable likelihood p(·|θ) and findθa(xθ)Estimate θ using θa(xθ) closest to θa(y)
Estimates of parameters of the auxiliary model fitted to thedata become the summary statistics in ABC.
Compare and contrast models based on Beta Binomial (BB)and Binomial mixture (BM).
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Summary statistics
For each model
Compare the auxiliary estimates for simulated data x , θxa, and
observations y , θya, with the Mahalanobis distance
ρ(y, x) = ρ(θya,θ
xa) =
√
(θya − θ
xa)
TS−1(θya − θ
xa),
where S is the covariance matrix for the MLE
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Auxiliary Beta-Binomial model
The data show too much variation for Binomial
A Beta-Binomial model has an extra parameter to capturedispersion
p(mi |αi , βi ) =
(
li
mi
)
B(mi + αi , li −mi + βi )
B(αi , βi ),
Useful reparameterisation pi = αi/(αi + βi ) andθi = 1/(αi + βi )
Relate the proportion and over dispersion parameters to time,ti , and initial larvae, li , covariates
logit(pi ) = β0 + β1 log(ti) + β2(log(ti ))2,
log(θi) =
{
η100, if li ≈ 100η200, if li ≈ 200
,
Five parameters
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Auxiliary Binomial Mixture model
An auxiliary model based on a three component Binomialmixture.
The ith observation has the distribution
p(mi |Θ) =(
limi
)∑3
k=1 wk(θki )
mi (1− θki )li−mi ,
where w3 = 1− w1 − w2.
Reparameterise the θki , logit(θki ) = γk0 + γ1 log(ti ), so that
each component has the same slope but a different intercept.
Six parameters, Θ = (w1,w2, γ10 , γ
20 , γ
30 , γ1).
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
ABC Summary Statistics from auxiliary models
How to choose between different auxiliary models?
use data analytical tools for the original data setTry each model/summary statistic (Nunes and Balding, 2010)Former is far less computer intensive but does it give best ABCapproximation?
Fit to data
Relative measure: Beta-Binomial better than Binomialmixture by 170 on log likelihood scale and 1 less parameter
Absolute measure: Generalized Pearson statistics
Beta-Binomial, 213.2 on 207 d.o.f.Binomial mixture, 278.9 on 206 d.o.f.
Plots suggest Binomial mixture does not capture variability ofdata
Residual analysis: Beta-binomial X
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
SMC Algorithm Settings and Posterior Results
Algorithm Settings
Take N = 1000 particles, start with 60% acceptance, discardhalf each iteration, finish with 3% acceptance for MCMCkernel.
Repeat MCMC kernel to get about 99% acceptance at eachiteration
Fixed values γ = 0.04, µM = 0.0015
Prior choices:
ν: U(0,1)µL: U(0,1)µI : U(0,2),β: U(0,2),
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Posterior Densities
Posteriors Beta Binomial vs Binomial mixture
6.0 6.5 7.0 7.5
0.0
0.5
1.0
1.5
nu
nu
Dens
ity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
mu_I
mu_I
Dens
ity
0.00 0.01 0.02 0.03 0.04
010
3050
70
mu_L
mu_L
Dens
ity
0.0 0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
beta
beta
Dens
ity
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Bivariate Posteriors
Posteriors Beta Binomial vs Binomial mixture
6.0 6.5 7.0 7.5 8.0
0.0
0.5
1.0
1.5
2.0
nu
mu_
IPlots of posteriors, green is BM, red is BB
0.000 0.010 0.020 0.030
0.0
0.5
1.0
1.5
2.0
mu_L
mu_
I
0.000 0.010 0.020 0.030
0.5
1.0
1.5
2.0
mu_L
beta
0.0 0.5 1.0 1.5 2.0
0.5
1.0
1.5
2.0
mu_I
beta
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Posterior Density Differences
For ν BB posterior is more concentrated than BM, similarmodes
For µL similar concentration but very different modes , 0.0055vs 0.0244.
Using the Nunes-Balding criterion for summary statisticswould prefer BB to BM.
But what are the predictions from each ABC fitted model?
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
ABC Fit to data
95 percent prediction interval from the stochastic model usingABC modal estimates: with auxiliary model Beta-Binomial (left)and the Binomial mixture (right) model.
0 200 400 600 800 1000 1200 1400
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
195% prediction intervals based on the auxiliary Beta−Binomial model
Autopsy time
mat
ure
coun
t
0 200 400 600 800 1000 1200 1400
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
195% prediction intervals based on the auxiliary Binomial mixture model
Autopsy time
mat
ure
coun
t
ABC model fit based on Beta-Binomial auxiliary model capturesvariability of data
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Summary of Method
Advantages
Can be applied in non-iid settings
Appropriate of statistics assessed via data analytic techniques
Statistics are complicated functions of data
Disadvantages
Likelihood maximisation for each simulated data drawn
Chris Drovandi ISBA 2012
Intro Algorithms Application Summary Statistics Auxiliary models Results Ending
Reference
Drovandi et al (2011). JRSS C 60, 317-337.
Drovandi and Pettitt (2011). Biometrics 67, 225-233.
Riley et al (2003). J. Theoret. Biol. 225, 419430.
Heggland and Frigessi (2004). JRSS B 66, 447-462.
Nunes and Balding (2010). Stat. Appl. Genet. Mol. Biol.9(1).
Chris Drovandi ISBA 2012