value of information calculations in r

Value of information calculations in R

Howard Thom with

Wei Fang, Zhenru Wang, Mike Giles, Chris Jackson, Mike Giles,

Christoph Andrieu and Nicky Welton

UCL, 11th July 2018

School of Social and Community Medicine

Overview• Recall the formula for the expected value of partial perfect information (EVPPI)

𝐸𝑉𝑃𝑃𝐼 𝑋 = 𝐸𝑋 max𝑑∈𝐷𝐸𝑌|𝑋 𝑁𝐵𝑑 𝑋, 𝑌 − max

𝑑∈𝐷𝐸𝑋,𝑌 𝑁𝐵𝑑 𝑋, 𝑌

• EVPPI can be estimated (in principle) by nested Monte Carlo simulation:

1

𝑁

𝑛=1

𝑁

max𝑑∈𝐷

1

𝑀

𝑚=1

𝑀

𝑁𝐵𝑑 𝑋(𝑛), 𝑌(𝑚,𝑛) −max

𝑑∈𝐷

1

𝑀𝑁

𝑛=1

𝑁

𝑚=1

𝑀

𝑁𝐵𝑑 𝑋(𝑛), 𝑌(𝑚,𝑛)

• This requires 𝑁 samples of 𝑋(𝑛) and M samples 𝑌(𝑚,𝑛) conditional on each 𝑋(𝑛)

• This nested simulation is biased due to Jensen’s inequality and computationally intensive.

• Near impossible to estimate using Excel; possible in R but may struggle for complex models

• R can parallelise computation to increase efficiency: run each sample from 𝑋(𝑛) on separate CPU.• Package ‘parallel’ in R.

• Can also use meta-modelling techniques, some of which have been implemented in easy-to-use R

packages or online servies.

• We propose using efficient Monte Carlo sampling schemes for efficient and accurate estimation

Meta-modelling approaches

• Consider the first term of the EVPPI

𝐸𝑉𝑃𝑃𝐼 𝑋 = 𝐸𝑋 max𝑑∈𝐷𝐸𝑌|𝑋 𝑁𝐵𝑑 𝑋, 𝑌 − max

𝑑∈𝐷𝐸𝑋,𝑌 𝑁𝐵𝑑 𝑋, 𝑌

• We can remove the nested simulation by regressing 𝑁𝐵𝑑 𝑋, 𝑌 on 𝑋𝑁𝐵𝑑 𝑋, 𝑌 = 𝐸𝑌|𝑋 𝑁𝐵𝑑 𝑋, 𝑌 + 휀

𝑁𝐵𝑑 𝑋, 𝑌 = 𝑔𝑑(𝑋) + 휀

• Where 𝑔𝑑(𝑋) is some unknown smooth function and 𝐸 휀 = 0

• The fitted values 𝑔𝑑 𝑋 are estimates of 𝐸𝑌|𝑋 𝑁𝐵𝑑 𝑋, 𝑌

• EVPPI now requires only a single loop

𝐸𝑉𝑃𝑃𝐼 𝑋 ≈1

𝐾

𝑘=1

𝐾

max𝑑 𝑔𝑑 𝑋(𝑘) −max

𝑑

1

𝐾

𝑘=1

𝐾

𝑔𝑑 𝑋(𝑘)

• Need to choose a 𝑔𝑑() that represents well how 𝑁𝐵𝑑 depends on 𝑋, 𝑌, 𝑁𝐵𝑑• Estimating 𝑔𝑑() itself may be computationally challenging

Generalised additive model• The generalised additive model (GAM) is commonly used to estimate EVPPI for sets of up to

5 parameters.

• Assuming 𝑠(𝑋) is a smooth function, the GAM is

𝑁𝐵𝑑𝑘= 𝑠 𝑋 𝑘 + 휀(𝑘)

• Where the function 𝑠(𝑋) is a sum of known ‘basis’ functions

𝑠 𝑋 =

𝑛=1

𝑁

𝛾𝑛𝑏𝑛(𝑋)

• The 𝛾𝑛 are parameters estimated from the samples of 𝑁𝐵𝑑 and 𝑋.

• In higher dimensions would need to specify a multivariate smooth function

𝑁𝐵𝑑 = 𝑠 𝑋1, … , 𝑋𝑞 + 휀

• Would need 𝑁𝑞 basis functions where 𝑁 is number of basis functions needed for each

dimension.

• Computationally infeasible if number of independent variables greater than 5 or 6

• GAM are implemented in the BCEA for R package and the SAVI website.

Gaussian process regression• A further option implemented in BCEA and SAVI is Gaussian process (GP) regression

• The regression model is

𝑁𝐵𝑑 = 𝑔𝑑(𝑿𝑖) + 휀

with 휀~𝑁(0, 𝜎2)

• The 𝑿𝑖 are the set of 𝑞 focal parameters

• Now the function 𝑔𝑑(𝑿𝑖) is a GP which is equivalent to

𝑁𝐵(𝑋)~𝑁𝑜𝑟𝑚𝑎𝑙(𝑯𝛽, 𝜎2𝜮(𝛿) + ν𝑰)

• Where

𝑯 =

1 𝑋1(1)

1 𝑋1(2)

⋯ 𝑋𝑞(1)

⋯ 𝑋𝑞(2)

⋮ ⋮

1 𝑋1(𝑁)⋱ ⋮

⋯ 𝑋𝑞(𝑁)

• Hyperparameters are pre-specified by the SAVI software, which may adversely affect its performance.

• R code available that can better estimate the hyperparameters and conduct GP regression.

• GP involves inverting an 𝑁 × 𝑁 matrix at 𝑂(𝑁3) computational cost.

• The dimension of the focal parameters 𝑿𝑖 also influences the cost.

SPDE-INLA• The BCEA package includes a highly efficient adaptation of the GP regression method.

• Adapts ideas from spatial statistics for approximation of high dimensional with low dimensional

problems

• Approximates high dimensional GP with low dimensional GP that is solution to a form of SPDE.

• This changes task from inverting dense 𝑁 ×𝑁 matrix at 𝑂(𝑁3) cost to inverting sparse 𝑁 ×𝑁 matrix at 𝑂(𝑁 32) cost

• SPDE is latent Gaussian model, efficiently solved by integrated nested Laplace approximation (INLA)• BCEA uses the ‘R-INLA’ package in R.

• SPDE approach doesn’t work in high dimensional parameter space.

• BCEA therefore uses principal fitted components projection to reduce to 2-dimensions.• BCEA uses the ‘ldr’ package in R.

• Syntax in R is relatively simple:

evppi.inla<-evppi(param.indices,all.param.samples,bcea.model.object,method="INLA")

Multilevel Monte-Carlo• Key idea is to choose an estimator 𝐸𝑉𝑃𝑃𝐼0 with a larger bias but much lower cost than 𝐸𝑉𝑃𝑃𝐼.

• We can then construct the EVPPI estimator

1

𝑁0

𝑛=1

𝑁0

𝐸𝑉𝑃𝑃𝐼0(𝑛)+1

𝑁1

𝑛=1

𝑁1

𝐸𝑉𝑃𝑃𝐼(𝑛) − 𝐸𝑉𝑃𝑃𝐼0(𝑛)

• We then need only estimate the (cheap) 𝐸𝑉𝑃𝑃𝐼0 and the difference 𝐸𝑉𝑃𝑃𝐼 − 𝐸𝑉𝑃𝑃𝐼0.

• So long as 𝐸𝑉𝑃𝑃𝐼 and 𝐸𝑉𝑃𝑃𝐼0 are highly correlated the variance will be small and number of samples

𝑁1 for accurate estimation will be small.

• The variance of 𝐸𝑉𝑃𝑃𝐼0 is close to that of 𝐸𝑉𝑃𝑃𝐼 so 𝑁0 is similar to number needed to estimate 𝐸𝑉𝑃𝑃𝐼

• This extends naturally via a sequence of estimators 𝐸𝑉𝑃𝑃𝐼0, 𝐸𝑉𝑃𝑃𝐼1, …, 𝐸𝑉𝑃𝑃𝐼𝐿 to the estimator

1

𝑁0

𝑛=1

𝑁0

𝐸𝑉𝑃𝑃𝐼0(0,𝑛)+

𝑙=1

𝐿1

𝑁𝑙 𝐸𝑉𝑃𝑃𝐼(𝑙,𝑛) − 𝐸𝑉𝑃𝑃𝐼0

(𝑙,𝑛)

MLMC for EVPPI• In practice it is more efficient o estimate the quantity

𝐷𝐼𝐹𝐹 = 𝐸𝑉𝑃𝐼 − 𝐸𝑉𝑃𝑃𝐼

• The MLMC estimator using 𝑁𝑙 outer samples for each level 𝑙 is then

𝐷𝐼𝐹𝐹 =

𝑙=1

𝐿1

𝑁𝑙

𝑛=1

𝑁𝑙

𝐷𝐼𝐹𝐹𝑙(𝑙,𝑛)− 𝐷𝐼𝐹𝐹𝑙−1

(𝑙,𝑛)

• Unlike NMC, MLMC provides an estimate of the estimation bias

𝐸 𝐷𝐼𝐹𝐹𝐿+1 − 𝐷𝐼𝐹𝐹𝐿 ~

𝑙=𝐿+1

∞

𝐸 𝐷𝐼𝐹𝐹𝑙 − 𝐷𝐼𝐹𝐹𝑙−1 = 𝐷𝐼𝐹𝐹 − 𝐸 𝐷𝐼𝐹𝐹𝐿

• Number of levels 𝐿 is chosen to achieve bias ≤ 0.5휀 and 𝑁𝑙 chosen to achieve variance ≤ 0.75휀2.

• This achieves Mean Square Error (MSE) 휀2 at a cost which is only 𝑂 휀−2

• Nested Monte-Carlo would require 𝑂 휀−3

• Package ‘mlmc’ exits in R and forthcoming publication will include example code.

Quasi Monte Carlo

• Assuming 𝑔 is a real-valued function and 𝑋 is a random variable with cumulative distribution 𝐹, we

estimate 𝐸 𝑔(𝑋) by

1

𝑁

𝑛=1

𝑁

𝑔 𝑋𝑖

where 𝑋𝑖 = 𝐹−1 𝑈𝑖 and 𝑈𝑖 are uniformly distributed in 0,1 .

• Standard MC chooses 𝑈𝑖 for 𝑋𝑖 randomly.

• QMC chooses 𝑈𝑖 for 𝑋𝑖 systematically.

• QMC is deterministic and achieves a better convergence rate of the error than MC

• For EVPPI, only apply QMC to outer samples N as not many inner samples 𝑀 are needed for

accuracy.

• A confidence interval for estimated EVPPI can be provided using randomized QMC.

• Randomized QMC is unbiased and has much lower variance than NMC.

Quasi Monte Carlo

• Example of QMC uniform random number generation in 2-dimensions for 256 points using using rank-1

lattice points or a Sobol sequence.

Depression toy example

• Artificial example comparing no-treatment with CBT and antidepressants.

• Decision tree model follows structure used in NICE CG90 guidelines

• Analysed network meta-analysis data using Markov Chain Monte Carlo (MCMC)

Depression toy example - results

• MC, MLMC, and QMC based on 50,000 samples; GAM, GP, and INLA on 7500 to account for

additional computation time for regression.

• MLMC did not offer reliable computational savings over MC.

• QMC offered substantial computational savings on larger EVPPI values.

• GAM and INLA perform well on small EVPPI values.

ParametersMC reference

(RMSE)MC QMC MLMC GAM (SE) GP (SE) INLA

Probabilities

(6)275.71 (0.5) 273.62 (5.19) 290.20 (3.03) 274.84 (12.00) NA 332.84 (44.43) 293.44

Costs and

QALYs (6)287.29 (0.5) 286.86 (5.46) 286.93 (4.58) 285.13 (9.80) NA 281.50 (29.92) 547.42

CBT lor (2) 7.35 (0.5) 29.53 (20.47) 44.65 (20.44) 22.03 (19.44) 11.01 (6.46) NA 13.83

Antidepressan

t lor (2)1.78 (0.25) .28 (18.73) 5.12 (16.58) 4.10 (16.00) 2.26 (6.51) NA 4.81

Diagnostic plots from INLA

• INLA gave unreliable estimates for the cost and utilities parameter set, but diagnostic plots alert us.

• These are generated using

diag.evppi(x=evppi.inla.lor.anti,y=he.depression)

diag.evppi(x=evppi.inla.lor.anti,y=he.depression,diag="qqplot")

• Not clear what the problem is: the model structure is a simple decision tree and these (log) cost and

utilities parameters follow independent normal distributions.

DOACs for atrial fibrillation

• 17 health states plus two

transient events (TIA and

SE)

• 5 treatment options:

coumarin, apixaban,

dabigatran, rivaroxaban,

edoxaban.

• Competing risks NMA fit

using MCMC (35-dim)

• Baseline hazards (7-dim)

and hazard ratios relative to

no treatment (7-dim) also

used MCMC

• Impact of previous events on

future risks, cost and utility

parameters bring total to 99.

AF Well

MI

Stroke

Major Bleed

ICH

B+S

ICH+S

MI+S

B+MI

B+MI+S

B+ICH+S S+B

+MI+ICH De

ad

ICH+MI

B+ICH+MI

ICH+MI+S

B+ICH

DOACs model results

• MC, MLMC, and QMC cost are 105 samples.

• INLA and GP based on only 7500 samples (max accepted by SAVI)

• GAM cannot be used as parameter sets too large.

• INLA and GP did not give accurate results with small number of simulations.

• Only (parallel computationally intensive) nested MC, MLMC and QMC provided reliable results…

Parameters

(N)

Reference

value (RMSE)MC cost MLMC cost QMC cost GP (SE) INLA

Simple trial

(14)196.69 (1.23) 2144 68.63 - 561.26 (68.22) 741.09

Complex trial

(79)273.33 (1.23) 157 33.78 89.54 869.79 (69.87) -

All MCMC

(41)348.23 (1.23) 139.9 24.47 31.72 782.38 (73.95) 256.56

Loghr MCMC

(28)286.48 (1.23) 317.1 31.85 21.21 978.03 (94.67) 551.98

Conclusions

• Wealth of options for estimating EVPPI in R

• For simple models (decision trees, low-state Markov) and low dimensional

parameter sets, SAVI and BCEA can be efficient and reliable.

• In more complex situations, necessary to use parallel computing Monte Carlo

or advanced sampling schemes.

• However, these are more challenging to implement and there are no black-

box functions similar to BCEA or SAVI.

• Parallel computing and efficient sampling schemes are near impossible to

implement in Excel.

Acknowledgements• Hubs for Trials Methodology Research (HTMR), Network Grant N79, Efficient sample schemes for

estimation of value of information of future research.

• The HTMR Collaboration and innovation in Difficult and Complex randomised controlled Trials In

Invasive procedures (ConDuCT-II) hub provided support for this research.

• Funding for the DOACs model was provided by NIHR HTA grant 11/92/17

• This study was supported by the NIHR Biomedical Research Centre at the University Hospitals Bristol

NHS Foundation Trust and the University of Bristol. The views expressed in this presentation are those

of the author(s) and not necessarily those of the NHS, the National Institute for Health Research or

the Department of Health.

MLMC for EVPPI

• To apply MLMC to EVPPI we instead estimate the quantity

𝐷𝐼𝐹𝐹 = 𝐸𝑉𝑃𝐼 − 𝐸𝑉𝑃𝑃𝐼 = 𝐸𝑋,𝑌 max𝑑∈𝐷𝑓(𝑋, 𝑌) − 𝐸𝑋 max

𝑑∈𝐷𝐸𝑌|𝑋 𝑓𝑑 𝑋, 𝑌

• The NMC estimator with 2𝑙 inner samples based on 𝑋 𝑛 is

𝐷𝐼𝐹𝐹𝑙 =1

2𝑙

𝑚=1

2𝑙

max𝑑∈𝐷𝑓𝑑 𝑋(𝑛), 𝑌(𝑛,𝑚) −max

𝑑∈𝐷

1

2𝑙

𝑚=1

2𝑙

𝑓𝑑 𝑋(𝑛), 𝑌(𝑛,𝑚)

• MLMC generates a sequence of differences 𝐷𝐼𝐹𝐹𝑙 − 𝐷𝐼𝐹𝐹𝑙−1 and estimates the telescoping sum

𝐸 𝐷𝐼𝐹𝐹𝐿 = 𝐸 𝐷𝐼𝐹𝐹0 +

𝑙=1

𝐿

𝐸 𝐷𝐼𝐹𝐹𝑙 − 𝐷𝐼𝐹𝐹𝑙−1

• For low 𝑙, 𝐷𝐼𝐹𝐹𝑙 − 𝐷𝐼𝐹𝐹𝑙−1 is cheap to evaluate, for high 𝑙 it has a low variance so few samples are

needed to estimate its expectation

value of information calculations in r

Documents