accounting for model uncertainty via mcmc...

Accounting for Model Uncertainty via MCMC

Ricardo Sandes Ehlers

Departamento de Estatıstica

Universidade Federal do Parana

http://www.est.ufpr.br/∼ehlers

[email protected]

8th Brazilian Meeting on Bayesian Statistics March 26-29, 2006


Examples

• choice of explanatory variables in regression (and its extensions);

• order selection in polynomial models;

• order specification in (S)ARIMA and regime swichting time series models;

“The number of things that you don’t know is one of the things that you don’t

know”.

Ricardo Ehlers Model Uncertainty via MCMC 2


Examples

• choice of explanatory variables in regression (and its extensions);

• order selection in polynomial models;

• order specification in (S)ARIMA and regime swichting time series models;

“The number of things that you don’t know is one of the things that you don’t

know”.

Model Selection

In real problems, model choice is typically subjetive resulting from the combination

of factors like

• quantitative measures,

• personal experience and

• costs.

Here we will talk about quantitative criteria.



Bayesian Inference

Bayesian inference is based on Bayes theorem:

π(θ) ∝ L(y|θ) p(θ)

y: observed data and

θ: model parameters.

In words

posterior distribution ∝ likelihood× prior distribution.

We want to make inferences about a function g(θ) computing its posterior mean

Eπ[g(θ)] =

∫

g(θ)π(θ)dθ

typically analytically intractable.



Markov chain Monte Carlo

We want to be able to generate θ1, . . . ,θm ∼ π(θ) (Target distribution)

defining transition densities P (θt,θt+1) of a Markov chain.

Given the realizations {θ(t), t = 0, 1, . . . } of a Markov chain that has π as equilibriumdistribution then, under certain conditions,

θ(t) t→∞−→ π(θ) and1

n

n∑

t=1

g(θ(t)i )

n→∞−→ Eπ(g(θi)) a.s.

Although the chain is dependent by definition the arithmetic mean of the chain

values is a consistent estimator of the theoretical mean.



The Metropolis-Hastings Algorithm

Starting at θ0 at time (iteration) t = 0, at each iteration t = 1, 2, . . .

1. sample a candidate value φ ∼ q(·|θt).

2. sample u ∼ U(0, 1) and if

u ≤ min

{

1,π(φ)

π(θ)

q(φ|θt)

q(φ|θt)

}

set θt+1 = φ, otherwise set θt+1 = θt.

q is arbitrary but in practice ...

• q(φ|θt) = q(φ),

• q(φ|θt) = q(|φ− θt|),• q(φ|θt) symmetric.



The Gibbs Sampler

The transition kernel is formed by complete conditional distributions

π(θi|θ−i) =π(θ)

∫

π(θ)dθi

, θ−i = (θ1, . . . , θi−1, θi+1, . . . , θd)′.

At each iteration we obtain a new value θ′ generating

θ′1 ∼ π(θ1|θ2, θ3, . . . , θd)

θ′2 ∼ π(θ2|θ′1, θ3, . . . , θd)...

θ′d ∼ π(θd|θ′1, θ′2, . . . , θ′d−1)



The Toolbox

BUGS (Bayesian inference Using Gibbs Sampling): WinBUGS, GeoBUGS,

PKBUGS, OpenBUGS. (http://www.mrc-bsu.cam.ac.uk/bugs/)

You can run WinBUGS from Linux ! (http://www.est.ufpr.br/dicas)

JAGS (Just Another Gibbs Sampler). (www-fis.iarc.fr/∼martyn/software/jags).BOA (Bayesian Output Analysis Program). CODA (Convergence Diagnostics and

Output Analysis). R: R-CODA, mcmc, MCMCpack, bayesSurv, bayesm.

(http://www.est.ufpr.br/R).

More specific programs:

Nmix (Fortran): Bayesian analysis of univariate normal mixtures, implementing

Richardson and Green (1997).

AutoRJ (Fortran): automatic RJMCMC (Green, 2003)

Updated information: MCMC Preprint Service

(http://www.statslab.cam.ac.uk/∼mcmc).Ricardo Ehlers Model Uncertainty via MCMC 7


Searching for the “Best” Model(s)

Supose that the numberM of alternative models is quite large.

E.g. linear model with 19 possible covariates: 219 = 524288 alternative models (with

no interations).

Enumerate, estimate and associate a measure fit and parsimony to each possible

model may not be the best strategy.

How to compare competing models?

How to make average inference using the competing models (or a subset of this)?



Bayesian Approach

Supose we have k different modelsM1, . . . ,Mk

a prioriwe assign probabilities p(Mi) to each model.

For each model there is a vector of parameters θi ∈ Rni with:

• a likelihood given the observations p(y|θi,Mi)

• a prior distribution p(θi|Mi).

We obtain the posterior distribution of both models and their associated parameters

via Bayes theorem,

π(Mi,θi) ∝ p(y|θi,Mi) p(θi|Mi) p(Mi)

obtaining a sample from this posterior and counting the number of times each

model is visited by the chain.



Bayesian Model Averaging

Hoeting, Madigan, Raftery and Volinsky (1999) Statistical Science 14, 382-401

LetM denote the set that indexes all entertained models. Assume that ∆ is an

outcome of interest well defined across models (e.g. a future value yt+k).

The posterior distribution for ∆ is

p(∆|y) =∑

i

p(∆|Mi,y)p(Mi|y)

for data y and posterior model probability

p(Mi|y) =p(y|Mi)p(Mi)

p(y)

where

p(y|Mi) =

∫

p(y|θi,Mi)p(θi|Mi)dθi



Classical Approach

Model discrimination is based on comparison of information criteria, e.g.

Akaike (1974) AIC(θi,Mi) = −2 log p(y|θi,Mi) + 2ni

Schwartz (1978) BIC(θi,Mi) = −2 log p(y|θi,Mi) + ni log T

Ideally we should compare AIC (or BIC) weights

wi ∝ exp(−AIC(θi,Mi)/2)

∝ p(y|θi,Mi) exp(−ni).

The weights wi are proportional to the posterior model probabilities (in θi) with

priors

p(θi|Mi) ∝ constant and p(Mi) ∝ e−ni.



Deviance Information Criterion

Spiegelhalter, Best, Carlin, and van der Linde (2002)

DIC(θi,Mi) = −2 log p(y|θi,Mi) + 2pD

= D + pD

where θi = E(θi|y), D = E(D(θi)|y) and pD = D −D(θi).

DIC is easily calculated during the chain simulations. If θ1i , . . . ,θ

mi is a sample from

π(θi) then

D ≈ 1

m

m∑

k=1

D(θki ) and D(θi) ≈ D

(

1

m

m∑

k=1

θki

)

WinBUGS version 1.4 computes DIC automatically.

Gelfand and Ghosh (1998) Utility rather than probability to guide model choice.

Dγ =γ

γ + 1

n∑

i=1

(µi − yi,obs)2 +

n∑

i=1

σ2i ,



Example: Mapping Homicide Rates in Curitiba City

Silva, Mota, and Ehlers (2004) Data: Number of homicides in 2000 by district of

Curitiba city,

Yi|ei, ψi ∼ Poisson(eiψi), i = 1, . . . , n.

Model: Hierarchical Bayes with spatial component

ψi = exp(X ′iβ + θi + φi).

and covariate effects of: Median Income, Illiteracy, Households at risk.

Priors,

φi|φj, j 6= i ∼ N

(∑

j∈δiwijφj

∑

j∈δiwij

,1

τφ∑

j∈δiwij

)

(1)

β ∼ N(0, Iσ2β)

β0 ∼ U(−∞,∞)

τφ ∼ Γ(0.5, 0.0005)

τθ ∼ Γ(0.5, 0.0005)

θi ∼ N(0, 1/τθ)



Table 1: DIC values and normalized DIC weights for each model from WinBugs with 30000 simula-

tions.

Model PD DIC weights

No spatial effect 43.728 274.556 0.0000

No covariates 33.292 261.888 0.0062

Illiteracy 18.978 252.448 0.6936

Household 63.434 292.265 0.0000

Income 29.812 259.717 0.0183

Income+Illiteracy 20.632 254.249 0.2819

Illiteracy+Household 61.703 285.940 0.0000

Income+Household 63.323 291.764 0.0000

Income+Illiteracy+Household 57.467 286.244 0.0000

We apply Occam’s razor principle Madigan and Raftery (1994) and use the more

parsimonious model.



0.24 − 0.30.3 − 0.50.5 − 0.70.7 − 1.311.31 − 7.4

(a)

0.08 − 0.120.12 − 0.170.17 − 0.220.22 − 0.370.37 − 2.7

(b)

Figure 1: (a) and (b) Maps of relative risk point estimates and posterior standard deviation in the

Bayesian hierarchical model by district of Curitiba.



Sample based AIC

Given a sample θ1i , . . . ,θ

mi ∼ π(θi|Mi) an obvious extension of the usual AIC is

EAIC = E [AIC(θi,Mi)|y] = D(θi,Mi) + 2ni or D(θi,Mi) + 2ni

Same for the EBIC = E [BIC(θi,Mi)|y].

We could use posterior medians or modes instead of posterior means?

What is the distance between values of these measures?



Example: Autoregressive Models

AR(k) : yt =k∑

j=1

ajyt−j + ǫt, ǫt ∼ N(0, σ2ǫ )

Figure 2: 114 observations of base 10 logarithms minus the mean of the annual trappings of Canadian

lynx, 1821-1934.

0 20 40 60 80 100

-1.0

-0.5

0.00.5

1.0



Table 4 in Brooks, S.P. (2002) Discussion to Spiegelhalter et al. (2002). AR(k) models for Lynx data.

k pD DIC EAIC EBIC π(k) wDIC

kwEAIC

kwEBIC

k

1 1.88 206.660 206.78 209.51 0.000 0.000 0.000 0.000

2 2.85 126.580 127.72 133.19 0.243 0.000 0.003 0.858

3 3.78 127.060 129.27 137.48 0.016 0.000 0.001 0.101

4 4.76 125.520 128.75 139.70 0.007 0.000 0.002 0.033

5 5.70 125.230 129.52 143.20 0.002 0.000 0.001 0.006

6 6.62 126.300 131.68 148.09 0.001 0.000 0.004 0.000

7 7.60 122.340 128.72 147.88 0.002 0.000 0.002 0.001

8 8.61 121.810 129.19 151.08 0.002 0.000 0.001 0.000

9 9.58 122.750 131.16 155.79 0.001 0.000 0.001 0.000

10 10.54 118.940 128.40 155.76 0.002 0.001 0.002 0.000

11 11.33 106.510 117.16 147.26 0.154 0.431 0.566 0.001

12 12.61 106.890 118.27 151.10 0.268 0.356 0.325 0.000

13 13.56 108.740 121.17 156.74 0.135 0.142 0.076 0.000

14 14.46 110.770 124.30 162.61 0.067 0.051 0.016 0.000

15 15.37 112.896 127.42 168.47 0.000 0.019 0.003 0.000



Trans-dimensional Jumps

• Propose a jump from modelMi to modelMj w.p. rij,

• generate a vector u of dimension nj − ni from q(),

• set θj = fij(θi,u)where fij : Θi × Rnj−ni → Θj denotes a bijective function.

• Accept the jump w.p. min(1, A)where

A =π(θj,Mj)

π(θi,Mi)︸︷︷︸

target ratio

rjirij q(u)

∣∣∣∣

∂fij(θi,u)

∂(θi,u)

∣∣∣∣

︸︷︷︸

proposal ratio

Choice of proposal distribution q is crucial to cover model and parameter spaces.

When possible use the complete conditionals, or

approximations to the complete condicionals Brooks and Ehlers (2002)



Possible Targets

Joint posterior distribution

π(Mi,θi) ∝ p(y|θi,Mi) p(θi|Mi) p(Mi).

Boltzmann Distribution

πT (θi,Mi) ∝ exp

(−g(θi,Mi)

T

)

.

MCMC + Simulated Annealing: Brooks, Friel, and King (2003)

MCMC + Genetic Algorithms: Ehlers and Ferreira (2005)



ARIMAmodels

Ehlers and Brooks (2004) Reparameterization in terms of reciprocal roots of

characteristic polynomials

AR(I)MA(p, q) :k∏

i=1

(1 − λiL)yt =

q∏

j=1

(1 − δjL)ǫt, ǫt ∼ N(0, σ2).

Stationarity/Inversibility: |λi| < 1, i = 1, . . . , k and |δi| < 1, i = 1, . . . , q.

Possible jumps: Addition/Deletion of 1 real root or a pair of (conjugate) complex

roots.

Possible proposals: Truncated Normal, Beta-based, Logstica-based.



Table 2: Model probabilities and proportion of correct model for a simulated AR(3).

Sample size

proposal 20 50 100 200 500 1000

Truncated normal 0.0092 0.0398 0.1074 0.2677 0.4565 0.5480

0.0000 0.1500 0.3500 0.6000 0.9000 0.9500

Beta-based 0.0096 0.0441 0.1059 0.2702 0.4812 0.5404

0.0000 0.1500 0.3500 0.6500 0.9500 0.9000

Logistic-based 0.0092 0.0414 0.1058 0.2628 0.4822 0.5436

0.0000 0.1500 0.3000 0.6000 0.9500 0.9000



Figure 3: Southern oscillation index (SOI), 540 measurements taken between 1950-1995.

0 100 200 300 400 500

-8-6

-4-2

02

4



Table 3: Posterior model order probabilities, 500,000 iterations after a 500,000 burn-in.

Proposal d (p, q) 0 1 2 3 4 5

Truncated normal 0 1 0.0000 0.2245 0.0518 0.0503 0.0398 0.0336

2 0.0024 0.0164 0.0556 0.0292 0.0249 0.0184

3 0.0111 0.0173 0.0374 0.0286 0.0247 0.0186

4 0.0130 0.0083 0.0178 0.0134 0.0140 0.0104

5 0.0200 0.0071 0.0148 0.0113 0.0103 0.0085

1 0 0.0000 0.0135 0.0089 0.0156 0.0147 0.0193

1 0.0000 0.0023 0.0042 0.0067 0.0078 0.0105

2 0.0001 0.0018 0.0032 0.0053 0.0067 0.0086

3 0.0004 0.0020 0.0024 0.0041 0.0042 0.0058

4 0.0004 0.0012 0.0024 0.0034 0.0050 0.0060



ARModels with Logistic Smooth Transition

LSTAR(m, p1, . . . , pm): m regimes and pi lags in each regime.

yt = α′1x1,t + (α′

2x2,t − α′1x1,t) G1(yt−d, γ1, c1) + · · · +

(α′mxm,t − α′

m−1xm−1,t) Gm−1(yt−d, γm−1, cm−1) + ǫt, ǫt ∼ N(0, σ2ǫ )

where

Gi(yt−d, γi, ci) =1

1 + exp[−γi(yt−d − ci)], i = 1, . . . ,m− 1.

xj,t = (1, yt−1, . . . , yt−pj)

αj = (α0, α1, . . . , αpj) : AR coefficients

γi > 0 : smoothing parameters

ci : threshold parameters

Total number of parameters

k =∑m

j=1(1 + pj) + 2(m− 1) + 1

Lubrano (2000): m = 2, p1 = p2 = p known.

Lopes and Salazar (2006): m = 2, p1 = p2 = p unknown.

Ehlers (2005): m, p1, . . . , pm unknown.



The joint posterior of (m, p1, . . . , pm,γ, c,α, σ2ǫ |y) is proportional to

p(y|m,p,γ, c,α, σ2ǫ ) p(c|m) p(γ|m) p(m) p(σ2

ǫ )m∏

j=1

p(pj|m) p(αj|pj).

Priors (Lubrano, 2000),

p(α1, σ2) ∝ 1/σ2 and αj|σ2

ǫ , γ∗ ∼ N(0, σ2

ǫ exp(γ∗)Ipj), j = 2, . . . ,m

where α′ = (α′1, . . . ,α

′m) and γ∗ = max{γ1, . . . , γm−1}.

σ2ǫ ∼ IG(a, b), a, b > 0.

p(c|m) =

(m− 1)!∏m−1

j=1 p(cj), c1 < c2 < · · · < cm−1,

0, otherwise

p(γ|m) =m−1∏

j=1

p(γj), γj ∼ Cauchy(0, σγ) truncated to γj > 0.

m ∼ U{1, . . . ,mmax} and pj ∼ U{1, . . . , pmax}, j = 1, . . . ,m.



RJMCMC in LSTARModels

split/combine jumps: randomly choose between creating new regime or combining

2 existing ones.

Creating regimes

• sample j ∈ {1, . . . ,m− 1}

• c′j = cj − ǫ, c′j+1 = cj + ǫ, γ′j = γjτ , γ′j+1 = γj/τ

• sample α(m+1) from its complete conditional distribution.

Combining regimes

• randomly choose a pair of adjacent thresholds cj < cj+1

• c′j = (cj + cj+1)/2, γ′j =

√γjγj+1, ǫ = (cj+1 − cj)/2, τ =

√

γj/γj+1

• update the coefficients α(m−1)

Within-regime jumps

• randomly choose a regime j and propose new valor p′j for pj,

• update α(m) from its complete conditional.



RJMCMC in LSTAR models

RJMCMC for LSTAR Models, Iterations= 50000 Burn-in= 25000 Thinning= 1

Time difference of 17.35 mins

1074 models visited out of 2366 candidates

splits combines births deaths

proposed 8379 4054 12440 12560

accepted 2049 2049 3208 3112

prob model m lags thres smoo sig2

1 0.2134 24 2 2 11 0 3.301 6.38 0.039

2 0.0493 25 2 2 12 0 3.215 5.67 0.040

3 0.0409 363 3 2 2 12 2.927 3.275 1.77 2.15 0.042

4 0.0374 11 2 1 11 0 3.149 5.32 0.041

5 0.0191 38 2 3 12 0 3.223 4.82 0.039

6 0.0188 155 2 12 12 0 3.128 2.86 0.039

7 0.0184 157 2 13 1 0 3.130 2.80 0.040

8 0.0183 26 2 2 13 0 3.133 4.04 0.039

9 0.0155 8 2 1 8 0 3.160 4.45 0.047

10 0.0144 9 2 1 9 0 3.201 3.36 0.047

11 0.0118 180 3 1 1 11 2.844 3.266 1.67 1.69 0.046

12 0.0113 181 3 1 1 12 2.977 3.296 1.64 1.87 0.042

13 0.0108 52 2 4 13 0 3.142 4.61 0.039

+ models with prob < 0.05*P(24)



RJMCMC + Genetic Algorithms

E.g. Linear Models

E(Y ) = β0 + βj1xj1 + · · · + βjkxjk, k = 0, . . . , kmax

2kmax possible models with intercept.

If estimation requires little computational effort, we propose an effective and

semi-automatic method for model comparison.

Given a population of models Z = (z1, . . . , zM)where zij = 0, 1.

Propose a new population z′ via genetic operators (esp. mutation and crossover).

Accept the new population with probability,

min

(

1,exp{−BIC(z′)}exp{−BIC(z)}

P (z′, z)

P (z, z′)

)

where

P (z, z′) = Pr(proposing a jump from population z to z′)



Crossover Move

Randomly choose a pair of individuals zi, zj and propose a new population as

follows,

1. randomly choose k ∈ {1, . . . , p− 1}

2. set z′i = (zi,1, . . . , zi,k−1, zj,k . . . , zj,p)

3. set z′j = (zj,1, . . . , zj,k−1, zi,k . . . , zi,p)

4. Accept this new population with probability min(1, A) where

A =exp(−BIC(z′

i) −BIC(z′j))

exp(−BIC(zi) −BIC(zj))

P (z′, z)

P (z, z′)

This updating scheme is repeated for all [M/2] pairs of individuals selected without

replacement from the population.



Mutation Move

Either include a new regressor with probability w, or delete an existing one with

probability 1 − w.

Suppose we are now updating zi and an inclusion is proposed. Then,

1. randomly choose j ∈ J = {j : zij = 0} and set z′ij = 1

2. accepted this move w.p. min(1, A) where

A =exp(−BIC(z′

i))

exp(−BIC(zi))

w |J |(1 − w) (|J | + 1)

with J = {j : zij = 1} and |J | denotes the cardinality of J .

Likewise, if a deletion is proposed

1. choose j ∈ J and set z′ij = 0.

2. accept the move w.p. min(1, A−1).

This updating scheme is repeated for all individuals in the population.



Simulated Data

50 observations with kmax= 19 (524288 candidate models). Output based on 2500

iterations (after 2500 burn-in) and a population of size 20.

Full enumeration GA-RJMCMC

weights model probs model

1 0.5974 10512 0.5928 10512

2 0.0451 11536 0.0497 26896

3 0.0427 26896 0.0464 11536

4 0.0371 43280 0.0413 43280

5 0.0299 272656 0.0330 11024

6 0.0288 11024 0.0295 272656

7 0.0243 141584 0.0233 141584

8 0.0184 76048 0.0172 76048

9 0.0169 10576 0.0139 10640

10 0.0149 10640 0.0132 10576



Real Data

Crime rates in 47 US states, 15 potential regressors (Raftery, Painter, and Volinsky

2005).

GA-RJMCMC Time difference of 3.2 mins

612 models visited out of 32768 candidates

10 most visited models in descending order

M So Ed Po1 Po2 LF M.F Pop NW U1 U2 GDP Ineq Prob Time

0.209 1 0 1 1 0 0 0 0 1 0 1 0 1 1 1

0.123 1 0 1 1 0 0 0 0 1 0 1 0 1 1 0

0.060 1 0 1 1 0 0 0 0 1 0 1 1 1 1 1

0.055 1 0 1 1 0 0 0 1 1 0 1 0 1 1 0

0.053 1 0 1 0 1 0 0 0 1 0 1 0 1 1 0

0.036 1 0 1 0 1 0 0 0 1 0 1 0 1 1 1

0.026 1 0 1 1 0 0 0 0 1 0 0 0 1 1 1

0.025 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1

0.023 1 0 1 1 0 0 0 0 0 0 1 0 1 1 0

0.022 1 0 1 0 1 0 0 0 1 0 1 1 1 1 1



regressor prob.inc

1 M 0.9890

2 So 0.0549

3 Ed 1.0000

4 Po1 0.7714

5 Po2 0.2459

6 LF 0.0290

7 M.F 0.0347

8 Pop 0.2049

9 NW 0.9227

10 U1 0.0889

11 U2 0.8891

12 GDP 0.2414

13 Ineq 1.0000

14 Prob 0.9956

15 Time 0.4963

births deaths mutations crossovers

proposed 50101 49899 100000 37299

accepted 6792 6793 13585 14678



Models visited by GA−MCMC

Model1 2 3 4 5 7 12 21 54

Time

Prob

Ineq

GDP

U2

U1

NW

Pop

M.F

LF

Po2

Po1

Ed

So

M

Figure 4:


8th Brazilian Meeting on Bayesian Statistics March 26-29, 2006REFERENCES

References

Akaike, H. (1974). A new look at the statistical identification model. IEEE Transactions on Automatic

Control 19, 716–723.

Brooks, S., N. Friel, and R. King (2003). Classical model selection via simulated annealing. Journal

of the Royal Statistical Society, Series B 65, 503–520.

Brooks, S. P. and R. S. Ehlers (2002). Efficient construction of reversible jump MCMC proposals for

autoregressive time series models. Technical report, University of Cambridge.

Ehlers, R. S. (2005). Fully Bayesian analysis of regime switching models with an unknown number

of components. In Preparation.

Ehlers, R. S. and S. P. Brooks (2004). Bayesian analysis of order uncertainty in arima models.

Technical Report 2004/05-B, Federal University of Paran.

Ehlers, R. S. and M. A. Ferreira (2005). Trans-dimensional genetic algorithms for model

discrimination. In Preparation.

Gelfand, A. E. and S. K. Ghosh (1998). Model choice: A posterior predictive loss approach.

Biometrika 8, 1–11.

Lopes, H. F. and E. Salazar (2006). Bayesian model uncertainty in smooth transition

autoregressions. Journal of Time Series Analysis 27, 99–117.

Lubrano, M. (2000). Bayesian analysis of nonlinear time series models with a threshold. In

Proceedings of the Eleventh International Symposium in Economic Theory. Helsinki: Cambridge

University Press.

Madigan, D. and A. E. Raftery (1994). Model selection and accounting for model uncertainty in

graphical models using Occam’s window. Journal of the American Statistical Association 89,

1535–1546.


8th Brazilian Meeting on Bayesian Statistics March 26-29, 2006REFERENCES

Raftery, A. E., I. S. Painter, and C. T. Volinsky (2005). BMA: An R package for Bayesian model

averaging. R News 5(2), 2–8.

Schwartz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6, 461–464.

Silva, S. A., L. L. M. Mota, and R. S. Ehlers (2004). Spatial analysis of incidence rates: A Bayesian

approach. Technical report, Federal University of Paran. Technical Report 2004/02-B.

Spiegelhalter, D. J., N. G. Best, B. P. Carlin, and A. van der Linde (2002). Bayesian measures of

model complexity and fit (with discussion). Journal of the Royal Statistical Society, Series B 64, 1–34.


accounting for model uncertainty via mcmc...

Documents