estimating distributional parameters in hierarchical models

46
Estimating Distributional Parameters in Hierarchical Models

Upload: others

Post on 12-Jun-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Estimating Distributional Parameters in Hierarchical Models

Estimating Distributional Parameters in

Hierarchical Models

Page 2: Estimating Distributional Parameters in Hierarchical Models

Introduction: Variability in Hierarchical Models

Page 3: Estimating Distributional Parameters in Hierarchical Models

Linear Models

• Modelling central tendency

• Response (𝑦𝑖𝑗) is a sum of intercept (𝛽0), slopes (𝛽1, 𝛽2, …), and error (𝑒𝑖𝑗)

• Error is assumed to be normally distributed around zero

𝑦𝑖𝑗 = 𝛽0 + 𝛽1𝑋𝑖𝑗 + 𝑒𝑖𝑗𝑒𝑖𝑗 ∼ 𝑁(0, σ2)

Page 4: Estimating Distributional Parameters in Hierarchical Models

Linear Models

• Modelling central tendency

• Response (y) is a sum of intercept (implicit), slopes (pred), and error (implicit)

• Error is assumed to be normally distributed around zero

lm(y ~ pred)

Page 5: Estimating Distributional Parameters in Hierarchical Models

Linear Mixed Effects Models

• Modelling central tendency

• Response (𝑦𝑖𝑗) is a sum of intercept (𝛽0), slopes (𝛽1, 𝛽2, …), random unit intercepts (μ0𝑖), random unit slopes (μ1𝑖), and error (𝑒𝑖𝑗)

• Error, random intercepts, and random slopes are assumed to be normally distributed around zero

𝑦𝑖𝑗 = 𝛽0 + μ0𝑖 + (𝛽1 + μ1𝑖)𝑋𝑖𝑗 + 𝑒𝑖𝑗μ0𝑖 ∼ 𝑁(0, σ2)μ1𝑖 ∼ 𝑁(0, σ2)𝑒𝑖𝑗 ∼ 𝑁(0, σ2)

Page 6: Estimating Distributional Parameters in Hierarchical Models

Linear Mixed Effects Models

• Modelling central tendency• Response (y) is a sum of intercept (implicit), slopes (pred), random unit

intercepts (pred || rand_unit), random unit slopes (pred | rand_unit), and error (implicit)

• Error, random intercepts, and random slopes are assumed to be normally distributed around zero

lmer(y ~ pred + (pred | rand_unit))

Page 7: Estimating Distributional Parameters in Hierarchical Models

Example Non-Gaussian Data: RT

•2AFC: does the word match the picture?

•Congruency (2) x Predictability (12% – 100%)

•35 Subjects, 200 trials

+

bandage

sardine

Page 8: Estimating Distributional Parameters in Hierarchical Models

Gamma Family GLMM

m_glmer <- glmer(

rt ~ cong * pred +

(cong * pred | subj) +

(cong | image) +

(1 | word),

family = Gamma(identity),

control = glmerControl(

optimizer = “bobyqa”,

optCtrl = list(maxfun = 2e5)

)

)

Page 9: Estimating Distributional Parameters in Hierarchical Models

GLMM Results

Page 10: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects

summary(m_glmer)

Page 11: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects

ranef(m_glmer)

Page 12: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects

m_glmer %>% ranef() %>% as.data.frame()

Page 13: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects

ranef(m_glmer) %>%

as_tibble() %>%

filter(grpvar == “subj") %>%

mutate(grp = fct_reorder2(grp, term, condval)) %>%

ggplot(aes(

x = grp, y = condval,

ymin = condval - condsd,

ymax = condval + condsd

)) +

geom_pointrange(size=0.25) +

facet_wrap(vars(term), scales="free", nrow=2)

Page 14: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects – Subject

Page 15: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects – Image

Page 16: Estimating Distributional Parameters in Hierarchical Models

GLMM Results – Random Effects – Word

Page 17: Estimating Distributional Parameters in Hierarchical Models
Page 18: Estimating Distributional Parameters in Hierarchical Models

Estimating Distributional Parameters in

Hierarchical Models

Page 19: Estimating Distributional Parameters in Hierarchical Models

What if Meaningful Effects on Variance?

•All glm variants model single parameters(i.e. central tendency)•What if your effect looks like this?

Page 20: Estimating Distributional Parameters in Hierarchical Models

What if Meaningful Effects on Variance?

•Mu is higher F(1, 1998) = 3237, p<.001

•Sigma is higher Levene’s F(1, 1998) = 550, p<.001

Page 21: Estimating Distributional Parameters in Hierarchical Models

Assumption-free Distribution Comparison

•Within a single model?

•Assumption free distribution comparison (e.g. Kolmogorov–Smirnov) could be one approach!

•Overlapping index (Pastore & Calcagni, 2019) from 0 (no overlap) to 1 (identical distribution)

Page 22: Estimating Distributional Parameters in Hierarchical Models

Assumption-free Distribution Comparisonx <- rnorm(1000, 10, 1),

y <- rnorm(1000, 10.5, 1.5)

Page 23: Estimating Distributional Parameters in Hierarchical Models

Assumption-free Distribution Comparison

Page 24: Estimating Distributional Parameters in Hierarchical Models

Overlap Index Mu * Sigma Parameter Space

Page 25: Estimating Distributional Parameters in Hierarchical Models

Overlap Index Mu * Sigma Parameter Space

Page 26: Estimating Distributional Parameters in Hierarchical Models

Overlap Index Mu * Sigma Parameter Space

Page 27: Estimating Distributional Parameters in Hierarchical Models

Weirder Distribution Example

Page 28: Estimating Distributional Parameters in Hierarchical Models

Weirder Distribution Example

Page 29: Estimating Distributional Parameters in Hierarchical Models

Summary so far

• Assumption-free approaches are flexible but don’t allow us to test/make any specific predictions

• Equivalent of shrugging and saying “yeah idk probs something going on there” (though useful for very weird distributions)

• Explicitly modelling multiple parameters of an assumed distribution can give us more meaningful info

Page 30: Estimating Distributional Parameters in Hierarchical Models

Distributional Parameters in brms

brm(

bf(

dv ~ Intercept + iv + (iv | rand_unit),

sigma ~ Intercept + iv + (iv | rand_unit)

),

control = list(

adapt_delta = 0.999,

max_treedepth = 12

),

sample_all_pars = TRUE

)

Page 31: Estimating Distributional Parameters in Hierarchical Models

Shifted Log-Normal Distribution

Page 32: Estimating Distributional Parameters in Hierarchical Models

Shifted Log-Normal Distribution

Page 33: Estimating Distributional Parameters in Hierarchical Models

Bayesian Shifted Log-Normal Mixed Effects Model with Distributional Parameters

brms::bf(

rt ~ Intercept + cong * pred +

(cong * pred | subj) +

(cong | image) +

(1 | word),

sigma ~ rt ~ Intercept + cong * pred +

(cong * pred | subj) +

(cong | image) +

(1 | word),

ndt ~ rt ~ Intercept + cong * pred +

(cong * pred | subj) +

(cong | image) +

(1 | word)

)

Page 34: Estimating Distributional Parameters in Hierarchical Models
Page 35: Estimating Distributional Parameters in Hierarchical Models
Page 36: Estimating Distributional Parameters in Hierarchical Models
Page 37: Estimating Distributional Parameters in Hierarchical Models
Page 38: Estimating Distributional Parameters in Hierarchical Models
Page 39: Estimating Distributional Parameters in Hierarchical Models
Page 40: Estimating Distributional Parameters in Hierarchical Models

ranef(m_bme)

Bayesian Results – Random Effects

ID (e.g. subj_01, subj_02…) * value (est, err, Q2.5, Q97.5) * fixed parameter

Page 41: Estimating Distributional Parameters in Hierarchical Models
Page 42: Estimating Distributional Parameters in Hierarchical Models
Page 43: Estimating Distributional Parameters in Hierarchical Models
Page 44: Estimating Distributional Parameters in Hierarchical Models
Page 45: Estimating Distributional Parameters in Hierarchical Models

Caveats

•Computationally intensive if using non-informative priors for complex hierarchical formulae

•Have to avoid temptation to try over-infer about mechanisms unless using more cognitively informed models (e.g. drift diffusion)

Page 46: Estimating Distributional Parameters in Hierarchical Models

Summary

Hierarchical models with maximal structures for distributional parameters are a robust and appropriate way of looking at or accounting for subject/item/etc variability in fixed effects when you’re interested in more than central tendency.

But, if you can assume no systematic differences in distributional parameters, GLMMs will suffice (and save you a lot of time and effort)!