generalized additive models...generalized additive models matteo fasiolo (university of bristol, uk)...
TRANSCRIPT
Generalized additive models
Matteo Fasiolo (University of Bristol, UK)
June 27, 2018
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 1 / 30
Today’s plan
Morning session
1 Intro to Generalized Additive Models (GAMs)
2 Smooth effect types & Big Data methods
Afternoon session1 Beyond mean modelling: GAMLSS models
2 Distribution-free modelling: Quantile GAMs
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 2 / 30
Intro to Generalized Additive Models (GAMs)
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 3 / 30
Structure of the talk
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 4 / 30
What is an additive model
Regression setting:
y is our response or dependent variable
x is a vector of covariates or independent variables
In distributional regression we want a good model for Dist(y |x).
Model is Distm{y |θ1(x), . . . , θq(x)}, where θ1(x), . . . , θq(x) are param.
In a Gaussian model, the mean depends on the covariates
y |x ∼ N{y |µ = θ(x), σ2)
},
where µ = E(y |x) and σ2 = Var(y).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 5 / 30
What is an additive model
−2 −1 0 1 2
−4
−2
02
4
x
y
Figure : Gaussian model with variable mean.In mgcv: gam(y~s(x), family=gaussian).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 6 / 30
What is an additive model
Gaussian additive model:
y |x ∼ N(y |µ(x), σ2)
where µ(x) = E(y |x) =∑m
j=1 fj(x).
fj ’s can be fixed, random or smooth effects with coefficients β.
β̂ is the maximizer of penalized log-likelihood
β̂ = argmaxβ
{Ly (β)− Pen(β)
}.
where:
Ly (β) =∑
i log p(yi |β) is log-likelihood
Pen(β) penalizes the complexity of the fj ’s
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 7 / 30
What is an additive model
Generalized additive model (GAM) (Hastie and Tibshirani, 1990):
y |x ∼ Distr{y |θ1 = µ(x), θ2, . . . , θp},
where
E(y |x) = µ(x) = g−1{ m∑
j=1
fj(x)}
,
and g is the link function.
Poisson GAM:
y |x ∼ Pois{y |µ(x)}
E(y |x) = Var(y |x) = exp{∑m
j=1 fj(x)}
g = log assures µ(x) > 0
Here E(y |x) and Var(y |x) is implied by model...
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 8 / 30
What is an additive model
... or we can have extra parameters for scale and shape.
Scaled Student’s t GAM:
y |x ∼ ScaledStud{y |µ(x), σ, ν}
E(y |x) = µ(x) =∑m
j=1 fj(x)
σ is scale parameter
ν is shape parameter (degrees of freedom)
Var(y |x) = σ2 ν
ν−2
In the afternoon will see models with multiple linear predictors, eg:
y |x ∼ N{y |µ(x), σ(x)}
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 9 / 30
Structure of the talk
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 10 / 30
Introducing smooth effects
Consider additive model
E(y |x) = µ(x) = g−1{
f1(x) + f2(x) + f3(x)}
,
where
f1(x) = β0 + β1x1 + β2x21
f2(x) =
{0 if x2 = FALSEβ4 if x2 = TRUE
f3(x) = f3(x3) is a non-linear smooth function.
Smooth effects built using spine bases
f3(x3) =r∑
k=1
βkbk(x3)
where βk are unknown coeff and bk(x3) are known spline basis functions.
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 11 / 30
Introducing smooth effects
Example 1: B-splines
Figure : B-spline basis (left) and smooth (right).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 12 / 30
Introducing smooth effects
Example 2: Thin plate regression splines (TPRS)
Figure : Rank 7 TPRS basis. Image from Wood (2006).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 13 / 30
Introducing smooth effects
Figure : Rank 17 2D TPRS basis. Courtesy of Simon Wood.Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 14 / 30
Introducing smooth effects
In general
f (x) =
r∑
k=1
βkbk(x).
To determine complexity of f (x):
the basis rank r is large enough for sufficient flexibility
a complexity penalty on β controls the wiggliness of the effects
In first morning practical we’ll see only 1D effects.
In mgcv:
gam(y ~ 1 + x0 + s(x1, bs = "tp", k = 15), ...)
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 15 / 30
Structure of the talk
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 16 / 30
Introducing random effects
Suppose we have data on bone mineral density (bmd) as a function of age.
We have m subjects and n data pairs per subject
subj 1: {bmd11, age11}, . . . , {bmdn1, agen1}
subj j: {bmd1j , age1j}, . . . , {bmdnj , agenj}
subj m: {bmd1m, age1m}, . . . , {bmdnm, agenm}
Standard linear model ignores individual differences
E(bmd |ageij) = µ(ageij ) = α+ βageij .
We can include random intercept per subject
µ(ageij) = α+ βageij + aj ,
where a = {a1, . . . , am} ∼ N(0,Σ).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 17 / 30
Introducing random effects
11.0 11.5 12.0 12.5 13.0
0.6
0.8
1.0
1.2
age
bmd
11.0 11.5 12.0 12.5 13.0
age
We can also include random slopes
µ(ageij) = α+ (β + bj)ageij + aj ,
where a ∼ N(0,Σa) and b ∼ N(0,Σb).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 18 / 30
Introducing random effects
In mgcv random effect are specified as:
gam(bmd ~ 1 + s(subject, bs = "re") +
age + s(age, subject, bs = "re"), ...)
In simplest case Σa = γaI and Σb = γbI, that is
Σa =
γa 0 0 . . . 00 γa 0 . . . 0...
......
. . ....
0 0 0 . . . γa
Variances γa and γb must be estimated (later I’ll explain how).
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 19 / 30
Structure of the talk
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 20 / 30
Diagnostics and model selection tools
In first hands-on session we’ll use few basic diagnostics.QQ-plots
−4
−2
0
2
4
−2 0 2
Em
piric
al q
uant
iles
Right skewness
−4
−2
0
2
4
−2 0 2
Left skewness
−6
−3
0
3
−2 0 2
Theoretical quantiles
Em
piric
al q
uant
iles
Heavy tails
−5.0
−2.5
0.0
2.5
−2 0 2
Theoretical quantiles
Over−dispersed
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 21 / 30
Diagnostics and model selection tools
Useful for choosing model Distm(y |x) (e.g. Poisson vs Neg. Binom.)
Less useful for finding omitted variables and non-linearities.
0
1
−1.0 −0.5 0.0 0.5 1.0
x
y
−2.5
0.0
2.5
−2 0 2
Theoretical quantiles
Em
piric
al q
uant
iles
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 22 / 30
Diagnostics and model selection tools
Conditional residuals checks are more helpful here.
−1.0
−0.5
0.0
0.5
1.0
−1.0 −0.5 0.0 0.5 1.0
x
Res
idua
ls
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
−2
0
2
4
−1.0 −0.5 0.0 0.5 1.0
x
Mea
n re
sidu
als
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 23 / 30
Diagnostics and model selection tools
Recall structure of smooth effects:
f (x) =k∑
j=1
βjbj(x).
where β shrunk toward zero by smoothness penalty.
Effective number of parameters we are using is < k .
Approximation is Effective Degrees of Freedom (EDF) < k .
By default k = 10 but this is arbitrary.
Exact choice of k not important, but it must not be too low.
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 24 / 30
Diagnostics and model selection tools
Checking whether k is too low:
1 look at conditional residuals checks
2 look at output of check(fit):
## k’ edf k-index p-value
## s(wM) 9.00 8.60 0.91 <2e-16 ***
## s(wM_s95) 9.00 8.13 1.02 0.76
## s(Posan) 8.00 2.66 1.04 0.97
3 increase k and see if a model selection criterion improves
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 25 / 30
Diagnostics and model selection tools
Model selection
General criterion is approximate Akaike Information Criterion (AIC):
AIC = −2 log p(y|β̂)︸ ︷︷ ︸
goodness of fit
+ 2τ︸︷︷︸
model complexity
where τ is EDF.
If AICm1 < AICm2 choose model 1.
To select which effects to include we can also look at p-values:
summary(fit)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 267.2004 75.4197 3.543 0.000405 ***
## Fl 6.2854 1.0457 6.010 2.20e-09 ***
## loc2 79.8459 80.4130 0.993 0.320858
## loc3 -71.2728 86.1725 -0.827 0.408284
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 26 / 30
Structure of the talk
Structure:1 What is an additive model?
2 Introducing smooth effects
3 Introducing random effects
4 Diagnostics and model selection tools
5 GAM modelling using mgcv and mgcViz
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 27 / 30
GAMs in mgcv and mgcViz
mgcv is the recommended R package for fitting GAMs.
Today we’ll work with mgcViz’s interface.
mgcViz extends mgcv’s tools for:
plotting the estimated effects
doing visual model checking
But most of the computation is done by mgcv under the hood.
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 28 / 30
Further reading
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 29 / 30
References I
Hastie, T. and R. Tibshirani (1990). Generalized Additive Models, Volume 43. CRCPress.
Wood, S. (2006). Generalized additive models: an introduction with R. CRC press.
Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 30 / 30