generalized additive models...generalized additive models matteo fasiolo (university of bristol, uk)...

30
Generalized additive models Matteo Fasiolo (University of Bristol, UK) [email protected] June 27, 2018 Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 1 / 30

Upload: others

Post on 09-Jul-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Generalized additive models

Matteo Fasiolo (University of Bristol, UK)

[email protected]

June 27, 2018

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 1 / 30

Page 2: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Today’s plan

Morning session

1 Intro to Generalized Additive Models (GAMs)

2 Smooth effect types & Big Data methods

Afternoon session1 Beyond mean modelling: GAMLSS models

2 Distribution-free modelling: Quantile GAMs

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 2 / 30

Page 3: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Intro to Generalized Additive Models (GAMs)

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 3 / 30

Page 4: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Structure of the talk

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 4 / 30

Page 5: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

What is an additive model

Regression setting:

y is our response or dependent variable

x is a vector of covariates or independent variables

In distributional regression we want a good model for Dist(y |x).

Model is Distm{y |θ1(x), . . . , θq(x)}, where θ1(x), . . . , θq(x) are param.

In a Gaussian model, the mean depends on the covariates

y |x ∼ N{y |µ = θ(x), σ2)

},

where µ = E(y |x) and σ2 = Var(y).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 5 / 30

Page 6: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

What is an additive model

−2 −1 0 1 2

−4

−2

02

4

x

y

Figure : Gaussian model with variable mean.In mgcv: gam(y~s(x), family=gaussian).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 6 / 30

Page 7: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

What is an additive model

Gaussian additive model:

y |x ∼ N(y |µ(x), σ2)

where µ(x) = E(y |x) =∑m

j=1 fj(x).

fj ’s can be fixed, random or smooth effects with coefficients β.

β̂ is the maximizer of penalized log-likelihood

β̂ = argmaxβ

{Ly (β)− Pen(β)

}.

where:

Ly (β) =∑

i log p(yi |β) is log-likelihood

Pen(β) penalizes the complexity of the fj ’s

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 7 / 30

Page 8: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

What is an additive model

Generalized additive model (GAM) (Hastie and Tibshirani, 1990):

y |x ∼ Distr{y |θ1 = µ(x), θ2, . . . , θp},

where

E(y |x) = µ(x) = g−1{ m∑

j=1

fj(x)}

,

and g is the link function.

Poisson GAM:

y |x ∼ Pois{y |µ(x)}

E(y |x) = Var(y |x) = exp{∑m

j=1 fj(x)}

g = log assures µ(x) > 0

Here E(y |x) and Var(y |x) is implied by model...

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 8 / 30

Page 9: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

What is an additive model

... or we can have extra parameters for scale and shape.

Scaled Student’s t GAM:

y |x ∼ ScaledStud{y |µ(x), σ, ν}

E(y |x) = µ(x) =∑m

j=1 fj(x)

σ is scale parameter

ν is shape parameter (degrees of freedom)

Var(y |x) = σ2 ν

ν−2

In the afternoon will see models with multiple linear predictors, eg:

y |x ∼ N{y |µ(x), σ(x)}

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 9 / 30

Page 10: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Structure of the talk

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 10 / 30

Page 11: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing smooth effects

Consider additive model

E(y |x) = µ(x) = g−1{

f1(x) + f2(x) + f3(x)}

,

where

f1(x) = β0 + β1x1 + β2x21

f2(x) =

{0 if x2 = FALSEβ4 if x2 = TRUE

f3(x) = f3(x3) is a non-linear smooth function.

Smooth effects built using spine bases

f3(x3) =r∑

k=1

βkbk(x3)

where βk are unknown coeff and bk(x3) are known spline basis functions.

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 11 / 30

Page 12: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing smooth effects

Example 1: B-splines

Figure : B-spline basis (left) and smooth (right).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 12 / 30

Page 13: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing smooth effects

Example 2: Thin plate regression splines (TPRS)

Figure : Rank 7 TPRS basis. Image from Wood (2006).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 13 / 30

Page 14: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing smooth effects

Figure : Rank 17 2D TPRS basis. Courtesy of Simon Wood.Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 14 / 30

Page 15: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing smooth effects

In general

f (x) =

r∑

k=1

βkbk(x).

To determine complexity of f (x):

the basis rank r is large enough for sufficient flexibility

a complexity penalty on β controls the wiggliness of the effects

In first morning practical we’ll see only 1D effects.

In mgcv:

gam(y ~ 1 + x0 + s(x1, bs = "tp", k = 15), ...)

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 15 / 30

Page 16: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Structure of the talk

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 16 / 30

Page 17: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing random effects

Suppose we have data on bone mineral density (bmd) as a function of age.

We have m subjects and n data pairs per subject

subj 1: {bmd11, age11}, . . . , {bmdn1, agen1}

subj j: {bmd1j , age1j}, . . . , {bmdnj , agenj}

subj m: {bmd1m, age1m}, . . . , {bmdnm, agenm}

Standard linear model ignores individual differences

E(bmd |ageij) = µ(ageij ) = α+ βageij .

We can include random intercept per subject

µ(ageij) = α+ βageij + aj ,

where a = {a1, . . . , am} ∼ N(0,Σ).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 17 / 30

Page 18: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing random effects

11.0 11.5 12.0 12.5 13.0

0.6

0.8

1.0

1.2

age

bmd

11.0 11.5 12.0 12.5 13.0

age

We can also include random slopes

µ(ageij) = α+ (β + bj)ageij + aj ,

where a ∼ N(0,Σa) and b ∼ N(0,Σb).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 18 / 30

Page 19: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Introducing random effects

In mgcv random effect are specified as:

gam(bmd ~ 1 + s(subject, bs = "re") +

age + s(age, subject, bs = "re"), ...)

In simplest case Σa = γaI and Σb = γbI, that is

Σa =

γa 0 0 . . . 00 γa 0 . . . 0...

......

. . ....

0 0 0 . . . γa

Variances γa and γb must be estimated (later I’ll explain how).

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 19 / 30

Page 20: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Structure of the talk

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 20 / 30

Page 21: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

In first hands-on session we’ll use few basic diagnostics.QQ-plots

−4

−2

0

2

4

−2 0 2

Em

piric

al q

uant

iles

Right skewness

−4

−2

0

2

4

−2 0 2

Left skewness

−6

−3

0

3

−2 0 2

Theoretical quantiles

Em

piric

al q

uant

iles

Heavy tails

−5.0

−2.5

0.0

2.5

−2 0 2

Theoretical quantiles

Over−dispersed

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 21 / 30

Page 22: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

Useful for choosing model Distm(y |x) (e.g. Poisson vs Neg. Binom.)

Less useful for finding omitted variables and non-linearities.

0

1

−1.0 −0.5 0.0 0.5 1.0

x

y

−2.5

0.0

2.5

−2 0 2

Theoretical quantiles

Em

piric

al q

uant

iles

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 22 / 30

Page 23: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

Conditional residuals checks are more helpful here.

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0

x

Res

idua

ls

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

−2

0

2

4

−1.0 −0.5 0.0 0.5 1.0

x

Mea

n re

sidu

als

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 23 / 30

Page 24: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

Recall structure of smooth effects:

f (x) =k∑

j=1

βjbj(x).

where β shrunk toward zero by smoothness penalty.

Effective number of parameters we are using is < k .

Approximation is Effective Degrees of Freedom (EDF) < k .

By default k = 10 but this is arbitrary.

Exact choice of k not important, but it must not be too low.

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 24 / 30

Page 25: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

Checking whether k is too low:

1 look at conditional residuals checks

2 look at output of check(fit):

## k’ edf k-index p-value

## s(wM) 9.00 8.60 0.91 <2e-16 ***

## s(wM_s95) 9.00 8.13 1.02 0.76

## s(Posan) 8.00 2.66 1.04 0.97

3 increase k and see if a model selection criterion improves

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 25 / 30

Page 26: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Diagnostics and model selection tools

Model selection

General criterion is approximate Akaike Information Criterion (AIC):

AIC = −2 log p(y|β̂)︸ ︷︷ ︸

goodness of fit

+ 2τ︸︷︷︸

model complexity

where τ is EDF.

If AICm1 < AICm2 choose model 1.

To select which effects to include we can also look at p-values:

summary(fit)

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 267.2004 75.4197 3.543 0.000405 ***

## Fl 6.2854 1.0457 6.010 2.20e-09 ***

## loc2 79.8459 80.4130 0.993 0.320858

## loc3 -71.2728 86.1725 -0.827 0.408284

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 26 / 30

Page 27: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Structure of the talk

Structure:1 What is an additive model?

2 Introducing smooth effects

3 Introducing random effects

4 Diagnostics and model selection tools

5 GAM modelling using mgcv and mgcViz

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 27 / 30

Page 28: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

GAMs in mgcv and mgcViz

mgcv is the recommended R package for fitting GAMs.

Today we’ll work with mgcViz’s interface.

mgcViz extends mgcv’s tools for:

plotting the estimated effects

doing visual model checking

But most of the computation is done by mgcv under the hood.

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 28 / 30

Page 29: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

Further reading

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 29 / 30

Page 30: Generalized additive models...Generalized additive models Matteo Fasiolo (University of Bristol, UK) matteo.fasiolo@bristol.ac.uk June 27, 2018 Matteo Fasiolo (University of Bristol,

References I

Hastie, T. and R. Tibshirani (1990). Generalized Additive Models, Volume 43. CRCPress.

Wood, S. (2006). Generalized additive models: an introduction with R. CRC press.

Matteo Fasiolo (University of Bristol, UK) Additive modelling June 27, 2018 30 / 30