term 4, 2005bio656 multilevel models1 hierarchical models for pooling: a case study in air pollution...

44
Term 4, 2005 BIO656 Multilevel Models 1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Upload: miles-stokes

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 1

Hierarchical Models for Pooling: A Case Study in Air Pollution

Epidemiology

Francesca Dominici

Page 2: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 2

NMMAPS

• National Morbidity and Mortality Air Pollution Study (NMMAPS)

• Daily data on cardiovascular/respiratory mortality in 10 largest cities in U.S.

• Daily particulate matter (PM10) data• Log-linear regression estimate relative risk of

mortality per 10 unit increase in PM10 for each city

• Estimate and statistical standard error for each city

cccc

Page 3: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 3

Page 4: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 4

Relative Risks* for Six Largest Cities

City RR Estimate (% per 10 micrograms/ml

Statistical Standard Error

Statistical

Variance

Los Angeles 0.25 0.13 .0169

New York 1.4 0.25 .0625

Chicago 0.60 0.13 .0169

Dallas/Ft Worth 0.25 0.55 .3025

Houston 0.45 0.40 .1600

San Diego 1.0 0.45 .2025

Approximate values read from graph in Daniels, et al. 2000. AJE

Page 5: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 5

*

*

*

**

*

-2-1

01

23

4City-specific MLEs for Log Relative Risks

Pe

rce

nt C

ha

ng

e

Page 6: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 6

Notation

c

Page 7: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 7

Sources of Variation

)()()ˆ(

ˆ

ˆ

ccc

ccc

cc

ccc

eVardVarVar

ed

d

e

Page 8: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 8

0

0 0

0 0 0

-2-1

01

23

4

City-specific MLEs for Log Relative Risks (*) and True Values (o)

city

Pe

rce

nt C

ha

ng

e

Page 9: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 9

0

0 0

0 0 0

-2-1

01

23

4

City-specific MLEs for Log Relative Risks (*) and True Values (o)

city

Pe

rce

nt C

ha

ng

e

*

*

*

**

*

Page 10: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 10

*

*

*

**

*

-2-1

01

23

4

City-specific MLEs for Log Relative Risks

city

Pe

rce

nt C

ha

ng

e

Page 11: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 11

*

*

*

**

*

-2-1

01

23

4

City-specific MLEs for Log Relative Risks

city

Pe

rce

nt C

ha

ng

e

Page 12: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 12

Notation

Page 13: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 13

Estimating Overall Mean

• Idea: give more weight to more precise values

• Specifically, weight estimates inversely proportional to their variances

Page 14: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 14

Estimating the Overall Mean

Page 15: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 15

Calculations for Empirical Bayes Estimates

City Log RR

(bc)

Stat Var

(vc)

Total

Var

(TVc)

1/TVc wc

LA 0.25 .0169 .0994 10.1 .27

NYC 1.4 .0625 .145 6.9 .18

Chi 0.60 .0169 .0994 10.1 .27

Dal 0.25 .3025 .385 2.6 .07

Hou 0.45 .160 ,243 4.1 .11

SD 1.0 .2025 .285 3.5 .09

Over-all

0.65 37.3 1.00

= .27* 0.25 + .18*1.4 + .27*0.60 + .07*0.25 + .11*0.45 + 0.9*1.0 = 0.65

Var = 1/Sum(1/TVc) = 0.164^2

Page 16: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 16

Software in R

beta.hat <-c(0.25,1.4,0.50,0.25,0.45,1.0)

se <- c(0.13,0.25,0.13,0.55,0.40,0.45)

NV <- var(beta.hat) - mean(se^2)

TV <- se^2 + NV

tmp<- 1/TV

ww <- tmp/sum(tmp)

v.alphahat <- sum(ww)^{-1}

alpha.hat <- v.alphahat*sum(beta.hat*ww)

Page 17: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 17

Two Extremes

• Natural variance >> Statistical variances– Weights wc approximately constant = 1/n– Use ordinary mean of estimates regardless of

their relative precision

• Statistical variances >> Natural variance– Weight each estimator inversely proportional to

its statistical variance

Page 18: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 18-4 -3 -2 -1 0 1

-0.5

0.0

0.5

1.0

1.5

Sensitivity of Inferences to Natural Variance

Log2(Natural Variance)

Ave

rag

e R

ela

tive

Ris

k

Page 19: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 19

Estimating Relative Risk for Each City

• Disease screening analogy– Test result from imperfect test– Positive predictive value combines prevalence

with test result using Bayes theorem• Empirical Bayes estimator of the true value for a

city is the conditional expectation of the true value given the data )ˆ|( ccE

Page 20: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 20

Empirical Bayes Estimate

c

ccc VarVar

ˆ

)()|~

(

))var((

ˆ)1(ˆ)ˆ|(~

cc

ccccc

NVNV

E

)()|~

( ccc VarVar

Page 21: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 21

Calculations for Empirical Bayes Estimates

City Log RR

Stat Var

(vc)

Total

Var

(TVc)

1/TVc wc RR.EB se

RR.EB

LA 0.25 .0169 .0994 10.1 .27 .83 0.32 0.17

NYC 1.4 .0625 .145 6.9 .18 .57 1.1 0.14

Chi 0.60 .0169 .0994 10.1 .27 .83 0.61 0.11

Dal 0.25 .3025 .385 2.6 .07 .21 0.56 0.12

Hou 0.45 .160 ,243 4.1 .11 .34 0.58 0.14

SD 1.0 .2025 .285 3.5 .09 .29 0.75 0.13

Over-all

0.65 1/37.3=

0.027

37.3 1.00 0.65 0.16

cc c~

Page 22: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 22

*

*

*

**

*

-2-1

01

23

4

City-specific MLEs for Log Relative Risks

city

Pe

rce

nt C

ha

ng

e

Page 23: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 23

*

*

*

**

*

-2-1

01

23

4

City-specific MLEs (Left) and Empirical Bayes Estimates (Right)

city

Pe

rce

nt C

ha

ng

e

Page 24: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 24

* *** * *

0.0 0.5 1.0 1.5 2.0

Shrinkage of Empirical Bayes Estimates

Percent Increase in Mortality

o oooo o

Maximum likelihood estimates

Empirical Bayes estimates

Page 25: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 250.0 0.2 0.4 0.6 0.8 1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

Sensitivity of Empirical Bayes Estimates

Natural Variance

Re

lative

Ris

k

c

Page 26: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 26

Key Ideas

• Better to use data for all cities to estimate the relative risk for a particular city– Reduce variance by adding some bias

– Smooth compromise between city specific estimates and overall mean

• Empirical-Bayes estimates depend on measure of natural variation– Assess sensivity to estimate of NV

Page 27: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 27

Daily time series of air pollution, mortality and weather in Baltimore 1987-1994

Page 28: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 28

90 Largest Locations in the USA

Page 29: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 29

Statistical Methods

• Semi-parametric regressions for estimating associations between day-to-day variations in air pollution and mortality controlling for confounding factors

• Hierarchical Models for estimating:– national-average relative rate

– national-average exposure-response relationship

– exploring heterogeneity of air pollution effects across the country

Page 30: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 30

Hierarchical Models for Estimating a National Average

Relative Rate of Mortality

Page 31: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 31

Pooling

City-specific relative rates are pooled across cities to:

1. estimate a national-average air pollution effect on mortality;

2. explore geographical patterns of variation of air pollution effects across the country

Page 32: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 32

Pooling

• Implement the old idea of borrowing strength across studies

• Estimate heterogeneity and its uncertainty

• Estimate a national-average effect which takes into account heterogeneity

Page 33: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 33

City-specific and regional estimates

City-specific and regional estimates

Page 34: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 34

Spatial Model for Relative Rates

Page 35: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 35

Three Models

• “Three stage”- as in previous slide

• “Two stage”- ignore region effects; assume cities have exchangeable random effects

• Two stage with “spatial” correlation -city random effects have isotropic exponentially decaying autocorrelation function

Page 36: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 36

Estimating a national-average relative rate

Dominici, Zeger, Samet RSSA 2000

Samet, Dominici, Zeger et al. NEJM 2000

Page 37: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 37

Epidemiological Evidence from NMMAPS

Page 38: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 38

Maximum likelihood and Bayesian estimates of air pollution effects

Use only city-specific information Borrow strength across cities

Dominici, McDermott, Zeger, Samet EHP 2003

Page 39: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 39

Shrinkage

Page 40: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 40

Posterior Distribution of National Average

Page 41: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 41

Results Stratified by Cause of Death

Page 42: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 42

Regional map of air pollution effects

Partition of the United States used in the 1996 Review of the NAAQS

Page 43: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 43

Findings

• NMMAPS has provided at least four important findings about air pollution and mortality

1. There is evidence of an association between acute exposure to particulate air pollution and mortality

2. This association is strongest for cardiovascular and respiratory mortality

3. The association is strongest in the Northeast region of the USA

4. The exposure-response relationship is linear

Page 44: Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici

Term 4, 2005 BIO656 Multilevel Models 44

Caveats

• Used simplistic methods to illustrate the key ideas:– Treated natural variance and overall estimate as

known when calculating uncertainty in EB estimates

– Assumed normal distribution or true relative risks

• Can do better using Markov Chain Monte Carlo methods – more to come