term 4, 2006bio656--multilevel models 1 projects are due by midnight, friday, may 19 th electronic...

37
BIO656--Multilevel Models 1 Term 4, 2006 PROJECTS ARE DUE PROJECTS ARE DUE • By midnight, Friday, May 19 th Electronic submission only to tl o [email protected] • Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype]

Upload: agnes-murphy

Post on 17-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 1Term 4, 2006

PROJECTS ARE DUEPROJECTS ARE DUE

• By midnight, Friday, May 19th • Electronic submission only to [email protected]• Please name the file: [myname]-project.[filetype]or [name1_name2]-project.[filetype]

Page 2: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 2Term 4, 2006

Efficiency-Robustness Trade-offsEfficiency-Robustness Trade-offs

• First, we consider alternatives to the Gaussian distribution for random effects

• Then, we move to issues of weighting, starting with some formalism

• Then, move to an example of informative sample size

• And, finally give a basic example that has broad implications of choosing among weighting schemes

Page 3: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 3Term 4, 2006

Alternatives to the Gaussian Alternatives to the Gaussian Distribution for Random EffectsDistribution for Random Effects

Page 4: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 4Term 4, 2006

The t-distributionThe t-distribution

• Broader tails than the Gaussian

• So, shrinks less for deviant Y-values

• The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual

Page 5: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 5Term 4, 2006

Creating a t-distributionCreating a t-distribution

• Assume a Gaussian sampling distribution,

• Using the sample standard deviation produces the t-distribution• Z is t with a large df• t3 is the most different from Z for t-distributions with a finite variance

Page 6: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 6Term 4, 2006

Page 7: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 7Term 4, 2006

With a t-prior, B is B(Y), increasing with |Y - |

Page 8: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 8Term 4, 2006

Z is distancefrom the center

(1-B) = ½ = 0.50

Page 9: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 9Term 4, 2006

Z is distancefrom the center

(1- B) = 2/3 = 0.666

Page 10: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 10Term 4, 2006

Estimated Gaussian & Estimated Gaussian & Fully Non-parametric priors Fully Non-parametric priors

for the USRDS data

Page 11: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 11Term 4, 2006

USRDS estimated PriorsUSRDS estimated Priors

Page 12: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 12Term 4, 2006

Page 13: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 13Term 4, 2006

Page 14: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 14Term 4, 2006

Page 15: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 15Term 4, 2006

Page 16: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 16Term 4, 2006

Page 17: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 17Term 4, 2006

Informative Sample SizeInformative Sample Size(Similar to informative Censoring)(Similar to informative Censoring)

See Louis et al. SMMR 2006

Page 18: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 18Term 4, 2006

Page 19: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 19Term 4, 2006

Page 20: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 20Term 4, 2006

Page 21: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 21Term 4, 2006

Page 22: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 22Term 4, 2006

Page 23: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 23Term 4, 2006

Page 24: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 24Term 4, 2006

Page 25: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 25Term 4, 2006

Choosing among weighting schemesChoosing among weighting schemes

“Optimality” versus goal achievement

Page 26: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 26Term 4, 2006

Inferential ContextInferential Context

Question What is the average length of in-hospital stay?

A more specific question• What is the average length of stay for:

– Several hospitals of interest?

– Maryland hospitals?

– All hospitals?

– .......

Page 27: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 27Term 4, 2006

““Data” Collection & GoalData” Collection & Goal

Data gathered from 5 hospitals

• Hospitals are selected by some method

• nhosp patient records are sampled at random

• Length of stay (LOS) is recorded

Goal is to: Estimate the “population” mean

Page 28: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 28Term 4, 2006

ProcedureProcedure

• Compute hospital-specific means

• “Average” them– For simplicity assume that the population variance

is known and the same for all hospitals

How should we compute the average?• Need a goal and then a good/best way to combine information

Page 29: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 29Term 4, 2006

““DATA”DATA”

Hospital # sampled nhosp

Hospital size

% of Total

size: 100hosp

Mean LOS

Within-hospital variance

1 30 100 10 25 2/30

2 60 150 15 35 2/60

3 15 200 20 15 2/15

4 30 250 25 40 2/30

5 15 300 30 10 2/15

Total 150 1000 100 ? ?

Page 30: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 30Term 4, 2006

Weighted averages & Variances Weighted averages & Variances (Variances are based on FE not RE)

Weightingapproach

Weightsx100

Mean VarianceRatio

100*(Var/min)

Equal 20 20 20 20 20 25.0 130

Proportional to Reciprocal variance

20 40 10 20 10 29.5 100

Population hosp10 15 20 25 30 23.8 172

Each weighted average is mean =

Reciprocal variance weights minimize varianceIs that our goal?

kkXw

Page 31: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 31Term 4, 2006

There are many weighting There are many weighting choices and weighting goalschoices and weighting goals

• Minimize variance by using reciprocal variance weights

• Minimize bias for the population mean by using population weights (“survey weights”)

• Use policy weights (e.g., equal weighting)

• Use “my weights,” ...

Page 32: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 32Term 4, 2006

General SettingGeneral Setting

When the model is correct• All weighting schemes estimate the same quantities

– same value for slopes in a multiple regression

• So, it is clearly best to minimize variance by using reciprocal variance weights

When the model is incorrect• Must consider analysis goals and use appropriate weights• Of course, it is generally true that our model is not correct!

Page 33: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 33Term 4, 2006

Weights and their propertiesWeights and their properties

• But if1 = 2 = 3 = 4 = 5 =

then all weighted averages estimate the population

mean: kk

So, it’s best to minimize the variance

But, if the hospital-specific k are not all equal, then• Each set of weights estimates a different target• Minimizing variance might not be “best”• For an unbiased estimate of setwk = k

kkkk

54321

w estimates Yw Then,

E(LOS) specific-hospital the be(Let

w

),,,,

πμ

πμ

πμ

Page 34: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 34Term 4, 2006

The variance-bias tradeoffThe variance-bias tradeoff

General idea Trade-off variance & bias to produce low Mean Squared Error (MSE)

MSE = Expected(Estimate - True)(Estimate - True)22

= Variance + (Bias)Variance + (Bias)22

• Bias is unknown unless we know the k

(the true hospital-specific mean LOS)

• But, we can study MSE (, w, )

• In practice, make some “guesses” and do sensitivity analyses

Page 35: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 35Term 4, 2006

Variance, Bias and MSE Variance, Bias and MSE as a function of (the as a function of (the ss, , ww, , ))

• Consider a true value for the variation of the between hospital means (* is the “overall mean”)

T = (k - *)2

• Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between-hospital variance

• So, when A = T, MSE is minimized by this optimizer

• In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital)– Fraction = 0 minimize variance– Fraction = 1 minimize bias

Page 36: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 36Term 4, 2006

The bias-variance trade-offThe bias-variance trade-offX-axis is assumed variance fraction

Y is performance computed under the true fraction

Assumed k

Page 37: Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please

BIO656--Multilevel Models 37Term 4, 2006

SummarySummary

• Much of statistics depends on weighted averages

• Weights should depend on assumptions and goals

• If you trusttrust your (regression) model,– Then, minimize the variance, using “optimal” weights– This generalizes the equal case

• If you worryworry about model validity (bias for – You can buy full insurance by using population weights– But, you pay in variance (efficiency)– So, consider purchasing only the insurance you need by

using compromise weightsusing compromise weights