term 4, 2006bio656--multilevel models 1 projects are due by midnight, friday, may 19 th electronic...

BIO656--Multilevel Models 1Term 4, 2006

PROJECTS ARE DUEPROJECTS ARE DUE

• By midnight, Friday, May 19th • Electronic submission only to [email protected]• Please name the file: [myname]-project.[filetype]or [name1_name2]-project.[filetype]


Efficiency-Robustness Trade-offsEfficiency-Robustness Trade-offs

• First, we consider alternatives to the Gaussian distribution for random effects

• Then, we move to issues of weighting, starting with some formalism

• Then, move to an example of informative sample size

• And, finally give a basic example that has broad implications of choosing among weighting schemes


Alternatives to the Gaussian Alternatives to the Gaussian Distribution for Random EffectsDistribution for Random Effects


The t-distributionThe t-distribution

• Broader tails than the Gaussian

• So, shrinks less for deviant Y-values

• The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual


Creating a t-distributionCreating a t-distribution

• Assume a Gaussian sampling distribution,

• Using the sample standard deviation produces the t-distribution• Z is t with a large df• t3 is the most different from Z for t-distributions with a finite variance


With a t-prior, B is B(Y), increasing with |Y - |


Z is distancefrom the center

(1-B) = ½ = 0.50


Z is distancefrom the center

(1- B) = 2/3 = 0.666


Estimated Gaussian & Estimated Gaussian & Fully Non-parametric priors Fully Non-parametric priors

for the USRDS data


USRDS estimated PriorsUSRDS estimated Priors


Informative Sample SizeInformative Sample Size(Similar to informative Censoring)(Similar to informative Censoring)

See Louis et al. SMMR 2006


Choosing among weighting schemesChoosing among weighting schemes

“Optimality” versus goal achievement


Inferential ContextInferential Context

Question What is the average length of in-hospital stay?

A more specific question• What is the average length of stay for:

– Several hospitals of interest?

– Maryland hospitals?

– All hospitals?

– .......


““Data” Collection & GoalData” Collection & Goal

Data gathered from 5 hospitals

• Hospitals are selected by some method

• nhosp patient records are sampled at random

• Length of stay (LOS) is recorded

Goal is to: Estimate the “population” mean


ProcedureProcedure

• Compute hospital-specific means

• “Average” them– For simplicity assume that the population variance

is known and the same for all hospitals

How should we compute the average?• Need a goal and then a good/best way to combine information


““DATA”DATA”

Hospital # sampled nhosp

Hospital size

% of Total

size: 100hosp

Mean LOS

Within-hospital variance

1 30 100 10 25 2/30

2 60 150 15 35 2/60

3 15 200 20 15 2/15

4 30 250 25 40 2/30

5 15 300 30 10 2/15

Total 150 1000 100 ? ?


Weighted averages & Variances Weighted averages & Variances (Variances are based on FE not RE)

Weightingapproach

Weightsx100

Mean VarianceRatio

100*(Var/min)

Equal 20 20 20 20 20 25.0 130

Proportional to Reciprocal variance

20 40 10 20 10 29.5 100

Population hosp10 15 20 25 30 23.8 172

Each weighted average is mean =

Reciprocal variance weights minimize varianceIs that our goal?

kkXw


There are many weighting There are many weighting choices and weighting goalschoices and weighting goals

• Minimize variance by using reciprocal variance weights

• Minimize bias for the population mean by using population weights (“survey weights”)

• Use policy weights (e.g., equal weighting)

• Use “my weights,” ...


General SettingGeneral Setting

When the model is correct• All weighting schemes estimate the same quantities

– same value for slopes in a multiple regression

• So, it is clearly best to minimize variance by using reciprocal variance weights

When the model is incorrect• Must consider analysis goals and use appropriate weights• Of course, it is generally true that our model is not correct!


Weights and their propertiesWeights and their properties

• But if1 = 2 = 3 = 4 = 5 =

then all weighted averages estimate the population

mean: kk

So, it’s best to minimize the variance

But, if the hospital-specific k are not all equal, then• Each set of weights estimates a different target• Minimizing variance might not be “best”• For an unbiased estimate of setwk = k

kkkk

54321

w estimates Yw Then,

E(LOS) specific-hospital the be(Let

w

),,,,

πμ

πμ

πμ


The variance-bias tradeoffThe variance-bias tradeoff

General idea Trade-off variance & bias to produce low Mean Squared Error (MSE)

MSE = Expected(Estimate - True)(Estimate - True)22

= Variance + (Bias)Variance + (Bias)22

• Bias is unknown unless we know the k

(the true hospital-specific mean LOS)

• But, we can study MSE (, w, )

• In practice, make some “guesses” and do sensitivity analyses


Variance, Bias and MSE Variance, Bias and MSE as a function of (the as a function of (the ss, , ww, , ))

• Consider a true value for the variation of the between hospital means (* is the “overall mean”)

T = (k - *)2

• Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between-hospital variance

• So, when A = T, MSE is minimized by this optimizer

• In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital)– Fraction = 0 minimize variance– Fraction = 1 minimize bias


The bias-variance trade-offThe bias-variance trade-offX-axis is assumed variance fraction

Y is performance computed under the true fraction

Assumed k


SummarySummary

• Much of statistics depends on weighted averages

• Weights should depend on assumptions and goals

• If you trusttrust your (regression) model,– Then, minimize the variance, using “optimal” weights– This generalizes the equal case

• If you worryworry about model validity (bias for – You can buy full insurance by using population weights– But, you pay in variance (efficiency)– So, consider purchasing only the insurance you need by

using compromise weightsusing compromise weights

term 4, 2006bio656--multilevel models 1 projects are due by midnight, friday, may 19 th electronic...

Documents