projects are due
DESCRIPTION
PROJECTS ARE DUE. By midnight, Friday, May 19 th Electronic submission only to tl o [email protected] Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype]. Efficiency-Robustness Trade-offs. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/1.jpg)
BIO656--Multilevel Models 1Term 4, 2006
PROJECTS ARE DUEPROJECTS ARE DUE
• By midnight, Friday, May 19th • Electronic submission only to [email protected]• Please name the file: [myname]-project.[filetype]or [name1_name2]-project.[filetype]
![Page 2: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/2.jpg)
BIO656--Multilevel Models 2Term 4, 2006
Efficiency-Robustness Trade-offsEfficiency-Robustness Trade-offs
• First, we consider alternatives to the Gaussian distribution for random effects
• Then, we move to issues of weighting, starting with some formalism
• Then, move to an example of informative sample size
• And, finally give a basic example that has broad implications of choosing among weighting schemes
![Page 3: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/3.jpg)
BIO656--Multilevel Models 3Term 4, 2006
Alternatives to the Gaussian Alternatives to the Gaussian Distribution for Random EffectsDistribution for Random Effects
![Page 4: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/4.jpg)
BIO656--Multilevel Models 4Term 4, 2006
The t-distributionThe t-distribution
• Broader tails than the Gaussian
• So, shrinks less for deviant Y-values
• The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual
![Page 5: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/5.jpg)
BIO656--Multilevel Models 5Term 4, 2006
Creating a t-distributionCreating a t-distribution
• Assume a Gaussian sampling distribution,
• Using the sample standard deviation produces the t-distribution• Z is t with a large df• t3 is the most different from Z for t-distributions with a finite variance
![Page 6: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/6.jpg)
BIO656--Multilevel Models 6Term 4, 2006
![Page 7: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/7.jpg)
BIO656--Multilevel Models 7Term 4, 2006
With a t-prior, B is B(Y), increasing with |Y - |
![Page 8: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/8.jpg)
BIO656--Multilevel Models 8Term 4, 2006
Z is distancefrom the center
(1-B) = ½ = 0.50
![Page 9: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/9.jpg)
BIO656--Multilevel Models 9Term 4, 2006
Z is distancefrom the center
(1- B) = 2/3 = 0.666
![Page 10: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/10.jpg)
BIO656--Multilevel Models 10Term 4, 2006
Estimated Gaussian & Estimated Gaussian & Fully Non-parametric priors Fully Non-parametric priors
for the USRDS data
![Page 11: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/11.jpg)
BIO656--Multilevel Models 11Term 4, 2006
USRDS estimated PriorsUSRDS estimated Priors
![Page 12: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/12.jpg)
BIO656--Multilevel Models 12Term 4, 2006
![Page 13: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/13.jpg)
BIO656--Multilevel Models 13Term 4, 2006
![Page 14: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/14.jpg)
BIO656--Multilevel Models 14Term 4, 2006
![Page 15: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/15.jpg)
BIO656--Multilevel Models 15Term 4, 2006
![Page 16: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/16.jpg)
BIO656--Multilevel Models 16Term 4, 2006
![Page 17: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/17.jpg)
BIO656--Multilevel Models 17Term 4, 2006
Informative Sample SizeInformative Sample Size(Similar to informative Censoring)(Similar to informative Censoring)
See Louis et al. SMMR 2006
![Page 18: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/18.jpg)
BIO656--Multilevel Models 18Term 4, 2006
![Page 19: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/19.jpg)
BIO656--Multilevel Models 19Term 4, 2006
![Page 20: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/20.jpg)
BIO656--Multilevel Models 20Term 4, 2006
![Page 21: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/21.jpg)
BIO656--Multilevel Models 21Term 4, 2006
![Page 22: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/22.jpg)
BIO656--Multilevel Models 22Term 4, 2006
![Page 23: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/23.jpg)
BIO656--Multilevel Models 23Term 4, 2006
![Page 24: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/24.jpg)
BIO656--Multilevel Models 24Term 4, 2006
![Page 25: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/25.jpg)
BIO656--Multilevel Models 25Term 4, 2006
Choosing among weighting schemesChoosing among weighting schemes
“Optimality” versus goal achievement
![Page 26: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/26.jpg)
BIO656--Multilevel Models 26Term 4, 2006
Inferential ContextInferential Context
Question What is the average length of in-hospital stay?
A more specific question• What is the average length of stay for:
– Several hospitals of interest?
– Maryland hospitals?
– All hospitals?
– .......
![Page 27: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/27.jpg)
BIO656--Multilevel Models 27Term 4, 2006
““Data” Collection & GoalData” Collection & Goal
Data gathered from 5 hospitals
• Hospitals are selected by some method
• nhosp patient records are sampled at random
• Length of stay (LOS) is recorded
Goal is to: Estimate the “population” mean
![Page 28: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/28.jpg)
BIO656--Multilevel Models 28Term 4, 2006
ProcedureProcedure
• Compute hospital-specific means
• “Average” them– For simplicity assume that the population variance
is known and the same for all hospitals
How should we compute the average?• Need a goal and then a good/best way to combine information
![Page 29: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/29.jpg)
BIO656--Multilevel Models 29Term 4, 2006
““DATA”DATA”
Hospital # sampled nhosp
Hospital size
% of Total
size: 100hosp
Mean LOS
Within-hospital variance
1 30 100 10 25 2/30
2 60 150 15 35 2/60
3 15 200 20 15 2/15
4 30 250 25 40 2/30
5 15 300 30 10 2/15
Total 150 1000 100 ? ?
![Page 30: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/30.jpg)
BIO656--Multilevel Models 30Term 4, 2006
Weighted averages & Variances Weighted averages & Variances (Variances are based on FE not RE)
Weightingapproach
Weightsx100
Mean VarianceRatio
100*(Var/min)
Equal 20 20 20 20 20 25.0 130
Proportional to Reciprocal variance
20 40 10 20 10 29.5 100
Population hosp10 15 20 25 30 23.8 172
Each weighted average is mean =
Reciprocal variance weights minimize varianceIs that our goal?
kkXw
![Page 31: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/31.jpg)
BIO656--Multilevel Models 31Term 4, 2006
There are many weighting There are many weighting choices and weighting goalschoices and weighting goals
• Minimize variance by using reciprocal variance weights
• Minimize bias for the population mean by using population weights (“survey weights”)
• Use policy weights (e.g., equal weighting)
• Use “my weights,” ...
![Page 32: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/32.jpg)
BIO656--Multilevel Models 32Term 4, 2006
General SettingGeneral Setting
When the model is correct• All weighting schemes estimate the same quantities
– same value for slopes in a multiple regression
• So, it is clearly best to minimize variance by using reciprocal variance weights
When the model is incorrect• Must consider analysis goals and use appropriate weights• Of course, it is generally true that our model is not correct!
![Page 33: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/33.jpg)
BIO656--Multilevel Models 33Term 4, 2006
Weights and their propertiesWeights and their properties
• But if1 = 2 = 3 = 4 = 5 =
then all weighted averages estimate the population
mean: kk
So, it’s best to minimize the variance
But, if the hospital-specific k are not all equal, then• Each set of weights estimates a different target• Minimizing variance might not be “best”• For an unbiased estimate of setwk = k
kkkk
54321
w estimates Yw Then,
E(LOS) specific-hospital the be(Let
w
),,,,
πμ
πμ
πμ
![Page 34: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/34.jpg)
BIO656--Multilevel Models 34Term 4, 2006
The variance-bias tradeoffThe variance-bias tradeoff
General idea Trade-off variance & bias to produce low Mean Squared Error (MSE)
MSE = Expected(Estimate - True)(Estimate - True)22
= Variance + (Bias)Variance + (Bias)22
• Bias is unknown unless we know the k
(the true hospital-specific mean LOS)
• But, we can study MSE (, w, )
• In practice, make some “guesses” and do sensitivity analyses
![Page 35: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/35.jpg)
BIO656--Multilevel Models 35Term 4, 2006
Variance, Bias and MSE Variance, Bias and MSE as a function of (the as a function of (the ss, , ww, , ))
• Consider a true value for the variation of the between hospital means (* is the “overall mean”)
T = (k - *)2
• Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between-hospital variance
• So, when A = T, MSE is minimized by this optimizer
• In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital)– Fraction = 0 minimize variance– Fraction = 1 minimize bias
![Page 36: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/36.jpg)
BIO656--Multilevel Models 36Term 4, 2006
The bias-variance trade-offThe bias-variance trade-offX-axis is assumed variance fraction
Y is performance computed under the true fraction
Assumed k
![Page 37: PROJECTS ARE DUE](https://reader035.vdocuments.net/reader035/viewer/2022070411/5681471f550346895db45390/html5/thumbnails/37.jpg)
BIO656--Multilevel Models 37Term 4, 2006
SummarySummary
• Much of statistics depends on weighted averages
• Weights should depend on assumptions and goals
• If you trusttrust your (regression) model,– Then, minimize the variance, using “optimal” weights– This generalizes the equal case
• If you worryworry about model validity (bias for – You can buy full insurance by using population weights– But, you pay in variance (efficiency)– So, consider purchasing only the insurance you need by
using compromise weightsusing compromise weights