matt vanlandeghem and grant sorensen. too many parameters: lots of variance in predicted values ...
TRANSCRIPT
![Page 1: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/1.jpg)
Matt VanLandeghem and Grant Sorensen
![Page 2: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/2.jpg)
Too many parameters:• Lots of variance in predicted values
Too few parameters:• Missing important parameters
Variance/bias tradeoff
![Page 3: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/3.jpg)
See SAS website• PROC GLMSELECT• Version 9.3 documentation (not 9.2)
• http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_glmselect_sect037.htm
![Page 4: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/4.jpg)
Variable importance represented as a selection frequency• Instead of p-value from F test
Estimates based on several “good” models
Distributions of parameter estimates
All of these help us pick the most useful model
![Page 5: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/5.jpg)
Any field where variable selection techniques are used• Biology (Burnham and Anderson
2002)• Atmospheric sciences (Sloughter et
al. 2007)• Econometrics (LeSage and Parent
2007)• Finance (Pesaran et al. 2009)• Psychology (Wasserman 2000)• …and others
![Page 6: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/6.jpg)
SAS implementation• GLMSELECT• Only GLMs• Experimental
Sensitive to correlated predictors• e.g. Homework #4
Extension of regression• Typical assumptions
still apply• Not a “magic” solution
![Page 7: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/7.jpg)
Other SAS options• AIC or BIC from SAS
procedure of choice• Model weights based
on AIC or BIC• Averaged “by hand”
![Page 8: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/8.jpg)
Burnham, K.P. and D.R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York.
LeSage, J.P and O. Parent. 2007. Bayesian model averaging for spatial economic models. Geographical Analysis 39:241-267.
Peseran, M.H., C. Schleicher, and P. Zaffaroni. 2009. Model averaging in risk management with an application to futures markets. Journal of Empirical Finance 16:280-305.
Sloughter, J.M., A.E. Raftery, T. Gneiting, and C. Fraley. 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon. Wea. Rev., 135, 3209–3220
Wasserman, L. 2000. Bayesian model selection and model averaging. Journal of Mathematical Psychology 44:92-107.
Whintey, M. and L. Ngo. 2004. Bayesian model averaging using SAS software. SUGI 29 Proceedings, Paper 203-29.
Pitfall picture:http://www.retrogameoftheday.com/2009/10/retro-game-of-day-pitfall.html
SAS model averaging webpage: http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_glmselect_sect026.htm
![Page 9: Matt VanLandeghem and Grant Sorensen. Too many parameters: Lots of variance in predicted values Too few parameters: Missing important parameters](https://reader030.vdocuments.net/reader030/viewer/2022032703/56649d055503460f949d943d/html5/thumbnails/9.jpg)
ods graphics on;
proc glmselect data = colstd seed=3 plots= all;model y = x1-x9 / selection=stepwise
(choose=cv);modelAverage tables=(EffectSelectPct(all)
ParmEst(all)) refit(minpct=50 nsamples=100) ;
run;
ods graphics off;