topic 9: remedies - purdue universitybacraig/notes512/topic_9.pdf · poisson (count data) – gamma...
TRANSCRIPT
![Page 1: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/1.jpg)
Topic 9: Remedies
![Page 2: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/2.jpg)
Outline
• Review diagnostics for residuals• Discuss remedies
– Nonlinear relationship– Nonconstant variance– Non-Normal distribution– Outliers
![Page 3: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/3.jpg)
Diagnostics for residuals• Look at residuals to find serious/obvious
violations of the model assumptions – nonlinear relationship – non-constant variance– non-Normal errors
• presence of outliers• a strongly skewed distribution
![Page 4: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/4.jpg)
Recommendations for checking assumptions
• Plot Y vs X (is it a linear relationship?)
• Use the i=sm## in symbol statement if using SAS to get smoothed curve fit
• If reasonable, fit model to get residuals• Look at distribution of residuals• Plot residuals vs X, time/run order, or any
other potential explanatory variable
![Page 5: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/5.jpg)
Plots of Residuals• Plot residuals vs
– Time (run order)– X or predicted value (b0+b1X)
• In all cases look for –nonrandom patterns–outliers (unusual observations)
![Page 6: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/6.jpg)
1. Residuals vs Time
• Pattern in plot suggests dependent errors / lack of indep
• Pattern usually a linear or quadratic trend and/or cyclical
• If you are interested in more info read KNNL pgs 108-110
![Page 7: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/7.jpg)
2. Residuals vs X or .
• Can look for –nonconstant variance–nonlinear relationship–outliers–To some extent non-Normality
of residuals
Y
![Page 8: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/8.jpg)
Assessment of Normality• Look at distribution of residuals
–Can look at then together because E(e) = 0 (unlike Y’s)
• Common to use eye-test–Histogram–Normal quantile plot–Scatter in residual plot
![Page 9: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/9.jpg)
Tests for Normality• H0: data are an i.i.d. sample from a
Normal population• Ha: data are not an i.i.d. sample
from a Normal population• KNNL (p 115) suggest a correlation
test that requires a table look-up based on relationship between obsand normal scores
![Page 10: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/10.jpg)
Tests for Normality• SAS has several choices for the
significance testing procedure • PROC UNIVERIATE with the normal
option provides four proceduresproc univariate normal;
• Shapiro-Wilk is most common choice
![Page 11: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/11.jpg)
Tests for Normality
Test Statistic p ValueShapiro-Wilk W 0.978904 Pr < W 0.8626
Kolmogorov-Smirnov D 0.09572 Pr > D >0.1500
Cramer-von Mises W-Sq 0.033263 Pr > W-Sq >0.2500
Anderson-Darling A-Sq 0.207142 Pr > A-Sq >0.2500
All P-values > 0.05…Do not reject H0
Not the same as concluding errors are Normally distributed. Just not enough evidence to reject
Toluca data set
![Page 12: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/12.jpg)
Other tests for model assumptions
• Durbin-Watson test for serially correlated errors (KNNL p 114)
• Modified Levene test for homogeneity of variance (KNNL p 116-118)
• Breusch-Pagan test for homogeneity of variance (KNNL p 118)
• For SAS commands see topic9.sas
![Page 13: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/13.jpg)
Plots vs significance tests• Significance tests results are very
dependent on the sample size; with sufficiently large samples we can reject slight deviations from null hypothesis that would not invalidate results if ignored
• Plots are more likely to suggest a remedy if one is needed
![Page 14: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/14.jpg)
Default graphics with SAS 9.4
proc reg data=toluca; model hours=lotsize; id lotsize;
run;
![Page 15: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/15.jpg)
Will discuss these diagnostics more with multiple regression
Plots provide rule of thumb limits
Questionable observation (30,273)
![Page 16: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/16.jpg)
Additional summaries in these plots
• Rstudent: Studentized residual…almost all should be between ± 2
• Leverage: “Distance” of X from center…helps determine outlying X values in multivariable setting…outlying X values may be influential
• Cook’s D: Influence of ith case on all predicted values
![Page 17: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/17.jpg)
![Page 18: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/18.jpg)
![Page 19: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/19.jpg)
Lack of fit• When we have repeat observations at
various values of X, we can do a significance test for nonlinearity
• Description in KNNL Section 3.7• Details of approach discussed when
we get to KNNL 17.9, p 762• Basic idea is to compare two models• Gplot with a smooth is a better (i.e.,
simpler) approach for this assesment
![Page 20: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/20.jpg)
SAS code and outputproc reg data=toluca; model hours=lotsize / lackfit;
run;Analysis of Variance
Source DFSum of
SquaresMean
Square F Value Pr > FModel 1 252378 252378 105.88 <.0001Error 23 54825 2383.71562Lack of Fit 9 17245 1916.06954 0.71 0.6893Pure Error 14 37581 2684.34524Corrected Total 24 307203
![Page 21: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/21.jpg)
Nonlinear relationships• We can model many nonlinear
relationships with linear models, some have several explanatory variables (i.e., need to move to multiple linear regression)–Y = β0 + β1X + β2X2 + e (quadratic)
–Y = β0 + β1log(X) + e
![Page 22: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/22.jpg)
Nonlinear Relationships• Sometimes one can transform a
nonlinear equation into a linear equation
• Consider Y = β0exp(β1X) + e• Can form linear model using log
log(Y) = log(β0) + β1X + dNote that we have changed our
assumptions about the error. If we back-transform Y = β0exp(β1X)exp(d)
![Page 23: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/23.jpg)
Nonlinear Relationships• If we don’t want to alter assumptions
on errors, we can instead perform a nonlinear regression analysis– Means vary in nonlinear manner– Obs deviate about these means
just as in linear regression• KNNL Chapter 13• SAS PROC NLIN
![Page 24: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/24.jpg)
Nonconstant variance• Sometimes we need to model the way in
which the error variance changes– may be functionally related to X– In other words changes with the mean
• In this case, we use a weighted analysis• This is discussed KNNL 11.1• Use a weight statement in PROC REG
![Page 25: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/25.jpg)
Non-Normal errors• Transformations of Y often
allow you to still use linear regression
• If not, use a procedure that allows different distributions for the error term–SAS PROC GENMOD
![Page 26: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/26.jpg)
Generalized Linear Model• Possible distributions of Y:
– Binomial (Y/N or percentage data)– Poisson (Count data)– Gamma (exponential)– Inverse gaussian– Negative binomial– Multinomial
• Specify a link function for E(Y)– May be linear or nonlinear function of X
![Page 27: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/27.jpg)
Transformations Ladder of Re-expression
p
0.0(log)0.5
-0.5-1.0
1.01.5 Transformation
is xp or yp
Transform of y affects error assumptions
![Page 28: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/28.jpg)
Circle of TransformationsWhat quadrant best describes the shape of the relationship
X up, Y up
X up, Y down
X down, Y up
X down, Y down
X
Y
Transform using appropriate power
![Page 29: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/29.jpg)
Box-Cox Transformations
• Also called power transformations• These transformations adjust for
non-Normality and nonconstant variance
• Y´ = Yλ or Y´ = (Yλ - 1)/λ• In the second form, the limit as λ
approaches zero is the (natural) log
![Page 30: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/30.jpg)
Important Special Cases
• λ = 1, Y´ = Y1, no transformation• λ = .5, Y´ = Y1/2, square root• λ = -.5, Y´ = Y-1/2, one over square root• λ = -1, Y´ = Y-1 = 1/Y, inverse• λ = 0, Y´ = (natural) log of Y
![Page 31: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/31.jpg)
Box-Cox Details• We can estimate λ by including it as
a parameter in a non-linear model• Yλ = β0 + β1X + e
and use the method of maximum likelihood to estimate it
• Details are in KNNL p 134-137• SAS code is in boxcox.sas
![Page 32: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/32.jpg)
Box-Cox Solution• Standardized transformed Y is
–K1(Yλ - 1) if λ ≠ 0–K2log(Y) if λ = 0where K2 = (Π Yi)1/n (the geometric mean)and K1 = 1/ (λ K2
λ-1)• Run regressions with X as
explanatory variable • Best choice of λ minimizes SSE
![Page 33: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/33.jpg)
data a1; input age plasma @@;cards;0 13.44 0 12.84 0 11.91 0 20.090 15.60 1 10.11 1 11.38 1 10.281 8.96 1 8.59 2 9.83 2 9.002 8.65 2 7.85 2 8.88 3 7.943 6.01 3 5.14 3 6.90 3 6.774 4.86 4 5.10 4 5.67 4 5.754 6.23;
Example
![Page 34: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/34.jpg)
Variability large initially and gets smaller. Is it just different at age=0? Or is it more a function of age?
![Page 35: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/35.jpg)
Box Cox Procedure*Procedure that will automatically find the Box-Cox transformation;
proc transreg data=a1;
model boxcox(plasma)=identity(age);
run;
![Page 36: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/36.jpg)
Transformation Information for BoxCox(plasma)
Lambda R-Square Log Like
-2.50 0.76 -17.0444
-2.00 0.80 -12.3665
-1.50 0.83 -8.1127
-1.00 0.86 -4.8523 *
-0.50 0.87 -3.5523 <
0.00 + 0.85 -5.0754 *
0.50 0.82 -9.2925
1.00 0.75 -15.2625
1.50 0.67 -22.1378
2.00 0.59 -29.4720
2.50 0.50 -37.0844
< - Best Lambda
* - Confidence Interval
+ - Convenient Lambda
Based on these results, either take inverse square root or log
![Page 37: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/37.jpg)
![Page 38: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/38.jpg)
*The first part of the program gets the geometric mean;
data a2; set a1; lplasma=log(plasma);
proc univariate data=a2 noprint; var lplasma; output out=a3 mean=meanl;
Doing procedure inside SAS data steps
![Page 39: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/39.jpg)
data a4; set a2; if _n_ eq 1 then set a3; keep age yl l;k2=exp(meanl);do l = -1.0 to 1.0 by .1;
k1=1/(l*k2**(l-1));yl=k1*(plasma**l -1);if abs(l) < 1E-8 then yl=k2*log(plasma);
output;end;
![Page 40: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/40.jpg)
proc sort data=a4 out=a4; by l;
proc reg data=a4 noprint outest=a5; model yl=age; by l;
data a5; set a5; n=25; p=2; sse=(n-p)*(_rmse_)**2;
proc print data=a5; var l sse;
![Page 41: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/41.jpg)
Obs l sse1 -1.0 33.90892 -0.9 32.70443 -0.8 31.76454 -0.7 31.09075 -0.6 30.68686 -0.5 30.5596***7 -0.4 30.71868 -0.3 31.17639 -0.2 31.9487
10 -0.1 33.0552
![Page 42: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/42.jpg)
symbol1 v=none i=join;proc gplot data=a5;
plot sse*l;run;
![Page 43: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/43.jpg)
![Page 44: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/44.jpg)
Compare Regressions• Can also create large data set of
standardized transformed Y’s and compare R2 fits
• Best l maximizes R2
• This approach is also provided in the SAS code.
![Page 45: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/45.jpg)
data a2; set a1;do l = -1.0 to -.1 by .1;
yl=(plasma**l -1)/l; output;
end;l=0; yl=log(plasma); output;do l = .1 to 1 by .1;
yl=(plasma**l -1)/l; output;
end;
![Page 46: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/46.jpg)
proc sort data=a2 out=a2; by l;
proc reg noprint outest=a3; model yl=age/selection=rsquare; by l;
proc print data=a3; var l _rsq_;proc gplot data=a3; plot _rsq_*l/frame;
run;
![Page 47: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/47.jpg)
![Page 48: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/48.jpg)
data a1; set a1;
tplasma = plasma**(-.5);
tplasma1 = log(plasma);
symbol1 v=circle i=sm50;
proc gplot; plot tplasma*age;
proc gplot; plot tplasma1*age;
run;
Now let’s apply the transformation
![Page 49: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/49.jpg)
![Page 50: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/50.jpg)
![Page 51: Topic 9: Remedies - Purdue Universitybacraig/notes512/Topic_9.pdf · Poisson (Count data) – Gamma (exponential) – Inverse gaussian ... SAS code is in boxcox.sas. Box-Cox Solution](https://reader034.vdocuments.net/reader034/viewer/2022042707/5a785f2c7f8b9a4b538ea927/html5/thumbnails/51.jpg)
Background Reading
• Sections 3.4 - 3.7 describe significance tests for assumptions (read it if you are interested).
• Box-Cox transformation is in boxcox.sas
• Read Sections 4.1, 4.2, 4.4, 4.5, and 4.6