fixing problems with the model transforming the data so that the simple linear regression model is...
TRANSCRIPT
![Page 1: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/1.jpg)
Fixing problems with the model
Transforming the data so that the simple linear regression model is
okay for the transformed data.
![Page 2: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/2.jpg)
Options for fixing problems with the model
• Abandon simple linear regression model and find a more appropriate – but typically more complex – model.
• Transform the data so that the simple linear regression model works for the transformed data.
![Page 3: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/3.jpg)
Abandoning the model
• If not linear: try a different function, like a quadratic (Ch. 7) or an exponential function (Ch. 13).
• If unequal error variances: use weighted least squares (Ch. 10).
• If error terms are not independent: try fitting a time series model (Ch. 12).
• If important predictor variables omitted: try fitting a multiple regression model (Ch. 6).
• If outlier: use robust estimation procedure (Ch. 10).
![Page 4: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/4.jpg)
Choices for transforming the data
• Transform X values only.
• Transform Y values only.
• Transform both the X and the Y values.
![Page 5: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/5.jpg)
Transforming the X values only
![Page 6: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/6.jpg)
Transforming the X values only
• Appropriate when non-linearity is the only problem – normality and equal variance okay – with the model.
• Transforming the Y values would likely change the well-behaved error terms into badly-behaved error terms.
![Page 7: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/7.jpg)
Memory retention
time prop1 0.845 0.7115 0.6130 0.5660 0.54120 0.47240 0.45480 0.38720 0.361440 0.262880 0.205760 0.1610080 0.08
• Subjects asked to memorize a list of disconnected items. Asked to recall them at various times up to a week later
• Predictor time = time, in minutes, since initially memorized the list.
• Response prop = proportion of items recalled correctly.
Example 1
![Page 8: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/8.jpg)
Fitted line plot
10000 5000 0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
time
pro
p
S = 0.152284 R-Sq = 57.1 % R-Sq(adj) = 53.2 %
prop = 0.525870 - 0.0000557 time
Regression Plot
Example 1
![Page 9: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/9.jpg)
Residual vs. fits plot
0.50.40.30.20.10.0
0.3
0.2
0.1
0.0
-0.1
-0.2
Fitted Value
Re
sid
ual
Residuals Versus the Fitted Values(response is prop)
Example 1
![Page 10: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/10.jpg)
Normal probability plot
P-Value (approx): > 0.1000R: 0.9751W-test for Normality
N: 13StDev: 0.145801Average: -0.0000000
0.30.20.10.0-0.1-0.2
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
RESI1
Normal Probability Plot
Example 1
![Page 11: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/11.jpg)
Transform the X values
time prop log10_time1 0.84 0.000005 0.71 0.6989715 0.61 1.1760930 0.56 1.4771260 0.54 1.77815120 0.47 2.07918240 0.45 2.38021480 0.38 2.68124720 0.36 2.857331440 0.26 3.158362880 0.20 3.459395760 0.16 3.7604210080 0.08 4.00346
Change (“transform”) the predictor time to log10(time).
Example 1
![Page 12: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/12.jpg)
Fitted line plot using transformed X values
0 1 2 3 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
log10time
pro
p
prop = 0.846415 - 0.182427 log10timeS = 0.0233881 R-Sq = 99.0 % R-Sq(adj) = 98.9 %
Regression Plot
Example 1
![Page 13: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/13.jpg)
Residuals vs. fits plot using transformed X values
0.90.80.70.60.50.40.30.20.1
0.04
0.03
0.02
0.01
0.00
-0.01
-0.02
-0.03
-0.04
Fitted Value
Re
sid
ual
Residuals Versus the Fitted Values(response is prop)
Example 1
![Page 14: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/14.jpg)
Normal probability plotusing transformed X values
P-Value (approx): > 0.1000R: 0.9786W-test for Normality
N: 13StDev: 0.0223924Average: -0.0000000
0.030.00-0.03
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
RESI1
Normal Probability Plot
Example 1
![Page 15: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/15.jpg)
Predicting new proportion
Estimated regression function:
timeY 10log182.0846.0ˆ
Therefore, we predict the proportion of words recalled after 1000 minutes is:
30.03182.0846.0ˆ
1000log182.0846.0ˆ10
Y
Y
Example 1
![Page 16: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/16.jpg)
Predicting new proportion
Example 1
Predicted Values for New Observations
New Fit SE Fit 95.0% CI 95.0% PI1 0.299 0.00765 (0.282, 0.316) (0.245, 0.353)
Values of Predictors for New Observations
New Obs log10tim1 3.00
We can be 95% confident that a person will recall between 24.5% and 35.3% of the words after 1000 minutes.
![Page 17: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/17.jpg)
Transforming the Y values only
![Page 18: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/18.jpg)
Transforming the Y values only
• Appropriate when non-normality and/or unequal variances are the problems.
• The transformation on Y may also help to “straighten out” a curved relationship.
![Page 19: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/19.jpg)
Gestation time and birth weight for mammals
Mammal Birthwgt GestationGoat 2.75 155Sheep 4.00 175Deer 0.48 190Porcupine 1.50 210Bear 0.37 213Hippo 50.00 243Horse 30.00 340Camel 40.00 380Zebra 40.00 390Giraffe 98.00 457Elephant 113.00 670
• Predictor Birthwgt = birth weight, in kg, of mammal.
• Response Gestation = number of days until birth
Example 2
![Page 20: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/20.jpg)
Fitted line plot
0 50 100
200
300
400
500
600
700
Birthwgt
Ge
sta
tion
Gestation = 187.084 + 3.59137 BirthwgtS = 66.0943 R-Sq = 83.9 % R-Sq(adj) = 82.1 %
Regression Plot
Example 2
![Page 21: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/21.jpg)
Residual vs. fits plot
600500400300200
100
0
-100
Fitted Value
Re
sid
ual
Residuals Versus the Fitted Values(response is Gestatio)
Example 2
![Page 22: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/22.jpg)
Normal probability plot
P-Value (approx): > 0.1000R: 0.9703W-test for Normality
N: 11StDev: 62.7025Average: -0.0000000
500-50-100
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
RESI1
Normal Probability Plot
Example 2
![Page 23: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/23.jpg)
Transform the Y values
Mammal Birthwgt Gestation log10GestGoat 2.75 155 2.19033Sheep 4.00 175 2.24304Deer 0.48 190 2.27875Porcupine 1.50 210 2.32222Bear 0.37 213 2.32838Hippo 50.00 243 2.38561Horse 30.00 340 2.53148Camel 40.00 380 2.57978Zebra 40.00 390 2.59106Giraffe 98.00 457 2.65992Elephant 113.00 670 2.82607
Change (“transform”) the response Gestation to log10(Gestation).
Example 2
![Page 24: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/24.jpg)
Fitted line plot using transformed Y values
0 50 100
2.2
2.3
2.4
2.5
2.6
2.7
2.8
Birthwgt
log1
0G
est
log10Gest = 2.29256 + 0.0045211 BirthwgtS = 0.0939425 R-Sq = 80.3 % R-Sq(adj) = 78.1 %
Regression Plot
Example 2
![Page 25: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/25.jpg)
Residual vs. fits plotusing transformed Y values
2.3 2.4 2.5 2.6 2.7 2.8
-0.1
0.0
0.1
Fitted Value
Res
idua
l
Residuals Versus the Fitted Values(response is log10Gest)
Example 2
![Page 26: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/26.jpg)
Normal probability plotusing transformed Y values
P-Value (approx): > 0.1000R: 0.9743W-test for Normality
N: 11StDev: 0.0891217Average: -0.0000000
0.10.0-0.1
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
RESI2
Normal Probability Plot
Example 2
![Page 27: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/27.jpg)
Predicting new gestation Estimated regression function:
BirthwgtestG 0045.029.2)ˆ(log10
Therefore, since:
515.2500045.029.2)ˆ(log10 estG
we predict the gestation length of another mammal at 50 kgs to be:
3.3271010ˆ 515.2)ˆ(log10 estGestG
Example 2
![Page 28: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/28.jpg)
Predicting new gestation
Example 2
Predicted Values for New Observations
New Fit SE Fit 95.0% CI 95.0% PI1 2.5186 0.0306 (2.4494, 2.5878) (2.2951, 2.7421)
Values of Predictors for New Observations
New Birthwgt1 50.0
3.19710 2951.2
2.55210 7421.2
We can be 95% confident that the gestation length for a new mammal at 50 kgs will be between 197.3 and 552.2 days.
![Page 29: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/29.jpg)
Transforming both the X and Y values
![Page 30: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/30.jpg)
Transforming both the X and Y values
• Appropriate when the error terms are not normal, have unequal variances, and the function is not linear.
• Transforming the Y values corrects the problems with the error terms (and may help the non-linearity).
• Transforming the X values corrects the non-linearity.
![Page 31: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/31.jpg)
Diameter (inches) and volume (cu. ft.) of 70 shortleaf pines
Example 3
5 15 25
0
50
100
150
Diameter
Vo
lum
e
Volume = -41.5681 + 6.83672 DiameterS = 9.87485 R-Sq = 89.3 % R-Sq(adj) = 89.1 %
Regression Plot
![Page 32: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/32.jpg)
Residuals vs. fits plot
Example 3
100500
5
4
3
2
1
0
-1
-2
Fitted Value
Sta
ndar
diz
ed
Re
sid
ual
Residuals Versus the Fitted Values(response is Volume)
![Page 33: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/33.jpg)
Normal probability plot
Example 3
P-Value (approx): < 0.0100R: 0.9409W-test for Normality
N: 70StDev: 1.02852Average: 0.0085024
543210-1-2
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
SRES1
Normal Probability Plot
![Page 34: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/34.jpg)
Transform the Y values onlyDiameter Volume logVol 4.4 2.0 0.69315 4.6 2.2 0.78846 5.0 3.0 1.09861 5.1 4.3 1.45862 5.1 3.0 1.09861 5.2 2.9 1.06471 5.2 3.5 1.25276 5.5 3.4 1.22378 5.5 5.0 1.60944 5.6 7.2 1.97408 5.9 6.4 1.85630 5.9 5.6 1.72277 7.5 7.7 2.04122 7.6 10.3 2.33214… and so on …
Transform response volume to loge(volume)
Example 3
![Page 35: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/35.jpg)
Fitted line plotusing transformed Y values
5 15 25
0
1
2
3
4
5
6
Diameter
logV
ol
logVol = 0.451703 + 0.239531 DiameterS = 0.322919 R-Sq = 90.5 % R-Sq(adj) = 90.4 %
Regression Plot
Example 3
![Page 36: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/36.jpg)
Residuals vs. fits plotusing transformed Y values
654321
1
0
-1
-2
-3
Fitted Value
Sta
ndar
diz
ed
Re
sid
ual
Residuals Versus the Fitted Values(response is logVol)
Example 3
![Page 37: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/37.jpg)
Normal probability plotusing transformed Y values
P-Value (approx): < 0.0100R: 0.9610W-test for Normality
N: 70StDev: 1.01888Average: -0.0077969
10-1-2-3
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
SRES4
Normal Probability Plot
Example 3
![Page 38: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/38.jpg)
Transform both the X and Y valuesDiameter Volume logDiam logVol 4.4 2.0 1.48160 0.69315 4.6 2.2 1.52606 0.78846 5.0 3.0 1.60944 1.09861 5.1 4.3 1.62924 1.45862 5.1 3.0 1.62924 1.09861 5.2 2.9 1.64866 1.06471 5.2 3.5 1.64866 1.25276 5.5 3.4 1.70475 1.22378 5.5 5.0 1.70475 1.60944 5.6 7.2 1.72277 1.97408 5.9 6.4 1.77495 1.85630 5.9 5.6 1.77495 1.72277 7.5 7.7 2.01490 2.04122 7.6 10.3 2.02815 2.33214… and so on …
Transform predictor diameter to
loge(diameter)
Transform response volume to loge(volume)
Example 3
![Page 39: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/39.jpg)
Fitted line plotusing transformed X and Y values
Example 3
1.5 2.0 2.5 3.0
1
2
3
4
5
logDiam
logV
ol
logVol = -2.87179 + 2.56442 logDiamS = 0.170263 R-Sq = 97.4 % R-Sq(adj) = 97.3 %
Regression Plot
![Page 40: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/40.jpg)
Residual plot using transformed X and Y values
Example 3
54321
3
2
1
0
-1
-2
Fitted Value
Sta
ndar
diz
ed
Re
sid
ual
Residuals Versus the Fitted Values(response is logVol)
![Page 41: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/41.jpg)
Normal probability plot using transformed X and Y values
Example 3
P-Value (approx): > 0.1000R: 0.9896W-test for Normality
N: 70StDev: 1.00930Average: -0.0028401
210-1-2
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
babi
lity
SRES5
Normal Probability Plot
![Page 42: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/42.jpg)
Transformation strategies
![Page 43: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/43.jpg)
Effects of transformations
• Transforming the Y values corrects the problems with the error terms – and may simultaneously help non-linearity.
• Transforming the X values can only correct non-linearity.
![Page 44: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/44.jpg)
Transformation strategies
• If form of the relationship between x and y is known, then it may be possible to find a linearizing transformation analytically.
• Fitting a regression model empirically generally requires trial and error – try different transformations to see which does best.
![Page 45: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/45.jpg)
Transformation strategies
Finding a linearizing transformation analytically
![Page 46: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/46.jpg)
Knowing functional relationship is of the power form
If the relationship between x and y is of the power form:
xy
taking log of both sides transforms it into a linear form:
xy eee logloglog
![Page 47: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/47.jpg)
Knowing functional relationship is of the exponential form
If the relationship between x and y is of exponential form:
xey
taking log of both sides transforms it into a linear form:
xy ee loglog
![Page 48: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/48.jpg)
Transformation strategies
Finding a transformation by trial and error
![Page 49: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/49.jpg)
Family of power transformations
The most common transformation involves transforming the response by taking it to some power λ. That is:
yy Most commonly, for interpretation reasons, λ is a number between -1 and 2, such as -1, -0.5, 0, 0.5, (1), 1.5, and 2.
When λ = 0, the transformation is taken to be the log transformation. That is:
yy elog
![Page 50: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/50.jpg)
Effect of loge transformation
10005000
5
0
-5
x
f(x)
Natural log function
![Page 51: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/51.jpg)
Effect of loge transformation
543210
2
1
0
-1
-2
-3
-4
-5
-6
x
f(x)
Natural log function
![Page 52: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/52.jpg)
Some guidelines for specifying λ
• To make smaller values more spread out, use a smaller λ.
• To make larger values more spread out, use a larger λ.
![Page 53: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/53.jpg)
Possible transformations
x
y
2x
x y
y
y
ylog
y1
3x
x
x
![Page 54: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/54.jpg)
Possible transformations
y
x y
y
2y
xlog
x1
3yx
xx
y
![Page 55: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/55.jpg)
Possible transformations
x y
y
y
ylog
y1
x
xx
f(x)
xlog
ylog
xlog
x1
![Page 56: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/56.jpg)
Possible transformations
2x
x y
y
y3x
x
xx
f(x)
2y
3y
![Page 57: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/57.jpg)
Transformation strategies
Variance stabilizing transformations
![Page 58: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/58.jpg)
Common variance stabilizing transformations
If the response is a Poisson count, so that the variance is proportional to the mean, use the square root transformation:
yyy 21
If the response is a binomial proportion, use the arcsine square root transformation:
pp ˆsinˆ 1
![Page 59: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/59.jpg)
Common variance stabilizing transformations
If the variance is proportional to the mean squared, use the natural log transformation:
yy elog
If the variance is proportional to the mean to the fourth power, use the reciprocal transformation:
yy 1
![Page 60: Fixing problems with the model Transforming the data so that the simple linear regression model is okay for the transformed data](https://reader035.vdocuments.net/reader035/viewer/2022062715/56649da05503460f94a8b276/html5/thumbnails/60.jpg)
Transforming data in Minitab
• Select Calc >> Calculator …• In box labeled “Store result in variable,”, tell
Minitab in which column (variable) you want the transformed data stored.
• Type (input) the expression for the desired transformation in the box labeled Expression. (Use the available functions.)
• Select OK. The data will appear in the column of the worksheet that you specified.