lecture 18 miscellaneous topics in multiple regressionghobbs/stat_512/lecture_notes/... · •...
TRANSCRIPT
18-1
Lecture 18
Miscellaneous Topics in Multiple Regression
STAT 512
Spring 2011
Background Reading
KNNL: 8.1-8.5,10.1, 11, 12
18-2
Topic Overview
• Polynomial Models (8.1)
• Interaction Models (8.2)
• Qualitative Predictors (8.3-8.5)
• Added-Variable Plots (10.1)
• More Complex Remedial Measures (11)
• Correlated (non-independent) Data (12)
18-3
Polynomial Models
• Can include 2nd, 3rd, 4th, etc. order terms in
models.
• If more than one predictor, can also include
an interaction term.
• Useful for prediction; parameter
interpretation becomes more difficult.
• Usually anything more than 2nd or 3rd order
should be used with caution.
18-4
Polynomial Models (2)
• Variables X and X2 will be correlated
• Intercorrelation can be reduced by centering
the variables first (subtracting the mean)
• Instead of X, use transformation
x X X= − .
• Take squares, interactions using the
transformed variables.
• Still equally useful for predictions as before.
18-5
Polynomial Models in SAS
• PROC STANDARD can be used to obtain
centered variables.
• Obtain higher order terms by “creating” new
predictor variables from the old ones, using
data steps (from the centered variables).
• Do regression as usual, simply treating
things like 2
1 1 2,X X X , etc. as additional
predictors.
18-6
Model Building (Polynomial)
• Model building procedures that we’ve
discussed can be used.
• Generally include all lower order terms in
the model (even if they test non-
significant). For example, if including 3
1X
you would also include square and linear
term.
18-7
Polynomial Models (Drawbacks)
• Intercorrelation / Multicollinearity can still
be a major problem, despite centering.
• Uses additional degrees of freedom.
• Estimates may lack interpretative value.
18-8
Interaction Models
• Products of predictor variables incorporated
into the model in the same way as squares,
cubes, etc.
• Interaction means that the effect of one
predictor depends on the value of the other.
• The model 0 1 1 2 2 3 1 2Y X X X Xβ β β β ε= + + + + can be
rewritten as follows:
( )
( )
0 1 3 2 1 2 2
0 1 1 2 3 1 2
Y X X X
Y X X X
β β β β ε
β β β β ε
= + + + +
= + + + +
18-9
Interaction Models (2)
• Additive Effects – no interaction (lines will
be parallel)
• Reinforcing Effect – combined effect is
greater than simply adding them together
• Interference Effect – Higher values of one
variable suppress the effect of the other
18-10
Additive Model
18-11
Reinforcing Effect
18-12
Interference Effect
18-13
Qualitative Predictors
• X1 takes values 0 and 1 corresponding to two
different groups or categories
• X2 is a continuous variable.
• Use this 0/1 coding along with an interaction
term in the following model:
0 1 1 2 2 3 1 2Y X X X X= β +β +β +β + ε
• This is a convenient way of writing down two
separate SLR models for the two categories.
18-14
Qualitative Predictors (2)
• When X1 = 0, the model becomes
( ) ( )0 1 2 2 3 2
0 2 2
0 0Y X X
X
= β +β +β +β + ε
= β +β + ε
This is a SLR model for Y as a function of X2
with intercept β0 and slope β2.
• When X1 = 1, the model becomes
( ) ( )
( ) ( )
0 1 2 2 3 2
0 1 2 3 2
1 1Y X X
X
= β +β +β +β + ε
= β +β + β +β + ε
This is a SLR model for Y as a function of X2
with intercept (β0 + β1) and slope (β2 + β3).
18-15
Qualitative Predictors (3)
• We could run two separate SLR for X1 = 0, 1.
• But by modeling in this way, we use all the
data to get the variance estimate, increasing our
error degrees of freedom.
• Some useful tests in this situation include:
- 0 1 3: 0H β β= = is the hypothesis that the
regression lines are the same.
- 0 1: 0H β = is the hypothesis that the two
intercepts are equal.
- 0 3: 0H β = is the hypothesis that the two
slopes are equal.
18-16
Example
• Y is the number of months it takes for an
insurance company to adopt an innovation.
• X1 is the type of firm (a qualitative or
categorical variable): X1 is 0 if it is a mutual
fund firm and 1 if it is a stock fund firm.
• X2 is the size of the firm (a continuous
variable)
• SAS code: insurance.sas
18-17
Example (2)
18-18
Example (3) *Create the interaction variable; data insurance; set insurance; sizestock=size*stock; *Run the model and test whether the two lines are the same or different; proc reg data=insurance; model months = stock size sizestock; sameline: test stock, sizestock;run;
Analysis of Variance
Source DF Sum of Squares
Mean Square
F Value Pr > F
Model 3 1504.41904 501.47301 45.49 <.0001
Error 16 176.38096 11.02381
Corrected Total 19 1680.80000
R-Square 0.8951
Adj R-Sq 0.8754
18-19
Example (4)
- The two lines are not the same. Test sameline Results for Dependent Variable months
Source DF Mean Square
F Value Pr > F
Numerator 2 158.12584 14.34 0.0003
Denominator 16 11.02381
- The slopes are not significantly different,
but the intercepts are.
Parameter Estimates
Variable DF Parameter Estimate
Standard Error
t Value Pr > |t|
Intercept 1 33.83837 2.44065 13.86 <.0001
stock 1 8.13125 3.65405 2.23 0.0408
size 1 -0.10153 0.01305 -7.78 <.0001
sizestock 1 -0.00041714 0.01833 -0.02 0.9821
18-20
Example (5)
proc reg data=insurance; model months = stock size; run;
Parameter Estimates
Variable DF Parameter Estimate
Standard Error
t Value Pr > |t|
Intercept 1 33.87407 1.81386 18.68 <.0001
stock 1 8.05547 1.45911 5.52 <.0001
size 1 -0.10174 0.00889 -11.44 <.0001
For mutual fund firms (X1=0), the estimated
regression line is: 2Y 33.874 – 0.102 X .=
For stock firms (X1=1), the estimated regression
line is: 2 2Y (33.874 + 8.055) – 0.102 X 41.93 0.102X .= = −
18-21
Added Variable Plots
• Also called partial regression plots.
• Idea: Help you figure out the net effect of a
given predictor on the response, given
other variables in the model (related to
partial correlations).
• One plot for each predictor.
18-22
Added Variable Plots (2)
• To construct plot run two regressions:
o Remove variation in Y due to X1. The
residuals represent the variation in Y left to
be explained after X1 is in the model.
o Suppose you want to consider adding X2.
X2 may be correlated to X1 – remove this
variation by regressing X2 on X1. Your
residuals represent the part of X2 not
explained by X1.
18-23
Added Variable Plots (3)
• Plot the residuals against each other – a
pattern suggests that X2 is an important
variable (we account for X1 – then leftover
variation in Y is explained by what is left
of X2)
• Additionally - the shape of the pattern
suggests the type of relationship.
18-24
Added Variable Plots (SAS)
proc reg;
model y = x1;
output out=diag1 r=resid_y;
proc reg;
model x2 = x1;
output out=diag2 r=resid_x2;
data diag; merge diag1 diag2;
proc gplot data=diag;
plot resid_y = resid_x2;
18-25
Remedial Measures
• Needed for violations of assumptions (we’ve
discussed transformations)
• Needed when there are influential
observations
• We’ll mention several ideas, but not focus
on any of the details. See Chapter 11 for
more details.
18-26
Weighted Least Squares
• Remedial measure for non-constant error
variances.
• Regular regression estimates are still
unbiased, but no longer minimum variance
since observations of Y now have different
variances.
• Observations with larger variance get less
“weight” in the model, while more precise
observations are given more weight
18-27
Weighted Least Squares
• Procedure for analysis begins by estimating
weights. Some good procedures for doing
this are described in the text
• Formulas are more complex (see, but
perhaps don’t try too hard to understand,
p. 430)
• Once weights are obtained, you can use a
‘weight’ statement in the regression
procedure in SAS
18-28
Robust Regression
• Dampens the influence of outlying cases
• Several different types available, all have
slightly different pros and cons
• IRLS (iteratively re-weighted least squares)
is a common choice.
o Employ weights that vary inversely with
size of residual
o Might read the case study in section 11.3
if you are interested.
18-29
Bootstrap
• Useful for evaluating precision in
nonstandard situations
• Example: To satisfy the assumptions of a
linear regression model for the physicians
data, we needed to do a log transformation.
� Everything was done on the log scale –
if we want to develop a prediction
interval for a county, we might
“exponentiate” the endpoints of the
prediction interval in the log scale
18-30
Bootstrap (2)
• Example (cont)
� When we exponentiate, we have left the
scale where the confidence intervals are
valid (i.e. the assumptions no longer
apply)
� Alternative is to obtain Bootstrap CI’s
• Bootstrapping is further described, with
some examples, in section 11.5.
18-31
Essence of Bootstrap
• From original sample, take many (1000?)
bootstrap samples (there are several
different procedures for doing this)
• Estimate the value of the parameter(s) of
interest for each sample. These form an
empirical distribution.
• Use the empirical distribution to obtain a
confidence interval
18-32
Autocorrelation / Time Series
• Error terms correlated over time (non-
independent)
• Time Series is STAT 520 – we won’t
discuss this topic further in STAT 512.
18-33
Upcoming in Lecture 19...
- Introduction to ANOVA and Design of
Experiments (15.1-15.3)