lecture 18 miscellaneous topics in multiple regressionghobbs/stat_512/lecture_notes/... · •...

18-1

Lecture 18

Miscellaneous Topics in Multiple Regression

STAT 512

Spring 2011

Background Reading

KNNL: 8.1-8.5,10.1, 11, 12

18-2

Topic Overview

• Polynomial Models (8.1)

• Interaction Models (8.2)

• Qualitative Predictors (8.3-8.5)

• Added-Variable Plots (10.1)

• More Complex Remedial Measures (11)

• Correlated (non-independent) Data (12)

18-3

Polynomial Models

• Can include 2nd, 3rd, 4th, etc. order terms in

models.

• If more than one predictor, can also include

an interaction term.

• Useful for prediction; parameter

interpretation becomes more difficult.

• Usually anything more than 2nd or 3rd order

should be used with caution.

18-4

Polynomial Models (2)

• Variables X and X2 will be correlated

• Intercorrelation can be reduced by centering

the variables first (subtracting the mean)

• Instead of X, use transformation

x X X= − .

• Take squares, interactions using the

transformed variables.

• Still equally useful for predictions as before.

18-5

Polynomial Models in SAS

• PROC STANDARD can be used to obtain

centered variables.

• Obtain higher order terms by “creating” new

predictor variables from the old ones, using

data steps (from the centered variables).

• Do regression as usual, simply treating

things like 2

1 1 2,X X X , etc. as additional

predictors.

18-6

Model Building (Polynomial)

• Model building procedures that we’ve

discussed can be used.

• Generally include all lower order terms in

the model (even if they test non-

significant). For example, if including 3

1X

you would also include square and linear

term.

18-7

Polynomial Models (Drawbacks)

• Intercorrelation / Multicollinearity can still

be a major problem, despite centering.

• Uses additional degrees of freedom.

• Estimates may lack interpretative value.

18-8

Interaction Models

• Products of predictor variables incorporated

into the model in the same way as squares,

cubes, etc.

• Interaction means that the effect of one

predictor depends on the value of the other.

• The model 0 1 1 2 2 3 1 2Y X X X Xβ β β β ε= + + + + can be

rewritten as follows:

( )

( )

0 1 3 2 1 2 2

0 1 1 2 3 1 2

Y X X X

Y X X X

β β β β ε

β β β β ε

= + + + +

= + + + +

18-9

Interaction Models (2)

• Additive Effects – no interaction (lines will

be parallel)

• Reinforcing Effect – combined effect is

greater than simply adding them together

• Interference Effect – Higher values of one

variable suppress the effect of the other

18-10

Additive Model

18-11

Reinforcing Effect

18-12

Interference Effect

18-13

Qualitative Predictors

• X1 takes values 0 and 1 corresponding to two

different groups or categories

• X2 is a continuous variable.

• Use this 0/1 coding along with an interaction

term in the following model:

0 1 1 2 2 3 1 2Y X X X X= β +β +β +β + ε

• This is a convenient way of writing down two

separate SLR models for the two categories.

18-14

Qualitative Predictors (2)

• When X1 = 0, the model becomes

( ) ( )0 1 2 2 3 2

0 2 2

0 0Y X X

X

= β +β +β +β + ε

= β +β + ε

This is a SLR model for Y as a function of X2

with intercept β0 and slope β2.

• When X1 = 1, the model becomes

( ) ( )

( ) ( )

0 1 2 2 3 2

0 1 2 3 2

1 1Y X X

X

= β +β +β +β + ε

= β +β + β +β + ε

This is a SLR model for Y as a function of X2

with intercept (β0 + β1) and slope (β2 + β3).

18-15

Qualitative Predictors (3)

• We could run two separate SLR for X1 = 0, 1.

• But by modeling in this way, we use all the

data to get the variance estimate, increasing our

error degrees of freedom.

• Some useful tests in this situation include:

- 0 1 3: 0H β β= = is the hypothesis that the

regression lines are the same.

- 0 1: 0H β = is the hypothesis that the two

intercepts are equal.

- 0 3: 0H β = is the hypothesis that the two

slopes are equal.

18-16

Example

• Y is the number of months it takes for an

insurance company to adopt an innovation.

• X1 is the type of firm (a qualitative or

categorical variable): X1 is 0 if it is a mutual

fund firm and 1 if it is a stock fund firm.

• X2 is the size of the firm (a continuous

variable)

• SAS code: insurance.sas

18-17

Example (2)

18-18

Example (3) *Create the interaction variable; data insurance; set insurance; sizestock=size*stock; *Run the model and test whether the two lines are the same or different; proc reg data=insurance; model months = stock size sizestock; sameline: test stock, sizestock;run;

Analysis of Variance

Source DF Sum of Squares

Mean Square

F Value Pr > F

Model 3 1504.41904 501.47301 45.49 <.0001

Error 16 176.38096 11.02381

Corrected Total 19 1680.80000

R-Square 0.8951

Adj R-Sq 0.8754

18-19

Example (4)

- The two lines are not the same. Test sameline Results for Dependent Variable months

Source DF Mean Square

F Value Pr > F

Numerator 2 158.12584 14.34 0.0003

Denominator 16 11.02381

- The slopes are not significantly different,

but the intercepts are.

Parameter Estimates

Variable DF Parameter Estimate

Standard Error

t Value Pr > |t|

Intercept 1 33.83837 2.44065 13.86 <.0001

stock 1 8.13125 3.65405 2.23 0.0408

size 1 -0.10153 0.01305 -7.78 <.0001

sizestock 1 -0.00041714 0.01833 -0.02 0.9821

18-20

Example (5)

proc reg data=insurance; model months = stock size; run;

Parameter Estimates

Variable DF Parameter Estimate

Standard Error

t Value Pr > |t|

Intercept 1 33.87407 1.81386 18.68 <.0001

stock 1 8.05547 1.45911 5.52 <.0001

size 1 -0.10174 0.00889 -11.44 <.0001

For mutual fund firms (X1=0), the estimated

regression line is: 2Y 33.874 – 0.102 X .=

For stock firms (X1=1), the estimated regression

line is: 2 2Y (33.874 + 8.055) – 0.102 X 41.93 0.102X .= = −

18-21

Added Variable Plots

• Also called partial regression plots.

• Idea: Help you figure out the net effect of a

given predictor on the response, given

other variables in the model (related to

partial correlations).

• One plot for each predictor.

18-22

Added Variable Plots (2)

• To construct plot run two regressions:

o Remove variation in Y due to X1. The

residuals represent the variation in Y left to

be explained after X1 is in the model.

o Suppose you want to consider adding X2.

X2 may be correlated to X1 – remove this

variation by regressing X2 on X1. Your

residuals represent the part of X2 not

explained by X1.

18-23

Added Variable Plots (3)

• Plot the residuals against each other – a

pattern suggests that X2 is an important

variable (we account for X1 – then leftover

variation in Y is explained by what is left

of X2)

• Additionally - the shape of the pattern

suggests the type of relationship.

18-24

Added Variable Plots (SAS)

proc reg;

model y = x1;

output out=diag1 r=resid_y;

proc reg;

model x2 = x1;

output out=diag2 r=resid_x2;

data diag; merge diag1 diag2;

proc gplot data=diag;

plot resid_y = resid_x2;

18-25

Remedial Measures

• Needed for violations of assumptions (we’ve

discussed transformations)

• Needed when there are influential

observations

• We’ll mention several ideas, but not focus

on any of the details. See Chapter 11 for

more details.

18-26

Weighted Least Squares

• Remedial measure for non-constant error

variances.

• Regular regression estimates are still

unbiased, but no longer minimum variance

since observations of Y now have different

variances.

• Observations with larger variance get less

“weight” in the model, while more precise

observations are given more weight

18-27

Weighted Least Squares

• Procedure for analysis begins by estimating

weights. Some good procedures for doing

this are described in the text

• Formulas are more complex (see, but

perhaps don’t try too hard to understand,

p. 430)

• Once weights are obtained, you can use a

‘weight’ statement in the regression

procedure in SAS

18-28

Robust Regression

• Dampens the influence of outlying cases

• Several different types available, all have

slightly different pros and cons

• IRLS (iteratively re-weighted least squares)

is a common choice.

o Employ weights that vary inversely with

size of residual

o Might read the case study in section 11.3

if you are interested.

18-29

Bootstrap

• Useful for evaluating precision in

nonstandard situations

• Example: To satisfy the assumptions of a

linear regression model for the physicians

data, we needed to do a log transformation.

� Everything was done on the log scale –

if we want to develop a prediction

interval for a county, we might

“exponentiate” the endpoints of the

prediction interval in the log scale

18-30

Bootstrap (2)

• Example (cont)

� When we exponentiate, we have left the

scale where the confidence intervals are

valid (i.e. the assumptions no longer

apply)

� Alternative is to obtain Bootstrap CI’s

• Bootstrapping is further described, with

some examples, in section 11.5.

18-31

Essence of Bootstrap

• From original sample, take many (1000?)

bootstrap samples (there are several

different procedures for doing this)

• Estimate the value of the parameter(s) of

interest for each sample. These form an

empirical distribution.

• Use the empirical distribution to obtain a

confidence interval

18-32

Autocorrelation / Time Series

• Error terms correlated over time (non-

independent)

• Time Series is STAT 520 – we won’t

discuss this topic further in STAT 512.

18-33

Upcoming in Lecture 19...

- Introduction to ANOVA and Design of

Experiments (15.1-15.3)

lecture 18 miscellaneous topics in multiple regressionghobbs/stat_512/lecture_notes/... · •...

Documents