topic 17: interaction models. interaction models with several explanatory variables, we need to...
TRANSCRIPT
Topic 17: Interaction Models
Interaction Models
• With several explanatory variables, we need to consider the possibility that the effect of one variable depends on the value of another variable
• Special cases
–One binary variable (Y/N) and one continuous variable
–Two continuous variables
One binary variable and one continuous variable
• X1 takes values 0 and 1 corresponding to two different groups
• X2 is a continuous variable• Model: Y = β0 + β1X1 + β2X2 + β3X1X2 + e
• When X1 = 0 : Y = β0 + β2X2 + e• When X1 = 1 : Y = (β0 + β1)+ (β2 + β3) X2 + e
One binary and one continuous
• β0 is the intercept for Group 1
• β0+ β1 is the intercept for Group 2
• Similar relationship for slopes (β2 and β3)
• H0: β1 = β3 = 0 tests the hypothesis that the
regression lines are the same
• H0: β1 = 0 tests equal intercepts
• H0: β3 = 0 tests equal slopes
KNNL Example p316
• Y is number of months for an insurance company to adopt an innovation
• X1 is the size of the firm (a continuous variable
• X2 is the type of firm (a qualitative or categorical variable)
The question
• X2 takes the value 0 if it is a mutual fund firm and 1 if it is a stock fund firm
• We ask whether or not stock firms adopt the innovation slower or faster than mutual firms
• We ask the question across all firms, regardless of size
Plot the data
symbol1 v=M i=sm70 c=black l=1;symbol2 v=S i=sm70 c=black l=3;
proc sort data=a1; by stock size;proc gplot data=a1; plot months*size=stock;run;
months
0
10
20
30
40
size
0 100 200 300 400
stock 0 1M M M S S S
M
M
MM
M
M
M
M
M
M
S
S SS
S S
S S
SS
Two symbols on plot
Interaction effects
• Interaction expresses the idea that the effect of one explanatory variable on the response depends on another explanatory variable
• In the KNNL example, this would mean that the slope of the line depends on the type of firm
Are both lines the same?
• From scatterplot, looks like different intercepts but can use the test statement for formal assessment
Data a1; set a1; sizestock=size*stock;Proc reg data=a1; model months=size stock sizestock; test stock, sizestock;run;
Output
Test 1 Results for Dependent Variable months
Source DFMean
Square F Value Pr > FNumerator 2 158.12584 14.34 0.0003
Denominator 16 11.02381
Reject H0.There is a difference in the linear relationship across groups
Output•How are they different?
Parameter Estimates
Variable DFParameter
EstimateStandard
Error t Value Pr > |t|Intercept 1 33.83837 2.44065 13.86 <.0001
size 1 -0.10153 0.01305 -7.78 <.0001
stock 1 8.13125 3.65405 2.23 0.0408
sizestock 1 -0.00041714 0.01833 -0.02 0.9821
1. No difference in slopes assuming different intercepts
2. Potentially different intercepts assuming different slopes
Two parallel lines?
proc reg data=a1; model months=size stock;run;
OutputAnalysis of Variance
Source DFSum of
SquaresMean
Square F Value Pr > FModel 2 1504.4133 752.2066 72.50 <.0001
Error 17 176.38667 10.37569
Corrected Total 19 1680.8000
Root MSE 3.22113 R-Square 0.8951
Dependent Mean 19.40000 Adj R-Sq 0.8827
Coeff Var 16.60377
Output
Int for stock firms is 33.87+8.05 = 41.92Common slope is –0.10
Parameter Estimates
Variable DFParameter
EstimateStandard
Error t Value Pr > |t|Intercept 1 33.87407 1.81386 18.68 <.0001
size 1 -0.10174 0.00889 -11.44 <.0001
stock 1 8.05547 1.45911 5.52 <.0001
Plot the two fitted lines
symbol1 v=M i=rl c=black l=1;symbol2 v=S i=rl c=black l=3;
proc gplot data=a1; plot months*size=stock;run;
months
0
10
20
30
40
size
0 100 200 300 400
stock 0 1M M M S S S
M
M
MM
M
M
M
M
M
M
S
S S
The plot
Two continuous variables
• Y = β0 + β1X1 + β2X2 + β3X1X2 + e
• Can be rewritten as follows
• Y = β0 + (β1 + β3X2)X1 + β2X2 + e
• Y = β0 + β1X1 + (β2 + β3X1) X2 + e
• The coefficient of one explanatory variable depends on the value of the other explanatory variable
Last slide
• We went over KNNL 8.2 – 8.7
• We used programs Topic17.sas to generate the output for today