![Page 1: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/1.jpg)
Analyzing Continuous and Categorical IVs Simultaneously
Analysis of Covariance
![Page 2: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/2.jpg)
Skill Set
• When we model a single categorical and a single continuous variable, what do the main effects look like?
• What do the interactions look like?
• What is the meaning of each of the three b weights in such models?
• What is the sequence of tests used to analyze such data?
• Why should we avoid dichotomizing continuous IVs?
• What is the difference between ordinal and disordinal interactions?
• Why do we test for regions of significance of the difference between regression lines when we have an interaction?
![Page 3: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/3.jpg)
Mixed IVs
• Simplest example has 2 IVs• 1 IV is categorical (e.g., Male, Female)• 1 IV is continuous (e.g., MAT score)
– Keats:Shelly::Byron:Harley-Davidson
• DV is continuous, e.g., GPA in law school• Have used ANOVA for categorical and
Regression for continuous• Both are part of GLM. Many people call
mixing categorical and continuous vbls Analysis of Covariance (ANCOVA).
![Page 4: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/4.jpg)
Example DataN Sex MAT GPA N Sex MAT GPA 1 1 51 3.7 21 -1 47 2.72 2 1 53 3.28 22 -1 53 3.62 3 1 52 3.79 23 -1 51 3.45 4 1 50 3.23 24 -1 51 3.78 5 1 54 3.58 25 -1 46 3.14 6 1 50 3.34 26 -1 48 2.89 7 1 52 3.05 27 -1 51 3.36 8 1 56 3.78 28 -1 51 3.05 9 1 49 3.23 29 -1 53 3.65
10 1 52 3.16 30 -1 55 3.61 11 1 50 3.46 31 -1 50 3.45 12 1 51 3.47 32 -1 51 3.43 13 1 49 3.73 33 -1 52 3.56 14 1 54 3.63 34 -1 50 3.14 15 1 48 3.09 35 -1 49 3.19 16 1 48 3.18 36 -1 49 3.32 17 1 53 3.58 37 -1 50 3.06 18 1 53 3.26 38 -1 52 3.47 19 1 48 3.11 39 -1 44 3.07 20 1 47 3.22 40 -1 52 3.66
Note that there are 40 people here. Effect coding (1, -1) has been used to identify males vs. females. Doesn’t matter which is which (-1, 1) for coding purposes.
![Page 5: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/5.jpg)
Example Data Graph
1
1
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
22
2 2
2
22
2
2
2
2
2
575451484542MAT
3.9
3.6
3.3
3.0
2.7
GP
A
MAT & GPA
Male Regression Line
Female Regression Line
Total Group Regression
1=female2=male
What is the main story here?
![Page 6: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/6.jpg)
Group vs. Common Regression Coefficient• Can have 1 common slope, bc.
• Can have 2 group slopes, bF and bM.
• Common slope is weighted average of group slopes:
• Weight by SSX (here, MAT scores) for each group. Weight comes from variability in X and number of people in group.
22
22
MF
MMFFc xx
bxbxb
![Page 7: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/7.jpg)
Telling the Story With Graphs (1)
1
11
1
1
1
1
1
1
11 11
1
1
1
1
1
1
1
2
2
2
2
2 2 2
2
2
2
2
2
22
2
2
22
2 2
575451484542
MAT
5
4
3
2
1
GP
AMAT & GPA
No Story
Why is there nothing to tell here?
![Page 8: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/8.jpg)
Telling the Story (2)
1
1
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
22
2 2
2
22
2
2
2
2
2
575451484542MAT
3.9
3.6
3.3
3.0
2.7
GP
A
MAT & GPA
1
1
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
22
2 2
2
22
2
2
2
2
2
575451484542MAT
3.9
3.6
3.3
3.0
2.7
GP
A
MAT & GPA
..
2
2
11
..
22
11
How does the graph tell us which variable is important?
![Page 9: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/9.jpg)
Telling the Story (3)
1
1
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
22
2 2
2
22
2
2
2
2
2
575451484542MAT
3.9
3.6
3.3
3.0
2.7
GP
A
MAT & GPA
..
2
2
11
..
22
11
1
11
1
1
1
1
1
1
11 11
1
1
1
1
1
1
1
2
2
2
2
2 2 2
2
2
2
2
2
22
2
2
22
2 2
575451484542
MAT
5
4
3
2
1
GP
A
MAT & GPAInteraction
What stories are being told in each of these graphs?
When the story is obvious, the graph tells it. But we need statistical tests when the results are not obvious, and when we want to persuade others (publish).
![Page 10: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/10.jpg)
Testing Sequence (1)
• Construct vectors X, G and XG.– X is continuous
– G is group (categorical)
– XG is the product of the two. Just mult.
• Intercept for common group is a. Note three b weights. First tells difference in groups. Second is common slope. Third is interaction (difference in group slopes). Two common terms, two difference terms.
Y a b G b X b GX1 2 3
![Page 11: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/11.jpg)
Testing Sequence (2)
• Estimate 3 slopes (and intercept).
• Examine R2 for model. If n.s., no story; quit. If R2 sig and large enough:
• Examine b3. If sig, there is an interaction. If sig, estimate separate regressions for different groups.
• If b3 is not sig, re-estimate model without XG. Examine b1 and b2.
![Page 12: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/12.jpg)
Testing Sequence (3)The significance of the b weights tells the importance of the variables.
Is b1 significant? (G, categorical)
Is b2
significant? (X, cont)
Yes No
Yes Parallel slopes, different intercepts
Identical regressions
No Mean diffs only; slopes are zero
Only possible with severe confounding; ambiguous story.
![Page 13: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/13.jpg)
Test Illustration (1)R2 = .44; p < .05
Y' = -.0389+.75G+.0673X-.0146GX
Term Estimate SE t
G (b1; Sex) .75 .6856 1.0567
X (b2; MAT) .0673 .0125 4.9786*
GX (b3; Int) -.0146 .0135 -1.0831
Step 1. R2 is large & sig.Step 2. Slope for interaction (b3) is N.S. (low power test)Step 3. Drop GX and re-estimate.
![Page 14: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/14.jpg)
Test Illustration (2)R2 = .42; p < .05Y' = .1154+.0045G+.0687XTerm Estimate SE tG (b1; Sex) .0045 .0833 .1365X (b2; MAT) .0687 .0135 5.0937*
Step 4. Examine slopes (b weights). The only significant slope is for MAT. Conclusion: Identical regressions for Males and Females.
1
1
1
1
1
1
1
1
1
1
1 1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
22
2 2
2
22
2
2
2
2
2
575451484542MAT
3.9
3.6
3.3
3.0
2.7
GP
AMAT & GPA
The slight difference in lines is due to sampling error.
![Page 15: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/15.jpg)
Second Illustration (1)
1
1
1
1
1
1
1
1
11
1 1
11
11
1
1
11
2
2
2
2
2
2
2 2
2 2
2
2
2
22
2
2
2
2
2
575451484542
MAT
4.0
3.6
3.2
2.8
2.4
2.0
GP
AMAT & GPA
1=female
2=male
Suppose our data look like these. What story do you think they tell?
![Page 16: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/16.jpg)
Second Illustration (2)R2 = .72; p < .05Y' = -11.54+.8268G+.0643X-.0117GXTerm Estimate SE t pG (b1; Sex) .8268 .6627 1.2476 .22X (b2; MAT) .0643 .0131 4.2947 .0001
GX (b3; Int) -.0117 .0131 -.8945 .3770
1. Is there any story to tell?2. Is there an interaction?
R2 = .72; p < .05Y' = -.1805+.2346G+.0655XTerm Estimate SE t pG (b1; Sex) .2346 .0320 7.34 .0001X (b2; MAT) .0655 .0130 5.05 .0001
What is the story? Does it agree with the graph?
![Page 17: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/17.jpg)
More Complex Designs
• With more complex designs, logic and sequence of tests remain the same.
• Categorical vbls may have more than 2 levels• We may have several continuous IVs
• If multiple categories, create multiple (G-1) interaction terms. If multiple Xs, create products for each. Test the terms as a block using hierarchical regression:
FR R df df
R N dfL S L s
L L
( ) / ( )
( ) / ( )
2 2
21 1
![Page 18: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/18.jpg)
Categorizing Continuous IVs
• The median split (e.g., personality, stress, BEM sex-role scales).
• Don’t do this because:– Loss of power and information – treat IQs of 100
and 140 as identical.
– Loss of replication (median changes by sample)
– Arbitrary value of split - “high stress” group may not be very stressed
• Some throw out middle people – also a problem because of range enhancement bias.
![Page 19: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/19.jpg)
Interactions
• Some research is aimed squarely at interactions, e.g., Aptitude Treatment Interaction (ATI) research. Learning styles, etc.
• Types of Interactions:
1086420
X
12
9
6
3
0
Y
1086420
X
12
9
6
3
0
Y
1086420
X
12
9
6
3
0
Y
No interaction Ordinal Interaction
Disordinal Interaction
Implications?
![Page 20: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/20.jpg)
Regions of SignificanceWith a disordinal interaction, there must be a place where the treatments are equal (where the lines cross).
Point of intersection(X) = a a
b b1 2
2 1
1086420X
12
9
6
3
0
Y
Y=4+.3X
Y=1.5+.8X
The crossover is found by (a1-a2)/(b2-b1) or (4-1.5)/(.8-.3) = 2.5/.5 =5, just where it appears to be on the graph.
Some places on X give equivalent effects. Other places show a benefit to one treatment or the other.
![Page 21: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/21.jpg)
Simultaneous Regions of Significance
Region = B B AC
A
2
2212
221
)4,2( )(11
)(4
2bb
xxss
N
FA res
N
))(()(4
221212
2
221
1)4,2( bbaax
X
x
Xss
N
FB res
N
2212
2
22
21
21
21
)4,2( )()(4
2aa
x
X
x
X
nn
Nss
N
FC res
N
F is the tabled value. N is n1+n2 = total people.
![Page 22: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/22.jpg)
Disordinal Example (1)
1
1
1
1 1
1
1
1
11
1
1 11
1
1
11
1
1-1
-1-1-1 -1
-1-1
-1 -1-1
-1-1
-1-1
-1-1
-1
-1
-1-1
604530150
Learning Style
100
80
60
40
20
Test
Sco
reDisordinal Interaction Data
Hypothetical experiment in teaching Research Methods.Learning style – high scores indicate preference for spoken instruction. Two instruction methods – graphics intensive and spoken intensive.
N=40. X = learning style questionnaire score. G = method of instruction. DV is in-class test score.
![Page 23: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/23.jpg)
Disordinal Example (2)R Y
(Test)X (Learn Style)
G (Lect v. tutor)
GX (Int)
Y 1 X .22 1 G -.09 .03 1 GX .35 .02 .88 1M 73.8 27.78 0 .43SD 15.06 15.14 1.01 31.94
Source Df SS MS FModel 3 8035.69 2678.56 119.53Error 36 806.70 22.41 C Total 39 8842.40 R2=.91 Variable Estimate SE t pInt 67.09 G -26.99 1.58 -17.09 .0001X .227 .05 4.54 .0001GX .917 .05 18.33 .0001
![Page 24: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/24.jpg)
Disordinal Example (3)
n1=20 Y X G=1Y 1 R X .95 1 M 72.4 28.2 SD 18.43 15.26
Source Df SS G=1Model 1 5805.09 Error 18 651.71 C Total 19 6456.80 R2 = .90 Variable Estimate SE t pInt 40.10 2.88 13.9 .0001X 1.15 .09 12.66 .0001
Group 1 data
![Page 25: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/25.jpg)
Disordinal Example (4)
n2=20 Y X G=-1Y 1 R X -.97 1 M 75.2 27.35 SD 11.02 15.41
Group 2 data
Source df SS G=-1Model 1 2152.21 Error 18 154.99 C Total 19 2307.20 R2 = .93 Variable Estimate SE t PInt 94.09 1.36 69.03 .0001X -.69 .04 -15.81 .0001
![Page 26: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/26.jpg)
Disordinal Example (5)Therefore, the regression will all terms included is:Y'=67.09 - 26.99G + .23X + .92GXThe regression for the 1 group is: Y'=40.1 + 1.15XThe regression for the -1 group is: Y'= 94.09 - .69X. To find the crossover point, we find (a1-a2)/(b2-b1) which, in our case is (94.09-40.1)/(1.15+.69) = 29.34.
x12
N=40 n1=20 n2=20 Group1 = 1Group2 = -1
F.05(2,36)=3.26 SSres(tot) = 806.70 SSres(1) = 651.71 SSres(2) = 154.99
Note: SSres(tot) = SSres(1) + + SSres(2)
=4424.48 SD=15.26,SS =SD2*(N-1)
=4511.89 SD=15.41,SS=15.41*15.41*19
=28.2 From corrs =27.35 From corrs
a1=40.10 b1=1.15 a2=94.09 b2=-.69
1X 2X
22x
![Page 27: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/27.jpg)
Disordinal Example (6)
32.3)69.15.1(89.4511
1
48.4424
1)70.806(
36
)26.3(2 2
A
52.97)69.15.1)(09.9410.40(89.4511
35.27
48.4424
2.28)70.806(
36
)26.3(2
B
83.2849)09.9410.40(89.4511
35.27
48.4424
2.28
)20)(20(
40)70.806(
36
)26.3(2 2
C
Region =97 52 97 52 332 2849 83
332
2. . ( . )( . )
.
Lower 27.26
Middle 29.34
Upper 31.48
Therefore, our estimates are:
![Page 28: Analyzing Continuous and Categorical IVs Simultaneously](https://reader035.vdocuments.net/reader035/viewer/2022062314/5681389f550346895da05afd/html5/thumbnails/28.jpg)
Disordinal Example (7)
1
1
1
1 1
1
1
1
11
1
1 11
1
1
11
1
1-1
-1-1-1 -1
-1-1
-1 -1-1
-1-1
-1-1
-1-1
-1
-1
-1-1
604530150
Learning Style
100
80
60
40
20
Test
Sco
reDisordinal Interaction Data
N.S. Region