Download - Psych 5510/6510
1
Psych 5510/6510
Chapter 13:
ANCOVA: Models with Continuous and Categorical Predictors
Part 1: Increasing Power in True Experimental Designs
Spring, 2009
2
ANCOVA: “The analysis of Covariance”
Originated in the analysis of experimental designs. The goal was to investigate the effects of the categorical variables while controlling some continuously measured variable, called a covariate.
3
Model Comparison Approach
In the model comparison approach we are simply putting both continuous and categorical variables into our model. Usually this will involve using categorical variables to code our independent variable (i.e. which experimental group the subject belongs in) and continuous variables to measure some other aspect of each subject (something that is not being manipulated by the experimenter, e.g. height or age).
4
ContextsWe will be looking at three contexts in which this
will be useful:1. Within a ‘true experimental’ design, where we
can use this approach to increase the power of the design and to add sophistication to our model.
2. Within a ‘quasi-experimental’ or ‘static group’ design, where we can use this approach to control a confounding variable.
3. Within a correlational design, where we can introduce a categorical variable to better understand a continuous variable.
5
Context 1: True Experimental Designs
True experimental design: subjects are randomly divided into groups, the independent variable is then manipulated by the experimenter.
As the subjects are randomly assigned to groups, it is assumed that the group means start off being fairly equal. If, after the independent variable has been applied, a statistically significant difference between the group means is found it is interpreted as being the result of the independent variable.
6
‘Priming’ Example60 subjects are randomly divided into two groups. Each
subject is shown two words, a ‘priming’ word for 2 seconds, followed by a ‘test’ word. The perceptual threshold of the test word is measured. For Group 1, the priming word is similar in shape to the test word, for Group 2, the priming word is similar in meaning to the test word.
• IV: Type of prime (shape or meaning)• DV: perceptual threshold
For the analysis contrast coding (1 and –1) was used to code the independent variable.
7
Results: t Test for Independent Groups
Mean Group 1: 31.13Mean Group 2: 29.42
t(58)=1.942, p=.057
8
Results: Linear Regression
Model Summary
.247a .061 .045 3.40635Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), Groupa.
ANOVAb
43.776 1 43.776 3.773 .057a
672.985 58 11.603
716.761 59
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), Groupa.
Dependent Variable: Primingb.
Coefficientsa
30.271 .440 68.835 .000
.854 .440 .247 1.942 .057
(Constant)
Group
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: Priminga.
)p0.854(Grou30.27Y ii
9
Interpretation
Well, the results were not quite statistically significant, we cannot conclude that the independent variable had an effect. Perhaps if the experiment just had a little more power we could have rejected the null hypothesis. In the previous semester we examined what influences power, in this case we will focus on the variance within the groups. If we can reduce the variance of the scores within the groups then we can increase the power of the experiment.
10
Within-Group VarianceGroup 1: Standard deviation=3.07Group 2: Standard deviation=3.71
Is that a lot? Well, it’s hard to say, but we can think about reducing it. To do that we can ask the question, ‘why do the scores differ within each group’? For our purposes in this chapter we will refine the question to ‘what measurable attribute of the subjects might be correlated to the dependent variable (perceptual threshold)’. Age comes to mind. There was a wide variety of ages within each group, if age is correlated with perceptual threshold, and the ages of the participants varies within each group, then it could account for some of the within-group variance.
11
Is Age Correlated with Perceptual Threshold?
MODEL C: Ŷi = β0
MODEL A: Ŷi = β0 + β1Agei
ANOVAb
113.783 1 113.783 11.790 .001a
559.754 58 9.651
673.536 59
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), Agea.
Dependent Variable: Yb.
12
Adding Age to the ExperimentUp to now our approach has been:
MODEL C: Ŷi = β0
MODEL A: Ŷi = β0 + β1Groupi
H0: β1= 0 Ha: β1 0
Now we are going to move to:MODEL C: Ŷi = β0 + β2Agei
MODEL A: Ŷi = β0 + β2Agei + β1Groupi
H0: β1= 0 Ha: β1 0
13
Adding Age to the Exp. (cont.)
MODEL A: Ŷi = β0 + β2Agei + β1Groupi
The mechanics of this are simple, we are going to measure the subject’s age and add that variable to the model along with the variable that uses a contrast code to indicate which group they are in.
14
Why This Helps (ANOVA)
If we think about this in terms of the t test for independent groups, what we are accomplishing is to remove from each group the variance that can be accounted for by knowing the subject’s age. If that reduces the variance in each group then the power of the experiment should increase.
15
Why This Helps (Regression)But let’s think about this in terms of the model comparison
approach.Previous:
MODEL C: Ŷi = β0
MODEL A: Ŷi = β0 + β1Groupi
Now:MODEL C: Ŷi = β0 + β2Agei
MODEL A: Ŷi = β0 + β2Agei + β1Groupi
If we are interested in the worthwhileness of adding variable ‘Group’ to the model, why would adding it to a model that already contains ‘Age’ be better then adding it to a model that didn’t contain ‘Age’?
16
Why This Helps (Regression)
To understand the explanation we need to note that variables ‘Age’ and ‘Group’ are likely to be fairly non-redundant. Remember that redundancy can be thought of as how much you can use one variable to predict the other. We have randomly divided people into groups, so the groups probably have a fairly similar distribution of ages, consequently we shouldn’t be able to use what group a person is in to predict their age, or vice versa. If the mean age in each group is the same then age and group are completely independent (non-redundant). The mean age in the two groups, however, will probably not be exactly the same so there could be a small amount of redundancy.
17
Why This Helps (Regression)
In the following diagrams I show how adding X1 to a model of Y that already contains X2 is more powerful than without X2, but only if the predictor variables are not very redundant.
18
Note while the amount of Y that can be explained by X1 is the same in both cases, the PRE is greater below:
19
The situation would be different if X1 and X2 were quite redundant:
20
Redundancy of Age and Group
The correlation between Age and Group is r=.007 (very low). If we square that to get the value of R² we get a value very close to zero, which would make the tolerance of Age and Group essentially 1. The redundancy doesn’t need to be this low for the covariate to add power, but I’m not complaining.
21
Results with Covariate (Age) Included
Model Summary
.469a .220 .192 3.13247Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), Age, Groupa.
ANOVAb
157.456 2 78.728 8.023 .001a
559.305 57 9.812
716.761 59
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), Age, Groupa.
Dependent Variable: Primingb.
Overall Analysis
22
Results with Covariate (Age) Included
)(Age281.)p0.864(Grou65.12Y iii
Both Group (p=.0367) and Age (p=.001) are worthwhile when added last to a model that contains the other predictor. Without Age in the model (our previous analysis) Group was not significant. The tolerances (not shown above) are very close to 1.00, indicating that Age and Group have very little redundancy.
Coefficientsa
21.651 2.564 8.443 .000
.281 .083 .398 3.404 .001 .397 .411 .398
.864 .404 .250 2.135 .037 .247 .272 .250
(Constant)
Age
Group
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Zero-order Partial Part
Correlations
Dependent Variable: Priminga.
23
Summary Table
Source SS df MS F p PRE
Regression 157.46 2 78.73 8.023 .001 .220
Group 44.73 1 44.73 4.56 .037 .074
Age 113.70 1 113.70 11.59 .001 .169
Residual 559.31 57 9.81
Total 716.76 59
24
SummaryWithout Age in the model the t value (from both the t test
for independent groups as well as from the regression analysis) for the effect of Group (the independent variable) was 1.942, p=.057.
With Age in the model the t value for the effect of Group was 2.135, p=.037.
What happened? In terms of the t test for independent groups, including Age in the model removed the variance that could be accounted for by Age before the evaluation of the effect of Group, thus power was increased. In terms of model comparison, including Age in Model C lowered SSE(C) in a manner that was not redundant with Group, so the proportional reduction of error by adding Group was greater.
25
With the tools we now have we can add covariates to any type of experimental design. For example, if we have two independent variables, ‘A’, and ‘B’, and A has two levels (contrast X1 can handle that), and B has three levels (contrast X2 and X3 can handle that), and we also are looking at the interaction of A and B (contrasts X4 and X5 can handle that), and we want to add two covariates ‘Age’ (X6) and ‘Height’ (X7) to gain power then we regress Y on:Ŷi = β0+β1X1 +β2X2 +β3X3 +β4X4 +β5X5 +β6X6 +β7X7
If Age and Height are not very redundant with the contrasts that code the independent variables then they will increase the power of the tests of those contrasts.
Fancier Designs
26
A Powerful ToolThis procedure provides a very simple tool for increasing
the power of a true experimental design.
1. Think of some reason why scores will differ within groups (e.g. age, income level, height, gender).
2. Measure that.
3. Test to see if that measure is significantly correlated with the dependent variable (i.e. regress the dependent variable on the measure).
4. If it is, add it to the model to increase the power of the test for the independent variable(s).
27
Ways of Thinking About ItIf we approach our original example from the
perspective of a t test for independent groups then our focus is on whether or not the independent variable (type of priming) had an effect on the dependent variable (perceptual threshold). Including the covariate of Age is simply a means of increasing the power the experiment, but our focus remains on the effect of the independent variable.
28
Ways of Thinking About ItIf we approach what we are doing from the model
comparison approach, then our focus is on trying to model perceptual thresholds, and we are interested in whether it would be good to have both Age and Type of Priming be part of our model. Our analysis shows that both are worthwhile. I wonder if they interact? Wouldn’t it be interesting if the effect of ‘type of priming’ was different across ages? Let’s check it out. We’ll create another variable that is (Group)x(Age) to test for an interaction...
29
Interactive ModelMODEL C: Ŷi = β0 + β1Groupi + β2Agei
MODEL A: Ŷi = β0 + β1Groupi + β2Agei + β3GroupiAgei
PRE = -1.50²=.02 p=0.260Coefficientsa
22.432 2.648 8.471 .000
.256 .085 .373 2.993 .004 .411 .371 .360
2.895 2.648 .864 1.093 .279 -.029 .145 .132
-.097 .085 -.901 -1.139 .260 -.066 -.150 -.137
(Constant)
Age
Group
GxA
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Zero-order Partial Part
Correlations
Dependent Variable: Ya.
Moving to an interactive model was not worthwhile, and we can seethat including the interaction term made the effect of Group no longerstatistically significant. Why? A look at the tolerances (not shown inthis table) shows a great deal of redundancy between Group and theGroup x Age interaction. So, let’s leave the interaction out of our model.
30
The Model
MODEL: Ŷi = β0 + β1Groupi + β2Agei
This is our best model of perceptual threshold so far, I wonder what variable to try next? We could think of another continuous variable that we could measure in our next study, or we might want to manipulate some independent variable and see if it adds to the model. The goal is to work towards a better and better model of perceptual threshold. This is the flavor of the model comparison approach.