chapter 22: analysis of covariance (ancova)

35
Slide 1 of 33 Chapter 22: Analysis of Covariance (ANCOVA) Lecture 15 April 17, 2007 Psychology 791

Upload: others

Post on 12-Sep-2021

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 1 of 33

Chapter 22: Analysis of Covariance(ANCOVA)

Lecture 15

April 17, 2007Psychology 791

Page 2: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

● The basics

● Why we use Covariates

● Example Covariates

● Book terminology

● Additional Info

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 2 of 33

The basics

■ Our ANCOVA model is going to incorporate ideas that welearned in both regression and ANOVA.

■ This model will take the ANOVA model that we learnedpreviously, and add a covariate.

■ The difference is that this covariate will be a continuousvariable, not a categorial variable.

■ In a sense, it is sort of like blocking.

■ The difference is that it is continuous and usually not part ofthe experimental design.

Page 3: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

● The basics

● Why we use Covariates

● Example Covariates

● Book terminology

● Additional Info

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 3 of 33

Why we use Covariates

■ ANOVA models that contain covariates utilize the idea ofabsorbtion of error variance due to covariates.

■ This is analogous to our blocking variable, in the sense that itabsorbs some error variance.

■ We are controlling for some factor that we think might effectour response variable.

Page 4: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 4 of 33

Some examples of Covariates

■ Covariates are continuous independent variables that might have aninfluence on our dependent variable.

■ For example, if we want to see the effect some study method has onstudent’s test results, some covariates might be:

◆ Some pre-test measure.

◆ Age.

◆ Intelligence.

■ By adding these to the model (usually one, sometimes more), we areabsorbing some outside variance. So we can say: Our results conclude thattest scores are effected by study method, irrespective of intelligence.

■ A note of caution, as you add more and more covariates to the model, youare absorbing more and more error variance, but you are also losing moreand more degrees of freedom. So you may inadvertently end up with ahigher MSE.

Page 5: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

● The basics

● Why we use Covariates

● Example Covariates

● Book terminology

● Additional Info

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 5 of 33

Book terminology

■ Just so you don’t get confused, the book uses two words todescribe these added variables.

◆ Concomitant : Occurring or existing concurrently.

◆ Covariate.

■ Both mean the same thing. These are variables that existand are collected concurrently with our other independentvariables.

■ As opposed to a blocking variable, covariates are not part ofsome master research design, they are simply other IVs thatare collected.

Page 6: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

● The basics

● Why we use Covariates

● Example Covariates

● Book terminology

● Additional Info

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 6 of 33

Some Additional Info on Covariates

■ As a note, what a covariate does is parse out the sharedinformation related to that covariate (above and beyondwhat’s being accounted for by the other experimentalfactors).

■ If we control for intelligence, then in your results (factor levelmeans) can be interpreted such that everyone in the studyhas equal intelligence.

■ If we control for age, the results are interpreted such thatindividuals all have the same age.

■ Remember in regression when we had multiple continuousIV, each of our parameters was interpreted "while holding theothers constant."

■ This is what you should think about when interpreting yourresults: the effect of the treatment holding the covariateconstant.

Page 7: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

● The basics

● Why we use Covariates

● Example Covariates

● Book terminology

● Additional Info

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 7 of 33

Covariates...Continued

■ The covariate in itself is thought to have an effect on theresponse variable, but we do not want it to have an effect onthe treatment IN ANY WAY.

■ These covariates still need to be IV, that is independent ofthe other variables on the right side of the model.

■ They are usually collected prior to treatment (pretest), or justsomething collected at the same time that may not have aneffect on the treatment (second test score).

■ When there is some relationship between your treatmentand the covariate, you will inevitably not only be fitting thewrong model, but interpreting the wrong results.

◆ But we do have more appropriate models where thiswould work.

Page 8: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 8 of 33

Model Formulation

■ It is amazing we got this far in lecture without seeing anygreek letters, but fear not, that is about to change.

■ We are going to take our regular ANOVA model, and thenmerge it with a regression model by adding a continuousfactor.

◆ And of course, we are going to need the help of a newGreek letter: γ.

Page 9: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 9 of 33

ANOVA Model

■ Here is the ANOVA model:

Yij = µ·+ τi + εij

■ Where:

■ µ·

is a constant component common to all observations(mean or weighted mean)

■ τi is the effect of the i th factor level

■ εij independent ∼ N(0,σ2)

■ i= 1,...,r; j = 1,...,ni

Page 10: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 10 of 33

ANCOVA Model 1

■ Here is the model:

Yij = µ·+ τi + γXij + εij

■ There is one difference in interpretation:

◆ Here, µ·

is not the overall mean.

■ If we center our covariate µ·

will represent the grand mean.

Page 11: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 11 of 33

ANCOVA model 2

■ Here is the new model model:

Yij = µ·+ τi + γ(Xij − X

··)εij

■ Where:

■ µ·

is overall mean

■ τi is the effect of the i th factor level

■ γ is the regression coefficient for the relationship between Yand X

■ εij independent ∼ N(0,σ2)

■ i= 1,...,r; j = 1,...,ni

Page 12: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 12 of 33

Pattern?

■ I hope by now you are seeing a pattern in the modelformulations.

■ This is why they call what we are using the GLM (generallinear model).

■ It is general and can be adapted for a bunch of differentsituations.

Page 13: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 13 of 33

Properties of ANCOVA

■ Most of the properties of ANCOVA are the same as theANOVA model, so let us concentrate on those that aredifferent.

■ When you are asked to predict an observation, say Yij , whatyou would usually do in an ANOVA model is say that it is thetreatment level mean µi.

■ This, however, is not the case for the ANCOVA model,because the mean of each observation incorporates both thetreatment and the covariate.

■ Hence, our new prediction for any observation is:

Yij = µ·+ τi + γ(Xij − X

··)

Page 14: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 14 of 33

Plots of Treatment Effects

■ Thinking about the model formulation, each treatment nowcan be represented by a line, not just a point.

■ Now our τi is not defined as our effect of our treatment, butthe difference between these lines at any given value of X.

■ Given this bit of information, wouldn’t it be helpful if thedistance between any two treatment lines was the same atevery level of X?

◆ That way our τi would be the same at all levels of X...

■ Well, that is the exact assumption that we are going to makeon our model.

Page 15: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 15 of 33

Consistency of Slope Assumption

■ This assumption is crucial to our fitting of the model.

■ If these slopes are not the same, then our τi changes forevery level of i for every level of X .

■ We will see shortly that we are able to test this assumptionby adding an interaction term to the model and testing itssignificance (to be continued...).

Page 16: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 16 of 33

Generalizing the Model

■ Adapting the model for more scenarios in this context can bedone in a bunch of ways.

■ Here are a few:

◆ Non-constant X variables (random variable, does not havea linear relationship with Y).

◆ Non-linear relationship (adding a squared term to identifya quadratic relationship with Y).

◆ Adding multiple covariates (add more than onecovariate...we just add it in the same way and startnumbering our γ parameters).

Page 17: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

● Model Formulation

● ANOVA Model

● ANCOVA Model 1

● ANCOVA model 2

● Pattern?

● Properties of ANCOVA

● Consistency of Slope

Assumption

● Generalizing the Model

● Matrix Formulation

ANCOVA Example

Model Fitting

Concluding Remarks

Slide 17 of 33

Matrix Formulation

■ How would our model look in matrix form?

Page 18: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

● Data● Analysis Without the

Covariate

● Analysis With the Covariate

● Regression Lines

● Plot

Model Fitting

Concluding Remarks

Slide 18 of 33

ANCOVA Example

■ To illustrate, let’s go through an example.

■ Imagine we have assigned a different math book to fourdifferent classrooms (our treatment variable T ).

◆ We want to see if there is a difference between learning ofmath between the four books.

■ We have selected 10 students at random from each class -each takes a 25 item math test following the course.

■ Because the students were not randomly placed into eachclass, we also have a rating of each student’s math ability(from 1-10) from the student’s teachers.

Page 19: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

● Data● Analysis Without the

Covariate

● Analysis With the Covariate

● Regression Lines

● Plot

Model Fitting

Concluding Remarks

Slide 19 of 33

Data

Page 20: Chapter 22: Analysis of Covariance (ANCOVA)

libname ancova ’C:\Documents and Settings\Jonathan Templin\Desktop\Psychproc means data=ancova.ancovaexample;var covariate;run;

data ancova;set ancova.ancovaexample;mc_cov=covariate-7.125; *mean centering the covariate;run;

*running analysis without covariate...;proc glm data=ancova;class treatment;model y=treatment/solution;means treatment;lsmeans treatment/pdiff cl adjust=tukey;run;

*now running the analysis with the covariate;proc glm data=ancova;class treatment;model y=treatment mc_cov /solution;means treatment;lsmeans treatment/pdiff cl adjust=tukey;run; quit;

*plotting the different slopes;symbol1 color=black interpol=rl value=1;symbol2 color=black interpol=rl value=2;symbol3 color=black interpol=rl value=3;symbol4 color=black interpol=rl value=4;proc gplot data=ancova;plot y * mc_cov = treatment;

run; quit;

19-1

Page 21: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 20 of 33

Analysis Without the CovariateThe GLM Procedure

Dependent Variable: Y Dependent Variable

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 3 56.1000000 18.7000000 2.53 0.0723

Error 36 265.8000000 7.3833333

Corrected Total 39 321.9000000

R-Square Coeff Var Root MSE Y Mean

0.174278 15.48279 2.717229 17.55000

Source DF Type III SS Mean Square F Value Pr > F

treatment 3 56.10000000 18.70000000 2.53 0.0723

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 17.40000000 B 0.85926325 20.25 <.0001

treatment 1 -1.60000000 B 1.21518174 -1.32 0.1963

treatment 2 0.50000000 B 1.21518174 0.41 0.6832

treatment 3 1.70000000 B 1.21518174 1.40 0.1704

treatment 4 0.00000000 B . . .

Page 22: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 21 of 33

Analysis Without the Covariate

The GLM Procedure

Least Squares Means

Adjustment for Multiple Comparisons: Tukey

LSMEAN

treatment Y LSMEAN Number

1 15.8000000 1

2 17.9000000 2

3 19.1000000 3

4 17.4000000 4

Least Squares Means for effect treatment

Pr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: Y

i/j 1 2 3 4

1 0.3244 0.0475 0.5586

2 0.3244 0.7574 0.9761

3 0.0475 0.7574 0.5082

4 0.5586 0.9761 0.5082

Page 23: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 22 of 33

Analysis With the CovariateThe GLM Procedure

Dependent Variable: Y Dependent Variable

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 4 221.2274083 55.3068521 19.23 <.0001

Error 35 100.6725917 2.8763598

Corrected Total 39 321.9000000

R-Square Coeff Var Root MSE Y Mean

0.687255 9.663723 1.695983 17.55000

Source DF Type III SS Mean Square F Value Pr > F

treatment 3 65.0422659 21.6807553 7.54 0.0005

mc_cov 1 165.1274083 165.1274083 57.41 <.0001

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 17.62793661 B 0.53716011 32.82 <.0001

treatment 1 -2.10652579 B 0.76140733 -2.77 0.0090

treatment 2 0.39869484 B 0.75858468 0.53 0.6025

treatment 3 1.39608452 B 0.75952673 1.84 0.0745

treatment 4 0.00000000 B . . .

mc_cov 1.01305158 0.13370375 7.58 <.0001

Page 24: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 23 of 33

Analysis With the Covariate

The GLM Procedure

Least Squares Means

Adjustment for Multiple Comparisons: Tukey-Kramer

LSMEAN

treatment Y LSMEAN Number

1 15.5214108 1

2 18.0266314 2

3 19.0240211 3

4 17.6279366 4

Least Squares Means for effect treatment

Pr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: Y

i/j 1 2 3 4

1 0.0116 0.0003 0.0426

2 0.0116 0.5603 0.9523

3 0.0003 0.5603 0.2732

4 0.0426 0.9523 0.2732

Page 25: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 24 of 33

Within Treatment Regression Lines

■ Using the output, specify the within treatment regression lines:

◆ Book 1:

◆ Book 2:

◆ Book 3:

◆ Book 4:

Page 26: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 25 of 33

Plot of Within Treatment Regression Lines

Page 27: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

Model Fitting

● Inferences

● Testing Parallel Slopes

● Analysis With the Covariate

Interaction

● Regression Lines

● Plot

Concluding Remarks

Slide 26 of 33

Determining if Model is Appropriate

■ There are five key issues that we need to look at whentesting the validity of using the ANCOVA model:

1. Normality of error terms.

2. Equal error variance for treatment levels.

3. Equal slopes for different treatments (relates to covariate).

4. Linear relationship between Y and covariate.

5. Uncorrelated error terms.

Page 28: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

Model Fitting

● Inferences

● Testing Parallel Slopes

● Analysis With the Covariate

Interaction

● Regression Lines

● Plot

Concluding Remarks

Slide 27 of 33

Inferences

■ Just like when we had a blocking variable, most of the timethe covariate is not of interest.

■ Unlike the blocking variable, however, we do want to makesure there is a significant linear relationship.

■ Since it isn’t part of the experimental design, we can put it inand take it out if it is not significant and add another.

■ Again, the only factor of importance is our treatment (testingthe significance of our τs.

■ Null and Testing Exactly the same as for a one factor study,the only thing that differs is in the interpretation.

Page 29: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

Model Fitting

● Inferences

● Testing Parallel Slopes

● Analysis With the Covariate

Interaction

● Regression Lines

● Plot

Concluding Remarks

Slide 28 of 33

Testing Parallel Slopes

■ One assumption that is the most crucial to test is that of theequality of slopes.

■ Think about what a regression model fits, we are fitting linesfor each treatment group.

■ When we do not have an interaction term, it is necessarythat each line has the same slope, since we are onlyestimating one slope parameter.

■ If we include an interaction term between the covariate andtreatment, we are now fitting more than one slope parameter,meaning that each treatment group CAN have a differentslope.

■ If this effect is not significant, we can think of the same ideaas an interaction in an anova model, the lines areparallel...meaning that the slopes CAN differ, but they DONOT.

Page 30: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

Model Fitting

● Inferences

● Testing Parallel Slopes

● Analysis With the Covariate

Interaction

● Regression Lines

● Plot

Concluding Remarks

Slide 29 of 33

Testing Parallel Slopes

■ So what does it mean when there is a significant interactionterm?

■ This means that not only CAN the slopes differ, but they DOdiffer.

■ The model is in a sense fitting different regression lines foreach level of your IV (treatment).

■ If this is in fact true, we really have no way of interpreting ourτ parameters alone (we need the covariate to describe theseterms).

Page 31: Chapter 22: Analysis of Covariance (ANCOVA)

*check for interaction of covariate with treatment;proc glm data=ancova;class treatment;model y=treatment|mc_cov /solution;means treatment;lsmeans treatment/pdiff cl adjust=tukey;run;

29-1

Page 32: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 30 of 33

Analysis With the Covariate InteractionSum of

Source DF Squares Mean Square F Value Pr > F

Model 7 222.2335908 31.7476558 10.19 <.0001

Error 32 99.6664092 3.1145753

Corrected Total 39 321.9000000

R-Square Coeff Var Root MSE Y Mean

0.690381 10.05593 1.764816 17.55000

Source DF Type III SS Mean Square F Value Pr > F

treatment 3 62.8196556 20.9398852 6.72 0.0012

mc_cov 1 152.2793868 152.2793868 48.89 <.0001

mc_cov*treatment 3 1.0061824 0.3353941 0.11 0.9550

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 17.61188811 B 0.56136703 31.37 <.0001

treatment 1 -2.05980478 B 0.79719034 -2.58 0.0145

treatment 2 0.42613272 B 0.79221483 0.54 0.5944

treatment 3 1.40932748 B 0.79182845 1.78 0.0846

treatment 4 0.00000000 B . . .

mc_cov 0.94172494 B 0.26944540 3.50 0.0014

mc_cov*treatment 1 -0.04020979 B 0.43655144 -0.09 0.9272

mc_cov*treatment 2 0.16244172 B 0.37079348 0.44 0.6643

mc_cov*treatment 3 0.10873377 B 0.37952080 0.29 0.7763

mc_cov*treatment 4 0.00000000 B . . .

Page 33: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 31 of 33

Within Treatment Regression Lines

■ Using the interaction output, specify the within treatment regression lines:

◆ Book 1:

◆ Book 2:

◆ Book 3:

◆ Book 4:

Page 34: Chapter 22: Analysis of Covariance (ANCOVA)

Slide 32 of 33

Plot of Within Treatment Regression Lines

Page 35: Chapter 22: Analysis of Covariance (ANCOVA)

The Idea

The Model

ANCOVA Example

Model Fitting

Concluding Remarks

● Concluding Remarks

Slide 33 of 33

Concluding Remarks

■ So, in a sense, today’s lecture utilized about 90% of theinformation that we already know...the only difference is thatnow we have added on little tiny parameter to the model, thatof the covariate

■ Again, the covariate does nothing to change theESTIMATION of the model, it only changes yourINTERPRETATION of the parameters and the description ofmean difference

■ Hopefully, you are becoming more familiar with the GLM andare becoming more comfortable with tiny alterations made tothe model.