anova: analysis of variance

70
Anthony J Greene 1 ANOVA: Analysis of Variance 1-way ANOVA

Upload: derek-levine

Post on 03-Jan-2016

113 views

Category:

Documents


3 download

DESCRIPTION

ANOVA: Analysis of Variance. 1-way ANOVA. ANOVA. What is Analysis of Variance The F -ratio Used for testing hypotheses among more than two means As with t -test, effect is measured in numerator, error variance in the denomenator Partitioning the Variance - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ANOVA: Analysis of Variance

Anthony J Greene 1

ANOVA: Analysis of Variance

1-way ANOVA

Page 2: ANOVA: Analysis of Variance

Anthony J Greene 2

ANOVA

I. What is Analysis of Variance1. The F-ratio2. Used for testing hypotheses among more than two means3. As with t-test, effect is measured in numerator, error variance

in the denomenator4. Partitioning the Variance

II. Different computational concerns for ANOVA1. Degrees Freedom for Numerator and Denominator2. No such thing as a negative value

III. Using Table B.4IV. The Source TableV. Hypothesis testing

Page 3: ANOVA: Analysis of Variance

Anthony J Greene 3

M2 M3M1

Page 4: ANOVA: Analysis of Variance

Anthony J Greene 4

ANOVA

• Analysis of Variance• Hypothesis testing for more than 2 groups

• For only 2 groups t2(n) = F(1,n)

Merror

effect

error

effect

s

MM

s

st

s

sF 21

2

2

Page 5: ANOVA: Analysis of Variance

Anthony J Greene 5

BASIC IDEA

• As with the t-test, the numerator expresses the differences among the dependent measure between experimental groups, and the denominator is the error.

• If the effect is enough larger than random error, we reject the null hypothesis.

VarianceTreatment Within

VarianceTreatment Between F

M1 = 1 M2 = 5 M3 = 1

Is the Effect Variability

Large

Compared to the Random Variability

Grp 1 Grp 2 Grp 3

Effect VRandom V

=

Page 6: ANOVA: Analysis of Variance

Anthony J Greene 6

BASIC IDEA

• If the differences accounted for by the manipulation are low (or zero) then F = 1

• If the effects are twice as large as the error, then F = 3, which generally indicates an effect.

Error

Error Effect Treatment F

Page 7: ANOVA: Analysis of Variance

Anthony J Greene 7

Sources of Variance

Page 8: ANOVA: Analysis of Variance

Anthony J Greene 8

Why Is It Called Analysis of Variance?Aren’t We Interested In Means, Not Variance?

• Most statisticians do not know the answer to this question?

• If we’re interested in differences among means why do an analysis of variance?

• The misconception is that it compares 12 to 2

2. No

• The comparison is between effect variance (differences in group means) to random variance.

Page 9: ANOVA: Analysis of Variance

Anthony J Greene 9

Learning Under Three Temperature Conditions

TGxT ,T is the treatment total, G is the Grand totalM2 M3M1

Page 10: ANOVA: Analysis of Variance

Anthony J Greene 10

Computing the Sums of Squares

Page 11: ANOVA: Analysis of Variance

Anthony J Greene 11

How Variance is Partitioned

This simply disregards group membership and computes an overall SS

Variability Between and Within Groups is Included

N

XXSS

22

Keep in mind the general formula for SS

N

GXSSTotal

22

M1 = 1 M2 = 5 M3 = 1

Grp 1 Grp 2 Grp 3

Page 12: ANOVA: Analysis of Variance

Anthony J Greene 12

How Variance is Partitioned

Imagine there were no individual differences at all.

The SS for all scores would measure only the fact that there were group differences.

Grp 1 Grp 2 Grp 3

N

XXSS

22

Keep in mind the general formula for SS

N

G

n

TSSBetween

22

1 5 11 5 11 5 11 5 11 5 1

M1 = 1 M2 = 5 M3 = 1

T1 = 5 T2 = 25 T3 = 5

Page 13: ANOVA: Analysis of Variance

Anthony J Greene 13

How Variance is Partitioned

SS computed within a column removes the mean.

Thus summing the SS’s for each column computes the overall variability except for the mean differences between groups.

M1 = 1 M2 = 5 M3 = 1

Grp 1 Grp 2 Grp 3

2)( MXSS

Keep in mind the general formula for SSSSSSWithin

0-11-13-11-10-1

4-53-56-53-54-5

1-12-12-10-10-1

Page 14: ANOVA: Analysis of Variance

Anthony J Greene 14

How Variance is Partitioned

N

GXSSTotal

22

M1 = 1 M2 = 5 M3 = 1

Grp 1 Grp 2 Grp 3

0-11-13-11-10-1

4-53-56-53-54-5

1-12-12-10-10-1

N

G

n

TSSBetween

22

2)( MXSS

SSSSWithin

Page 15: ANOVA: Analysis of Variance

Anthony J Greene 15

Computing Degrees Freedom

• df between is k-1, where k is the number of treatment groups (for the prior example, 3, since there were 3 temperature conditions)

• df within is N-k , where N is the total number of ns across groups. Recall that for a t-test with two independent groups, df was 2n-2? 2n was all the subjects N and 2 was the number of groups, k.

Page 16: ANOVA: Analysis of Variance

Anthony J Greene 16

Computing Degrees Freedom

Page 17: ANOVA: Analysis of Variance

Anthony J Greene 17

How Degrees Freedom Are Partitioned

N-1 = (N - k) + (k - 1)

N-1 = N - k + k – 1

Page 18: ANOVA: Analysis of Variance

Anthony J Greene 18

Partitioning The Sums of Squares

Page 19: ANOVA: Analysis of Variance

Anthony J Greene 19

Computing An F-Ratio

between

betweenbetween df

SSMS

within

withinwithin df

SSMS

within

between

MS

MSF

Page 20: ANOVA: Analysis of Variance

Anthony J Greene 20

Consult Table B-4

Take a standard normal distribution, square each value, and it looks like this

Page 21: ANOVA: Analysis of Variance

Anthony J Greene 21

Table B-4

Page 22: ANOVA: Analysis of Variance

Anthony J Greene 22

Two different F-curves

Page 23: ANOVA: Analysis of Variance

Anthony J Greene 23

ANOVA: Hypothesis Testing

Page 24: ANOVA: Analysis of Variance

Anthony J Greene 24

Basic Properties of F-Curves

Property 1: The total area under an F-curve is equal to 1.

Property 2: An F-curve starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis as it does so.

Property 3: An F-curve is right skewed.

Page 25: ANOVA: Analysis of Variance

Anthony J Greene 25

Finding the F-value having area 0.05 to its right

Page 26: ANOVA: Analysis of Variance

Anthony J Greene 26

Assumptions for One-Way ANOVA

1. Independent samples: The samples taken from the populations under consideration are independent of one another.

2. Normal populations: For each population, the variable under consideration is normally distributed.

3. Equal standard deviations: The standard deviations of the variable under consideration are the same for all the populations.

Page 27: ANOVA: Analysis of Variance

Anthony J Greene 27

Learning Under Three Temperature Conditions

M1 = 1 M2 = 5 M3 = 1

Page 28: ANOVA: Analysis of Variance

Anthony J Greene 28

Learning Under Three Temperature Conditions

M1 = 1 M2 = 5 M3 = 1

Is the Effect Variability

Large

Compared to the Random Variability

Page 29: ANOVA: Analysis of Variance

Anthony J Greene 29

Learning Under Three Temperature Conditions

xT

Page 30: ANOVA: Analysis of Variance

Anthony J Greene 30

Learning Under Three Temperature Conditions

xT

Page 31: ANOVA: Analysis of Variance

Anthony J Greene 31

Learning Under Three Temperature Conditions

Page 32: ANOVA: Analysis of Variance

Anthony J Greene 32

Learning Under Three Temperature Conditions

Page 33: ANOVA: Analysis of Variance

Anthony J Greene 33

Learning Under Three Temperature Conditions

M2 M3M1

Page 34: ANOVA: Analysis of Variance

Anthony J Greene 34

Learning Under Three Temperature Conditions

ΣX2 = 106

191

16936916

144

M2 M3M1

Page 35: ANOVA: Analysis of Variance

Anthony J Greene 35

Learning Under Three Temperature Conditions

TG

M2 M3M1

Page 36: ANOVA: Analysis of Variance

36

Learning Under Three Temperature Conditions

M2 M3M1

Page 37: ANOVA: Analysis of Variance

Calculating the F statistic

Sstotal = X2-G2/N = 46

SSbetween =

SSbetween = 30

SStotal= Ssbetween + SSwithin

Sswithin = 16

N

G

n

T 22

28.1133.1

15

1216230

within

within

between

between

within

betweeen

dfSSdfSS

MS

MSF

Page 38: ANOVA: Analysis of Variance

Anthony J Greene 38

Distribution of the F-Statistic for One-Way ANOVA

error

treatment

within

between

MS

MS

MS

MSF

Suppose the variable under consideration is normally distributed on each of k populations and that the population standard deviations are equal. Then, for independent samples from the k populations, the variable

has the F-distribution with df = (k – 1, n – k) if the null hypothesis of equal population means is true. Here n denotes the total number of observations.

Page 39: ANOVA: Analysis of Variance

Anthony J Greene 39

ANOVA Source Table

for a one-way analysis of variance

Page 40: ANOVA: Analysis of Variance

Anthony J Greene 40

The one-way ANOVA test for k population means (Slide 1 of 3)

Step 1 The null and alternative hypotheses are

Ho: 1 = 2 = 3 = …= k

Ha: Not all the means are equal

Step 2 Decide On the significance level,

Step 3 The critical value of F, with df = (k - 1, N - k), where N is the total number of observations.

Page 41: ANOVA: Analysis of Variance

Anthony J Greene 41

The one-way ANOVA test for k population means (Slide 2 of 3)

Page 42: ANOVA: Analysis of Variance

Anthony J Greene 42

The one-way ANOVA test for k population means (Slide 3 of 3)

Step 4 Obtain the three sums of squares, STT, STTR, and SSE

Step 5 Construct a one-way ANOVA table:

Step 6 If the value of the F-statistic falls in the rejection region, reject H0;

Page 43: ANOVA: Analysis of Variance

Anthony J Greene 43

Post Hocs

•H0 : 1 = 2 = 3 = …= k

•Rejecting H0 means that not all means are equal.

•Pairwise tests are required to determine which of the means are different.

•One problem is for large k. For example with k = 7, 21 means must be compared. Post-Hoc tests are designed to reduce the likelihood of groupwise type I error.

Page 44: ANOVA: Analysis of Variance

Anthony J Greene 44

Criterion for deciding whether or not to reject the null hypothesis

Page 45: ANOVA: Analysis of Variance

Anthony J Greene 45

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8

3 4 6

0 1 4

1 1 7

A researcher wants to test the effects of St. John’s Wort, an over the counter, herbal anti-depressant. The measure is a scale of self-worth. The subjects are clinically depressed patients. Use α = 0.01

Page 46: ANOVA: Analysis of Variance

Anthony J Greene 46

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6

0 1 4

1 1 7

T1=5 T2=10 T3=30

Compute the treatment totals, T, and the grand total, G

Page 47: ANOVA: Analysis of Variance

Anthony J Greene 47

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7

T1=5 T2=10 T3=30

n1=5 n2=5 n3=5

Count n for each treatment, the total N, and k

Page 48: ANOVA: Analysis of Variance

Anthony J Greene 48

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7

T1=5 T2=10 T3=30

n1=5 n2=5 n3=5

M1=1 M2 =2 M3 =6

Compute the treatment means

Page 49: ANOVA: Analysis of Variance

Anthony J Greene 49

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7

T1=5 T2=10 T3=30

n1=5 n2=5 n3=5

M1=1 M2 =2 M3 =6

SS=6 SS=8 SS=10

Compute the treatment SSs

(0-1)2=1

(1-1)2=0

(3-1)2=4

(0-1)2=1

(1-1)2=0

sum

Page 50: ANOVA: Analysis of Variance

Anthony J Greene 50

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7 X2= 229

T1=5 T2=10 T3=30

n1=5 n2=5 n3=5

M1=1 M2 =2 M3 =6

SS=6 SS=8 SS=10

Compute all X2s and sum them

Page 51: ANOVA: Analysis of Variance

Anthony J Greene 51

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7 X2= 229

T1=5 T2=10 T3=30 SSTotal=94

n1=5 n2=5 n3=5

M1=1 M2 =2 M3 =6

SS=6 SS=8 SS=10

Compute SSTotal

SSTotal= X2 – G2/N

Page 52: ANOVA: Analysis of Variance

Anthony J Greene 52

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7 X2= 229

T1=5 T2=10 T3=30 SSTotal=94

n1=5 n2=5 n3=5 SSWithin=24

M1=1 M2 =2 M3 =6

SS1=6 SS2=8 SS3=10

Compute SSWithin

SSWithin= SSi

Page 53: ANOVA: Analysis of Variance

Anthony J Greene 53

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7 X2= 229

T1=5 T2=10 T3=30 SSTotal=94

n1=5 n2=5 n3=5 SSWithin=24

M1=1 M2 =2 M3 =6 d.f. Within=12

SS1=6 SS2=8 SS3=10 d.f. Between=2

d.f. Total=14

Determine d.f.s

d.f. Within=N-k

d.f. Between=k-1

d.f. Total=N-1

Note that (N-k)+(k-1)=N-1

Page 54: ANOVA: Analysis of Variance

Anthony J Greene 54

One-Way ANOVA

control low dose high dose

0 1 5

1 3 8 G=45

3 4 6 N=15

0 1 4 k=3

1 1 7 X2= 229

T1=5 T2=10 T3=30 SSTotal=94

n1=5 n2=5 n3=5 SSWithin=24

MX1=1 MX2 =2 MX3 =6 d.f. Within=12

SS1=6 SS2=8 SS3=10 d.f. Between=2

d.f. Total=14

Ready to move it to a source table

Page 55: ANOVA: Analysis of Variance

Anthony J Greene 55

One-Way ANOVA

Compute the missing values

Source SS df MS F

Between 70 2

Within 24 12

Total 94 14

Page 56: ANOVA: Analysis of Variance

Anthony J Greene 56

One-Way ANOVA

Compute the missing values

Source SS df MS F

Between 70 2 35

Within 24 12 2

Total 94 14

Page 57: ANOVA: Analysis of Variance

Anthony J Greene 57

One-Way ANOVA

Compute the missing values

Source SS df MS F

Between 70 2 35 17.5

Within 24 12 2

Total 94 14

Page 58: ANOVA: Analysis of Variance

Anthony J Greene 58

One-Way ANOVA1. Compare your F of 17.5 with the critical value at

2,12 degrees of freedom, = 0.01: 6.93

2. reject H0

Source SS df MS F

Between 70 2 35 17.5

Within 24 12 2

Total 94 14

Page 59: ANOVA: Analysis of Variance

Anthony J Greene 59

One-Way ANOVA

Low Medium High

2 6 9

4 4 10

3 5 8

0 3 10

2

1

6

6

8

9

Students want to know if studying has an impact on a 10-point statistics quiz, so they divided into 3 groups: low studying (0-5hrs./wk), medium studying (6-15 hrs./wk) and high studying (16+ hours/week). At α=0.01, does the amount of studying impact quiz scores?

Page 60: ANOVA: Analysis of Variance

Anthony J Greene 60

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8

0 3 10

2

1

6

6

8

9

T1=12 T2=30 T3=54

Compute the treatment totals, T, and the grand total, G

Page 61: ANOVA: Analysis of Variance

Anthony J Greene 61

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

T1=12 T2=30 T3=54

n1=6 n2=6 n3=6

Count n for each treatment, the total N, and k

Page 62: ANOVA: Analysis of Variance

Anthony J Greene 62

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

T1=12 T2=30 T3=54

n1=6 n2=6 n3=6

M1=2 M2 =5 M3 =9

Compute the treatment means

Page 63: ANOVA: Analysis of Variance

Anthony J Greene 63

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

T1=12 T2=30 T3=30

n1=6 n2=6 n3=6

M1=2 M2 =5 M3 =9

SS=10 SS=8 SS=10

Compute the treatment SSs

(2-2)2=0

(4-2)2=4

(3-2)2=1

(0-2)2=4

(2-2)2=0

(1-2)2=1

sum

Page 64: ANOVA: Analysis of Variance

Anthony J Greene 64

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

X2=682

T1=12 T2=30 T3=54

n1=6 n2=6 n3=6

M1=2 M2 =5 M3 =9

SS=10 SS=8 SS=10

Compute all X2s and sum them

Page 65: ANOVA: Analysis of Variance

Anthony J Greene 65

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

X2= 682

T1=12 T2=30 T3=54 SSTotal=170

n1=6 n2=6 n3=6

M1=2 M2 =5 M3 =9

SS=10 SS=8 SS=10

Compute SSTotal

SSTotal= X2 – G2/N

Page 66: ANOVA: Analysis of Variance

Anthony J Greene 66

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=96

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

X2= 682

T1=12 T2=30 T3=54 SSTotal=170

n1=6 n2=6 n3=6 SSWithin=28

M1=2 M2 =5 M3 =9

SS1=10 SS2=8 SS3=10

Compute SSWithin

SSWithin= SSi

Page 67: ANOVA: Analysis of Variance

Anthony J Greene 67

One-Way ANOVA

low medium high

2 6 9

4 4 10 G=90

3 5 8 N=18

0 3 10 k=3

2

1

6

6

8

9

X2= 682

T1=12 T2=30 T3=54 SSTotal=170

n1=6 n2=6 n3=6 SSWithin=28

M1=2 M2 =5 M3 =9 d.f. Within=15

SS1=10 SS2=8 SS3=10 d.f. Between=2

Determine d.f.s

d.f. Within=N-k

d.f. Between=k-1

d.f. Total=N-1

Note that (N-k)+(k-1)=N-1

Page 68: ANOVA: Analysis of Variance

Anthony J Greene 68

One-Way ANOVA

Fill in the values you have

Source SS df MS F

Between 2

Within 28 15

Total 170 17

Page 69: ANOVA: Analysis of Variance

Anthony J Greene 69

One-Way ANOVA

Compute the missing values

Source SS df MS F

Between 142 2 71 37.97

Within 28 15 1.87

Total 170 17

Page 70: ANOVA: Analysis of Variance

Anthony J Greene 70

One-Way ANOVA1. Compare your F of 37.97 with the critical value at

2,15 degrees of freedom, = 0.01: 6.36

2. reject H0

Source SS df MS F

Between 142 2 71 37.97

Within 28 15 1.87

Total 170 17