1 hypothesis testing: about more than two (k) related populations

1

HYPOTHESIS TESTING:HYPOTHESIS TESTING:ABOUT ABOUT MORE THAN MORE THAN TWOTWO (K) (K)

RELATEDRELATED POPULATIONS POPULATIONS

2

Often, one wants to administer the same test to the same

subjects repeatedly over a period of time or under different

circumstances.

In essence, one is interested in examining differences within

each subject, for example, subjects' improvement over time.

Such designs are referred to as within-subjects designs or

repeated measures designs.

Repeated Measures ANOVA

3

For example, imagine that one wants to monitor the

improvement of students' algebra skills over three months

of instruction. A standardized algebra test is administered

after one month (level 1 of the repeated measures factor),

and comparable tests are administered after two months

(level 2 of the repeated measures factor) and after three

months (level 3 of the repeated measures factor). Thus,

the repeated measures factor (Time) has three levels.

4

Mean

n

xitxijxi1i

x1tx1jx111

tj1 MeanTime

Subject

1.x jx. tx. ..x

1nx

njx ntx

.1x

.ix

.nx

5

Repeated measures can occure in different ways:

• Repeated measures can be taken at different time points in a single group.

•Repeated measures can be taken at different time points in several groups.

6

subject time1 time2 time3 time41 30,9 30,7 30,9 30,92 31,9 31,6 31,6 31,73 31,3 31,1 31,0 31,34 32,1 31,0 31,7 31,35 30,9 31,2 30,5 30,86 31,3 31,7 31,4 31,27 31,3 31,8 31,8 31,78 32,1 33,0 31,7 31,59 30,3 30,9 30,8 30,610 32,2 32,1 32,2 32,4

Example Temperatures of the forehead (in degrees Celsius) measured at 30-minute intervals in a single group of subjects are given in the table.

H0:There is no difference between time periods

7

Sources of variation SS df MS F Sig.Times 0,178 3 0,059 0,57 0,64Subjects 10,086 9 1,121Error 2,812 27 0,104

There is no difference between time periods.

8

Example Temperatures of the forehead (in degrees Celsius) measured at 30-minute intervals in two groups of subjects are given in the table.

Subject Group time1 time2 time3 time41 1 30,90 30,70 30,90 30,902 1 31,90 31,60 31,60 31,703 1 31,30 31,10 31,00 31,304 1 32,10 31,00 31,70 31,305 1 30,90 31,20 30,50 30,806 1 31,30 31,70 31,40 31,207 1 31,30 31,80 31,80 31,708 1 32,10 33,00 31,70 31,509 1 30,30 30,90 30,80 30,60

10 1 32,20 32,10 32,20 32,4011 2 31,50 30,60 30,80 31,0012 2 31,20 31,20 31,10 31,3013 2 31,30 31,30 31,50 31,4014 2 30,40 30,80 30,40 30,2015 2 30,70 30,90 30,90 30,9016 2 29,80 30,80 30,90 30,8017 2 31,40 32,00 31,70 31,6018 2 30,90 32,40 31,80 31,9019 2 31,10 31,30 31,20 31,2020 2 31,30 31,50 31,60 31,70

9

Source of variation df SS MS F Sig.Groups 1 1,275 1,275 1,336 0,263Times 3 0,41 0,137 1,412 0,249Interaction 3 0,336 0,112 1,158 0,334Subjects 18 17,176 0,954Residual 54 5,231 0,097

Three pairs of hypotheses can be tested:

1. Hypothesis on groups

2. Hypothesis on time points

3. Hypothesis on interaction

10

Time

4321

Est

ima

ted

Ma

rgin

al M

ea

ns

31,6

31,5

31,4

31,3

31,2

31,1

31,0

30,9

Group

1,00

2,00

11

Cochran’s Q Test

Cochran’s Q Test extends McNemar Test to examine

change in a dichotomous variable (0-1) across more

than two observations. It is a particularly appropriate test

when subjects are used as their own controls and

dichotomous outcome variable is measured across

multiple time periods or under several types of

conditions.

12

N

ii

N

ii

k

jj

k

jj

LLk

GGkk

Q

1

2

1

2

11

2)1(

where

k: # of conditions or time periods

N: # of subjects

Gj: The total # of 1s in the jth column

Li: The total # of 1s in the ith row

Post hoc tests are necessary to determine where the differences lie.

13

ExampleThe children in the anxiety reduction intervention groups were evaluated for the presence of certain symptom before and after the intervention.

Presence of symptom

Subject Preint. Postint.

1 1 0

2 1 0

3 1 1

4 0 0

5 1 0

6 1 0

7 1 1

8 1 0

9 1 1

10 1 0

Is the clinical intervention effective in reducing the symptom?

Since the pretest-posttest measure is dichotomous and the data are paired, the use of McNemar test is appropriate to answer this question.

14

Suppose one month after the end of the intervention program a

third measurement is taken from all children. Is the proportion of

children who presents the symptom (yes,no) same across all three

data collection periods?

Since there are now three points of data collection, the McNemar

test can no longer be used. Cochran’s Q test, however, woud be

appropriate.

H0: the proportion of “yes” responses with regard to the presence

of the symptom same across all three time periods for those

children.

15

Presence of symptom

Subj Preint. Postint One month later.

Li

1 1 0 1 2

2 1 0 0 1

3 1 1 1 3

4 0 0 0 0

5 1 0 1 2

6 1 0 1 2

7 1 1 1 3

8 1 0 0 1

9 1 1 1 3

10 1 0 0 1

Gj 9 3 6 18

912

108

)112()112(3

18639(3)13(

)1(

222

2222

1

2

1

2

11

2

N

ii

N

ii

k

jj

k

jj

LLk

GGkk

Q

2(2,0.05)=5.99 <Qcal, p<0.05

Reject H0.

16

Friedman TestThe Friedman Test extends the Wilcoxon Signed Ranks Test

to include more than two time periods of data collection or

conditions

)1(3)1(

12

1

2

kNRkNk

Fk

jjr

Where

Rj: The sum of the ranks for column j

N: The # of subjects

k: # of time periods or conditions

This statistic is distributed as a chi-square with df=k-1.

17

Multiple Comparisons TestOnce a determination has been made that the overall Friedman test is significant, post hoc tests can be undertaken that compare the differences in average ranks for all possible pairs to determine where the differences lie.

NkkzRR kkji 6/)1()1(/

The null hypothesis of no differences in mean ranks of the pairs being examined will be rejected if absolute value of these differences is greater than a specified critical value. That is, we would reject the null hypothesis if the following condition holds true:

Ri: the mean rank for time or condition i

k: the number of time periods or conditions

N: the number of cass

18

Example Suppose we had collected information concerning the 10 children’s anxiety levels not only at pretest and immediately following the anxiety reduction intervention but also just prior to the administration of the preoperative medication. What are the differences in the anxiety levels of the 10 children who took part in the intervention across the three time periods?

H0: there will be no differences among the median anxiety scores at preintervention, at postintervention, and at preoperative medication for the 10 children who took part in the intervention.

19

Children’s anxiety levels

Subj Preint Postint Preop

1 7 5 6

2 4 4 4

3 3 5 5

4 3 6 4

5 6 3 4

6 7 3 6

7 6 5 5

8 7 5 6

9 5 4 4

10 7 5 6

Rank Rank Rank

3 1 2

2 2 2

3 1.5 1.5

1 3 2

3 1 2

3 1 2

3 1.5 1.5

3 1 2

3 1.5 1.5

3 1 2

Sum Rj 27 14.5 18.5

20

15.8

)13)(10(35.185.1427)13)(3(10

12

)1(3)1(

12

222

1

2

kNRkNk

Fk

jjr

2(2,0.05)=5.99 <Fr, p<0.05

Reject H0.

21

08.1

)10(6/)4(339.2

6/)1()1(/

Nkkz kk

Groups Critical value Decision

1-2 1.25 1.08 Reject H0

1-3 0.85 <1.08 Accept H0

2-3 0.40 <1.08 Accept H0

ji RR

1 hypothesis testing: about more than two (k) related populations

Documents

time subject slide

repeated measures factor

repeated measures anova

interaction slide

repeated measures designs

different time points

time periods n

period of time