epidemiology 9509 - principles of biostatistics chapter 12...

24
Epidemiology 9509 proportions - misc Epidemiology 9509 Principles of Biostatistics Chapter 12 (continued again) - miscellaneous John Koval Department of Epidemiology and Biostatistics University of Western Ontario 1

Upload: others

Post on 02-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Epidemiology 9509Principles of Biostatistics

Chapter 12 (continued again) - miscellaneous

John Koval

Department of Epidemiology and BiostatisticsUniversity of Western Ontario

1

Page 2: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

What is being covered

1. sample size and power

2. one-sided alternative hypotheses

3. relationship between tests

4. SAS Proc Freq - exact

2

Page 3: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

planning studies

1. sample size

2. power

3

Page 4: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

sample size - two samples

interested inδπ = π1 − π2since variance depends on meanhave to specify π1, π2

simple formula becomes more complex

n =

(

zα/2√

2π(1 − π) + zβ√

π1(1− π1) + π2(1− π2)

ES

)2

where π = π1+π22

ES is effect size, that is, δπσ from earlier formula appears in the numerator

4

Page 5: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

sample size - effect size - example

usually α = 0.05β = 0.20 (power is 80%)

current ”cure” rate is 50%want to see if improvement is 10%ie new rate is 60%

statistically π1 = 0.60, π2 = 0.50

5

Page 6: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

sample size example (continued)

first π = 0.50+0.602 = 0.55

and ES = 0.10

n =

(

1.96√

2(0.55)(0.45) + 0.842√

0.50(0.50) + 0.60(0.40)

0.10

)2

=

(

[1.96√0.495] + [0.842

√0.49]

0.10

)2

=

(

1.3789 + 0.5894

0.10

)2

=

(

1.9683

0.10

)2

= 387.4528

that is 388 (close enough to 400)

6

Page 7: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

power for difference in proportions

for difference in meansPr(

ZN > zα/2 − |δA|σD̄

)

In the case of proportionsδA = π1 − π2

σD̄ =√

π1(1−π1)n1

+ π2(1−π2)n2

=

π1(1−π1)+π2(1−π2)n

for equal sample size

7

Page 8: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

example: power for difference in proportions

Say, for smoking-cold example,π1 = 0.65, π2 = 0.45, n1 = n2 = n = 25

Hence δA = π1 − π2 = 0.65 − 0.45 = 0.20

and σD̄ =

0.65(0.35)+0.45(0.55)25 =0.1378

so that power is Pr(

ZN > z.025 − 0.200.1378

)

= Pr(ZN > 1.960 − 1.4510) = Pr(ZN > 0.509)= z(0.509) = 0.305(by linear interpolation)

Hence power to find difference of 0.20with samples of size 25 is 30%

8

Page 9: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

One-sided alternatives

can use test statistic based on p1 − p2

(another possibility shown later)

test statistic is p1−p2√

p(1−p)(

1n1

+ 1n2

)

where p1 and p2 are the proportions in the two samplesand p is the overall proportion

p = (X1+X2)(n1+n2)

9

Page 10: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

example of one-sided test

HA : π1 > π2

ie smokers more likely to get colds than non-smokers

p1 = (15/23) = 0.65217p2 = (10/22) = 0.45455and p = 25/45 = 0.55555

10

Page 11: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

one-sided test (continued)

p = Pr

(

ZN > 0.65217−0.45455√

0.55555(0.44444)( 123+ 1

22)

)

= Pr

(

ZN > 0.19762√0.24691(0.08893)

)

= Pr(

ZN > 0.197620.14818

)

= Pr(ZN > 1.3336) = z(1.3336)= 0.0912 (by linear interpolation)Hence, at α = 0.05, cannot reject Ho .

11

Page 12: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Continuity correction

test statistic is(p1−p2)−

(

12n1

+ 12n2

)

p(1−p)(

1n1

+ 1n2

)

p = Pr

(

ZN >0.65217−0.45455−( 1

46+ 1

44)√

0.55555(0.44444)( 123+ 1

22)

)

= Pr(

ZN > 0.153160.14818

)

= Pr(ZN > 1.0336) = z(1.0336)= 0.1506 (by linear interpolation)

Hence, at α = 0.05, cannot reject Ho .

12

Page 13: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

use of SY for one-sided alternatives

Recall that SY was developed for two-sided alternatives

In calculating p-value, we used the formulap = Pr(χ2

1 > x) = 2Pr(ZN >√x)(1)

we actually wanted Pr(χ21 > x) = Pr(ZN >

√x) + Pr(ZN < −√

x)but because Pr(ZN >

√x) = Pr(ZN < −√

x)we get equation (1)

13

Page 14: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

use of SY for one-sided alternatives

Recall that SY was developed for two-sided alternatives

In calculating p-value, we used the formulap = Pr(χ2

1 > x) = 2Pr(ZN >√x)(1)

we actually wanted Pr(χ21 > x) = Pr(ZN >

√x) + Pr(ZN < −√

x)but because Pr(ZN >

√x) = Pr(ZN < −√

x)we get equation (1)

for one-sided alternativesuse only one of Pr(ZN >

√x) and Pr(ZN < −√

x)

for example, for one-sided alternativep = Pr(ZN >

√SY )

14

Page 15: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

example: use of SY for one-sided test

for two-sided alternative

pvalue = Pr(SY > 1.0668) = 2Pr(ZN > 1.0336)

for one-sided alternative

pvalue = Pr(ZN >√

SY )

= Pr(ZN > 1.0336)

equivalent to the result using statistic p1 − p2...

15

Page 16: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Equivalence of tests

D’Agostino et al (pages 307-309)and other authors suggesttest of Ho : π1 = π2using p1 − p2

Unnecessary

mathematically equivalent to SPtest of independence/associationsee below

16

Page 17: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Equivalence of tests II

1. the proportions version

(p1 − p2)2

p(1− p)(

1r1+ 1

r2

)

2. the contingency table version

2∑

i=1

2∑

j=1

(

(Oij − Eij )2

Eij

)

3. the epidemiology version

(ad − bc)2T

c1c2r1r2

17

Page 18: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Equivalence of tests III

1. the ”unadjusted” version of all three statisticsare mathematically equivalent

2. the ”continuity correction unadjusted” version of all threestatisticsare mathematically equivalentUse whichever of the three you find easiest for yourcalculations

3. this applies to two-sided andone-sided alternative hypotheses

18

Page 19: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Summary - tests with proportions in two populations

1. if you have two proportionsconvert to contingency table

2. if possible, use computer to get exact test

3. if doing calculation by hand,use ”epidemiological” version of Yates test

SY = T(|ad − bc | − T/2)2

c1c2r1r2

4. if two-sided alternative, calculate

pvalue = Pr(χ21 > SP) = 2Pr(ZN >

SP)

pvalue = Pr(ZN >√

SP)

19

Page 20: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Summary - partial example

If you have two proportionseg, in one sample of 80 males, 20% smokewhile in another sample of 100 females, 30% smoke

then contingency table isstudy of smoking and gender

smoker

Female 30 70 100Male 16 64 80

Total 46 134 180

20

Page 21: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

SAS for large tables

for tables larger than 2X2, and exact computation

example comes from the textbook by d’Agnostinothe distribution of three treatments among a sampleof males and females.

3 treatments in two sexes

Treatment type

Type A Type B Type C Total

Females 147 435 256 838Males 147 392 323 862

Total 294 827 579 1700

21

Page 22: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

Program

title "get exact pvalue for 2x3 table";

options ls=64 ps=24;

data exact;

input r c n;

datalines;

1 1 147

1 2 435

1 3 256

2 1 147

2 2 392

2 3 323

;

proc freq ORDER=DATA;

weight n;

table r*c/CHISQ NOROW NOCOL NOPERCENT EXACT;

22

Page 23: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

SAS for exact p-values - output

get exact pvalue for 2x3 table

The FREQ Procedure

Table of r by c

r c

Frequency| 1| 2| 3| Total

---------------------------------------------

1 | 147 | 435 | 256 | 838

---------------------------------------------

2 | 147 | 392 | 323 | 862

----------------------------------------------

Total | 294 | 827 | 579 | 1700

23

Page 24: Epidemiology 9509 - Principles of Biostatistics Chapter 12 ...publish.uwo.ca/~jkoval/courses/Epid9509/chapter12/two_proportions… · Principles of Biostatistics Chapter 12 (continued

Epidemiology 9509 proportions - misc

SAS for exact p-values - output (continued)

Statistics for Table of r by c

Statistic DF Value Prob

------------------------------------------------

Chi-Square 2 9.6519 0.0080

Likelihood Ratio Chi-Square 2 9.6684 0.0080

Mantel-Haenszel Chi-Square 1 4.8042 0.0284

Phi Coefficient 0.0753

Contingency Coefficient 0.0751

Cramer’s V 0.0753

Fisher’s Exact Test

----------------------------------

Table Probability (P) 1.771E-05

Pr <= P 0.0080

Sample Size = 1700

24