lecture 3: sample size and power estimation

51
Copyright 2009, The Johns Hopkins University and Saifuddin Ahmed. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License . Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.

Upload: truongtruc

Post on 14-Jan-2017

234 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Lecture 3: Sample Size and Power Estimation

Copyright 2009, The Johns Hopkins University and Saifuddin Ahmed. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.

Page 2: Lecture 3: Sample Size and Power Estimation

Methods in Sample Surveys 140.640

3rd Quarter, 2009

Sample Size and Power Estimation

Saifuddin Ahmed, PHDBiostatistics Department

School of Hygiene and Public HealthJohns Hopkins University

Page 3: Lecture 3: Sample Size and Power Estimation

Sample size and Power

“When statisticians are not making their lives producing confidence intervals and p-values, they are often producing power calculations”

Newson, 2001

Page 4: Lecture 3: Sample Size and Power Estimation

”In planning of a sample survey, a stage is always reached at which a decision must be made about the size of the sample. The decision is important. Too large a sample implies a waste of resources, and too small a sample diminishes the utility of the results.“

Cochran, 1977

Page 5: Lecture 3: Sample Size and Power Estimation

Sample size estimation: Why?

• Provides validity of the clinical trials/intervention studies – in fact any research study, even presidential election polls

• Assures that the intended study will have a desired power for correctly detecting a (clinically meaningful) difference of the study entity under study if such a difference truly exists

Page 6: Lecture 3: Sample Size and Power Estimation

Sample size estimation

• ONLY two objectives:– Measure with a precision:

• Precision analysis

– Assure that the difference is correctly detected• Power analysis

Page 7: Lecture 3: Sample Size and Power Estimation

First objective: measure with a precision

• Whenever we propose to estimate population parameters, such as, population mean, proportion, or total, we need to estimate with a specified level of precision

• We like to specify a sample size that is sufficiently large to ensure a high probability that errors of estimation can be limited within desired limits

Page 8: Lecture 3: Sample Size and Power Estimation

Stated mathematically:• we want a sample size to ensure that we can

estimate a value , say, p from a sample which corresponds to the population parameter, P.

• Since we may not guarantee that p will be exact to P, we allow some error

• Error is limited to certain extent, that is this error should not exceed some specified limit, say d.

Page 9: Lecture 3: Sample Size and Power Estimation

• We may express this as:p - P = ± d,

i.e., the difference between the estimated p and true P is not greater than d (allowable error: margin-of-error)

• But do we have any confidence that we can get a p, that is not far away from the error of ±d?

• In other words, we want some confidence limits, say 95%, to our error estimate d.

That is 1-α = 95% It is a common practice: α-error = 5%

Page 10: Lecture 3: Sample Size and Power Estimation

prob {-d p - P d} 1 -≤ ≤ ≥ α

In probability terms, that is,

In English, we want our estimated proportion p to vary between p-d to p+d, and we like to place our confidence that this will occur with a 1-α probability.

Page 11: Lecture 3: Sample Size and Power Estimation

From our basic statistical course, we know that we can construct a confidence interval for p by:

p ±

z1-α/2 *se(p)where zα

denotes a value on the abscissa of a standard normal distribution (from an assumption that the sample elements are normally distributed) and se(p) = σp is the standard error.

Hence, we relate p ±

d in probabilities such that:

pzpdp σα 2/1−±=±

np)p(1Z

Zd

=

= 2/1

2/1

α

α σ

Page 12: Lecture 3: Sample Size and Power Estimation

If we square both sides,

np)p(1

d

2/1

2/1

−=

=

α

α σ

Z

Z

2d

)p(12Zn

p)p(1d2

2Zn

np)-p(12Zd2

2/1

2/1

2/1

p−=

−=

=

α

α

α

Page 13: Lecture 3: Sample Size and Power Estimation

932.922)^10(.

6.0*4.0*)96.1( 2≈==n

For the above example:

Page 14: Lecture 3: Sample Size and Power Estimation

Note that, the sample size requirement is highest when p=0.5. It is a common practice to take p=0.5 when no information is available about p for a conservative estimation of sample size.

40038516.3842)^05(.

5.0*5.0*)96.1( 2

≈===n

As an example, p = 0.5, d = 0. 05(5% margin-of-error), and α-error = 0.05:

. di 1.96^2*.5*(1-.5)/(.05^2)384.16. di (invnorm(.05/2))^2*.5*(1-.5)/(.05^2)384.14588

Page 15: Lecture 3: Sample Size and Power Estimation

Stata

. sampsi .5 .55, p(.5) onesample

Estimated sample size for one-sample comparison of proportionto hypothesized value

Test Ho: p = 0.5000, where p is the proportion in the population

Assumptions:

alpha = 0.0500 (two-sided)power = 0.5000

alternative p = 0.5500

Estimated required sample size:

n = 385

Page 16: Lecture 3: Sample Size and Power Estimation

Sample Size Estimation for Relative Differences If d is relative difference,

pd

pt

d p

pptn

2

)1(2

2)*(

)1(2

−=

−=

Consider that 10% change is relative to p=.40 in the above example. Then, d = 0.4*0.10=0.04, that is, p varies between 0.36 to 0.44. Now,

Note, d is very sensitive for sample size calculation.

57724.5762)^4.0*10(.

6.0*4.0*)96.1(n2

≈==

Page 17: Lecture 3: Sample Size and Power Estimation

Sample Size for Continuous Data

nt

d=

2 2

2

σ

Change the variance

Page 18: Lecture 3: Sample Size and Power Estimation

Sources of variance information:

• Published studies – (Concerns: geographical, contextual, time

issues – external validity)• Previous studies• Pilot studies

Page 19: Lecture 3: Sample Size and Power Estimation

Study design and sample size

• Sample size estimation depends on the study design – as variance of an estimate depends on the study design

• The variance formula we just used is based on “simple random sampling” (SRS)

• In practice, SRS strategy is rarely used • Be aware of the study design

Page 20: Lecture 3: Sample Size and Power Estimation

Sample Size Under SRS Without Replacement

We know that under SRSWOR,

So, under SRSWOR:

)1(2)1(

)1(

1)1(22

PPDN

PNPn

NnN

npp

td

−+−

−=

−−−

= α

where,

1NnN

n)y(V

2

−−σ

=

αtdD /=

Page 21: Lecture 3: Sample Size and Power Estimation

For continuous data,

22)1(

2

σ

σ

+−=

DN

Nn

Page 22: Lecture 3: Sample Size and Power Estimation

Alternative Specification (in two-stages):

This n, say n’, is estimated under simple random sampling with replacement (SRSWR). When sampling is without replacement, we adjust the n by

Nn

nn'1

'

+=

or,

Nn

n1

'1

1

+=

(n is adjusted for finite population correction factor, 1- n/N).

Page 23: Lecture 3: Sample Size and Power Estimation

Example: For exercise study, 93 samples are needed. Say, the population size is 200. Under SRSW OR:

64n

48.63465.193

465.0193

200931

93n

N'n1

'nn

==+

=+

=

+=

Smaller sample size is needed when population size is small, but opposite is not true

Page 24: Lecture 3: Sample Size and Power Estimation

Derivation (alternative two-stage formula):

Nnnn

nNn

n

nNn

nn

Nnn

nNn

nNn

n

nS

NnNvs

nS

/'1'

'

'11

'1'

'11

1'

11

11'

1

1)1('

1

........'

22

+==>

+==>

+==>

+==>

−==>

−==>

−Remember the relationship between

Page 25: Lecture 3: Sample Size and Power Estimation

Sample Size Based on Coefficient of Variation

• In the above, the sample size is derived from an absolute measure of variation, σ2.

• Coefficient of variation (cv) is a relative measure, in which units of measurement is canceled by dividing with mean.

• Coefficient of variation is useful for comparison of variables.

Page 26: Lecture 3: Sample Size and Power Estimation

Coefficient of variation is defined as,

YS

Cy

Y = , and is estimated by ys

cy

y =

Coefficient of variation (CV) of mean is

CV SEy

s ny

sy n

= = =/ 1

So, ys

CVn 2

2

2

1=

For proportion p,

p

pCVp

ppCV

n )1(1)1(1222

−=

−=

Page 27: Lecture 3: Sample Size and Power Estimation

Caution about using coefficient of variation (CV)

• If mean of a variable is close to zero, CV estimate is large and unstable.

• Next, consider CV for binomial variables. For binary variables, the choice of P and Q=1-P does not affect P(1-P) estimate, but CV differs. So, the choice of P affects sample size when CV method is used.

Page 28: Lecture 3: Sample Size and Power Estimation

Cost considerations for sample size

How many samples you may afford to interview, given then budget constraints?C(n) = cost of taking n samplesco = fixed cost c1 = cost for each sample interview

then,C(n) = co + c1 x n

Example:C(n)=$10000 - your budget for survey implementationco = $3000 - costs for interviewer training, questionnaire prints,

etcc1 = $8.00 - cost for each sample interview10000=3000+8*n

So, n=875

Page 29: Lecture 3: Sample Size and Power Estimation

Objective 2: Issues of Power Calculation

POWERThe power of a test is the probability of rejecting the null hypothesis

if it is incorrect.

TRICKS to REMEMBER:R, T: Reject the null hypothesis if it is true - Type I error (alpha error) { one stick in R, T}A, F: Accept the null hypothesis if it is false - Type II error (beta error) {two sticks in A, F}

POWER: 1- type II error

Power: Reject the null hypothesis if it is false.

Another way:

False Positive (YES instead of NO) ?

False Negative (NO instead of YES) ?

Page 30: Lecture 3: Sample Size and Power Estimation

Ho=Left Curve, Ha=Right Curve, Area to right of line=PowerN = 500 diff = .2 alpha = .05

.

Power = .99x

-1 -.5 0 .5 1

0

Ho=Left Curve, Ha=Right Curve, Area to right of line=PowerN = 50 diff = .2 alpha = .05

.

Power = .29x

-1 -.5 0 .5 1

0

Ho=Left Curve, Ha=Right Curve, Area to right of line=PowerN = 200 diff = .2 alpha = .05

.

Power = .81x

-1 -.5 0 .5 1

0

Page 31: Lecture 3: Sample Size and Power Estimation

We take power into consideration when we test hypotheses. Example: Consider following study questions: 1. What proportions of pregnant women received antenatal care?

There is no hypothesis. b) Whether 80% of women received antenatal care? There is a hypothesis:

To test that the estimated value is greater, less, or equal to a pre-specified value. c) Do women in project (intervention) area more likely to utilize antenatal care,

compared to control area?

There is a hypothesis: To test that that P1 is greater than P2. In terms of hypothesis: Null hypothesis: Ho:P1=P2, i.e., P1-P2=0 Alternative hypothesis: Ha:P1 > P2 (one-sided)

Ha:P1≠P2 (two-sided) i.e., P1-P2 ≠ 0

Page 32: Lecture 3: Sample Size and Power Estimation

Issues: One-sided vs. two-sided tests. One-sided: sample size will be smaller. Two-side: sample size will be larger. Always prefer "two-sided" - almost a mandatory in clinical trials. Why? Uncertainty in knowledge (a priori).

Page 33: Lecture 3: Sample Size and Power Estimation

How to incorporate "power" in sample size calculations? 1. Proportions:

2/)2p1p(ispwhere

d

)p1(p)tt(

n2

2

+

−β+α=

Note: n for each group. Alternative:

2

2parcsin21parcsin2

ZZ

2n1n⎥⎥

⎢⎢

β+α==

Why? Arcsin provides normal approximation to proportion quantities.

Page 34: Lecture 3: Sample Size and Power Estimation

For continuous variables:

d

sn

ZZ

2

222/1 )( βα +

=−

Page 35: Lecture 3: Sample Size and Power Estimation

Values Two-sided One-sidedLevel 1%

5%10%

2.5761.9601.645

2.3261.6451.282

Power 80%90%95%99%

0.841.2821.6452.326

Values of Z1-α/2 and Zβ

corresponding to specified values of significance level and power

Page 36: Lecture 3: Sample Size and Power Estimation

How to incorporate "power" in sample size calculations? a) Proportions:

)(

)][var()(

21

211)2/1(

2

2

pp

ppdifferenceofvariancezzn

−+=

−− βα

How to estimate variance of difference?

222

2222)(

2.,.,)var()2var()1var(

0)2,1cov(,

2

21

212121

pppd

pp

ppppdpp

variancecommonhaveeipppthatassumealsoweIf

ppceindependenofassumptiontheUnder

σσσσ

σσ

σσσσσσ

=+=

==

==

−+==−

Page 37: Lecture 3: Sample Size and Power Estimation

So,

variancecommonofassumptiontheunder

pppwherepp

qpzz

pp

ppvdifferenceofvariancezzn

2/)()(

*2)(

)(

)]([)(

21

21

1)2/1(

21

211)2/1(

2

2

2

2

+=−

+=

−+=

−−

−−

βα

βα

Page 38: Lecture 3: Sample Size and Power Estimation

The sample size formula for testing two proportions under independence without the assumption of common variance is then:

Note that Fleiss (1981) suggested more precise formula:

When n1 and n2 is not equal and related by a ratio, say by r, the formula is:

The final formula (using normal approximation with continuity correction [without the correction, the power is considered low than expected] with proportions) is:

221

22112

12/121 )(

)]1()1([)(pp

ppppzznn

−−+−+

== −− βα

{ }

2/)(,

)()1()1()1(2

21

221

2

221112/1

pppwhere

ppppppzppz

n

+=

−−+−+−

=′ −− βα

{ }2

21

2

221112/1

)()1()1()1()1(

pprppprpzpprz

n−

−+−+−+=′ −− βα

12

2

211 ||

)1(2114

rnn

pprnrnn

=

⎭⎬⎫

⎩⎨⎧

−′+

++′

= The STATA has implemented this formula in SAMPSI command.

Page 39: Lecture 3: Sample Size and Power Estimation

Stata implementationNO Hypothesis. sampsi .5 .55, p(.5) onesample

Estimated required sample size:n = 385

Study has a hypothesis, but comparing with a hypothesized value

. sampsi .5 .55, p(.8) onesampleEstimated sample size for one-sample comparison of proportionto hypothesized value

Estimated required sample size:n = 783

Study has a hypothesis, and comparing between two groups. sampsi .5 .55, p(.8)Estimated sample size for two-sample comparison of proportions

n1 = 1605n2 = 1605

. di 783/3852.0337662

. di (1.96+.84)^2/1.96^22.0408163

Page 40: Lecture 3: Sample Size and Power Estimation

Stata implementation. sampsi

.5 .55, p(.8) nocontinuity

Estimated sample size for two-sample comparison of proportions

Test Ho: p1 = p2, where p1 is the proportion in population 1and p2 is the proportion in population 2

Assumptions:

alpha = 0.0500 (two-sided)power = 0.8000

p1 = 0.5000p2 = 0.5500

n2/n1 = 1.00

Estimated required sample sizes:

n1 = 1565n2 = 1565

. di 783*21566

In each group, sample size is doubled

Page 41: Lecture 3: Sample Size and Power Estimation

.8.8

5.9

.95

1po

wer

1500 2000 2500 3000 3500n

Sample Size and Power for P1=.5 and P2=.55

Power graph in Stata

Page 42: Lecture 3: Sample Size and Power Estimation

*Calculate and plot sample size by power from .8 to .99

***************************************************************************args p1 p2 type

clearset obs 20gen n=.gen power=.local i 0

while `i' <_N {local i = `i' +1

local j=.79 +`i'/100

quietly sampsi `p1' `p2', p(`j') `type'replace n=r(N_1) in `i'replace power=r(power) in `i'

}

noisily list power n

graph twoway line power n, t1("Sample Size and Power for P1=`p1' and P2=`p2' `type'")*****************************************************************************

Save the above commands as “do” file (e.g., sample_graph.do).

Execute the above file by:

run sample_graph

Page 43: Lecture 3: Sample Size and Power Estimation

Sample size determination when expressed in “relative risk”

In epidemiological studies, often the hypothesis is expressed in

relative risk or odds ratio, e.g, H0:R=1.

{ { } }

cEcR

cccRR

PPRandRPPWhere

RPRPRPZPPZn

/2/)]1([

)]1(/[)1(1)1(2 222

=+=

−+−++−= βα

A sample size formula given in Donner (1983) for Relative Risk (p. 202) is:

Page 44: Lecture 3: Sample Size and Power Estimation

{ }

2/)(,

)()1()1()1(2

2

2

11

CE

EC

CCEE

pppwhere

ppppppzppz

n

+=

−−+−+−

=′ −− βε

Nothing but the Fleiss’ formula:

Note, CEC

E RPPPPR ===

Solution: Replace all PE with RPC and apply Fleiss’ formula

How Donner’s formula was derived: P=(PE +PC )/2=(RPC +PC )/2=[PC (R+1)]/2=[PC (1+R)]/2

PE (1-PE )+PC (1-PC )=RPC (1-RPC )+PC (1-PC )=RPC -R2PC

2 +PC -PC2

=PC (R-R2PC +1-PC )=PC (1+R-PC (1+R2)

and, (PC -PE )2=(PC -RPC )2=[PC (1-R)]2

Page 45: Lecture 3: Sample Size and Power Estimation

11

12

1112

211212112

21

12

1

1

2

2

**

*)*(***)1(**

1

1

QPORPORP

PORPORQPPPORPORPPORQPORQP

QPQP

PP

PP

OR

+=

=+−=−==

=

−=

Sample size for odds-ratio (OR) estimates:

Convenient to do in two stages:

1. Estimate P2 from odds-ratio (OR)

2. Apply “proportion method” (of Fleiss)

Page 46: Lecture 3: Sample Size and Power Estimation

An exampleSuppose we want to detect an OR of 2 using an ratio of 1:1 cases to controls in a population with

expected exposure proportion in non-cases of 0.25 while requiring a α= 0.05 and power = 0.8.

How to estimate SS?

EpiTable calculates m1 = m2 = 165. (Total sample size = 330).

So, P1=.25,P2= (2*.25)/(2*.25+.75) = 0.4

In Stata:

. sampsi .25 .40, p(.8)

Estimated required sample sizes:

n1 = 165n2 = 165

Page 47: Lecture 3: Sample Size and Power Estimation

SAMPLE SIZE determination for Logistic Regression Models Consider a logistic regression,

We want to estimate sample size needed to achieve certain power for testing null hypothesis Ho:β=0. Recall that null hypothesis testing depends on the variance of β. In logistic regression, the effect size is expressed as “log odds ratio” (η). Hsieh(1989) suggested the following formula for one-sided test:

xplogitp

pβα +==⎟⎟

⎞⎜⎜⎝

⎛−

)(1

log

[ ] ( )

)]4/exp(1/[)]4/5exp()1(1[

,

)ˆ/(ˆ21)4/exp(

222

222

ηηηδ

ηδηβα

−+++=

+−+=

where

ppzzn

Page 48: Lecture 3: Sample Size and Power Estimation

Say, you want to examine the effect size of log odds ratio of 1.5 = log(1.5)= .40546511 = ~0.41 See, the implementation of formula in STATA: . clear . set obs 1 obs was 0, now 1 . *Enter "odds-ratio" . gen p=0.10 . gen or=1.5 . gen beta=log(or) . di beta .4054651 . gen delta= (1+(1+beta^2)*exp(5*beta^2/4))/(1+exp(-beta^2/4)) . . di delta 1.2399909 . . di " n = " (1.645+1.282*exp(-beta^2/4))^2*(1+2*p*delta)/(p*beta^2) n = 627.61987 So, n = 628 = ~ 630 Sample Size for Multiple logistic Regression Multiple logistic regression requires larger n to detect effects. Let R denote the multiple correlation between the independent variable of interest, X, and the other covariates. Then, sample size: n` = n/(1-R2) Say, if R=0.25, then n` = 630/(1-0.25^2)= 672

Page 49: Lecture 3: Sample Size and Power Estimation

Stata’s add-on programs for sample size estimation

• STPOWER: Survival studies• Sampsi_reg: Linear regression• Sampclus: Cluster sampling• ART: randomized trials with survival time

or binary outcome• XSAMPSI: Cross-over trials• Samplesize: Graphical results• MVSAMPSI: multivariate regression

Page 50: Lecture 3: Sample Size and Power Estimation

• STUDYSI: Comparative study with binary or time-to-event outcome

• SSKAPP: Kappa statistics measure of inter- rater aggrement

• CACLSI: log-rank/binomial test

Page 51: Lecture 3: Sample Size and Power Estimation

Additional topics to be covered

• Sample allocation – stratified sampling• Sample size corrected for design-

effect(DEFF)• Optimal sample size per cluster• Sample size for clusters• Sample size and power for pre-post surveys

in program evaluation