proportions and confidence intervals in biostatistics

Biostatistics

Lecture 9

Lecture 8 Review –Proportions and confidence

intervals• Calculation and interpretation of:

– sample proportion– 95% confidence interval for population proportion

• Calculation and interpretation of:––

difference in sample proportions95% confidence interval for difference proportions

in population

Single proportion – Inference

Estimated proportion of vivax malaria (p) = 15/100 = 0.15

Standard error of p

p(1 − p)

0.15(1 − 0.15)s e ( p ). . = = = 0.036

• 95% Confidence interval for π (population proportion)

Lower limit = p - 1.96×s.e.(p) = 0.079

Upper limit = p + 1.96×s.e.(p) = 0.221

Interpretation..

“We are 95% confident, the population proportion (π) of people with vivax

malaria is between 0.079 and 0.221

(or between 7.9% and 22.1%)”

Comparing two proportions2×2 table

••

ProportionProportion

Proportion

of all subjects experiencing outcome, p = d/n

in exposed group, p1 = d1/n1

in unexposed group, p0 = d0/n0

With outcome

(diseased)

Without outcome

(disease-free)

Exposed

(group 1)

d1 h1 n1

Unexposed

(group 0)

d0 h0 n0

Total d h n

Comparing two proportionsExample – TBM trial

Death during 9 months post start of treatment

Treatment group Yes No Total

Dexamethasone

(group 1)

87(p1=0.318)

187 274

Placebo

(group 0)

112(p0=0.413)

159 271

Total 199 346 545

Comparing two proportions - Inference

Example:- TBM trial

Estimate of difference in population proportions

= p1-p0 = -0.095

s.e.(p1-p0) = 0.041

95% CI for difference in population proportions (π1-π0):

-0.095 ± 1.96×0.041

-0.175 up to -0.015 OR -17.5% up to -1.5%

Interpretation:-

“We are 95% confident, that the difference in population proportions is

between -17.5% (dexamethasone reduces the proportion of deaths by a

large amount) and -1.5% (dexamethasone marginally reduces the

proportion of deaths)”.

Comparing two proportions (absolute difference):-

Risk difference

Example:- TBM trial

Outcome measure: Death during nine

months treatment.

following start of

Dexamethasone

p1 (incidence risk) = d1/n1 = 87/274 = 0.318

Placebo

p0 (incidence risk)

= d0/n0 = 112/271 = 0.413

p1 – p0 (risk difference) = 0.318 – 0.413 = -0.095 (or -9.5%)

Lecture 9 – Measures of association

• 2×2 table (RECAP)

• Measures of association

––

Risk differenceRisk ratio

– Odds ratio

• Calculation & interpretation of confidence interval for

each measure of association

2×2 table

••

ProportionProportion

Proportion

of all subjects experiencing outcome, p = d/n

in exposed group, p1 = d1/n1

in unexposed group, p0 = d0/n0

With outcome

(diseased)

Without outcome

(disease-free)

Exposed

(group 1)

d1 h1 n1

Unexposed

(group 0)

d0 h0 n0

Total d h n

2×2 table - Measures of association

• Differentbetween

measures of associationoutcome and exposure

• Can calculate confidence intervals and test statistics foreach measure

Measure of Effect Formula

Risk difference(lecture 8)

Risk ratio (relative risk) p1 / p0

Odds ratio (d1/h1) / (d0/h0)

2×2 table – TBM trial example

Death during 9 months post start

of treatment

Treatment group Yes No Total Incidence risk of death (p)

Odds of death

Dexamethasone

(group 1)87 (d1)

187 (h1)

274 (n1)

d1 / n1

= 0.318

d1 / h1

= 0.465

Placebo

(group 0)112 (d0)

159 (h0)

271 (n0)

d0 / n0

= 0.413

d0 / h0

= 0.704

Total 199 346 545

• Risk difference = p1-p0 = 0.318 – 0.413 = -0.095 (or -9.5%)

• Risk ratio = p1/p0 = 0.318 / 0.413 = 0.77

• Odds ratio = (d1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66

Dexamethasone

(group 1)

87 (d1) 187 (h1) 274 (n1)

Placebo

(group 0)

112 (d0) 159 (h0) 271 (n0)

Total 199 346 545

2×2 table – Calculation of Odds Ratio

Commonly given formula for odds ratio(a×d) / (b×c) = (87×159) / (187×112) = 0.66

Dexamethasone

(group 1)

87 (a) 187 (b) 274 (n1)

Placebo

(group 0)

112 (c) 159 (d) 271 (n0)

Total 199 346 545

2×2 table – Calculation of Odds Ratio

Odds ratio for not dying

= (a×d) / (b×c) = (187×112) / (=1/0.66)

(87×159) = 1.51

Treatment group No Yes Total

Dexamethasone

(group 1)

187 (a) 87 (b) 274 (n1)

Placebo

(group 0)

159 (c) 112 (d) 271 (n0)

Total 346 199 545

Differences in measures of association• When there is no association between exposure and outcome,

– risk difference = 0

– risk ratio (RR) = 1

– odds ratio (OR) = 1

• Risk difference can be negative or positive

• RR & OR are always positive

• For rare outcomes, OR ~ RR

• OR is always further from 1 than corresponding RR

– If RR > 1 then OR > RR

– If RR < 1 the OR < RR

Interpretation of measures of association

• RR & OR < 1, associated with a reduced risk / odds (may beprotective)

– RR = 0.8 (reduced risk of 20%)

• RR & OR > 1, associated with an increased risk / odds

– RR = 1.2 (increased risk of 20%)

• RR & OR – further the risk is from 1, stronger the associationbetween exposure and outcome (e.g. RR=2 versus RR=3).

Inference

• Obtain a sample estimate, q, of the population parameter (e.g.difference in proportions)

• REMEMBER different samples would give different estimatesof the population parameter (e.g. sample 1 q1, sample 2 q2,…)

• Derive:

– Standard error of q (i.e. s.e.(q))

– Confidence interval (i.e. q ± (1.96 × s.e.(q) )

Ratios – Risk ratio (RR) or Odds ratio (OR)

• Usual confidence intervals formula,q ± (1.96×s.e.(q)), is problematic for ratios.

When q is close to zero and s.e.(q) large,

calculated lower limit of confidence interval may benegative…

Risk ratio (RR)

• Solution Calculate the logarithm of(logeRR) and its standard error

1 1 −

1s.e.(lo g e

RR ) = +d1 n1 d0 n0

95% CI for logarithm of RR :-Upper limitLower limit

= logeRR= logeRR

+ 1.96×s.e.(logeRR)

- 1.96×s.e.(logeRR)

95% CI for Risk ratio (RR):-Upper limit = antilog (upper limit of CI for logeRR)Lower limit = antilog (lower limit of CI for logeRR)

Log to the base e & antiloge (exponential)

• ‘Natural logarithms’ use the mathematical constant, e, as

their base, e=2.71828……1618 – Scottish

Mathematician: John Napier

ex• antilogex = exp(x) =

e = 2.718 loge2.718 = 1 antiloge1 = 2.718

e2 = 7.388 loge7.388 = 2 antiloge2 = 7.388

e3 = 20.079 loge20.079 = 3 antiloge3 = 20.079

101 = 10 log1010 = 1 antilog101 = 10

102 = 100 log10100 = 2 antilog102 = 100

103 = 1000 log101000 = 3 antilog103 = 1000

Risk ratio = p1/p0 = 0.318 / 0.413 =logeRR = loge(0.77) = -0.26

1 s.e.(lo g RR ) =e = 0.11

87 274 112 271

95% CI for logeRR: -0.48 up to -0.04

95% CI for RR: exp(-0.48) up to exp(-0.04) = 0.62 up

to 0.96

Dexamethasone

(group 1)

87 (d1) 187 (h1) 274 (n1)

Placebo

(group 0)

112 (d0) 159 (h0) 271 (n0)

Total 199 346 545

Using Statacsi 87 112 187 159

| Exposed Unexposed | Total-----------------+------------------------+------------

Cases |

Noncases |87

187112

346-----------------+------------------------+------------

Total |

274 271 |

Risk .3175182 .4132841 .3651376

Point estimate [95% Conf. Interval]|------------------------+------------------------

Risk difference

Risk ratio Prev. frac. ex. Prev. frac. pop

-.0957659

.7682808

.2317192

.1164974

-.1762352 -.0152966.6139856 .9613505.0386495 .3860144

+-------------------------------------------------chi2(1) = 5.39 Pr>chi2 = 0.0202

Remember the warning about how the table is presented-Stata requires presentation with outcome by rows and exposure by columns

Results are close to those obtained by hand

Interpretation…..

Dexamethasone was associated with an estimated decreased risk of 23% (estimated RR=0.77) for death during 9 months post start of treatment.

We are 95% confident, that the population risk ratio, lies between0.62 (decreased risk of 38%) and 0.96 (decreased risk of 4%).

Dexamethasone

(group 1)

87 (d1) 187 (h1) 274 (n1)

Placebo

(group 0)

112 (d0) 159 (h0) 271 (n0)

Total 199 346 545

95% confidence interval for Odds ratio (OR)

• Calculate the logarithm of OR (logeOR) and its standard error.

1 1 1 1Woolf’s formulas.e.(lo g OR ) =

e+ + +

d1 h1 d0 h0

95% CI for logarithm of OR :-Upper limit = logeOR + 1.96×s.e.(logeOR)

Lower limit = logeOR - 1.96×s.e.(logeOR)

95% CI for Odds ratio (OR):-

Upper limit = exp (upper limit of CI for logeOR)Lower limit = exp (lower limit of CI for logeOR)

Odds Ratio = (d1/h1)/ (d0/h0) = 0.66logeOR = loge(0.66) = -0.42

1 s.e.(lo g OR ) =e = 0.18

87 187 112 159

95% CI for logeOR: -0.77 up to -0.07

95% CI for OR: exp(-0.77) up to exp(-0.07) = 0.46 up

to 0.93

Dexamethasone

(group 1)

87 (d1) 187 (h1) 274 (n1)

Placebo

(group 0)

112 (d0) 159 (h0) 271 (n0)

Total 199 346 545

Using Stata

. csi 87 112 187 159, or

Cases |

Noncases |87

187112

346-----------------+------------------------+------------

Total |

274 271 |

Risk .3175182 .4132841 .3651376|

|Point estimate [95% Conf. Interval]|------------------------+------------------------

Risk difference

Odds ratio

-.0957659

.7682808

.2317192

.1164974

.6604756

-.1762352

.6139856

.0386495

-.0152966

.9613505

.3860144

.4652544 .937623 (Cornfield)+-------------------------------------------------

chi2(1) = 5.39 Pr>chi2 = 0.0202

For OR, by default Stata uses Cornfield’s formula for se. You can requestthe Woolf formula as csi 87 112 187 159, or woolf

Test statistic forRisk ratio (RR) & Odds

ratio (OR)

Null hypothesis:-population RR = 1 or population OR = 1

• For risk ratio:-

log e RR − log e1 − 0.26 − 0z = = = −2.4

s.e.(lo g

2-sided p-value = 0.016

Test statistic forRisk ratio (RR) & Odds

Null hypothesis:-

ratio (OR)

population RR = 1 or population OR = 1

• For odds ratio:-

log e OR − log e1 − 0.42 − 0z = = = −2.3

s.e.(lo g

2-sided p-value = 0.021

Comparing the outcome measure of two exposure groups(groups 1 & 0)

1 0 1 01 0

s.e.( p ) + s.e.( p )1 0

Outcome variable – data type

Population parameter

Estimate of population parameter

from sample

Standard error 95% Confidence interval for population

parameter

Numerical

µ1−µ0 x1 x0 s.e.( x 1 − x 0 )

2 2= s.e.( x 1 ) + s.e.( x 0 )

x1 − x0

± 1.96 × s.e.( x 1 − x 0 )

Categorical π1−π0 p − p s.e.( p − p )

p − p±1.96× s.e.(

1 − p

Comparing the outcome measure of two exposure groups(groups 1 & 0)

s.e.(lo g RR ) =e − + −

1 1 0 0

Population parameter

Estimate of

population parameter

from sample

Standard error of loge(parameter)

95% Confidence interval of loge(population parameter)

Categorical

Population risk ratio

p1/p0 1 1 1 1d1 n1 d0 n0

log eRR

± 1.96 × s.e.(log eRR )

Categorical Population odds ratio

(d1/h1) / (d0/h0)

1 1 1 1s.e.(lo ge OR ) =

hlogeOR

±1.96× s.e.(log eOR)

Calculation of p-values for comparing two groups

s.e.( p − p )

s.e.(log ( OR ))

Population parameter Population parameter under null hypothesis

Test statistic

Numerical

µ1−µ0 µ1−µ0=0 x1 − x0

s.e.( x 1 − x 0 )

Categorical

π1-π0

Population odds ratio

π1-π0=0

Population risk ratio=1

Population odds ratio=1

z = p1 − p0

loge ( RR)

s.e.(log ( RR ))

z = loge (OR)

Comparing the outcome measure of two exposure groups(TBM trial: dexamethasone versus placebo)

Population parameter under null hypothesis

Estimate of population parameter

from sample

95% confidence interval for population parameter

Two-sided p-value

Categorical Population risk

difference= 0

= -0.095-0.175, -0.015 0.020

Categorical

= 0.770.62, 0.96 0.016

Categorical Population odds ratio

(d1/h1) / (d0/h0)

= 0.660.46, 0.93 0.021

Using Stata – p-value calculated using Chi-squared test

. csi 87 112 187 159, or

Cases |

Noncases |87

187112

346-----------------+------------------------+------------

Total |

274 271 |

Risk .3175182 .4132841 .3651376|

|Point estimate [95% Conf. Interval]|------------------------+------------------------

Risk difference

Odds ratio

-.0957659

.7682808

.2317192

.1164974

.6604756

-.1762352

.6139856

.0386495

-.0152966

.9613505

.3860144

.4652544 .937623 (Cornfield)+-------------------------------------------------

chi2(1) = 5.39 Pr>chi2 = 0.0202

For OR, by default Stata uses Cornfield’s formula for se. You can requestthe Woolf formula as csi 87 112 187 159, or woolf

Lecture 9 - Objectives

• Calculate and interpret the measures ofassociation and theirand test statistics

confidence intervals

–––

Risk differenceRisk ratioOdds ratio

Thank You

www.HelpWithAssignment.com

proportions and confidence intervals in biostatistics

total dexamethasone

p1 p0 risk difference

risk difference example

ratios risk ratio rr

increased risk odds

difference proportions

association rr

n1 placebo group

Education

6.3 confidence intervals for population proportions

copyright © 2010, 2007, 2004 pearson education, inc....

section 6.3 confidence intervals for population proportions...

reading: course-pack chapters 17 – 18, 23 - 26 –sampling...

confidence intervals for proportions chapter 19. standard...

week 8 confidence intervals for means and proportions

chapter 22 comparing two proportions two proportion...

conﬁdence intervals for two proportions and one...

simple and effective confidence intervals for proportions...

chapter 19 confidence intervals for proportions use your...

introduction to biostatistics, harvard extension school ©...

chapter 19 – confidence intervals for proportions

copyright © 2010 pearson education, inc. chapter 19...

chapter 19: confidence intervals with proportions

how sure we really are confidence intervals for means and...

confidence intervals for proportions presentation 9.1

biostatistics lecture 10 4/27 & 4/28/2015. ch 9 –...

confidence intervals for population proportions section 6.3

inference for proportions one sample. confidence intervals...

copyright © 2012 pearson education. chapter 11 confidence...