introduction to risk factors & measures of effect meg mccarron, cdc

Introduction to Risk Factors & Measures of Effect

Meg McCarron, CDC

2

Introduction to

Risk Analysis

What is a risk analysis?• The analysis of an association between a variable (e.g.

underlying condition) and an outcome (e.g. death)

• Why do risk analysis?

• The probability of an outcome is often dependent on the interplay between a variety of factors

• Follow up on suggested associations observed in descriptive analysis (e.g. the elderly appear to die more frequently than healthy young adults; a risk analysis might tell you whether or not that is a true observation)

• Determine the severity of risk

• Identify significant risk factors

• Using this type of analysis we can measure risk ratio (RR), odds ratio (OR) 3

What is a risk factor?

A risk factor is a factor that is associated with increased chance of getting a disease.

In epidemiological terms: A risk factor is a variable (determinant) associated with an increased risk of disease or infection (outcome).

Example: Obesity (determinant/exposure) is associated with increased risk of heart attack (outcome)

When we measure risk factors we assess Strength Direction Shape

4

Risk factors in SARI surveillance

• Information about a number of potential risk factors and outcomes is often recorded• e.g. Outcomes: death, influenza status• Risk factors: age, co-morbid conditions

• Surveillance data can be analyzed to increase the understanding of the association of risk factors with severe outcomes

• Surveillance data describing exposures allows analysis of associations without expensive in-depth studies

5

Is a risk factor the cause of a disease?

Risk factors are correlational and not necessarily causal Correlation does not imply causation The statistical methods used do not consider

the direction of effects For an effect to be causal the exposure must

have occurred before the outcome e.g. young age does not cause measles

(Morbillivirus causes measles), but young people are at greater risk because they are less likely to have developed immunity due to previous exposure or vaccination

6

The Correlation-Causation Problem

Somalia has many pirates, but low carbon emissions

How are risk factors/disease determinants identified?

Individual-level data Two key variables

Outcome: e.g. influenza Exposure: e.g. vaccination

Should consider multiple risk factors

Epidemiological study designs used to identify risk factors Case-control Cohort

Surveillance data may approximate a cohort study

Biological plausibility e.g. age and influenza infection Exposure (risk factor) must occur prior to outcome

(disease)

Types of variables

Continuous E.g. Age

Categorical variables Binary

E.g. Gender, vaccination status Ordinal

E.g. Age group, socioeconomic status (SES) Nominal/Categorical

E.g. Geographic region Count

E.g. number of ILI symptoms

How are risk factors/disease determinants identified?

Clinical and epidemiological comparison of hospitalized SARI patients with and without laboratory-confirmed influenza week 40/20xx to (current week)/20xx, Country X (NOTE: Numbers in table are not real and for example only) Characteristics Percent of influenza-negative SARI

hospitalizations with selected demographic and epidemiological characteristics

Percent of SARI hospitalizations confirmed as influenza with selected demographic and epidemiological characteristics

Sex Information available for N = 100 Information available for N = 50

Male 54/100 (54%) 27/50 (54%) Female 46/100 (46%) 23/50 (46%) Sex unknown 0 0

Chronic Medical Illnesses Information available for N = 98 Information available for N = 48 Number of cases with at least one of the

chronic medical illness listed below *

30/98 (31%) 28/48 (58%)

Chronic respiratory disease 15/98 (15%) 20/48 (42%) Asthma 15/98 (15%) 10/48 (21%) Diabetes 11/98 (11%) 11/48 (23%) Chronic cardiac disease 5/98 (5%) 5/48 (10%) Chronic renal disease 3/98 (3%) 3/48 (6%) Chronic liver disease 4/98 (4%) 4/48 (8%) Chronic neurological impairment 7/98 (7%) 6/48 (13%) Immune-compromised 0/98 (0%) 1/48 (2%) Number of cases without any of the above chronic medical illnesses

68/98 (69%) 20/48 (42%)

Unknown if risk factors present N=2 N=2 Pregnancy status Information available for N = 50 women Information available for N = 23 women

Pregnancy in any trimester 11/50 (22%) 8/23 (35%) Not-pregnant 39/50 (78%) 15/23 (65%) Pregnancy status unknown N=0 N=0

Obesity (or other conditions as determined by national priorities)

Information available for N = 90 Information available for N = 35

Obese (BMI>30 or judged obese clinically) 25/90 (28%) 15/35 (42%) Not obese (BMI<30 or not clinically judged obese) 65/90 (72%) 20/35 (58%) Obesity status unknown 10 15

Age-groups (years) Information available for N = 100 Information available for N = 48 0-1 40/100(40%) 10/48 (21%) 2-4 25/100(25%) 8/48 (17%) 5-14 10/100 (10%) 10/48 (21%) 15-29 5/100 (5%) 11/48 (23%) 30-64 5/100 (5%) 8/48 (16%) 65+ 15/100 (15%) 1/48 (2%) Age unknown N=0 N=2

Vaccination Status Information available for N = 98 Information available for N = 40 Received monovalent or trivalent vaccine during the current influenza season

40/98 (41%) 2/40 (5%)

Did not receive monovalent or trivalent vaccine during the current influenza season

58/98 (59%) 38/40 (95%)

Vaccination status unknown N=2 N=10 Oseltamivir/zanamivir (Tamiflu/Relenza) Use Information available for N = 100 Information available for N = 44 Received oseltamivir/zanamivir within 48 hours of symptom onset

10/100 (10%) 8/44 (18%)

Did not receive oseltamivir/zanamivir within 48 hours of symptom onset

90/100 (90%) 36/44 (82%)

Oseltamvir use unknown N=0 N=6 Median days from symptom onset to hospital admission 4.0 days 4.5 days

10

How are risk factors/disease determinants identified? (… continue …)

Clinical and epidemiological description of hospitalized SARI patients with laboratory-confirmed influenza, by outcome status, year x to year y, Country X (NOTE: Numbers in table are not real and for example only)

Hospitalised SARI cases with laboratory-confirmed influenza Characteristics

Percent of hospitalized (non-ICU/non-severe) cases with selected demographic and epidemiological characteristics

Percent of severe (severe outcome/or died) cases with selected demographic and epidemiological characteristics

Sex Information available for N = 100 Information available for N = 30

Male 54/100 (54%) 15/30 (50%) Female 46/100 (46%) 15/30 (50%) Sex unknown 0 0

Chronic Medical Illnesses Information available for N = 98 Information available for N = 28 Number of cases with at least one of the chronic medical

illness listed below *

30/98 (31%) 19/28 (58%)

Chronic respiratory disease 25/98 (25%) 20/28 (71%) Asthma 15/98 (15%) 4/28 (14%) Diabetes 11/98 (11%) 54/28 (23%) Chronic cardiac disease 5/98 (5%) 5/28 (18%) Chronic renal disease 3/98 (3%) 3/28 (11%) Chronic liver disease 0/98 (0%) 4/28 (14%) Chronic neurological impairment 3/98 (3%) 7/28 (25%) Immune-compromised 0/98 (0%) 1/28 (4%) Number of cases without any of the above chronic medical illnesses

68/98 (69%) 9/28 (42%)

Unknown if risk factors present N=2 N=2 Pregnancy status Information available for N = 50 women Information available for N = 15 women

Pregnancy in any trimester 11/50 (22%) 10/15(67%) Not-pregnant 39/50 (78%) 5/15 (33%) Pregnancy status unknown N=0 N=0

Obesity (or other conditions as determined by national priorities)

Information available for N = 90 Information available for N = 28

Obese (BMI>30 or judged obese clinically) 23/90 (26%) 19/28 (68%) Not obese (BMI<30 or not clinically judged obese) 66/90 (73%) 9/28 (32%) Obesity status unknown 10 2

Age-groups (years) Information available for N = 100 Information available for N = 30 0-1 35/100(35%) 5/30 (17%) 2-4 30/100(30%) 2/30 (6%) 5-14 10/100 (10%) 5/30 (17%) 15-29 4/100 (4%) 3/30 (10%) 30-64 6/100 (6%) 10/30 (33%) 65+ 15/100 (15%) 5/30 (17%) Age unknown N=0 N=0

Vaccination Status Information available for N = 98 Information available for N = 30 Received monovalent or trivalent vaccine during the current influenza season

20/98 (20%) 2/30 (7%)

Did not receive monovalent or trivalent vaccine during the current influenza season

78/98 (80%) 28/30 (93%)

Vaccination status unknown N=2 N=0 Oseltamivir/zanamivir (Tamiflu/Relenza) Use Information available for N = 100 Information available for N = 27 Received oseltamivir/zanamivir within 48 hours of symptom onset

15/100 (15%) 2/27 (7%)

Did not receive oseltamivir/zanamivir within 48 hours of symptom onset

85/100 (85%) 25/27 (93%)

Oseltamvir use unknown N=0 N=3 Median days from symptom onset to hospital admission 3.5 days 7.5 days

11

Cohort study Follow people over

time Collect data on

their exposures (risks)

Monitor their outcomes

Compare risk of disease among exposed versus unexposed

Participant

1

2D

3

4D

5

6

0 1 2 3 4

time

Example: cohort study

e.g. Risk of death among SARI admissions Outcome: death Risk factors: age, underlying conditions,

influenza-positive Source population: all patients admitted

with SARI, followed until death or discharge

13

14

Case control study Cases: people with

disease Deliberately over-

selected Controls: people without

disease Represent exposure

distribution of the source population

Find out their exposure status

Compare risk of exposure among diseased and non-diseased

E D 1

Participant

D 2

E D 3

4

E 5

6

time

Example: case-control study

Risk of influenza among vaccinated patients Cases: people with influenza Controls: people without influenza Outcome: influenza status Risk factors: vaccination status, age,

underlying comorbidity

15

Statistical significance: is the association due to chance alone?

A statistical test is used to assess if an association may be due to chance alone (random error) In statistics, a result is called statistically

significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level (e.g. α: 0.05).

16

Common statistical tests

Categorical data: Chi-square (2) test, Fisher’s test McNemar’s test

Continuous data: T-test Wilcoxon rank-sum test ANOVA

These tests can tell if there’s a difference between groups but do not convey the size or direction of effects

Common measures of association / effect

Measure the size of an association (effect) Compare some measure of disease in exposed versus

unexposed Absolute difference

Y1-Y2

Risk difference Relative difference (ratio)

Y1/Y2

Odds ratio Risk ratio Incidence rate ratio Hazard ratio (survival data) Attributable risk

18

19

Odds ratios Most common measure

of association used in epidemiology

Binary outcome Odds Ratios (OR):

compares the odds of exposure among cases (people with disease) with controls (people without disease) Odds: ratio of the

probability (p) of an event occurring versus it not occurring

Odds = p/(1-p)

Calculation of the RR & ORCases Controls

Exposed a b

Unexposed

c d

OR = (a/c) / (b/d)

OR = 1 = no associationOR < 1 = negative association

(reduces risk)OR > 1 = positive association

(increases risk)

20

Example of OR CalculationsOutcome (Influenza patients that died)

Outcome (Influenza patients that died)

Calculation of the RR & ORDied Alive

Flu+ 200 (a) 150 (b)

Flu- 50 (c) 100 (d)

OR = (a/c) / (b/d) = (a*d) / (b*c)

OR=(200/50)/(150/100)=2.7

Calculation of the RR & ORDied Alive

Female

200 (a) 180 (b)

Male 98 (c) 100 (d)

OR=(200*100)/(180*98)=1.1

Confidence intervals OR is a point estimate Confidence interval (CI)

is a measure of uncertainty around your point estimate

CI is based on the standard error (SE)

SE=narrower confidence interval

If CI includes 1, then not statistically significant wide CI also a problem

Usually use 95%CI

Cases Controls

Exposed a b

Unexposed

c d

SE = √1/a + 1/b + 1/c + 1/d

95%CI = e(OR 1.96 * SE)

22

• OR=1.1• 95%CI=1.01,1.4

Confidence intervals

e.g. 2007 Victorian surveillance data, adults, influenza B

Flu+ Flu-

Vaccinated 44 (a) 95 (b)

Unvaccinated

205 (c) 260 (d)

OR = (44/205) / (95/260) = 0.59

ln(OR) = ln(0.25) = -0.53

SE = √1/44 + 1/95+ 1/205 + 1/260 = 0.20

95%CI = e(-0.53 + 1.96*0.20) = e(0.09) = 0.39 (UL)= e(-0.53 - 1.96*0.20) = e(-2.87) = 0.88

(LL)

Interpreting Results

Size of the CI is an indicator of uncertainty Wide CI = uncertainty Narrow CI = uncertainty

If CI includes 1, then not statistically significant The observed effect could just be due to chance

P-values are often used to convey statistical significance The p-value for a OR is calculated from a chi-

squared test The p-value reference for a 95%CI is 5% or 0.05

P-values

The p-values help us to determine whether the difference between the two groups might be due to random variation

CI and p-values 95%CI=1.0, 2.3 indicates that the two-sided p-value

for no association is about 0.05. 95%CI=0.9, 2.4 suggests p>0.05 95%CI=0.9, 2.4 indicate that the data are compatible

with a two-fold higher risk (i.e. upper limit includes 2) The p-value is a measure of the compatibility of

the data and the null hypothesis

Implementation of a statistical test

We start with a research hypothesis State the relevant null (H0)

No effect (effect is due to chance) Alternative hypotheses (HA)

An effect exists Decide which test is appropriate (see earlier list) Compute the test statistic and the associated p-

(probability) value Compare the computed p-value to a reference p value

(usually 0.05) to accept or reject the null hypothesis If the p-value of the test is lower than the reference

value the H0 is rejected The effect is not likely to be due to chance

Example: Implementation of a statistical test

Influenza prevalence in hospitalized patients:

Non pregnant women: 100/1000 = 10%

Pregnant women: 30/200 = 15%

Question: Is the influenza

prevalence in hospitalized pregnant women different to non-pregnant women?

Hypothesis H0: p1 = p2 ; p1 - p2

= 0 HA: p1 = p2 ; p1 - p2

= 0 Reject H0 if p (test) is

< α: 0.05 Test results:

Z (test statistic): 0.119 p value: 0.037

0.037<0.05 → Reject H0

Example: factors associated with influenza-positive diagnosis among ILI patients

OR p-

value

95% CI

Lower limit

Upper limit

Vaccinated 0.54 0.02 0.32 0.89Underlying condition 1.20 0.47 0.72 2.00

Epi week 1.04 0.01 1.01 1.08

Age group

<20 ref 20-64 0.76 0.17 0.51 1.1365+ 1.09 0.85 0.45 2.62

Adjusted OR=0.54 (95%CI=0.32,0.89)

Crude OR=0.59 (95%CI=0.39,0.88)

Summary

A risk factor is a variable which increases (or decreases) the risk of an outcome

We can assess the influence of risk factors using individual-level data from case-control and cohort studies

The size of the effect can be measured by effect measures Most common effect measure is the odds ratio

The uncertainty of the effect can be measured by the confidence interval Understanding whether an effect is due to random error is

indicated by the p-value and tested using a statistical test Multivariable methods can tell us how much influence

one risk factor has compared with others

introduction to risk factors & measures of effect meg mccarron, cdc

Documents