some statistics and epidemiology for the akt. we’ll try to cover 1.general tips 2.types of study,...

Some Statistics and Epidemiology for the AKT

We’ll try to cover

1. General tips

2. Types of study, the ‘evidence hierarchy’, what you get from each study type

3. Looking at numbers: calculations and data representation

4. Screening: qualities of test, more calculations

Scope

• What I learnt

• Hopefully will give an approach for most questions

• Not everything! (See AKT content guide)

• PasTest / OnExamination

Some General Tips

• A few key formulae - write them down!

• Stats more than anything - ‘RTQ2’

• Flag and return

• If presented with a random chart…(You don’t have to fully understand a graph

to get the information you need)

• Flag and return… (don’t stress)

2. Looking at Evidence

“It is a truth universally acknowledged that a medical intervention justified by observational data must be in want of verification through a randomised controlled trial”

BMJ 2003;327:1459-61

Qualitiative vs. Quantitative

• Focus on quantitative here

• Qualitative:– Generate ‘informed assertions’ or

hypotheses– May inform planning quantitiative research– Focus groups, interviews, questionnaires

Example questions?

Quantitative study types

• Try to be able to identify type of study from a description of the study

• Therefore know what it can and can’t tell you • Any study:

– What is the exposure?– What is the outcome?– What type of study is it?

Rates of liver cancer and HBV seroprevance, by country.

Population Studies• Population studies

– Examples?– Uses– Limitations

• Information from groups, not individuals• Correlations• Generate hypotheses, can’t test them• Confounding• The Ecological Fallacy

The Ecological FallacyUS 2004 elections:

– Republicans won in the poorest 15 states– Democrats won in 9 of the 11 wealthiest

states– So, are wealthy people more likely to vote

for the Democrats?

NO!Incorrect assumption that individuals from wealth states are more likely to be wealthy

= the ecological fallacy

Individual Studies

• Case reports, case series, cross sectional• Generate hypotheses, can’t test them• Cross sectional study: exposure and outcome

at the same time– No temporal relationship– BMI (outcome) and time spent exercising

(exposure)

Analytical Studies

• Generally the most useful for us..can answer questions (not just ask them)

• Observational– No control over exposure

• Interventional– Control over exposure

Observational Studies

• A type of analytical study

• Consider two types…

Women diagnosed with DVT are six times more likely to have a history of oral contraceptive usage

Case - control studies

• It’s all in the title…

• Take cases, find matched controls• Exposure to risk factor is determined retrospectively

Case-control studies

Advantages• Rare diseases• Cheap and quick

Limitations• Need matching• Confounding• Recall bias

Can calculate odds ratios

As study looking at new diagnoses of lung cancer in smokers vs. non-smokers.

Cohort studies

A type of observational, analytical study:• Determine exposure e.g. measure tobacco use• Follow up cohort over a set period of time e.g. 10

years• Measure outcome e.g. new diagnoses of lung

cancer

Cohort studies

Advantages• Generate incidence• Less danger of bias

by poor selection of controls

• No recall bias• Multiple exposures

Limitations• Expensive• Time consuming• Loss to follow-up• Rare conditions?

Can calculate relative risk

Rivaroxiban reduces the risk of ischamic stroke over five years in patients with atrial fibrillation compared

with placebo.

Interventional Studies

Basically a cohort study, but…The difference is that the investigator intervenes:• The investigator determines exposure• Individuals are followed up for a period of time to

determine the outcome.

Need to consider:• Double blind versus single blind• Placebo control• randomisation

RCTRandomisation:• Subjects are randomised into groups (drug vs. placebo, or drug A vs.

drug B…)• The aim is that the groups are identical in every way apart form

exposure to the drug, to minimise confounders• Minimise selection bias• (Also crossover - added safeguard)

Control:• The control group should be identical• The control may receive a placebo intervenion, which ideally looks

and tastes the same

Blinding• The investigators don’t know who has received what when collecting

and analysing data• The subjects don’t know whether they have reviewed placebo or not

RCTs

Advantages• Minimise

confounding and bias

• ‘Gold standard’

Limitations• Extremely expensive• Side effects may

unblind• Ethics

Can calculate relative risk

(remember it is a cohort study)

Meta-anaysis

Not to be confused with a systematic review.A meta-anaylsis:• Aggregate data from multiple trials• Complex analysis - probably less accessible

…but generally analysed by people with less of an interest in deceiving you.

• ‘Gold standard’

Answering Questions

Using EBM in practice:• Need a specific question:

– P– I– C– O

• Need use a systematic means of searching for available evidence

• Need to be able to appraise the evidence– Some useful tools if you have the time

• Or, get someone else to do it:– CKS, Cohrane, BMJ Clinical Evidence

Answering Questions: The Evidence Hierachy

Hierarchy of quality of evidence for quantitative studies• I-1: Systematic review of RCTs• I-2: RCT• II-1: Cohort• II-2: Case-control• II-3: Uncontrolled experiment• III: Expert committees, respected authorities• IV: ‘Somebody once told me’, The Daily Mail,

– case reports, case series

BMJ 2003;327:1459-61

“Only two options exist. The first is that we accept that, under exceptional circumstances, common sense might be applied when considering the potential risks and benefits of interventions. The second is that we continue our quest for the holy grail of exclusively evidence based interventions and preclude parachute use outside the context of a properly conducted trial. The dependency we have created in our population may make recruitment of the unenlightened masses to such a trial difficult. If so, we feel assured that those who advocate evidence based medicine and criticise use of interventions that lack an evidence base will not hesitate to demonstrate their commitment by volunteering for a double blind, randomised, placebo controlled, crossover trial.”

3. Looking at Numbers

Looking at numbers

• Definitions

• Comparing risk between populations

• Looking at data: some other points

• Draw a table if you can!

A few definitions

• Rate– Denominator = person-time at risk– e.g. cases / 1000 at risk population / year

• Incidence

• Prevalence

Comparing risk between populations

• Quantifying risk is basis of measuring effect of an exposure or intervention

• Calculations depend on type of study:– Risk: cohort– Odds: case-control

• Once you have a measure of risk, you can compare between exposed and unexposed– Risk ratio or Odds ratio

Risk• Risk = proportion• 6 our of 10 medical students are female• The risk of being female is 0.6 if you are a

medical student

Risk = affected / total(in this case the exposure is being a medical student, the outcome is being female)

• In a case-control, you don’t know the total exposed, so you can’t calculate risk.

• In a cohort, you determine exposure at the start, so you can

Risk• Remember, in a cohort study, you select

according to exposure, and follow up to determine outcome

• Design a cohort study to look at car accidents in yellow and black cars– Suggest that yellow cars prevent accidents– Let’s suppose there are only two colours of car for

simplicity

Risk• Exposure: car being yellow• Outcome: accident free after 1 year• Plan: determine exposure at the start (record

colour), follow up for one year and monitor for car accidents

• Draw a table– 100 yellow cars, 10 had accidents– 100 black cars, 20 had accidents

Yes No

Yes 90 10 100

No 80 20 100

170 30

Exposure

Outcome

Totals

TotalsNo accidents at 1yr

Being yellow

Risk

• We can calculate the risk in those exposed (yellow car) and those not exposed (black)

• Then we can calculate a ratio of the risks in the two groups

• Work out the formula…

Relative Risk• The risk being accident free at one year if you

have a yellow car relative to if you have a black one = a ratio of the risks

RR = risk (exposed) / risk (unexposed)

• Or, how much more likely are you to be accident free if you drive a yellow car vs. a black one

• So work out the relative risk using our table• If RR = 1, there is no difference in risk.

Yes No

Yes 90 10 100

No 80 20 100

170 30

Exposure

Outcome

Totals

TotalsNo accidents at 1yr

Being yellow

Relative Risk

• Risk in exposed = 0.9• Risk in unexposed = 0.8

• Relative risk = 0.9/0.8 = 1.125

NB from these data you can also calculate an incidence, as mentioned earlier

Absolute Risk Reduction

• ARR is a measure of the reduction in risk which an exposure causes

• This is basically what we want to know to decide whether something is effective – Does it make a difference, and if so how much?

• The main use of ARR is to calculate the NNT

• Could you work out the formula knowing this information?

Absolute Risk Reduction

• Reduction in risk caused by an exposure

ARR = Risk (exposed) - Risk (unexposed)

• In our example:= Risk of accident free (yellow) - Risk (black)= 0.9 - 0.8= 0.1

Number Needed to Treat

• NNT (or NNH - harm) is a useful intuitive number

• Just learn the formula:NNT = 1/ARR

• In this case 1/0.1 =10– So you would need to paint 10 cars yellow

to prevent one accident.

Another example

• A trial looking at effectiveness of nicotine gum on smoking cessation

• 6328 smokers given gum, 1149 stopped• 8380 smokers given placebo, 893

stopped• Calculate the NNT You are allowed a calculator for this one!

– Table– Write out formulae

Yes No

Yes 1149 5179 6328

No 893 7487 8380

2042 12666

Exposure

Outcome

Totals

TotalsSmoking cessation

Nicotine gum

Odds

• Remember, Relative Risk needs the total exposed, which you can’t know from a case-control study

• Odds and Odds Ratio give similar information for data from case-control studies

• Note that in rare conditions the OR approximates the RR (you can work it out if you want, I wouldn’t bother)

Odds

Odds = affected / unaffected

OR = odds (exposed) / Odds (unexposed)

Odds• Consider a study looking retrospectively at

asbestos exposure in patients with mesothelioma

• 100 cases of mesothelioma, 80 recalled aspestos exposure

• 100 controls, 50 recalled aspestos exposure

• Draw a table…

Yes No

Yes 80 50 130

No 20 50 70

100 100

Exposure

Outcome

Totals

TotalsMesothelioma

Aspestos exposure

Odds

• Odds in exposed = 80/50 = 1.6• Odd in unexposed = 20/50 = 0.4• Odds ratio = 1.6 / 0.4 = 4

• I don’t find OR as intuitive as RR, and I think they are less useful– So hopefully less likely to need to calculate

them, but you might!

Looking at numbers

• Some other useful things– P values and confidence intervals, chance

and error– Normal distribution and skewness– (Mean, median and mode)– (Standard deviation, standard error)

Results Interpretation

• The point of any study is to inform us about things in the real world

• We need to consider:– Could the results be due to chance or do

they show a real effect?– Could error have affected the results– Can these results be applied to the real

world, I.e. are they generaliseable?

Results Interpretation

• All a study can tell you is how likely or unlikely things are:– “This study shows X causes Y”– What we actually mean is “it is very likely

that X causes Y”

• How can this be measured?– How do we know if differences are due to

chance or due to a real effect?

Significance

• A significant result is one that is unlikely to be due to chance alone– P value is the probability that the results

seen could have been caused by chance alone

– p < 0.05 or p < 0.01 are generally taken to indicate significance

Significance

• The null hypothesis– This means that the results are due to

chance alone– If your p-value is less than your chosen

level of significance (say 0.05) then you “reject the null hypothesis”

– This means you are saying that you don’t think the results are due to chance.

Significance• Take a finding with a p < 0.01

– This means that the chance that the observed results are due to chance is less than 1 in 100

– Or if you did the same experiment 100 times, one time the results would be due to chance, 99 times they would be due to a real effect.

• As your p value is less than your chosen level of significance, you reject the null hypothesis.

Significance

• P values depend on power, I.e the amount of data available.

• The effect of chance in a small study is greater, hence it would have a bigger p value.

• When planning a study it is necessary to do a power calculation to work out the sample size which is likely to produce a significant result– Let’s not go there… (get a actual statistician)

Confidence Intervals• A confidence interval gives a range for a

value of interest (such as relative risk)– Remember, if RR = 1 that means no effect

• As with p values, a CI is calculated with a chosen significance level (usually 0.05 - giving a 95% CI)

• If a CI range crosses 1, a result is not significant

• A CI gives the largest and smallest effects that are likely given the data

Forest Plot

• Used in meta-analyses

• Plot confidence intervals– Which studies produced a significant

result?– Look at the effect on the values of having

extra power form aggregated data!

Fig 3 Meta-analysis of studies showing impact of opiate substitution treatment in relation to HIV transmission in people who inject drugs among all pooled studies and studies reporting

only adjusted effect estimates .

MacArthur G J et al. BMJ 2012;345:bmj.e5945

©2012 by British Medical Journal Publishing Group

Error

• Results from your data may (will) differ from ‘real life’ - this is error.– Random– Systematic

Error - Random

• Random error occurs because we are only using a selection to get our information, not the whole population (sampling)– ß-error / Type II– Alpha-error / Type I

Error - Random

• Beta / Type II– Conclude there is no association when

there is one– Due to a small study, not enough power,

sample size too small– The danger is that we are not be aware of

risk factors or treatments (also less likely to get published)

Error - Random

• Alpha / Type I– Conclude there is an association when

there isn’t one (much worse)– Minimised by using a small P-value as our

chosen level of significance.– If the risk of harm due to a treatment was

very high, we might use a smaller P-value• P-value = risk of alpha error. 0.05 = 5% risk

Error - Systematic

• Selection bias:– Esp. case-control studies (cases and controls not

comparable)– If controls are selected using criteria related to risk

factors under investigation• Selecting lung cancer controls from resp. clinic

– Rx: good study design and control selection

Error - Systematic

• Information bias:– Mis-classification of disease or exposure

status– Recall bias is the main example

Error - Systematic

• Confounding– Factor which is associated both with exposure and

outcome, and isn’t taken into account• Consider occupation and lung cancer rates• Smoking may be more common in some occupations

– Rx: good study design, anticipating confounders and taking into account

Validity

• Validity: extent to which a variable measures what it is supposed to measure– Internal: study design, inferences about

study population– External: can the results be applied to non-

study patients/populations (is the study generalisable?)

– NB statistical association ≠ causality

Skewness

Positive and negative skew in describing uni-modal data

4. Screening

Screening

• Not time to cover here properly

• Learn the Winson and Jungner criteria– Condition– Test– Treatment– Programme

• Learn about screening test evaluation

Screening Test Evaluation

Again, having a descriptive defintion will help you to work out the formula:

• Sensitivity

• Specificity

• Positive Predictive Value

• Negative Predictive Value

• Likelihood Ratios


• Sensitivity and specificity: refer to test

• Predictive values: refer to population– Depend on disease prevalence

• Likelihood ratios: refer to individuals


• My advice:

• Draw a table (surprise!)– Consider doing this at the start of the exam

• Be able to work out the formulae from the table and your descriptive definitions

Pos. Neg.

Pos.TRUE

POSITIVEFALSE

POSITIVE

Neg.FALSE

NEGATIVETRUE

NEGATIVE

TRUE

TEST

Remember: ‘positive’ or ‘negative’ refers to the test result

Pos. Neg.

Pos.A B A+B

Neg.C D C+D

A+C B+DA+B C+D

TRUE

TEST

Sensitivity

• How good a test is at picking up cases– What proportion of the cases of disease

are picked up by the test?– Cases of disease = A+C– Picked up by the test (true positives) = A– So…

Sensitivity = A / (A+C)

Pos. Neg.

Pos.A B A+B

Neg.C D C+D

A+C B+DA+B C+D

TRUE

TEST

Specificity

• How specific is a positive test result to cases of the disease?– A specific test would have a small proportion false

positives (B), which mean sit would have a high proportion of true negatives (D)…

– Out of all of those without the disease (B+D)• Remember, specificity is independent of prevalence

– So…Specificity = D / (B+D)

Pos. Neg.

Pos.A B A+B

Neg.C D C+D

A+C B+DA+B C+D

TRUE

TEST

PPV

• Chance that a test +ve actually has the disease– Test +ve who actually have disease (true

positives) = A – All test +ves = A+B– So…

PPV = A / (A+B)

Remember this depends in prevalence, which has important implications for screening

Pos. Neg.

Pos.A B A+B

Neg.C D C+D

A+C B+DA+B C+D

TRUE

TEST

NPV

• Chance that a test -ve really doesn’t have the disease– Test negatives who really don’t have disease (ture

negatives) = D– All test -ves = C+D– So…

NPV = D / (C+D)

Remember this depends in prevalence, which has important implications for screening

Pos. Neg.

Pos.A B A+B

Neg.C D C+D

A+C B+DA+B C+D

TRUE

TEST

Likelihood Ratios

• Useful when considering individuals

• +ve LR: How much more likely a person is to have a disease, if they test positive

• Allow you to take into account pre test probability.

So what to learn?

• AKT syllabus?– At least read the list of terms with wikipedia at your

side

• Stuff covered here?• Or key bits?

– Familiarity with study types– How to calculate an NNT– What P-values and ORs mean– Wilson and Junger criteria– Screening test evaluation– Mean/median/mode/SD and some graph types

some statistics and epidemiology for the akt. we’ll try to cover 1.general tips 2.types of study,...

Documents

akt slide

exercising exposure

ecological fallacy slide

type of study

dont stress slide

types of study

observational studies

type of analytical study