screening. outline screening basics evaluation of screening programs

Screening

http://bmj.bmjjournals.com/content/vol315/issue7107/images/large/gret07hr.f1.jpeg

Outline

• Screening basics

• Evaluation of screening programs

Where are we going today?

• Definition of screening?• Whether it is always beneficial?• Types of bias in screening?• Principles for the development of screening.

– The test: Validity, LR, Multiple testing, ROC curve, Kappa

– The disease;

• Evaluation of a screening program

Where are we?



– The disease;


Screening Basics

• What does “screening” mean?

• What do we screen for (objective)?

• What makes a disease an appropriate target for screening?

• What makes a test a good screening test?

Levels of Prevention(Mausner and Kramer 1985)

• Primary Prevention - Prevention of the occurrence of disease (reduce incidence of disease)

• Secondary Prevention - Early detection and prompt treatment of disease for cure, to slow progression, to prevent complications, or to limit disability (reduce prevalence of disease)

• Tertiary Prevention - Limitation of disability and rehabilitation where disease has already occurred and left residual damage

Natural History of Disease

The Pre-Clinical Phase (PCP) is• the period between when early detection

by screening is possible and when the clinical diagnosis would usually be made.

Pathology begins

Disease detectable Normal Clinical Presentation

Pre-Clinical Phase

Pre-clinical Phase

Objective and Definition

• Objective: to reduce mortality and/or morbidity by early detection and treatment.

• Definition: Asymptomatic individuals are classified as either unlikely or possibly having disease.

Trisha Greenhalgh, BMJ 1997;315:540-543 (30 August)


Where are we?



– The disease;


Principles for the development of screening;

1. The condition screened for is an important cause of morbidity, disability, or mortality.

2. The natural history of the disease is sufficiently well known.

3. The test must have high levels performance.4. The test must be acceptable to the target

population and their health care providers, and appropriate follow-up of positive findings must be ensured.

Sam Shapiro in Epidemiology and Health Services

Consequence of a screening test:

• Beneficence

• Non-beneficence

• Do harm;

Clofibrate in US

Labeling effect; Social psychology

Biases in assessing efficacy of screening

• Two major biases affect these data:

– lead time bias

– length bias

Where are we?



– The disease;


Lead Time

Lead time = amount of time by which diagnosis is advanced or made earlier

Pathology begins

Disease detectable Normal Clinical Presentation

Lead Time

Screen

Lead time bias

• We think early detection has increased survival– in fact all it has done is increase the time the

patient is aware of his disease!

– treatment could even hasten death and it might appear survival is longer post diagnosis!!

• Cannot just look at survival time post diagnosis.

Lead-time Bias

Length bias

Survival due to screening and treatment may be over rated because screening will tend to discover more slow-growing disease.

Length-time Bias

Biologic onset

Usual time of diagno-sis

Severe clinical illness (eg metasta-ses)

Death from the disease

First detect-able by screen-ing test

Suppose there are two subtypes of the disease:

Type 1: fast progression

Biologic onset

Usual time of diagnosis

Severe clinical illness (eg metastases)


First detectable by screening test

Type 2: slow progression

Biologic onset





Type 1

Biologic onset





Type 2

Length of time in pre-clinical phase longer in Type 2 than in Type 1

Biologic onset





Type 1

Biologic onset





Type 2

Periodic screening will tend to detect more of Type 2, as these have longer “exposure” in the critical interval for screening.

Biologic onset





Type 1

Biologic onset





Type 2

But look!! Type 2 individuals have a longer survival time from time of diagnosis than do Type 1.

• Without screening, suppose type 1 and type 2 were equal fractions of the population– average survival time is 50:50 mixture of the short

and long survival times.

• With screening, the screen-detected population has a higher fraction of type 2 (slow) individuals– mix will be proportional to ratio of the two intervals– suppose it is 70:30 in favor of long interval

• average survival time will be longer in screen detected individuals!

Length bias

Length bias

• Even if the treatment tended to be harmful and shorten life, because more longer interval individuals tend to be detected by screening, the screening program will appear to be effective!!

Where are we?



– The disease;



1. The test must have high levels performance.2. The condition screened for is an important cause

of morbidity, disability, or mortality. 3. The natural history of the disease is sufficiently

well known.4. The test must be acceptable to the target



Characteristics of Test

• Safety

• Cost

• Acceptability

• Validity

• Reliability

• Validity of test is shown by how well the test actually measures what it is supposed to measure. Validity is determined by the sensitivity and specificity of the test.

• Reliability is based on how well the test does in use over time - in its repeatability.


Characteristics of Test: Validity

• Sensitivity is the ability of a screening procedure to correctly identify those who have the disease-- the proportion of persons with the disease who have a positive test result

• Specificity is the ability of a screening procedure to correctly identify the percentage of those who do not have the disease--the proportion of persons without the disease who have a negative test result

Sensitivity and Specificity

True positive

A

False positive

B

False negative

C

True negative

D

Disease Present Absent

Positive

Negative

TestResult

Sensitivity = A / (A+C)

Specificity = D / (B+D)

Sensitivity and Specificity of Breast Cancer Screening Examination

True positive

A = 132

False positive

B = 983

False negative

C = 45

True negative

D = 63650

Breast CancerPresent Absent

Positive

Negative

Screening Test

Sensitivity=132/(132+45)=74.6%

Specificity=63650/(63650+983)=98.5%

Predictive Values

• Predictive Value Positive – Probability that a person actually has the disease given a positive screening test

• Predictive Value Negative – Probability that a person is actually disease-free given a negative screening test

Predictive Values

True positive

A

False positive

B

False negative

C

True negative

D

Disease Present Absent

Positive

Negative

TestResult

Sensitivity = A / (A+C)

Specificity = D / (B+D)

PPV = A / (A+B)

NPV = D / (C+D)

Predictive Values of Breast Cancer Screening Examination

True positive

A = 132

False positive

B = 983

False negative

C = 45

True negative

D = 63650

Breast CancerPresent Absent

Positive

Negative

Screening Test

Sensitivity=132/(132+45)=74.6%

Specificity=63650/(63650+983)=98.5%

PPV=132/(132+983)=11.8%

NPV=63650/(63650+45)=99.9%

Effect of Prevalence on Predictive Value Positive with Constant Sensitivity and Specificity

Prevalence PV+ (%) Sensitivity Specificity

(%) (%) (%)

0.1 1.8 90 95

1.0 15.4 90 95

5.0 48.6 90 95

50.0 94.7 90 95

Do not testDo not treat

Test and treat, based on the

basis of the result

Do not test

Get on the

treatmentPretest Probability of disease

Test-treatment threshold

Bayesian Approach (probabilities)

Prior ideaof effect

Study Results:effect size

Final Conclusion:Study + prior effect

PretestProbabilityof disease

TestResult

PosttestProbabilityof disease

Clinical Reasoning (probabilities)

0.3 13 64

Probability of Breast Ca (%)

Beforemammogram

Aftermammogram

AfterFNA

Sensitive test Specific test

Likelihood ratio of a positive test

• How much more likely is a positive test to be found in a person with the condition than in a person without it?

• sensitivity/ (l-specificity)

Likelihood Ratios

Disease

PresentAbsent

TestPositiveab

Negativecd

Se= a a + c

Sp= d b + d

LR+ = [a/a+c] [b/b+d] probability of + test in non-diseased

LR+=1 = no value

= probability of + test in diseased

Likelihood Ratios

Disease

PresentAbsent

TestPositiveab

Negativecd

Se= a a + c

Sp= d b + d

LR+ = [a/a+c] [b/b+d] probability of + test in non-diseased

LR+=1 = no value

= probability of + test in diseased

LR+>1 = valuable

Bayesian Approach

Prior ideaof effect

Study Results:effect size

Final Conclusion:Study + prior effect

PretestProbabilityof disease

TestResult

PosttestProbabilityof disease

PretestProb.of disease

LR+X = PosttestProb.of disease

Where are we?



– The disease;


Influence of prevalence on predictive values

0102030405060708090

100

2 10 25 50 75 90 98

PPV

NPV

Multiple testing

• Serial (Sequential)

• Simultaneous (Parallel)

ROC Curves

(PD- x CFP) / PD- x CFN)

Comparing two tests

Prior probability

Pos

teri

or p

roba

bili

ty

Where are we?



– The disease;


Reliability

Percent agreement

Cohen’s Kappa

• Reported in 1960

• Kappa corrects for the chance agreement that would be expected to occur if the 2 classifications were completely unrelated

N° = Observed number of agreement

Ne= Number of agreement expected to occur by chance alone

Varies from -1 to 1

K =N° - Ne

1 - Ne

Kappa

Population One (Prevalence = 0.05)Table for true positives

Observer A Positive Negative

Positive 36 9 45

Negative 4 1 5

40 10 50

Observer B

From Szklo and Nieto, 2000

Interpretation of Kappa

Below 0.0 Poor

0.00 - 0.20 Slight

0.21 - 0.41 Fair

0.42 - 0.60 Moderate

0.61 - 0.80 Substantial

0.81 - 1.00 Almost PerfectLandis & Koch (1977a)

Interpretation of Kappa

Where are we?

Where are we?



– The disease;• Evaluation of a screening program

Characteristics of Disease

• Serious

• Treatable

• Pre-clinical detectable period

• Early treatment is better than late

• Prevalent


• Why do we screen for serious disease?– why we screen for Phenyl ketonuria?

• Why do we screen for treatable disease?– why don’t we screen for pancreatic cancer?


• Why do we screen for diseases with a pre-clinical detectable period?– Ex: HIV vs. Legionella

• Why do we screen for diseases where early treatment is better than late?– Ex: breast cancer


• Why do we screen for prevalent disease?

– Prevalence affects predictive value

• Tayssachs

• Breast cancer


– The more prevalent a condition, the

fewer false positive tests there will be

– The less prevalent a condition, the fewer

false negative tests there will be

– No matter how good the test is!

Why do we test for prevalent diseases?

Although the combination ELISA/Western Blot test for HIV has extremely high sensitivity and specificity, predictive value is dependent on prevalence.

High risk population: prevalence = 40%

PPV = 0.985 NPV = 0.993

Low risk population: prevalence = 0.01%PPV = 0.0098 NPV = 0.999

Where are we?




• What are the costs of

– treatment for the disease? stage by stage

– a false negative? how are cases usually found? Does missing the case

on screen mean it is missed forever?

– false positive? risks of confirmatory test psychological risks / harms: worry, etc. labeling

Evaluation of Screening


• Safety

• Cost

• Acceptability

• Validity

• Reliability

Evaluation of Screening Outcomes

a) Study Designs• RCT

– Compare disease-specific cumulative mortality rate between those randomized (or not) to screening

– Eliminates confounding and lead time bias– But, problems of:

• Expense, time consuming, logistically difficult, ethical concerns, changing technology.


a) Study Designs• Observational Studies

– Cohort:• Compare disease-specific cumulative mortality rate

between those who choose (or not) to be screened

– Case-control:• Compare screening history between those with advanced

disease (or death) and healthy.

– Ecological:• Compare screening patterns and disease experience (both

incidence and mortality) between populations

Problems with Observational Studies

• Confounding due to health awareness - screenees are more healthy (selection bias)

• More susceptible to effects of lead-time bias and length-biased sampling

• Poor quality, often retrospective data

b) Measures of Effect of Screening Disease-specific Mortality Rate (MR)

the number of deaths due to disease

Total person-years experience

– The only gold-standard outcome measure for screening

– NOT affected by lead time,


NBSS1 (National Breast Screening Study), Canada 1980

– Age at entry: 40 to 49. – Randomization: Individual volunteers, with names entered

successively on allocation lists. Although criticisms of the randomization procedure have been made, a thorough independent review found no evidence of subversion and that subversion on a scale large enough to affect the results was unlikely.

– Sample size: 25,214 study) and 25,216 control. – Cause of death attribution: Death certificates, with review of

questionable cases by a blinded review panel. Also linked with the Canadian Mortality Data Base, Statistics Canada.

– Follow-up duration: 13 years. – Relative risk of breast cancer death, screening versus control

(95% CI): 0.97 (0.74-1.27).

Where are we?




Session objectives

• Screening basics

• Evaluation of screening programs

screening. outline screening basics evaluation of screening programs

Documents