mrcpsych08 - how to analyse diagnostic test studies (june08)

48
Critical Appraisal of Diagnostic Test Studies Alex J Mitchell Consultant in Liaison Psychiatry University of Leicester MRCPsych Teaching 2008

Upload: alex-j-mitchell

Post on 21-Jan-2015

1.181 views

Category:

Economy & Finance


3 download

DESCRIPTION

This is an educational talk/presentation on the science of diagnostic tests using examples from psychiatry. It was first presented for MRCPsych (Royal College of Psychiatrists UK) June 2008. Now updated in 2009...see newer version

TRANSCRIPT

Page 1: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Critical Appraisal of Diagnostic Test Studies

Alex J MitchellConsultant in Liaison PsychiatryUniversity of Leicester

MRCPsych Teaching 2008

Page 2: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Contents

• Importance of understanding diagnostic tests

• Statistics of diagnostic validity

• Examples

Page 3: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Importance of understanding diagnostic tests

Page 4: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

What Is a Diagnostic Test in Psychiatry?

• CT/MRI• CSF• Blood tests eg TFTs• SCAN/SCID/PSE/MINI• Neuropsychological Testing• MMSE• HADS/BDI/CESD?• Clinical Judgement• Self-report

Page 5: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Why Is a HADS score not a diagnosis?

Page 6: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Why Is a HADS score not a diagnosis?

1. No core features2. No symptom ranking3. No functional assessment4. Duration unclear5. What if Missing items?6. Imprecise

Page 7: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Defining Diagnostic Testing• INTENTION• Screening

– The systematic application of a test or inquiry, to identify individuals at sufficient risk of a specific disorder to warrant further actions among those who have not sought medical help for that disorder

• Case-Finding– The selected application of a test or inquiry, to identify individuals with a suspected disorder

and exclude those without a disorder, usually in those who have sought medical help for that disorder

• APPLICATION• Targeted (High Risk)

– The highly selected application of a test or inquiry, to identify individuals at high risk of a specific disorder by virtue of known risk factors

• Routine Screening– The systematic application of a test or inquiry, to individuals without a known disorder (or who

have not sought medical help for that disorder)

Adapted from Department of Health. Annual report of the national screening committee. London: DoH, 1997.

Page 8: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Aims of Detection

• Screening:– Short; Easy; some false +ve (low SpS PPV), few false

–ve (High Sens, NPV)

• Diagnosis (case-finding)– Accurate, Few false +ve or –ve

• Rating– Simple, patient rated, correl. With QoL and other

outcomes

Page 9: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

UK National Screening Committee Guidelines

• The condition should:• • Be an important health issue• • Have a well-understood history, with a

detectable risk factor or disease marker• • Have cost-effective primary preventions

implemented.

• The screening tool should:• • Be a valid tool with known cut-off• • Be acceptable to the public• • Have agreed diagnostic procedures.

• The treatment should:• • Be effective, with evidence of benefits of

early intervention• • Have adequate resources• • Have appropriate policies as to who should

be treated.

• The screening program should:• • Show evidence that benefits of screening

outweighing risks• • Be acceptable to public and professionals• • Be cost effective (and have ongoing

evaluation)• • Have quality-assurance strategies in place.• Adapted from: UK National Screening

Committee Criteria for appraising the viability, effectiveness and appropriateness of a screening programme

• http://www.nsc.nhs.uk/pdfs/criteria.pdf

Page 10: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

In this last step the screening tool /method is introduced clinically but monitored to discover the effect on important patient outcomes such as new identifications, new cases treated and new cases entering remission.

Screening implementation studies using real-world outcomes

ImplementationPhase IV_screen

This is an important step in which the tool is evaluated clinically in one group with access to the new method compared to a second group (ideally selected in a randomized fashion) who make assessments without the tool.

Screening RCT; clinicians using vs not using a screening tool

ImplementationPhase III_screen

The aim is to assess the refined tool against a criterion (gold standard) in a real world sample where the comparator subjects may comprise several competing condition which may otherwise cause difficulty regarding differential diagnosis.

Diagnostic validity in a representative sample

Diagnostic validityPhase II_screen

The aim is to evaluate the early design of the screening method against a known (ideally accurate) standard known as the criterion reference. In early testing the tool may be refined, selecting most useful aspects and deleting redundant aspects in order to make the tool as efficient (brief) as possible whilst retaining its value.

Early diagnostic validity testing in a selected sample and refinement of tool

Diagnostic validityPhase I_screen

Here the aim is to develop a screening method that is likely to help in the detection of the underlying disorder, either in a specific setting or in all setting. Issues of acceptability of the tool to both patients and staff must be considered in order for implementation to be successful.

Development of the proposed tool or test

DevelopmentPre-clinical

DescriptionPurposeTypeStage

Page 11: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Theory of Diagnostic Tests

Non-Depressed

Depressed# ofIndividuals

TestResult

Cut-off value

False +veFalse -ve

True -ve

True +ve

Page 12: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Low Prevalence (Se Sp = same)

Non-Depressed

Mj Depression# ofIndividuals

TestResult

Cut-off value

False +veLARGE

False –veSMALL

Page 13: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

High Prevalence (Se Sp = same)

Non-Depressed Mj+Mn Depression

# ofIndividuals

TestResult

Cut-off value

False +veSMALL

False –veLARGE

Page 14: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Accuracy 2x2 Table

PrevalenceSpecificitySensitivity

NPVTrue -VeFalse -VeTest -ve

PPVFalse +veTrue +veTest +ve

DepressionABSENT

DepressionPRESENT D / B + D

SpA / A + C

SnTotal

D/C + DNPV DC

Test-ve

A/A + BPPV BA

Test+ve

Reference StandardNo Disorder

Reference StandardDisorder Present

Page 15: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Can This Help establish a syndrome?

Page 16: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Example: A Clear Disease [#1]

Disorder

Number ofIndividuals

False +veFalse +ve

True -veTrue -ve

Point of Partial Rarity

Test Result

No Disorder

False -veFalse -ve

True +veTrue +ve

Page 17: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Example: A Probable Syndrome [#2]

Disorder

Number ofIndividuals

False +veFalse +ve False -veFalse -ve

True -veTrue -ve

True +veTrue +ve

MMSE Cognitive Score

No Disorder

Page 18: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Example: A Normally Distributed Trait [#3]

Disorder

Number ofIndividuals

False +veFalse +ve False -veFalse -ve

True -veTrue -ve

True +veTrue +ve

MMSE Cognitive Score

No Disorder

Page 19: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Example: Dementia

Disease?Syndrome?Trait?

Page 20: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Hubbert et al (2005) BMC Geriatrics

MMSE scores for dementia (n=72)and non-dementia (n=2735)

Huppert et al BMC Geriatrc 2005

Page 21: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Example: Depression

DiseaseSyndromeTrait

Page 22: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Mitchell, Coyne et al (2008)

0

10

20

30

40

50

60

70

80

90

100

110

Early Pregnancy3months Post-Partum12months Post-Partum

Scores on the CES-D during Pregnancy, 3 and 12 months Post-partum in 947 Women

Depressive Symptoms Moderate to Severe DepressionHealthy Mild Depression

Page 23: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

PHQ9 Linear distribution

0

5

10

15

20

25

30

35

Zero One Two

Three

Four

Five Six

Seven

Eight

Nine

TenElev

enTwelveThir

teen

Fourte

enFifte

enSixt

een

Sevente

enEigh

teen

PHQ9 (Major Depression)PHQ9 (Minor Depression)PHQ9 (Non-Depressed)

Baker-Glen, Mitchell et al (2008)

Page 24: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Thompson et al (2001) n=18,414

0

500

1000

1500

2000

2500

3000

Zero One

TwoThree Four

Five SixSev

en

eight

Nine

TenEleve

nTwelv

eThirt

een

Fourtee

nFifte

enSixtee

nSev

entee

nEightee

n

Page 25: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Statistics of diagnostic tests

Page 26: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Basic Measures of Accuracy• Sensitivity (Se) a/(a + c)• A measure of accuracy defined the proportion of patients with disease in whom

the test result is positive: a/(a + c)

• Specificity (Sp) d/(b + d)• A measure of accuracy defined as the proportion of patients without disease in

whom the test result is negative

• Positive Predictive Value a/(a+b)• A measure of rule-in accuracy defined as the proportion of true positives in

those that screen positive screening result, as follows

• Negative Predictive Value c/(c+d)• A measure of rule-out accuracy defined as the proportion of true negatives in

those that screen negative screening result, as follows

Page 27: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Summary Measures

• Youden's J– Sensitivity + Specificity – 1

• Predictive Summary Index– PPV + NPV – 1

• Overall accuracy– TP+TN / TP+FP+TN+FN

Page 28: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Reciprocal Measures• Number Needed to Diagnose (NND)

– 1 / (Youden's J)

• Number Needed to Predict (NNP)– 1 / (PSI)

• Number Needed to Screen (NNS)– 1/(FC-FiC)

Page 29: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Murphy JM, Berwick DM, Weinstein MC, Borus JF, Budman SH, Klerman GL 1987 : Performance of screening and diagnostic tests: Application of Receiver Operating Characteristic ROC analysis. Arch Gen Psychiatry 44:550-555

Page 30: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Accuracy 2x2 Table

PrevalenceSpecificitySensitivity

NPVTrue -VeFalse -VeTest -ve

PPVFalse +veTrue +veTest +ve

DepressionABSENT

DepressionPRESENT

Page 31: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Test vs Major Depression

700060001000

50004500500Test -ve

20001500500Test +ve

DepressionABSENT

DepressionPRESENT

Sensitivity50%

PPV 33%

Specificity75%

NPV 90%

Prevalence 14%

Page 32: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Test vs Major + Min Depression

300020001000

1000500500Test -ve

20001500500Test +ve

DepressionABSENT

DepressionPRESENT

Sensitivity50%

PPV 33%

Specificity33%

NPV 50%

Prevalence 33%

Page 33: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Added Value

• Definition 1:– The additional ability of a test to rule-in or rule-

out compared with the baseline rate– PPV minus Prevalence– NPV minus prevalence

• Definition 2:– The additional of a test to rule-in or rule-out

compared with the unassisted rate

Page 34: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Loss

of

ener

gy

Dim

inis

hed

driv

e

Slee

p di

stur

banc

e

Con

cent

rati

on/i

ndec

isio

n

Dep

ress

ed m

ood

Anx

iety

Dim

inis

hed

conc

entr

atio

n

Inso

mni

a

Dim

inis

hed

inte

rest

/ple

asur

e

Psyc

hic

anxi

ety

Hel

ples

snes

s

Wor

thle

ssne

ss

Hop

eles

snes

s

Som

atic

anx

iety

Tho

ught

s of

dea

th

Ang

er

Exce

ssiv

e gu

ilt

Psyc

hom

otor

cha

nge

Inde

cisi

vene

ss

Dec

reas

ed a

ppet

ite

Psyc

hom

otor

agi

tati

on

Psyc

hom

otor

ret

arda

tion

Dec

reas

ed w

eigh

t

Lack

of

reac

tive

moo

d

Incr

ease

d ap

peti

te

Hyp

erso

mni

a

Incr

ease

d w

eigh

t

All Case ProportionDepressed ProportionNon-Depressed Proportion

Mitchell, Zimmerman et al MIDAS Database. Psychol Med 2007 Submitted

Page 35: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

-0.10

0.00

0.10

0.20

0.30

0.40

0.50A

nger

Anx

iety

Dec

reas

ed a

ppet

ite

Dec

reas

ed w

eigh

t

Dep

ress

ed m

ood

Dim

inis

hed

conc

entr

atio

n

Dim

inis

hed

driv

eD

imin

ishe

d in

tere

st/p

leas

ure

Exce

ssiv

e gu

ilt

Hel

ple

ssne

ss

Hop

eles

snes

s

Hyp

erso

mni

a

Incr

ease

d ap

peti

te

Incr

ease

d w

eigh

t

Inde

cisi

vene

ss

Inso

mni

aLa

ck o

f re

acti

ve m

ood

Loss

of

ener

gy

Psyc

hic

anxi

ety

Psyc

hom

otor

agi

tati

on

Psyc

hom

otor

cha

nge

Psyc

hom

otor

ret

arda

tion

Slee

p di

stur

banc

e

Som

atic

anx

iety

Thou

ghts

of

deat

h

Wor

thle

ssne

ss

Rule-In Added Value (PPV-Prev)Rule-Out Added Value (NPV-Prev)

Page 36: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Accuracy of Tests: Visual

0% 100%25% 75%

Very unlikely Very likelylikelyunlikely

2 Questions

Overall

PHQ-2

WHO5 (1+3)

1 Question3% - (37) - 63% = 60%

3% - (16) - 32% = 29%

3% - (16) - 32% = 29%

10% - (22) -50% = 54%

32% - (37) - 96% = 64%

Henckel et al (2004) Eur Arch Psychiatry Clin Neurosci

CIDI (computer) Any Depression

Henckel et al (2004) Eur Arch Psychiatry Clin Neurosci

CIDI (computer) Any Depression

Arroll B et al (2003) BMJ

CIDI (computer) Mj Depression

CIDI (computer) Mj Depression

Page 37: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Pre-test Probability

Pos

t-tes

t Pro

babi

lity

Clinician Positive (Fallowfield et al, 2001)

Clinician Negative (Fallowfield et al, 2001)

Baseline Probability

HADS-D Positive (Mata-analysis)

HADS-D Negative (Meta-analysis)

Page 38: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Pre-test Probability

Post

-test

Pro

babi

lity

Depression Present (Routine)

Depression Absent (Routine)

Depression Scales +ve (Median)

Depression Scales -ve (Median)

Prior Probability

PPV=0.41

NPV=0. 97

Prevalence of 0.15

Page 39: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Worked Examples of diagnostic tests

Page 40: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

PostStroke Mj Depression vs NonMj

• Clinicians diagnosis using DSMIV vs SCAN/PSE

• Using the SCAN:• 50 people with major depression • 150 healthy people• 50 with minor depression

Page 41: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Clinicians using DSMIV• Clinicians diagnosed 52 cases with Mj depression• The specificity of DSMIV was 95%

• Q. What was the sensitivity?• Q. What was the prevalence?• Q. What was the PPV?• Q. What was the % correctly identified per every

100 screened?

Page 42: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Test vs Major Depression

20050

??Test -ve

52??Test +ve(Clinician)

DepressionABSENT

DepressionOn SCAN

Sensitivity50%

PPV ??%

Specificity95%

NPV ??%

Prevalence ??%

Page 43: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

1.301.271.1785.600.910.680.960.810.951902000.844250DSMIV algorithm

4.6151.9551.200.720.130.840.380.861722000.341750Anger

46.92502.5539.200.660.040.800.220.821642000.201050Poor orientation

11.937.697.3513.600.480.140.840.250.571142000.562850Poor concentration

7.32501.7158.400.790.010.800.330.981962000.04250Suicidal thoughts

2.452.561.6062.400.780.270.880.530.891782000.502550Poor appetite

3.932.632.7236.800.610.250.900.350.681362000.703550Insomnia

6.0112.50-2.23-44.800.100.210.950.210.10202000.984950Low energy

3.902.503.57280.550.270.920.330.601202000.804050Loss of drive

1.961.351.5863.200.770.500.990.520.781562000.964850Loss of interest

1.411.221.2083.200.900.660.970.740.921842000.904550Persistent low mood

NNPNNDNNSIdentification Index

Negative Utility Index

Positive Utility Index

NPVPPVSpecificity

Non Depressed Stroke Patient withoutsymptom

No Post-Stroke Depression by reference standard

Sensitivity

Post-Stroke Depression withsymptom

Post-Stroke Depression by reference standard

Symptoms

Page 44: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

Advanced Techniques

sROCReal World NumbersNND; NNSEconomics

Page 45: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

NNS= 1/Idemtification Index

Number needed to ScreenRequires application of criterion (gold) standard)

Measures real number of correct identifications vs misidentificationsCan be easily converted into a percentage

TP+TN / TP+FP+TN+FNOverall Accuracy (Fraction Correct)

NNP = 1/PSINumber Needed to Predict

Dependent of prevalencePlaces equal weight on rule-in and rule-out accuracy

Measures gainClinically applicable

PPV + NPV – 1Predictive Summary Index

NND = 1/YoudenNumber Needed to Diagnose

Requires application of criterion (gold) standard)Does not assess ratio of false positives to negatives

Relatively independent of prevalenceNot clinically interpretable

sensitivity + specificity – 1Youden Index

Reciprocal Absolute Benefit Formula

Reciprocal Absolute Benefit

WeaknessStrengthBasic FormulaMeasure

Page 46: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

PPV DT Distress = 55%; PPV Other Methods 65%

Page 47: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)

ROC Plot

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

1 - Specifity

Sens

itivi

ty Low Mood

DSMIV

Low mood & loss interest

Page 48: MRCPsych08 - How To Analyse Diagnostic Test Studies (June08)