measures of diagnostic accuracy
DESCRIPTION
Statistical Methods for Analysis of Diagnostic Accuracy Studies Jon Deeks University of Birmingham with acknowledgement to Hans Reitsma. Measures of diagnostic accuracy. Positive and negative predictive values Sensitivity and specificity Likelihood ratios Area under the ROC curve - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/1.jpg)
Statistical Methods for Analysis of Diagnostic
Accuracy Studies
Jon DeeksUniversity of Birmingham
with acknowledgement to Hans Reitsma
![Page 2: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/2.jpg)
Measures of diagnostic accuracy
• Positive and negative predictive values
• Sensitivity and specificity • Likelihood ratios• Area under the ROC curve • Diagnostic odds ratio
![Page 3: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/3.jpg)
Diagnostic accuracy studies
• Results from the index test are compared with the results obtained with the reference standard on the same subjects
• Accuracy refers to the degree of agreement between the results of the index test and those from the reference standard
![Page 4: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/4.jpg)
Basic Design
Series of patientsSeries of patients
Index testIndex test
Reference standardReference standard
Cross-classificationCross-classification
![Page 5: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/5.jpg)
Clinical problem
• Diagnostic value of B type natriuretic (BNP) measurement
• Does BNP measurement distinguish between those with and without left ventricular dysfunction in the elderly?
• Smith et al. BMJ 2000; 320: 906.
![Page 6: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/6.jpg)
Anatomy of diagnostic study
• Target population: unscreened elderly• Index test: BNP• Target condition: LVSD • Final diagnosis (reference standard):
echocardiography – global and regional assessment of ventricular function including measurement of LV ejection fraction
![Page 7: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/7.jpg)
Our example
Elderly patientsElderly patients
BNP measurementBNP measurement
Echocardiography for LVSDEchocardiography for LVSD
Cross-classificationCross-classification
![Page 8: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/8.jpg)
Results of BNP study
61TP FP
FN TN
94
155
BNP
LVSD
>=18.7
<18.7
Present Absent
5011
1 93
14312
![Page 9: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/9.jpg)
Measures of test performance
155
61
94
BNP
LVSD
>=18.7
<18.7
Present Absent
14312
5011
1 93
• sensitivity 11 / 12 = 92% < Pr(T+|D+) >
• specificity 93 / 143 = 65% < Pr(T-|D-) >
![Page 10: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/10.jpg)
Measures of test performance
155
61
94
BNP
LVSD
>=18.7
<18.7
Present Absent
14312
5011
1 93
• positive predictive value11 / 61 = 18% < Pr(D+|T+) >
• negative predictive value93 / 94 = 99% < Pr(D-|T-) >
![Page 11: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/11.jpg)
Sensivity and Specificity not directly affected by prevalence
286143143
50131
12 93
181
105
BNP
LVSD
>=18.7
<18.7
Present Absent
• sensitivity 131 / 143 = 92%
• specificity 93 / 143 = 65%
![Page 12: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/12.jpg)
Predictive values directly affected by prevalence
286143143
50131
12 93
181
105
BNP
LVSD
>=18.7
<18.7
Present Absent
• positive predictive value131 / 181 = 72%
• negative predictive value93 / 105 = 89%
![Page 13: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/13.jpg)
Do sensitivity and specificity vary with prevalence?
• Test performance is sometimes observed to be different in different settings, patient groups, etc.
• Occasionally attributed to differences in disease prevalence, but:– diseased and non-diseased spectrums differ as well.
• e.g. using a test in primary care and secondary care referrals– the diseased group are different (cases more difficult)– the non-diseased group are different (conditions more similar)– sensitivity may decrease, specificity certainly decreases
![Page 14: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/14.jpg)
Likelihood ratios
• Why likelihood ratios?• Applicable in situations with more
than 2 test outcomes• Direct link from pre-test probabilities
to post-test probabilities
![Page 15: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/15.jpg)
Likelihood ratios
• Information value of a test result expressed as likelihood ratio
155
61
94
BNP
LVSD
>=18.7
<18.7
Present Absent
14312
5011
1 93
6.2143/50
12/11
)|Pr(
)|Pr(
DT
DTLR
![Page 16: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/16.jpg)
Likelihood Ratio of positive test
• How more often a positive test result occurs in persons with compared to those without the target condition
)|Pr(
)|Pr(
DT
DTLR
![Page 17: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/17.jpg)
Likelihood ratios
• Likelihood ratio of a negative test result
• How less likely a negative test result is in persons with the target condition compared to those without the target condition
)|Pr(
)|Pr(
DT
DTLR
![Page 18: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/18.jpg)
Likelihood ratios
13.0143/93
12/1
)|Pr(
)|Pr(
DT
DTLR
155
94
BNP
LVSD
>=18.7
<18.7
Present Absent
61
14312
5011
1 93
![Page 19: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/19.jpg)
Calculate likelihood ratios from column percentages
LR
100%100%
34.97%91.67%
8.33% 65.03% 0.13
BNP
LVSD
>=18.7
<18.7
Present Absent
2.62
![Page 20: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/20.jpg)
Interpreting likelihood ratios
• A LR=1 indicates no diagnostic value
• LR+ >10 are usually regarded as a ‘strong’ positive test result
• LR- <0.1 are usually regarded as a strong negative test result
• But it depends on what change in probability is needed to make a diagnosis
![Page 21: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/21.jpg)
50%
92%LR+ = 10
10%55%
![Page 22: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/22.jpg)
Advantages of likelihood ratios
• Still useful when there are more than 2 test outcomes
![Page 23: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/23.jpg)
BNP is a continuous measurement
• Dichotomisation of BNP (high vs. low) means loss of information
• Higher values of BNP are more indicative of LVSD
![Page 24: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/24.jpg)
Results BNP study
BNP Present Absent Total 26.7 9 28 37
18.7 -26.7 2 22 24
<18.7 1 93 94
Total 12 143 155
LVSD
![Page 25: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/25.jpg)
Likelihood ratios
• Stratum specific likelihood ratios in case of more than 2 test results
)|Pr(
)|Pr()(
DxT
DxTxTLR
![Page 26: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/26.jpg)
Compute LR from column percentages
BNP Present Absent LR 26.7 75% 20% 3.83
18.7 -26.7 17% 15% 1.08
<18.7 8% 65% 0.13
Total 100% 100%
LVSD
![Page 27: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/27.jpg)
Bayes’ rule
Post-test odds for disease
=
Pre-test odds for disease x Likelihood ratio
![Page 28: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/28.jpg)
Bayes’ rule
• Pre-test odds – chance of disease expressed in
odds
– example: if 2 out of 5 persons have the disease: probability = 2/5 in odds = 2/3
![Page 29: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/29.jpg)
Bayes’ rule
• odds = probability / (1 – probability)
• probability = odds / (1 + odds)
)Pr(1
)Pr()(Odds
D
DD
)(Odds1
)(Odds)(Pr
D
DD
![Page 30: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/30.jpg)
Bayes’ rulepatient with BNP >26.7
• Pre-test probability = 0.5• Pre-test odds = 0.5 / (1-0.5) = 1• LR(BNP >26.7) = 3.83• Post-test odds = 1x3.83 = 3.83• Post-test probability = 3.83 /
(1+3.83) = 0.79
![Page 31: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/31.jpg)
Bayes’ rulepatient with BNP lower than 18.7
• Pre-test probability = 0.5• Pre-test odds = 0.5 / (1-0.5) = 1• LR(CK < 40) = 0.13• Post-test odds = 1 x 0.13 = 0.13• Post-test probability = 0.13 /
(1+0.13) = 0.12
![Page 32: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/32.jpg)
Probability for LVSD after BNP
BNP LR
26.7 3.83
18.7-26.7 1.08
<18.7 0.13
79%
52%
12%
Pre-test prob.
Post test prob.
50%
![Page 33: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/33.jpg)
50%
79%
52%
12%
![Page 34: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/34.jpg)
5%
17%
5%
1%
![Page 35: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/35.jpg)
Probability for LVSD after BNP
BNP LR 5% 50%
26.7 3.83 17% 79%
18.7-26.7 1.08 5% 52%
<18.7 0.13 1% 12%
Pre-test prob.
Post test prob.
![Page 36: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/36.jpg)
Confidence intervals
• Sample uncertainty should be described for all statistics, using confidence intervals
ˆˆ%95 2/ sezCI
estimate of effect
Normal deviate (1.96 for 95% CI)
+ gives upper limit - gives lower limit
Standard error of estimate
![Page 37: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/37.jpg)
Confidence Intervals for Proportions
• Sensitivity, specificity, positive and negative predictive values, and overall accuracy are all proportions
n
pppse
)ˆ1(ˆˆ
n
rp ˆ
![Page 38: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/38.jpg)
Exact or Asymptotic CI?
• Asymptotic CI are approximations• Inappropriate when
– proportion is near 0% or near 100%– sample sizes are small(confidence intervals are not symmetric in
these cases)
• Preferable to use Binomial exact methods– can be computed in many statistics packages– or refer to tables
![Page 39: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/39.jpg)
Comparison of Asymptotic and Exact Methods
95% Confidence intervals r/n p Asymptotic Exact
0/20 0% not calculable (0% to 14%) 1/20 5% (-5% to 15%) (0% to 25%) 2/20 10% (-3% to 23%) (1% to 32%) 3/20 15% (-1% to 31%) (3% to 38%) 4/20 20% (2% to 38%) (6% to 44%) 5/20 25% (6% to 44%) (9% to 49%) 6/20 30% (10% to 50%) (12% to 54%) 7/20 35% (14% to 56%) (15% to 59%) 8/20 40% (19% to 61%) (19% to 64%) 9/20 45% (23% to 67%) (23% to 68%)
10/20 50% (28% to 72%) (27% to 73%)
![Page 40: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/40.jpg)
Confidence Intervals for Ratios of Probabilities and Odds
• Likelihood ratios are ratios of probabilities
2121
1111ln
nnrrRRse
1
1
2
2
nrn
r
RR
221121
1111ln
rnrnrrORse
11
1
22
2
rnr
rnr
OR
• Odds ratios are ratios of odds
![Page 41: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/41.jpg)
CIs for study
• Sensitivity = 92% (62%, 100%)• Specificity = 65% (57%, 73%)
• PPV = 82% (70%, 91%)• NPV = 99% (94%, 100%)
• LR(>= 26.7) = 3.8 (2.4, 6.1)• LR(18.7 < 26.7) = 1.1 (0.3, 4.1)• LR(<18.7) = 0.13 (0.02, 0.84)
![Page 42: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/42.jpg)
ROC-curve
• ROC stands for Receiver Operating Characteristic
• ROC-curve shows the pairs of sensitivity and specificity that correspond to various cut-off points for the continuous test result
![Page 43: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/43.jpg)
Continuous diagnostic test results
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=94% sensitivity=94%
![Page 44: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/44.jpg)
Heterogeneity in Threshold
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=99% sensitivity=71%
![Page 45: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/45.jpg)
Heterogeneity in Threshold
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=97% sensitivity=86%
![Page 46: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/46.jpg)
Heterogeneity in Threshold
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=94% sensitivity=94%
![Page 47: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/47.jpg)
Heterogeneity in Threshold
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=97% sensitivity=86%
![Page 48: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/48.jpg)
Heterogeneity in Threshold
Non-diseased Diseased
Diagnostic Threshold
TN FN FP TP
specificity=71% sensitivity=99%
![Page 49: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/49.jpg)
Threshold effects
Increasing threshold increases specificity but decreases sensitivity
Decreasing threshold increases sensitivity but decreases specificity
0.2
.4.6
.81
sens
itivi
ty
0.2.4.6.81specificity
for predicting spontaneous birth
Fetal fibronectin
![Page 50: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/50.jpg)
Change in cut-off valueand effect on sens & spec
Cut-off Sensitivity Specificity9999 0% 100%26.7 75% 80%19.8 83% 70%18.7 92% 65%0 100% 0%
![Page 51: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/51.jpg)
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
1-specificity
Sen
siti
vity
ROC-curve BNP
Cut-off: 26.7
Cut-off: 18.7
Cut-off: 19.8
![Page 52: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/52.jpg)
ROC curve
• Shows the effect of different cut-off values on sensitivity and specificity
• Better tests have curves that lie closer to the upper left corner
• Area under the ROC is a single measure of test performance (higher is better)
• Shape– RAW continuous data gives steps– GROUPED data gives straight sloping lines– FITTED ROC curves are smoothed.
![Page 53: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/53.jpg)
Variation in diagnostic thresholdAt what level, is a test result categorised as +ve, and how
should the threshold be selected?
Threshold affects the performance of the test, as described by ROC curves, and likelihood ratios
Depends ondisease prevalence (affects +ve and -ve predictive values)relative costs of false positive and false negative misdiagnosesrelative benefits of true positive and true negative diagnoses
![Page 54: Measures of diagnostic accuracy](https://reader034.vdocuments.net/reader034/viewer/2022051218/56815a6f550346895dc7d108/html5/thumbnails/54.jpg)
Workshop exercise – erratum• Q16 page 8
Compute post-test probabilities for a high risk patient, pre-test prob=50%
Q19 page 10
LVSD
+ve -ve
MI or BNP +ve
-ve
40 86
36 63
4 23