7071
DESCRIPTION
hklTRANSCRIPT
![Page 1: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/1.jpg)
Validation of predictive regression models
Ewout W. Steyerberg, PhD
Clinical epidemiologist
Frank E. Harrell, PhD
Biostatistician
![Page 2: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/2.jpg)
Personal background
Ewout Steyerberg: Erasmus MC, Rotterdam, the Netherlands
Frank Harrell: Health Evaluation Sciences,
Univ of Virginia, Charlottesville, VA, USA
“Validation of predictions from
regression models is of
paramount importance”
![Page 3: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/3.jpg)
Learning objectives: knowledge of common types of regression models
fundamental assumptions of regression
models
performance criteria of predictive
models
principles of different types of validation
![Page 4: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/4.jpg)
Performance objectives
To be able to explain why validation is
necessary for predictive models
To be able to judge the adequacy of a
validation procedure
![Page 5: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/5.jpg)
Predictive models provide quantitative estimates of an outcome, e.g.
Quality of life one year after surgery
Death at 30 days after surgery
Long term survival
![Page 6: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/6.jpg)
Predictive models are often based on regression analysis
y ~ a + sum(bi*xi)
y: outcome variable
a: intercept
bi: regression coefficient i
xi: predictor variable i
i in [1,many], usually 2 to 20
![Page 7: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/7.jpg)
3 examples of regression
Quality of life one year after surgery:
continuous outcome, linear regression
Death at 30 days after surgery:
binary outcome, logistic regression
Long term survival:
time-to-outcome, Cox regression
![Page 8: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/8.jpg)
Predictive models make assumptions
Distribution
Linearity of continuous variables
Additivity of effects
![Page 9: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/9.jpg)
Example: a simple logistic regression model
30day mortality ~ a + b1*sex + b2*age
Assumptions:
Distribution of 30day mortality is binomial
Age has a linear effect
The effects of sex and age can be added
![Page 10: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/10.jpg)
Assessing model assumptions
Examine model residuals
Perform specific tests
add nonlinear terms, e.g. age+age2
add interaction terms, e.g. sex*age
![Page 11: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/11.jpg)
Model assumptions and predictionsBetter predictions if assumptions are met
Some violation inherent in empirical data
Evaluate predictions in new data
![Page 12: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/12.jpg)
Evaluation of predictions
Calibration
average of predictions correct?
low and high predictions correct?
Discrimination
distinguish low risk from high risk
patients?
![Page 13: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/13.jpg)
Example: predicted probabilities
0.0 0.1 0.2 0.3 0.4Predicted probability of 30-day mortality
0.0
0.1
0.2
0.3
0.4
Act
ual 3
0-da
y m
orta
lity
Area under ROC: 0.77Calibration: OK
![Page 14: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/14.jpg)
3 types of validation
Apparent: performance on sample used to
develop model
Internal: performance on population
underlying the sample
External: performance on related but
slightly different population
![Page 15: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/15.jpg)
Apparent validity
Easy to calculate
Results in optimistic performance
estimates
![Page 16: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/16.jpg)
Apparent estimates optimistic since same data used for:
Definition of model structure:
e.g. selection and coding of variables
Estimation of model parameters:
e.g. regression coefficients
Evaluation of model performance:
e.g. calibration and discrimination
![Page 17: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/17.jpg)
Internal validity
More difficult to calculate
Test model in new data, random from
underlying population
![Page 18: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/18.jpg)
Why internal validation?
Honest estimate of performance should
be obtained, at least for a population
similar to the development sample
Internal validated performance sets an
upper limit to what may be expected in
other settings (external validity)
![Page 19: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/19.jpg)
External validity
Moderately easy to calculate when new
data are available
Test model in new data, different from
development population
![Page 20: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/20.jpg)
Why external validation?
Various factors may differ from
development population, including
different selection of patients
different definitions of variables
different diagnostic or therapeutic
procedures
![Page 21: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/21.jpg)
Internal validation techniques
Split-sample:
development / validation
Cross-validation:
alternating development / validation
extreme: n-1 develop / 1 validate
(‘jack-knife’)
Bootstrap
![Page 22: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/22.jpg)
Bootstrap is the preferred internal validation technique
bootstrap sample for model development:
n patients drawn with replacement
original sample for validation: n patients
difference: optimism
efficiency: development and validation on n
patients
![Page 23: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/23.jpg)
Example: bootstrap results for logistic regression model
30-day mortality ~ a + b1*sex + b2*age
Apparent area under the ROC curve: 0.77
Mean area of 200 bootstrap samples:0.772
Mean area of 200 tests in original: 0.762
Optimism in apparent performance: 0.01
Optimism-corrected area: 0.76
![Page 24: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/24.jpg)
External validation techniques
Temporal validation: same
investigators, validate in recent years
Spatial validation (other place): same
investigators, cross-validate in centers
Fully external: other investigators, other
centers
![Page 25: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/25.jpg)
Example: external validity of logistic regression model
30-day mortality ~ a + b1*sex + b2*age
Apparent area in 785 patients: 0.77
Tested in 20,318 other patients: 0.74
Tested by other investigators: ?
![Page 26: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/26.jpg)
Example: external validation
0.0 0.1 0.2 0.3 0.4Predicted probability of 30-day mortality
0.0
0.1
0.2
0.3
0.4
Act
ual 3
0-da
y m
orta
lity
Area under ROC: 0.74Calibration: reasonable
![Page 27: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/27.jpg)
Summary
Apparent validity gives an optimistic
estimate of model performance
Internal validity may be estimated by
bootstrapping
External validity should be determined
in other populations
![Page 28: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/28.jpg)
Key references
tutorial and book on multivariable models(Harrell 1996, Stat Med 15:361-87;
Harrell: regression modeling strategies, Springer 2001)
empirical evaluations of strategies (Steyerberg 2000: Stat Med19: 1059-79)
internal validation (Steyerberg 2001:JCE 54: 774-81)
external validation (Justice 1999: Ann Intern Med 130:515-24;
Altman 2000: Stat Med 19: 453-73)
![Page 29: 7071](https://reader036.vdocuments.net/reader036/viewer/2022081801/55cf97b3550346d0339311ab/html5/thumbnails/29.jpg)
Links
Interactive text book on predictive
modelinghttp://www.neri.org/symptom/mockup/Chapter_8/
Harrell’s Regression modeling strategieshttp://hesweb1.med.virginia.edu/biostat/rms/