introduction to modeling - mit opencourseware · 2020. 12. 31. · h 1: 100% cherry h 2: 75%...
TRANSCRIPT
Introduction to Modeling6.872/HST950
Why build Models?
• To predict (identify) something
• Diagnosis
• Best therapy
• Prognosis
• Cost
• To understand something
• Structure of model may correspond to structure of reality
Where do models come from?
•••
• Assumes uniform priors over all hypotheses in the space
• A-priori knowledge, expressed in
• Structure of the space of models
• • Adjustments to observed data
Pure induction from data Even so, need some “space” of models to explore Maximum A-posteriori Probability (MAP)
• Maximum Likelihood (ML)
An Example(Russell & Norvig)
• Surprise Candy Corp. makes two flavors of candy: cherry and lime
• Both flavors come in the same opaque wrapper
• Candy is sold in large bags, which have one of the following distributions of flavors, but are visually indistinguishable:
• h1: 100% cherry
• h2: 75% cherry, 25% lime
• h3: 50% cherry, 50% lime
• h4: 25% cherry, 75% lime
• h5: 100% lime
• Relative prevalence of these types of bags is (.1, .2, .4, .2, .1)
• As we eat our way through a bag of candy, predict the flavor of the next piece; actually a probability distribution.
Bayesian Learning
• Calculate the probability of each hypothesis given the data
• To predict the probability distribution over an unknown quantity, X,
• If the observations d are independent, then
• E.g., suppose the first 10 candies we taste are all lime
h1: 100% cherry
h2: 75% cherry, 25% lime
h3: 50% cherry, 50% lime Learning Hypotheses
h4: 25% cherry, 75% lime
h5: 100% lime and Predicting from Them
• (a) probabilities of hi after k lime candies; (b) prob. of next lime
• Image by MIT OpenCourseWare.
MAP prediction: predict just from most probable hypothesis
• After 3 limes, h5 is most probable, hence we predict lime
• Even though, by (b), it’s only 80% probable
0 2 4 6 8 10 0 2 4 6 8 100
0.2
0.4
0.6
0.8
1
0.4
0.5
0.6
0.7
0.8
0.9
1
Number of samples in d
a b
Number of samples in d
Post
erio
r pro
babi
lity
of h
ypot
hesi
s
Prob
abili
ty th
at n
ext c
andy
is li
me
P(h1 | d) P(h2 | d) P(h3 | d) P(h4 | d) P(h5 | d)
Observations
• Bayesian approach asks for prior probabilities on hypotheses!
• Natural way to encode bias against complex hypotheses: make their prior probability very low
• Choosing hMAP to maximize
• is equivalent to minimizing
• but as we know that entropy is a measure of information, these two terms are
• # of bits needed to describe the data given hypothesis
• # bits needed to specify the hypothesis
• Thus, MAP learning chooses the hypothesis that maximizes compression of the data; Minimum Description Length principle
• Regularization is similar to 2nd term—penalty for complexity
• Assuming uniform priors on hypotheses makes MAP yield hML, the maximum likelihood hypothesis, which maximizes
Learning More Complex Hypotheses
• Input:
• Set of cases, each of which includes
• numerous features: categorical labels, ordinals, continuous
• these correspond to the independent variables
• Output:
• For each case, a result, prediction, classification, etc.,corresponding to the dependent variable
• In regression problems, a continuous output
• a designated feature the model tries to predict
• In classification problems, a discrete output
• the category to which the case is assigned
• Task: learn function f(input)=output
• that minimizes some measure of error
Linear Regression
• General form of the function
• For each case:
• Find to minimize some function of over all
• e.g., mean squared error:
Logistic Regression
• Logistic function:
• E.g, how risk factors contribute to probability of death are the log odds ratios •
More sophisticated models
• Nearest Neighbor Methods
• Classification Trees
• Artificial Neural Nets
• Support Vector Machines
• Bayes Networks (much on this, later)
• Rough Sets, Fuzzy Sets, etc. (see 6.873/HST951 or other ML classes)
How?
• Given: pile of training data, all cases labeled with gold standard outcome
• Learn “best” model
• Gather new test data, also all labeled with outcomes
• Test performance of model on new test data
• Simple, no?
Simplest Example
• Relationship between a diagnostic conclusion and a diagnostic test
Test Positive
Test Negative
Disease Present
True Positive
False Negative TP+FN
Disease Absent
False Positive
True Negative FP+TN
TP+FP FN+TN
DefinitionsTest
Positive Test
Negative
Disease Present
True Positive
False Negative
TP+FN
Disease Absent
False Positive
True Negative
FP+TN
TP+FP FN+TN
Sensitivity (true positive rate): TP/(TP+FN)
False negative rate: 1-Sensitivity = FN/(TP+FN)
Specificity (true negative rate): TN/(FP+TN)
False positive rate: 1-Specificity = FP/(FP+TN)
Positive Predictive Value (PPV): TP/(TP+FP)
Negative Predictive Value (NPV): TN/(FN+TN)
Test Thresholds
+
-
FPFN
T
Wonderful Test
+
-
FPFN
T
Test Thresholds Change Trade-off between Sensitivity and Specificity
+
-
FPFN
T
Receiver Operator Characteristic TPR
(sensitivity) (ROC) Curve
0 FPR (1-specificity)1 0
1
T
TPR What makes a better test?
0 FPR (1-specificity)1
(sensitivity)
0
1
worthless
superb
OK
Need to explore many models
• Remember:
• training set => model
• model + test set => measure of performance
• But
• How do we choose the best family of models?
• How do we choose the important features?
• Models may have structural parameters
• Number of hidden units in ANN
• Max number of parents in Bayes Net
• Parameters (like the betas in LR), and meta-parameters
• Not legitimate to “try all” and report the best !!!!!!!!!!!!!!!!!!
The Lady Tasting Tea
• R.A. Fisher & the Lady
• B. Muriel Bristol claimed she prefers tea added to milk rather than milk added to tea
• Fisher was skeptical that she could distinguish
• Possible resolutions
• Reason about the chemistry of tea and milk
• Milk first: a little tea interacts with a lot of milk
• Tea first: vice versa
• Perform a “clinical trial”
• Ask her to determine order for a series of test cups
• Calculate probability that her answers could have occurred by chance guessing; if small, she “wins”
• ... Fisher’s Exact Test
• Significance testing
• Reject the null hypothesis (that it happened by chance) if its probability is < 0.1, 0.05, 0.01, 0.001, ..., 0.000001, ..., ????
How to deal with multiple testing
• Suppose Ms. Bristol had tried this test 100 times, and passed once.Would you be convinced of her ability to distinguish?
• Bonferroni correction: for n trials, insist on a p-value that is 1/n of what you would demand for a single trial
Cross-validationTraining Data “Real” Training Data
Validation Data
Test Data
• Any number of times
• Train on some subset of the training data
• Test on the remainder, called the validation set
• Choose best meta-parameters
• Train, with those meta-parameters, on all training data
• Test on Test data, once!
Aliferis lessons (part)
• Overfitting
• bias, variance, noise
• O = optimal possible model over all possible learners
• L = best model learnable by this learner
• A = actual model learned
• Bias = O - L (limitation of learning method or target model)
• Variance = L - A (error due to sampling of training cases)
• Compare against learning from randomly permuted data
• Curse of dimensionality
• Feature selection
• Dimensionality reduction
Causality
• Suppes, 1950’s
• Statistical association
• Temporal succession
• No confounders (!)
• hidden variables • A node, X, is conditionally independent
of all other nodes in the network given its Markov blanket: its parents, Ui, children,Yi, and children’s parents, Zi.
X
Y1 Yn
UnU1
Z1 Zn
Using MIMIC data to build predictive models
• Mortality • Caleb Hug’s 2009 PhD thesis: http://dspace.mit.edu/handle/1721.1/46690
• Comparison to SAPS II
• Daily Acuity Scores
• Real-time Acuity Scores
• Other outcomes
• Good
• Weaning from Ventilator
• Weaning from Intra-Aortic Balloon Pump
• Weaning from Vasopressors
• Bad
• Septic shock
• Hypotension
• Acute kidney injury
3.2. EXTRACTING THE VARIABLES 31
Systolic Blood Pressure
Hours Between SBP Measurements
Fre
quency
0 1 2 3 4
0e+
00
8e+
05
Glasgow Coma Scale
Hours Between GCS Observations
Fre
quency
0 5 10 15
0e+
00
2e+
05
Blood Urea Nitrogen
Hours Between BUN Measurements
Fre
quency
0 5 10 15 20 25 30
010000
INR
Hours Between INR Observations
Fre
quency
0 5 10 15 20 25 30
04000
8000
WBC Counts
Hours Between WBC Counts
Fre
quency
0 5 10 15 20 25 30
010000
FiO2 Settings
Hours Between FiO2Set Recordings
Fre
quency
0 5 10 15 20 25 30
060000
Figure 3-1: Observation frequency histograms
Cleaning the data—half the research time
• Missing values
• Some values are not measured for some clinical situations
• Failures in data capture process
• Episodically measured variables
• Unclear/undefined clinical states 0 10 20 30 40 0 5 10 15 20 25 30
• Imprecise timing of meds, ... Hours Between HCT Measurements Hours Between Art_BE Measurements
• Partially measured i/o
• Proxies: e.g., which ICU⇒what disease
• Derived variables: integrals, slopes, ranges, frequencies, etc.
• Transformed variables: square root, log, etc.
• Select subset of data with enough data!
Hematocrit
Fre
quency
0
4000
Arterial Base Excess
Fre
quency
0
10000
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
3.4. PRELIMINARY DATASET 45
Figure 3-2: Histograms for demographic information
3.4. PRELIMINARY DATASET 47
Figure 3-4: SAPS II Variables (cont)
0
Descriptive look
Fre
quency
Fre
quency
Fre
quency
0
1000
2000
3000
0
500
1000
1500
0
1000
2000
3000
Patient Length of Stay Patient Age Histogram of Blood Urea Nitrogen (BUN) Histogram of White Blood Cell Counts
Fre
quency
Fre
quency
Fre
quency
50000
150000
0
50000
150000
0
50000
150000
20 40 60 80 100
n 13923
0 10 20 30 40 50
n 13923
0 50 100 150 0 10 20 30 40 50
n 2448468 n 2448468
Fre
quency
Fre
quency
Fre
quency
0
20000
60000
0
50000
150000
0
50000
150000
Fre
quency
Fre
quency
Fre
quency
0
4000
8000
12000
0
2000
6000
10000
0
200
400
600 missing 186042 missing 224258missing 0 missing 78
30.8 12.9mean mean5.02 63.5mean mean median 23 median 11.7median 2.67 median 65 std dev 24.3 std dev 6.5std dev 8.1 std dev 17
Length of Stay (days) Age (Years) BUN (mg/dL) WBC (1000/mm^3)
Patient Admit Weight Sex of Patients Histogram of Potassium Histogram of Sodium
50 100 150 200 250
n 13923
2 3 4 5 6 7 8 120 130 140 150 160
n 2448468 n 2448468n 13923
missing 56 missing 105388 missing 139134missing 1085 4.07 13981 mean meanmean
median 4 median 139median 78.5 std dev 0.534 std dev 4.76std dev 21.8
Female Male
Weight (kg) Sex Potassium (mEq/L) Sodium (mEq/L)
Service types for patients ICD9 Chronic Illnesses Histogram of Bicarbonate (CO2) Histogram of Bilirubin
10 20 30 40 50
n 2448468 n
missing
mean
median
std dev
2448468
1842053
4.01
1.1
7.46
13923
Other NSICU MSICU CSRU
13923
No Yes
n n missing 204650missing 0 missing 0 mean 24.6
median 24
std dev 4.72
0 5 10 15 20 25 30
ICD9 Chronic Illness? Service Type (metastatic carcinoma, hematologic malignancy, AIDS) Bicarbonate (mEq/L) Bilirubin (mg/dL)
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
3.4. PRELIMINARY DATASET 49
a given patient. Some patients, for example, are recorded as having died over oneyear after their first ICU discharge. To limit such cases, one might define death as“died within the ICU or within 30 days of discharge”. This limit marks a patient asalive if he or she is still in the hospital 30 days after ICU discharge. If the patientis not in the hospital at this point, the hospital discharge status of the patient isused to indicate mortality: if the patient was discharged alive (censored) they aremarked as survived; otherwise they are marked as expired. Figure 3-6 illustratesthe change in mortality rate as patients stay in the ICU for longer periods of time.This figure includes both hospital mortality (i.e., the patient died at any point duringany recorded visit) and the within-30-days-of-ICU-discharge mortality. As the figureshows, the two mortality rates track each other closely for the first several days.However, it is clear that patients who stay in the ICU longer are more apt to remainin the hospital for a significant period of time before dying. For the remainder of thiswork, references to “mortality” indicate death in the ICU or within the following 30days.
Figure 3-6: Patient counts versus the number of days spent in the ICU (left) andmortality rate versus the number of days spent in the ICU (right). For each patient,only the first ICU stay of the first recorded hospital visit is considered. “ICU + 30 daymortality” excludes deaths that occur after long post-ICU discharge hospitalizations.If a patient leaves the hospital alive within this 30-day period, they are assumed tohave survived.
3.5. FINAL DATASET 51
Table 3.14: Final Dataset: Partial Patient Exclusions
Drop Rule Number of RowsIn the ICU for longer than seven days 728739Received limited treatment, including 198942
CMO (“comfort measures only”)DNR (“do not resuscitate”)DNI (“do not intubate”)“no CPR” or “other code”
Received hemodialysis or hemofiltration 139561
The motivation for these rules generally follows the reasoning for excluding entirepatients. For example, as Figure 3-6 indicates by plotting the mortality rate versusthe number of days spent in the ICU, most patients leave the ICU within seven days ofadmission. For patients that do not leave in this 7-day window, the 30-day mortalityrate starts to noticeably decrease as caregivers are able to successfully prolong thepatient’s life while the patient remains in a compromised state often dependent onvarious interventions.
3.5.2 Dataset Summary
The final dataset — after applying all of the exclusions mentioned above — is sum-marized in Table 3.15. In addition, Appendix B lists the 438 individual variableswith brief summary statistics. Figure 3-7 provides an updated version of Figure 3-6for the final dataset.
Outcomes0
2000
6000
10000
14000
Num
ber
of patients
Total Number of Patients Number of Hospital Mortalities Number of ICU + 30 days Mortalities
0 5 10 15 20 25 30 10
15
20
25
Mort
alit
y R
ate
(%
)
0 5 10 15 20 25 30
0
2000
6000
10000
14000
Number of Patients
Hospital Mortality
ICU + 30 days Mortality
0 5 10 15 20 25 30
Days in ICU Days in ICU
Table 3.15: Preprocessed Data
Number of Patients 10,066Number of Rows 1,044,982Number of Features 438
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
4.3. OTHER SEVERITY OF ILLNESS SCORES 63
SAPS II
Table 4.1: SAPS II Variables Variable Max Points Age 18 Heart rate 11 Systolic BP 13 Body temperature 3 PaO2:FiO2 (if ventilated or continuous 11
positive airway pressure) Urinary output 11 Serum urea nitrogen level 10 WBC count 12 Serum potassium 3 Serum sodium level 5 Serum bicarbonate level 6 Bilirubin level 9 Glasgow Coma Scorea 26 Chronic diseases 17 Type of admission 8
aIf the patient is sedated, the estimated GCS prior to sedation
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
68 CHAPTER 5. MORTALITY MODELS
0 5 10 15 20 25 30 35
0.7
00.8
00.9
0
Fold 5
Number of covariates
RO
C A
rea
orientation_max_i4/5 Train, n=18299
1/5 Val, n=4589
Figure 5-1: SDAS model selection. Sensitivity to number of covariates on each cross-validation fold. The covariate(s) from the simplest model are marked on the trainingcurves.
68 CHAPTER 5. MORTALITY MODELS
0 5 10 15 20 25 30 35
0.7
00.8
00.9
0
Fold 1
Number of covariates
RO
C A
rea
GCS_max_sq
4/5 Train, n=18307
1/5 Val, n=4581
0 5 10 15 20 25 30
0.7
00.8
00.9
0
Fold 2
Number of covariates
RO
C A
rea
GCS_max_sq
4/5 Train, n=18499
1/5 Val, n=4389
0 5 10 15 20 25 30 35
0.7
00.8
00.9
0
Fold 3
Number of covariates
RO
C A
rea
mechVent_mean_sq4/5 Train, n=18220
1/5 Val, n=4668
0 5 10 15 20 25 30
0.7
00.8
00.9
0
Fold 4
Number of covariates
RO
C A
rea
GCS_max_sq
4/5 Train, n=18227
1/5 Val, n=4661
Figure 5-1: SDAS model selection. Sensitivity to number of covariates on each cross-validation fold. The covariate(s) from the simplest model are marked on the trainingcurves.
5.2. DAILY ACUITY SCORES 69
probability of mortality (positive correlation) while a negative coefficient means lessrisk of mortality (negative correlation).
Model 5.1 includes a number of interesting covariates. By examining the Wald Zscores, however, it appears that the model can likely be improved. For example, thecontributions from the largest length of stay fluid balance (LOSBal max, Z = 3.94)and the largest 24 hour fluid balance (Bal24 max, Z = −3.68) may largely counteracteach other and the model may benefit from removing at least one of these variablesand possibly replacing it with an input variable (a mean hourly output for the day,OutputB 60 mean sqrt, is already included). Similarly, the meaningfulness of thepressD01 sd sq variable (that is, the squared standard deviation of the points marked1 following the first pressor infusion and marked 0 before any pressors) is questionableas it decreases risk for patients who have a pressor started in the middle of their firstday, but increases risk for patients who receive pressors early or late on their first day.
Examining all five folds from Figure 5-1, it is clear that there was negligible impacton the validation performance by reducing the number of covariates to about 25.Considering this, I took the top 25 covariates from the models for cross-validationfolds 1, 2, 3, and 5 (excluding fold 4) and created a model using these covariates.By performing backward elimination one last time on this model a final model wasselected. As done with each cross-validation fold in Figure 5-1, a plot of performanceversus the number of covariates in a given model was created. This plot is shown inFigure 5-2.
Figure 5-2: SDAS model selection (all development data)
The models in Figure 5-2 indicate that most of the performance was capturedwith about 35 inputs. This model was chosen for additional refinement. Upon exam-ination, it was found to contain a number of pressor-related
Training models—5-fold cross validation
Fold 1 Fold 2 Fold 5
RO
C A
rea
RO
C A
rea
0.7
0
0.8
0
0.9
0
0.7
0
0.8
0
0.9
0
GCS_max_sq
4/5 Train, n=18307
1/5 Val, n=4581 R
OC
Are
a
RO
C A
rea
Number of covariates Number of covariates Number of covariates
0.7
0
0.8
0
0.9
0
0.7
0
0.8
0
0.9
0
GCS_max_sq
4/5 Train, n=18499
1/5 Val, n=4389
RO
C A
rea
RO
C A
rea
0.7
0
0.8
0
0.9
0
0.7
0
0.8
0
0.9
0
orientation_max_i 4/5 Train, n=18299
1/5 Val, n=4589
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35
SDAS Model SensitivityFold 3 Fold 4
mechVent_mean_sq 4/5 Train, n=18220
1/5 Val, n=4668
GCS_max_sq
4/5 Train, n=18227
1/5 Val, n=4661
GCS_max_sq
Training Data: n=22888 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30
Number of covariates Number of covariates
0 10 20 30 40
Number of CovariatesFigure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
70 CHAPTER 5. MORTALITY MODELS
72 CHAPTER 5. MORTALITY MODELSMany univariate analysesModel 5.1 SDAS Model for Fold 2 with 30 Covariates Model 5.2 Final SDAS model
Obs Max Deriv Model L.R. d.f. P C Dxy Obs Max Deriv Model L.R.
20172 1e-09 5415.11 30 0 0.893 0.785 20130 3e-10 5619.28
Gamma Tau-a R2 Brier 0.787 0.176 0.439 0.076
Coef S.E. Wald Z P INR_mean_i -1.795e+00 1.423e-01 -12.61 0.0000 GCS_max_sq -7.485e-03 6.000e-04 -12.47 0.0000 OutputB_60_mean_sqrt -6.561e-02 6.885e-03 -9.53 0.0000 pacemkr_max -1.084e+00 1.183e-01 -9.16 0.0000 svCSRU_max -9.516e-01 1.208e-01 -7.88 0.0000 GCSrdv_mean -1.138e-01 1.528e-02 -7.45 0.0000 pressD01_mean_am -2.774e+00 3.893e-01 -7.13 0.0000 Platelets_Slope_1680_min -5.493e+00 8.615e-01 -6.38 0.0000 pressD01_sd_sq -5.085e+00 8.678e-01 -5.86 0.0000 sedatives_mean_sq -4.375e-01 8.455e-02 -5.17 0.0000 Bal24_max -4.493e-05 1.222e-05 -3.68 0.0002 CV_HRrng_max -3.267e-03 1.083e-03 -3.02 0.0026 Intercept 4.292e-01 4.085e-01 1.05 0.2934 Milrinone_perKg_min_sq 3.523e+00 1.113e+00 3.17 0.0015 LOSBal_max 2.247e-05 5.703e-06 3.94 0.0001 hrmVA_max 3.410e-01 6.767e-02 5.04 0.0000 MBPm.pr_min_am 1.904e+00 3.711e-01 5.13 0.0000 Mg_min_sq 1.067e-01 1.798e-02 5.93 0.0000 beta.Blocking_agent_mean_lam 2.418e-01 3.955e-02 6.11 0.0000 Na_mean_am 5.214e-02 8.415e-03 6.20 0.0000 mechVent_mean_sq 7.183e-01 1.047e-01 6.86 0.0000 RESP_mean_sq 9.226e-04 1.293e-04 7.13 0.0000 Platelets_mean_i 2.512e+01 3.512e+00 7.15 0.0000 Lasix_max_lam 2.550e-01 3.457e-02 7.38 0.0000 CO2_mean_i 2.038e+01 2.741e+00 7.43 0.0000 jaundiceSkin_mean_la 1.523e-01 2.014e-02 7.56 0.0000 hospTime_min_sqrt 6.860e-03 7.939e-04 8.64 0.0000 pressorSum.std_mean_sqrt 7.758e-01 7.225e-02 10.74 0.0000 SpO2.oor30.t_mean_sqrt 4.929e-01 4.095e-02 12.04 0.0000 BUNtoCr_min_sqrt 2.867e-01 2.323e-02 12.34 0.0000 Age_min_sq 2.258e-04 1.450e-05 15.57 0.0000
Gamma 0.798
Tau-a 0.177
R2 0.456
GCS_max_sq INR_mean_i pacemkr_max svCSRU_max RikerSAS_mean Platelets_Slope_1680_min urineByHr_mean_sqrt GCSrdv_mean GCSrng_min_am pressD01_mean_am CV_HRrng_max Insulin_sd_sq alloutput_max_la MetCarcinoma_min WBC_mean_am AIDS_min Intercept MBPm.pr_min_am HemMalig_min RESP_mean_sq hrmVA_max PaO2toFiO2_mean Na_mean_am Mg_min_sq ShockIdx_max Platelets_mean_i hospTime_min_sqrt day_min_sq jaundiceSkin_mean_la CO2_mean_i Lasix_max_lam beta.Blocking_agent_mean_lam Sympathomimetic_agent_min SpO2.oor30.t_mean_sqrt BUNtoCr_min_sqrt Age_min_sq
d.f. P C Dxy 35 0 0.898 0.797
Brier 0.074
Coef S.E. Wald Z P -0.0064668 5.032e-04 -12.85 0.0000 -1.8734049 1.458e-01 -12.85 0.0000 -0.9337190 1.179e-01 -7.92 0.0000 -0.9137522 1.250e-01 -7.31 0.0000 -0.3430971 5.151e-02 -6.66 0.0000 -5.8856843 8.839e-01 -6.66 0.0000 -0.0584113 9.453e-03 -6.18 0.0000 -0.0902717 1.552e-02 -5.82 0.0000 -0.0812232 1.459e-02 -5.57 0.0000 -1.6132643 3.005e-01 -5.37 0.0000 -0.0061979 1.216e-03 -5.10 0.0000 -2.1686950 4.372e-01 -4.96 0.0000 -0.0890330 2.265e-02 -3.93 0.0001 0.4468763 1.567e-01 2.85 0.0043 0.0147036 5.149e-03 2.86 0.0043 0.5954305 1.991e-01 2.99 0.0028 1.5314512 4.529e-01 3.38 0.0007 1.4601630 3.518e-01 4.15 0.0000 0.6032027 1.212e-01 4.98 0.0000 0.0006615 1.324e-04 5.00 0.0000 0.3520834 6.823e-02 5.16 0.0000 0.2672376 4.336e-02 6.16 0.0000 0.0549066 8.506e-03 6.45 0.0000 0.1173220 1.815e-02 6.46 0.0000 0.5742182 8.853e-02 6.49 0.0000
24.0719462 3.560e+00 6.76 0.0000 0.0057514 8.158e-04 7.05 0.0000 0.0170075 2.372e-03 7.17 0.0000 0.1469141 2.045e-02 7.18 0.0000
19.3845272 2.682e+00 7.23 0.0000 0.2523702 3.444e-02 7.33 0.0000 0.2918077 3.923e-02 7.44 0.0000 0.8576883 9.254e-02 9.27 0.0000 0.4059329 4.128e-02 9.83 0.0000 0.2829088 2.348e-02 12.05 0.0000 0.0002601 1.495e-05 17.40 0.0000
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
5.2. DAILY ACUITY SCORES 73
Figure 5-3: SDAS ROC curve (development data). AUC = the area under the curve;n = the total number of available predictions used for curve; Missing = number ofmissing predictions.
Table 5.4: SDAS Hosmer-Lemeshow H risk deciles (all days)Died Survived
Decile Prob.Range Prob. Obs. Exp. Obs. Exp. Total1 [0.000203,0.00335) 0.002 2 4.2 2011 2008.8 20132 [0.003353,0.00682) 0.005 3 9.9 2010 2003.1 20133 [0.006825,0.01281) 0.010 11 19.2 2002 1993.8 20134 [0.012812,0.02277) 0.017 24 34.8 1989 1978.2 20135 [0.022771,0.03971) 0.031 53 61.5 1960 1951.5 20136 [0.039706,0.06691) 0.052 104 104.8 1909 1908.2 20137 [0.066911,0.11297) 0.088 198 176.7 1815 1836.3 20138 [0.112972,0.20128) 0.152 324 305.3 1689 1707.7 20139 [0.201280,0.40232) 0.285 610 574.7 1403 1438.3 201310 [0.402321,0.99876] 0.634 1239 1276.9 774 736.1 2013
χ2 = 24.47, d.f. = 8; p = 0.002
88 CHAPTER 5. MORTALITY MODELS
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
SDAS: Day 1
AUC
n
Missing
0.87
2434
584
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
SDAS: Day 2
AUC
n
Missing
0.877
2108
216
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
SDAS: Day 3
AUC
n
Missing
0.883
1389
137
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
SDAS: Day 4
AUC
n
Missing
0.882
928
85
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
SDAS: Day 5
AUC
n
Missing
0.863
662
59
Figure 5-7: SDAS ROC curves (validation data)
92 CHAPTER 5. MORTALITY MODELS
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Predicted Probability
Actu
al P
rob
ab
ility
IdealLogistic calibrationNonparametricGrouped observations
Dxy C (ROC) R2 D U Q Brier InterceptSlope Emax
0.741 0.871 0.374 0.209 0.003 0.205 0.074!0.079 0.866 0.046
SDAS Day 1
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Predicted Probability
Actu
al P
rob
ab
ility
IdealLogistic calibrationNonparametricGrouped observations
Dxy C (ROC) R2 D U Q Brier InterceptSlope Emax
0.755 0.877 0.376 0.205 0.005 0.200 0.072!0.132 0.829 0.066
DAS1
Figure 5-9: Calibration plots for SDAS, SDAS day 1, and DAS1 (validation data). Therelative frequencies for each predicted probability are indicated by the bars along thex-axis.
Evaluating the models
SDAS: All Days
Sensitivity
0.0
0.2
0.4
0.6
0.8
1.0
AUC n Missing
0.898 20130 2758
0.0 0.2 0.4 0.6 0.8 1.0
1!Specificity
Sensitivity
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
AUC n
0.876
Missing 8428 1164
1!Specificity
Devel Validation SDAS
0.0
0
.2
0.4
0
.6
0.8
1
.0
Actu
al P
rob
ab
ility
Ideal Logistic calibration Nonparametric Grouped observations
Dxy C (ROC) R2 D U Q Brier Intercept Slope Emax
0.752 0.876 0.385 0.223 0.006 0.217 0.077 !0.257 0.820 0.094
0.0 0.2 0.4 0.6 0.8 1.0 Predicted Probability
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
Selected features for each day of ICU stay
82C
HA
PT
ER
5.
MO
RTA
LIT
YM
OD
ELS
1 2 3 4 5
!15
!10
!5
05
10
DAS model (day n)
Figure by Hug, Caleb Wayne. "Detecting Hazardous Intensive Care PatientEpisodes Using Real-time Mortality Models." Massachusetts Institute of Technology, 2009.
Wald
z
Intercept
MetCarcinoma_min
dopLg_mean_la !
HemMalig_min
Lasix_max_lam !
CO2_mean_i !
Platelets_mean_i
SpO2_mean_am
PaO2toFiO2_mean !
hrmVA_max
hospTime_min_sqrt
ShockIdx_mean_sq
Na_max_am
jaundiceSkin_mean_la
BUNtoCr_min_sqrt
Age_min_sq
GCS_max_sq
alloutput_max_sqrt
GCSrdv_mean
INR_mean_i
pacemkr_max
Insulin_mean_la
LactateM_sd_i !
+ AIDS_min ! MetCarcinoma_min Intercept + RESP_mean_sq HemMalig_min ! hrmVA_max ! Platelets_max_am ShockIdx_mean_sq ! hospTime_min_sqrt + Sympathomimetic_agent_min Na_max_am SpO2.oor30.t_mean_sqrt jaundiceSkin_mean_la + beta.Blocking_agent_mean_lam + Amiodarone_min_am ! Age_min_sq BUNtoCr_mean_sqrt
GCS_max_sq
INR_max_i
pacemkr_max !
GCSrdv_mean !
alloutput_min_sqrt
+ UrineOutB_max_sqrt !
+ CV_HRrng_max !
Insulin_mean_la !
MetCarcinoma_min ! Platelets_max_am + temp_mean_am jaundiceSkin_mean_la Intercept + Na_Slope_1680_mean !+ Mg_min_sq !+ InputB_mean_sqrt !+ SBPm.oor30.t_max_sq ! beta.Blocking_agent_mean_lam !+ MBPm.pr_min_am Na_max_am !+ HCTrdv_max !+ Lasix_perKg_max_sqrt ! Sympathomimetic_agent_min RESP_mean_sq ! hospTime_min_sqrt BUNtoCr_min_sqrt Age_min_sq SpO2.oor30.t_mean_sqrt
GCS_max_sq
alloutput_max_sqrt !
INR_max_i
+ GCSrng_mean_sq
+ svCSRU_max
+ pressD01_mean_am
+ Platelets_Slope_1680_min
Intercept temp_mean_am ! jaundiceSkin_mean_la !+ ShockIdx_max MBPm.pr_min_am hospTime_min_sqrt ! Platelets_mean_i !+ CO2_mean_i Sympathomimetic_agent_min ! SpO2.oor30.t_mean_sqrt BUNtoCr_mean_sqrt Age_min_sq
GCS_max_sq
Platelets_Slope_1680_min
svCSRU_max
+ RikerSAS_mean
GCSrng_mean_sq
INR_mean_i
pressD01_mean_am !
+ Insulin_sd_sq !
+ urineByHr_mean_sqrt !
+ HemMalig_min
+ Mg_min_sq
MBPm.pr_min_am
ShockIdx_max
+ InputOtherBloodB_mean_lam
BUNtoCr_mean_sqrt
CO2_mean_i
SpO2.oor30.t_mean_sqrt
Age_min_sq
GCS_max_sq
svCSRU_max
RikerSAS_mean
Platelets_Slope_1680_min
INR_mean_i
GCSrng_mean_sq
Intercept
Figu
re5-5:
Ran
kedcom
parison
ofDASn
inputs
overdays
n∈
{1,2
,3,4
,5}.In
put
nam
esare
ranked
(but
equally
spaced
)for
the
positive
Wald
Zscores
and
the
negative
Wald
Z
scoresfor
eachm
odel.
On
agiven
day,
variables
absen
tfrom
the
previou
sday
areprefi
xedw
ith“+
”,an
dvariab
lesab
sent
fromth
efollow
ing
day
suffixed
with
“-”.
MIT OpenCourseWarehttp://ocw.mit.edu
HST.950J / 6.872 Biomedical Computing Fall 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.