the anzrod model: better benchmarking of icu outcomes … · the anzrod model: better benchmarking...

12
Critical Care and Resuscitation Volume 18 Number 1 March 2016 ORIGINAL ARTICLES 25 The ANZROD model: better benchmarking of ICU outcomes and detection of outliers Eldho Paul, Michael Bailey, Jessica Kasza, David Pilcher Comparative monitoring of outcomes and performance using risk-adjustment models is well established in critical care medicine. 1 The Acute Physiology and Chronic Health Evaluation (APACHE), 2-5 Simplified Acute Physiology Score 6-9 and Mortality Probability Model 10-13 are the main risk-adjustment models used in intensive care units globally. These models typically provide risk adjustment for observed mortality and facilitate benchmarking among ICUs for purposes of quality-of-care assessment. 14,15 The Australian and New Zealand Intensive Care Society (ANZICS) Centre for Outcome and Resource Evaluation (CORE) (http://www.anzics.com.au/Pages/CORE/About- CORE.aspx) has been using the APACHE III-j model (“j” being the tenth iteration) for risk adjustment in Australian and New Zealand ICUs for over a decade. 1,16 Mortality predictions estimated by the model are used to calculate standardised mortality ratios (SMRs), which are the ratios of observed deaths to predicted deaths in a given period of time. SMRs derived from the APACHE III-j model are used to benchmark patient outcomes and form the basis of reports on ICU performance in Australia and New Zealand. This model had good performance when it was implemented, but this has deteriorated over time, 17 which may limit its use for benchmarking performance of Australian and New Zealand ICUs. To improve risk adjustment and performance monitoring of Australian and New Zealand ICUs, the Australian and New Zealand Risk of Death (ANZROD) model was developed recently. This model was developed using locally derived data specifically tailored to the Australian and New Zealand ICU population. 18 The ANZROD model has excellent discrimination and good calibration over the entire ICU population, as well as specific patient subgroups. Because this model now provides better adjustment for casemix variation in Australian and New Zealand ICUs, it has recently been introduced into performance monitoring in Australasian hospitals. The ANZICS CORE uses funnel plots 19,20 for monitoring performance of ICUs across Australasia. In these funnel plots, annual SMRs are plotted against the number of admissions for each site during a reporting period. Separate funnel plots are generated for each hospital type (rural, metropolitan, tertiary and private). The upper and lower control limits are constructed with 95% and 99% confidence intervals for SMRs, which are estimated around the observed mortality using exact limits for the F distribution. 21 ICUs ABSTRACT Objective: To compare the impact of the 2013 Australian and New Zealand Risk of Death (ANZROD) model and the 2002 Acute Physiology and Chronic Health Evaluation (APACHE) III-j model as risk-adjustment tools for benchmarking performance and detecting outliers in Australian and New Zealand intensive care units. Methods: Data were extracted from the Australian and New Zealand Intensive Care Society Adult Patient Database for all ICUs that contributed data between 1 January 2010 and 31 December 2013. Annual standardised mortality ratios (SMRs) were calculated for ICUs using the ANZROD and APACHE III-j models. They were plotted on funnel plots separately for each hospital type, with ICUs above the upper 99.8% control limit considered as potential outliers with worse performance than their peer group. Overdispersion parameters were estimated for both models. Overall fit was assessed using the Akaike information criterion (AIC) and Bayesian information criterion (BIC). Outlier association with mortality was assessed using a logistic regression model. Results: The ANZROD model identified more outliers than the APACHE III-j model during the study period. The numbers of outliers in rural, metropolitan, tertiary and private hospitals identified by the ANZROD model were 3, 2, 6 and 6, respectively; and those identified by the APACHE III-j model were 2, 0, 1 and 1, respectively. The degree of overdispersion was less for the ANZROD model compared with the APACHE III-j model in each year. The ANZROD model showed better overall fit to the data, with smaller AIC and BIC values than the APACHE III-j model. Outlier ICUs identified using the ANZROD model were more strongly associated with increased mortality. Conclusion: The ANZROD model reduces variability in SMRs due to casemix, as measured by overdispersion, and facilitates more consistent identification of true outlier ICUs, compared with the APACHE III-j model. Crit Care Resusc 2016; 18: 25-36 with SMRs that fall outside the upper 99% confidence interval are considered potential outliers. There has been concern that funnel plots of APACHE III-j SMRs may no longer identify true outliers, due to deteriorating performance 17 of this model. The ANZROD model has been

Upload: trinhanh

Post on 28-Feb-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

25

The ANZROD model: better benchmarking of ICU outcomes and detection of outliers

Eldho Paul, Michael Bailey, Jessica Kasza, David Pilcher

Comparative monitoring of outcomes and performance using risk-adjustment models is well established in critical care medicine.1 The Acute Physiology and Chronic Health Evaluation (APACHE),2-5 Simplifi ed Acute Physiology Score6-9 and Mortality Probability Model10-13 are the main risk-adjustment models used in intensive care units globally. These models typically provide risk adjustment for observed mortality and facilitate benchmarking among ICUs for purposes of quality-of-care assessment.14,15

The Australian and New Zealand Intensive Care Society (ANZICS) Centre for Outcome and Resource Evaluation (CORE) (http://www.anzics.com.au/Pages/CORE/About-CORE.aspx) has been using the APACHE III-j model (“j” being the tenth iteration) for risk adjustment in Australian and New Zealand ICUs for over a decade.1,16 Mortality predictions estimated by the model are used to calculate standardised mortality ratios (SMRs), which are the ratios of observed deaths to predicted deaths in a given period of time. SMRs derived from the APACHE III-j model are used to benchmark patient outcomes and form the basis of reports on ICU performance in Australia and New Zealand. This model had good performance when it was implemented, but this has deteriorated over time,17 which may limit its use for benchmarking performance of Australian and New Zealand ICUs.

To improve risk adjustment and performance monitoring of Australian and New Zealand ICUs, the Australian and New Zealand Risk of Death (ANZROD) model was developed recently. This model was developed using locally derived data specifi cally tailored to the Australian and New Zealand ICU population.18 The ANZROD model has excellent discrimination and good calibration over the entire ICU population, as well as specifi c patient subgroups. Because this model now provides better adjustment for casemix variation in Australian and New Zealand ICUs, it has recently been introduced into performance monitoring in Australasian hospitals.

The ANZICS CORE uses funnel plots19,20 for monitoring performance of ICUs across Australasia. In these funnel plots, annual SMRs are plotted against the number of admissions for each site during a reporting period. Separate funnel plots are generated for each hospital type (rural, metropolitan, tertiary and private). The upper and lower control limits are constructed with 95% and 99% confi dence intervals for SMRs, which are estimated around the observed mortality using exact limits for the F distribution.21 ICUs

ABSTRACT

Objective: To compare the impact of the 2013 Australian and New Zealand Risk of Death (ANZROD) model and the 2002 Acute Physiology and Chronic Health Evaluation (APACHE) III-j model as risk-adjustment tools for benchmarking performance and detecting outliers in Australian and New Zealand intensive care units.Methods: Data were extracted from the Australian and New Zealand Intensive Care Society Adult Patient Database for all ICUs that contributed data between 1 January 2010 and 31 December 2013. Annual standardised mortality ratios (SMRs) were calculated for ICUs using the ANZROD and APACHE III-j models. They were plotted on funnel plots separately for each hospital type, with ICUs above the upper 99.8% control limit considered as potential outliers with worse performance than their peer group. Overdispersion parameters were estimated for both models. Overall fi t was assessed using the Akaike information criterion (AIC) and Bayesian information criterion (BIC). Outlier association with mortality was assessed using a logistic regression model.Results: The ANZROD model identifi ed more outliers than the APACHE III-j model during the study period. The numbers of outliers in rural, metropolitan, tertiary and private hospitals identifi ed by the ANZROD model were 3, 2, 6 and 6, respectively; and those identifi ed by the APACHE III-j model were 2, 0, 1 and 1, respectively. The degree of overdispersion was less for the ANZROD model compared with the APACHE III-j model in each year. The ANZROD model showed better overall fi t to the data, with smaller AIC and BIC values than the APACHE III-j model. Outlier ICUs identifi ed using the ANZROD model were more strongly associated with increased mortality.Conclusion: The ANZROD model reduces variability in SMRs due to casemix, as measured by overdispersion, and facilitates more consistent identifi cation of true outlier ICUs, compared with the APACHE III-j model.

Crit Care Resusc 2016; 18: 25-36

with SMRs that fall outside the upper 99% confi dence interval are considered potential outliers. There has been concern that funnel plots of APACHE III-j SMRs may no longer identify true outliers, due to deteriorating performance17 of this model. The ANZROD model has been

CCR - Mar16 Combined.indb 25CCR - Mar16 Combined.indb 25 16/02/2016 1:18:13 PM16/02/2016 1:18:13 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

26

shown to be better in predicting mortality in Australia and

New Zealand,18 but the performance of ANZROD as a tool

for risk adjustment in the comparison of ICU performance

remains to be investigated. It was hypothesised that the

newly developed, well calibrated and highly discriminatory

ANZROD model would be a better risk-adjustment tool for

routine monitoring of ICU performance using annual SMRs,

and therefore would result in more accurate identifi cation

of outliers. Therefore, our aim was to assess performance of

the ANZROD model compared with the APACHE III-j model,

for benchmarking intensive care outcomes and specifi cally

for detection of potential outlier ICUs.

Table 1. Intensive care unit characteristics, by year of study

Year

Characteristic 2010 2011 2012 2013

Total admissions 114 035 121 218 127 495 130 673

Admissions included in ANZROD model 104 516 111 360 117 123 119 915

Admissions included in APACHE III-j model 102 845 109 402 115 096 117 977

Number of ICUs, by hospital level

Rural 37 37 38 36

Metropolitan 30 31 33 33

Tertiary 35 36 35 36

Private 45 45 45 45

All ICUs 147 149 151 150

Mean observed mortality by hospital level, % (range)

Rural 10.3% (2.9%–26.6%) 10.1% (3.0%–24.1%) 8.9% (2.6%–21.4%) 8.6% (2.0%–22.1%)

Metropolitan 11.9% (5.1%–19.5%) 10.7% (0–21.1%) 9.8% (0–23.2%) 9.6% (1.9%–20.1%)

Tertiary 12.4% (5.2%–23.0%) 12.2% (5.0%–20.7%) 11.2% (3.3%–18.8%) 10.6% (0.5%–17.2%)

Private 4.3% (0–9.9%) 4.2% (0–12.4%) 4.8% (0–35.1%) 3.9% (0–20.0%)

All ICUs 9.3% (0–26.6%) 8.9% (0–24.1%) 8.4% (0–35.1%) 7.9% (0–22.1%)

Mean ANZROD predicted mortality by hospital level, % (range)

Rural 11.6% (5.9%–25.3%) 11.1% (4.8%–24.9%) 10.1% (3.5%–22.2%) 9.5% (1.8%–18.4%)

Metropolitan 12.7% (6.7%–21.0%) 11.6% (0.2%–23.8%) 11.0% (5.5%–27.0%) 10.4% (4.9%–21.5%)

Tertiary 12.4% (6.9%–20.1%) 12.0% (5.0%–20.7%) 11.3% (4.0%–17.9%) 10.1% (1.4%–18.0%)

Private 5.1% (0.9%–14.9%) 4.4% (1.0%–12.0%) 4.8% (0.7%–20.6%) 4.0% (1.0%–11.0%)

All ICUs 10.0% (0.9%–25.3%) 9.4% (0.2%–24.9%) 9.0% (0.7%–27.0%) 8.2% (1.0%–21.5%)

Mean APACHE III-j predicted mortality, by hospital level, % (range)

Rural 13.8% (6.4%–30.6%) 14.0% (6.1%–36.4%) 12.9% (4.2%–30.7%) 12.5% (2.3%–27.1%)

Metropolitan 16.1% (9.8%–28.3%) 15.9% (0.5%–28.0%) 15.7% (7.4%–31.0%) 14.8% (5.7%–27.5%)

Tertiary 16.3% (10.5%–26.5%) 16.0% (7.1%–26.1%) 15.6% (5.7%–24.0%) 14.6% (3.2%–25.6%)

Private 6.9% (1.4%–17.1%) 6.8% (2.1%–13.9%) 7.1% (1.0%–23.7%) 6.3% (1.6%–15.4%)

All ICUs 12.8% (1.4%–30.6%) 12.7% (0.5%–36.4%) 12.4% (1.0%–31.0%) 11.7% (1.6%–27.5%)

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. ICU = intensive care unit.

Methods

Data were extracted from the ANZICS Adult Patient Database (APD), which collects individual admissions data from ICUs across Australia and New Zealand, and was started in 1990.22 This binational, high-quality dataset23 is one of the largest databases of its kind in the world, with over 1.5 million ICU admission episodes currently. Data are submitted on behalf of each ICU director, and each hospital allows subsequent use as appropriate under the ANZICS CORE standing procedures and in compliance with the ANZICS CORE terms of reference.

We included all data submissions to the ANZICS APD between 1 January 2010 and 31 December 2013. We excluded patients aged under 16 years, patients with a

CCR - Mar16 Combined.indb 26CCR - Mar16 Combined.indb 26 16/02/2016 1:18:13 PM16/02/2016 1:18:13 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

27

missing acute physiology score on ICU Day 1, those with missing hospital outcomes, and readmission episodes to an ICU within the same hospital stay. Additionally, the ANZROD model excluded patients admitted for palliative care and organ donation, in keeping with the methodology,18 but the APACHE III-j model excluded patients who had an ICU stay of < 4 hours and patients transferred from or to another ICU. All extracted data were de-identifi ed, and the study was conducted with approval of the Monash University Human Research Ethics Committee.

All variables required for ANZROD and APACHE III-j models were extracted, including age, chronic health variables, physiological measures required to calculate acute physiology scores, treatment limitations, sources of admission (ICU and hospital), lead time, elective surgery, ventilation status and diagnoses. Annual APACHE III-j SMRs were calculated by dividing the total observed deaths by total predicted deaths for each ICU derived from the APACHE III-j model. Annual ANZROD SMRs were calculated by dividing total observed deaths by total predicted deaths for each ICU derived from the ANZROD model. An SMR < 1.0 indicated that the observed mortality rate was below the predicted number of deaths for each prediction model.

Statistical analyses

Analyses were performed using SAS, version 9.4 (SAS Institute), or Stata, version 11 (StataCorp). Descriptive statistics were used to summarise ICU characteristics by each year of observation. Funnel plots were constructed by plotting annual SMRs of ANZROD and APACHE III-j models separately against the number of admissions for each ICU for each year. Separate funnel plots were made for rural, metropolitan, tertiary and private hospitals. The 95% and 99.8% control limits were drawn around the mean SMR to indicate “warning” and “alarm” signs. ICUs with SMRs that fell outside the upper 99.8% control limit were identifi ed as potential outliers.19,20,24

We examined overdispersion within ANZROD and APACHE III-j funnel plots in each year, using the method recommended by Spiegelhalter.25 The degree of overdispersion was estimated for ANZROD and APACHE III-j models by winsorising 5% of the data. Winsorising consists of shrinking in the extreme Z-scores to some selected percentile. This retains the same number of Z-scores but discounts the infl uence of outliers. The overdispersion parameter was estimated for each hospital type separately to facilitate comparisons across all hospital types.

For a sample of K units assumed to be on target, the overdispersion parameter (φ) may be estimated as follows:

where is the SMR of the ith ICU, and is the standard error of that SMR.

We also assessed φ by comparing how well each of the ANZROD and APACHE III-j models fi t the data. To compare the fi t of the models, two separate logistic regression models were fi tted with observed mortality as the dependent variable, and ANZROD or APACHE III-j predicted mortality as the sole independent variable. The Akaike information criterion (AIC)26 and the Bayesian information criterion (BIC)27 were estimated for each model to determine which model fi t the data better. The AIC and the BIC are two popular measures for comparing maximum-likelihood models and can be viewed as measures that combine fi t and complexity. AIC and BIC are defi ned as follows:

where p is the number of parameters estimated and N is the number of observations.

AIC and BIC were compared separately for each hospital type and year of observation. The model with smaller values of AIC and BIC was considered to provide a better fi t to the data within each hospital type and year. Also, to assess the effect of differing exclusions on models, AIC, BIC and were estimated on a common dataset after applying ANZROD and APACHE III-j exclusions.

The ability of the ANZROD model to more accurately identify outlying hospitals, compared with the APACHE III-j model, was determined by comparing the association between outlier status and observed mortality separately for each model. Patients from outlying hospitals identifi ed by the models were considered as outliers, and all other patients were considered as non-outliers. The association between outlier status and hospital mortality was determined using logistic regression analysis, with hospital mortality as the outcome variable and outlier status (yes v no) as the sole predictor variable. Odds ratios and 95% confi dence intervals were estimated separately for each model. Goodness-of-fi t of models over time was assessed by calculating Brier score and area under the receiver operating characteristic curve.

Results

The characteristics of ICUs contributing data to the APD are shown in Table 1. About 150 ICUs contributed data every year. The observed mortality declined from 9.3% in 2010 to 7.9% in 2013. For the same period, predicted mortality declined from 10% to 8.2% and 12.8% to 11.7% for ANZROD and APACHE III-j models, respectively.

Estimates of φ for each model showed that the degree of overdispersion was less for the ANZROD model (2.21 in 2010, 2.86 in 2011, 3.38 in 2012 and 3.16 in 2013) compared with the APACHE III-j model (4.51 in 2010, 4.30 in 2011, 5.88 in 2012 and 6.88 in 2013) in each year. The degree of overdispersion for each hospital type is shown

CCR - Mar16 Combined.indb 27CCR - Mar16 Combined.indb 27 16/02/2016 1:18:13 PM16/02/2016 1:18:13 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

28

Figure 1. ANZROD and APACHE III-j funnel plots for rural hospitals

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. A–D: ANZROD funnel plots, 2010–2013, respectively. E–H: APACHE III-j funnel plots, 2010–2013, respectively.

CCR - Mar16 Combined.indb 28CCR - Mar16 Combined.indb 28 16/02/2016 1:18:13 PM16/02/2016 1:18:13 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

29

Table 2. Estimates of overdispersion (φ)* for ANZROD and APACHE III-j models, based on 5% winsorisation

Year Hospital level ANZRODφ APACHE III-jφ

2010 Rural 1.00 2.19

Metropolitan 1.00 1.87

Tertiary 2.63 3.57

Private 1.64 3.52

2011 Rural 3.07 1.93

Metropolitan 2.08 2.15

Tertiary 3.58 3.11

Private 1.89 3.33

2012 Rural 2.31 4.12

Metropolitan 2.58 2.10

Tertiary 3.69 4.44

Private 2.65 5.20

2013 Rural 2.08 6.33

Metropolitan 2.05 1.53

Tertiary 3.49 5.24

Private 3.37 5.56

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. * φ = overdispersion parameter (higher values indicate greater overdispersion).

Table 3. Akaike information criterion (AIC) and Bayesian information criterion (BIC) for ANZROD and APACHE III-j models* ANZROD APACHE III-j

Year Hospital level AIC BIC AIC BIC

2010 Rural 6627 6642 6926 6942

Metropolitan 8767 8782 9323 9339

Tertiary 22 813 22 830 22 901 22 918

Private 6220 6237 6554 6570

All hospitals 44 747 44 766 45 969 45 989

2011 Rural 6860 6875 7264 7279

Metropolitan 9310 9326 9771 9787

Tertiary 23 787 23 805 24 385 24 403

Private 6902 6919 6996 7013

All hospitals 47 103 47 122 48 670 48 689

2012 Rural 7073 7088 7374 7390

Metropolitan 9843 9859 10078 10094

Tertiary 22 561 22 579 23 164 23 181

Private 6591 6607 6737 6753

All hospitals 46 363 46 383 47 599 47 618

2013 Rural 6529 6545 6783 6798

Metropolitan 9480 9496 9795 9811

Tertiary 23 566 23 584 24 043 24 060

Private 6486 6502 6761 6778

All hospitals 46 407 46 426 47 631 47 650

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. * Smaller AIC and BIC values suggest better fi t to the data within each hospital type and year.

Table 4. Association between ANZROD or APACHE III-j outlier (above 99.8% control limit) status and hospital mortality, by year of study

ANZROD outlier APACHE III-j outlier

Odds ratio Odds ratio Year (95% CI) P (95% CI) P

2010 1.51 (1.27–1.80) < 0.0001 –* –

2011 1.20 (1.10–1.32) < 0.0001 –* –

2012 1.46 (1.36–1.58) < 0.0001 1.15 (1.05–1.27) 0.004

2013 1.25 (1.15–1.35) < 0.0001 0.51 (0.34–0.76) 0.001

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. * No site fell above 99.8% control limit.

in Table 2. The results were similar when the analysis was repeated on a common dataset after applying ANZROD and APACHE III-j exclusions.

Funnel plots for rural hospitals are shown in Figure 1. The number of outliers identifi ed by the ANZROD and APACHE III-j models was the same in 2010, 2012 and 2013. In contrast, the ANZROD model labelled one unit as an outlier in 2011 that was not identifi ed by the APACHE III-j model.

Metropolitan hospital funnel plots are shown in Figure 2. There were no outliers based on APACHE III-j SMRs. However, the ANZROD model identifi ed one unit as an outlier in 2012 and 2013.

Funnel plots for tertiary hospitals are shown in Figure 3. The ANZROD model classifi ed one hospital as an outlier in 2010 and 2013 and two hospitals in the years between. In contrast, the APACHE III-j model identifi ed only one hospital in the year 2012.

Funnel plots for private hospitals (Figure 4) showed that the ANZROD model identifi ed more outliers than the APACHE III-j model. The ANZROD model recognised at least one outlier every year with the exception of 2011. In 2013, it classifi ed three hospitals as outliers. On the other hand, the APACHE III-j model was able to detect only one hospital as an outlier in 2012. The distribution of predicted probabilities of both models across all hospital types is shown in kernel density plots (Figure 5) for each year.

Estimates of AIC and BIC for each model showed that the ANZROD model had smaller values compared with the APACHE III-j model in each year, suggesting a better fi t of the ANZROD model to the data. The results were similar across all hospital types during the study period. Table 3 shows the AIC and BIC estimates for each hospital type, by year of study. Similar results were obtained when the analysis was repeated on a common dataset after applying the ANZROD and APACHE III-j exclusions.

The association between outlier status and hospital mortality is shown in Table 4 for each model. An ICU identifi ed as an outlier using the ANZROD model showed a

CCR - Mar16 Combined.indb 29CCR - Mar16 Combined.indb 29 16/02/2016 1:18:14 PM16/02/2016 1:18:14 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

30

Figure 2. ANZROD and APACHE III-j funnel plots for metropolitan hospitals

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. A–D: ANZROD funnel plots, 2010–2013, respectively. E–H: APACHE III-j funnel plots, 2010–2013, respectively.

CCR - Mar16 Combined.indb 30CCR - Mar16 Combined.indb 30 16/02/2016 1:18:14 PM16/02/2016 1:18:14 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

31

Figure 3. ANZROD and APACHE III-j funnel plots for tertiary hospitals

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. A–D: ANZROD funnel plots, 2010–2013, respectively. E–H: APACHE III-j funnel plots, 2010–2013, respectively.

CCR - Mar16 Combined.indb 31CCR - Mar16 Combined.indb 31 16/02/2016 1:18:14 PM16/02/2016 1:18:14 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

32

Figure 4. ANZROD and APACHE III-j funnel plots for private hospitals

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. A–D: ANZROD funnel plots, 2010–2013, respectively. E–H: APACHE III-j funnel plots, 2010–2013, respectively.

CCR - Mar16 Combined.indb 32CCR - Mar16 Combined.indb 32 16/02/2016 1:18:15 PM16/02/2016 1:18:15 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

33

stronger positive association with observed mortality for each

year of study compared with the APACHE III-j model. Table 5

shows an assessment of over-time fi t of the models during

the study period.

Discussion

We report the application of the ANZROD model for benchmarking ICU outcomes and routine monitoring of ICU performance in Australian and New Zealand ICUs. We provide updated information about the ability of this model to identify ICUs with a higher-than-expected risk-adjusted mortality in comparison with the existing APACHE III-j model. Our study confi rms that the ANZROD model reduces the effect of casemix variation and is better at identifying true outliers in Australasian hospitals.

Our study used SMRs derived from ANZROD and APACHE III-j risk prediction models to identify hospitals with a higher-than-expected risk-adjusted mortality. SMRs are the most commonly used quality indicator of the performance of an ICU.28-31 The SMR of an ICU may be affected by casemix.32 A large proportion of patients with low mortality risk can change SMRs generated by various prediction models

Table 5. Assessment of over-time fi t of models during study period

ANZROD APACHE III-j

Year Brier score AUROC Brier score AUROC

2010 0.059 0.903 0.065 0.895

2011 0.058 0.904 0.065 0.893

2012 0.054 0.909 0.060 0.898

2013 0.052 0.909 0.058 0.898

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. AUROC = area under receiver operating characteristic curve.

Figure 5. Kernel density plots showing probability of death for ANZROD and APACHE III-j models, all hospital types

ANZROD = Australian and New Zealand Risk of Death. APACHE = Acute Physiology and Chronic Health Evaluation. A: 2010. B: 2011. C: 2012. D: 2013.

CCR - Mar16 Combined.indb 33CCR - Mar16 Combined.indb 33 16/02/2016 1:18:16 PM16/02/2016 1:18:16 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

34

regardless of the performance of the ICU.29,33 As shown by the kernel density plots, this was not an issue in our study, as differences were small across all hospital types. Differences in the proportions of patients with various diagnoses may also affect the SMRs.33,34 The APACHE III-j model is currently more affected by casemix variation17 than the ANZROD model,18 which is better calibrated to Australian and New Zealand critical care admissions and thus more appropriate to use for monitoring performance.

Recent studies35-37 to identify unusual performance in Australian and New Zealand ICUs have employed advanced statistical methods, but their implementation into routine practice may not be feasible due to their complexity and prolonged computing time. To account for the deteriorating performance of the APACHE III-j model, the mortality models used in these methods recalibrated the APACHE III-j model and included additional covariates and many interactions with the APACHE III-j model. The use of the ANZROD model, a new prediction model specifi cally tailored to the Australasian ICU population, reduces the need for such additional complexity.

The ANZICS CORE is responsible for benchmarking ICU performance in Australasia, based on data submitted from ICUs. The ANZICS CORE outlier management program was established in 2008 and involves a stepwise process of analysis, in which each subsequent step is undertaken only if the previous one did not adequately explain the fi ndings. Initially, analysis of data quality is undertaken, followed by assessment of the effect of casemix variation and, fi nally, if required, investigation of resources and staffi ng within the ICU.38 A poorly calibrated model such as the APACHE III-j model, which predicts mortality with variable success in different diagnostic groups, may lead to misidentifi cation of outliers by underpredicting mortality in some groups. Similarly, a poor model may fail to identify some ICUs at all, through overestimation of mortality risk in some subgroups. In contrast, the well calibrated ANZROD model provides better outlier identifi cation, with fewer false positive outliers and fewer missed true outliers. Our study provides evidence to support the use of the ANZROD model as the primary risk-adjustment model for identifying potential outlier ICUs.

Differences in how ICU admission diagnosis is included as a predictor may partly explain the variation in performance of the two models. The ANZROD model has 124 diagnostic groups, and the APACHE III-j model has 94 diagnostic groups. A second explanation might be the differences in accounting for physiological abnormalities, which are more important relative contributors in the ANZROD model than in the APACHE III-j model. Although acute physiology score is used to account for severity of illness in the APACHE III-j model, the ANZROD model enhances its use by reweighting each of the components separately. In addition, a prediction model should accurately predict mortality and exclude as

few patients as possible.39 Wunsch and colleagues40

previously showed that model exclusion criteria could have a profound impact on unit-level performance, as it altered crude hospital mortality for individual ICUs by up to 15%. In our study, patient exclusion criteria resulted in elimination of 8.2% of eligible admissions using the ANZROD model and 10.2% using the APACHE III-j model. However, when the effect of exclusions was assessed on a common dataset after applying ANZROD and APACHE III-j exclusions, similar results were obtained with smaller AIC and BIC values, and there was less overdispersion for the ANZROD model compared with the APACHE III-j model.

Our results suggest that the outliers identifi ed by the ANZROD model are more likely to be true outliers, compared with outliers identifi ed by the APACHE III-j model. In the absence of a gold standard, we compared the association between outlier status and hospital mortality (Table 4) to assess the ability of the models to identify true outlying hospitals. Not only did the ANZROD model show a stronger positive association with observed mortality, but the association was also consistent throughout the study period. In contrast, the APACHE III-j model provided varied results. Although a positive association was shown in 2012, the APACHE III-j model failed to identify any outliers in 2010 and 2011, and a negative association was found in 2013. The ANZROD model also had better calibration over time than the APACHE III-j model (Table 5). Our results support the hypothesis that the ANZROD model is better at identifying true outliers in Australasian hospitals.

Strengths and limitations

Strengths of this dataset include its wide coverage, making it highly representative of the ICU population, and its explicit defi nitions. Collection of raw data enables risk-adjustment models to be derived using standard algorithms across all units, allowing for better comparability of risk-adjusted outcomes between units. Overdispersion within funnel plots was assessed by computing an overdispersion parameter and comparing AIC and BIC. The ANZROD model showed less variation than the APACHE III-j model, irrespective of the type of method used.

Our analysis also had several limitations. Although data were extracted from the ANZICS APD, nearly 3% of eligible admissions could not be used because they had missing data, and their impact on SMRs cannot be determined. Further, although the ANZROD model showed better performance, our results may not be generalisable outside Australasia, where patient casemix, processes of care, resources, and admission and discharge criteria are different. A further potential limitation is that prognostic models tend to overpredict mortality over time, and hence the performance of the ANZROD model in this cohort may not refl ect future performance.

CCR - Mar16 Combined.indb 34CCR - Mar16 Combined.indb 34 16/02/2016 1:18:16 PM16/02/2016 1:18:16 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

35

Conclusion

The SMRs derived using the ANZROD model can be used for benchmarking ICU outcomes and routine monitoring of ICU performance in Australasian hospitals. The ANZROD model reduces variability in SMRs due to casemix across all hospital types, and facilitates more consistent identifi cation of true outlying ICUs compared with the APACHE III-j model for reviewing risk-adjusted mortality outcomes.

AcknowledgementsWe thank David Harrison, Senior Statistician, Intensive Care

National Audit and Research Centre, London, United Kingdom,

for assistance and advice in preparation of our manuscript.

Competing interestsNone declared.

Author detailsEldho Paul, PhD Candidate1

Michael Bailey, Associate Professor1

Jessica Kasza, Research Fellow2

David Pilcher, Intensivist1,3,4

1 Australian and New Zealand Intensive Care Research Centre,

Department of Epidemiology and Preventive Medicine, School

of Public Health and Preventive Medicine, Monash University,

Melbourne, VIC, Australia.

2 Biostatistics Unit, Department of Epidemiology and Preventive

Medicine, School of Public Health and Preventive Medicine,

Monash University, Melbourne, VIC, Australia.

3 Australian and New Zealand Intensive Care Society Centre for

Outcome and Resource Evaluation, Melbourne, VIC, Australia.

4 Department of Intensive Care Medicine, The Alfred Hospital,

Melbourne, VIC, Australia.

Correspondence: [email protected]

References1 Pilcher DV, Hoffman T, Thomas C, et al. Risk-adjusted

continuous outcome monitoring with an EWMA chart: could it

have detected excess mortality among intensive care patients

at Bundaberg Base Hospital? Crit Care Resusc 2010; 12: 36-

41.

2 Knaus WA, Zimmerman JE, Wagner DP, et al. APACHE — acute

physiology and chronic health evaluation: a physiologically

based classification system. Crit Care Med 1981; 9: 591-7.

3 Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II:

a severity of disease classification system. Crit Care Med 1985;

13: 818-29.

4 Knaus WA, Wagner DP, Draper EA, et al. The APACHE III

prognostic system. Risk prediction of hospital mortality for

critically ill hospitalized adults. Chest 1991; 100: 1619-36.

5 Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute

Physiology and Chronic Health Evaluation (APACHE) IV:

hospital mortality assessment for today’s critically ill patients.

Crit Care Med 2006; 34: 1297-310.

6 Le Gall JR, Loirat P, Alperovitch A, et al. A simplified acute

physiology score for ICU patients. Crit Care Med 1984; 12:

975-7.

7 Le Gall JR, Lemeshow S, Saulnier F. A new Simplified Acute

Physiology Score (SAPS II) based on a European/North American

multicenter study. JAMA 1993; 270: 2957-63.

8 Metnitz PG, Moreno RP, Almeida E, et al. SAPS 3 — From

evaluation of the patient to evaluation of the intensive care

unit. Part 1: Objectives, methods and cohort description.

Intensive Care Med 2005; 31: 1336-44.

9 Moreno RP, Metnitz PG, Almeida E, et al. SAPS 3 — from

evaluation of the patient to evaluation of the intensive care

unit. Part 2: Development of a prognostic model for hospital

mortality at ICU admission. Intensive Care Med 2005; 31:

1345-55.

10 Lemeshow S, Teres D, Pastides H, et al. A method for predicting

survival and mortality of ICU patients using objectively derived

weights. Crit Care Med 1985; 13: 519-25.

11 Lemeshow S, Teres D, Klar J, et al. Mortality Probability Models

(MPM II) based on an international cohort of intensive care

unit patients. JAMA 1993; 270: 2478-86.

12 Lemeshow S, Klar J, Teres D, et al. Mortality probability models

for patients in the intensive care unit for 48 or 72 hours: a

prospective, multicenter study. Crit Care Med 1994; 22: 1351-8.

13 Higgins TL, Teres D, Copes WS, et al. Assessing contemporary

intensive care unit outcome: an updated Mortality Probability

Admission Model (MPM0-III). Crit Care Med 2007; 35: 827-35.

14 Zimmerman D. Benchmarking: measuring yourself against the

best. Trustee 1999; 52: 22-3.

15 Zimmerman JE, Alzola C, Von Rueden KT. The use of

benchmarking to identify top performing critical care units:

a preliminary assessment of their policies and practices. J Crit Care 2003; 18: 76-86.

16 Duke GJ, Santamaria J, Shann F, et al. Critical care outcome

prediction equation (COPE) for adult intensive care. Crit Care Resusc 2008; 10: 35-41.

17 Paul E, Bailey M, Van Lint A, Pilcher D. Performance of APACHE

III over time in Australia and New Zealand: a retrospective

cohort study. Anaesth Intensive Care 2012; 40: 980-94.

18 Paul E, Bailey M, Pilcher D. Risk prediction of hospital mortality

for adult patients admitted to Australian and New Zealand

intensive care units: development and validation of the

Australian and New Zealand Risk of Death model. J Crit Care

2013; 28: 935-41.

19 Spiegelhalter D. Funnel plots for institutional comparison.

Qual Saf Health Care 2002; 11: 390-1.

20 Spiegelhalter DJ. Funnel plots for comparing institutional

performance. Stat Med 2005; 24: 1185-202.

21 Armitage P, Berry G, Matthews J. Statistical methods in medical

research. 4th ed. Blackwell Science, 2002.

22 Moran JL, Bristow P, Solomon PJ, et al. Mortality and length-

of-stay outcomes, 1993-2003, in the binational Australian and

CCR - Mar16 Combined.indb 35CCR - Mar16 Combined.indb 35 16/02/2016 1:18:16 PM16/02/2016 1:18:16 PM

Critical Care and Resuscitation • Volume 18 Number 1 • March 2016

ORIGINAL ARTICLES

36

New Zealand intensive care adult patient database. Crit Care

Med 2008; 36: 46-61.

23 Stow PJ, Hart GK, Higlett T, et al. Development and

implementation of a high-quality clinical database: the

Australian and New Zealand Intensive Care Society Adult

Patient Database. J Crit Care 2006; 21: 133-41.

24 Seaton SE, Barker L, Lingsma HF, et al. What is the probability

of detecting poorly performing hospitals using funnel plots?

BMJ Qual Saf 2013; 22: 870-6.

25 Spiegelhalter DJ. Handling over-dispersion of performance

indicators. Qual Saf Health Care 2005; 14: 347-51.

26 Akaike H. A new look at the statistical model identification.

IEEE Trans Automat Cont 1974; 19: 716-23.

27 Schwarz G. Estimating the dimension of a model. Ann Statistics

1978; 6: 461-4.

28 Afessa B, Keegan MT, Hubmayr RD, et al. Evaluating the

performance of an institution using an intensive care unit

benchmark. Mayo Clin Proc 2005; 80: 174-80.

29 Beck DH, Smith GB, Taylor BL. The impact of low-risk intensive

care unit admissions on mortality probabilities by SAPS II,

APACHE II and APACHE III. Anaesthesia 2002; 57: 21-6.

30 Glance LG, Osler TM, Dick A. Rating the quality of intensive

care units: is it a function of the intensive care unit scoring

system? Crit Care Med 2002; 30: 1976-82.

31 Zimmerman JE, Shortell SM, Rousseau DM, et al. Improving

intensive care: observations based on organizational case

studies in nine intensive care units: a prospective, multicenter

study. Crit Care Med 1993; 21: 1443-51.

32 Moreno R, Apolone G, Miranda DR. Evaluation of the

uniformity of fit of general outcome prediction models.

Intensive Care Med 1998; 24: 40-7.

33 Glance LG, Osler TM, Papadakos P. Effect of mortality rate on

the performance of the Acute Physiology and Chronic Health

Evaluation II: a simulation study. Crit Care Med 2000; 28:

3424-8.

34 Metnitz PG, Lang T, Vesely H, et al. Ratios of observed to

expected mortality are affected by differences in case mix and

quality of care. Intensive Care Med 2000; 26: 1466-72.

35 Kasza J, Moran JL, Solomon PJ; ANZICS Centre for Outcome

and Resource Evaluation (CORE) of Australian and New Zealand

Intensive Care Society (ANZICS). Evaluating the performance

of Australian and New Zealand intensive care units in 2009

and 2010. Stat Med 2013; 32: 3720-36.

36 Moran JL, Solomon PJ; ANZICS Centre for Outcome and

Resource Evaluation (CORE) of Australian and New Zealand

Intensive Care Society (ANZICS). Fixed effects modelling for

provider mortality outcomes: analysis of the Australia and

New Zealand Intensive Care Society (ANZICS) Adult Patient

Database. PLoS One 2014; 9: e102297.

37 Solomon PJ, Kasza J, Moran JL; ANZICS Centre for Outcome

and Resource Evaluation (CORE) of Australian and New

Zealand Intensive Care Society (ANZICS). Identifying unusual

performance in Australian and New Zealand intensive care

units from 2000 to 2010. BMC Med Res Methodol 2014; 14:

53.

38 Australian and New Zealand Intensive Care Society Centre for

Outcome and Resource Evaluation. Annual report 2012–13.

Melbourne: ANZICS CORE, 2014.

39 Kramer AA, Higgins TL, Zimmerman JE. Comparing observed

and predicted mortality among ICUs using different prognostic

systems: why do performance assessments differ? Crit Care

Med 2015; 43: 261-9.

40 Wunsch H, Brady AR, Rowan K. Impact of exclusion criteria

on case mix, outcome, and length of stay for the severity of

disease scoring methods in common use in critical care. J Crit Care 2004; 19: 67-74.

CCR - Mar16 Combined.indb 36CCR - Mar16 Combined.indb 36 16/02/2016 1:18:16 PM16/02/2016 1:18:16 PM