Improved hospital-level risk adjustment for surveillance of healthcare-associated bloodstream infections: a retrospective cohort study

Background To allow direct comparison of bloodstream infection (BSI) rates between hospitals for performance measurement, observed rates need to be risk adjusted according to the types of patients cared for by the hospital. However, attribute data on all individual patients are often unavailable and hospital-level risk adjustment needs to be done using indirect indicator variables of patient case mix, such as hospital level. We aimed to identify medical services associated with high or low BSI rates, and to evaluate the services provided by the hospital as indicators that can be used for more objective hospital-level risk adjustment. Methods From February 2001-December 2007, 1719 monthly BSI counts were available from 18 hospitals in Queensland, Australia. BSI outcomes were stratified into four groups: overall BSI (OBSI), Staphylococcus aureus BSI (STAPH), intravascular device-related S. aureus BSI (IVD-STAPH) and methicillin-resistant S. aureus BSI (MRSA). Twelve services were considered as candidate risk-adjustment variables. For OBSI, STAPH and IVD-STAPH, we developed generalized estimating equation Poisson regression models that accounted for autocorrelation in longitudinal counts. Due to a lack of autocorrelation, a standard logistic regression model was specified for MRSA. Results Four risk services were identified for OBSI: AIDS (IRR 2.14, 95% CI 1.20 to 3.82), infectious diseases (IRR 2.72, 95% CI 1.97 to 3.76), oncology (IRR 1.60, 95% CI 1.29 to 1.98) and bone marrow transplants (IRR 1.52, 95% CI 1.14 to 2.03). Four protective services were also found. A similar but smaller group of risk and protective services were found for the other outcomes. Acceptable agreement between observed and fitted values was found for the OBSI and STAPH models but not for the IVD-STAPH and MRSA models. However, the IVD-STAPH and MRSA models successfully discriminated between hospitals with higher and lower BSI rates. Conclusion The high model goodness-of-fit and the higher frequency of OBSI and STAPH outcomes indicated that hospital-specific risk adjustment based on medical services provided would be useful for these outcomes in Queensland. The low frequency of IVD-STAPH and MRSA outcomes indicated that development of a hospital-level risk score was a more valid method of risk adjustment for these outcomes.


Background
Healthcare-acquired infection (HAI) is a major contributor to patient morbidity and mortality [1], particularly bloodstream infections (BSI), which are expensive and difficult to treat [2]. Queensland Health has initiated a quality improvement programme, the Centre for Healthcare Related Infection Surveillance and Prevention (CHRISP), which undertakes standardized surveillance of HAI in public hospitals in Queensland.
To allow direct comparison of rates of HAI between hospitals, observed rates need to be risk adjusted according to the types of patients cared for by the hospital [3]. Without risk adjustment, hospitals might be penalized for high infection rates that arise due to the type of patients cared for rather than quality of patient care [4]. For surgical site infections, this involves risk adjusting individual patient outcomes according to measures of health status and surgical complexity [5]. However, for BSI, no individual data are collected by CHRISP on the general patient population, meaning that risk adjustment for hospital BSI rates has to be done indirectly, based on attributes of the hospital.
At present, expected BSI rates are crudely calculated for three hospital strata (called "levels") which roughly correspond to the size of, and types of services provided by, the facility. Level I hospitals are tertiary teaching hospitals, level II hospitals are large general hospitals and level III hospitals are smaller general hospitals. Level I hospitals tend to have higher rates of BSI than levels II and III hospitals, and crude risk adjustment based on hospital level allows for some of the between-hospital variation associated with patient case mix to be accounted for. However, CHRISP is seeking a more objective approach to risk adjustment based on hospital attributes (i.e. services provided) that are directly associated with BSI risk. The aim of the present study was to identify hospital services associated with high or low rates of BSI and to evaluate the services as indicators that can be used for improved hospitallevel risk adjustment.

Methods
CHRISP was initiated in 2000 with joint funding from the Australian Government Department of Health and Ageing, and Queensland Health, the Queensland government public health service. Surveillance methods are described in detail elsewhere [6], but here we provide a brief description of CHRISP surveillance of BSI. BSI data collection commenced in February 2001 on a voluntary basis and involved the 10 largest public hospitals in Queensland. In May 2002, an additional 11 smaller general hospitals were included. HAI (including BSI) data were collected by infection control practitioners in each participating hospi-tal using hand-held computing devices. Standard BSI definitions based on the United States National Nosocomial Infection Surveillance (NNIS) system definitions were used in all hospitals [7]. Patient de-identified data were transferred to an electronic surveillance software package, Electronic Infection Control Assessment Technology version 4.2. (eICAT, CHRISP, Brisbane, Australia) from which the data for this study were extracted. Ultimately, data were available from 5 level I, 10 level II and 6 level III public hospital.

Statistical analysis
Four types of BSI were investigated: Overall BSI (OBSI); BSI caused by Staphylococcus aureus (STAPH); Intravascular-device-related S. aureus BSI (IVD-STAPH) and BSI caused by methicillin-resistant S. aureus (MRSA), with the latter two forming overlapping subsets of STAPH and STAPH forming a subset of OBSI. As the frequency of MRSA monthly counts (number of infections per month) was low, with only 6.3 percent of all MRSA events being multiple events in the same month, this outcome was dichotomized to presence or absence of infections in each hospital and month. All BSI infection data were collected at an aggregated hospital level every month.
Split-sample validation was employed in the analysis. The training dataset consisted of a retrospective cohort of hospital-level monthly counts, comprising almost six years (71 months) of longitudinal data, collected from February 2001 to December 2006. The validation dataset comprised one year of longitudinal data, collected from January to December 2007. Three level II hospitals with multiple periods of missing longitudinal data were removed prior to analysis. The remaining 18 hospitals also had 11.3 percent missing outcome data because not all of these hospitals had joined CHRISP and began contributing data at the same time. However, these hospitals did not have missing data from the period that they started contributing. The training dataset had a total of 1122 observations. Generalized estimating equation (GEE) Poisson regression models, typically used to compute population-averaged parameter estimates, were developed to identify risk and protective services for the OBSI, STAPH and IVD-STAPH outcome. The total number of patient days per month was used as an exposure variable in the models to capture the activity level of the hospital in a particular month. We used the quasilikelihood under the independence model information criterion (QIC), which is analogous to the Akaike information criterion (AIC) [8] for likelihood-based models, to select a parsimonious model with the best fitting temporal autocorrelation structure. As for AIC, a lower QIC indicates a better trade-off between model complexity and fit [9][10][11]. For the dichotomous MRSA outcome, an independence logistic regression model was developed to identify risk factor services.
Saturated models with the following 12 candidate medical services were fitted: acute renal dialysis, acquired immune deficiency syndrome (AIDS), alcohol/drugs, cardiac surgery, diabetes, hospice care, infectious diseases, intensive care, plastic surgery, obstetrics and maternity, oncology and bone marrow transplants. Five candidate medical services were excluded due to collinearity: acute spinal injury, burns, neurosurgery, obstetrics and intensive care. Collinearity arose for a medical service when there was minimal variation in that service across hospitals. For example, the intensive care service was collinear because it was offered by most hospitals and had a similar distribution across hospitals to the infectious diseases service. The general surgery service was also excluded because it was provided by all hospitals.
Parsimonious models were sought by dropping non-significant medical services using an -level of 0.05. Parameter estimates for the GEE Poisson regression models were expressed in terms of incidence rate ratios (IRR) and 95% confidence intervals. Parameter estimates for the logistic regression model were expressed in terms of odds ratios (OR) and 95% confidence intervals.

Goodness of fit analysis
For the count outcomes, the concordance correlation was computed as a measure of agreement between the observed and fitted values [12][13][14]. High levels of agreement implied that the model's fitted values closely matched the observed values. The Harrell's c-index was also derived, as a measure of discrimination between hospitals with higher or lower infection rates. For dichotomous outcomes, the c-index is equivalent to the area under the ROC curve (AUC).
For the dichotomous MRSA outcome, a Hosmer-Lemeshow test with 10 groups was performed to assess the logistic regression model. A p-value greater than .05 indicated no statistical evidence of a poorly fitting model. Receiver operating characteristics (ROC) analysis was conducted and the AUC was computed. The AUC measured discrimination, which is the ability of the model to correctly predict the months with and without infections. An AUC of 0.5 represented a model that predicts no better than random guessing and an AUC of 1 represented a model that predicts perfectly.

Level re-classification based on risk scoring
For count outcomes where the models had a low concordance correlation, use of the regression model coefficients to calculate expected rates for direct hospital-level risk adjustment was not indicated. For these outcomes, and for the MRSA outcome, which occurred with a low frequency, an alternative risk-scoring approach [15] was explored. In this approach, a risk score that reflected the high and low risk services provided by a particular hospital was calculated by totaling the regression coefficients from the applicable medical services provided by that hospital. So, a hospital with an infectious diseases and cardiac surgery service would have a risk score based on the sum of the regression coefficients from those two services. Homogeneous subgroups of hospitals with similar risk scores were then identified and these groupings were used to reclassify the original hospital levels. To demonstrate the impact of re-classification, Bayesian shrinkage plots [16,17] were created using risk-adjusted rates calculated according to the original and re-classified hospital levels. Shrinkage estimators have been used extensively to derive better estimates of the true infection rates in hospitals. They minimize the mean squared error of parameter estimates between hospitals, adjust for variation in sample size and account for regression to the mean for individual hospitals. Statistical analyses were performed using Stata 10.1 software (StataCorp, College Station, TX, USA) and R 2.7.1 (R Core Development Team, Vienna, Austria).

Results
The mean BSI rate per month by outcome type and mean number of patient days per month, stratified by original hospital levels are displayed in Table 1. The results indicated that level I hospitals had the highest rates of BSI and they were the busiest group of hospitals with the highest number of patient days per month. Level III hospitals tended to have a very low number of infections per month. Across all levels, the outcomes of IVD-STAPH and MRSA were infrequent relative to OBSI and STAPH. Plots of numbers of OBSI per month are presented for selected hospitals in Figure 1.

Regression models
For the OBSI model (Table 2) model. The c-index was 0.83 (95% CI .81 to .86) which indicated the model had a high ability to discriminate. The highest OBSI rates among all hospitals were found in hospital 7 and 10 ( Figure 1); these two hospitals had all four risk services found by the GEE Poisson model. The risk for OBSI in these hospitals may be compounded as these four risk services were found together.
For the STAPH models, the parsimonious GEE Poisson model with an autoregressive structure of two lags is shown in Table 3. Three risk services and three protective services were found. The concordance correlation was 0.73 (95% CI .68 to .78). Thus there was moderate level of agreement. The c-index was 0.82 (95% CI .78 to .86) indicating a high level of discrimination.
IVD-STAPH had very low monthly counts with a large proportion of zeroes (75.9%). The QIC results suggested an autoregressive structure of lag 2 was most suitable. The parsimonious GEE model with AR 2 correlation is shown in Table 4. Three risk services and one protective service were identified. The concordance correlation was 0.58 (95% CI .50 to .65), which indicated a low level of agreement mainly due to the model being unable to predict a substantial number of observed zeroes. However, the cindex was moderately high at 0.78 (95% CI .72 to .85) which suggested sufficient ability to discriminate between lower and higher infection rates among hospitals.
MRSA had very low monthly counts with a large proportion of zeroes (79.7%) and a maximum monthly count of four events. The QIC suggested an independence structure adequately reflected the correlation structure. The parsimonious logistic regression model is shown in Table 5.
Three risk services and one protective service were found. The Hosmer-Lemeshow goodness of fit test with 10 groups suggested the model fitted adequately (χ 2 (8) = 6.74, P = .565). The AUC suggested good discrimination between observed and fitted values (AUC = .81, exact 95% CI .79 to .83).

Level re-classification based on risk scoring
Direct hospital-specific risk adjustment using the IVD-STAPH model was not recommended due to the low concordance correlation. Note, a GEE negative binomial model and a zero-inflated Poisson (ZIP) model [18] were also fitted for IVD-STAPH but resulted in similarly low concordance correlations. Therefore, the risk scoring approach was used for IVD-STAPH, and MRSA. Table 6 demonstrates the calculation of the risk score, and subsequent hospital reclassification for MRSA. Figure 2 shows a Bayesian shrinkage plot for five years of MRSA surveillance data with risk adjustment by the original hospital levels. The two and three standard deviation boundaries (sigma control limits) can be used to identify which hospitals have significantly over or under-performed rela-  tive to their peers. Hospitals 1 and 12 performed significantly worse than average at the 2 and 3 sigma control limits. Figure 3 shows that when risk adjustment was performed using the re-classified levels (based on the risk score), hospital 12 remained an outlier but hospital 1 was clearly within the 2 and 3 sigma control limits. Thus hospital 1, which had been re-classified from level II to level I (Table  6), was under control. When deriving the shrinkage ratios for MRSA, the estimate of the between-hospital variation of the true rates was obtained. With the original levels, the variation between hospitals was 0.332. With the new levels, the variation decreased to 0.138.

Discussion
This study aimed to develop and evaluate risk adjustment for BSI rates at a hospital level, based on services provided by those hospitals. Risk adjustment is clearly necessary given the large differences in rates between hospital levels across all four infection outcomes (Table 1; Figure 1). Our results suggest that hospital-specific risk adjustment based on medical services provided is strongly recommended for OBSI and STAPH. Expected infection counts (calculated using patient day denominators), may be obtained directly from the risk-adjustment models. By contrast, hospital-level risk adjustment with a risk score approach is recommended for IVD-STAPH and MRSA in lieu of direct hospital-specific risk adjustment. These methods may be used to derive less biased observed-expected ratios of monthly BSI than the crude approach currently being used, where risk adjustment is based on the hospital level, and CHRISP is currently implementing risk adjustment using the models presented in this report.
The risk-adjusted ratios may be implemented on the y-axis in funnel plots [19] and Bayesian shrinkage plots for continuous quality improvement. Shrinkage plots for MRSA demonstrate that hospital 1, reported as an outlier using the original hospital level classification, was found to be under control using our re-classified hospital levels. Table   6 indicated that the hospital, originally classified as a level 2 hospital, was reclassified as a level 1 hospital using the risk score based on the logistic regression model. This was because the hospital offered acute renal dialysis and infectious diseases services, which were the highest risk services found in the MRSA model. Hospital 1 actually had the highest risk score among all hospitals. It was also found that between-hospital variation in rate estimates was higher for the original, crudely adjusted values than in the new risk-adjusted values. This is further evidence to support the reclassification, as the new levels produced a more homogenous group of true rates within each level, and demonstrates that the risk scoring approach has had a significant impact on the interpretation of observed MRSA rates.
AIDS, infectious diseases, oncology, renal dialysis, cardiac surgery and transplant services were found to be high risk services for BSI in one or more models. This is unsurprising given the compromised immune state of most patients cared for by AIDS, oncology and transplant services, and the large number of invasive procedures conducted in oncology, renal dialysis and cardiac surgery wards. Infectious disease services were highly collinear with intensive care services and the finding of infectious disease services as high risk could relate to the health status and number of invasive procedures experienced by  patients in intensive care units and other collinear services. Another possibility is that hospitals with infectious disease units might perform better at BSI surveillance, with a higher probability of identifying and reporting BSI cases. This requires further investigation. We note that exclusion of the three level II hospitals with missing data potentially reduced the power of the statistical models to identify risk and protective services, and might have introduced an immeasurable source of bias.
Although hospital-level risk adjustment based on hospital services is a more objective and refined approach than that based on crude hospital levels, hospital services remain an indirect indicator of patient case-mix. Use of service-specific infection rates (which were not available in the current study) or attribute data of individual patients (also not available for the general patient population) would facilitate a more accurate and robust approach to risk adjustment. Further research will focus on developing risk-adjustment models that incorporate more sophisticated denominators such as central linedays to calculate BSI rates [20]. CHRISP is in the process of initiating a pilot of central line-day data collection within a major hospital. It is possible that variation in surveillance quality could contribute to observed variation in BSI rates between hospitals. We cannot capture this in our models but if a hospital signals (i.e. has a higher than expected rate), an investigation should be conducted and this will determine if the signal is a reporting artifact or a result of an infection control break down. While our validation results suggest that the models were robust over time, we do not have data available from another geographical area (e.g. Australian state) for external validation purposes but that is something we wish to investigate in the future.

Conclusion
The results of the models are generalizable to the network of public hospitals in Queensland. While the estimates of the models themselves may not be generalizable to other healthcare systems with different patient case mixes and organization of medical services, the statistical methods of risk-adjustment presented here are widely applicable to other healthcare systems that collect BSI surveillance data at an aggregated hospital level. The Australian government is currently mandating S. aureus BSI as a key performance indicator, and risk-adjustment will be essential to ensure that hospitals that offer high risk services will not be unfairly penalized given their underlying propensity of their patients to develop BSI. Therefore, the imperative is great for more objective methods of risk adjustment such as the approach outlined in this report.