A few antibiotics can represent the total hospital antibiotic consumption

Background Appropriate antibiotic use has become an important issue. However, collecting data on the use of all antibiotics in a hospital is difficult without an advanced computerized system and dedicated staff. This paper examines if 1–3 antibiotics can satisfactorily represent the total antibiotic consumption at the hospital level. Methods We collected antibiotic data from six large university hospitals in Korea for some years between 2004 and 2012. Since the total antibiotics consist of a few chosen representative antibiotics and the rest, we used those chosen antibiotics along with additional variables constructed only with t (time) such as t, t2, and t3 to capture the time trend and whether t belongs to each month or not to capture the monthly variations. The ordinary least squares method was used to explain the total antibiotic amount with these variables, and then the estimated model was employed to predict the use for 2013. To determine which antibiotics were the most representative in tracking general trends in antibiotic use over time, we tried various combinations of antibiotics to find the combination that best minimized the 2013 prediction error. Results We found that fluoroquinolones and aminoglycosides were the most representative, followed by beta-lactam/beta-lactamase inhibitors and 4th-generation and 3rd-generation cephalosporins. The mean prediction error over 12 months in 2013 with these few antibiotics was only 1–3% of the monthly antibiotic consumption amount. Conclusions The total antibiotic consumption amount at the hospital level can be represented sufficiently by a few antibiotics, such as fluoroquinolones and aminoglycosides, which means that hospitals can save resources by tracing only the usage of those few antibiotics instead of the entire inventory. Since the choice of fluoroquinolones and aminoglycosides is based solely on our Korean data, other hospitals may follow the same modelling methodology to find their own representative antibiotics. Electronic supplementary material The online version of this article (10.1186/s12879-018-3132-7) contains supplementary material, which is available to authorized users.


Background
Antimicrobial resistance is a worldwide problem, which poses a serious threat to global public health [1]. In 2011, to prevent further worsening of the problem, the World Health Organization (WHO) urged nations to be alert for antimicrobial resistance and called for urgent action to decrease antimicrobial consumption [2]. In accordance with the initiative, the Korean government launched a national action plan on antimicrobial resistance in 2016 [3].
As is well documented, the overuse/misuse of antibiotics has been recognized as a key factor for the emergence of antimicrobial-resistant organisms [4]. Inappropriate antibiotic use also causes extra medical expenses: unnecessary or duplicative antibiotic use in US hospitals led to an estimated $163 million in excess costs [5]. Hence, many experts have suggested establishing antimicrobial stewardship programmes in hospitals as well as in communities [1]. The first step to combat antibiotic abuse is finding out the severity of the problem. This calls for a proper measurement of antibiotic consumption [6], which helps to understand the epidemiology of antimicrobial resistance and provides hospitals with useful data to implement policies and guidelines about proper antibiotic usage [6].
To this end, we collected data on antibiotic prescriptions from six large Korean university hospitals with good computerized systems. There were considerable difficulties in collecting data on antibiotics because there were too many different types of antibiotics but no experienced/dedicated staff to collect data on antibiotics. There was no problem obtaining data on the prescription department, administration route and volume of drug, but a big problem in the data collection was that antibiotics were recorded by the brand name, not by the ingredient names nor by the antibiotic class name. This required extra effort to convert the data into a suitable and consistent form.
The goal of this paper is to explore whether it is possible to look at only a couple of representative antibiotics to determine the total antibiotic consumption at the hospital level. If yes, this means a considerable savings in terms of time and effort to keep track of all antibiotic use. To this goal, we built a statistical model, in which 1-3 representative antibiotics are chosen to predict the total antibiotic consumption at the hospital level, with an acceptably small magnitude of prediction error.

Study design
We build a simple linear statistical model, where the total antibiotic consumption at the hospital level is explained by 1-3 representative antibiotics along with time and month dummy variables-the time and month dummy variables are "free", as they depend on time index t only. Because the total amount consists of the representative antibiotics and the remaining (non-representative) ones, this modelling strategy amounts to explaining the remaining antibiotics using their correlations with the representative antibiotics as well as the time and month dummy variables. We estimate the model using the observations over 2004-2012 of six large university hospitals in Korea, one of which is the Hanyang University Seoul Hospital (HUS). Then, the model prediction capability is evaluated for the ensuing year (2013), using the data from HUS.

Data source
We collected data on the total antibiotic prescriptions for inpatients and their total patient days in 2004, 2008 and 2012 from six university hospitals (4 tertiary and 2 secondary) in Korea: Hanyang University Seoul Hospital (758 beds), Chungbuk University Hospital (620 beds), Chonnam University Hospital (970 beds), Gyeongsang University Hospital (889 beds), Hanyang University Guri Hospital (578 beds), and Korea University Ansan Hospital (543 beds). In addition, we collected data from HUS for each year between 2004 and 2013 on the total antibiotic prescription records and the total patient days.
All data were extracted from the electronic billing system by the data processing department in each hospital.

Definitions
We define antibiotics as medications with class J01 in Anatomical Therapeutic Chemical (ATC), which does not include antifungal agents nor anti-tuberculosis agents. Systemic agents with oral or parenteral administration routes are included, but topical agents are excluded. We convert each class of antibiotic amount to a defined daily dose (DDD) by using the ATC of the WHO and then standardize for 1000 patient days [7].
Let n ht denote the patient days for hospital h = 1 … 6 and month t; t ranges over 1 … 12 (year 2004), 49 … 60 (year 2008) and 97 … 108 (year 2012) for the hospitals other than HUS, and t ranges over t = 1 … 108 for HUS. Let DDD aht denote the DDD for antibiotics a, hospital h, and time t. With '≡' standing for "defined as", let 'allhospital patient days at time t' and 'all-hospital DDD for antibiotics a at time t' be n t ≡ n 1t þ … þ n 6t and DDD at ≡ DDD a1t þ … þ DDD a6t Then, the all-hospital DDD per 1000 patient days for antibiotics a at month t is X at ≡ DDD at n t Â 1000: Let m ('m' for main) be the number of the main (i.e., representative) antibiotics; m = 1, 2 or 3 in this paper. Listing the main antibiotics first, the total antibiotic amount at time t can be written as Y t ≡ the main antibiotic amount þ the others

Statistical Methodology
To achieve our goal of representing the total Y t with the main antibiotics, it is necessary to account for the remaining part P 19 a¼mþ1 X at in (1) in a simple way. We achieved this by replacing the sum P 19 a¼mþ1 X at with "free variables". If the free variables can represent P 19 a¼mþ1 X at well enough, then we do not have to collect data on those remaining antibiotics.
We used three types of free variables to account for P 19 a¼mþ1 X at : (i) time index t to capture the trend, (ii) month dummies to capture the monthly variations, (iii) and some calendar time dummies to capture "structural breaks" (i.e., big events), if there are any. Since all three types are determined by t, no data collection is needed for them. We illustrate these three types next. Let 1[A] = 1 if A holds, and 0 otherwise.
Suppose we have t = 1 … 17 monthly observations over January 2004 to May 2005. First, use a polynomial function such as P p q¼0 α q t q (e.g., to account for the trend, where α s are the parameters to be estimated using (1, t …t p ). Second, capture the monthly variations with the month dummies; e.g., the February dummy 1[t ∈ February] is to capture the February effect relative to the baseline January, where '∈' means "belonging to", and the March dummy 1[t ∈ March] is to capture the March effect relative to January. Third, there might be a big policy change, say, a crackdown on antibiotic abuse at t = 6 and onwards, in which case 1[6 ≤ t] can be used to account for the crackdown that is a structural break.
Since the main antibiotics and t-based variables are in the model, whereas the other antibiotics are not, in essence, the omitted non-representative antibiotics are explained by their correlations with the main antibiotics and the t-based variables. After the model parameters are estimated using the time-series data up to t = 108 (December 2012), we then construct the predicted Y t for t = 109~120 (2013) for HUS using the estimated model; let Y b t denote the predicted value.
After model estimation using t = 1~108, predicting Y b t for t = 1~108 is "in-sample prediction", and pre-dictingŶ t for t = 109~120 is "out-sample prediction". The out-sample prediction is to pick the representative antibiotics, and the in-sample prediction is just to see how the chosen representative antibiotics perform in fitting the in-sample observations. Since we put more emphasis on predicting the future than on explaining the past, the out-sample prediction is our primary criterion to determine the representative antibiotics, whereas the in-sample prediction is secondary.
To explain how to select 1-3 main antibiotics, suppose m = 2. For each main antibiotic candidate, we obtain Y b 109 ; :: Y b 120 for 2013 and its "mean prediction error": In other words, (2) is the average of the monthly absolute deviations for 2013 between the actual and predicted antibiotic uses in DDD/1000 patient days. The particular combination of two antibiotics minimizing (2) is the best choice.
To explain why we consider different values for m, the reason is that there is a trade-off in setting m large v. small. If m is large, say 10, we can trace the overall antibiotic consumption better, but then the representativeness will be worse; if m is small, say 1, then the opposite happens. Between these extremes, 1-3 seem to be reasonable values, and for each chosen value of m, we try different combinations of antibiotics.
The model for the ordinary least squares (OLS) estimator, where the main antibiotic amount and the above "free" t-based variables collectively explain Y t , is shown in Additional file 1: Tables S3 and S4; Additional file 1: Table S3 uses all six hospitals' data, whereas Additional file 1: Table S4 uses only the HUS data for the model estimation. In each table, the OLS estimates and their standard errors (SE) are provided. Dividing an estimate by its SE gives the 't-value' or 'z-score'. It being above 2 in absolute value indicates statistical significance at the 5% error level, i.e., we set statistical significance at P < 0. 05. R 2 shows the proportion of the Y t variation explained by all "regressors" (i.e., explanatory variables) jointly.

Most prescribed antibiotics in the pooled data
Pooling all time-series data of the six hospitals into one big data set, Table 1 provides descriptive statistics in all six hospitals, including HUS, as well as HUS alone; the unit for all numbers is DDD/1000 patient days. The average total antibiotic consumption of the six hospitals plus/minus the standard deviation (SD) is 864 ± 55. 5 In contrast, monobactam, oxazolidinone and tigecycline were rarely used; there was even no use at all of these antibiotics for some months.
Out-sample prediction, representative antibiotics, and in-sample fitness Table 2 shows the main antibiotics minimizing the mean prediction error (2). For example, the mean prediction error is 26.2 DDD/1000 patient days using only AG for m = 1, and it is 17.2 using AG and 4th CEP for m = 2, where the predictors were obtained with all six hospitals' data. In contrast, the mean prediction error is 20.7 DDD/1000 patient days using only FQ for m = 1, and it is 18.3 using FQ and AG for m = 2, where the predictors were obtained with only the HUS data.
In Table 2, when all six hospitals' data are used in the left half, AG does best (with the mean prediction error 26.2 when used alone), followed by FQ, 4th CEP, and BL-BLI. When only the HUS data are used, FQ or BL-BLI does best, followed by AG and 3rd CEP. Combining these findings, we may state that FQ is the most representative, followed by AG, BL-BLI, 4th CEP and 3rd CEP. Since the total number of observations is 180 (=36 months times 5 hospitals) plus 108 (12 months times 9 years from HUS) and HUS takes only 37.5% of the total observations 288 = 180 + 108, the result based on the entire data set that AG is the best changes when only the HUS observations are used. The details on the   OLS used for prediction are provided in the Additional file 1. Define the "mean prediction error multiplied by 100 and divided by the monthly antibiotics consumption amount in Table 1" as the "relative mean prediction error". Using the six hospitals' data, the relative mean prediction error for m = 1, 2, and 3 is, respectively, Judging from the m = 2 and 3 cases here and the rows for m = 2 and 3 in Table 2, although the predicted Y is for HUS, using all the hospital data is preferable to using only the HUS data to minimize the prediction error. Figure 1 shows the out-sample prediction time-series plot and the difference between the observed and predicted values using all hospital data, and Fig. 2 shows the same using only the HUS data. The figures show that the predicted lines match the actual line (dotted) well, and the 95% confidence interval for the prediction error includes zero in almost all cases. Figure 3 presents the in-sample actual and fitted (m = 1,2,3) values for the six hospitals, and the fitness looks good; Fig. 4 does the same for HUS. Notice a large drop at t = 52 in Fig. 4, which prompted using 1[52 ≤ t] in the OLS for the HUS data.
The most prescribed antibiotics are not necessarily the most representative Taking the most prescribed antibiotics (FQ, 1st CEP and 3rd CEP) as the three representative antibiotics, we redrew Figs Table 2.

Discussion
Collecting data on all antibiotics is a tedious and painstaking task in Korea, and this may be the case in other countries as well. We showed that it is possible to use only 1-3 representative antibiotics to track the total antibiotic consumption at the hospital level. Our main finding is that FQ and AG are the most representative, followed by BL-BLI, 4th CEP and 3rd CEP. Our mean prediction error is only 1-3% of the monthly antibiotic consumption amount, which is the average across the hospitals and years in our data. Whether or not these levels of prediction error are tolerable depends on how much we save in terms of time and money by not collecting data on the other antibiotics. Fig. 1 Out-sample prediction for Hanyang University Seoul Hospital, 2013, using six hospital data sets. Abbreviations: 4th CEP 4th-generation cephalosporins, AG aminoglycosides, BL-BLI beta-lactam/beta-lactamase inhibitors, CI confidence interval Although our representative antibiotics were selected, not based on their medical effectiveness, but based on how well they collectively represented the total antibiotics usage, the representative antibiotics also happened to be the most commonly prescribed for inpatients, except for 4th CEP. To better monitor antibiotic consumption in hospitals, one of the broad-spectrum antibiotics or antibiotics against multi-drug resistant pathogens (such as carbapenems) could be co-monitored with our representative antibiotics.
The overall antibiotic usage patterns in our data differ little from other studies in Korea. A single-centre study found that 3rd CEP was the most commonly prescribed antibiotic for hospitalized patients in Korea, followed by FQ, BL-BLI and 1st CEP [8]. Additionally, a population-based study showed that 3rd CEP was the most prescribed antibiotic for inpatients in Korea, followed by AG, 1st CEP and FQ [9]. These studies suggest that we might have found almost the same representative antibiotics had we analysed other Korean hospitals' data that are not in our data set.
The antibiotic usage pattern is different at various levels. For instance, in Italy and the UK, AG are not used as frequently as in Korea [10,11], which illustrates country-level differences; also, there are large differences in the consumption profiles for treatments of the same bacterial infection among European countries [12]. Even among the hospitals in the same country, large differences in antibiotic usage patterns exist; e.g., mediumsized, private and university hospitals use more antibiotics [13]; additionally, antibiotic usage patterns differ between small and large community hospitals in Fig. 2 Out-sample prediction for Hanyang University Seoul Hospital, 2013, using Hanyang University Seoul Hospital data only. Abbreviations: 3rd CEP 3rd-generation cephalosporins, AG aminoglycosides, BL-BLI beta-lactam/beta-lactamase inhibitors, FQ fluoroquinolones, CI confidence interval Fig. 3 In-sample prediction (2004,2008,2012) for six hospitals, using six hospitals' data. Abbreviations: 4th CEP 4th-generation cephalosporins, AG aminoglycosides, BL-BLI beta-lactam/beta-lactamase inhibitors Korea [9]. Possible reasons for these differences are variations in bacterial epidemiology at hospital level, the medical staff's attitude towards prescribing antibiotics, antimicrobial stewardship programme effectiveness, etc. Hence, if possible, it would be ideal for each hospital to conduct a study of its own (as was done in this paper) to find its own representative antibiotics.
The methodology we presented used basic statistics for predicting future time-series variables. It should not be too difficult for hospitals to tailor the methodology to meet their needs, finding a few representative antibiotics by using, e.g., different functions of t and different structural breaks at different times. Once the methodology is set, the hospital would then address the problem of selecting a few representative antibiotics, which is in fact more difficult than it looks; e.g., if three are to be chosen out of 20 antibiotics in total, there are 1140 possible combinations. In this case, despite many differences across countries and hospitals within the same country, our findings should be helpful in choosing antibiotics to consider first (it would be FQ, AG, BL-BLI, 4th CEP and 3rd CEP); of course, the most commonly prescribed antibiotics in the hospital would also make good candidates.
We attribute the structural break in Fig. 4 at HUS to the pre-authorization of an antibiotic use programme that started in 2008. The programme put restrictions on prescribing broad-spectrum antibiotics such as carbapenems, glycopeptides, oxazolidinone, polymyxin and tigecycline by requiring an extra approval step from the infectious disease department [14]. Additionally, the programme reinforced educating physicians on the appropriate use of antibiotics and collecting feedback after drug use.
As the HUS time-series data plot illustrates in Fig. 4, a structural break can move the intercept substantially, the ignorance of which would result in large biases in the other estimates because the other estimates would be adjusted downward to account for the large drop in the intercept. Detecting structural breaks is relatively Fig. 4 In-sample prediction for Hanyang University Seoul Hospital, using Hanyang University Seoul Hospital data only. Abbreviations: 3rd CEP 3rd-generation cephalosporins, AG aminoglycosides, BL-BLI beta-lactam/beta-lactamase inhibitors, FQ fluoroquinolones Of course, if the break magnitudes are small, then they are hard to detect with the naked eye, but then they would not be called "breaks". Structural breaks might have to be incorporated using outside information such as announced law/ regulation changes.
There are some notable limitations in our study. First, the six university hospitals were selected, not by any sampling principle, but by ease in data collection, in which sense our data may not be representative of the large university hospitals in Korea that would be our study population of interest. For five hospitals, we could gather only three years of data, which resulted in relatively larger standard errors than we would have liked. Second, the prediction performance was gauged using only one hospital's single-year data, and thus, using other hospital data or a longer time span of data may alter/qualify our findings. Third, we adopted a relatively simple ordinary least squares estimator to find the time trend and monthly variations; more statistically sophisticated models and approaches may refine and improve the prediction capability. Finally, we measured antibiotic consumption by DDD instead of days of therapy (DOT). According to a recent guideline for antibiotic stewardship programmes, DOT is preferred to DDD as a measure of antibiotic consumption [15]. However, we could not use DOT because only the total amount of antibiotic consumption per patient was available in five of the six hospitals.
As far as we are aware, our study is the first of its kind to look at the possibility of using only a few antibiotics to track the total antibiotic consumption at the hospital level. Hopefully, more studies will be done to save Fig. 6 In-sample prediction (2004,2008,2012) for six hospitals with the most commonly used antibiotics (FQ, 3rd CEP, and 1st CEP). Abbreviations: FQ, fluoroquinolones 3rd CEP 3rd-generation cephalosporins, 1st CEP 1st-generation cephalosporins Fig. 7 In-sample prediction for Hanyang University Seoul Hospital with the most commonly used antibiotics (FQ, 3rd CEP, and 1st CEP). Abbreviations: FQ fluoroquinolones, 3rd CEP 3rd-generation cephalosporins, 1st CEP 1st-generation cephalosporins medical personnels' time and effort surrounding nonessential data collection, so that they can concentrate on more important healthcare activities.

Conclusions
This study showed that the total antibiotic consumption at the hospital level can be represented sufficiently well by a few antibiotics. FQ and AG were the most representative in the sense of minimizing the mean prediction error, followed by BL-BLI, 4th CEP and 3rd CEP; the mean prediction error is only 1-3% of the monthly antibiotic consumption amount. Despite this positive finding, because our analysis is based solely on Korean data and because the medical environment/practice of each country and each hospital differs, other hospitals may follow a similar modelling strategy to find their own representative antibiotics instead of readily adopting the aforementioned antibiotics as the most representative.