Development and validation of a risk score to predict mortality during TB treatment in patients with TB-diabetes comorbidity

Background Making an accurate prognosis for mortality during tuberculosis (TB) treatment in TB-diabetes (TB-DM) comorbid patients remains a challenge for health professionals, especially in low TB prevalent populations, due to the lack of a standardized prognostic model. Methods Using de-identified data from TB-DM patients from Texas, who received TB treatment had a treatment outcome of completed treatment or died before completion, reported to the National TB Surveillance System from January 2010–December 2016, we developed and internally validated a mortality scoring system, based on the regression coefficients. Results Of 1227 included TB-DM patients, 112 (9.1%) died during treatment. The score used nine characteristics routinely collected by most TB programs. Patients were divided into three groups based on their score: low-risk (< 12 points), medium-risk (12–21 points) and high-risk (≥22 points). The model had good performance (with an area under the receiver operating characteristic (ROC) curve of 0.83 in development and 0.82 in validation), and good calibration. A practical mobile calculator app was also created (https://oaa.app.link/Isqia5rN6K). Conclusion Using demographic and clinical characteristics which are available from most TB programs at the patient’s initial visits, our simple scoring system had good performance and may be a practical clinical tool for TB health professionals in identifying TB-DM comorbid patients with a high mortality risk. Electronic supplementary material The online version of this article (10.1186/s12879-018-3632-5) contains supplementary material, which is available to authorized users.


Background
The effect of diabetes mellitus (DM) on the development and poor outcome of tuberculosis (TB) disease has been recognized for over a century [1]. While diabetes ranked 7th among the leading causes of death in 2015, TB has been recognized as a leading cause of the mortality due to an infectious disease [2,3]. With the global increase of obesity and type 2 diabetes, the combination of diabetes and tuberculosis (TB-DM) has posed an imminent public health threat and a challenge to TB control programs worldwide [4]. In the United State (US), the prevalence of diabetes has consistently increased from 0.93% in 1958 to 7.40% in 2015 with an estimate of 30.3 million people of all ages (9.4% of the US population) living with diabetes [5,6]. This increasing trend of diabetes morbidity in the US is concerning especially in US states (such as Texas) where both the TB and the DM prevalence are higher than the national average [7,8]. Given that TB-DM comorbid patients may have a mortality of 2-5 times higher than that of non-diabetic TB patients [9,10], more effective management strategies including the development of predictive models for TB mortality are urgently needed.
There are a growing number of prognostic models developed to predict mortality in patients with TB disease. Many characteristics such as older age, HIV co-infection, diabetes, alcohol abuse, malnutrition, hypoxemic respiratory failure, etc. have been identified as the risk factors for poor outcomes in TB patients [11][12][13][14][15]. However, these models were not specifically developed for patients with TB-DM comorbidity and used hospital-based data with variables that are not routinely collected by TB control programs. The populations of these models either did not include diabetic patients [11,12] or only included a small number of TB-DM patients [13][14][15].
The lack of a standardized prognostic system specifically developed for TB-DM patients poses a challenge for health care providers attempting to predict the risk of mortality during TB treatment in this high-risk group of patients. The present study aimed to develop and internally validate a prognostic scoring system using surveillance data with covariates which are routinely collected by most TB control program and available at the patient's initial visits for TB evaluation. This simple scoring system would be a practical tool helping quickly identify TB-DM patients having a high risk of death during TB treatment.

Methods
This retrospective cohort study used the de-identified data of all confirmed TB patients from the state of Texas reported to the Centers for Disease Control (CDC)'s National TB Surveillance System (NTSS) between January 2010 through December 2016 (both genotyped and non-genotyped), who satisfied the following inclusion criteria: (1) met the clinical case definition or was laboratory confirmed based on the CDC definition for a TB case [16]; (2) received TB treatment and had a documented outcome of either "completed" or "died". Patients having treatment outcomes other than "completed" or "died" (such as "adverse", "lost", "moved", "other", "refused", or "unknown") were excluded from the analyses.
Logistic regression modeling was used to determine prognostic factors associated with patient mortality. Variables with a p-value < 0.2 in the univariate analysis or considered as clinically significant were evaluated further in the multiple logistic regression. The variable section for the multiple logistic regression model was conducted according the Bayesian Modeling Averaging (BMA) method [17,18]. As our goal was to develop a model that could be used in the patient's initial visit when the Mtb biological confirmation is still not available, Mtb culture and genotype-related variables were not evaluated in the multivariable modeling. Model discrimination was determined by the area under the Receiver Operating Characteristic (ROC) curve (AUC). The best model was chosen based on the smallest Bayesian information criterion and highest AUC. The model's good calibration was determined by a non-significant Hosmer-Lemeshow's goodness of fit test.
Significant risk factors were assigned weighted-points that were proportional to their β regression coefficient values. A prognostic score was calculated for each individual patient in the cohort. The methodology of categorizing risk groups has been described elsewhere [19,20]. Briefly, the patients were categorized in three distinct groups of mortality risk: low (< 10% mortality), medium (10-20% mortality), and high risk (> 20% mortality). Internal validation was conducted using the bootstrap resampling method with 2000 replications. Model calibration was evaluated by the Hosmer-Lemeshow goodness-of-fit test. A non-significant p-value of the Hosmer-Lemeshow goodness-of-fit test indicates the model has a good calibration (predictive accuracy). The comparison of the AUC between models was conducted using the chi-square test. All analyses were performed with Stata MP14.2 (StataCorp LLC, College Station, TX). A p value < 0.05 was considered statistically significant.

Results
Between January 2010 and December 2016, 1400 TB-DM patients in Texas were reported in the National TB Surveillance System database. After excluding 173 patients who had an outcome other than "completed" or "died", 1227 TB-DM patients, who started the TB treatment and had a treatment outcome of completed treatment or died before completion, were included in the analysis, of whom 1115 completed TB treatment and 112 died (Fig. 1). Except for injecting-drug user (IDU) (p = 0.01), no other difference was found between the patients who were included in the analysis and those who were excluded (Additional file 1: Table S1).
The crude and adjusted associations between characteristics and mortality are presented in Table 1. Nine variables (age ≥ 65 years, being US-born, being homeless, IDU, having chronic kidney disease, TB meningitis, miliary TB, positive acid-fast bacilli (AFB) smear, and positive HIV status) were significant in the multiple logistic regression model and were included in the risk score development. The weighted points of risk factors were calculated using the linear transformation of the corresponding β coefficient (Table 2) [21]. A risk score was calculated for individual patients using the following formula: There were 776 (63.7%) low-risk, 233 (19.2%) medium risk and 208 (17.1%) high-risk patients with the mortality by risk group of 3.1, 12.9 and 27.9%, respectively. The final model had good discrimination in both development (AUC = 0.83 95% CI 0.79, 0.87) and bootstrap validation (AUC = 0.82 95% CI 0.78, 0.87) ( Table 3, Fig. 2). The model also had a good calibration with a non-significant Hosmer-Lemeshow chi-square of 4.54 (p = 0.81) and a small Brier score of 0.07 (Table 3). Patients in the medium-and high-risk groups had more than a four-and twelve-fold increased odds of mortality compared with patients in the low-risk group (Table 4). We also compared the performance of the current TB-DM specific model and that of our previously-published mortality predictive model, which included all confirmed TB patients who started TB treatment [19]. In TB-DM patients who were included in this study, the TB-DM specific model had a significantly higher discrimination power than that of its predecessor [AUC 0.83 (95% CI 0.79, 0.88) versus 0.76 (0.71, 0.82), p < 0.001] (data not shown).
The predicted probability of death during TB treatment can be calculated based on the intercept value (− 4.004594) of the final model and corresponding β coefficients of the variables included in the risk score, the predicted probability of death during TB treatment can be calculated from the following formula:

Online calculator
We have created a free online application for our risk score calculator, which can be downloaded from https:// oaa.app.link/Isqia5rN6K and usable on both android and iOS mobile devices (registration for a free account of OpenAsApp is required to access the calculator). The calculator app provides a risk score (in points), risk group (low, medium or high), and probability of death (%) during treatment for an individual patient.

Discussion
Using 7 years of TB surveillance data from the state of Texas, we have developed and internally validated a simple prognostic score to predict mortality during treatment of TB-DM patients using only nine variables, which are routinely collected by most TB control program at the patients' initial visits for TB evaluation before the availability of the Mtb culture. Having good discrimination and calibration together with the availability of a calculator mobile app, the scoring system would be a practical tool for clinicians and public health professionals to quickly identify the TB-DM patients who have a high mortality risk without waiting for the biological confirmation. Our scoring system classifies patients into three distinctive risk groups, which would be helpful for health care workers in allocating the appropriate medical support and follow-up resources. While patients of the low-risk group can be managed according to the routine protocol, TB-DM patients in the high-risk group would need more aggressive medical support. Although many of our prognostic model's components are unmodified characteristics, there are still multiple approaches that could be carried out to improve patient survival, especially patients in the high-risk group. Having better management of the plasma glucose level is among the important strategies to reduce the patient's mortality as compared with TB patients with controlled DM. TB patients with uncontrolled DM have more than 4 times the odds for death and 2 times the odds of non-conversion of sputum cultures after 2 months of intensive treatment [22]. Education on the negative impacts of DM on TB patients as well as guidelines for changes in diet and physical activity should be provided to patients and their family so that they can be more compliant with the treatment and actively contribute to the glucose control improvement [23]. More aggressive nutritional support would be necessary for high-risk patients who are residents of long-term care facilities as these patients are also prone to other potential risk factors for TB mortality such as old age and undernutritional condition [11]. Given that the combination antiretroviral therapy (cART) could reduce up to 68% TB-related deaths in TB/HIV co-infected patients [24], early initiation of cART could be considered in high-risk patients, who are also HIV positive, although cART is recommended to be started within 8 weeks of starting TB treatment if CD4+ level ≥ 50 cells/mm3 [25]. In our scoring system, a positive HIV status, having chronic kidney failure, TB meningitis, being IDU and age ≥ 65 years are strong predictors for mortality in TB-DM patients. These findings are consistent with current literature for TB patients in general [13,[26][27][28][29].
In a previous analysis using US multiple cause-ofdeath (MCOD) data from 1990 through 2006, Jung et al. observed that foreign-born patients had more than twice the TB-related death rate than that of US-born patients [30]. Meanwhile, our findings suggest that US-born TB-DM patients have more than twice the odds of death compared with foreign-born patients even after controlling for older age, homelessness, IDU, alcohol abuse and  [31]. The possible reasons leading to a higher mortality in US-born than foreign-born TB patients have been discussed elsewhere [19]. Briefly, foreign-born suspected TB patients might promptly receive the diagnosis and aggressive management than the US-born patients as foreign-birth has been recognized as a strong risk factor for TB disease by the Texas TB program [32]. Early detection of TB cases among foreign-born persons which occurs during immigration screening, contact investigation and targeted testing may also play a role in relatively lowering the mortality risk in foreign-born patients compare to US-born patients. Lastly, the significantly higher proportion of US-born TB cases in Texas compared to the national average (41.3% versus 31.4% in 2016, p < 0.001) suggests that some US-born patients may not be timely diagnosed, especially those patients who do not have a recent history of travelling to high TB burden countries [33]. The impact of DM-TB on patient mortality was reported inconsistently among studies in different populations [31,34]. In one of our previous studies using the surveillance data of all confirmed TB patients reported from the state of Texas between 2010 and 2016, the unadjusted association between mortality during treatment and diabetes was not significant (unadjusted odds ratio [OR] 1.04; 95% confidence interval [95% CI] 0.74, 1.47; p = 0.82) [19]. However, this unadjusted OR was obtained from only half of the study sample and might underestimate the impact of diabetes on the mortality. A more recent trend analysis using the entire data of the same population suggested that TB-DM patients had a higher mortality (10.3%) than non-DM patients (7.6%, p = 0.001) with nearly a 3-fold increase in the odds of overall death (adjusted OR 2.75; 95% CI 1.40, 5.39; p = 0.003) and death during TB treatment (adjusted OR 2.43; 95% CI 1.13, 5.23; p = 0.023). Additionally, while non-diabetic TB patients had a significant decrease in the mortality from 2010 to 2016, the mortality trend in TB-DM patients is unchanged overtime and consistently higher than that of non-diabetic patient (z = 3.05, p = 0.002) [35].
Although there is an increasing number of TB mortality risk scores being developed, we are not aware of a scoring system that is specific for the TB-DM population.  In the TB mortality scoring systems presented by Horita (2013) and Pefura-Yone (2017), TB-DM patients were not included [11,12]. In the prognostic models presented by Lui (2014), Nagai (2016), and Bastos (2016), only a small number of TB-DM patients were included in the study samples (ranged from n = 74 to n = 84) [13][14][15]. As these models used hospital-based data, many variables in these models such as respiratory failure requiring oxygen, serum albumin, activity of daily living, dehydration, hypoxemic respiratory failure, orientation disturbance, etc. are not routinely collected by TB programs. Using demographic and clinical characteristics in TB-DM patients, which are available for most TB program, our TB-DM mortality model is more practical and can be used in difference health care and public health settings. Although we have previously developed a mortality risk model including all patients staring TB treatment which has been shown to have a good overall diagnostic performance (AUC 0.82 in development and 0.80 in validation) in all presentations of TB in general [19], the model's AUC decreases to 0.76 in TB-DM patients. When the variable selection was specifically calibrated for TB-DM patients, our new TB-DM specific predictive model provided a more accurate prognosis in the TB-DM population.
Our risk model has several notable limitations. First, the study excluded 233 patients who died at the diagnosis or had a treatment outcome of other than "died" or "completed", which may be prone to potential misclassification bias. However, except for having a higher proportion of IDUs, the excluded patients had no other significant differences in the demographic and clinical characteristics compared with the included patients. Therefore, the misclassification bias, if any, would be minimal. Second, given our goal was using only the data routinely collected by the most TB programs, information regarding the diabetes treatment, lipid profiles and HbA1c were not evaluated in our model. Despite the lack of DM-specific variables, the model can still correctly discriminate the risk of mortality in most of the cases with an AUC of 0.83. Third, as our scoring system was developed in the US, external validation in settings of high TB burden would be necessary. Fourth, the use of surveillance data itself has some limitations. For example, certain self-reported data were obtained from interviewing TB patients which leads to a possibility that recall bias cannot be completely ruled out. Treatment failure or relapse were not well defined in the dataset. Treatment time and time to event data were not available and prevent us from performing more robust survival analyses. As our primary goal was to develop a predictive model for death during TB treatment, mortality prior to starting TB treatment has been excluded. Therefore, our findings may not reflect the overall mortality in TB-DM patients. Lastly, as Texas is one of the US states with a high TB burden, external validation in populations in different US states would also be needed.
Despite the limitations, there are several strengths making our prognostic score distinct: (1) using populationbased surveillance data of the entire state of Texas during 7 years; (2) including exclusively TB-DM patients with a large sample size; (3) using demographic and clinical characteristics which are routinely collected by most TB programs from the initial patient visits, our model can be used to identify the TB-DM with high mortality risk before having the biological confirmation for Mtb; (4) having good discrimination in both development and bootstrap internal validation (AUC = 0.83 and 082, respectively) and good calibration; and (5) providing a simple scoring system with the available of a mobile app for easily calculating the predicted probability of death during TB treatment.

Conclusions
Using demographic and clinical characteristics variables which are routinely collected by most TB programs from the initial patient visits, our simple scoring system can be used without waiting for the Mtb biological result and achieves good discrimination and calibration with the internal validation. With the free calculator app compatible with android and iOS mobile devices, the score would be a practical clinical tool for TB health professionals in identifying TB-DM comorbid patients who have high mortality risk so that appropriate approaches would be implemented to improve the patient outcomes.