Skip to main content

Predicting the risk of active pulmonary tuberculosis in people living with HIV: development and validation of a nomogram



Diagnosis of pulmonary tuberculosis (PTB) among people living with HIV (PLHIV) was challenging. The study aimed to develop and validated a simple, convenient screening model for prioritizing TB among PLHIV.


The study included eligible adult PLHIV participants who attended health care in Yunnan, China, from January 2016 to July 2019. Participants included before June 2018 were in the primary set; others were in the independent validation set. The research applied the least absolute shrinkage and selection operator regression to identify predictors associated with bacteriological confirmed PTB. The TB nomogram was developed by multivariate logistic regression. The C-index, receiver operating characteristic curve (ROC), the Hosmer–Lemeshow goodness of fit test (H–L), and the calibration curves were applied to evaluate and calibrate the nomogram. The developed nomogram was validated in the validation set. The clinical usefulness was assessed by cutoff analysis and decision curve analysis in the primary set.


The study enrolled 766 PLHIV, of which 507 were in the primary set and 259 in the validation set, 21.5% and 14.3% individuals were confirmed PTB in two sets, respectively. The final nomogram included 5 predictors: current CD 4 cell count, the number of WHO screen tool, previous TB history, pulmonary cavity, and smoking status (p < 0.05). The C-statistic was 0.72 (95% CI 0.66–0.77) in primary set and 0.68 (95% CI 0.58–0.75) in validation set, ROC performed better than other models. The nomogram calibration was good (H–L χ2 = 8.14, p = 0.15). The area under the decision curve (0.025) outperformed the existing models. The optimal cutoff for screening TB among PLHIV was the score of 100 (sensitivity = 0.93, specificity = 0.35).


The study developed and validated a discriminative TB nomogram among PLHIV in the moderate prevalence of TB and HIV. The easy-to-use and straightforward nomogram would be beneficial for clinical practice and rapid risk screening in resource-limited settings.

Peer Review reports


The prevalence of human immunodeficiency virus (HIV) and coinfection with tuberculosis (TB) was challenging to control the HIV and TB epidemic. TB was a critical opportunistic infection for people living with HIV (PLHIV). HIV-related TB had an incremental risk and derived disease burden globally [1]. The World Health Organization (WHO) estimated 815,000 TB incidence among HIV-positive people with 208,000 death in 2019, while the number was 14,000 and 2200 in China [2].

Regular screening for active pulmonary tuberculosis (PTB) in PLHIV, rapid PTB diagnosis are the key points for controlling the TB/HIV coinfection, also crucial to stop PTB transmission and reduce mortality among the susceptible population [3,4,5]. However, the diagnosis of PTB among PLHIV was particularly challenging. TB/HIV co-morbidity individuals might have atypical, non-specific clinical features and often a smear-negative disease [6]. WHO developed a symptom-based screening tool for intensifying PTB case detection in PLHIV, and it was recommended as routine work under the national TB program (NTP) to prioritize TB for isoniazid preventive therapy (IPT) in the low-resource setting [7, 8]. The WHO-recommended four symptom screen (W4SS) could identify PLHIV as having a high probability of PTB disease with one of four symptoms: current cough, fever, weight loss, or night sweat. The latest evidence of WHO guidelines showed that W4SS has relatively high sensitivity in adults and adolescents living with HIV, 83%, but low specificity, 38%. The sensitivity of W4SS among outpatients on ART was relatively low, 53%, indicating that W4SS alone would not be sufficient to detect TB among people in regular ART care. W4SS was relatively sensitive in outpatients not on ART (84%), indicating that W4SS is useful in finding people with TB among those starting HIV care, but the lack of specificity has implications for resources and rational use of diagnostic testing [9, 10].

The molecular test was applied for diagnosis of TB among PLHIV to improve the yield and speed of diagnosis, included the GeneXpert MTB/RIF assay (Cepheid, Sunnyvale, USA) (Xpert), and updated the Xpert Ultra (Ultra) [11,12,13,14]. WHO endorsed and recommended using the molecular WHO-recommended rapid diagnostics (mWRDs) for TB screening among inpatient HIV in medical wards where TB prevalence over 10% [15]. The challenge was Xpert equipment and reagent remains unavailable in primary health care units under resource-limited setting. Meanwhile, the symptom-based screening rule generated a large number of PLHIV without active PTB. The sequential screening strategy of W4SS algorithm then Xpert test was impractical. Thus, there was the need to predict the highest risk, prioritize the PTB screen, and sustain the resource.

Previous studies developed the multivariable prediction model (MPM) to calculate the probability of TB disease or improve TB case finding among PLHIV [16,17,18]. Although the researches were conducted in high TB and HIV prevalence locations, models could not validate in different TB/HIV prevalence settings. Furthermore, many predictors of the models were unavailable for resource-limited settings.

This study aimed to develop, validate, and assess a practical screening nomogram that integrated demographics, symptoms, and clinical factors. In addition, to determine whether the nomogram provides more accurate prediction and utility of the detection for bacteriological confirmed PTB among PLHIV in a moderate TB and HIV prevalence setting.


Study setting and participants

Yunnan province located southwest of China, shares a border with Myanmar, Laos, and Vietnam, with a moderate TB and HIV prevalence [19]. We developed and validated our nomogram applied the operational research data, a cross-sectional study that evaluates bacteriological confirmed PTB among PLHIV in Yunnan Provincial Hospital of Infectious Disease.

The study consecutively recruited eligible participants as a primary set between January 2016 and June 2018 in the study site. Then an independent set included eligible participants as a validation set between July 2018 and July 2019. The eligible participants in the primary and validate set were included or excluded using the same criteria, and they followed the identical investigation and laboratory testing.

The participants who met the inclusion criteria were recruited as eligible patients: inpatient or outpatient infected with HIV or diagnosed AIDS patients regardless of ART status and previously TB; age over 15 years; any symptomatic PLHIV, suitable for the TB screening and testing without critical or severe illness; and gave written informed consent to participate in the study. The exclusion criteria were as follows: refused to participate; did not give writing consent in the survey; current under anti-TB treatment; pregnant and maternal woman; abnormal mental condition, severe or emergency condition.

This study was conducted and reported by following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guideline developing, validating, or updating a prediction model [20]. TRIPOD checklist was displayed in Additional file 1: Table S1.

Study outcome

We defined the study outcome as bacteriological confirmed PTB and not PTB among PLHIV. Four sputum samples were requested when submitting a sputum test (two instant spot sputums, the third at night, and the fourth in the following morning). The bacteriological confirmed PTB was those PLHIV have any of the positivity of laboratory result: (1) Xpert MTB/RIF or (2) smear-positive or (3) M. tuberculosis culture-positive from any of the sputum samples. The laboratory procedure was stated in Additional file 2: Method S1.


Potential predictors associated with PTB among PLHIV were pre-specified based on epidemiological, clinical experience, and current literature review [21,22,23,24,25,26]. The factors included age, sex, body mass index (BMI, kg/m2), ART status (Pre-ART, On-ART), current CD 4 cell count (cells/ul), current cough (yes, no), fever (yes, no), unintentional weight loss (yes, no), night sweats (yes, no), hemoptysis (yes, no), tired (yes, no), the number of WHO screen tool of TB symptoms (current cough, fever, weight loss, and night sweats), previous TB history (yes, no), abnormal chest radiograph (yes, no), pulmonary cavity (yes, no), smoking status (never smoked, current smoking), alcohol use (never drank, current drinking). The following factors were self-reported by participants: age, TB symptoms, previous TB history, smoking status, and alcohol use; the rest were from the medical examination, test, or investigation.

Sample size calculation

The calculation of study sample size followed the method of developing a clinical prediction model [27, 28]. The parameters of C-index were set as 0.8, the TB/HIV prevalence was estimated as 0.2 [29], number of predictors was 17, the calculation resulted the minimum sample size was 752.

Statistical analysis

All statistical analysis was conducted with R software 3.6.2 ( The level of p < 0.05, two-sided was set as statistical significance.

Demographic and clinical characteristics were grouped based on the study outcome. The categorical variables were summarized as frequency and compared using the chi-square test and Fisher’s exact test as appropriate. The continuous variables were presented as the median and interquartile range (IQR).

Predictor selection

The least absolute shrinkage and selection operator (LASSO) method [30, 31], which suitable for the multiple variable selection and regression in high-dimension or highly correlated data, was applied to select the predictive features in the primary set. The 1-SE criteria defined the best LASSO parameter λ, which the regularized model that the ten-fold cross-validated error was within one standard error of the minimum. The optimal LASSO shrinkage parameter λ was in the 1-SE criteria with least number variables. The LASSO coefficient trace plot against variables determined the nonzero coefficients and variables analyzed in next model build step.

Development of the nomogram

The association between LASSO selected variables and the study outcome was assessed by univariate binary logistic regression. The multivariate prediction model was performed by using LASSO-selected variables. Based on the multivariate logistic regression results in the primary set, we built a quantitative and scoring tool of a nomogram to predict the PLHIV individual probability of PTB diagnosis. The scores of variables in nomogram were defined by ratio method. The score of variable i could be calculated as \({score}_{i}={\beta }_{i}\frac{100}{{\beta }_{max}}\), where the \({\beta }_{i}\) was regression coefficient of variable i, \({\beta }_{max}\) was the max \(\beta\) of all regression coefficients, the score of max \(\beta\) variable was set as 100. Total score of nomogram was defined as \({score}_{total}=\sum_{i=1}^{n}{score}_{i}\).

Model evaluation and calibration

Harrell’s C-index assessed the performance of the nomogram. The C-index represented the discriminative ability; the larger C-index, the more accurate the prediction was. Meanwhile, the sensitivity and specificity of nomogram were performed with receiver operating characteristic curve (ROC). The previous XPHACTOR model (predictors included: BMI, ART status, current CD 4 cell count, the number of WHO screen tool of TB symptoms) [17], and W4SS algorithm was compared the area under the curve (AUC) with the tuberculosis nomogram.

The Hosmer–Lemeshow goodness of fit test (H–L test) was applied for model calibration; a nonsignificant statistic of the H–L test implied the nomogram calibrated perfectly [32,33,34]. To assess the agreement between actual and predicted outcomes, we generated calibration curves to evaluate the nomogram. The in-sample and a bootstrapping validation (1000 bootstrap samples) were applied for depicted calibration curves.

Model validation

The independent validation was conducted in the validation set to test the generalization performance of the nomogram. The data of validation set were reintroduced into the previous established nomogram for internal validation. The performance of the nomogram in the validation set was assessed by C-index, ROC analysis, AUC comparison, H–L test, and calibration curves, as stated before.

Clinical utility

The cutoff analysis was performed to evaluate the applicability of the nomogram. The continuous scores and corresponding cutoffs were depicted with sensitivity and specificity of nomogram, aimed to find the optimal threshold for screening TB among PHILV. Decision curve analysis (DCA) was performed to evaluate the clinical usefulness of the nomogram. The clinical usefulness was defined as a net benefit (summing the true positives diagnosis and subtracting the weighted risk of false positives diagnosis) at sequential threshold probability. Thus, the balanced net benefit was the benefit of PLHIV with TB started empiric treatment (true positives) minus the harm of PHLHV without TB began empiric treatment (false positives) under various thresholds. We compared the net benefit of our nomogram to the previous XPHACTOR model and W4SS algorithm. A net benefit value of 0 meant no benefit of the model; higher values indicate more benefit [35, 36]. The benefit–cost ratio and area under the decision curve (AUDC) were used to evaluate the clinical utility of the risk model.

Sensitivity analysis

To test the stability and applicability of the model in different settings, we thus conducted sensitivity analysis based on the developed nomogram. By introducing and substituting variables in the model, we assessed the sensitive model with C-index, H–L test, and DCA analysis.


Study population and characteristics

In total, the study enrolled 766 PLHIV for investigation and examination, of which 66.2% (507/766) were in the primary set and 33.8% (259/766) in the validation set, respectively. The primary set identified 21.5% (109/507) bacteriological confirmed individuals and 78.5% (398/507) participants with negative test results; the proportion was 14.3% (37/259) and 85.7% (222/259) for the validation set.

PLHIV characteristics and the comparisons were shown in Table 1. The primary set consisted of 373 males and 134 females; the median age was 45 years. The validation set comprised 178 males and 81 females; the median age was 45 years. Ten of the factors in the primary set was significantly associated with bacteriological confirmed PTB, included the ART status, current CD 4 cell count, current cough, night sweats, unintentional weight loss, the number of WHO screen tool, previous TB history, abnormal chest radiograph, pulmonary cavity, and smoking status (p < 0.05).

Table 1 Baseline and clinical characteristics of primary and validation population

Predictor selection

The LASSO logistic regression model returned five nonzero coefficients while choosing the minimum criteria optimal shrinkage factor (Fig. 1). Based on the primary set, the collected 17 features were reduced to 5 predictors for the next step model development: current CD 4 cell count, the number of WHO screen tool, previous TB history, pulmonary cavity, and smoking status.

Fig. 1
figure 1

Feature selection applying the least absolute shrinkage and selection operator with binary logistic regression model. A LASSO shrinkage parameter λ selected by 10- fold cross-validation with minimum criteria. The Mean-Squared Error (red dots) and its 95% error bar (grey interval) was plotted against ln(λ) sequence. Vertical line shown the optimal value of λ selected by the minimum criteria (1-SE criteria: the regularized model that the cross-validated error is within one standard error of the minimum). A shrinkage factor λ of 0.041, with ln(λ) = − 3.19 was chosen for optimal parameter. B LASSO coefficient trace of features. LASSO coefficient trace was plotted against the ln(λ) sequence, vertical line was drawn by the optimal λ value selected in the first step, where result in 5 nonzero coefficients (current CD 4 cell count, the number of WHO screen tool of TB symptoms, pulmonary cavity, previous TB history and smoking status) with the minimum criteria. LASSO least absolute shrinkage and selection operator, MSE mean-squared error, TB tuberculosis

Development of prediction model

Table 2 summarized the development of the risk prediction model. The table showed that the predictors included current CD 4 cell count, the number of WHO screen tool, pulmonary cavity, previous TB history, and smoking status had a significant relationship with confirmed PTB (p < 0.05). All predictors were fitted into the multivariable binary logistic regression and shown as independent predictors. The TB nomogram that integrated all the above significant predictors for the primary set was developed and presented in Fig. 2A. The practical step-by-step TB risk calculator among PLHIV was showed in Fig. 2B. Additional file 3: Table S2 showed the calculation to define the points of variables.

Table 2 Binary logistic regression model for predicting bacteriological confirmed tuberculosis in people living with HIV
Fig. 2
figure 2

Tuberculosis nomogram and risk calculator among people with HIV/AIDS. A Tuberculosis nomogram for people with HIV/AIDS. B Tuberculosis risk calculator for people with HIV/AIDS. To use the nomogram in A, an individual PLHIV category located on each predictor axis, which corresponding to the Points axis overhead, a vertical line could be drawn to determine the score of each variable. The summation score of 5 variables located on the Total Points axis, a vertical line could be drawn downward to the Risk of TB axis to determine the likelihood of the bacteriologically confirmed tuberculosis in PLHIV. The risk calculation was followed the steps in B. First, define and choose the nomogram scores corresponding to the PLHIV individual category, then sum up all five assigned scores, and got the total points of PLHIV individual. Second, find the risk of bacteriologically confirmed tuberculosis in PLHIV in the right-hand side table corresponding the total points. PLHIV people with HIV/AID; TB, tuberculosis

The C-index of the PTB model among PLHIV in the primary set was 0.72 (95% CI, 0.66–0.77). The AUC of ROC for the nomogram (0.72) was bigger than XPHACTOR model (0.61) and W4SS algorithm (0.55, p < 0.05, Fig. 3A). The H–L test showed a nonsignificant statistic (χ2 = 8.14, p = 0.15), suggested that the nomogram performance was no departure from the ideal fit. The calibration plot showed that it along with the diagonal, though over prediction presented in the high-risk section (Fig. 4A).

Fig. 3
figure 3

Receiver operating characteristic area under the curve showing the discriminative performance of 3 different algorithms. A Predictive performance for the primary set. B Predictive performance for the validation set. The red line represents the ROC and AUC of tuberculosis nomogram, The green line represents the ROC and AUC of XPHACTOR study rule (included ART status, CD4 count, BMI, and the number of WHO symptoms), The blue line represents the ROC and AUC of W4SS algorithm. The vertical lines show the comparison between different algorithm AUCs, the red vertical lines with asterisk were AUC difference significant p < 0.05, the black vertical line was insignificant and p > 0.05. ROC receiver operating characteristic curve, AUC area under the curve, W4SS WHO-recommended four symptom screen

Fig. 4
figure 4

Calibration curve of the tuberculosis nomogram. A Calibration curve of the tuberculosis nomogram in the primary set. B Calibration curve of the tuberculosis nomogram in the validation set. The calibration curve represents the agreement between nomogram-predicted risk of confirmed TB and actual diagnosis outcome. The spike histogram at the top shows the distribution of predicted risks. The dash line in diagonal shows a perfectly-calibrated model. Red and black solid lines show the nomogram in-sample calibration outcome and the bias-corrected calibration by bootstrap resampling 1000 times respectively

Validation of the nomogram

The independent validation of nomogram performed in the validation set, the C-index of the PTB nomogram was 0.68 (95% CI, 0.58–0.75). The AUC of ROC for nomogram (0.68) was bigger than W4SS algorithm (0.59, p < 0.05), but it was identical with XPHACTOR model (p > 0.05, Fig. 3A). The calibration curve presented a favorable overall agreement between the observed and predicted probability of PTB (Fig. 4B).

Clinical utility

The optimal cutoff for screening TB among PLHIV was the score of 100 (Fig. 5A, sensitivity = 0.93, specificity = 0.35). More information of threshold analysis was in Additional file 4: Table S3. The decision curve analysis for the PTB nomogram among PLHIV was presented in Fig. 5B. The curve showed that the proposed nomogram provided the highest net benefit within the risk threshold range of 10% to 41% compared with the XPHACTOR rule, diagnosis-all, or diagnosis-none scheme. Cost to benefit ratio was from 1:9 to 2:3 in that threshold range. The area under the decision curve was 0.025, higher than the XPHACTOR rule (0.016), W4SS algorithm (0.005), diagnosis-all (0.014), or diagnosis-none (0.00) strategy, respectively.

Fig. 5
figure 5

Performance of cutoffs and decision curve analysis for tuberculosis diagnostic nomogram among people with HIV/AIDS. A The sensitive and specific performance of the tuberculosis nomogram at different cut-offs. B Decision curve analysis compares the standardized net benefit of different strategies. The clinical utility of four diagnostic strategies is compared by plotting the net benefit (y axis) for the threshold based on the risk of bacteriological confirmed tuberculosis (x axis). Net benefit calculated by summing the benefits (true positives) and subtracting the harms (false positives, weighting by the relative harm of withdraw treatment and an unnecessary treatment). The red line represents the developed nomogram yielded highest net benefit and area under decision curve for potential thresholds (ranging from 10 to 41%). The green line represents the net benefit of XPHACTOR study rule (included ART status, CD4 count, BMI, and the number of WHO symptoms). The blue line represents the net benefit of W4SS algorithm. The black line represents the assumption that all PLHIV diagnosis with confirmed TB. The grey line represents the assumption that no PLHIV diagnosis with confirmed TB. PLHIV people with HIV/AID, TB tuberculosis, W4SS WHO-recommended four symptom screen

Sensitivity analysis

The Additional file 5: Table S4 presented the sensitivity analysis results. The sensitive models with different predictors (by introducing ART status, and substituting CD4 count by ART status) presented no further discriminative information; performed the constant C-index and calibration H–L test results with the favorite nomogram. The AUDC presented no more extra benefit in sensitivity models.


We developed and validated the nomogram to calculate individualized risks, which estimated the active PTB among PLHIV under moderate TB/HIV prevalence. Although its rule was simple and the predictors were accessible, this integrated prediction model accurately discriminated PLHIV who had bacteriological confirmed PTB from those without PTB. The predicted and actual risks were concordances in both primary and validated sets. Furthermore, the nomogram had an additional benefit and outperformed the existing multivariate predict model, suggested the remarkable clinical utility in practice.

The potential impact of this study was the nomogram could be integrated into the disease control policy. The application was the nomogram as a tool to help routine TB control work. The health care provider (HCP) could rapidly calculate risk by hand and improve PTB case findings. In the constrained-resource setting, the bacteriology or molecular test for TB might be unavailable, or the long delay of laboratory test results reported, all of which might lead to the unfavorable consequence of the clinical decision and poor PTB treatment prognosis. The utility of the risk calculator was shortening the diagnosis delay and the time waiting for the TB laboratory tests results. The model predefined the highest probability of active PTB, which was considered the initiation of anti-TB treatment by its high sensitivity.

The significant advantage of the predictive nomogram was laid on its predictors, the PLHIV information contained current CD 4 cell count, the number of WHO screen tool, pulmonary cavity, previous TB history, and smoking status; which enhanced the nomogram accurate and was available and accessible for primary health care unit. The self-reported predictors of symptoms, previous TB history, and smoking status were conveniently collected by HCP. Medical test and examination of CD 4 cell count and Chest X-ray (CXR) was available in most resource-limited situations.

The LASSO method selected the combination of 5 characteristics for the nomogram and met the optimal criterion. Although ART status was an indicator in other clinical models [17, 18], our risk calculator excluded it. In the sensitivity analysis, we added the ART status as an additional predictor. The model with six predictors presented no further discriminative information compared with developed nomogram; The AUDC obtained no more extra benefit for ART status. Another model by substituting CD4 count with ART status performed worse discriminative and AUDC than the developed nomogram. Thus, the HCP could calculate the PTB risk with the above 5 predictors regardless of ART status. For the consideration that in some resource limited countries, CD4 testing was not systematically performed under NTP or the testing might be unavailable, substituting CD4 count by ART status might be an advantage model due to the slight loss of performance in sensitivity model (C-index, 0.70, 95%CI, 0.64–0.76).

The risk calculator integrated with the WHO screening tool could potentially pragmatically impact clinical practice. The latest evidences supported the standpoint that any TB screening tool used alone was insufficient to detect PTB and to excluded no-PTB in PLHIV, because either the low sensitivity or the low specificity of the existing tools [10]. The W4SS algorithm alone has relatively high sensitivity but low specificity (sensitivity = 83%, specificity = 38%) in adults and adolescents living with HIV [15]. A meta-analysis presented the W4SS addition with any abnormal CXR in people on-ART could improve sensitivity to 84.6% but decrease the specificity to 29.8% [37]. Our study supplied a measurement that integrated the WHO tool and other risk predictors, yielded more accurate discrimination and potentially impacts clinical practice. In the ROC and AUC analysis in primary and validation set, the yield of both sensitivity and specificity suggested that the nomogram not only could detect the high PTB risk PLHIV individual but also could be beneficial to rule out the active TB cases.

Another important contribution of this study is that in the cutoff analysis, we defined an optimal threshold score of TB nomogram (cutoff = 100). The interpretation of this would be PLHIV with CD4 count less than 100 cells/mm3 was enough to reach 100 points for the nomogram. This would mean that any PLHIV with any symptom and CD4 less than 100 cells/mm3 should be systematically investigated for TB. The result was consistent with a recent clinical trial [38]. The high number of PLHIV detected with TB by intensive case finding with urine lipoarabinomannan (LAM), Xpert and chest X-ray regardless on symptom among ART naïve PLHIV with CD4 < 100 cells /mm3. Despite the targets proposed by WHO for screening test was 90% for sensitivity and 70% for specificity, the nomogram performed better than other existing tools, which could be applicable for TB screening (sensitivity = 93%, specificity = 35%). There was other use of this nomogram, if the individual with low probability of PTB accompanied by latent TB infection (LTBI) diagnosis, suggested the potential eligibility for the isoniazid preventive therapy.

Previous studies have developed multivariate prediction models for identifying TB among PLHIV. The research by Balcha et al. developed a scoring system of PTB in adult PLHIV in Ethiopia [39]. The model predictors included cough, Karnofsky score, mid-upper arm circumference (MUAC), peripheral lymphadenopathy and hemoglobin. Though the area under curve (AUC) was 0.75, the system was never calibrated and validated. Thus, the generalization ability of the scoring system was unknown. The study by Boyles et al. in South Africa [18] developed an MPM and concluded the lack of response to empiric antibiotics therapy was a strong predictor of TB in PLHIV, C-reactive protein (CRP) adds predictive value only while the second visit and measured after anti-biotics (C-index, 0.75). The sample size was small, and no external validation was done yet. Another research was conducted by Hanifa et al. in South Africa [17]. They developed the XPHACTOR rule, which included ART status, CD4 count, BMI, and the number of WHO symptoms. The rule shown an acceptable C-index (0.75) and good calibration (H–L test p = 0.31). We thus did external validation applied the XPHACTOR algorithm to our primary set data. The MPM performance was shown an ordinary discriminative ability (C-index, 0.61) and adequately calibrated by the H–L test (χ2 = 4.37, p = 0.49). The deficient validation performance might attribute to the differential study populations and epidemiology risks. More evidences and external validations in different population were needed to verify the generalization and applicability of the XPHACTOR rule. Meta-analysis showed the BMI had a consistent log-linear association with TB incidence in the general population [40]. Higher BMI decreased the HIV-related TB incidence [41]. In our study, both in the primary and validated set, BMI was insignificantly associated with PTB incidence among PLHIV. The reason was that the exposure of high BMI contributed to reducing the PTB epidemic in high and middle-high socio-demographic index (SDI) countries recent years [42]. Meanwhile, the population malnutrition status was still constantly the main risk in the low and middle SDI countries. The demographic and geographical disequilibrium of risk distribution led to the different risk models.

The most important argument for applying the nomogram was based on the interpretation of the individual needs for further treatment or health care. Therefore, to sustain the clinical usefulness of the nomogram, we assessed whether the nomogram decision improved clinical outcomes. The novel method of decision curve analysis was applied to explore the consequence based on threshold probabilities. The indicator of high net benefit derived worthwhile clinical practice. The decision curve showed that if the risk threshold was defined between 10 and 41%, the current nomogram added more benefit than the XPHACTOR rule, W4SS algorithm, diagnosis-all, or diagnosis-none scheme.

The remarkable clinical utility of the nomogram lied in the accurate individualized quantification of the TB risk among PLHIV. By treating the TB risk as a continuum instead of the stratified risk into high-, middle-, or low-risk groups, the study presented TB diagnostic information that could assist the clinical decision-making. Meanwhile, the nomogram provided the clinician and patient with indispensable knowledge to promote the well-informed clinical decision in practice. While the risk threshold in the study was undefined, the nomogram performed better and yield more than existed model and algorithm at any risk threshold. Therefore, the model could be set as the standard for conventional clinic and disease control practice for the rapid screening of TB in the PLHIV, especially in implementing the large-scale active case finding strategy. In addition, PLHIV could revalue and examined the TB risk periodically by using the nomogram themselves. This would benefit the primary prevention for TB among HIV.

Our study has the strength that the well-designed cross-sectional study consecutively included the PLHIV; the representative samples allowed us to develop and validate the practical nomogram. In addition, the study was implemented and reported compliance with TRIPOD guidelines. Furthermore, the novel predictor selection method and decision curve analysis improved the power and interpretation of the research.

However, the study has the limitation that the outcome was based on laboratory bacteriological confirmed PTB, though the updated Ultra outperformed Xpert among PLHIV. The systematic review reported the substantially improved sensitivity for Ultra (90%) compared with Xpert (77%) in PLHIV [14]. This suggested the further study was needed to assess the nomogram by applying the new diagnostic tools. The nomogram did not incorporate some new point of care tests that could potentially have value in screening of TB, include urine LAM and CRP, and these diagnostic tests need further evaluation [15, 43, 44]. Bacteriologically confirmed TB represented only a fraction of all TB cases among PLHIV, and this fraction is even lower when PLHIV are severely immunosuppressed, a prospective cohort using a TB case validation by an independent review committee is needed in further study. Besides, the predictor was self-reported by participants, such as smoking habit and previous TB history, meanwhile, it did not consider of the quantity and duration for the smoking and drinking. The model would be imprecise with inaccurate exposure measurement. Additionally, in many sub-Saharan African countries with more constrained resource, CD4 and CXR test might unavailable, risk factors of TB were dissimilar to Asian, so more evidence was needed for the nomogram generalization in different population. By considering the relative low specificity, future sensitivity analysis should focus on that without CXR or identifying the proportion of patients who reach 100 points without CXR in limited resource countries.


The study developed and validated a TB nomogram among PLHIV; the simple rule integrated PLHIV’s information: CD 4 cell count, the number of WHO screen tool, pulmonary cavity, previous TB history, and smoking status. The nomogram could benefit in clinical and public health practice and TB intensified case finding strategy, and it should be prioritized for rapid TB screen in PLHIV in a moderate setting.

Availability of data and materials

All relevant data in this study are freely available in the manuscript as submitted and are included in summary tables and the results section of the manuscript. Dataset is available upon reasonable request to the corresponding author.



Antiretroviral therapy


Area under curve


Area under the decision curve


Body mass index


C-reactive protein


Chest X-ray


Decision curve analysis


Health care provider


Human immunodeficiency virus

H–L test:

Hosmer–Lemeshow goodness of fit test


Isoniazid preventive therapy


Interquartile range


The least absolute shrinkage and selection operator




Latent TB infection


Multivariable prediction model


Molecular WHO-recommended rapid diagnostics


Mid-upper arm circumference


National TB program


Receiver operating characteristic curve


People living with HIV


Pulmonary tuberculosis


Socio-demographic index




Xpert Ultra


World Health Organization


WHO-recommended four symptom screen


GeneXpert MTB/RIF assay


  1. Stanaway JD, Afshin A, Gakidou E, Lim SS, Abate D, Abate KH, et al. Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Stu. Lancet. 2018;392:1923–94.

    Article  Google Scholar 

  2. World Health Organization. Global tuberculosis report 2020. Geneva; 2020.

  3. Podlekareva DN, Efsen AMW, Schultze A, Post FA, Skrahina AM, Panteleev A, et al. Tuberculosis-related mortality in people living with HIV in Europe and Latin America: an international cohort study. Lancet HIV. 2016;3:e120–31.

    Article  PubMed  Google Scholar 

  4. Lisboa M, Fronteira I, Colove E, Nhamonga M, Martins MdRO. Time delay and associated mortality from negative smear to positive Xpert MTB/RIF test among TB/HIV patients: a retrospective study. BMC Infect Dis. 2019;19:18.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Suthar AB, Lawn SD, del Amo J, Getahun H, Dye C, Sculier D, et al. Antiretroviral therapy for prevention of tuberculosis in adults with HIV: a systematic review and meta-analysis. PLoS Med. 2012;9: e1001270.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Getahun H, Harrington M, O’Brien R, Nunn P. Diagnosis of smear-negative pulmonary tuberculosis in people with HIV infection or AIDS in resource-constrained settings: informing urgent policy changes. The Lancet. 2007;369:2042–9.

    Article  Google Scholar 

  7. Getahun H, Kittikraisak W, Heilig CM, Corbett EL, Ayles H, Cain KP, et al. Development of a standardized screening rule for tuberculosis in people living with HIV in resource-constrained settings: individual participant data meta-analysis of observational studies. PLoS Med. 2011;8: e1000391.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Corbett E, Zezai A, Cheung YB, Bandason T, Dauya E, Munyati S, et al. Provider-initiated sympton screening for tuberculosis in Zimbabwe: diagnostic value and the effect of HIV status. Bull World Health Organ. 2010;88:13–21.

    Article  PubMed  PubMed Central  Google Scholar 

  9. World Health Organization. WHO operational handbook on tuberculosis Module 2: screening. Geneva: World Health Organization; 2021.

    Google Scholar 

  10. Dhana A, Hamada Y, Kengne AP, Kerkhoff AD, Rangaka MX, Kredo T, et al. Tuberculosis screening among ambulatory people living with HIV: a systematic review and individual participant data meta-analysis. Lancet Infect Dis. 2021.

    Article  PubMed  Google Scholar 

  11. Boehme CC, Nabeta P, Hillemann D, Nicol MP, Shenai S, Krapp F, et al. Rapid molecular detection of tuberculosis and rifampin resistance. N Engl J Med. 2010;363:1005–15.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Steingart KR, Schiller I, Horne DJ, Pai M, Boehme CC, Dendukuri N. Xpert® MTB/RIF assay for pulmonary tuberculosis and rifampicin resistance in adults. Cochrane Database Syst Rev. 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Dorman SE, Schumacher SG, Alland D, Nabeta P, Armstrong DT, King B, et al. Xpert MTB/RIF Ultra for detection of Mycobacterium tuberculosis and rifampicin resistance: a prospective multicentre diagnostic accuracy study. Lancet Infect Dis. 2018;18:76–84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Horne DJ, Kohli M, Zifodya JS, Schiller I, Dendukuri N, Tollefson D, et al. Xpert MTB/RIF and Xpert MTB/RIF Ultra for pulmonary tuberculosis and rifampicin resistance in adults. Cochrane Database Syst Rev. 2019.

    Article  PubMed  PubMed Central  Google Scholar 

  15. World Health Organization. WHO consolidated guidelines on tuberculosis Module 2: Screening. Geneva: World Health Organization; 2021.

    Google Scholar 

  16. Balcha TT, Skogmar S, Sturegård E, Schön T, Winqvist N, Reepalu A, et al. A clinical scoring algorithm for determination of the risk of tuberculosis in HIV-infected adults: a cohort study performed at Ethiopian health centers. Open Forum Infect Dis. 2014;1:1–9.

    Article  Google Scholar 

  17. Hanifa Y, Fielding KL, Chihota VN, Adonis L, Charalambous S, Foster N, et al. A clinical scoring system to prioritise investigation for tuberculosis among adults attending HIV clinics in South Africa. PLoS ONE. 2017;12: e0181519.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Boyles TH, Nduna M, Pitsi T, Scott L, Fox MP, Maartens G. A clinical prediction score including trial of antibiotics and C-reactive protein to improve the diagnosis of tuberculosis in ambulatory people with HIV. Open Forum Infect Dis. 2020;7:1–9.

    Article  CAS  Google Scholar 

  19. Health Commission of Yunnan Province. Overview of notifiable infectious diseases of Yunnan Province in 2020. Kunming; 2020.

  20. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594–g7594.

    Article  PubMed  Google Scholar 

  21. Mama M, Manilal A, Tesfa H, Mohammed H, Erbo E. Prevalence of pulmonary tuberculosis and associated factors among HIV positive patients attending antiretroviral therapy clinic at Arba Minch General Hospital Southern Ethiopia. Open Microbiol J. 2018;12:163–71.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Liu E, Makubi A, Drain P, Spiegelman D, Sando D, Li N, et al. Tuberculosis incidence rate and risk factors among HIV-infected adults with access to antiretroviral therapy. AIDS. 2015;29:1391–9.

    Article  CAS  PubMed  Google Scholar 

  23. Karo B, Haas W, Kollan C, Gunsenheimer-Bartmeyer B, Hamouda O, Fiebig L. Tuberculosis among people living with HIV/AIDS in the German ClinSurv HIV Cohort: long-term incidence and risk factors. BMC Infect Dis. 2014;14:148.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Chang CA, Meloni ST, Eisen G, Chaplin B, Akande P, Okonkwo P, et al. Tuberculosis incidence and risk factors among human immunodeficiency virus (HIV)-infected adults receiving antiretroviral therapy in a large HIV program in Nigeria. Open Forum Infect Dis. 2015;2:2633851.

    Article  Google Scholar 

  25. Kim YJ, Woo JH, Kim MJ, Park DW, Song JY, Kim SW, et al. Opportunistic diseases among HIV-infected patients: a multicenter-nationwide Korean HIV/AIDS cohort study, 2006 to 2013. Korean J Intern Med. 2016;31:953–60.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Méda ZC, Sombié I, Sanon OWC, Maré D, Morisky DE, Chen Y-MA. Risk factors of tuberculosis infection among HIV/AIDS patients in Burkina Faso. AIDS Res Human Retrovir. 2013;29:1045–55.

    Article  Google Scholar 

  27. Riley RD, Van Calster B, Collins GS. A note on estimating the Cox-Snell R2 from a reported C statistic (AUROC) to inform sample size calculations for developing a prediction model with a binary outcome. Stat Med. 2021;40:859–64.

    Article  PubMed  Google Scholar 

  28. Riley RD, Ensor J, Snell KIE, Harrell FE, Martin GP, Reitsma JB, et al. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441.

    Article  PubMed  Google Scholar 

  29. Gao L, Zhou F, Li X, Jin Q. HIV/TB co-infection in mainland china: a meta-analysis. PLoS ONE. 2010;5: e10736.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective. J Roy Stat Soc B. 2011;73:273–82.

    Article  Google Scholar 

  31. Sauerbrei W, Royston P, Binder H. Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Stat Med. 2007;26:5512–28.

    Article  PubMed  Google Scholar 

  32. Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Alba AC, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux PJ, et al. Discrimination and calibration of clinical prediction models. JAMA. 2017;318:1377.

    Article  PubMed  Google Scholar 

  34. Stevens RJ, Poppe KK. Validation of clinical prediction models: what does the “calibration slope” really measure? J Clin Epidemiol. 2020;118:93–9.

    Article  PubMed  Google Scholar 

  35. Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res. 2019;3:18.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Van Calster B, Wynants L, Verbeek JFM, Verbakel JY, Christodoulou E, Vickers AJ, et al. Reporting and interpreting decision curve analysis: a guide for investigators. Eur Urol. 2018;74:796–804.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Hamada Y, Lujan J, Schenkel K, Ford N, Getahun H. Sensitivity and specificity of WHO’s recommended four-symptom screening rule for tuberculosis in people living with HIV: a systematic review and meta-analysis. The Lancet HIV. 2018;5:e515–23.

    Article  PubMed  Google Scholar 

  38. Blanc F-X, Badje AD, Bonnet M, Gabillard D, Messou E, Muzoora C, et al. Systematic or test-guided treatment for tuberculosis in HIV-infected adults. N Engl J Med. 2020;382:2397–410.

    Article  CAS  PubMed  Google Scholar 

  39. Balcha TT, Skogmar S, Sturegård E, Schön T, Winqvist N, Reepalu A, et al. A clinical scoring algorithm for determination of the risk of tuberculosis in HIV-infected adults: a cohort study performed at Ethiopian health centers. Open Forum Infect Dis. 2014;1:ofu095.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Lonnroth K, Williams BG, Cegielski P, Dye C. A consistent log-linear relationship between tuberculosis incidence and body mass index. Int J Epidemiol. 2010;39:149–55.

    Article  PubMed  Google Scholar 

  41. Hanrahan CF, Golub JE, Mohapi L, Tshabangu N, Modisenyane T, Chaisson RE, et al. Body mass index and risk of tuberculosis and death. AIDS. 2010;24:1501–8.

    Article  PubMed  Google Scholar 

  42. Murray CJL, Aravkin AY, Zheng P, Abbafati C, Abbas KM, Abbasi-Kangevari M, et al. Global burden of 87 risk factors in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. The Lancet. 2020;396:1223–49.

    Article  Google Scholar 

  43. Shapiro AE, Hong T, Govere S, Thulare H, Moosa M-Y, Dorasamy A, et al. C-reactive protein as a screening test for HIV-associated pulmonary tuberculosis prior to antiretroviral therapy in South Africa. AIDS. 2018;32:1811–20.

    Article  CAS  PubMed  Google Scholar 

  44. Yoon C, Semitala FC, Atuhumuza E, Katende J, Mwebe S, Asege L, et al. Point-of-care C-reactive protein-based tuberculosis screening for people living with HIV: a diagnostic accuracy study. Lancet Infect Dis. 2017;17:1285–92.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


We thank the tireless contributions of staffs in Yunnan Provincial Hospital of Infectious Disease, Family Health International office and other related health care workers in undertaking the study. We thank Chen-Yuan Chiang and Kai Wang for their support to this study.


The study was funded by the Yunnan health training project of high level talents (Grant No. H-2019027) and scientific research institutions set up in the Yunnan Provincial Department of Health (Grant No. 2016NS143). The funders had no role in study design, data collection and analysis, writing of the manuscript and decision to publish. Funding for the costs of publication in an open-access, peer-reviewed journal was supported by Yunnan Center for Disease Control and Prevention.

Author information

Authors and Affiliations



JC, TC, XY, HR, LL, XL, XY and QX participated in the field work and acquisition of data. TC, XY, HR and QX did the laboratory work. LX, LL and XL leaded the project and participated in study design. JC analysed and interpreted the data for the work. JC drafted the work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lin Xu.

Ethics declarations

Ethical approval and consent to participate

This study was reviewed and approved by the Yunnan AIDS Care Center Ethics Review board (No. 201601). All participants gave written informed consent to participate in this study. The consent form was approved as part of the ethics approval process. All methods were performed in accordance with the relevant guidelines and regulations required by the Yunnan AIDS Care Center Ethics Review board. The study was not included personal identification and all data in analyses were anonymized.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Table S1.

The TRIPOD checklist.

Additional file 2. Method S1.

The laboratory procedure of TB conformation in PLHIV.

Additional file 3. Table S2.

The calculation of the variables’ scores.

Additional file 4. Table S3.

The cutoff and threshold analysis of the tuberculosis nomogram.

Additional file 5. Table S4.

Sensitivity analysis results of the tuberculosis nomogram.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Li, L., Chen, T. et al. Predicting the risk of active pulmonary tuberculosis in people living with HIV: development and validation of a nomogram. BMC Infect Dis 22, 388 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: