Diagnostic accuracy of the rapid urine lipoarabinomannan test for pulmonary tuberculosis among HIV-infected adults in Ghana–findings from the DETECT HIV-TB study

Background Rapid diagnostic tests are urgently needed to mitigate HIV-associated tuberculosis (TB) mortality. We evaluated diagnostic accuracy of the rapid urine lipoarabinomannan (LAM) test for pulmonary TB and assessed the effect of a two-sample strategy. Methods HIV-infected adults eligible for antiretroviral therapy were prospectively enrolled from Korle-Bu Teaching Hospital in Ghana and followed for minimum 6 months. We applied the LAM test on urine collected as a spot and early morning sample. Diagnostic accuracy was analysed for a microbiological TB reference standard based on sputum culture and Gene Xpert MTB/RIF results and for a composite reference standard including clinical follow-up data. Performance of sputum smear microscopy was included for comparison. Results Of 469 patients investigated for TB, the LAM test correctly identified 24/55 (44 %) of microbiologically confirmed TB cases. Sensitivity of the LAM test was positively associated with hospitalisation (67 %), Modified Early Warning Score > 4 (57 %) and subsequent death (71 %). LAM test specificity was 95 % increasing to 98 % for the composite reference standard. A two-sample LAM test strategy did not improve test performance. Using concentrated sputum for Ziehl-Neelsen and fluorescence microscopy in combination yielded a sensitivity of 31/55 (56 %) that increased to 35/55 (64 %) when the LAM test was added. Surprisingly, nontuberculous mycobacteria were cultured in 34/469 (7 %) and associated with a positive LAM test (p = 0.008). Conclusions LAM test sensitivity was highest in patients with poor prognosis and subsequent death and did not increase with a two-sample strategy. A rigorous sputum microscopy strategy had superior sensitivity, but the simplicity of the LAM test holds operational possibilities as a TB screening method among severely sick patients. Electronic supplementary material The online version of this article (doi:10.1186/s12879-015-1151-1) contains supplementary material, which is available to authorized users.


Background
Tuberculosis (TB) is the leading cause of death among HIV-infected individuals initiating antiretroviral therapy (ART) in sub-Saharan Africa [1]. Undiagnosed TB remains highly prevalent in this population as identified in studies of intensified TB screening pre-ART [2][3][4]; the highest TB diagnostic yield achieved by microbiological screening of all HIV-infected individuals regardless of presenting symptoms [5]. While culture based diagnostics remain the gold standard for TB diagnosis, it is time-consuming and unavailable in many TB endemic settings. The fully automated GeneXpert®MTB/RIF ("Xpert") assay can provide diagnosis within 2 hours, but wide implementation of this assay is impeded by high costs and requirements for electricity and maintenance [6,7]. Often, TB diagnosis relies on symptom-based screening, chest x-ray and sputum smear microscopy for Acid Fast Bacilli (AFB) although these tools have demonstrated limited performance [3,[8][9][10]. Recently, the DETERMINE TB-LAM Ag test ("LAM test" Alere, Waltham, MA, USA) was commercialised for point-ofcare TB diagnosis. It is a lateral flow immunochromatographic strip test developed to detect the mycobacterial lipoarabinomannan (LAM) antigen released from metabolically active or degrading mycobacteria [11]. It can be applied directly on urine and provides a result within 30 minutes. Previous LAM performance studies have reported great variation in test sensitivity and specificity [12][13][14][15][16][17][18][19]. A consensus paper on the LAM test highlights issues with its performance and appropriate target group and calls for further evaluations of the LAM test in diverse settings [20].
We undertook a prospective study in Ghana to evaluate the diagnostic accuracy of the LAM test to diagnose pulmonary TB among HIV-infected individuals eligible for ART. We assessed a two-sample strategy, similar to what is conventionally applied for sputum microscopy, and explored the trade-off between the increase in sensitivity against reduction in specificity when using the LAM test as an add-on test to sputum smear microscopy.

Methods
Design This is a cross sectional diagnostic accuracy study with a longitudinal follow-up of minimum 6 months.

Study setting and population
As part of the DETECT HIV-TB study, participants were recruited prospectively between January 2013 and March 2014 from the out-and inpatient departments, Fevers Unit, Korle-Bu Teaching Hospital; Ghana's largest public hospital situated in the capital city Accra. The Fevers Unit provides ART services including adherence counselling, medical care and laboratory services for HIVinfected individuals. The overall HIV-epidemic in Ghana is moderate with a prevalence of 1.4 % in the general population [21]. In 2013, the TB Control Programme in Ghana conducted a national TB prevalence survey that found an overall adult TB prevalence of 356/100,000 (personal communication with the programme manager for the National TB Control Programme, Ghana, unpublished data). This is much higher than the WHO estimated prevalence of 71/100,000 and questions the estimated case detection rates of 88 %, which is likely to be an overestimation [22].
HIV-infected adults were consecutively enrolled into the study if ≥18 years and eligible for lifelong ART, i.e. severe and advanced HIV clinical disease (WHO clinical stage 3 or 4 disease), a blood CD4 cell count ≤350 cells/ μl or pregnant [23,24]. Patients on antituberculous treatment of more than 2 days within the last 3 months before enrolment or who were unable to produce sputum or urine samples were excluded. Presence of TB related symptoms was not used as inclusion or exclusion criteria.
We used a standardized questionnaire to record demographic and clinical details including vital measures, height and weight, TB specific signs and symptoms including the WHO symptom screen with presence of more than one of the following symptoms: cough, fever, weight loss or night sweats [25] and previous history of TB. We calculated the Modified Early Warning Score (MEWS) for each participant as an indicator for illness severity. MEWS is a scoring system based on the level of deranged physiological parameters including systolic blood pressure, pulse rate, respiratory rate, body temperature and level of consciousness. A cut-off score > 4 has been considered predictive of poor outcomes [26]. Blood CD4 cell count was obtained for participants at enrolment. Full blood count and x-ray interpretation was obtained if available through routine laboratory investigations.
At enrolment, participants were requested to provide a spot urine specimen for LAM testing and one respiratory specimen for sputum smear microscopy, mycobacterial culture and Xpert assay. Participants were further asked to deliver an early morning urine and sputum sample within 7 days after enrolment. Medical records were reviewed from time of enrolment to a minimum of 6 months post-enrolment. If medical records were unavailable, an interview with the participant or relatives of the participant was conducted by phone. The following were recorded: vital status, lost to follow-up, transfer out, start on antituberculous treatment.

Ethics
Informed consent was obtained in writing from each participant prior to enrolment in the study. The study protocol was approved by the Ethical and Protocol Review Committee, University of Ghana Medical School (MS-Et./ M.4-P 3.3/2012-13) and evaluated by the Developing Country Committee of the Danish National Committee on Health Research Ethics (No. 1302133/1206169). Urine LAM test results were not used for treatment decisionmaking, but sputum microscopy, culture and Xpert results were communicated to the responsible clinicians.
We followed the Standards for the Reporting of Diagnostic accuracy studies (STARD) criteria [27]. Additional file 1 displays the STARD checklist (For more on STARD see http://www.stard-statement.org/).

Urine sample analysis
Urine samples were transported to the Department of Medical Microbiology within Korle-Bu Teaching Hospital premises where study staff applied the LAM test on fresh urine in accordance with the manufacturer's instructions. A 60 μL of unprocessed urine was applied to the sample pad at the bottom of the test strip. The test result was read between 25-35 minutes later and graded by comparing the test strip with a reference card. We used the original 2012 reference scale card that consisted of 5 colour intensity grades. The test band was graded as zero if no visual band appeared and graded 1 through 5 for a visualized band of equal intensity as those on the reference card. If a faint band was observed with intensity lower than the grade 1 cut-point it was recorded as "faint". Additional file 2 provides a figure of the reference card used and its interpretation. Each test was graded by two individual readers blinded to their counterpart's observations and to the patient identity through the use of anonymous study ID's. Reference standard results were not known at the time of LAM testing.

Sputum sample analysis
Sputum samples were analysed at the Reference TB Laboratory at Noguchi Memorial Institute for Medical Research or the TB laboratory at Korle-Bu Teaching Hospital. Samples were processed according to standardized protocols for mycobacterial microscopy and culture by trained laboratory technicians [28,29]. Sputum samples were decontaminated with N-acetyl-L-cysteine and sodium hydroxide. Smears of centrifuged sputum sediment were examined microscopically and graded for AFB using both Ziehl-Neelsen (ZN) staining method and fluorescence microscopy of auramine O stained smears. After re-suspension in phosphate buffer the sputum sediment was cultured for mycobacteria using both solid Lowenstein-Jensen medium and the BACTEC mycobacteria growth indicator tube (MGIT) 960 system (BD Diagnostics, Sparks, MD, USA) and incubated for up to 8 and 6 weeks respectively. Positive cultures were re-assessed for AFBs using ZN smear microscopy and an anti-MPB64 antibody assay was used on AFB positive cultures to confirm presence of M. tuberculosis. The GeneXpert MTB-RIF assay was performed on either fresh sputum sample or on sputum sediment according to the manufacturer's specifications (Cepheid, CA, USA) after it became available at the Chest Clinic TB laboratory to also confirm TB.

Diagnostic classification for analysis
We defined a positive LAM test result as a test band with intensity equal to or greater than the grade 2 cut-point [14]. The first reading of the first sample was considered the study result and used for all data analysis except for analysis of inter-reader agreement and accuracy of a two-sample strategy.
In the absence of a single suitable reference standard for TB diagnosis in a population of severely immunocompromised HIV-infected individuals we used the following composite TB case definition to categorize participants: "Confirmed TB" if M. tuberculosis culture positive or Xpert positive in any of the sputum samples. "Possible TB" if no positive culture or Xpert results for TB, but one of the following; sputum smear microscopy positive i.e. smears graded as scanty, 1+, 2+, and 3+; a clinical-radiological picture highly suggestive of TB and started on antituberculous treatment within two months; a clinical diagnosis of active TB by a non-study clinician and started on treatment within two months; death within two months of enrolment reported to be due to TB per medical record. "Non-TB" if not meeting criteria for "Confirmed" or "Possible" TB. Participants with growth of nontuberculous mycobacteria (NTM) and no positive cultures or Xpert results for M. tuberculosis were assigned to this group.

Statistical analysis
Descriptive analysis was used to characterize the study population and reported with interquartile range (IQR) and standard deviations (SD) as appropriate. Kappa statistics were used to determine inter-reader agreement between LAM test results and agreement between test results reported with the standard error (SE). Accuracy measures (sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV) and likelihood ratio (LR)) were calculated with 95 % Confidence Interval (CI). In our primary analysis, we used a microbiological reference standard comparing "Confirmed TB" vs. participants with no positive cultures or positive Xpert results. In the secondary analysis we used a composite reference standard for TB and combined "Confirmed TB" and "Possible TB" for calculation of sensitivities versus "Non TB" cases to calculate specificity. Figure 1 outlines the analysis of groups. For subgroup analysis we stratified participants by: enrolment site (hospitalised patients vs. outpatients); CD4 cell count (CD4 < 100 cells/mm 3 vs. CD4 ≥ 100 cells/mm 3 ); MEWS (MEWS > 4 vs. MEWS ≤ 4); and vital status at 2 months (dead vs. alive). Sensitivity and specificity was compared across strata using chi-square test or Fisher Exact test as appropriate. We determined diagnostic accuracy for LAM test in combinations with sputum smear microscopy and for the two-sample LAM test strategy. When assessing performance of a combination of tests, the result was considered positive if any of the tests were positive. The result was considered negative if both tests were negative. McNemar's test was used to compare two different test sensitivities and specificities. The cumulative probabilities of death were estimated by means of the Kaplain-Meier method, compared according to LAM test results with the log-rank test. Statistical significance was defined as a two-sided p-value less than 0.05 and all analysis were conducted using STATA™ version 13.1 software.

Participants
In the study period, 571 HIV-infected adults were screened and 469 were eligible according to our inclusion criteria (Fig. 1). In total, 399 (85 %) were enrolled from the outpatient clinic and 70 (15 %) were hospitalised patients (Table 1). Participants produced a mean of 1.8 (SD 0.36) urine samples with two urine samples obtained from 396 (84 %) participants. Two sputum samples were collected from 371 (79 %) participants with a mean of 1.8 (SD 0.41) per participant.

Urine LAM test performance
All LAM tests provided a valid result with visible control bar. The distribution of LAM test results across band intensity grades was: no band, 194 (41 %); faint band, 168 (36 %); grade 1, 62 (13 %); grade 2, 10 (2 %); grade 3, 10 (2 %); grade 4, 16 (4 %) and grade 5, 9 (2 %). Inter-rater agreement between the readers as to presence versus absence of a LAM test band with intensity of grade 2 cutpoint was 99.1 % (kappa 0.94; SE 0.05). Agreement as to   (Table 3), LAM test sensitivity increased significantly among hospitalised patients (p = 0.035), participants with MEWS > 4 (p = 0.008) and in participants who died within 2 months of follow-up (p = 0.013). The increase in sensitivity among participants with low CD4 did not reach statistically significance. The inverse pattern was seen for test specificity being significantly lower among strata with higher degree of diseases severity or death as outcome.

Analysis 2-Composite reference standard for TB
When using the composite reference standard for TB, the overall LAM test specificity increased to

Discussion
We evaluated the urine LAM strip test for TB diagnosis among ART eligible adults in Ghana where TB is reported as the most common AIDS defining event and cause of death [30]. We found that rapid urine LAM test could identify 44 % of confirmed TB cases with a specificity of 95 % and that a two-sample strategy for the LAM test did not improve sensitivity. The LAM test was easily performed and had a high inter-reader reliability. Sensitivity increased to 67 % among hospitalised patients and was associated with death and high clinical illness score (MEWS > 4); further, we found a tendency of increased sensitivity among patients with lower CD4 cell counts. Our findings affirm that sensitivity of the LAM test is highest for the sickest patients and those holding the greatest risk of dying. This is consistent with findings by Lawn et al. who found increased LAM test sensitivity among patients with CD4 cell count < 100 cells/mm 3 , CRP > 200 mg/L, severe anaemia (<8.0 g/dL), advanced symptoms and subsequent death [31,32]. To our knowledge, four other studies have evaluated urine LAM test among HIV-infected individuals irrespective of the presence of TB symptoms [12][13][14][15] and four other studies among HIV-infected TB suspects [16][17][18][19]. The lowest overall LAM test sensitivity of 25 % as reported by Balcha et al.   in a study that included HIV-infected individuals with CD4 < 350 regardless of TB symptoms [12]. The highest sensitivity of 50 % was reported for grade 2 cut-point in a study among hospitalised TB suspects [18]. LAM test sensitivity was consistently the highest among hospitalised populations and strata with lower CD4 cell count, suggesting that critically ill patients are the appropriate target group for LAM testing. No study has previously reported a correlation with MEWS although this could be a simple and objective alternative to identify the target group for LAM testing. MEWS was originally developed to detect critically ill patients at risk of catastrophic deterioration in high-income countries [26], but has since gained a wider application in medical wards and intensive care units to predict hospital admission and mortality also in resource limited settings [33,34]. A great advantage of MEWS is that it is based on simple measures that are routinely collected as part of most clinical consultations. Our finding of a significantly higher risk of death for LAM-positive compared with LAM-negative TB patients further emphasises that the LAM test identify TB patients with the greatest clinical need. Specificity of the LAM test is paramount for utility of LAM as a screening test. We found considerably lower specificity among subgroups with signs of critical illness or subsequent death than the target of 95 % for new point-of-care tests for TB diagnosis [35]. Using the composite reference standard, however, increased overall specificity to 98 % and to 94 % in all subgroups with poor outcome. This probably reflects the well-known limitations of sputum culture and Xpert to detect TB among severely sick HIV-infected individuals. We sought to optimize our microbiological reference standard by performing two sputum cultures on both solid and liquid media, and by adding Xpert results. Despite these efforts, the microbiological reference standard leads to underestimation of specificity. This is similar to findings by Peter et al. where LAM test specificity increased from 78 % to 94 % when the composite reference standard was used instead of a microbiological reference standard [18]. Another key factor for test specificity is the positivity threshold for the LAM test as shown in previous studies [13,16,18]. There is consensus of using grade 2 cut-point as positivity threshold and in 2014 the LAM test manufacturers changed the reference scale card omitting the band corresponding to grade 1. However, we report a large number of tests with band-intensity fainter than the original grade 2 cut-point as did another prospective LAM study [16]. It is important to acknowledge that such visible bands, although fainter than the positivity threshold, in a clinical setting could prompt clinicians to interpret the LAM test as positive and lead to over treatment.
LAM is also a component of the NTM cell wall [36], but the possibility of NTM causing false positive LAM test results has been sparsely addressed. In a cohort of cystic fibrosis patients we previously reported that 2/23 (8.7 %) of NTM infected patients were LAM-positive at a grade 2 cut-point increasing to 9/23 (39.1 %) for grade 1 cut-point [37]. In the context of HIV infected individuals, 10 NTM culture positive cases have previously been described to be LAM positive [15,[38][39][40]. We found a positive LAM test for 5/34 (15 %) of NTM culture positives. The prevalence of NTM culture positive was high in our study and associated with a positive LAM test. While this raises concern that NTM among HIV infected individuals affect LAM test specificity, its importance needs to be evaluated in studies applying appropriate case definitions for NTM disease.
A two-sample test strategy including an early morning sample has not been evaluated to date for the LAM test. We found that an additional sample did not increase overall performance of the LAM test and agreement between results for a spot and morning sample was high. A recent study explored a two-test strategy performed on the same sample and did not show any added diagnostic value of the second test [13].
Previous studies found that the LAM test sensitivity was superior to sputum smear microscopy and found incremental diagnostic value of combining the two tests [12,16,19]. We found higher sputum smear microscopy sensitivity that contrasts other studies comparing microscopy with the LAM test. This could be due to our use of sputum concentration and fluorescence microscopy both known to increase diagnostic sensitivity [41,42], and collection of two samples for the majority of participants. A similar and rigorous sputum sampling and microscopy methodology is not always achievable in a routine clinical setting and the LAM test could be preferred for reasons other than a higher diagnostic accuracy alone.
The LAM test is simple to perform at the bedside and collection of urine is simple and poses minimal biohazard; all attractive features in settings with overburdened TB laboratories. Urine based diagnosis further holds a potential to improve TB diagnosis in children with one study of LAM test showing reasonable sensitivity in HIV-positive TB infected children [43]. A number of other biomarkers have been detected at increased levels in the urine from patients with active TB [44,45]. LAM remains among the most promising so far, but is limited by its modest sensitivity and use among HIV-infected individuals only.
The strengths of this study are prospective data collection, LAM testing on fresh urine and a minimum of 6 months' follow-up. We chose to enrol HIV-infected individuals initiating ART regardless of clinical presentation as this target group has been identified as a particularly vulnerable group for prevalent and undiagnosed TB. We had several limitations with regard to our reference standard that did not include Xpert results for all participants or investigation of extrapulmonary samples that could have increased specificity further, especially among those with more advanced immunodeficiency. Moreover, we did not have capacity to perform sputum induction to improved sputum sample quality. Despite active follow-up through personal calls to participants and their relatives the LTFU was high, but comparable to several other African HIVcohort studies [46]. Mortality is a frequent cause of LTFU [47,48] and mortality rates in our study could have been higher had we performed a more thorough follow-up with e.g. house-visit. However, sensitivity analysis assuming that all participants LTFU had died did not change the association between LAM-positivity and mortality.