Test characteristics and potential impact of the urine LAM lateral flow assay in HIV-infected outpatients under investigation for TB and able to self-expectorate sputum for diagnostic testing

Background The commercially available urine LAM strip test, a point-of-care tuberculosis (TB) assay, requires evaluation in a primary care setting where it is most needed. There is currently inadequate data to guide implementation in TB and HIV-endemic settings. Methods Adult HIV-infected outpatients with suspected pulmonary TB able to self-expectorate sputum from four primary clinics in South Africa, Zambia and Tanzania underwent diagnostic evaluation [sputum smear microscopy, Xpert-MTB/RIF, and culture (reference standard)] as part of a prospective parent study. Urine LAM testing (grade-2 cut-point) was performed on archived samples. Performance characteristics of LAM alone or in combination with sputum—based diagnostics were evaluated. Potential impact on 2 and 6-month morbidity (TBscore), patient dropout rates, and prognosis (death/ loss to follow-up) were evaluated. Results Among 583 participants with suspected TB that were HIV-infected or refused testing, the overall LAM sensitivity (95 % CI; n/N) and in the CD4 ≤ 100 cells/mm3 sub-group was 22.7 % (16.6-28.7; 41/181) and 30.4 % (17.1-43.7; 14/46), respectively. Overall specificity was 93.0 % (90.5-95.6; 361/388). Amongst culture-positive TB cases, adjunctive LAM testing did not improve the sensitivity of either sputum Xpert-MTB/RIF [78.2 % (69.8-86.7; 72/92) versus 76.1 % (67.4-84.8; 70/92), p = 0.7] or smear-microscopy [56.2 % (45.9-66.5; 50/89) versus 43.8 % (33.5-54.1; 39/89), p = 0.1). Clinic-based LAM, as an adjunct to either smear microscopy or Xpert MTB/RIF same-day testing, would neither have decreased patient dropout, nor increased same-day treatment initiation in this clinical setting where same-day chest radiography was available. LAM positivity was associated with 6-month lost-to-follow-up/death (AOR 4.4; p = 0.002) but not TBscore (at baseline or change in TBscore 2-months post-treatment) (p = 0.17). Conclusions In African HIV-TB co-infected outpatients able to self-expectorate sputum LAM had limited sensitivity even at low CD4 counts, and offered no significant incremental diagnostic yield over Xpert-MTB/RIF or smear microscopy. In primary care clinics with chest radiography and where empiric TB treatment is common, LAM seems unlikely to improve rates of same-day treatment initiation and patient dropout, however, the ability of LAM to identify patients at high risk of death or lost-to-follow-up may offer important prognostic value. Electronic supplementary material The online version of this article (doi:10.1186/s12879-015-0967-z) contains supplementary material, which is available to authorized users.


Background
Of the estimated 8.6 million active tuberculosis (TB) cases globally an estimated three million cases remained either undiagnosed or unreported [1]. Thus, tests and technologies that allow rapid, accurate, point-of-care (POC) diagnosis represent an unmet need and are projected to substantially reduce the global TB burden [2,3]. An increasing number of high TB/HIV burden settings are implementing frontline Xpert MTB/RIF testing for HIV-infected patients with suspected TB [4][5][6], although sputum smear microscopy remains the frontline TB diagnostic tool in the majority of resource-poor high burden settings. Used at the POC, Xpert MTB/RIF can decrease dropout rates (patients TB-positive but not returning to initiate treatment) and time-to-treatment initiation [5,7]. However, sensitivity in sputum is reduced in HIV-infected patients [8] and is also suboptimal in induced sputum samples [9]. These and other considerations including infrastructure and electricity requirements means that countries remain interested in simple low-cost non-sputum based POC diagnostic tools for both pulmonary and extrapulmonary TB diagnosis.
In 2013, the Alere Determine™ TB LAM Ag lateral flow strip test (Alere, USA, www.alerehiv.com; referred to as LAM from this point forward) became the first commercially available bedside urine test for TB diagnosis in HIV co-infected patients with results available within 25 minutes using just 60ul of urine [10]. To date, diagnostic accuracy studies of urine LAM have been largely single centre [11][12][13]. Tested patient populations have been heterogeneous (patients with extrapulmonary TB, those unable to spontaneously provide a sputum sample for TB diagnostic testing, or hospitalised patients) [14] and the incremental value of LAM over sputum smear microscopy or Xpert MTB/RIF has thus been variable [11][12][13][14][15][16][17]. Nevertheless, POC tests like urine LAM require evaluation in a primary care settings where they are likely to have most impact and where more than 90 % of the TB case load is first encountered. However, there are limited data to guide implementation in such settings where POC TB tests are most needed. Furthermore, the incremental value of LAM over tests like Xpert MTB/RIF, if any, has hardly been studied in HIV endemic primary care settings. There are also limited published data about the potential impact of LAM on morbidity, same-day treatment initiation, patient dropout rates and prognosis [18,19].
In this study we sought to provide multicentre comparative accuracy and extrapolate impact data of adjunctive LAM testing in the setting where the overwhelming majority of individuals with presumptive pulmonary TB present (out-patient primary care clinics). These data are needed to make a definitive recommendation about the use of LAM in this key patient population. We hypothesised that in out-patients able to self-expectorate sputum for sputum-based diagnostics, LAM would have very limited incremental utility. We therefore tested the urine for LAM positivity in a cohort of 583 HIV-infected patients with suspected pulmonary TB who formed part of a parent randomised controlled trial comparing Xpert MTB/RIF with same-day smear microscopy in primary care clinics of three sub-Saharan African countries [5]. All the participants provided two expectorated sputa and urine sample.

Design and study population
This cross-sectional accuracy study was nested within a randomised, parallel-arm, multicentre trial, to evaluate the impact of point-of-treatment Xpert MTB/RIF testing with same-day smear microscopy. Patients were enrolled between 12  [20,21] (see online supplementary methods for further details), ii) the ability to spontaneously expectorate two spot sputum specimens with a volume of ≥1 ml each, and exclusion criteria included: i) failure to obtain informed consent and ii) initiation of anti-TB treatment in the previous 60 days. In this substudy, HIV-uninfected patients were excluded from the analysis. Patients refusing voluntary counselling and testing for HIV (3 %) where considered "positive" and included in the LAM analysis as this would occur in routine clinical practice given the very high (>50 %) incidence of HIV coinfection amongst new TB cases in these endemic countries. A further detailed description of RCT methodology including description of each primary care clinic site is available with the published manuscript of the parent study [5].

Sample collection and processing
Each patient had at least two spot expectorated sputa collected sequentially at recruitment. Nurses visually inspected expectorated sputum samples and estimated the volume using standards of known volume. Patients randomised to the smear microscopy study arm received two same-day sputum smears for acid-fast bacilli and one arbitrarily selected specimen also underwent culture. Patients in the Xpert MTB/RIF arm received nurseperformed clinic-based Xpert MTB/RIF testing and the other specimen was sent for culture. All patients were asked to provide a spontaneously voided urine sample (10-30ml) into a sterile receptacle. Urine was transferred to the laboratory within 4 hours and frozen at -20 degrees for later batch testing. Urine samples were collected in all parent study sites except Zimbabwe where urine biobanking was not possible.

Clinical management and follow-up
Patients were enrolled and initially reviewed by research nursing staff. They were offered voluntary testing and counselling for HIV at recruitment, and received a chest radiograph while awaiting their rapid TB test result. TB-related morbidity was assessed at enrolment and during 2-and 6-month follow-up using the previously validated TBscore [22] (Additional file 1 Table S1 in the online supplement). Patients were referred to the local DOTS programme office at the same clinic for the initiation of anti-TB treatment if any positive result from either smear microscopy, Xpert MTB/RIF, or TB culture result was obtained. Smear-or Xpert MTB/RIF negative patients were referred to clinical staff for review together with their chest radiographs and as part of the routine clinic workflow. The WHO guidelines for the treatment of smear-negative TB [21] are routinely used at each clinic. Doctors who were not part of the study team and routinely visited each facility twice a week initiated the treatment of smear-or Xpert MTB/RIF-negative patients. Follow-up was conducted for all study patients by research staff at 2-and 6-months post randomisation (within a range of 14 days before and after both time points). Patients were considered lost-to-follow up if they were not contactable despite multiple attempts at telephonic contact and a community healthcare worker tracing.

Diagnostic test procedures
Liquid TB culture (Bactec MGIT; BD Microbiology Systems, USA) was performed in central laboratories on sputum decontaminated using N-acetyl-L-cysteine-NaOH. Front-loaded, same-day smear microscopy was performed on-site by a technician employed by the programme in a laboratory attached to the healthcare facility, except in South Africa where it was performed at a centralised laboratory. Fluorescence smear microscopy was performed on concentrated samples with auramine-O staining at all sites except Mbeya, Tanzania where concentrated samples underwent ZN-staining and light microscopy. Patients were classified as having smear-positive tuberculosis on the basis of at least one scanty smear (1-9 bacilli per 100 fields [1000× for light microscopy and 400× for fluorescence microscopy]).
Trained staff, including clinical and laboratory staff, according to the manufacturer recommendations on unprocessed thawed urine specimens stored at -20 degrees for not longer than 18 months, performed LAM strip tests. Two independent readers, blinded to reference test results and clinical outcomes, graded (0-5 according to the colour band intensity) each strip after 25-35 minutes using the pre-January 2014 manufacturer's reference card (see Additional file 1 Figure S1A in supplement). Where results between the two independent readers were discordant, a third consensus reader graded the LAM strip, and this consensus read was used in the analysis. Further details on the reading of LAM strip tests and the reference cards are provided in the online supplement. LAM strip results were neither provided to the clinicians caring for patients nor were they used for treatment decisions.

Statistical analyses
The reference standard for the primary analysis of diagnostic accuracy was a single sputum liquid culture for Mycobacterium tuberculosis. An additional analysis is provided in the online supplement with culture-negative clinical-TB cases considered reference standard positive given the acknowledged limitation of a single sputum TB culture to diagnose TB in HIV co-infection. The study was powered (75-100 % at 95 % confidence interval) to detect differences in diagnostic accuracy between LAM, Xpert MTB/RIF and smear, alone or in combination, for HIV-infected patients. The study was underpowered to detect small differences in diagnostic accuracy measures amongst different CD4 strata. Descriptive statistics were used to characterise the study population. Diagnostic accuracy measures presented include sensitivity, specificity, positive (LR+, PPV) and negative (LR-, NPV) likelihood ratios and predictive values all with 95 % confidence intervals (CI). χ 2 and Fisher's exact test with mid-P correction were used for comparisons between proportions, and the Mann-Whitney test was used to compare differences in TBscore. The potential impact of same-day LAM testing was assessed by assuming that LAM testing would be performed at the first point of clinic contact and all LAMpositive patients would initiate anti-TB treatment immediately. χ 2 -squared testing is used to compare same-day treatment initiation and treatment dropout proportions with and without the use of LAM in each study arm. Multivariable-linear (for morbidity scores) and logistic (for mortality) regression analyses were performed. Sample size calculations were based on the primary outcome for the parent study [5] (http://clinicaltrials.gov/show/ NCT01554384). Based on the prevalence of HIV and culture-positive TB cases in the parent study there was adequate study power for the 95 % CI around LAM accuracy measures to be within a range of ±10 %. Analyses were performed using OpenEpi (version 2.3.1) [23], and R (version 3.0) [24]. The study is reported in accordance with the STARD initiative recommendations [25].

Role of the funding source
Alere donated the LAM strip tests. However, neither the company nor the study sponsor had any role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Results
Study sites and population characteristics Figure 1 provides the study profile. We enrolled 1095 patients with suspected TB, that were able to provide ≥2 sputa and a spot urine sample, from four primary care clinics in South Africa (n = 419 in Cape Town, n = 193 in Durban), Zambia (n = 400) and Tanzania (n = 83). Further detailed descriptions of each primary care clinic have already been published with the parent study [5]. HIV-infection was confirmed in 564/1095 (52 %) and testing was refused in 19/1095 (1 %); LAM test performance is considered in these two groups combined (N = 583). Of the 583 HIV-positive/status unknown patients with suspected TB, 14 (2 %) had either a contaminated culture or no available result, and 181 (31 %) had culture-positive TB. Table 1 provides the basic demographic and clinical characteristics of the HIV-infected patients of the study population stratified by study site.

LAM performance
Three (<1 %) LAM strip tests failed on the first attempt and required use of a second strip test to produce valid results. The 3 rd LAM strip reader was required in 151/ 583 (26 %) patients, and in 126/151 (83 %) cases this was for differences in grading bands between the grade 0 and 1 intensity. Thus, excluding grade 0 /1 discordance a 3 rd reader was required in only 25/583 (4 %) of patients. Additional file 1 Table S2 compares LAM diagnostic accuracy using the grade-2 versus grade-1 cut-point (pre-January 2014 reference card, Fig. 1A) showing the higher specificity and LR+ of the grade-2 cut-point. Interobserver agreement as to the presence versus absence of a test band of intensity grade 2 or higher was 97.3 % (kappa 0.87), while agreement regarding the presence versus absence of a test band of intensity grade 1 or higher was 90.9 % (kappa 0.77), p < 0.001. Based on this data, and previously published work [14][15][16]26] and in accordance with the updated manufacturer's reference card (see Additional file 1 Figure S1B), the remaining diagnostic accuracy data presented in the results section is presented using the grade-2 LAM cut-point. Table 2 shows the diagnostic accuracy of LAM, Xpert MTB/RIF and smear-microscopy amongst HIV-infected patients stratified by CD4 cell count (PPV, NPV, LR+ and LR-shown in Additional file 1 Table S3). Overall, LAM had sensitivity (95 % CI) and specificity (95 % CI) of 22.7 % (16.6-28.7) and 93.0 (90.5-95.6) respectively. LAM specificity did not significantly increase when TB culture-negative patients diagnosed with clinical-TB
A sensitivity analysis was performed to investigate the impact of the 21 % (28/131) and 27 % (36/130) Non-TB patients in the smear and Xpert study arms respectively that were lost-to-follow up and the 1.5 % (2/131) and 3 % (4/130) that were deceased. No differences in the diagnostic accuracy of LAM alone or with smear/Xpert were noted (data not shown).
Potential impact on patient-important treatment outcomes of adding point-of-care LAM to sputum-based Xpert MTB/RIF or smear microscopy In the parent study, the reason for treatment initiation, time-to-treatment and failure to initiate treatment in culture-positive TB patients ('dropout') was recorded (Fig. 2). The potential impact on these outcomes of adding point-of-care LAM (in the manner described in methods section) is also shown in Fig. 2  LAM as a potential prognostic marker  Table 4, LAM positivity was only a predictor of the combined outcome of lost-to-follow-up/death but not 6-month mortality or baseline TBscore. In addition, TB-related morbidity outcomes, as measured by the amount of improvement from enrolment in TB-score at 2-and 6-months post-initiation of effective anti-TB treatment were similar amongst LAM strip positive versus negative patients [4(3-5) versus 4 (2.5-5), p = 0.136] (online Additional file 1 Table S6).

Discussion
In this multi-centre out-patient study of patients with suspected PTB able to provide expectorated sputum for diagnostic testing, LAM had poor overall sensitivity, which did not significantly improve in patients with CD4 cell counts less than 100 cells/ml. LAM was not able to improve on the diagnosis offered by either sputum-based Xpert MTB/RIF or smear microscopy alone. Due to this lack of incremental utility over sputum-based tools, LAM seems unlikely to be able to improve outpatient important treatment outcomes. Indeed, although LAM could offer some prognostic utility in identifying patients at high risk of death or loss-to-follow-up, there was no impact on patient dropout or morbidity. Adjunctive LAM testing may only increase same-day treatment initiation in clinic settings using sputum smear microscopy where same-day chest radiography is not available to guide empiric sameday treatment decisions. It may be argued that our study only considered outpatients able to expectorate sputum, and LAM may offer important incremental value for patients unable to provide sputum for diagnostic testing. However, this study's patient population is the major subgroup presenting to primary care facilities for frontline testing and although preliminary data suggest that this is not the subgroup most likely to benefit from LAM, data is nevertheless essential for clear recommendations to guide implementation across TB programs. Indeed, program directors continue to question why the only commercially available and affordable POC TB test is unavailable for frontline testing? The data outlined here offers valuable insights. Published data, albeit limited, indicate that LAM sensitivity is increased with higher circulating LAM levels, occurring with higher mycobacterial disease burden, extrapulmonary TB, lower CD4 cell count and WHO clinical stage 3 and 4 in out-and in-patient settings [11,[27][28][29][30][31][32][33][34]. Moreover, LAM and sputum smear microscopy identified non-overlapping sub-groups of culture-positive TB, thereby offering additive diagnostic value [11-14, 16, 35]. By contrast, we found no incremental benefit of LAM. There are a number of possible explanations. Firstly, sputum-scarce TB, smear-negative TB, and EPTB is more common in hospital-based and pre-ART screening cohorts and thus smear microscopy sensitivity is more likely to be reduced, which in turn would increase the incremental benefit offered by LAM testing. Secondly, differences in sputum smear microscopy staining and concentration methodologies across different studies affect smear microscopy sensitivity [13]. Thirdly, in contrast to other LAM-based studies [11,12,14] our study did not offer sputum induction to improve sputum sampling and thus, those unable to spontaneously produce two spot sputa were excluded. Sputum induction was not offered as the pragmatic study design of the parent study reflects the reality that sputum induction facilities remain unavailable in the majority of routine primary care clinic settings. Nevertheless, the inability of LAM testing to improve the diagnostic yield of a single sputum-based Xpert MTB/RIF, irrespective of declining CD4 cell count, is consistent with the findings of Lawn et al. and reflect the superior sensitivity of sputum-based Xpert MTB/RIF for the diagnosis of pulmonary TB [11,36,37].
In our initial study of LAM amongst hospitalised HIVinfected patients with advanced immunosuppression we noted that test specificity and inter-reader agreement was optimised (>95 % for both) by use of an alternative grade-2 rather than the manufacturer's initially suggested grade-1 cut-point [16]. Based on these findings, the manufacturer's reference card has been updated as of January 2014 so that the first positive visual band corresponds to the grade-2 intensity band of the old reference card (see Fig. 1 of the online supplement, www.alere.com).
Independent of cut-point selection, suboptimal LAM specificity (<95 % as recommended by an expert committee for point-of-care TB testing [38]), particularly in countries north of South Africa, remains a concern. Indeed we noted lower test specificity in Tanzania and Zambia compared to the two South African study sites. Reasons for this may include different degrees of the unavoidable misclassification bias associated with a single sputum culture to correctly classify TB in HIV-infected patients with advancing immunosuppression. However, in our secondary analysis excluding 'probable or clinical TB' the specificity in Zambia and Mbeya remained lower. Kroidl et al. found cross-contamination of the LAM ELISA from dust, soil and stool in Tanzania, and we have demonstrated cross-reacting LAM-like glycolipid antigens in Nocardia and Candida spp. [30,39]. Sterile collection of urine samples is essential, especially Table 3 Morbidity (measured by TBscore) in countries north of South Africa, and in a recent Ugandan in-and out-patient study LAM specificity was 95 % [14].
LAM offered limited incremental diagnostic benefit over sputum-based diagnostics. In contrast to studies showing incremental benefit of LAM in hospitalised patients with sputum-scarce TB and EPTB [14,16] or those identifying patients with TB missed by empiric treatment initiation but identified by LAM [40]. Consequently, our study suggests LAM would have minimal potential impact on patient important treatment outcomes. In fact, test specificity was significantly lower when combining Xpert MTB/RIF with LAM for both a culture and composite reference standard with the potential to increase inappropriate treatment initiation. Thus, sputum-based diagnosis, especially where Xpert MTB/RIF is available, should be preferred in HIVinfected outpatients able to spontaneously provide sputa. However, LAM may potentially improve same-day treatment initiation in the clinic setting where only sputum smear microscopy is performed and no chest radiography facilities are available. In addition, LAM may still offer i) important added diagnostic benefit where the performance of sputum-based tests is reduced such as sputum-scarce TB, extrapulmonary TB, mycobacteremia [27], and/ or renal TB [41], and ii) important prognostic and treatment monitoring utility [18].
This study had several limitations and strengths. It is the first large multicentre study in primary care practice, allowing for accurate evaluation of diagnostic accuracy across three sub-Saharan African countries. The design of the parent study offered a unique opportunity to evaluate the diagnostic accuracy of LAM in a well-defined outpatient population able to provide sputum for diagnostic testing and hence, to estimate the potential impact of LAM when combined with either Xpert MTB/RIF or smear in this patient group. Misclassification bias was a potential problem in our study as a single sputum culture can miss TB cases amongst HIV-infected patients. However, in an alternative analysis where TB culture-negative patients diagnosed as clinical TB and initiating treatment are considered the reference test, no significant difference in LAM specificity was noted and study conclusions were unaltered. In addition, a sensitivity analysis with lostto-follow-up and deceased patients excluded from the non-TB group i.e. considered unclassifiable, did not significantly alter diagnostic accuracy measures. LAM was performed on frozen rather than fresh samples which could have reduced test sensitivity, however, meta-analysis data suggests no differences in diagnostic accuracy using frozen rather than fresh samples [42]. Consequently, impact data for adjunctive LAM is extrapolated. The study was not powered to detect small differences in sub-groups (CD4 strata, treatment dropouts) and thus small incremental benefits of LAM in these sub-groups may not have been detected. Likewise, the small number of study deaths limited power to examine LAM as a predictor of mortality in the multivariate analysis.
In conclusion, LAM strip testing had poor sensitivity amongst HIV-infected outpatients able to provide expectorated sputum for diagnostic testing. There was no incremental diagnostic benefit over either Xpert MTB/RIF or smear microscopy. If used as an adjunctive diagnostic tool in this setting, it is unlikely to impact patientimportant treatment outcomes such as morbidity, patient dropout, or same-day treatment initiation, except in smear microscopy only clinics with no chest radiography. Potential gains need to be weighed against the likely increase in inappropriate 'false-positive' treatment. However, further impact-orientated studies focused on mortality and morbidity benefits in ill hospitalised patients are warranted.

Additional file
Additional file 1: Figure S1A. Pre-January 2014 LAM strip test manufacturer's reference card illustrating visual intensity grades 0-5. Figure S1B: January 2014 new LAM strip test manufacturer's reference card illustrating visual intensity grades 0-4. Table S1. Variables used to calculate the TB score as defined by Wejse et al. (2008) 1 . Each patient was scored at baseline, 2 months and 6 months. Table S2. Comparative diagnostic accuracy of two LAM strip test grade cut-points (old reference card) in HIV-infected patients and stratified by CD4 cell count. Table S3. Additional diagnostic accuracy measures (Likelihood ratios and predictive values) of LAM (grade 2 cut-point), sputum Xpert MTB/RIF or smear microscopy alone or in combination for culture-confirmed versus culture-negative pulmonary tuberculosis amongst HIV-infected (and refused testing) patients stratified by CD4 cell count (TB prevalence = 31 %). Table S4. Diagnostic accuracy of LAM (grade 2 cut-point) alone or in combination with either Xpert MTB/RIF or smear microscopy for culturepositive and clinical versus culture-negative pulmonary tuberculosis amongst HIV-infected (and refused testing) patients and stratified by CD4 cell count. Table S5A. Sensitivity of different diagnostic tests alone and in combination in HIV-infected (and refused testing) patients with culture-positive tuberculosis, stratified by study site. Table S5B. Specificity of different diagnostic tests alone or in combination in all patients culture-negative for tuberculosis, stratified by study site. Table  S5C. Specificity of different diagnostic tests alone or in combination in HIV-infected patients culture-negative for tuberculosis and without clinical TB ¶ stratified by study site. Table S6. Changes in 2-month and 6-month TB-related morbidity indices in patients treated for TB according to baseline culture status, stratified by LAM result.

Competing interests
Although Alere supplied the ELISA and LAM strip-tests free of charge they had no role in the design and conduct of the study, analysis of the data or writing of the manuscript. The authors have no other interests to declare.

Acknowledgments
We are indebted to the patients who participated in this study. We thank the Health Directorate of the City of Cape Town, the Zambian Ministry of Health, the Kwa-Zulu Natal Provincial Department of Health, and the Tanzanian Ministry of Health and Social Welfare. We acknowledge the assistance of health facility staff at each site, and the assistance of the local institutional review boards.
The TB-NEAT study team