Performance of interferon-γ release assays in the diagnosis of confirmed active tuberculosis in immunocompetent children: a new systematic review and meta-analysis

Background Tuberculosis (TB) is a global public health problem, causing morbidity and mortality in adults and children. The most reliable diagnostic tools currently available are the in vivo Tuberculin Skin Test (TST) and the ex vivo Interferon-γ release assays (IGRAs). Several clinical, radiological, and bacteriological features make the detection of active (overt disease) TB in children difficult. Although recently developed immunological assays such as QuantiFERON-TB Gold In-Tube (QFT-IT) and T-SPOT®.TB are commonly used to identify active TB in adults, different evidence is required for diagnosis in children. The purpose of this study was to reassess the sensitivity and specificity of IGRAs in detecting microbiologically confirmed active TB in immunocompetent children. Methods A systematic review and meta-analysis of studies reporting on the diagnostic accuracy of tests for TB in immunocompetent children aged 0–18 years, with confirmation by positive M. tuberculosis cultures, were undertaken. Electronic databases were searched up to September 2015 and study quality assessment was performed using QUADAS-2. Results Fifteen studies were included in our meta-analysis. Results showed that there were no significant differences in sensitivity between TST (88.2 %, 95 % confidence interval [CI] 79.4–94.2 %), QFT-IT (89.6 %, 95 % CI 79.7–95.7 %) and T SPOT (88.5 %, 95 % CI 80.4–94.1 %). However, both QFT-IT (95.4 %, 95 % CI 93.8–96.6 %) and T-SPOT (96.8 %, 95 % CI 94.2–98.5 %) have significantly higher specificity than TST (86.3 %, 95 % CI 83.9–88.6 %). Conclusions QFT-IT and T-SPOT have higher specificity than TST for detecting active TB cases in immunocompetent children.


Background
Tuberculosis (TB) is one of the most important global public health problems and one of the major causes of adult and childhood morbidity and mortality worldwide. In 2012, there were an estimated 530,000 TB cases (bacteriologically confirmed or clinically diagnosed) among children <15 years of age, approximately 6 % of the total number of 8.6 million cases. Among HIV-negative children, there were 74,000 TB-related deaths, approximately 8 % of the total number of 940,000 TB-related deaths among HIV-negative people [1].
In 2011, the trend in the pediatric TB notification rate showed a slight decline during the previous ten years from a peak of 5.7 in 2001. However, a number of countries, such as Bulgaria, Finland and Italy, have seen increasing trends during the same period [2]. Indeed, across Europe during the period 2000-2009, a decline or stabilization of trends was reported in high-incidence countries while low-incidence countries tended to report an increased incidence in pediatric TB.
In 2009, only 19.2 % of all childhood TB cases in Europe were confirmed by culture, a clear indication that TB diagnosis in children remains a major public health challenge [3]. Several clinical, radiological and bacteriological features (such as pauci-bacillary nature, atypical clinical signs, and a lower probability of bacteriological confirmation) make the detection of active TB in children difficult, often leading to the neglect of TB within pediatric populations [4].
As a result, the diagnosis of active disease in children often relies on a combination of contact history, clinical symptoms, and radiological findings, together with a consideration of the results of a Tuberculin Skin Test (TST) [5,6].
The most reliable diagnostic tools currently available for identifying TB infection are the in vivo TST and the ex vivo interferon-γ (IFN-γ) release assays (IGRAs). For almost 100 years, the TST was the main test of choice for identifying TB infection. This test measures an individual's response to a solution of Mycobacterium tuberculosis antigens and can produce false-positive and false-negative responses due to immunologic immaturity or crossreactivity with mycobacteria not in the M. tuberculosis complex, vaccination with Bacille Calmette-Guérin (BCG), and other undetermined causes [7,8]. Within the past decades, however, two new immunological assays have been developed: the QuantiFERON-TB Gold (QFT-G; Qiagen), QuantiFERON-TB Gold In-Tube (QFT-IT; Qiagen), and the T-SPOT®.TB assay (Oxford Immunotec). QFT-G and QFT-IT measure the concentration of IFN-γ produced in whole blood by enzyme-linked immunosorbent assay (ELISA) [8,9]. T-SPOT measures the number of individual Mycobacterium-specific T cells secreting IFN-γ by the enzyme-linked immunosorbent spot (ELISPOT) assay [10,11].
In adults, a higher specificity of IGRAs compared with TST has been reported. The sensitivity for active TB ranges from 70 to 90 % and is lower in high TB incidence settings [12][13][14][15]. Thus, IGRAs are now included by the CDC in the recommended diagnostic algorithm for detection of TB in adults [16]. However, caution is recommended regarding their use in children [17].
A growing number of studies have compared TST and IGRAs for the detection of M. tuberculosis infection, a condition that may or may not progress to clinical disease and active (overt disease) TB in children. Studies have measured sensitivity in populations with active TB and in populations exposed to TB cases [18,19]. Six meta-analyses [6,[20][21][22][23][24] have previously assessed IGRAs' sensitivity and specificity in children and reported largely different pooled estimates. These differences are due to the characteristics of the study populations and different inclusion/exclusion criteria (such as immunologic status, level of income, and concurrent infections). Two of these previous meta-analyses focused on either bacteriologically confirmed or clinically diagnosed TB cases [6,22], one included contacts with TB cases in addition to the previous two categories [20], another also included cases of latent TB [21] and one [23], although providing a sub-analysis on microbiologically confirmed cases, included studies for which it was not possible to clearly identify methods used to confirm cases. In the last meta-analyses, which provided a sub-analysis including only microbiologically confirmed cases, the study population also included immunocompromised children [24]. Because of this heterogeneity, pooled estimates of sensitivity and specificity of IGRAs and TST have varied considerably. Through the use of different inclusion/exclusion criteria compared with previous studies, the aim of our study was to reassess the sensitivity and the specificity of IGRAs, QFT-IT, and T-Spot TB versus TST in the detection of bacteriologically confirmed active TB in immunocompetent children aged 0-18 years.

Literature retrieval
An extensive search of the scientific literature was carried out by querying electronic databases of PubMed, EMBASE and Cochrane Library to identify articles published in English or Italian between January 1 st 2003 and September 30 th 2015. The following terms were used as keywords: "tuberculosis", "tuberculosis infection", or "tuberculosis disease"; "pediatrics" or "child*"; "Tuberculin Test"; "Interferon-gamma Release Tests", "Quanti-FERON", "ELISpot", "QFT-IT", "QFT-2G", "IFN", "Tcell assays", "T-SPOT.TB test", "ESAT-6", "CFP10", or "RD1 antigens"; "Sensitivity"; and "Specificity". Further retrieval of grey literature was conducted through consulting Google Scholar and websites of the World Health Organization (http://www.who.int/en/), Centre for Disease Control and Prevention (http://www.cdc.gov/) and the National Institute for Health and Clinical Excellence (https://www.nice.org.uk/) for relevant unpublished studies and national and international guidelines. We integrated the electronic searches with manual searches, checking the reference lists of relevant articles to identify further studies.

Selection criteria
Potential studies were selected through consideration of the title and abstract by two researchers. Disagreements were solved by a senior researcher. Full texts of eligible articles were read by two researchers to decide upon final inclusion.
The following inclusion criteria were used: only studies performed on healthy children from 0 to 18 years were considered eligible, and articles which included only adults or immunosuppressed children (such as HIV-positive patients) were excluded; we included only those studies focused on the sensitivity and specificity of IGRAs or TST in detecting confirmed active TB cases (considered as a child with active TB disease, confirmed by positive M. Tuberculosis cultures); we included only those studies including sensitivity and specificity, or where it was possible to calculate them; we included only articles that reported original data (reviews, case reports and editorials were excluded); and we included only those studies with ≥5 study subjects.

Quality assessment
Two independent researchers evaluated the validity of the selected studies using the Revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool [25]. This tool assesses the risk for bias and concerns regarding applicability in four domains: patient selection, index test, reference standard, and study flow and timing. The risk of bias was evaluated through the identification of specific questions and the development of guidance on items evaluation according to QUADAS-2 recommendations. The reviewers recorded and compared the answers given to each question.
Both reviewers analysed all articles in terms of the study population, index test, reference standard, setting, diagnostic pathway, target condition, and flow diagram. For each article, researchers independently recorded a score of "low risk of bias/low concerns regarding applicability," "high risk of bias/high concerns," or "unclear" for each question. All domains with at least one negative response scored "high risk of bias" (if the negative response regarded the risk for bias) or "high concerns regarding applicability" (if the negative response regarded the applicability), while domains with no negative responses but at least one unsure response scored "unclear". Domains with no negative and no unsure responses scored "low risk of bias/low concerns". All disagreements were resolved by consensus.

Data abstraction and data analysis
Data were extracted using a standardized form including the following information: authors, year of publication, journal, country, country TB burden, study design, age of the patients, sample size, TB diagnostic tests, and TST cut-off. For each study, children representing true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) were defined by identifying microbiological culture as the reference test. With respect to TST, patients were classified as positive or negative according to the TST cut-off chosen by both the authors of each paper and to all three TST cut-offs defined by the American Academy of Paediatrics (AAP) (>5 mm, >10 mm, >15 mm). The three cut offs suggested by the AAP were applied to all patients of each study because it was not possible to classify patients in risk groups as defined by the AAP itself.
Two authors independently extracted data from the papers and corroborated their findings. Pooled sensitivity and specificity of TST, QFT-IT and T-SPOT and a 95 % confidence interval (CI) were calculated using the Der Simonian and Laird random effects model. Furthermore summary receiving operating characteristic (sROC) curves with Area Under the Curve (AUC) were obtained on the basis of the Littenberg and Moses model. Meta-Disc, version 1.4 (Hospital Ramony Cajal, Madrid, Spain) [26] was used to perform the analysis. A value of 0.5 was added to all cells in studies where any cell was 0. Heterogeneity was assessed using the I 2 statistic. Pooled likelihood positive and negative ratios (LR+ and LR-) were obtained to assess the informative power of the three tests.
The included studies were undertaken in 11 countries, of which four (Lithuania, China, India and Uganda) [31,33,37,39,41] had a high TB burden. Among studies considered in the analysis, six assessed both the sensitivity and specificity of IGRAs and TST [27,30,31,33,36,39]. The TST cut-off was set at 10 mm in nine studies, 5 mm in four studies, 15 mm in one study, and not defined in one. Regarding the IGRAs, five assessed both QFT-IT and T-SPOT.TB, seven only QFT-IT, and three only T-SPOT.TB. Characteristics of all included studies are given in Table 1.

Quality assessment
Results of the quality assessment are summarized in Table 2 and Fig. 2. Before disagreements were resolved, reviewers' consensus on risk for bias and concerns regarding applicability were 91.7 and 97.7 %, respectively. No study was considered at low risk for bias in all the domains while all studies scored low in terms of concerns regarding applicability in all domains. The study of Sun Lin et al. [33] and that of Cruz et al. [35] were considered to be the most at risk of bias; judged at high risk in each domain with the exception of the Index Test domain (Sun Lin et al.) and Reference Standard domain (Cruz et al.). The studies of Detjen et al. [27], Kampmann et al. [29], Hansted et al. [31], and Chiappini et al. [36] were considered to be less at risk for bias; judged at low risk in each domain with the exception of the Patient Selection domain (Detjen et al. [27], Kampmann et al. [29]) and Index Test domain (Hansted et al. [31] and Chiappini et al. [23,36]).
In the Patient Selection domain (domain 1), five studies scored low risk for bias, one scored unclear risk (recruitment protocol not clearly stated), and nine scored high risk (sample of patients enrolled in a non-consecutive, nonrandom way or inappropriate exclusions not avoided). In the Index Test domain (domain 2), four studies had low risk for bias, in seven cases it was unclear whether the index test results were interpreted with or without knowledge of the results of the reference standard, and four scored high risk (the index test results interpreted with knowledge of the results of the reference standard). In the Reference Standard domain (domain 3), seven studies showed low risk for bias, six had unclear risk, and two were judged at high risk for bias. Indeed, in six studies, it was unclear if results of the reference standard were interpreted without knowledge of the index test, and in two cases reviewers judged that results of the reference standard were interpreted with knowledge of the index test. In the Flow and Timing domain (domain 4), seven studies scored low risk for bias, while seven were judged at high risk for bias, because not all patients recruited into the study were included in the analysis, and one scored unclear.
Diagnostic performance TST (cut-off stated in the study), QFT-IT and T-SPOT TP, TN, FP, and FN for each study are reported in Table 3.

Discussion
Our study demonstrates that all the three tests were highly accurate as shown by the AUC. According to the confidence intervals of pooled estimates, there are no significant differences in sensitivity among the three methodologies assessed: TST pooled sensitivity: 88. . Moreover, with respect to the previously published meta-analysis, we have provided additional evidence of a higher specificity of QFT-IT and T-SPOT in bacteriologically confirmed active TB in immunocompetent children.
Since the sensitivity is equal, this improved specificity of QFT-IT and T-SPOT ensures that healthy children are not wrongly diagnosed as an active TB patient and incorrectly treated as such, exposing them to two or three drugs for at least six months. This improved specificity also reduces the negative emotional impact of a false positive result on the families of children.
The diagnosis of active TB in children is especially problematic as symptoms can be confused with those of common childhood diseases and sputum samples are Table 3 Results of TST, QFT-IT and T-SPOT.TB   Author, Year  TST (cut-off stated in each study)  QFT-IT  T-SPOT.TB   TP  TN  FP  FN  TP  TN  FP  FN  IND  TP  TN  FP  FN   . This means that if the ratio of the odds of having a negative test result in a TB patient to the odds of the same result in a healthy one is similar for the three tests, the ratio of the odds of having a positive test result in a diseased patient to the odds of the same result in a healthy child is much higher using QFT and T-SPOT instead of TST. This makes these tests useful in clinical practice as they allow clinicians to make a diagnosis of active TB [42]. The improved specificity in healthy children confirms previous evidence [12,43,44], encouraging the primary use of QFT-IT or T-SPOT for case finding among healthy children and young patients [45]. These children may also fail to present for TST reading as previously suggested by Lewinsohn et al. [8]. From a Public Health perspective, our results provide an opportunity to consider the use of these tests in screening too. In fact, even though all the tests we have assessed showed similar sensitivities, IGRAs do not require, unlike TST, a second visit to assess results, which may be problematic for large and specific populations [46]. Furthermore, IGRAs have been suggested to be more accurate than TST in immunocompetent people [47] and allow distinguishing individuals who have been previously vaccinated, which could represent an advantage for screening. In fact, IGRAs have already been used to screen children during the investigation of potentially exposed newborns in a Teaching Hospital [48] and the use of IGRAs in "one step" approach has been also proposed in other contexts [49,50].

Limitations
Our study has a number of limitations. First, of all the studies fulfilling our inclusion criteria considered small populations. There is a small number of published studies focused on children, especially those aged <5 years. In fact, caution should be exercised when considering the preferential use of IGRAs in immunocompetent children aged <5 years; a warning to this effect was added to the national guidelines in the United States in a recent update [51]. Another limitation is the heterogeneity of studies, particularly concerning different and specific age groups. We did not perform a sub-analysis according to the size of TB burden (low versus high) because of the small number of studies we were able to include. For the same reason, a funnel plot was not used to investigate potential publication bias. The indeterminate rate results (inadequate interferon-γ response to positive control (PHA/mitogen) due to anergy, excessive interferon-γ in the negative control or, only for T-SPOT, insufficient cells, < 250,000 cells/100 μl) among children, which is considered an important impediment to the use of IGRAs in clinical practice for children [51], was not available in all the included studies. Further research should focus on evaluating the additional value of safety, social and ethical implications, organizational impact, and cost-effectiveness of IGRAs on the basis of a Health Technology Assessment approach.

Conclusions
QFT-IT and T-SPOT have a higher specificity than TST for detecting active TB cases in immunocompetent children, providing evidence for choices available to clinicians. These tests may be used as complementary tests to support the clinical diagnosis of active TB and may be also considered as part of public health responses.