Diagnostic accuracy of tests to detect Hepatitis C antibody: a meta-analysis and review of the literature

Background Although direct-acting antivirals can achieve sustained virological response rates greater than 90% in Hepatitis C Virus (HCV) infected persons, at present the majority of HCV-infected individuals remain undiagnosed and therefore untreated. While there are a wide range of HCV serological tests available, there is a lack of formal assessment of their diagnostic performance. We undertook a systematic review and meta-analysis to evaluate he diagnostic accuracy of available rapid diagnostic tests (RDT) and laboratory based EIA assays in detecting antibodies to HCV. Methods We used the PRISMA checklist and Cochrane guidance to develop our search protocol. The search strategy was registered in PROSPERO (CRD42015023567). The search focused on hepatitis C, diagnostic tests, and diagnostic accuracy within eight databases (MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials, Science Citation Index Expanded, Conference Proceedings Citation Index-Science, SCOPUS, Literatura Latino-Americana e do Caribe em Ciências da Saúde and WHO Global Index Medicus. Studies were included if they evaluated an assay to determine the sensitivity and specificity of HCV antibody (HCV Ab) in humans. Two reviewers independently extracted data and performed a quality assessment of the studies using the QUADAS tool. We pooled test estimates using the DerSimonian-Laird method, by using the software R and RevMan. 5.3. Results A total of 52 studies were identified that included 52,673 unique test measurements. Based on five studies, the pooled sensitivity and specificity of HCV Ab rapid diagnostic tests (RDTs) were 98% (95% CI 98-100%) and 100% (95% CI 100-100%) compared to an enzyme immunoassay (EIA) reference standard. High HCV Ab RDTs sensitivity and specificity were observed across screening populations (general population, high risk populations, and hospital patients) using different reference standards (EIA, nucleic acid testing, immunoblot). There were insufficient studies to undertake subanalyses based on HIV co-infection. Oral HCV Ab RDTs also had excellent sensitivity and specificity compared to blood reference tests, respectively at 94% (95% CI 93-96%) and 100% (95% CI 100-100%). Among studies that assessed individual oral RDTs, the eight studies revealed that OraQuick ADVANCE® had a slightly higher sensitivity (98%, 95% CI 97-98%) compared to the other oral brands (pooled sensitivity: 88%, 95% CI 84-92%). Conclusions RDTs, including oral tests, have excellent sensitivity and specificity compared to laboratory-based methods for HCV antibody detection across a wide range of settings. Oral HCV Ab RDTs had good sensitivity and specificity compared to blood reference standards. Electronic supplementary material The online version of this article (10.1186/s12879-017-2773-2) contains supplementary material, which is available to authorized users.


Background
Hepatitis C is a liver disease caused by the hepatitis C virus (HCV) that causes acute and chronic infection [1,2]. An estimated 71 million people had chronic hepatitis C infection worldwide in 2015 [3]. Viral hepatitis caused 1.34 million deaths in 2015, a number comparable to deaths caused by tuberculosis and higher than those caused by HIV [3]. The introduction of direct-acting antivirals (DAAs) has led to a sustained virological response (SVR) in greater than 90% of treated individuals [4,5]. DAAs are now recommended by the World Health Organization (WHO) [1] and many other HCV treatment guidelines [1]. DAAs will not only improve SVR rates but also may simplify HCV management algorithms and allow smaller health facilities to manage HCV-infected individuals [6]. Despite the availability of effective treatment, most HCV-infected individuals remain undiagnosed and untreated [7]. Left untreated, approximately 15-30% of individuals with chronic HCV infection progress to cirrhosis, leading to end-stage liver disease and hepatocellular carcinoma [1,2].
In February 2016 the WHO updated the guidelines for the screening, care, and treatment of persons with chronic hepatitis C infection [1]. These guidelines included recommendations on whom to screen for HCV and how to confirm HCV infection, but not which tests are optimal for initial screening. Advances in HCV detection technology create new opportunities for enhancing screening, referral, and treatment. Previous systematic reviews on HCV infection have focused on treatment response [8,9], clinical complications [10], and epidemiology [11,12]. Two previous systematic reviews on hepatitis C testing have focused on evaluating point-of-care tests compared to EIAs and other reference tests [13,14]. We have undertaken a further systematic review and meta-analysis to generate pooled sensitivity and specificity of rapid diagnostic tests used to detect HCV antibody (HCV Ab), and to inform the development of recommendations on serological testing in the 2017 WHO testing guidelines [15].

Research question
The main purpose of the review was to assess the diagnostic accuracy of available assays for detecting HCV Ab in persons identified for hepatitis C testing. The research question was structured in a PICO format (ie. population, intervention, comparisons and outcome). P: Persons identified for HCV testing; I: Rapid diagnostic tests and enzyme immunoassays for HCV Ab detection; C: 1), EIA (with a subanalysis based on the last 10 years); 2), NAT (nucleic acid testing); 3), Immunoblot or similar assay; 4), A combination of 1,2,3 above; O:

Search strategy and identification of studies
Search strategies were developed by a medical librarian with expertise in designing systematic review searches. Our search algorithm consisted of the following components: hepatitis C, diagnostic tests, and diagnostic accuracy. We searched MEDLINE (OVID interface, 1946 onwards), EMBASE (OVID interface, 1947 onwards), the Cochrane Central Register of Controlled Trials (Wiley interface, current issue), Science Citation Index Expanded (Web of Science interface, 1970 onwards), Conference Proceedings Citation Index-Science (Web of Science interface, 1990 onwards), SCOPUS (1960 onwards), Literatura Latino-Americana e do Caribe em Ciências da Saúde (LILACS) (BIREME interface) and WHO Global Index Medicus. The search was supplemented by searching for ongoing studies in WHO's International Clinical Trials Registry. The literature search was limited to English language and human subjects that available until April 30th, 2015. In addition to searching databases, we contacted individual researchers and authors of major trials to address whether any relevant manuscripts are in preparation or in press. The references of published articles found in the above databases were searched for additional pertinent materials.
Study selection proceeded in three stages: 1) titles/abstracts were screened by a single reviewer according to standard inclusion and exclusion criteria; 2) full manuscripts were obtained and evaluated by two independent reviewers to include or not; 3) two independent reviewers extracted all data. Differences were resolved by a third independent reviewer.

Selection criteria
The inclusion criteria included the following: primary purpose was HCV Ab test evaluation, reported sensitivity and specificity of HCV Ab test kits, and studies published before May 2015. We included observational and randomised control trial (RCT) studies that provided original data from patient specimens. Studies that only reported sensitivity or specificity, conference abstracts, comments or review papers, panel studies, or those that only used reference assays for positive samples were excluded. In this manuscript, a hepatitis panel refers to a laboratory series test in which use the blood with confirmed hepatitis C serostatus to assess the accuracy of a testing kit.

Data extraction
Information on the following variables were extracted from each individual study: first author, total sample size, country (and city) of sampling, sample type (oral fluid, finger prick, venous blood), point-of-care (POC, defined as being able to give a result within 60 min and having the results to guide clinical management in the same encounter), eligibility criteria, reference standard, manufacturer, raw cell numbers (true positives, false negatives, false positives, true negatives), antibodyantigen combo (yes or no), sources of funding, reported conflict of interest, and study population (general population, high risk population and hospitalized population). The high risk population groups include men who have sex with men, sex workers and their clients, transgender people, people who inject drugs and prisoners and other incarcerated people [16]. The hospitalized population was defined as those admitted to a hospital for medical care or observation. We also verified whether assays evaluated in the studies were currently on the market (as of June 1st, 2017), and if this was the case, we also reported the available version of the testing kit ( Table 1).

Assessment of methodological quality
Study quality was evaluated using the QUADAS-2 tool [17] and the STARD checklist [18]. QUADAS includes domains to evaluate bias in the following categories: risk of bias (patient selection, index test, reference standard, flow, and timing); applicability concerns (patient selection, index test, reference standard). The STARD checklist consists of a checklist of 25 items and flow diagram that authors can use to ensure that all relevant information is present.

Data analysis and synthesis Data synthesis
Data were extracted to construct 2 × 2 tables. By comparing with reference standard results, the index test results were categorized as a true positive, a false positive, a false negative, or a true negative. Indeterminate test results were not included in pooled analyses.

Statistical analysis
To estimate test accuracy, we calculated sensitivity and specificity for each study and pooled statistics, along with 95% confidence intervals [19]. We pooled test estimates using the DerSimonian-Laird method, a bivariate random effect model. We did further subanalyses based on reference standard (EIA alone; NAT or immunoblot; EIA, NAT, or immunoblot), brand, sample type, and combination test. We performed all statistical analysis (including heterogeneity, through Q test) using the software R and RevMan 5.3.
There were insufficient data to undertake a subanalysis based on HIV co-infection or other co-infections.

Assessment of the quality of the studies
All studies used a cross-sectional or case-control design. The risk of bias in patient selection, index test, or reference standard was assessed using QUADAS-2 ( Table 2). Among the included studies, 25 had at least one category that was considered high risk [19, 22, 25-28, 30, 31, 34, 36-39, 41, 45-50, 53, 55, 56, 58-62]. The risk of bias in patient selection usually came from a poor description of patient selection and clinical scenario. Bias in the index test was primarily due to a lack of reported blinding while reading test results. Bias in the reference standard was due to the use of multiple reference standards (EIA, NAT, and/or immunoblot). Bias in the flow and timing was primarily due to a lack of reported details.

Diagnostic accuracy
Overall clinical performance of assays The 52 included studies contributed 127 data points from 52,273 unique test measurements. Some studies contributed additional data points by comparing the accuracy of two or more tests, reporting data from multiple study sites, or reporting the accuracy of a test in more than one type of specimen. The sample sizes of the included studies ranged from 37 to 17,894. Sensitivities of included studies ranged from 22 to 100%, and specificities ranged from 77 to 100%. The overall pooled sensitivity and specificity for all tests were 97% (95% CI: 97%-98%) and 99% (95% CI: 98%-99%) respectively. Figure 2 shows estimates of sensitivity and specificity from each study.

Manufacturers and accuracy of RDTs among included studies
Overall, 32 studies evaluated the accuracy of 30 different RDTs (Table 3). The most commonly evaluated test kit was the OraQuick ADVANCE® from OraSure Technologies.    For the three studies that were conducted within the last 10 years [25,49,51], the total sample size was 12,992, with pooled sensitivity and specificity of 99% (95%CI 99%-100%) and 100% (95%CI 100%-100%), respectively.
GRADE approach (Grading of Recommendations, Assessment, Development and Evaluation to assessing overall quality of evidence GRADE for RDT versus EIA HCV Ab RDTs showed comparable sensitivity and specificity compared to that of EIAs. Among the five studies that evaluated RDTs versus EIA, 15,943 of samples were evaluated, and moderate risk of bias was observed (Table 4), but there was a consistent high level of specificity. Since the unit of the analysis varied among studies (Table 4), indirectness was observed. In addition, the overall strength of the pooled evaluation was moderate, with pooled sensitivity and specificity of 99% (95% CI 98%-100%) and 100% (95% CI 100%-100%), respectively. Under the pre-test probability of 5%, the post-test probability after a positive test result is 97%, and the post-test probability after a negative test result is 100%.

GRADE for oral RDT versus blood reference
The use of oral RDTs HCV Ab had comparable sensitivity and specificity compared to blood reference standards (Additional file 7). For the 12 studies evaluated oral RDT versus blood reference, 14,547 samples were evaluated. A moderate risk of bias was observed. Inconsistency was present for sensitivity, as the sensitivities of the included studies varied. But there was a consistent high level of specificity. Since the unit of the analysis varied with each other among the included studies (Table 4), indirectness was observed for included studies. In addition, the overall strength of the pooled evaluation was moderate, with pooled sensitivity and specificity of 94% (95% CI 93%-96%) and 100% (95% CI 100%-100%), respectively. Assuming a pre-test probability of 5%, the post-test probability after a positive test result was 94%, and the post-test probability after a negative test result was 100%.

Discussion
There is a global need to expand HCV diagnostic testing. In this meta-analysis, we found HCV Ab RDTs, including those using oral fluid, showed a high overall sensitivity and specificity compared to laboratory-based EIAs. This extends the literature by including several new studies that were not included in prior reviews, including a sub-analysis that focused on use of RDTs with oral fluid. In addition, the evidence collected from this review was used to inform recommendations in the 2017 WHO guidelines on testing for hepatitis B and C [15]. The evidence for generally high levels of diagnostic accuracy across most brands from this systematic review and metaanalysis supported a strong recommendation for the use of HCV RDTs in WHO testing guidelines [15]. Our data suggest that RDTs can be used for HCV Ab detection in a wide range of clinical settings. For example, for all the included studies, 17 were conducted among general populations, 20 were among high risk populations, and 17 were among hospitalized patients (two studies included two kinds of populations). High HCV Ab RDTs sensitivity and specificity were observed across multiple different populations (including general population, high risk populations, and hospital patients), which is consistent with previous systematic reviews [13,14,63]. The use of an EIA to detect HCV Ab followed    [64,65]. However, despite these recommendations, HCV Ab EIA assays have not been widely used because of the complexity of laboratory-based assays, long turnaround time, high cost and requirements for specialized apparatus and trained technicians [13]. To overcome these barriers, RDTs for HCV Ab screening were developed [66]. They obviate the need for multiple follow-up appointments, shorten wait times, and allow for the simplification and decentralization of testing (Additional file 8). However, it is essential for policymakers, government officials, and health care practitioners engaged in HCV screening, care, and treatment to be aware that the performance of individual RDTs for detection of HCV Ab vary widely. Individual diagnostic accuracy for specific brands should be examined to ensure acceptable performance. Our data suggest that oral fluid RDTs have high sensitivity and specificity. This is consistent with other literature [67]. Tests that can be used with non-invasive samples allow testing to be decentralized further and can be used in outreach settings [68]. Our data suggest that oral tests have a slightly lower pooled sensitivity (94%, 95%CI: 93%-96%) compared to blood-based tests (98%, 95% CI: 97%-98%) but comparable specificity. Oral HCV Ab RDTs tests may be particularly useful in contexts where venepuncture may be difficult, such as subsets of people who inject drugs which have difficult veins to access.
With the increasing availability of DAAs, countries are seeking testing kits with high sensitivity and specificity, in order to allow them to scale up HCV Ab screening, especially among at-risk populations. The advantages and disadvantages of EIAs and RDTs are well established [15]. Performance, cost, and accessibility need to be considered. Determining which tests to deploy at which level of the health care system and for what settings require policy makers to consider the different attributes of laboratory-based EIA versus blood-based or oral RDTs. Potential trade-offs include slightly lower accuracy for greater uptake and acceptability of testing, provision of test results, and linkage to care. Each country needs to decide on which trade-offs or compromises are acceptable, based not only on disease prevalence and the health care infrastructure but also on technical, socioeconomic, cultural, behavioral considerations. For example, they need to be clear on whether it is acceptable to buy Test X which is 10% less accurate than Test Y but is considerably cheaper so that many more people can be tested. In addition, although oral RDTs are less accurate than blood-based RDTs, it may be that oral RDTs will be more acceptable for outreach testing and accessing at-risk populations and allow the control programs to identify more HCV cases. In a low prevalence setting, even a test with 98% specificity can yield more false positive than true positive results. All these tradeoffs can be modeled to give an estimate of the costeffectiveness and potential impact of different strategies for HCV Ab screening.
Our review also underlines some of the common methodological problems encountered in evaluating diagnostic accuracy. Cross-sectional or case-control designs were used by all 52 included studies, introducing a potential risk of bias. These studies used a broad range of reference standards, which makes the pooled performance data less meaningful. Within the evaluation of diagnostic accuracy, even crosssectional studies in patients with diagnostic uncertainty and direct comparison of test results with an appropriate reference standard can be considered high quality [69]. The majority of the included studies used convenience sampling. In this review, we excluded panel studies because they are not based on clinical settings and our purpose was to generate data that would be relevant in clinical settings as part of detection of HCV Ab. Most studies that reported HIV or HBV co-infection only reported the test performance of the kits among all samples, instead of disaggregated diagnostic accuracy. There were insufficient data from two studies to undertake a subanalysis based on HIV co-infection. It may be important for policymakers to know the diagnostic accuracy of HCV Ab tests among individuals with coinfections, particularly HIV co-infection [70], and this requires further research among co-infected individuals.
Our study is subject to several limitations. First, we included studies conducted among the general population, hospital patients, and high risk populations. Diagnostic performance can be influenced by disease prevalence and HCV prevalence is variable among these different populations [71,72]. Second, we detected substantial heterogeneity that could influence our confidence in the review findings [73], but addressed this problem through a series of sub-group stratified analyses. Third, about 20 brands of RDT kits were used in the included studies, and their performance varies considerably. This limited our ability to summarize the accuracy of different brands, with the exception of comparing OraQuick to other brands. Another concern is publication bias, as studies with poor test performance may be less likely to be published, leading to distorted estimates of accuracy [74]. Fourth, since not all HCV RDTs can be performed from oral fluid/capillary whole blood (some require plasma/serum), and some of them require a cold chain for storage and transport, the direct comparison between EIA and RDTs in this metaanalysis would be less meaningful. Fifth, we should note that not all test kits are still on the market and that versions of the tests included in this metaanalysis may have since changed. Finally, statistical heterogeneity was present. But is common in metaanalyses of diagnostic studies. Additional research is important for understanding why the tests perform more poorly in certain populations or settings.

Conclusion
RDTs, including oral tests, have excellent sensitivity and specificity compared to laboratory-based methods for HCV antibody detection across a wide range of settings. National policymakers should consider the performance, cost and accessibility of RDTs into consideration, when selecting assays for use in their national testing algorithms.  Notes: *Studies conducted in both LMIC and high-income countries were not included here Studies conducted cross these regions were not included here SE sensitivity, SP specificity