High specificity of line-immunoassay based algorithms for recent HIV-1 infection independent of viral subtype and stage of disease

Background Serologic testing algorithms for recent HIV seroconversion (STARHS) provide important information for HIV surveillance. We have shown that a patient's antibody reaction in a confirmatory line immunoassay (INNO-LIATM HIV I/II Score, Innogenetics) provides information on the duration of infection. Here, we sought to further investigate the diagnostic specificity of various Inno-Lia algorithms and to identify factors affecting it. Methods Plasma samples of 714 selected patients of the Swiss HIV Cohort Study infected for longer than 12 months and representing all viral clades and stages of chronic HIV-1 infection were tested blindly by Inno-Lia and classified as either incident (up to 12 m) or older infection by 24 different algorithms. Of the total, 524 patients received HAART, 308 had HIV-1 RNA below 50 copies/mL, and 620 were infected by a HIV-1 non-B clade. Using logistic regression analysis we evaluated factors that might affect the specificity of these algorithms. Results HIV-1 RNA <50 copies/mL was associated with significantly lower reactivity to all five HIV-1 antigens of the Inno-Lia and impaired specificity of most algorithms. Among 412 patients either untreated or with HIV-1 RNA ≥50 copies/mL despite HAART, the median specificity of the algorithms was 96.5% (range 92.0-100%). The only factor that significantly promoted false-incident results in this group was age, with false-incident results increasing by a few percent per additional year. HIV-1 clade, HIV-1 RNA, CD4 percentage, sex, disease stage, and testing modalities exhibited no significance. Results were similar among 190 untreated patients. Conclusions The specificity of most Inno-Lia algorithms was high and not affected by HIV-1 variability, advanced disease and other factors promoting false-recent results in other STARHS. Specificity should be good in any group of untreated HIV-1 patients.


Background
Information on HIV incidence is necessary for monitoring the dynamics of the HIV epidemic in affected countries and assessing the effectiveness of preventive measures targeted at major risk populations. Consequently, serologic testing algorithms for recent HIV seroconversion (STARHS) have been developed. These tests make use of the fact that both the concentration and affinity of HIV antibodies during the first few months of HIV infection are lower than at later stages [1][2][3][4]. STARHS require a special assay of reduced sensitivity, hence they are also called 'detuned' assays. The reduced sensitivity renders these tests unsuitable for diagnosis of HIV infection and restricts their use to epidemiologic studies. For a systematic epidemiologic monitoring it would be advantageous if information on the proportion of recent infections could be gained prospectively and systematically from the tests used anyway to diagnose HIV infection.
We have shown that a patient's antibody reaction in a commercial line immunoassay, the Inno-Lia TM HIV I/II Score (Inno-Lia), provides information on the duration of infection similar to that of a commercial enzyme immunoassay (EIA) for STARHS, the so-called BED-EIA [5,6]. The Inno-LIA is a kind of second-generation Western blot and measures antibodies to different HIV antigens in a semi-quantitative way. The pattern and intensity of HIV-specific antibodies both evolve during the first weeks to months after infection. It is thus possible to define algorithms which, with a certain diagnostic sensitivity and specificity, recognize early and late antibody patterns. Based on the number of cases ruled recent by the Inno-Lia and the known values for sensitivity and specificity it is then possible to calculate the proportion of infections of up to 12 months duration in a group by using a simple formula [5]. As the Inno-Lia is a confirmatory HIV test, it is convenient to prospectively test all newly diagnosed patients and to notify the results to the respective health authority, which will calculate periodically the proportion of recent infections among the different transmission risk groups.
The diagnostic sensitivity and specificity of each algorithm are crucial for this method. If they are not correct, estimates of recent infections will not be accurate. We estimated these parameters for a total of 12 algorithms in a baseline study of newly diagnosed patients with HIV-1 infection of either less or more than 12 months duration, as judged by the treating physicians of these patients. The estimates for sensitivity resulting from this study varied between 20 and 50%, while specificity was between 92 and 100%. The algorithm, which distinguished best between incident and older infection, had a sensitivity of 50.3% and a specificity of 95.0% [5]. As the study was prospective, it was difficult to know whether the treating physician's judgment on the duration of each infection was correct. Follow-up information on the patients on the course of HIV-1 RNA and CD4+ T cell concentrations over time, which is sometimes necessary for differentiating between severe primary and advanced HIV infection, was not available at the time of diagnosis, and the reliability of the staging information in that first study is therefore somewhat arguable. For example, some of the patients classified as CDC stage B or C by the diagnosing physician, but ruled recent by the Inno-Lia algorithms, may actually have suffered from severe acute HIV infection [7][8][9]. Another wellknown cause for a false classification is infection by non-B subtypes of HIV-1 [10,11]. Patients infected with non-B viruses may produce antibodies of reduced avidity to the subtype B antigens frequently employed in serologic tests, thus leading to false classification as recent. Similarly, the waning antibody titers to some HIV proteins in advanced immunodeficiency may lead to false classification as recent infection [12][13][14][15][16][17].
Reliable information on the diagnostic performance of our method is thus still lacking, and the true diagnostic sensitivity and specificity of the Inno-Lia algorithms for recent infection still have to be established. Towards this goal, we have conducted two studies. One study, to be published elsewhere, will determine the diagnostic sensitivity in a cohort of patients diagnosed at the time of primary HIV infection. That study will also investigate the overall diagnostic performance of the algorithms and present a validation of the method in consecutive annual cohorts of HIV notifications.
In contrast, the goal of the present study was to further assess the specificity of the algorithms in HIV-1 patients known to have been infected for longer than 12 months and to identify factors that might influence the outcome of the algorithms. Of particular interest was the question whether there was an impact by the HIV-1 subtype or an advanced stage of disease.

Ethics statement
This study was conducted as a nested project in the framework of the Swiss HIV Cohort Study (SHCS; see http://www.shcs.ch ) [18]. The ethical committees of all participating institutions, i.e., the HIV outpatient clinics and laboratories of seven Swiss hospitals (the university hospitals of Basel, Berne, Geneva, Lausanne, Zurich, and the cantonal hospitals of Lugano and St Gallen), have approved the general study protocol, and all participating patients have given their written informed consent to the goals of the SHCS and its research projects, including this one.

Patients and samples
The study investigated a single plasma or serum specimen from a total of 714 patients of the SHCS. The patients and their specimens were selected for the study in spring 2008. All patients had been infected with HIV-1 for at least 12 months, as demonstrated by either a documented first positive HIV test or registration into the SHCS at least 12 months prior to sample date. The patients originated from 7 different SHCS treatment centers and represented all clinical stages and CD4+ strata. The HIV-1 subtype of all patients was known based on the recorded results of genetic resistance testing in the reverse transcriptase and protease regions of the pol gene. Patients were selected with the aim that all viral subtypes and circulating recombinant forms (CRF) were represented with 30 samples both in CDC stages A and B and with 40 samples in stage C. If there were more patients of a given subtype per stage, the patients required were selected randomly. If there were fewer, all available were selected. Of all but 12 patients, a plasma aliquot stored at -70°C was used for testing. For 12 patients a frozen serum sample stored at -20°C was used instead.

Serological differentiation of recent and older HIV-1 infection
All samples were number-coded and tested retrospectively, batch-wise by the Inno-Lia TM HIV I/II Score assay (Innogenetics, Ghent, Belgium). Testing was conducted in 7 different accredited laboratories including 6 HIV regional confirmatory laboratories commissioned by the Swiss Federal Office of Public Health (SFOPH) and the Swiss National Center for Retroviruses (SNCR), which serves as the national HIV reference laboratory and is also commissioned by the SFOPH. All 7 labs are accredited according to the international standard ISO/ IEC 17025 by the governmental Swiss Accreditation Service SAS (see http://www.seco.admin.ch/sas/index.html? lang=en). All had participated already in the first study of Inno-Lia based recent infection assessment and were experienced with the test [5].
The Inno-Lia is a Western blot-like line immunoassay that measures antibodies against recombinant proteins or synthetic peptides of HIV-1 group M, HIV-1 group O, or HIV-2, which are coated as 7 discrete lines on a nylon strip with plastic backing. As each test strip also contains three quantitative internal standards, a semiquantitative ranking of the different antibody reactions is possible [19,20].
All assays were performed between Oct 2008 and Jan 2009 and involved 4 different lots of test kits. The manufacturer's 16-h sample incubation protocol was used for all tests. In 3 labs, on a total of 498 samples, testing was conducted on CE-marked Auto-Lia 48 or Autoblot 3000 test automats (both from Innogenetics). In 4 labs, on a total of 216 samples, testing was performed manually. Antibody reaction to each of the 7 HIV antigen bands present on the test strips (sgp120 [including group O peptides], gp41, p31, p24 and p17 of HIV-1, and sgp105 and gp36 of HIV-2) was assessed either visually (in three of the four labs that used manual testing on a total of 123 samples) or by the automated scanner-based LiRAS system (Innogenetics) (in 4 labs; 591 samples). Based on the three internal standards, which define reaction levels of 0.5 (+/-), 1 and 3 for each test strip, the antibody reaction to each HIV antigen was classified into one of six possible intensity scores (0, 0.5, 1, 2, 3, or 4).

Inno-Lia algorithms
Twenty-four algorithms (Algs) for recent HIV-1 infection were developed empirically by investigating which Inno-Lia antibody patterns were found at maximal frequency in a group of patients with less than 12 months of infection (= recent or incident infections) and at minimal frequency in a group of patients with ≥12 months duration of infection (= older infections). Twelve of the algorithms, Alg02 to Alg13, are as published [5]. The other 12 were developed more recently based on the same dataset [5]. All 24 algorithms were applied to the collected Inno-Lia data. Thus, each Inno-Lia result was classified by 24 algorithms as representing either a recent or older HIV-1 infection.

Data evaluation and statistics
The results of Inno-Lia testing and the clinical data of the SHCS were linked only after all testing was completed. Differences between means were analyzed by the nonparametric Mann-Whitney U test, differences in frequency by contingency tables and Fisher's exact test, and correlations by nonparametric Spearman's rank correlation. Predictors of result of Inno-Lia algorithms (incident or older infection) were evaluated by univariate and multivariate logistic regression analysis. Independents analyzed included person-related parameters (sex, age, time since registration into the SHCS), diseaserelated factors (CDC stage, CD4+ T-cell count and percentage, treatment status, duration of HAART, HIV-1 RNA concentration as by commercial RT-PCR assays from Roche), and testing modalities (type of specimen, storage duration, lot number of test kit, modes of testing and result evaluation, laboratory which stored the samples and performed the testing). All statistical analyses, as well as the classification of the Inno-Lia results by the 24 recent infection algorithms, were performed in the StatView 5.0 program for Macintosh (SAS Institute, Cary, North Carolina, U.S.A.).

Results
A total of 714 stored plasma or serum samples from patients who participated in the SHCS and had been infected by HIV-1 for at least 12 months were tested by the Inno-Lia HIV I/II score assay, as described under Methods. The main epidemiological, virological and immunological characteristics of the patients are summarized in Table 1. Owing to the selection for non-B clade infections, which in our country are more frequent in women than in men, the two sexes were represented at about equal numbers. Roughly half of the patients were classified as CDC stage A, 22% were in stage B and 28% in stage C. Almost three quarters of the patients had received HAART for a median duration of 13.5 months, resulting in 308 patients who presented with a HIV-1 RNA concentration below 50 copies/mL. HIV-1 RNA among the 406 patients with HIV-1 RNA ≥50 copies/mL amounted to a median of 10 3.94 copies/ mL. The majority of the patients (86.8%) were infected by non-B clades comprising a total of 15 different clades in addition to subtype B.

Influence of HAART
The group of patients receiving HAART at the time of testing had significantly lower concentrations of HIV-1 RNA than those untreated (10 1.59 copies/mL compared to 10 4.25 copies/mL; p < 0.0001, Mann-Whitney U test). They also had significantly less intense reactions in the Inno-Lia with respect to viral proteins sgp120 (p = 0.029), p31 (p < 0.0001), p24 (p = 0.0003) and p17 (p = 0.0013). In contrast, the intensity of antibodies to gp41 was similar in both treated and untreated patients (p = 0.17), and the minimal band intensity was 1.0 independent of the treatment status. There was also a strong association between viral load and band intensity ( Figure  1). The 308 patients (including 6 who were treatmentnaïve) with HIV-1 RNA <50 copies/mL had on average significantly lower intensity of all bands than the 406 patients with ≥50 copies/mL, particularly with respect to sgp120, p31 and p17 (p < 0.0001 for all three) and somewhat less with respect to gp41 (p = 0.003) and p24 (p = 0.008). The result of the Inno-Lia algorithms, recent or older infection, was affected likewise by treatment status and viral load (not shown). Multivariate logistic regression analysis combining HIV-1 RNA, treatment status and duration of HAART as independents showed, however, that the treatment status and the duration of HAART had no significant effect on the result of any algorithm. In contrast, the viral load levelmore or less than 50 copies/mL -was a significant determinant for outcome of all algorithms except Alg03, Alg03.1, Alg05 and Alg06 (data not shown).
Based on these first findings we determined the diagnostic specificity of the 24 Inno-Lia algorithms among the 190 treatment-naïve patients (including 6 patients with HIV-1 RNA <50 copies/mL) and the 222 patients with a viral load ≥50 copies/mL despite receiving HAART. Algorithm specificity among these 412 patients extended from 92.0% to 100%, with a median of 96.5% (Table 2). Perfect specificity (100%) was obtained with the single-band algorithms Alg03 and Alg03.1; Alg06 was least specific (92.0%). Specificity of the algorithms among the 190 HAART-naïve patients alone was similar (median 95.5%, range 93.2 -100%).

Investigation of factors that affect algorithm specificity
Using logistic regression analysis, we sought to identify the factors that affected the result of the various algorithms in the total of the 714 patients. Alg03 could not be analyzed, as it was 100% specific. Results for the remaining 23 algorithms are summarized in Figures 2 and 3. There were predictors that promoted false-recent results and others which protected against these. Most of the effects were not distributed randomly, but were associated with distinct groups of algorithms.
In the univariate analysis ( Figure 2), the strongest and most consistent predictors of algorithm result included the HIV-1 RNA level, the CD4+ T cell percentage (CD4%) or count, sex, HAART status, age, and CDC stage. HIV-1 RNA <50 copies/mL, CD4% or count, age, and receiving HAART promoted a recent infection result. Other promoting factors included, in decreasing  order, testing in certain laboratories compared to the one taken as reference, a long duration of sample storage, or being infected with the circulating recombinant form (CRF) CRF01_AE. Conversely, HIV-1 RNA concentration (in log(copies/mL), female sex or, for some algorithms, being in CDC stages B or C compared to A were factors that protected against a recent infection result. Other protective factors for some algorithms included being infected with CRF02_AG or subtypes A, C or D, or manual Inno-Lia testing. There were also some sporadic associations with the type or volume of the stored specimen or lot number of test kit. No associations were seen for duration of HAART, time since registration into the SHCS and mode of result scoring (visual versus automated).
The multivariate analysis of factors that affected algorithm specificity (Figure 3) was performed with all parameters that had shown at least one significant association in the univariate analysis. There was strong co-linearity between CD4 count and CD4%, the two parameters for HIV-1 RNA, as well as testing laboratory and mode of testing. We therefore excluded CD4 count and testing laboratory from the analysis. Regarding HIV-1 RNA, we excluded log(copies/mL) in favor of the statistically stronger level.
The multivariate analysis confirmed the importance of HIV-1 RNA, CD4%, sex and age. Specifically, for the 20 algorithms for which an effect of HIV-1 RNA was demonstrated, <50 copies/mL was associated with a roughly fivefold increase in false-recent results Furthermore, for those algorithms in which age promoted a false-recent result, the testing of serum stored at -20°C instead of plasma stored at -70°C appeared to be a further promoting factor. There were only 12 serum samples, however, thus relativizing this finding. Advanced clinical stage lost the protective effect seen in univariate analysis in all algorithms but one. Sample size, duration of sample storage, and modes of testing and result evaluation retained no significance. Test kit lot #3 was again associated with a lower specificity when using Alg07 or, as a trend, Alg13.1. Close inspection of the data showed, however, that the great majority of the samples, namely 671 (86.4%), had been tested with lot #1. Only 32 specimens (4.5%) had been tested with lot #3, too few to permit any conclusions regarding possible variations in lot quality.

No influence by HIV clade
Compared to HIV-1 subtype B, infection with CRF01_AE remained significantly associated with an increased proportion of false-recent results by Alg02, Alg03.2, Alg07 and, as a trend, Alg17. Closer inspection   Similarly, the apparent protective effects of infections by CRF02_AG or subtypes A, C and D also turned out to be associated with HAART. With Alg04, e.g., there were 13 false-recent results among 62 treated patients infected with subtype B (21%), while the respective numbers for CRF02_AG were 4/72 (5.6%). Thus, among treated patients, those infected with CRF02_AG had a significantly lower risk for false-recent results than those infected with subtype B (p = 0.009). In contrast, among untreated patients, the proportions of false-recent results between CRF02_AG (1/24, 4%) and subtype B (4/32, 12.5%) differed less (p = 0.38). Again, only 5 of the false-recent results occurred among HAART-naïve patients. Similar relationships were found with respect to the apparent protective effects of subtypes A, C and D compared to B (not shown).
In a next step to determine the relevance of the factors leading to false-recent results, we narrowed the analysis to those 412 patients who were either HAARTnaïve or exhibited HIV-1 RNA ≥50 copies/mL despite receiving HAART (Figure 4). The analysis was further restricted to those independents which in Figure 3 had shown significant effects with at least two algorithms. Thus, CDC stage, HAART, sample volume, storage duration, modes of testing and scoring, and kit lot were no longer in the model. Alg03.1 had no false-recent result and could not be analyzed.

Age impairs algorithm specificity
The only variable that retained broad significance in this setting was age, which significantly promoted falserecent results in 6 algorithms and showed a trend in a further 8. On average, the rate of false-recent results   Even more than in Figure 3, the remaining weak trends for either promoting or protective effects are based on too few cases to be of any relevance. When finally focusing the investigation on the 190 HAART-naïve patients, univariate analysis revealed age as a factor, which significantly promoted false-recent results in four algorithms and showed a trend in two further ones ( Figure 5,   Odds ratios of variables of particular interest and their 95% confidence intervals are shown in the text. HIV-1 RNA was used as a continuous, logarithmized parameter, and concentrations below the lower limit of detection were set to 1 copy/mL.  . CD4% showed additional weak protective effects with Alg07 and Alg09, while HIV-1 RNA showed further protective effects with Algs 07, 09, 10, 16, and 17. Age lost its effect with Alg06, but gained a new one with Alg 10. Exclusion of the 6 cases with HIV-1 RNA <50 copies/mL led to the loss of all protective effects of HIV-1 RNA, while the effects of age and CD4% remained. This suggested that HIV-1 RNA <50 copies/mL promoted false-recent results also among untreated patients, while there was no effect among the higher concentrations. With regard to CD4%, close inspection of the data revealed no evidence for an association of low CD4% with low antibody intensities, and antibody intensities among patients in CDC stage C were on average higher than in stage A. Therefore, the weak effects of CD4% were not attributable to patients in advanced stage of disease. In cross-comparison of Figures 3, 4 and 5, age clearly promoted false-recent results in all groups. Independent of the statistical significance in individual algorithms, the mean odds ratio for age among the algorithms differed little between the analyses of Figures 3, 4, and 5 and amounted to 1.021, 1.037 and respectively 1.032, thus suggesting a relative increase in false-recent results of about 3% per additional year of life. An HIV-1 RNA below 50 copies/mL promoted false-recent results in both treated and untreated patients; above this level there was, however, no effect of the concentration. With respect to CD4%, the strongly promoting effect in Figure  3 was strictly associated with long-term, successful HAART, as it was no longer present when HIV-1 RNA was above 50 copies/mL or when patients were HAART-naïve ( Figure 5). If the weak protective effects in this latter group are real, they were not attributable to patients in the most advanced stage of disease. All other factors, including HIV-1 clade, had no effect.

Discussion
The principal goal of the study was to determine the specificity of more than 20 Inno-Lia algorithms developed for estimating the fraction of recent infections in cohorts of HIV-1 infected patients [5]. A second aim was to identify possible factors that impair the specificity of the algorithms. Of particular importance was whether non-B clades of HIV-1 or advanced stages of immune deficiency would lead to false-recent results. These investigations are a first step of an ongoing overall evaluation of this new method, and without full knowledge of the sensitivity of the algorithms and their overall performance, which are investigated in a separate study to be published elsewhere, no definite conclusions should be drawn as to the suitability of this method for assessment of the recent infection rate in a population.
In order to answer the questions addressed in the present study, we retrospectively tested frozen specimens from well-characterized patients of the SHCS [18]. All patients, 86.8% of them selected for infection with non-B clades of HIV-1 and 73.4% receiving HAART for a median duration of more than one year, were in the chronic stage of infection and had been infected for longer than 12 months (Table 1). These 714 patients clearly represented older HIV infections as by our definition [5] and provided suitable conditions for an analysis of factors that affected algorithm specificity.
The high specificity of the 24 Inno-Lia algorithms ( Table 2) already indicated that HIV-1 clade could have but a small effect. This was confirmed by univariate and multivariate logistic regression analysis (Figures 2 and  3). Both showed that the non-B clades that were available at sufficiently high numbers, i.e., subtypes A, C, D, F, G and J, as well as CRF01_AE, CRF02_AG and CRF06_CPX, did not affect the specificity in a relevant manner. Apparent promoting effects of CRF01_AE for false-recent results in Algs 02, 03.2, 07 and 17 were upon closer inspection found to be restricted to patients receiving HAART. Similarly, apparent protective effects of CRF02_AG and subtypes A, C and D were also restricted to patients receiving HAART. Both types of effects lost significance when the analysis was restricted to patients with no or only incompletely effective HAART. We conclude that these effects were largely treatment-associated and will not exert a sizeable effect in newly diagnosed, untreated patients.
In contrast to virus clade, there were some parameters, which affected the algorithms in a consistent and highly significant manner. One of these predictors was HIV-1 RNA <50 copies/mL, which was associated with a lower antibody intensity against all five HIV-1 antigens ( Figure 1) and promoted false-recent results in most algorithms (Figure 3). HIV-1 RNA retained no significance when restricting the analysis to patients with no or incompletely effective HAART (Figure 4), thus confirming that only a very low or undetectable viral load would lead to false-recent results.
Our finding that HAART or, respectively, the low or undetectable viral load resulting from prolonged HAART was associated with a reduction in the concentration of HIV-specific antibodies in chronically infected patients is in contradiction to other reports. Although several studies have shown a delayed seroconversion, or partial seroreversion, in patients in whom HAART was started during acute infection or shortly thereafter [21][22][23][24][25], two studies of about 80 patients each found no reduction of HIV antibodies in chronically infected patients successfully treated with HAART for at least 5 years [24,25]. In contrast to these two studies and despite a shorter treatment duration, the average intensity of all five HIV antibody specificities in patients with <50 copies/mL HIV-1 RNA under HAART was significantly lower in the present study (Figure 1). This indicates a modest, but clear effect of HAART on antibody concentrations in chronically infected patients.
A high CD4% (or CD4+ count) was another factor strongly associated with false-recent results in the analysis of all 714 patients (Figure 3). This association is more difficult to understand and is probably the result of several superimposed effects. Further analysis revealed that patients who were receiving HAART and had higher CD4% than the median (≥21.4%) showed significant inverse correlations between CD4% and all antibodies except those to p24 (Spearman's rank correlation; p < 0.01 in all instances). In contrast, HAART-naïve patients or those with CD4% below the median did not exhibit such a correlation. In combination, these results may suggest that the association of high CD4% and low antibodies is an effect of HAART, whereby the patients that regain the highest CD4% are also those most likely to experience a decrease in their HIV-specific antibodies. The fact that CD4% had no significance in Figure 4 and even exhibited some protective effects in Figure 5 suggests that its promotion of false-recent results in Figure 3 is also a treatment-associated artifact. In HAARTnaïve patients, a high CD4% may possibly protect against false-recent results in certain algorithms, but there was no indication that the patients with the most advanced disease were prone to false-recent results.
Sex, age, and testing serum instead of plasma were further frequent predictors of a false-recent result when investigated in the entire collective ( Figure 3). Age promoted false-recent results in all algorithms that contained the term 'p31 = 0 AND p24≥2' (Algs 10 to 13.1, 15 and 16; see Table 2). Patients older than 35 years had twice as many false-recent results with Alg10 than the younger ones (9.9% vs. 4.6%, Fisher's exact test p < 0.01). They also exhibited a significantly lower mean intensity of p31 antibodies (p < 0.01). Age retained significance in the analyses of Figures 4 and 5 and exhibited similar average odds ratios between all analyzed groups. It is thus a factor that should also lead to some false-recent results in newly diagnosed patients. This finding fits into the well-known age-dependent weakening of the antibody responses to viral antigens such as present in viral vaccines [26][27][28][29].
Other factors including sex, using different lots of test kits, or testing serum instead of plasma, which appeared to affect the specificity of some algorithms when tested in all 714 patients (Figure 3), lost all significance when tested in HAART-naïve patients or those with a viral load ≥50 copies/mL despite HAART ( Figure 4). As these factors are logically independent of HAART, they should also have no relevance when testing untreated patients.

Limitations
For assessment of specificity of the algorithms and of factors that may affect specificity, a cohort of untreated patients would have been optimal. HAART has been the standard of care for patients with a certain degree of immunodeficiency for more than a decade, however, and it was impossible to meet this goal. Only 190 patients were HAART-naïve and only 20 of them were in CDC stage C.
Nevertheless, we consider our results to be valid for newly diagnosed, untreated patients, for the following reasons: Since HAART reduces the viral load and because an undetectable viral load in turn is associated with weaker antibodies, the antibody reactions in untreated patients will be stronger, which should result in even fewer false-recent results than found here. As a matter of fact, when the 714 patients were stratified according to CDC stage, the individual antibody intensities, as well as their sum, were higher in the untreated patients in all stages. Similarly, when stratification was for CD4% higher or lower than the median, the untreated patients had higher antibody intensities, except for p24, but the sum of all antibodies was higher again. This illustrates that single band patterns that would promote a false-recent result are successfully 'diluted out' or counteracted by suitably defined combination algorithms. Of note, some combination algorithms, in particular Alg14, but also Algs 11 to 13.1, appeared to be affected very little by all investigated variables.
Nevertheless, we cannot exclude the possibility that other factors than those investigated here may affect the specificity of the method. It is thus advisable to predetermine the diagnostic performance of the test before transferring it to a new setting.

Conclusions
The present study shows that the specificity of more than 20 Inno-Lia algorithms for recent infection is high. The specificity was clearly impaired by increasing age and an HIV-1 RNA load below 50 copies/mL, but not by the HIV-1 clade. Other variables, including sex, CDC stage, HAART without effective virus control, modalities of testing and result evaluation, did not matter. Similarly, for most algorithms there was no evidence for impairment by low CD4%. Some algorithms remained largely unaffected by all variables. We therefore expect that these algorithms should have a high specificity in all possible settings of untreated HIV-1 infected patients. Provided that they also exhibit a good diagnostic sensitivity and good overall performance, which are both assessed in a different study, they might become valuable tools for monitoring the rate of recent HIV-1 infections among newly diagnosed patients.
List of abbreviations STARHS: serologic testing algorithms for recent HIV seroconversion; Inno-Lia: the INNO-LIA TM HIV I/II Score test; SHCS: Swiss HIV Cohort Study; HAART: highly active antiretroviral therapy; Alg, algorithm; OR: odds ratio;