To the best of our knowledge, this is the first nation-wide phylodynamics study examining the characteristics of an HIV-1 subtype B driven epidemic on a population consisting of over half of all HIV-positive patients diagnosed in the 13-year study period. The HIV-1 epidemic in Slovenia is mainly driven by homosexual intercourse and through local transmission, where eighty percent of patients have an identified local transmission link (meaning either being infected by someone or infecting someone locally). Several introductions of HIV-1 to Slovenia have been observed after the year 1986. Interestingly, faster substitution rate of Slovenian sequences was seen, which suggests low transmission rates and a slowly growing epidemic.
The HIV-1 epidemic in Slovenia results from several introductions after 1986 which then propagated mainly through local transmissions
In order to understand the dynamics of HIV-1 subtype B transmission in Slovenia, we analyzed a dataset including fifty-two percent (223/427) of all newly diagnosed patients over the 13-year time span between 2000 and 2012. Since MSM are the most vulnerable population for HIV infection in Slovenia, male gender was over-represented (94% of all included patients). Indeed, a significant association between male gender and subtype B had already been determined in a previous study conducted in Slovenia [9].
The majority of infected patients belong to large transmission clusters: 65.5% of patients belonged to a transmission cluster together with 9 or more individuals. When including also smaller transmission clusters of 2–4 individuals, 80.7% Slovenian patients infected with a subtype B virus have a local transmission link. This is also the reason why few non-Slovenian sequences could be found when using the BLAST search tool to select for additional control sequences to be included in the analysis. This supports the findings that in Europe, infections are acquired predominantly from patients of the same country [27].
All the identified clusters were nested within subtype B control sequences selected from GenBank, indicating that the HIV epidemic in Slovenia is comprised of several introductions of HIV into the country. tMRCA values obtained from the analysis of the complete dataset indicate that HIV-1 subtype B was introduced into Slovenia at different time points from the late 80s onward. The earliest dated two clusters of Slovenian sequences show 1986 as the year in which HIV was introduced into the country. This was also the year when the first AIDS patient was diagnosed and reported in Slovenia, indicating consistency between epidemiological data and Bayesian tMRCA estimates. Even then, most infections were found among MSM, as they are today; this vulnerable risk group thus probably imported HIV into Slovenia at different time points.
The HIV-1 epidemic in Slovenia is mainly driven by homosexual intercourse and through local transmission
In the current study, three factors were found to be significantly associated with patients belonging to a large cluster: the fact that the patient reported Slovenia as the country in which HIV infection occurred, diagnosis of HIV after 2004 and no DRM detected. The first was expected, is consistent with the findings of large local transmission clusters as mentioned above and is in line with the study of Frentz et al. (2013) [27]. A possible explanation for the second is that the period around 2004 was when a substantial pool of HIV infected patients had formed in Slovenia, thus making HIV acquisition within the country, as opposed to abroad, more probable. On the other hand, more recent infections within clusters can be explained by the process of cluster identification. The definition of a transmission cluster is still a topic of discussion and no consensus has been reached. In any case, since divergence is quickly accumulating within a host after infection, and given the current proposed definitions of a transmission cluster, transmission clusters with a higher number of recent infections will be preferentially selected in detriment of the ones where infections were acquired over a longer time period. In such recent infections, less divergence has accumulated and therefore evolutionary distances and eventually statistical supports of the cluster (aLRT, bootstrap, posterior probability) will be higher. In this study, since evolutionary distances were not considered as a criterion for the definition of transmission clusters, we checked the age of the cluster by investigating its depth in years. Our mean (and median) depth in years were 18.6 (20.1), which is quite comparable with the mean (or median) depth in years found in other studies, ranging between 7 and 35 years [2,4,28-36].
The third finding showing statistical significance was that most patients with DRMs were not included in transmission clusters, nor did they have a transmission link. This could imply that most transmitted drug resistance (TDR) is imported, confirming the particular importance of HIV-1 drug resistance testing of newly-diagnosed treatment-naive patients infected abroad, as is the current national strategy. Alternatively, it could indicate a lower fitness of viruses harboring TDR, with lower transmissibility of such strains.
Interestingly, 40% of patients reporting that infection happened abroad did not have a transmission link, compared to only 15% among those reporting Slovenia as their country of infection. This may be due to sampling bias, since sampling in other countries, where the infection was acquired, might not be as substantial. However, this leaves 60% with a local transmission link found, even though the infection was supposedly acquired abroad. This could mean that a large proportion of patients infected abroad further transmit the disease locally. On the other hand, it is possible that these patients were not certain of their source of infection and had actually been infected in Slovenia. Uncertainty in patients’ reports of the most probable source of transmission, especially in populations that are sexually more active, has been observed in a study by Resik et al. (2007) examining two transmission networks in Cuba [37].
More than half of the large clusters were comprised of male individuals only and, furthermore, 2 clusters had individuals reporting only homosexual contact as their route of infection. However, the other 3 male-only clusters had reported heterosexual contact and, since no female individuals were found in these clusters, this could be explained by missing transmission links due to sampling or by individuals not disclosing their true sexual orientation due to stigma. The latter was described recently in a study by Hue et al. (2014), where a 1–11% misclassification of homosexuals as reported heterosexuals was noted [38]. Most of the heterosexual transmissions seem to be limited to small clusters, without further spread of the infection, with only a few in large clusters (possibly women infected by bisexual partners). All things considered, our results show that local HIV transmission in Slovenia is mainly driven by homosexual transmission clusters.
Discrepancies in the results obtained analyzing the complete dataset and smaller separate cluster datasets
With the intention of testing tMRCA estimates based on Bayesian methods tMRCA for a particular clade was obtained by using two different calibration strategies. Firstly, the complete set of Slovenian sequences with corresponding control sequences drawn from GenBank were used for calibrating the dates; and, secondly, only cluster strains were used for calibration. For two of the clusters, the analysis using the entire dataset showed consistently earlier tMRCA estimates compared to the analysis using cluster strains only. For all other clusters, the analysis based on cluster sequences only did not reach convergence. The discrepancies in tMRCA found for the three clusters with both estimates may be caused by dramatic fluctuations of evolutionary rate, for example because of bottleneck events or the introduction of HIV in populations with different epidemic characteristics. It has been noted before that even the relaxed clock model with uncorrelated lognormal distribution, as was used here, cannot cope with extreme differences in rates across branches. Wertheim et al. conducted an analysis on more divergent data (HIV group M), examining tMRCAs of different HIV subtypes and concluded that heterotachy was responsible for this discrepancy. Relaxed clock analysis was able to detect rate changes in a separate clade, although the molecular clock model considerably underestimated the impact of this change. Using different approaches in order to clarify these obtained discrepancies, the authors concluded that none of the approaches was able to resolve these differences [39]. Our results further corroborate such findings. It was interesting to observe wider 95% HPD intervals of the obtained clusters’ tMRCAs and substitution rates in the full analysis than in the analyses with only the cluster strains, since generally one would expect that more data generates more confident results.
For 95.5% of patients included in the study, results of an incidence algorithm characterizing patients as having a recent infection (RI) or a long-standing infection (LSI) were available. A window period of 155 days was set as a cut-off for differentiating RI from LSI, meaning that when a patient is characterized as having a RI, infection was acquired in an interval of 155 days before sampling. When comparing these “BED” estimated times of infection with the tMRCAs obtained from Bayesian analysis, we found that the obtained tMRCAs were estimated earlier in time in the full analysis, as well as in the separate analysis based on cluster sequences. One explanation for these earlier estimates relies on the definition of MRCA: the Most Recent Common Ancestor corresponds to the strain that gave origin to the infections of that cluster. Therefore, this strain should have originated before the estimated time of infection, inside the body of the patient who transmitted that strain to one of the patients in the cluster. As such, it is normal and to be expected that the estimated tMRCA will be earlier than the time of infection estimated by the BED algorithm. It therefore shows that tMRCA estimations by Bayesian methods for these clusters were credible. However, these results should be interpreted with caution, since the BED test used in the incidence algorithm does not allow individual determination of the timing of infection, due to variations in immune system response, so the stated intervals of the supposed timing of infection based on this data have limitations.
All in all, these results open discussion about the accuracy and interpretation of tMRCA estimates when analyzing large datasets that represent different epidemic settings, including bottleneck events and different transmission dynamics.
The faster substitution rate of Slovenian sequences suggests low transmission rates, a slowly growing epidemic consistent with the small HIV-1 effective population size in Slovenian clusters
In the study of Abecasis et al. (2009), the substitution rate of the pol region of subtype B was estimated at 0.001 substitutions/site/year. When comparing this to the rate obtained for major clusters of Slovenian sequences, a more than 10-fold faster substitution rate of Slovenian sequences is seen [40]. This, together with the findings of a low epidemic growth rate, is in line with the observation that the evolutionary rate of HIV-1 slows down when the epidemic rate increases, e.g., in a slow epidemic with small numbers of transmissions, the evolutionary rate of the virus will be faster. The HIV epidemic is indeed still small in Slovenia, thus explaining and corroborating the finding of such a fast substitution rate in the Slovenian population of HIV-1 infected patients. It has been observed that if transmissions have occurred mostly from individuals in the early stage of infection, there is no major impact of the host immune system, so less selective pressure is applied and fewer mutations accumulate in the virus transmitted [41]. Taking this into account, this suggests that the Slovenian epidemic is probably therefore driven mostly by individuals with an established long-lasting infection.
In order to determine trends in the HIV epidemic in Slovenia, analyses reconstructing population growth were additionally carried out on the two major clusters of Slovenian HIV sequences. When observing the effective population size from the Cluster 1 analysis, a rise in the numbers was seen in 2003. Interestingly, in a previous analysis of the epidemic in 2006, sequences found in clusters were predominantly obtained from individuals infected during or after 2003. It indicates that this period was important for the spread of HIV within the country [10]. As already mentioned, this was confirmed in the present study, since patients diagnosed with HIV after 2004 were found significantly more often in a large cluster. The HIV incidence in Slovenia in 2005, in fact, was more than 3-fold higher than in 2003 (35 vs. 11 newly diagnosed patients per 1,000,000 inhabitants; P < 0.05) [42].