For 506 patients, newly diagnosed and consulting a university HIV clinic between 2001 and 2009, data on patient demographics, infection risk and laboratory and clinical parameters, were supplemented with information obtained from phylogenetic analysis of the HIV-1 sequences. The results revealed an epidemic characterized by high heterogeneity in subtypes and high prevalence of non-B infections (40.3%). This heterogeneity and high non-B prevalence has been a characteristic of the HIV epidemic in Belgium since the beginning [28, 29]. From the results of our study it appears that in our local cohort, even after years of co-existence of different subtypes, patients infected with subtype B viruses and non-B viruses still represent highly distinct populations with regard to route of transmission (84.5% homosexual transmission for subtype B vs. 9.5% for non-B; p < 0.001), origin (95.3% Caucasian origin for subtype B infection vs. 33.7% for non-B; p < 0.001), gender (92.4% males for subtype B infection vs. 46.1% in non-B subtypes) and detection of other STI infections (Syphilis and/or Chlamydia infection 59.9% for Subtype B infection vs. 28.1% for non-B subtypes; p < 0.001).
In addition, more and larger transmission clusters were defined in subtype B infected patients compared to the non-B infections (19 clusters comprising 143 of the 302 patients infected with subtype B viruses and 7 clusters comprising 26 of the 204 patients with non-B infections; p < 0.001). These figures suggest that local onward transmission of subtype B virus contributes to an important extent to the epidemic. As 95.3% of the subtype B infected individuals are Caucasian and most are local citizens, the chance that the source patients or sexual partners are followed in the same hospital is higher than for the non-B infected patients of which 66% are foreigners and an important part is infected in the country of origin. This might in part explain the observations. Gifford et al. also showed that the majority of non-B infections in the UK in 2007 reflects separate introductions through travel and migration .
Importantly, this study shows the clear association between phylogenetic clustering, homosexual HIV transmission and infection with other STI. A correlation between being infected with closely related viruses, sexual risk and sexually transmitted infections has been shown before  in a study that was performed on a mainly Caucasian and almost exclusively MSM population (91% MSM). Our study included female and male patients of several origins, infected with a variety of HIV subtypes and through several transmission routes.
The observation of one very large transmission cluster of 57 MSM, infected with genetically very similar viruses (mean genetic distance: 0.0108), is striking and alarming. The cluster contains patients diagnosed over the whole 9 year follow-up period, with 8 new inclusions in the last year. Members of this cluster are significantly younger than the rest of the population (0.022) and have more Chlamydia (p = 0.013) and syphilis infections (p < 0.001).
Even after exclusion of this one cluster the significant association between clustering, MSM infection and syphilis infection remains. Together, these observations indicate that high-risk taking MSM constitute the most important source of local onward HIV transmission in our region. These findings corroborate the results of national survey programs in Belgium, revealing a continuous increase in number of new infections that is attributable to a large extent to increasing incidences in the MSM population http://www.iph.fgov.be/.
Because of the retrospective nature of this study, only the available information on STI screening results could be used. Distinguishing between an infection from before or after the diagnosis of HIV-1 infection is often impossible. Prospective studies that check for ongoing STI's are necessary to further address the role of STI's in HIV-1 clustering.
Previous phylogenetic studies, that concentrated mainly on subtype B infections (Brenner et al., 2007: 90% Subtype B)  or on patients from Caucasian origin (Yerly et al., 2001: 94% Caucasians)  reported an association between clustering and primary HIV infection. This association was confirmed by our data although the statistical support was weak (p = 0.017). A more significant association between clustering and higher CD4 counts (p = 0.002) however, supports the relation between clustering and earlier infection stage. In contrast to the studies cited above, the stage of infection was unknown for most of the patients included in our study. It is possible that our selection criterion for PHI patients has created a bias toward the preferential inclusion of patients who present for regular screening. These individuals are most probably more aware of their risk taking behavior and the association seen between clustering and PHI might reflect an association between clustering and high risk behavior.
One of the major difficulties when running phylogenetic analysis on HIV-1 sequences in an attempt to identify clusters of onward transmission, is the lack of well defined criteria for cluster identification. Selection of clusters based on bootstrap values only, as done by some [31, 32], implies a risk for the erroneous inclusion of subtype-specific clustering. On the other hand, bootstrap values can be misleadingly low in case of clusters with very short branch-lengths due to high similarity of the viruses . Selection of clusters based on low genetic distance, as done by others [8, 9, 11, 12, 15, 34] will restrict the inclusion to transmission events that occurred within a short timeframe. Inclusion of reference sequences and the use of robust methods can help identifying these problems. For our study, we opted for very strict criteria, based on bootstrap values and Bayesian probability and with an individual check of all clusters with a mean genetic distance >0.015. A drawback is that this might have caused an underestimation of the number of transmission clusters but it prevents the inclusion of clusters with a low chance of being attributable to local transmission events.
The presence of mutations possibly associated with a reduced sensitivity for antiretroviral drugs was observed in 6.5% of the patients analyzed. This figure is somewhat lower than the overall reported percentage for Belgium (9.5%) and the mean percentage for Western European countries (9%) [35, 36]. The observed higher prevalence of DRM in subtype B than in non-B infections confirms the results of others . Although it was not one of the major aims of this study to investigate the distribution of drug resistant virus in our population, the available information provided some interesting insights in the transmission of these DRM strains, as it demonstrates the frequent transmission of viruses containing 215 revertant mutants. These mutations evolve from 215 resistant mutations after primary infection with resistant strains and therefore are supposed to indicate the latent presence of resistant variants. Highly sensitive sequencing techniques however failed to identify these resistant variants in the majority of patients in whom the revertant mutation was detected . These findings and our observations indicate that an important number of 215 revertants are the result of ongoing infection with revertant strains and that their presence will have no or only limited influence on the virologic response to antiretroviral medication.