Persistence, clearance and reinfection regarding six high risk human papillomavirus types in Colombian women: a follow-up study

Background The design of new healthcare schemes which involve using molecular HPV screening means that both persistence and clearance data regarding the most prevalent types of HR-HPV occurring in cities in Colombia must be ascertained. Methods This study involved 219 HPV positive women in all of whom 6 types of HR-HPV had been molecularly identified and quantified; they were followed-up for 2 years. The Kaplan-Meier survival function was used for calculating the time taken for the clearance of each type of HPV. The role of a group of independent variables concerning the time taken until clearance was evaluated using a Cox proportional-hazards regression model or parametric (log-logistic) methods when necessary. Regarding viral load, the Wilcoxon rank-sum test was used for measuring the difference of medians for viral load for each type, according to the state of infection (cleared or persistent). The Kruskal-Wallis test was used for evaluating the change in the women’s colposcopy findings at the start of follow-up and at the end of it (whether due to clearance or the end of the follow-up period). Results It was found that HPV-18 and HPV-31 types had the lowest probability of becoming cleared (1.76 and 2.75 per 100 patients/month rate, respectively). Women from Colombian cities other than Bogotá had a greater probability of being cleared if they had HPV-16 (HR 2.58: 1.51–4.4 95% CI) or HPV-58 (1.79 time ratio: 1.33-2.39 95% CI) infection. Regarding viral load, HPV-45-infected women having 1 × 106 to 9.99 × 109 viral copies had better clearance compared to those having greater viral loads (1.61 time ratio: 1.01-2.57 95% CI). Lower HPV-31 viral load values were associated with this type’s persistence and changes in colposcopy findings for HPV-16 gave the worst prognosis in women having low absolute load values. Conclusions HPV infection clearance in this study was related to factors such as infection type, viral load and the characteristics of the cities from which the women came. Low viral load values would indicate viral persistence and a worse prognosis regarding a change in colposcopy findings.


Background
Persistent infection with high-risk human papillomavirus (HR-HPV) types is the main (but not the only) cause of developing pre-cancerous lesions or cervical cancer [1][2][3].
Several types of HPV considered oncogenic have been found to be responsible for around 90% of cancer cases worldwide [3], having high HR-HPV type prevalence rates (i.e. HPV-16, −31, −18, −33, −45 and −58) amongst Colombian women [4]. Our group is aware that multiple infections are of great importance for our population, since more than 40% prevalence has been reported in some regions of the country [4].
It has been described that HPV infection represents a transient phenomenon throughout the whole world, leading to high infection prevalence, even though most cases do not produce cervical lesions and those which do, consist of low grade lesions involving spontaneous regression [5]. However, Colombia does have a high cervical cancer incidence rate and high morbiditymortality values for a type of cancer which is highly preventable [6,7].
Intrinsic viral determinants, such as the infecting viral type or viral load, have been identified as HPV persistence markers [1,8,9]; however, some socio-environmental factors also play an important role in this type of infection and its possible outcomes [5,10].
The present work seeks to describe both viral and host factors which could be intervening in the persistence and clearance of the most common HR-HPV types in Colombia. This study was mainly aimed at providing epidemiological information illustrating how HPV infection can become eliminated in a target population, bearing the state of the infection (single or multiple) and the most prevalent HR-HPV types in mind.

Study population and ethical considerations
Women eligible for the present study were voluntarily attending cervical screening consultations in three Colombian cities (Chaparral, Girardot and Bogotá) between April 2007 and March 2010.
The present study involved 3 cities: Bogotá the country's capital having an urban population, Chaparral located in the Tolima department is mainly inhabited by mestizos leading a sedentary life style and Girardot in the Cundinamarca department which has become a tourist destination due to its favourable weather and closeness to Bogotá. Data concerning Chaparral and Girardot was combined in a single category called "other city" to ensure a better analysis of women in this study, as both these cities are small, having similar climates and lying at less than 1,000 masl (Bogotá is 2,600 masl).
As inclusion criteria, all women signed a written informed consent form and completed a questionnaire regarding their sociodemographic characteristics, sexual behaviour and risk factor data before undergoing a gynaecological examination and providing a cervical smear sample. The signature of a parent or guardian was required for females younger than 18 years old. Women who stated that they did not intend to move from their home cities for at least 2 years after the study began were included in the follow-up study.
Amongst exclusion criteria considered, women who had negative HPV results, those whose samples had little DNA in them (to ensure that all PCR assays were performed satisfactorily) or no amplification for the HMBS gene were not included. Women who were pregnant at base line and who had less than 3 months or more than 9 months until their next visit were also excluded from the study.
The Papanicolaou test was used for analysing samples and HPV-DNA detection; real-time PCR was used for selecting just HPV positive women. This study was supervised and approved by the relevant ethics' committees at Hospital de Engativá Nivel II (in Bogotá), Hospital San Juan Bautista (in Chaparral in the Tolima department) and Nuevo Hospital San Rafael (in Girardot in the Cundinamarca department).

HPV DNA collection, processing and detection by PCR amplification
Cervical samples were collected with a cytobrush and kept in 95% ethanol at 4°C [11]. DNA from these samples was purified using a commercial Quick Extract Solution kit, following the manufacturer's instructions. Samples were homogenised in 200 μL lysis buffer (10 mM Tris-HCl (pH 7.9), 0.45% Nonidet-P-40, 0.45% Tween 20 and 60 mg/mL proteinase K) and incubated for 6 min at 65°C, followed by 2 min at 92°C. Samples were centrifuged at 13,000 rpm for 10 min and the supernatant was removed and stored at 20°C.

Viral load determined by real-time PCR Primers and probes
Specific primers for each viral type and for Homo sapiens hydroxymethylbilane synthase (HMBS) were synthesised according to a study published by Moberg et al., [14]. The primers described by Moberg et al., amplified the same region for HPV-33 and −58; a new set of primers aimed at the E7 region of these viral types was thus designed. Designing probes for each viral type and HMBS was based on four parallel duplex real-time PCRs per patient, taking into account the types included in each reaction (Table 1) and support by integrated DNA technologies.

Cloning and sequencing
Processed samples were used as template for PCR (10 μL final volume), containing 0.5 U/μL Mango Taq DNA polymerase (Bioline), 1× Mango Taq Color reaction Buffer, 2 mM MgCl 2 , 250 nM dNTPs, 1 mM of each primer and DNase-free water to fulfil the necessary reaction volume. The PCR protocol for each fragment consisted of initial denaturing for 5 min at 95°C, followed by 35 cycles of 30 s at 95°C, 20 s at corresponding melting temperature and 30 s at 72°C. A reaction containing DNA-free water was used as negative control. The amplicons so obtained were purified with a Wizard PCR preps kit (Promega), once their quality has been evaluated on 3.25% agarose gel. A TOPO TA cloning kit was used for ligation, followed by transformation in TOP10 E. coli cells (Invitrogen). Several clones which grew on selective LB plates with 50 μg/mL kanamycin were incubated in LB broth at 37°C with 250 rpm overnight. Recombinant plasmids were purified using an UltraClean mini plasmid prep kit (MO BIO laboratories, California, USA) and sequenced with an automatic ABI PRISM 310 Genetic Analyzer (PE Applied Biosystems, California, USA). Each insert's integrity was checked by aligning the products with the respective theoretical sequenced fragments of each gene using Clustal W software [15].

Real-time PCR
A NanoDrop 2000 (Thermo Scientific NanoDrop Products) was used for quantifying plasmid DNA and the DNA copy numbers were calculated by using the URI Genomics and Sequencing Center web site [16] (Table 1). Standardised RT-PCR assays with 10-fold serial plasmid dilutions (10 11 -10 6 copies) gave a standard curve for each viral type and HMBS gene (−3.2 to −3.5 slope values). Samples were tested for HPV-16, HPV-18, HPV-31, HPV-33, HPV-45 and HPV-58. The human HMBS (hydroxymethylbilane synthase) gene was amplified in all samples to verify DNA integrity and calculate viral copy number per cell. PCR involved using a CFX96 Touch Real-Time PCR detection system which can detect 6 different fluorescent dyes; four real-time PCR reactions were carried out per sample, one for detecting HPV-16, a second for HPV-18 and −31, a third for HPV-33 and −45 and a fourth one for HPV-58 and HMBS.
The HPV-16 PCR mix contained 1× reaction buffer, 1.  cervical sample and DNA-free water to complete 20 μL volume.
96-well plates were used for each run, including 6 standards for each viral type and HMBS, involving 10-fold plasmid dilutions (10 11 -10 6 copy dynamic detection range) and a no template control to rule out DNA contamination.
The thermal cycling conditions for HPV-18, −31, −33, −45, −58 and HMBS consisted of initial denaturing for 5 min at 94°C, followed by 30 amplification cycles for 10 s at 94°C and 30 s at 53.7°C. Initial HPV-16 denaturing was followed by 30 PCR cycles for 30 s at 54°C and 30 s at 94°C.
Viral load values were given as absolute and normalised. The viral load was normalised to cellular DNA input amount, using the following formula: viral load (HPV copies/cell): number of HPV copies/(number of HMBS copies/2) [17].

Statistical analysis
Women who had had both a Pap-smear result and HPV-DNA detected by PCR and who fulfilled the follow-up inclusion criteria (at least 3 follow-ups, leaving 6 to 9 months between visits) were included in the analysis. Women were excluded where the HMBS gene was not amplified by RT-PCR. Analysis was based on typespecific HPV infection rather than on individual women, taking into account that multiple infection is common in the Colombian population [4].
Cox's multivariate regression model was used when calculating sample size; hazard ratios (HR) of at least 2 were thus considered, whenever they had a 5% significance level, 80% power, 0.55 standard deviation of tested covariates and 0.1 correlations between tested covariates. The probability of clearance was set at 0.7, according to previous reports [18,19]. Such suppositions required sample size of at least 86 women. STATA 12 stpower command was used for making the calculations.
Clearance was defined as at least two consecutive type-specific HPV DNA samples proving negative, such samples taken at 6-month intervals following a positive sample [20]. Persistence was defined as the identification of the same HPV type in baseline and follow-up samples [21]. The time taken for HR-HPV infection clearance was calculated in months (95% CI), estimated using the Kaplan-Meier survival function.
The first step was evaluating each variable independently to assess their importance regarding clearance time ( Table 2); those variables having a significance level of less than 0.2 in the univariate analysis were included in the multivariable models.
The independent variables included in the multivariable model were city, ethnicity, age of first sexual relationship, number of lifetime sexual partners, family planning method, coinfection and viral load (categorised as low, viral load being lower than 9.99E + 5, middle viral load between 1.00E + 6 to 9.99E + 9 and high viral load being higher than 1.00E + 10) concerning time taken to clearance using a Cox's proportional hazard (PH) regression model or parametric methods (log-logistic) when the PH assumption was violated. The PH assumption was graphically evaluated using log-log plots and a PH test based on weighted residuals using Grambsch and Therneau tests [22]. The choice of parametric model was defined using Akaike information criterion (AIC) and Bayesian information criterion (BIC).
Three categories were assigned to the variable "changes in colposcopy" (alike, improved and worsened) for evaluating changes in colposcopy findings between the results of colposcopy at the start of follow-up and the end of it (whether due to clearance of the virus or not). The Kruskall-Wallis test was used for evaluating the difference in viral load for each viral type and change in colposcopy findings since the sample did not have a normal distribution. The Mann-Whitney test was also used for evaluating viral load according to the state of infection (persistent or cleared).
Categorical variable distribution amongst groups was assessed by Chi-squared test or Fisher's test, as appropriate. Median and interquartile ranges were used for quantitative variables, according to the data distribution. Incidence ratios were estimated using months of follow-up as denominator. A ≤ 0.05 p value was considered statistically significant; STATA 12 was used for all statistical analysis.

Results
The present work has consolidated data concerning 219 women infected by several HR-HPV types; they became voluntarily incorporated into our follow-up study. All the women included guaranteed to attend a base-line visit and at least 3 follow-up visits with around 6 month difference (±3 months); 23.3% of the population being sampled managed to attend follow-up 4 (i.e. data became available from 5 visits).
Regarding   Table 2). Survival data was estimated for each type of HR-HPV regardless of single infection or multiple infections and clearance time for each type (Figure 1). A greater clearance occurred for HPV-33-infected women, followed by HPV-16-infected females, whilst fewer events per month occurred for HPV-18-and HPV-31-infected women (Table 3). Figure 2 clearly shows that HPV-18 infection was the most persistent; infection had not become resolved in 15 women 2 years later, followed by HPV-31 which was present in 4 women having positive identification for  this type during each follow-up visit (infection remaining unresolved by the end of the study). Specific viral load type values were also determined in this study; Table 3 gives both absolute and normalised viral load values for each type infections at the start of the study. It is worth highlighting that those infected by HPV-31 had the highest viral load values, even those normalised by the number of cells, whilst HPV-16 gave the lowest viral load values.
Hazard ratios were calculated for two (HPV-16 and-18) of the 4 most prevalent viral types in molecular determination (i.e. HPV-16, −18, −45, −58), bearing in mind the most important variables in univariate models, for each type (data not shown) ( Table 4). Time ratios were calculated for the remaining types, since these (HPV-45 and −58) did not comply with supposed proportional risks for the aforementioned variables; variables could not thus be re-categorised nor could they be assumed to be time-dependent variables. Regression was thereby modelled using a parametric model, bearing in mind the shape of the hazard function and AIC and BIC. The loglogistic model gave the best fit for both types with the foregoing criteria.
Multivariate model values showed that the probability of clearance when a woman had HPV-16 or HPV-58 infection became significantly increased in women from another city compared to women living in the capital. It is worth stating that both types belong to the same species (A9).
Regarding the types belonging to species A7, HPV-18 infection did not have a statistically significant association with any variable evaluated here. However, it was observed that the probability of clearance regarding HPV-45 became significantly increased when the absolute viral load for this type ranged from 1 × 10 6 to 9.99 × 10 9 compared to loads equal to or greater than 1 × 10 10 .
It should be highlighted that single or multiple HR-HPV infection was not associated with time to clearance in the present sample, since coinfection values for any HPV type were not statistically significant in this model (Table 4).
When evaluating the medians for normalised viral load (per cell) for each type, according to the state of infection at the end of follow-up (cleared or persistent), it was observed that the median for HPV-31 type in the group of women where clearance was found (median = 332.5; IQR = 12,399.72) was greater than the median for those where this virus was not cleared (median = 9.4; IQR = 1,659.98, p = 0.0450). There were no differences in any of the groups regarding the medians for the other HR-HPV types.
The change in colposcopy findings was also evaluated concerning the result at the start of follow-up and the result at the moment of clearance, or at the end of follow-up (Figure 3). There were statistically significant differences regarding HPV-16 concerning absolute values for viral loads for each group, since the value for the group which became worse regarding diagnosis by colposcopy was lower (median = 89,300; IQR = 253,600) than that for the other groups (improved: median = 2.9 × 10 6 ; IQR = 1.1 × 10 7 ; alike: median = 2.9 × 10 6 ; IQR = 9.9 × 10 6 ) (p = 0.046). There were no differences in the rest of the types evaluated here regarding viral load according to change in colposcopy findings.
When evaluating normalised viral load according to the degree of HPV infection and colposcopy findings at the start and end of follow-up, it was found that median viral load for HPV-58 was greater for women who had a better prognosis (median = 3.98; IQR = 351.9) compared  to those whose prognosis remained the same (alike) (mean = 0.013; IQR = 0 .68), only in the group of women having persistence for this virus (p = 0.012). The number of reinfections for each viral type was determined (Table 3); HPV-58 was the only viral type for which there were no reinfection events during the time follow-up lasted.

Discussion
This study has provided detailed epidemiological data for six HR-HPV types present in a Colombian cohort. This has been the first study in Colombia (to the best of our knowledge) aimed at using real time quantification of DNA from the 6 most prevalent types of HR-HPV, giving absolute and normalised load values.
Findings concerning the virus' persistence for the 6 types included here have demonstrated that the risk of acquiring a later HPV infection becomes increased in women already infected by any type of HPV (regardless of complying with a phylogenetic relationship [23], mainly between high risk types [24,25]). Such coinfection could have been the result of immune system deficiency regarding clearance, thereby facilitating viral persistence at the infection site [25].
Regarding the state of infection (single or multiple) and its relationship to clearance time, this work did not reveal an important relationship between such aspects. Various studies have shown that type specific HPV clearance seems to occur regardless of coinfection in an immunocompetent population [26,27].
The highest clearance rates in our cohort were observed for HPV-33 and HPV-16. Previous reports have shown that HPV-16 clears out after other HR-HPV types [26,28,29], but in or study, this type of infection displayed a more transient pattern. Our results are in agreement with previous studies showing that the majority of women with a type-specific infection are negative for that particular viral after one year [19,30].
The reduced clearance rates observed for infections with HPV-18 and −31 types is particularly important bearing in mind that HPV-31 was found in high prevalence in our country [4] and that HPV-18 has been detected in aggressive forms of cancer [31].
Despite HPV-31 was not evaluated in the multivariable model due to the low sample size, it is worth noting that it displayed the highest viral load values and one of the lowest clearance rates. Future studies analysing more women infected with this viral type might help to a better understanding about the influence of viral loads in the type-specific clearance process.
HPV-16, −18 and −58 viral load values did not have a clear relationship with clearance time. This has already been shown for a population from Bogotá in a study involving semi-quantitative identification of viral DNA [24]. The present study showed that, regardless of using a more sensitive technique, no relationship was established between viral load and clearance time for these types, not just in Bogotá, but also in other Colombian cities.
Another factor associated with clearance time was city of origin, showing that women infected by types from the A9 species became cleared more rapidly if they came from Girardot/Chaparral. It is supposed that these cities have factors related to sexual behaviour or cultural characteristics which were not measured in this study and which would have modulated such findings. When analysing the control arm of the large randomised PATRICIA study, it was found that region of origin was one of the behavioural determinants of clearance time, as north-American women took less time to clearance than their European counterparts [20].
Regarding ethnicity, fine control was not used for obtaining it; thus, other ethnic characteristics which were not controlled in this study may have intervened in such marked association between city and clearance time. Another aspect concerned the women's nutritional state; most women were from low socio-economic strata. However, nutritional and/or feeding data was not controlled and may have provided more detailed characteristics concerning the population's idiosyncrasies. Previous studies have shown that women who consumed one or more servings of vegetables per day cleared their HPV infections more quickly than women who did not consume vegetables daily [32]. The intake of lower levels of micronutrients found in vegetables has been associated with increased persistence [33].
Other factor that was not measured in this study but that could influence HPV clearance is hygienic practices. In a cohort of university students, it has been shown that the use of tampons was associated with a reduced rate of HR-HPV clearance [32].
Very interesting data for three A9 species types (HPV-16, −31 and −58) was revealed when determining viral load according to the state of infection and colposcopy findings since these had low viral load values (absolute or per cell) associated with greater lesion severity at the end of follow-up or when infection did not become eliminated. Besides intermediate viral load values for HPV-45 (A7 species) were associated with faster time to clearance, this may be a factor related to transient infection. Such results could have been due to immune system evasion mechanisms since it has been reported that low HPV viral load values have been related to persistent infection [34] and it could be suggested that higher viral load values could be detected efficiently by the immune system and rapidly eliminated. This is contradictory with studies proposing that high viral loads facilitate persistence, specifically, it has been shown that HPV-16 viral loads in LSIL and HSIL were higher compared with no intraepithelial lesion or malignancy [35].
This work has several strengths, such as having compiled data from two important focuses of HPV infections (i.e. Girardot and Bogotá), determined viral load using the most sensitive technique for doing so and the percentage of multiple infections revealing an important Colombian populational characteristic. However, the study had difficulties in terms of follow-up times, since infection transience meant that shorter follow-up times than the ones established here may probably have led to obtaining more precise clearance and incidence values. Our information was limited to using prevalent high risk infections for analysing persistence and clearance of infection; the foregoing means that follow-up studies are needed to facilitate understanding the most prevalent epidemiological HPV patterns for Colombia.
Diagnosing HPV infection in clinical specimens has been widely accepted to date in Colombia; viral DNA identification in this type of sample has been included in the Obligatory Healthcare Plan, 2012. The following step must thus be to incorporate monitoring from the identification of HPV infection in cervical cancer control schemes. Such work thus contributes towards the search for a correct algorithm for defining HPV DNA screening since the time taken for most women to clear the virus must be determined for calculating the determinants of such scheme and be referred to regular monitoring [19].

Conclusions
Time to clearance in Colombian females infected by the most frequently occurring HR-HPV types in the sample population was not modulated by infection status (single or multiple). However, viral load played a role in terms of infection regarding HPV-45 and the origin of HPV-16 and −58 infection. Viral persistence and worsening of cytological findings were related to lower HPV-16, −31 and −58 viral loads. All women in our sample who eliminated HPV-58 were not infected again by this viral type. Given that time to clearance was related to lesion development, such information should prove significant when designing HPV DNA primary screening in Colombian healthcare systems, as well as in developing countries.