Evaluation of using composite HPV genotyping assay results to monitor human papillomavirus infection burden through simulation

Background Researchers often group various HPV types into composite measures based on vaccine subtypes, oncogenic potential, or phylogenetic position. Composite prevalence estimates based on PCR genotyping assay results have been calculated to assess HPV infection burden and to monitor HPV vaccine effectiveness. While prevention and intervention strategies can be made based on these prevalence estimates, the discussion on how well these prevalence estimates measure the true underlying infection burdens is limited. Methods A simulation study was conducted to evaluate accuracy of using composite genotyping assay results to monitor HPV infection burden. Data were generated based on mathematical algorithms with prespecified type-specific infection burdens, assay sensitivity, specificity, and correlations between various HPV types. Estimated-to-true prevalence rate ratios and percent reduction of vaccine types were calculated. Results When “true” underlying type-specific infection burdens were prespecified as the reported prevalence in U.S. and genotyping assay with sensitivity and specificity (0.95, 0.95) was used, estimated-to-true infection prevalence ratios were 2.35, 2.29, 2.18, and 1.46, for the composite measures with 2 high-risk vaccine, 4 vaccine, 14 high-risk and 37 HPV types, respectively. Estimated-to-true prevalence ratios increased when prespecified “true” underlying infection burdens or assay specificity declined. When prespecified “true” type-specific infections of HPV 6, 11, 16 and 18 were reduced by 50%, the composite prevalence estimate of 4 vaccine types only decreased by 17% which is much lower than 48% reduction in the prespecified “true” composite prevalence. Conclusions Composite prevalence estimates calculated based on panels of genotyping assay results generally over-estimate the “true” underlying infection burdens and could under-estimate vaccine effectiveness. Analytical specificity of genotyping assay is as or more important than analytical sensitivity and should be considered in selecting assay to monitor HPV.


Background
Infection with human papillomaviruses (HPVs) can cause warts and various forms of carcinoma in the cervix, anus, vagina, vulva, head and neck in women and men [1]. More than 120 types of HPV have been identified according to DNA genomes [1,2]. Researchers have classified and grouped these HPV types by their association with a variety of clinical conditions (i.e., cervical cancer, warts), phylogenetic position, and types related to vaccines [1,3] to form composite measures. Polymerase chain reaction (PCR-based) DNA genotyping tests can detect the existence of small amount of virus and have been considered as the "gold standard" to detect infectious organisms [4,5]. Prevalence estimates based on one or a panel of PCR genotyping test results has been used for research purposes in numerous epidemiology/clinical studies to assess the burden of HPV infections [6][7][8][9][10] and vaccine effectiveness [11][12][13]. While prevention and intervention strategies can be made based on these prevalence estimates, research on how well these composite prevalence estimates measure the true underlying infection burden is limited.
Various factors could affect accuracy of composite prevalence estimates. Depending on the classification, the number of HPV types included in the composite measures could be different. Since HPV type-specific infections share the same risk factors (i.e., sexual lifestyle, age, etc.) and subjects with weaker immune systems are more likely to get infected or stay infected, HPV type-specific infections are likely to be correlated and could result in coinfection with more than one HPV type [14,15]. In another word, the probability of getting infected or detecting any given HPV type is greater among individuals who are currently positive for at least one other HPV type. In addition, the underlying infection prevalence of various HPV types can be different by study population (i.e., different age groups and geographic regions) or change over time [16,17]. Furthermore, more than 20 in-house or commercial HPV genotyping assays (e.g., Linear Array, INNO-LiPA) have been used to detect HPV infections [13, 16,18,19]. Analytical sensitivity and specificity of these genotyping assays can vary greatly for various reasons (e.g., primer set, reaction condition, laboratory techniques of personnel, etc.) [18][19][20] and result in different composite prevalence estimates.
Challenges to evaluate accuracy of prevalence estimates include not knowing the true values of underlying typespecific infection burdens, low feasibility to recruit subjects from various regions with various levels of infection burdens and to test numerous assays for comparisons. To overcome these challenges, we performed a Monte Carlo simulation study to evaluate composite prevalence estimates. The simulation approach allows us to have data with various levels of known prespecified "true" underlying type-specific infection burdens and genotyping test results based on the prespecified assay performance so the prevalence estimates calculated based on type-specific assay results can be compared to the pre-specified "true" underlying infection burden. Monte Carlo simulation approach has been widely used in statistics, physics, finance, economics and engineering to evaluate the impact of various factors on complex systems/processes and identify an optimal system design/process [21][22][23][24][25]. For this study, we are interested in examining accuracy of the composite prevalence estimates based on panels of PCR genotyping assay results and the impacts of various factors.

Simulation settings
A simulation study was conducted to evaluate composite prevalence estimates calculated based on panels of PCR genotyping assay results. Data were generated using mathematical algorithms extended from Lin et al. [26] with prespecified values of "true" underlying typespecific prevalence, PCR genotyping assay sensitivity and specificity and correlations among HPV types. The infectious statuses were assigned based on the prespecified type-specific prevalence; and the genotyping test results were obtained based on the type-specific infectious status, genotyping sensitivity and specificity. Each data set included the known "true" type-specific infection statuses and genotyping assay results so the prevalence estimates could be calculated to compare to the prespecified "true" infection burdens. Without loss of generality, the total number of subjects was set to be 4,000 and correlations between HPV types were set to be 0.05 which was the average value of 666 pairwise correlations of 37 HPV genotyping test results in the 2003-2006 National Health and Nutrition Examination Survey (NHANES). For each scenario, 500 data sets were generated. Mean and standard deviation of the "true" and estimated composite prevalence of 500 data sets were calculated. Mathematical algorithms related to the simulation setup are given in the Appendix.
Two different levels of infection burdens were considered. For the first setting, the "true" underlying typespecific infection burdens were prespecified as the reported 37 different HPV prevalence of females, aged [2003][2004][2005][2006], in the U.S. (Figure 1) [7]. For the second setting, the "true" underlying type-specific infection burdens were prespecified as the reported 45 different HPV prevalence of females, aged 14 or older, 2008-2009 in the Northwest Territories (NWT), Canada [9]. The reported type-specific prevalence in the NWT, Canada was generally lower than in the U.S. and the relative rates were also different ( Figure 1).
Prespecified "true" prevalence of each outcome measure is defined as the proportion of subjects with a positive infection status. For each subject, based on the pre-specified "true" type-specific infectious statuses, the "true" composite positive infection status of the four outcome measures is defined as having at least one HPV type-specific infection of the 37 HPV types (45 HPV in Canada), 14 high-risk types (22 high-risk in Canada), 4 vaccine types, and 2 high-risk vaccine types.
Prevalence estimate of each outcome measure is defined as the proportion of subjects with positive composite test results. For each subject, a positive test result of the four outcome measure is defined as having at least one positive of the 37 HPV (45 HPV in Canada), 14 high-risk (22 highrisk in Canada), 4 vaccine, and 2 high-risk vaccine typespecific results. To examine how well the composite prevalence estimate measures the "true" underlying composite prevalence, for each outcome measure, estimated-to-true prevalence ratio was calculated. A ratio greater than 1 means the composite prevalence estimate calculated based on a panel of genotyping assay results over-estimates the "true" underlying composite infection burden. The number of false positives exceeds the number of false negatives. In contrast, a ratio less than 1 means composite prevalence estimate calculated based on a panel of genotyping assay results underestimates the "true" underlying composite infection burden. The number of false negatives exceeds the number of false positives. The larger the ratio from 1, the less accurate the composite prevalence estimate is.
To assess the effects of analytical sensitivity and specificity of genotyping assay on composite prevalence estimates, the initial values of PCR genotyping assay sensitivity and specificity were set to be equal (0.95, 0.95) to reflect the well-performed Linear Array test and then the assay performance was varied. To examine the effect of genotyping assay specificity, assay sensitivity was held unchanged and specificity was reduced from 0.95 to 0.90, 0.85 and 0.80. Similarly, to examine the effect of genotyping assay sensitivity, assay specificity was held unchanged and sensitivity was reduced from 0.95 to 0.90, 0.85 and 0.80.
To examine how well the composite prevalence estimates measured intervention effectiveness, for demonstration purposes, assuming the pre-specified "true" type-specific prevalence of the 4 vaccine types (HPV 6,11,16 and 18) were reduced by 50%. Percent reductions of composite prevalence estimates of 4 vaccine types and 2 high-risk vaccine types (HPV 16,18) were calculated and compare to the reductions in the prespecified "true" underlying composite prevalence.
To examine the effects of levels of underlying typespecific infection burdens and number of HPV types in the composite measure on the composite prevalence estimates, prevalence estimates of the four outcome measures from the U.S. scenarios were compared to those estimates from the NWT, Canada scenarios. The reported type-specific HPV prevalence in the NWT, Canada was generally lower than in the U.S. and the relative rates were also different ( Figure 1). In the U.S. scenarios, there are 37 HPV types and 14 high-risk types. In the NWT, Canada, there are 45 HPV types and 22 high-risk types.
To examine the effects of correlations between different HPV types on composite prevalence estimates, the correlations between different HPV types were varied from 0, 0.05, 0.1, 0.2, 0.3 and 0.4. The "true" underlying type-specific infections were prespecified as the reported prevalence in the U.S. and type-specific assay sensitivity and specificity were set to be (0.95, 0.95). "True" composite prevalence, estimated composite prevalence and estimated-to-true prevalence ratios of 37 HPV, 14 highrisk HPV, 4 vaccine and 2 high-risk vaccine types were calculated. Table 1(a,b) show the results of estimated composite prevalence when the "true" type-specific infection burdens are prespecified as the reported prevalence in the U.S. When genotyping assay with sensitivity and specificity (0.95, 0.95) is used, composite prevalence estimates based on a panel of genotyping assay results generally overestimate prespecified "true" composite infection burden. As shown in Table 1, the estimated-to-true prevalence rate ratios are 2.35, 2.29, 2.18 and 1.46 for composite measures with 2, 4, 14 and 37 HPV types, respectively.  Pre-specified type-specific prevalence at baseline: When assay specificity is held unchanged and sensitivity is reduced, the results suggest that composite prevalence estimates are robust to decline of assay sensitivity. As shown in Table 1a, when sensitivity is decreased from 0.95, to 0.90, 0.85, 0.80, composite prevalence estimates do not change much and the estimated-to-true ratios remain similar. When genotyping assay sensitivity is held unchanged and specificity is decreased from 0.95, to 0.90, 0.85, 0.80, composite prevalence estimates increase and the estimated-to-true ratios become much larger ( Table 1b). The overestimating problem is alleviated when the number of HPV types in the composite measure becomes larger.

Results
Simulation results also suggest that the composite prevalence estimates could under-estimate vaccine effectiveness (Table 1a,b). When the prespecified "true" underlying type-specific prevalence of HPV 6, 11, 16 and 18 are reduced to 50% of the reported level and genotyping assay with sensitivity and specificity (0.95.0.95) is used, the prevalence estimates of 2 high-risk vaccine types and 4 vaccine types were only reduced by 17.2% and 16.7%, respectively, which are much lower than the 50% and 48% reduction in the prespecified infection burden. Table 2(a,b) shows the results of estimated composite prevalence when "true" type-specific prevalence is prespecified as the reported prevalence in the Northwest Territories (NWT), Canada. Results suggest that the magnitude of over-estimation is greater in the NWT, Canada than in the U.S. due to lower HPV infection burdens in Canada. In addition, increasing the number of HPV types in the composite measure does not help much to alleviate the over-estimating problem as in the U.S. scenarios.
The results of sensitivity analysis for correlations are given in Table 3. Results suggest that estimated-to-true prevalence ratios are robust to the change of correlations and the impact of correlations on the composite measures depends on the magnitude of correlations and the number of HPV types in the composite measures. The impact of correlations is limited when the number of HPV types in the composite measure is relatively small (e.g., 2, 4). Magnitude of decline in composite prevalence estimates increases with increasing number of HPV types in the composite measures (Table 3).

Discussion
Prevalence estimates based on one or panels of PCR genotyping assay results have been used to assess HPV infection burdens and monitor vaccine effectiveness [11,29]. Since true HPV infection burdens are often unknown, mathematical algorithms with prespecified infection burdens and assay performance were used to simulate various scenarios to evaluate these composite prevalence estimates.
PCR-based DNA genotyping tests can detect the existence of small amount of virus. The PCR process includes denaturation, annealing and extension in each PCR cycle. Each cycle approximately doubles the amount of target viral DNA. Although PCR process is labor-intensive and time consuming, PCR can theoretically produce one million copies from a single doublestranded DNA molecule after 30 cycles. Analysis of the amplification products can be done in different ways including gel electrophoresis, dot blot or line strip hybridization [30].
The initial values of PCR genotyping assay sensitivity and specificity were set to be equal to (0.95, 0.95) in this simulation study to reflect a well-performed Linear Array test. Roche Linear Array genotyping assay is the commercialized version of a PCR test which is designed to standardize the entire PCR process and to detect 37 HPV types. LA is a qualitative test and has been used for research purpose in numerous epidemiological/clinical studies. It is also the most widely-used assay by the labs of the World Health Organization (WHO) HPV labNet to monitor HPV vaccine effectiveness. When specimens are carefully handled and PCR procedures are strictly performed according to protocol, analytical sensitivity and specificity of Linear Array assay can reach (0.95, 0.95). The performance of in-house assays can have greater variation since each of the many steps of the PCR testing procedure can introduce important variability [18][19][20]. Factors associated with analytical sensitivity and specificity of PCR testing include primer selection, lab environment and reaction conditions, performance of the DNA polymerase used in the reaction, laboratory techniques of personnel and specimen acquisition, handling and storage, etc. [30].
Type-specific HPV infections are considered to be correlated in this simulation study because the risk factors of getting infected by various HPV types are similar and subjects with weaker immune systems are more likely to get infected and/or stay infected. Without loss of generality, the correlations between HPV types were set to be 0.05. This is the mean of 666 pairwise correlations calculated based on 37 Linear Array genotyping test results collected in the 2003-2006 National Health and Nutrition Examination Survey (NHANES). Although the values of pairwise correlations varied from 0 to 0.3, eighty-five percent of these pairwise correlations were somewhere between 0 and 0.1. In addition, the simulation results (Table 3) suggest the effect of correlations on estimated-to-true prevalence ratios is limited. The impact of correlations on the composite prevalence depends on the magnitude of correlations and the number of HPV types in the composite measure. The higher the    correlations between different HPV types, the more likely co-infections would occur. Therefore, the higher the co-infection rate, the lower the composite prevalence will be. When comparing results from the U.S. scenarios with those from the NWT, Canada, although results are generally consistent, the magnitude of over-estimation is more severe in the NWT, Canada scenarios. This is because the pre-specified "true" underlying type-specific infection burden is generally lower in the NWT, Canada and the chance of getting false positive increases. Assay specificity become even more important for getting accurate prevalence estimates. Unlike the US scenario, increasing number of HPV types in the composite measures does not always help to ease the overestimating problem in the NWT, Canada. It is because in the U.S. scenarios, the type-specific infection burdens of newly-added HPV types are at similar or higher levels than those already in the composite measure. In the NWT, Canada, the type-specific infection burdens of newly-added types are much lower than those already in the composite measure ( Figure 1), therefore, the magnitude of false positive rates increased is much greater.
In the context of monitoring HPV infection burden, the focus has been given to assay analytical sensitivity to detect the existence of HPV infections. There has been a tendency either to develop a new testing technique or to modify existing techniques to increase analytical sensitivity to detect HPV. Studies suggest that increasing analytical sensitivities of HPV detection has reveals that the HPV prevalence is higher than previously suggested [31,32]. In contrast to previous studies, the simulation results suggest that prevalence estimates based on PCR genotyping assay results generally overestimate the true infection burden; genotyping assay sensitivity has limited effect on the composite prevalence estimates; and the decline in specificity is more influential. When assay specificity declines, false positive rates increase and the problem of overestimating becomes more severe. Particularly, when underlying typespecific infection rates are low, for each HPV type, small reductions in genotyping assay specificity results in a high number of subjects with false positive results. Therefore, eliminating factors which might cause false positives (i.e., contamination introduced through reagents, laboratory disposables or equipment including carry-over contamination between tests or sample-to-sample contamination, etc.) to increase assay specificity is important.
Since the introduction of the HPV vaccine in 2006, more nations are now monitoring HPV infection as an earlier indication of vaccine effectiveness [11,12,29]. More than 20 types of in-house or commercial assays have been developed to detect HPV infections and the performance of these assays varies [13,16,18,20]. Also, true underlying infection burdens vary by geographic regions, age groups [16,17] and can change over time. We need to be aware that accuracy of prevalence estimates based on panels of genotyping assay results can vary by true underlying infection burden, genotyping assay performance and number of HPV types included in the composite measure. For geographic regions or subpopulations with relatively low infection burden, in general, prevalence estimates overestimate true underlying infection burden and could underestimate vaccine effectiveness. Also, we need to understand that the impact of genotyping assay specificity is as or more important than sensitivity and should be considered in selecting a genotyping assay to monitor HPV infections. Particularly, subjects with positive HPV detected by PCR in research studies generally are not informed or referred to have colposcopy, since persistent infection of HPV high-risk types is the pivotal event in the development of cervical cancer and most of HPV detected can be cleared without treatment in about two years. In addition, on the population level, from the point of view of infectiousness, population prevalence calculations based on PCR results could overestimate true infectiousness burden, since a very tiny amount of DNA detected by PCR is likely not infectious and may just represent a past infection. Furthermore, laboratory guidelines or policies leading to more standardized assay performance between different laboratories are necessary for combining data from different sites to estimate vaccine effectiveness or compare infection burden at various geographic regions.
In this study, we simulated various scenarios to evaluate composite prevalence estimates based on PCR genotyping assay results. Although it is not possible to consider all levels of infection burden or PCR genotyping assay performance, this simulation study is able to examine the impact of true infection burden and assay sensitivity and specificity on the accuracy of composite prevalence estimates. Estimated-to-true prevalence ratios were used to examine how well the prevalence estimates based on genotyping assay results measure the true underlying infection burden. One limitation is the ratio does not provide information to distinguish true and false positive rates. True and false positive rates depend on the type-specific infection burden and genotyping assay sensitivity and specificity. Although the simulation result suggest that increasing number of HPV types in the composite measure could improve the accuracy of composite prevalence, HPV types are grouped to form composite measures based on their association with a variety of clinical conditions, phylogenetic position or types related to vaccines and may not be varied. Crossreaction is not discussed in this manuscript since the chance of cross-reaction occurrence when applying PCR testing technique is relatively low. In addition, bias which can be introduced due to study design (i.e., sampling strategy, confounders) is not discussed in this manuscript. Having a good sample representing the target population is very important. Since PCR genotyping assay results have limited clinical utility, future studies can be conducted to investigate incorporating HPV clinical tests (i.e.,Digene HC2 or Cobas test) to monitor HPV infections.

Conclusions
Composite prevalence estimates calculated based on panels of genotyping assay results generally overestimate the true infection burden and could underestimate effectiveness. Analytical specificity of genotyping assay is as or more important than sensitivity and should be considered in selecting assay to monitor HPV.
The threshold values of p j can be expressed as functions of the true infection rate of the j-th type, δ j , and the threshold values of q j can be expressed as functions of the true type-specific prevalence of the j-th type, δ j , assay sensitivity, α, and specificity, β, as follows: where ΦΟ is the standard cumulative univariate normal distribution function. It is because Furthermore, α and p j , and q j (and thus α, β, and δ j ) also determine the correlation between Y ij and Z ij , since where φ 2 (y,z;ϱ) is the cumulative distribution function of bivariate normal random vector with means of zero, variance of 1 and correlation of ϱ. A numerical method is used to solve for ϱ with a given α, β and δ j .
To generate the data, for each type, we first identify the threshold values of p j and q j based the pre-specified values of δ j , α, β. Second, we identify correlation matrix ∑ k (where correlation between each pair of Y j and Z j is obtained by a numerical method). Multivariate normal data (Y i1 , Z i1 ,…Y ij , Z ij ,….,Y ik , Z ik ) are generated with mean zero and correlation matrix ∑ k . Then, for each HPV type, we use the pre-determined threshold values of p j and q j to dichotomize Y ij and Z ij to obtain D ij and T ij .

Competing interests
The author declares no competing interests.
Author's contribution CL conducted the simulation study and drafted the manuscript. Author read and approved the final manuscript.