Predictors of Epstein-Barr virus serostatus in young people in England

Background Epstein-Barr virus (EBV) is an important human pathogen which causes lifelong infection of > 90% people globally and is linked to infectious mononucleosis (arising from infection in the later teenage years) and several types of cancer. Vaccines against EBV are in development. In order to determine the most cost-effective public health strategy for vaccine deployment, setting-specific data on the age at EBV acquisition and risk factors for early infection are required. Such data are also important to inform mathematical models of EBV transmission that can determine the required target product profile of vaccine characteristics. We thus aimed to examine risk factors for EBV infection in young people in England, in order to improve our understanding of EBV epidemiology and guide future vaccination strategies. Methods The Health Survey for England (HSE) is an annual, cross-sectional representative survey of households in England during which data are collected via questionnaires and blood samples. We randomly selected individuals who participated in the HSE 2002, aiming for 25 participants of each sex in each single year age group from 11 to 24 years. Stored samples were tested for EBV and cytomegalovirus (CMV) antibodies. We undertook descriptive and regression analyses of EBV seroprevalence and risk factors for infection. Results Demographic data and serostatus were available for 732 individuals. EBV seroprevalence was strongly associated with age, increasing from 60.4% in 11–14 year olds throughout adolescence (68.6% in 15–18 year olds) and stabilising by early adulthood (93.0% in those aged 22–24 years). In univariable and multivariable logistic regression models, ethnicity was associated with serostatus (adjusted odds ratio for seropositivity among individuals of other ethnicity versus white individuals 2.33 [95% confidence interval 1.13–4.78]). Smoking was less strongly associated with EBV seropositivity. Conclusions By the age of 11 years, EBV infection is present in over half the population, although age is not the only factor associated with serostatus. Knowledge of the distribution of infection in the UK population is critical for determining future vaccination policies, e.g. comparing general versus selectively targeted vaccination strategies.


Background
Epstein-Barr Virus (EBV) is a herpesvirus that infects 90-95% of humans, causing lifelong infection [1,2]. EBV infection during childhood is generally asymptomatic, however acquisition of EBV during adolescence or early adulthood often causes infectious mononucleosis (IM), [3] which can cause substantial morbidity during important educational periods in adolescents and young adults [4,5]. EBV is associated with 1% of global cancers, particularly Hodgkin's lymphoma, Burkitt's lymphoma, nasopharyngeal cancer and gastric cancer [6].
EBV infection is not currently treatable nor preventable by vaccination; however, vaccine candidates are in development. In phase II trials, a first-generation vaccine administered to healthy seronegative volunteers aged 16-25 years demonstrated protection against IM but not EBV infection [7]. Second-generation vaccines elicited higher levels of antibody responses in animal models, [8] and first-in-human trials are likely to begin soon. Mathematical modelling of different vaccination strategies is essential to determine the effectiveness and costeffectiveness of different vaccination strategies for reducing rates of EBV infection, IM, and EBV-associated cancers, taking into account factors such as vaccine efficacy, duration of protection and differing outcomes according to age at infection.
A greater understanding of EBV epidemiology, including the dynamics of EBV infection in different subpopulations, is necessary for the development of such models. EBV seroprevalence increases with age; 90-95% of people globally are infected by age 25, whilst 5-10% remain seronegative throughout life [9]. The best public health strategy for the deployment of an infectionpreventing vaccine may vary between settings; infection appears to occur at younger ages in resource-limited countries and thus children will need to be vaccinated early [10][11][12]. However, if the duration of vaccineinduced protection is not lengthy, vaccinated individuals may become susceptible to natural infection at an age where the consequences of infection are more severe, for example leading to IM or cancer [13].
Additionally, sub-optimal vaccine coverage even of a vaccine with a long duration of protection will lead to a higher age at infection amongst those who remain unvaccinated. In such situations it may be better to delay vaccination until the pre-teenage years, targeting individuals who remain EBV seronegative. Alternatively, a vaccine protecting against IM and EBV-associated diseases (such as certain cancers) could be administered to older children as they approach adolescence, which may be effective even with a shorter duration of protection. After the licensing of vaccine candidates, strategic discussions will need to take place nationally and be informed by accurate national data on the epidemiology of EBV infection.
In the United Kingdom, EBV seroprevalence increases rapidly in very young children, reaching 21 and 51% by the age of two years in children of white and Pakistani ethnicity, respectively [14]. Another study showed that EBV seroprevalence then remained relatively constant, at around 55%, between the ages of five and 11 years [15]. EBV seroprevalence was estimated at 75% in university students at 19 years and 92% by the age of 22 years [16]. We recently published summary data on the seroprevalence of EBV in adolescents in England [13]; however, to date no study has investigated factors associated with seropositivity that could inform a targeted vaccination strategy.
Our aim was to investigate the sociodemographic and lifestyle factors, particularly age, associated with EBV serostatus in children and young adults in England, and to discuss the implications of our findings for future EBV vaccination policy.

Study population
The Health Survey for England (HSE) is an annual, cross-sectional, representative survey of households in England. Its methods are described in detail elsewhere [17]. For this study, and in order to parameterise a model of EBV transmission, [13] we randomly selected individuals who participated in the 2002 HSE; 2002 was the most recent year in which survey participants gave consent for future studies to test their blood samples for blood-borne viruses. Our aim was to include 25 participants of each sex in each single year age group from 11 to 24 years, in order to fill a gap in the literature and capture the years at which infection is most likely to have clinical consequences. The participant IDs were selected randomly by the HSE, however it was not possible at the time of sampling to determine whether the samples had already been used. As a result, more than 25 IDs were selected for each age-sex group to ensure there were sufficient samples for our analysis, and therefore there are not exactly 25 samples in each group (Additional file 1: Table S1).

Measuring seroprevalence of Epstein-Barr virus and cytomegalovirus infection
Stored blood serum samples collected between January 2002 and March 2003 were obtained from the HSE. Samples were posted to the laboratory within two days, where they were centrifuged, and the remaining serum was frozen and stored at − 40°c until they were analysed, which was completed in September 2017 [18].
EBV virus capsid antigen (VCA)-specific IgG and CMV-specific IgG were detected in serum samples using commercial ELISA kits obtained from EUROIMMUN, Germany (EI2791-9601-G, EI2570-9601G). Assays were performed according to manufacturer's instructions and serum antibody concentrations were calculated using a standard curve. Data on the performance of the assays are detailed in Additional file 1: Table S2. Results were presented in relative units (RU/mL); <16RU/mL samples were considered negative, ≤16 to <22RU/mL borderline and ≥ 22RU/mL positive. Borderline results from the EBV VCA IgG ELISA were subsequently subjected to reanalysis with an EBV immunoblot assay (EUROIMMUN, Germany, DY2790G) which revealed all borderline serum samples (n = 5) had reactivity to alternative EBV antigens; they were therefore considered seropositive.

Statistical analysis
Data were analysed in Stata version 15.0. We weighted our sample, using the svy commands in Stata, to be representative of the English population in 2002 with respect to age and sex, utilising data from the Office for National Statistics [19]. All stated percentages are weighted. Descriptive analyses of the study population were undertaken. ArcMap 10.3.1 was used to create a map of EBV seroprevalence by English Government Office Region [20].
To investigate factors associated with being seropositive for EBV, we undertook logistic regression modelling. A causal inference framework was used to determine a priori factors to be included in multivariable models, from the available data collected in the HSE. We built two multivariable regression models.
A second 'adults-only' model was restricted to individuals aged ≥16 years, and additionally included factors for which data was only available for adults; smoking status (never smoked, current smoker, smoked in past) and occupational category from the National Statistics Socioeconomic classification (NS-SEC) [21]. The NS-SEC categorises occupations into higher managerial and professional roles (involving strategy/supervision), intermediate occupations (typically clerical, sales, service or technical positions which do not involve general planning or supervision), routine and manual occupations (involving basic labour), never worked or long-term unemployed, and other. We excluded individuals missing data on one or more variables.
Planned sensitivity analyses investigated the impact of excluding CMV serostatus as a predictor of EBV serostatus, and the impact of classifying the originally indeterminate serological results as seronegative rather than seropositive.

Ethical approval
This study was approved by the University College London Research Ethics Committee (5683/002). The HSE obtained informed written consent for blood samples to be collected and stored for future analyses [17].

Results
Our study sample included 732 individuals aged 11-24 years, of whom 547 (74.6%) were EBV-seropositive. The characteristics of seropositive individuals are shown in Table 1.
Factors associated with EBV seropositivity were largely consistent between the univariable and multivariable models ( Among adults, EBV seropositivity was higher among those who currently smoked (aOR 4.29 [2.13-8.65]), than those who had never smoked. There was no evidence of associations between sex, BMI or occupational category and EBV serostatus.
In sensitivity analyses, we firstly excluded CMV serostatus as a predictor of EBV serostatus, and secondly we classed indeterminate serology results (n = 5) as seronegative rather than seropositive. Both sensitivity analyses showed results consistent with our main analyses (Additional file 1: Table S3, Table S4).

Discussion
The importance of EBV as a cancer-causing pathogen has generated international interest in developing an anti-infection vaccine [22]. The cost-effectiveness of different strategies to deploy such vaccines will vary from setting to setting and is dependent on the epidemiology of the infection. For example, EBV's association with IM means that vaccines that do not produce lifelong immunity may be better targeted towards subgroups which are likely to acquire infection in adolescence. In this observational study of factors associated with EBV seroprevalence among young people in England in 2002, we explored the distribution of seroprevalence by age and the sources of additional variability. We found a substantial increase in EBV seroprevalence with age among our sample population, associations with ethnicity and smoking, and a potential association with CMV seroprevalence.
A series of studies have demonstrated that EBV is generally acquired pre-adulthood, and that this varies between settings [12]. Our findings regarding smoking fit with the prevailing narrative that there is an association between EBV and socioeconomic status, rather than smoking being an independent risk factor [12]. Unfortunately, we did not have a good measure of socioeconomic status in our analysis; the NS-SEC does not account for familial socioeconomic status during childhood, which is probably more relevant to EBV seroprevalence than individual occupational status in young adults, and we were unable to measure socioeconomic status in children at all. We found that EBV prevalence varied substantially between regions of the UK in univariable analyses and in the whole-cohort model, but not in the adults-only model, suggesting confounding between region and socioeconomic status. There was also a strong association between EBV seropositivity and ethnicities other than white, in both univariable and multivariable models. This may be the result of different mixing patterns (as people of ethnic minorities are more likely to live in larger households), different feeding practices, or residual confounding of socioeconomic status. CMV is another herpesvirus which infects a high proportion of the population from a young age, [23] and has also been associated with EBV in other settings [24,25].
In England, EBV infects 55% of the population by the age of 12 [15]; i.e. prior to adolescence, when the risk of IM increases. Cost-effective deployment of a cheap, infection-preventing, vaccine with a lifelong duration of protection could thus likely involve targeting the early years. However, future vaccines may produce a shorter duration of immunity, potentially delaying infection and resulting in an increasing incidence of IM (and IMassociated cancers). This could be compounded by suboptimal vaccine coverage increasing the average age at infection [26] and consequently potentially increasing rates of IMsimilarly to how sub-optimal coverage of the MMR vaccine led to an increase in congenital rubella syndrome in Greece [27,28].
In such a scenario, targeted vaccine deployment to the social groups who acquire infection later (when the likelihood of IM is higher) might be considered, possibly with repeated dosing if required. Such targeting could be informed by the risk factors detected within this analysis, and data such as those presented here should be considered in conjunction with the characteristics of the vaccine available when determining what a vaccine policy should look like. If a vaccine was cheap and effective, then universal coverage would be appropriate. If the duration of protection was short, it may be prudent to give repeat doses of the vaccine to people who pick up the infection at the youngest age, which is linked to ethnicity and likely to socioeconomic status. The use of an expensive vaccine could be stratified on the basis of who is most likely to suffer EBV-related disease after infection, which we have studied separately [29].
The limitations of our work include the age of the data and the use of a cross-sectional study design, preventing determination of the temporality of the correlation between EBV and CMV infection. In our analysis, EBV seroprevalence was higher than CMV seroprevalence in all age groups, and both increased with age. We found that CMV was associated with EBV in univariable analyses, and in the adults-only model, but not in the whole-cohort multivariable model. As both EBV and CMV are associated with increasing age, particularly during adolescence, we would not expect an association between CMV and EBV to persist in the whole-cohort multivariable model. It is possible that as the association between age and EBV seroprevalence was less strong in the adults-only multivariable model (as EBV seroprevalence starts to saturate as people reach adulthood), there was enough of a residual effect that the association between EBV and CMV could be detected. Unfortunately, our sample size was not large enough to investigate the interactions between EBV, CMV and age in more detail. The association may result from shared genetic, immunological and/or sociodemographic risk factors, or one infection could increase susceptibility to the other. Longitudinal studies with serial testing are necessary to explore this association, and additional risk factors, in more detail.
We elected to measure IgG antibodies to the EBV VCA protein and whole CMV virus, as these antibodies are present in all infected individuals and persist for life. Although we did not test for IgM antibodies, and cannot exclude the possibility that some seronegative individuals may have been recently infected, we note that VCAspecific IgG and IgM antibodies usually appear contemporaneously [30] and therefore we would expect the number of such individuals in our study to be low.

Conclusions
Knowledge of the distribution of EBV infection among young population groups in England is critical for determining future vaccination policies, including the costeffectiveness of general versus selective approaches. Data such as those presented here should be used together with detailed information on vaccine characteristics, the implications of remaining EBV-uninfected for life, the ramifications of delayed infection, and the financial costs of IM and EBV-associated cancers to inform such policies.