Changes in norovirus genotype diversity in gastroenteritis outbreaks in Alberta, Canada: 2012–2018

Background The emergence of norovirus genotype GII.4 variants has been associated with gastroenteritis pandemics worldwide, prompting molecular surveillance for early detection of novel strains. In this study, we aimed to analyze the outbreak activity of norovirus and characterize the norovirus strains circulating in Alberta between July 2012 and February 2018. Methods Stool samples from gastroenteritis outbreaks in Alberta were tested for norovirus at the Provincial Laboratory for Public Health using a multiplex real time-RT PCR assay. The ORF1 and ORF2-genotypes of norovirus positive samples were assigned based on phylogenetic analyses of partial polymerase and capsid sequences, respectively. Results A total of 530 norovirus outbreaks were identified. During July 2012 and June 2017 there was a gradual decrease in the annual number of GII.4 outbreaks, however, outbreak numbers increased from June 2017–February 2018. Four novel strains emerged: GII.17 Kawasaki in July 2014–June 2015, GII.P16/GII.4 Sydney in July 2015–June 2016, GII.P16/GII.2 and GII.P4 New Orleans/GII.4 Sydney in July 2016–June 2017. GII.Pe/GII.4 Sydney was the single predominant strain responsible for the majority (over 50%) of all norovirus outbreaks up to June 2015. Between June 2017 and February 2018, GII.P16/GII.4 Sydney was the leading strain causing 63% of all norovirus outbreaks. Conclusions GII.4 stands as the predominant capsid genotype causing a large majority of the norovirus outbreaks in early 2018. An increase in genotype diversity was observed in the last years, characterized by a high circulation of non-GII.4 strains and GII.4 recombinants. Electronic supplementary material The online version of this article (10.1186/s12879-019-3792-y) contains supplementary material, which is available to authorized users.


Background
With advancement in molecular testing, Norovirus (NoV) is recognized as the leading cause of morbidity and mortality from diarrhoeal disease across all ages [1]. Approximately 18% of all acute gastroenteritis around the world is caused by NoV with an estimated economic burden of 60 billion US dollars per year [2,3]. Norovirus has a single-stranded positive-sense RNA genome of~7. 5 Kb encompassing three open reading frames (ORFs). The three ORFs encode a non-structural polyprotein, the major capsid gene (VP1) and the minor capsid gene (VP2).
The NoV genome has high genetic variability at the P2 subdomain of VP1. P2 forms the outermost part of the virion and contains binding sites for histo-blood group antigens (HBGAs), attachment factors for human NoV strains. Norovirus can evade host immune responses by antigenic drift, through a process similar to influenza's epochal evolution [4]. Recombination is another mechanism of evolution for NoV that often occurs at the ORF1/ ORF2 junction, further increasing the genetic diversity of the virus [5]. Seven NoV genogroups (GI to GVII) have been described to date of which GI, GII and GIV can cause gastroenteritis in humans, with GIV being less common [6]. NoV genogroups are sub classified into genotypes; at least 14 ORF1-based and 9 ORF2-based genotypes have been described for GI and 27 ORF1-based and 22 ORF2-based genotypes for GII [6].
NoV GII.4 has been the most common genotype in circulation worldwide since the mid-1990s. New genetic clusters or variants of NoV GII. 4 [7][8][9]. No global pandemic GII. 4 strain has yet emerged after GII.4 Sydney. A novel cluster of GII.17 named Kawasaki became predominant in Asia during the 2014-2015 winter season [10] raising concern over a possible new global pandemic NoV genotype. Since its emergence in Asia, GII. 17 Kawasaki has also been reported in other continents although with lower prevalence [10].
It is yet unknown why GII.4 has remained as the predominant genotype for over two decades but several features of this genotype support its enhanced circulation, including higher rates of evolution and a progressive accumulation of mutations that help evade host immune responses [11,12]. Moreover, pandemic GII.4 variants can broadly bind to a wide set of HBGAs, a feature that favours virus transmission by providing NoV a large pool of individuals genetically susceptible to infection [13].
The aim of the present study is to describe the outbreak activity of NoV from July 2012 to February 2018 in Alberta, Canada. We provide information on the norovirus genotypes in circulation and their relevance in outbreak settings. This study provides important data for vaccine development and enhances our understanding of norovirus disease burden.

Samples
Gastroenteritis outbreak investigations in Alberta were managed by public health officials in collaboration with the Provincial Laboratory for Public Health (ProvLab) [8]. Stool samples collected between July 2012 and February 2018 during outbreak investigations were tested at Prov-Lab for NoV genogroup I and genogroup II using a real time RT-PCR assay [14]. A NoV-confirmed outbreak was defined as ≥2 epidemiologically linked cases with gastroenteritis and at least one sample tested positive for NoV. Data was analyzed from July to June of the following year considering the winter seasonality of NoV.
Outbreak settings were classified into 6 different groups: 1) community long-term care, hospital longterm care, supportive living, and group homes facilities; 2) hospital acute care; 3) food establishments, catering events, food shops, community functions, conferences and hotels; 4) day care centers; 5) other types of group residences, e.g. camp, dormitory; and 6) other settings including community shelters, community services, household, schools and cruise ships. Groups 1 and 2 were classified as healthcare-related settings, whereas groups 3 to 6 were classified as non-healthcare-related.

Norovirus genotyping
For each NoV-positive outbreak, one NoV positive sample was selected for genotyping. Briefly, the nucleic acid extract from each stool sample was subjected to reverse transcription (RT) with random primers and the resulting cDNA was PCR-amplified. Samples collected up to February 2017, were amplified in region C (ORF2) using primer pair G2SKF/G2SKR for genogroup II strains or primer pair G1SKF/G1SKR for genogroup I strains [15]. Samples with emergent or unassigned genotypes based on region C sequence analysis were further genotyped targeting the 3'end of the polymerase gene using primers LV4282-99F [16] and COG2R [17]. Samples collected from March 2017 onwards, were genotyped using a dual polymerase-capsid genotyping protocol based on a single PCR amplicon obtained with primer pair MON432/ G1SKR for genogroup I strains and primer pair MON431/G2SKR for genogroup II strains [18]. All PCR products were subjected to Sanger sequencing and genotypes were assigned using the Norovirus Genotyping tool [19]. A large majority of strains from outbreaks occurring between July 2012 and June 2015 were left uncharacterized at ORF1; retrospective characterization was not attempted based on observations that 69% of norovirus GII outbreaks in Alberta were caused by a single ORF2 genotype, GII.4 Sydney, and reports from North America [18] and diverse countries from different continents [20] suggest that GII.Pe/GII.4 Sydney was the major strain in circulation worldwide during that time frame.

Phylogenetic analysis
Phylogenetic analyses of GII.17, GII.P16/GII.4 Sydney, GII.P4 New Orleans/GII.4 Sydney and GII.P16/GII.2 sequences obtained in our study were performed with MEGA 6.06 [21]. Maximum likelihood trees were constructed using the substitution model producing the lowest Bayesian Information Criterion scores, as calculated by the software. For all trees, branch significance was estimated based on 1000 bootstrap replicates.

Cloning, expression and purification of recombinant P domain proteins
The capsid P domain of two GII.P16/GII.4 Sydney outbreak strains (AB-2016-26 and AB-2016-190) were amplified by RT-PCR using forward primer ACGCGGATC CTCAAGAACTAAACCATTCTCTGTCC and reverse primer ATAAGAATGCGGCCGCTTAGCAAAAGCAA TCGCCACGGCAATCGCATACTGCACGTCTACGCC CCGTTCC and cloned into pGEX-4 T-1 vector (GST Gene fusion System, GE Healthcare Life Sciences) between the Bam HI and Not I sites. An RGD4C tag (CDCRGDCFC) was linked to the C terminus of the P domain for P particle formation [22]. The recombinant P domain protein was expressed in E. coli (BL21, DE3) with induction by 0.25 mM isopropyl-β-D-thiogalactopyranoside (IPTG) at room temperature (~21°C) overnight as described elsewhere [22]. Purification of the glutathione S-transferase (GST)-P domain fusion protein was performed using resin of Glutathione Sepharose 4 Fast Flow (GE Healthcare Life Sciences) according to the manufacturer's instruction. GST was removed from the target proteins by thrombin (GE Healthcare Life Sciences) cleavage either on beads or in solution (phosphate buffer saline, PBS, pH 7.4).

Saliva binding assay of P-domain proteins
The saliva-based binding assays were performed as previously described [13]. Briefly, boiled human saliva with known HBGA phenotypes collected from Cincinnati, OH, United States, were diluted 1000-fold and used to coat 96-well microtiter plates (Dynex Immulon; Dynatech, Franklin, MA). After blocking with 5% non-fat milk in PBS, different concentrations of P-domain protein (15, 7.5, 3.75 ng/μl) were added to the wells. The bound P proteins were detected using a guinea pig anti-NoV antiserum (1:3000), followed by the addition of HRP-conjugated goat anti-guinea pig IgG. The HRP activity was then measured with TMB kit (Kierkegaard & Perry Laboratories, Gaithersburg, MD) and the OD450 values were read with an ELISA spectrum reader (Tecan, Durham, NC).

Statistical analysis
The proportion of NoV GI and GII outbreaks by settings were compared using the Chi-square exact test. The annual numbers of NoV positive outbreaks occurring between July 2012 and June 2017 were compared to those occurring in the previous 5 years, from July 2007 to June 2012, using a one tailed t-test, and a significance of p < 0.05.

Norovirus outbreaks: Annual activity and circulating genotypes
A total of 1572 gastroenteritis outbreak investigations were performed in Alberta between July 1st 2012 and February 30th 2018, of which 859 (54.6%) had specimens submitted to the ProvLab for laboratory testing. Norovirus was identified in 530 (61.7%) of all tested outbreaks. The monthly distribution of NoV-positive outbreaks peaked in the winter months ( Fig. 1), however, spring peaks with higher activity than that of winter months occurred in March 2014 and May 2016.
Compared to historical data, the annual numbers of NoV outbreaks between July 2012 to June 2017 were lower than those observed in the previous 5 years (July 2007 to June 2012 vs. July 2012 to June 2017, p = 0.0489, one-tailed t-test).

GII.P4 New Orleans/GII.4 Sydney recombinant
The recombinant strain GII.P4 New Orleans/GII.4 Sydney was identified during the period July 2016-June 2017 (Fig. 3). The strain was responsible for four out of five GII.4 outbreaks from that period and was not observed after June 2017.

NoV outbreak settings
Between July 2012 and February 2018, the majority of outbreaks occurred in community long-term care, hospital long-term care, supportive living, and group homes facilities (63/83, 75.9% of all NoV GI and 346/440, 78.6% of all NoV GII outbreaks, respectively) and hospital

Discussion
During July 2012 to February 2018 the NoV outbreak activity in Alberta presented important changes in terms of overall disease burden and NoV genotype distribution as compared to historical data. Between GII.P17/GII.  The GII.17 Kawasaki strain was first reported in Asia in September 2014 where it quickly became the predominant genotype during the 2014-2015-winter season raising concern of a global pandemic [10]. Since its emergence, GII.17 Kawasaki has undergone diversification into at least three sub-clusters [10]. In Alberta, we identified two sub-clusters: the Kawasaki 308-like sub-cluster and the Kawasaki 323-like sub-cluster. Although both clusters started to circulate in the province at similar timeframes as reported in Asia [10], the GII. 17 Kawasaki had limited prevalence in Alberta, as also observed in other regions outside Asia, including Europe and the United States [10]. The factors limiting the transmission of GII.17 Kawasaki outside Asia are largely unknown. The GII.17 Kawasaki strains display a broad saliva HBGA-binding profile comparable to pandemic GII.4 strains [25], which emphasizes the potential of GII.17 as a pandemic strain.
The novel GII.P16/GII.4 Sydney recombinant was the most common strain among NoV outbreaks from July 2015-June 2016. Although the prevalence of this recombinant decreased temporarily in July 2016-June 2017, its circulation increased sharply in the more recent period, July 2017 and February 2018, when GII.P16/GII.4 Sydney was again responsible for a large majority of outbreaks. The GII.P16/GII.4 Sydney recombinant also circulated in the United States and Germany during the 2015-2016 winter season [18,26]. Our analysis of capsid sequences from this recombinant collected at three different time points (2016, 2017 and 2018) demonstrated no unique amino acid substitutions compared to GII.Pe/ GII.4 Sydney, its ORF2 parental strain. Furthermore, we observed that GII.P16/GII.4 Sydney can bind to saliva of secretor HBGA phenotypes but not to that of non-secretors, similarly as GII.Pe/GII.4 Sydney strains. Our findings agree with a recent phylogenetic study [27] and suggest that amino acid substitutions outside ORF2, rather than antigenic change at the capsid gene, conferred an evolutionary advantage to GII.P16/GII.4 Sydney over its parental strain GII.Pe/GII.4 Sydney.
The GII.P4 New Orleans/GII.4 Sydney recombinant had low circulation between July 2016 and June 2017 and was no longer observed in the subsequent period. We hypothesize that the transmission of this recombinant in Alberta might have been restricted by its antigenic resemblance to two highly circulated strains, GII.4 New Orleans and GII.4 Sydney. GII.4 has been the predominant genotype worldwide among humans since the 1990s but had very limited circulation in the 1970s and 1980s [28,29]. It is possible that the decline in GII.4 NoV outbreaks observed between July 2015 and June 2017 is a consequence of herd immunity accumulated in the population after two decades of high prevalence of various GII.4 variants and exhaustion of mutational sites in GII.4 Sydney. The GII.P16/GII.4 Sydney recombinant might have partially circumvented such factors by acquiring the ORF1 genes from the less circulating, and presumably more fit, GII.P16. Further surveillance data is still required to understand the turnover of NoV genotypes in humans. The last recombinant that emerged in Alberta in the time period of the present study is GII.P16/GII.2. Although GII.P16/GII.2 outbreaks have been reported previously in Asia in 2009, phylogenetic time-scale analyses performed by others suggest that the polymerase of the recent GII.P16/GII.2 strain is rather closely related to that of GII.P16/GII.4 Sydney and carries amino acid substitutions that could have conferred the novel GII.P16/ GII.2 an evolutionary advantage [30]. GII.P16/GII.2 strains have also been reported in Germany, France, Japan, China and the United States since mid-2016 [18,30]. Notably, GII.P16/GII.2 was the predominant strain in the province briefly in July 2016-June 2017, but was replaced thereafter by GII.P16/GII.4 Sydney.

A B
A limitation of our study is the limited collection of ORF1 data for the period between July 2012 and June 2015, which could have lead us to miss novel or unusual recombinants. However, we believe the likelihood of missing important strains is low since 69% of the norovirus GII outbreaks in Alberta within those years were caused by a single ORF2 genotype, GII.4 Sydney. Reports from North America [18] and diverse countries around different continents [20] suggest that GII.Pe/GII.4 Sydney was the major strain in circulation worldwide during that time frame and thus, we believe that most, if not all of GII.4 Sydney from that period, carried a GII.Pe polymerase.

Conclusions
In summary, we provided 6 years of systematic molecular surveillance data of norovirus outbreaks in Alberta and identified a GII.4 recombinant, GII.P16/GII.4 Sydney, as the most prevalent strain causing outbreaks in the province in early 2018. An important shift and increase in genotype diversity was noticed in recent years, which should be considered for vaccine development. Ongoing surveillance of the molecular epidemiology of NoV using a dual ORF1/ORF2 genotyping scheme is indispensable to better identify the disease burden of new emerging strains and understand their evolutionary pathways.