Norovirus GII.2[P16] strain in Shenzhen, China: a retrospective study

Background Norovirus (NoV) is the main cause of non-bacterial acute gastroenteritis (AGE) outbreaks worldwide. From September 2015 through August 2018, 203 NoV outbreaks involving 2500 cases were reported to the Shenzhen Center for Disease Control and Prevention. Methods Faecal specimens for 203 outbreaks were collected and epidemiological data were obtained through the AGE outbreak surveillance system in Shenzhen. Genotypes were determined by sequencing analysis. To gain a better understanding of the evolutionary characteristics of NoV in Shenzhen, molecular evolution and mutations were evaluated based on time-scale evolutionary phylogeny and amino acid mutations. Results A total of nine districts reported NoV outbreaks and the reported NoV outbreaks peaked from November to March. Among the 203 NoV outbreaks, 150 were sequenced successfully. Most of these outbreaks were associated with the NoV GII.2[P16] strain (45.3%, 92/203) and occurred in school settings (91.6%, 186/203). The evolutionary rates of the RdRp region and the VP1 sequence were 2.1 × 10–3 (95% HPD interval, 1.7 × 10–3–2.5 × 10–3) substitutions/site/year and 2.7 × 10–3 (95% HPD interval, 2.4 × 10–3–3.1 × 10–3) substitutions/site/year, respectively. The common ancestors of the GII.2[P16] strain from Shenzhen and GII.4 Sydney 2012[P16] diverged from 2011 to 2012. The common ancestors of the GII.2[P16] strain from Shenzhen and previous GII.2[P16] (2010–2012) diverged from 2003 to 2004. The results of amino acid mutations showed 6 amino acid substitutions (*77E, R750K, P845Q, H1310Y, K1546Q, T1549A) were found only in GII.4 Sydney 2012[P16] and the GII.2[P16] recombinant strain. Conclusions This study illustrates the molecular epidemiological patterns in Shenzhen, China, from September 2015 to August 2018 and provides evidence that the epidemic trend of GII.2[P16] recombinant strain had weakened and the non-structural proteins of the recombinant strain might have played a more significant role than VP1. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-021-06746-9.

and GIV infect humans. GI and GII are responsible for the majority of human diseases and can be further divided into nine (GI.1-GI.9) and 27 (GII.1-GII. 27) genotypes based on the diversity of VP1 [4]. The fulllength single stranded RNA genome is approximately 7.5 ~ 7.7 kb, with three open reading frames (ORFs) [5]. The first 5 kb closest to the 5' end of the genome is ORF1, which encodes non-structural proteins, including N-terminal protein (P48), NTPase, 3A protein (P22), VPg (viral genomic junction protein), 3 C-like protein (Pro) and RNA-dependent RNA polymerase (RdRp) [6]. These proteins are important for the replication of NoV. ORF2 is 1.6 kb in length and encodes the major structural protein VP1, which constitutes the main capsid structure and is responsible for the infectivity and antigenicity of NoV [7]. VP1 contains a well-conserved shell (S) domain and a protruding (P) domain, and the latter is divided into two subdomains, P1 and P2 [8]. Furthermore, the P2 region is considered a hypervariable part of the genome because the domain encodes the receptor-binding domain, which is responsible for histoblood group antigen (HBGA) binding, and important epitopes targeted by antibodies that inhibit binding [9,10]. ORF3 is 0.6 kb and encodes the minor structural protein (VP2) [11].
Shenzhen is one of the most important cities in Guangdong Province. However, information about NoV outbreaks in this region is limited. This retrospective study aimed to determine the genotypic diversity of NoV strains in outbreaks and the genetic characteristics of the GII.2[P16] strain in Shenzhen, China, from September 2015 to August 2018.

The surveillance of NoV outbreaks
Faecal specimens in AGE outbreaks submitted to the Shenzhen Center for Disease Control and Prevention (Shenzhen CDC) by District Centers for Disease Control and Prevention (district-level CDCs) from September 2015 to August 2018 were obtained. District-level CDCs are responsible for conducting outbreak investigations, including providing epidemiological and clinical information. The Shenzhen CDC performs NoV detection and genotyping on the specimens. The NoV outbreaks were identified as > 5 acute gastroenteritis cases within 3 days after exposure in a common setting where > 2 samples (whole faecal, rectal swab, or vomitus) had been laboratory confirmed as NoV.

Detection of NoV by real-time RT-PCR
For faecal specimen analysis, a 10% suspension was prepared by mixing 0.1 g stool with 1 mL phosphate-buffered saline (pH 7.2). Viral RNA was extracted from the clarified stool suspension using the Viral Nucleic Acid Extraction Kit II (Geneaid, China), after which the viral RNA was examined by real-time reverse transcription polymerase chain reaction (real-time RT-PCR) using Ag-Path Kit (Applied Biosystems, USA) with primers (Cog1F, Cog1R, Cog2F, and Cog2R) and TaqMan probe (Ring 1E and Ring 2) (Additional file 1: Table S1). The cycling conditions were described previously [18]. A negative control containing DEPC water and 2 positive controls containing RNA of NoV GI and GII were included in each experiment. Samples were scored as positive if the cycle threshold values were ≤ 40 and the positive and negative controls showed the expected values.

Genotyping analysis
Genotypes were confirmed by BLAST and an automated online NoV genotyping tool offered by the Netherlands National Institute for Public Health and the Environment (RIVM, http:// www. rivm. nl/ mpf/ norov irus/ typin gtool) [21].

Phylogenetic analysis of the RdRp region and VP1
To evaluate the evolution of the NoV GII.2[P16] strain in Shenzhen, the full-length RdRp region or VP1 sequence from this study and all the sequences of the full-length RdRp region or VP1 sequence we found in GenBank as of September 2016 were collected. Phylogenetic trees were constructed using the Markov chain Monte Carlo (MCMC) method with the strict molecular clock in BEAST software v 1.8.2. The best substitution models were TN93 (Tamura-Nei) + G (Gamma) and TN93 (Tamura-Nei) + G (Gamma) + I (Invariable) for the RdRp region and VP1 sequence, selected by MEGA 6.0 using the BIC method [22]. MCMC chains were run for 1.0 × 10 8 steps for the RdRp region sequences and 2.0 × 10 8 steps for the VP1 sequences. Effective sample sizes greater than 200 were confirmed by the Tracer. The final result was visualized using the FigTree software v1.4.3.

Amino acid mutations of the non-structural region and VP1
To evaluate the impact of the intergenic recombination of the non-structural region and VP1, the amino acid mutations of the non-structural region and VP1 among different genotypes were analysed by MEGA 6.0.

Statistical analysis
The difference between GII.2 NoV detection rates in the age distribution were compared using Fisher's exact test by SPSS Statistics software v.22.0 through dominant school settings (childcare centre, primary school, middle school), and a p-value less than 0.05 was considered statistically significant.

Nucleotide sequence accession numbers
The datasets generated during the current study for the GII.

NoV outbreak settings and geographical locations
According to ten district-level CDCs, there were 203 NoV outbreaks in Shenzhen between the period September 2015 and August 2018. Most outbreaks were from Nanshan district (30.5%, 62/203) and no outbreak was from Yantian district ( Fig. 1). Information on the outbreak size was reported for 197 (97.0%), ranging from 5 to 115 cases per outbreak (Table 1). Of the 203 outbreaks, 186 (91.6%, 186/203) occurred in school settings, with 17 (8.4%, 17/203) occurring in non-school settings ( Table 2). Of the 186 outbreaks occurred in school settings, 143 (76.9%, 143/186) occurred in child care centres. The reported outbreaks peaked in the cold season, especially from November to March (Fig. 2).

Genotype distribution and outbreak characteristics
For outbreaks caused by the GII.2 strain, most occurred in school settings: 73 (79.3%, 73/92) outbreaks occurred in child care centres and the age distribution of GII.2 infection showed no significant differences (Fisher's exact test = 3.595, p = 0.177) through dominant school settings (child care centre, primary school, middle school). Of the thirteen outbreaks caused by the GII.3 strains, most (86%, 11/13) also occurred in child care centre.

Phylogenetic analysis of the RdRp region and VP1 sequence of the GII.2[P16] strain
To examine strain evolution, 52 full-length RdRp regions of strain GII. 2 (Table 4).

Amino acid mutations of HBGA-binding and epitope sites of the GII.2[P16]
To explore the HBGA-binding profile, predicted epitopes and epitope A to E sites of the GII. 2 (2016-2018), from 1975 to 2018 were collected and aligned. Sequencing data revealed 29 parisimony-informative sites, but there were no mutations in the HBGA-binding profile, predicted epitopes and epitopes A to E of the GII.2[P16] strain (Additional file 1: Table S2).

Discussion
In this study, NoV-associated AGE outbreaks in Shenzhen, China, from September 2015 to August 2018 were analysed. A total of 203 NoV outbreaks were reported to the Shenzhen CDC. NoV infection was initially described as "winter vomiting disease" due to its seasonal characteristics [25]. Analysis of the monthly distribution also indicated that the peak of the outbreak in Shenzhen occurred from November to March. Previous studies have found a link between climate or weather and increased NoV abundance, and low absolute humidity provides an ideal conditions for NoV persistence and transmission during cold months [26]. Indeed, NoV rapidly loses viability and infectivity with the increase in increasing temperature; therefore, NoV appears to be more stable in a cold climate and thus is transmitted more easily among people at cold times of the year [27,28]. The peak in this study was in December, when Shenzhen began to become cold, and March, when the temperature began to turn warm, suggesting that that climate change has an impact on NoV transmission. The NoV outbreaks usually occur in hospitals, nursing homes, schools, childcare centres, hotels and other institutional settings [3]. A study in United States reported 3960 NoV outbreaks between 2009 and 2013 and found that long-term care homes were the most frequent sites of NoV outbreaks [29]. Another study from Qin et al. [30] showed that middle school was the most important setting of NoV outbreaks in China, followed by primary school between 2006 and 2016. In this study, we classified the outbreak settings into 12 categories, and the results showed that most were occurred in childcare centre, followed by primary school. This suggests that school remains the most common setting for NoV outbreaks in Shenzhen, but that the current high incidence is occurring among younger children who are under 6 years of age. Combining the results of the monthly distribution of NoV outbreaks in Shenzhen, we suspect that the decrease in the number of NoV outbreaks in January and February is related to school holidays. When the scale of the outbreaks was analysed, the average number of people involved per outbreak in Shenzhen was nine, smaller than the 18 persons reported in the United States    The results of sequence alignment showed that important sites of VP1, including the HBGA-binding profile and epitope-predicted sites, were not mutated. This suggested that the reason for the prevalence of NoV GII.2[P16] strains in the population is different from that of the previous pandemic NoV GII.4, which was mainly due to changes in the capsid region leading to changes in blocking antibody epitopes to cause population among people [36,37]. Parra et al. [37] analysed the GII.2 capsid sequences over a 40-year period and found only small differences, which agrees with our results, indicating that the GII.2 strain is more genetically stable than is the GII.4 strain. At the same time, the lack of variation in the antigen regions of strains may also explain their short duration. These results indicate that the presence of a structure other than the VP1 contributes significantly to the prevalence of the GII.2[P16] strain [38], which may help to reveal the reasons for the GII. 17[P17] epidemic that caused the outbreak of acute gastroenteritis in many countries in the winter of 2014-2015. Tohma et al. [39] summarized the reasons for the epidemic caused by GII. 17[P17] and believed it to be related to the non-structural region. In this study, amino acid substitutions were found within the nonstructural regions including P48, NTPase, P22 and RdRp. These non-structural proteins play important roles in NoV replication, damaging host cells and promoting virus synthesis by interfering with intracellular protein transport, vesicle misorientation and Golgi disintegration [40][41][42]. The RdRp region can be divided into three highly conserved segments according to function and structure, including the fingers, thumb, and palm subdomains, which can be organized into motifs A to G [43]. The results of amino acid mutation of non-structural protein sites of the GII.2[P16] recombinant strain suggest that the non-structural region may provide materials for virus replication, accelerate apoptosis in host cells and enhance fitness by changing the interaction mode. Another study also reported that the GII.2[P16] strain leads to a higher viral load than GII.4[Pe] and GII.17[P17] in patients [44]. However, not all changes in the non-structural region would cause epidemics. The study of Tohma et al. calculated the amino acid substitution sites in the RdRp region of GII.2[P2] and found that the replacement rate of GII.P2 was higher than that of GII.P16 [45]. Regardless, NoV GII.2[P2] outbreaks have not resulted in pandemics, indicating that the RdRp region plays a crucial role in the GII.2[P16] epidemic.
This study showed that the GII.2[P16] outbreaks have decreased in Shenzhen, although the continuous surveillance to monitor genotypes is still necessary to identify new variants in a timely manner. The limitations of this study were as follows. First, genotyping was only successful for 150 (73.9%) of the positive NoV cases in our study. Second, our study lacked clinical information and epidemiological data for outbreaks. In future studies, epidemiological surveillance should be more comprehensive and molecular analysis for different NoV genotypes should be developed.