Molecular characterization of the viral structural gene of the rst dengue virus type 2 outbreak in Hunan Province, inland China

Background: An unexpected dengue outbreak occurred in the Hunan Province in 2018. This is the rst dengue outbreak in this area of inland China resulting 172 infected. Methods: To verify the causative agent of this outbreak and investigate gene characterization, the structural protein C/prM/E genes of viruses isolated from local residents were sequenced followed by mutation, phylogenetic analysis. The recombination, selection pressure, potential secondary structure and three-dimensional structure analysis were also performed. Results: Phylogenetic analysis revealed that all epidemic strains were classied as the cosmopolitan DENV-2 genotype, closest to the Zhejiang strain (MH010629, 2017) and then Malaysia strain (KJ806803, 2013). Compared with the DENV-2SS, 151 base substitutions were found in 89 sequences of isolates, resulting in 20 nonsynonymous mutations, of which 17 mutations existed among all samples (two in capsid protein, six in prM/M, and nine in envelope proteins). Moreover, amino acid substitutions at 602 th (E322:Q → H) and 670 th (E390: N → S) may result in heightened virulence of the epidemic strains. One new DNA-binding site and ve new protein binding sites were observed. Two polynucleotide-binding sites and seven protein binding sites were lost compared with DENV-2SS. Meanwhile, ve changes were found in helix regions. The helical transmembrane and disordered regions have minor changes. Protein tertiary structure prediction revealed the 429 th amino acid of E proteins was switch from histamine (positively charged) to asparagines (neutral) in 89 isolate strains. No recombination events or positive selection pressure sites were detected. To our knowledge, this study is the rst gene analysis of epidemic strain in the rst dengue outbreak in Hunan Province, inland China. Conclusions: The causative agent is likely to come from Zhejiang Province, a neighbouring Province where dengue fever broke out in 2017. This study may help understand the intrinsic geographical relatedness of DENV-2 and contributes further to research on pathogenicity and vaccine development. protein 1; NT: Nucleotides; PCR: Polymerase chain reaction; WHO: World health organization.

the same period of 2017, mainly occurring in countries such as Paraguay, Argentina, Bangladesh, Cambodia, India, Myanmar, Malaysia, Pakistan, Thailand, Yemen, and China, and mainly caused by DENV-1 and DENV-2 serotypes. DF has become a serious public health problem in China, according to data provided by the Chinese Center for Disease Control (CCDC), with 757,243 people infected in the past 42 years [7,8], largely occurring in Hainan [9], Guangdong [10,11], Zhejiang [12,13], Fujian [14, Taiwan, and Yunnan [15][16][17][18]. In 2018, an unexpected and rst-time dengue outbreak occurred in Hunan, an inland province of China. The rst dengue fever case was reported on September 2. Till October 6, 172 infected individuals were con rmed as NS1-positive, with one death. 73 cases were con rmed during September 8 to September 14, accounting for 76.04%. The ratio of female and male infected patients was 1.04 to 1 (49:47), with an average age of 49.5 (ranging from 11 to 84 years old). It should be noted that no dengue case was found in this area from 2000 to 2013, and there were only ve imported cases during 2014-2017 with no local cases.
This was the rst dengue outbreak in Hunan, interior province of China. It provided us an early warning that the dengue fever has been gradually spread inland from China's coastal and border regions and highlighted the urgent need to monitor the cross-border and cross-regional spread of dengue virus. The purpose of this paper was to verify the causative agent and analyze the molecular characteristics of the epidemic strain in this outbreak.

Methods
The geographic analysis of Hunan province and study design The geographical distribution map of dengue fever in China over the years was made using Chinese mapping and drawing software. Blood samples of patients were collected from two local hospitals responsible for the treatment of DENV patients (Qiyang People's Hospital and the Nongshan Hospital) during the 2018 dengue outbreak. The dengue fever epidemic situation in the surrounding areas of Hunan province was also analyzed.
Dengue virus RNA extraction and identi cation Viral RNA were extracted from 140 µl of serum using the QIAamp viral RNA mini kit (Qiagen, Hilden, Germany; No.52906) and then reverse transcribed into cDNA using the Prime Script TM II 1st Strand cDNA Synthesis Kit (Takara Bio, Shiga, Japan; No.6210A). Universal primer of dengue virus and the speci c primers of the four serotypes (Table S1) were used for polymerase chain reaction (PCR), and the type was identi ed. The reaction conditions in each 25-µl volume were denaturation at 95℃ for 5 min, followed by 30 cycles of denaturation at 95℃ for 30 s, annealing at 55℃ for 30 s, and elongation at 72℃ for 30 s, with a nal elongation step at 72℃ for 7 min. The PCR products were then subjected to gel electrophoresis.

Primer design
A total of three synthetic oligonucleotide primer pairs F1/R1, F2/R2, and F3/R3 (Table S2) were designed to amplify overlapping fragments with sizes of 2,325 nucleotides spanning the entire structural protein genome of DENV-2. All primers were designed using SnapGene software (version 3.2.1), based on the Japan strain (GenBank accession no. M29095). All primers were synthesized and puri ed by Sangon Biotech Co., Ltd. (Shanghai, China).
Gene ampli cation and sequencing PCR was performed with the following protocol (50-µl volume): denaturation at 95℃ for 3 min, followed by 30 cycles of denaturation at 95℃ for 10 s, annealing at 60℃ for 90 s, and elongation at 72℃ for 30 s, with a nal elongation step at 72℃ for 7 min. The PCR products were con rmed by agarose gel electrophoresis and sequenced at Sangon Biotech Co., Ltd. (Shanghai, China). Both forward and reverse sequencing were done.
Sequence and phylogenetic analysis Sequence of the structural protein genes (C/prM/E) were aligned by MEGA 7.0 and compared with 133 dengue virus (DENV) reference strains, including four serotypes of standard strains (Table S3), which were collected from websites (https://www.viprbrc.org). Phylogenetic analysis was performed using MEGA 7.0 through the ML phylogeny test with a bootstrap of 1,000 replications.

Molecular characteristics analysis
A total of 89 nucleotide sequences were assembled using BioEdit 7.1.3 (http://www.mbio.ncsu.edu/bioedit/bioedit.html) then uploaded to Genbank through Sequin Application (version 15.50), and BankIt was used to access the National Center for Biotechnology Information (NCBI) GenBank database (https://www.ncbi.nlm.nih.gov/genbank/). Next, the nucleotide sequence and translated amino acid sequence mutations of the structural proteins of these 89 strains were analyzed with BioEdit and Molecular Evolutionary Genetics Analysis (MEGA) software version 7.0. The secondary structure of structural proteins of DENV-2 epidemic and reference strains were predicted with the Predict Protein server (https://www.predictprotein.org/).

Recombination and selection pressure analysis
The Detection of Recombination Using a Genetic Algorithm (GARD) [19] server of Datamonkey [20] in the online software was used for automatic analysis of reorganization events of the structural protein regions with 775 codons of 130 DENV-2 reference sequences in Table S4 and 89 epidemic strains in our study. The phylogenies server in the software was used for analysis of selection pressure. In this research, four methods, the Fixed Effect Likelihood (FEL) [21], Internal Fixed Effect Likelihood (IFEL) [22], Mixed Effect Evolution Model (MEME) [23], and Rapid Unbiased Bayesian Approximation (FUBAR) [24], were adopted to estimate the locus speci c selection pressure. At least three of the four methods meet the requirement of ω> 1(ω=β/α), and the p-value < 0.1 or Posterior Prob (α<β) >0.9. The positive selection of this site can be inferred.

Results
The Geographic Analysis of Hunan Province and Study Design The geographic relationships between Hunan and the DENV outbreak areas in China were analyzed rst. The results showed that Hunan had become a central area of the DENV epidemic, which was surrounded by Yunnan, Guangdong, Guangxi, Hainan, Fujian, Zhejiang and the other dengue outbreak areas ( Fig. 1) (The map in the gure was drawn by ourselves. Part of the data in the map quoted from Zhao [25], and part was provided by Centers for Disease Control and Prevention of Hunan Province).
During the DENV outbreak in Qiyang County, Hunan from September 2018, a total of 260 serum samples from fever patients were collected, and all of these cases were con rmed to be NS1-positive through colloidal gold testing. Of these, 96 DENV-positive serum samples were screened out from patients whose fever courses were shorter than 5 days. Seven strains were ampli ed in C6/36 cells for over 6 days to construct a viral seed library of Hunan DENV. Eighty-nine viral RNAs were successfully extracted directly from these serum samples, followed by gene sequencing of the DENV structural protein C/prM/E genes. The phylogenetic analysis, recombination and selection pressure analysis, potential secondary structure prediction based on structural gene sequences originating from epidemic strains were performed to understand the genetic characterization, potential source, and evolution. The study design and the following disposition of study subjects are shown in Fig. 2.

Bases and amino acid mutations
Through ampli cation, three structural protein-overlapped fragments of 89 epidemic strains were obtained. After sequencing, the proteins were effectively spliced, and the length of coding nucleotide sequences was 2,325 nt, 775 amino acids were encoded, the homology between isolates was 99.7-100%, and the amino acid (AA) sequence of E protein was highly conserved. By comparison, the comparability of nucleotide and amino acid sequences between the 89 epidemic strains and DENV-2SS were 93.5 and 97.8%, respectively. Two hundred fteen bases had mutations in the structural protein region of epidemic strains, among which 195 were synonymous mutations and 20 were nonsynonymous mutations, leading Potential secondary structure of the structural protein region The protein secondary structure among DENV-2 standard strain KM204118 and three randomly selected sequences (HNQY2018014, 021, and 028) from the 89 isolate strains were predicted. Compared with DENV-SS, Hunan epidemic strains had missed one nucleotide-binding site (site 6) and one DNA-binding site (site 18), as well as one protein binding region (sites: 4 and 5) in the capsid protein (Fig. S1), while one new DNA-binding site (site 74) and two new protein binding sites (19 and 29) were observed in isolate strains. Moreover, variations were found in the disordered region among Hunan epidemic strains, DENV-2SS and Zhejiang/2017 (Fig. S1). In the prM/M region, which contained 166 amino acids, the protein secondary structure of the epidemic strains was highly consistent with that of the Zhejiang strain (Fig. S2). However, compared to the DENV-2SS, three protein binding regions disappeared in Hunan epidemic strains (sites:122, 133, and 220), and one novel protein binding region emerged (site 144).
Additionally, one helical transmembrane region of the isolates visibly differed from the DENV-2SS, and eight signi cant changes were observed in the buried and exposed region, while no noticeable variation was found in the strand and helix region (Fig. S2). Three protein binding sites (sites: 584, 596, and 642) disappeared at the 495AA locus of E protein, one protein binding location (site 377) was updated in Hunan isolates, four considerable alterations were observed in exposed and buried regions, and minor changes were found in the helical transmembrane and disordered region (Fig. 5) (101, 102, 124, 141, 207, 290, 294, 553, 607, 636, 651, 692-695), and nearly 70% of changes occurred in E proteins. Nevertheless, compared with the Zhejiang 2017 strain, there was no signi cant change in protein binding region and polynucleotide-binding region in structural protein (C, prM/M, and E) (Fig. 5, Fig. S1, and Fig. S2).
Possible three-dimensional structure of the structural protein E genes The possible three-dimensional structure of structural proteins of the representative epidemic strains (HNQY2018014, 021, and 028) were predicted and compared with DENV2-SS and Zhejiang/2017 strain. Homology modeling revealed that ve strains had the same three-dimensional structure. In addition, binding sites were also predicted by the 3DLigandSite ligand binding site prediction server, four protein binding sites were observed in DENV-2SS (HIS429, ALA430, THR435, and GLY436) (Fig. 6E). Hunan epidemic strains and the Zhejiang/2017 strain have the same binding sites at ASN429, THR435, and GLY436) (Fig. 6D). HNQY2018028 has two different binding sites (429 and 430) compared to DENV-2SS (Fig. 6) and one diverse binding site (429) compared to Zhejiang/2017.

Recombination and election pressure analysis
RDP4 software was used to analyze potential recombination events among HNQY2018001-HNQY2018089 and other representative DENV-2 virus strains. Preliminary analysis results showed that no recombination event may occur in these DENV-2 strains (p<0.05). The structural proteins of 202 strains were analyzed, including 113 representative strains of DENV-2 and 89 isolate strains. The results showed that the MEME method identi es the maximum number of actively selected sites (n = 16).
However, the FEL, IFEL and FUBAR methods indicated that all 775 sites were under negative pressure (Table 1). Therefore, no signi cant evidence of positive selection was presented in at least three different methods, so positive sites of selection pressure at these sites cannot be determined.

Discussion
In China, dengue fever mainly occurs in Guangdong, Hainan, Zhejiang, Fujian, Taiwan, Guangxi, and other coastal regions, or in Yunnan Province and its borders with South Asian countries. Only scattered cases have been reported in inland China, but no large-scale dengue epidemic has been reported in the inland area to date. Hunan is an inland province of China, located near 30 degrees north latitude, with a warm and humid climate from June to November, providing a natural environment for aedes albopictus breeding. Hunan province is located near Guangdong, Guangxi, Zhejiang, and other areas with high incidence of dengue fever. The total number of dengue infections in China in 2018 was 5,106, including 3,250 in Guangdong province, 217 sporadic cases in Zhejiang, and 172 cases in Hunan. This was the rst dengue outbreak in Hunan and China's interior province, and it provided us an early warning that the dengue fever has gradually spread inland from China's coastal and border regions and highlighted the urgent need to monitor the cross-border and cross-regional spread of dengue virus.
In this study, we collected serum from 260 patients with fever in Qiyang county, Hunan province, and con rmed 96 cases with positive NS1. Seven of the cases were treated with virus ampli cation culture on C6/36 cells to preserve the seeds, and another 89 viral RNAs were extracted and structural protein genomes (HNQY2018001-089) were obtained by ampli cation of overlapping fragments with a length of 2,325 nucleotides. Phylogenetic tree analysis showed that all isolate strains were cosmopolitan DENV-2 genotypes, and all located on one cluster of the ML tree and were closely related to the Zhejiang strain (2017, MH110588). Additionally, it is also closely related to the following four strains: Malaysia (KJ806803, 2013), Bali (KT806318, 2014), Indonesia (KT781561, 2014) and the Philippines (KU517847, 2015). This result suggested that the DENV-2 epidemic in Hunan possibly imported from Southeast Asian countries, such as Malaysia, Indonesia or the Philippines, passed through Zhejiang rst, and then further spread to Hunan or directly from these neighboring countries. Compared with the structural protein C/prM/E of standard strains, 17 amino acid substitutions occurred in all 89 epidemic strains. prM-E protein is the main structural protein of avivirus, which is related to virulence, host a nity, virus adsorption, penetration, and cell fusion [26]. Hydrophobic amino acids play an important role in maintaining the tertiary structure of proteins due to their hydrophobic interactions and may impact the virulence of the virus. Tamm et al. found that hydrophobic domains affect the virulence potential of the YadA protein of Yersinia enterocolitica [27]. Sainz et al. determined that single hydrophobic amino acids play an important role in transcriptional activation in vivo [28]. In our study, three hydrophobic amino acids in CDS region mutated into hydrophilic ones at 196th (M82: T→A), 262th (M148: H→Y), and 351th (E71: D→A). In addition, neutral amino acids became basic amino acids at 332th (E52: Q→H), and two positive electricity amino acids converted into negative ones at sites 406th (E126: K→E) and 429th (E139: H→N), mutations in these amino acids have not been reported, and changes in polarity or charge of amino acids may affect the functions of prM and E proteins, but these speculations need further study to con rm. Moreover, Moreland et al. de ned that this region of amino acids (E295 ~ E395) of dengue virus E protein domain III, which is the immunoglobulin G (IgG) immunoglobulin-like folding and plays an important role in mediating the fusion of virus and host receptor [29]. In this study, there were two site changes in the EDIII domain 602th (E322: I→V) and 670th (E390: N→S), it has been proved that the mutation of E390 from N amino acid to S amino acid can enhance the replication ability of virus [30], but the in uence of E322 amino acid mutation remains to be further con rmed.
The change of protein secondary structure will affect the enzyme activity. Compared with DENV-2 standard strain (KM204118), eight protein binding sites (4, 5, 122, 120, 133, 584, 596, and 642) and two polynucleotide-binding sites (6 and 18) were missed. Simultaneously, four new protein binding sites (19,29,144, and 377) and one polynucleotide-binding site (74) emerged. Furthermore, approximately eight obvious changes were taken place in buried area and exposed region. All of the above changes may lead to the diversi cation of protein structure domain, further in uencing the protein function. Homologous modeling and prediction of the possible 3D structure of structural proteins showed that epidemic strains and DENV2-SS had the similar 3D structure and 4 predicted protein binding sites, while, it is notable that the 429th binding site was different among them (DENV2-SS: His 429; Zhejiang/2017 and Hunan epidemic strain: ASN 429).
The analysis showed that there was no recombination event among the Hunan epidemic strains and 130 DENV-2 reference sequences, and no distinct positive selection site in structural protein, which contained 775 amino acids, suggesting that these structural protein coding genes were conservative.

Conclusions
This study reported the characteristics of the structural protein genome in DENV-2 originating from the 2018 outbreak in Hunan, inland China. This will bene t the research of 2018 and later follow-up studies of DENV outbreak in China and Southeast Asia. Our nding also showed that the transmission region of DENV has gradually spread from China's border and coastal areas to inland China. It provided us a warning that the dengue fever epidemic in China has become increasingly serious and di cult to control and emphasized the urgent need to monitor the cross-border spread of DENV.   Table   Table 1. Selection pressure analysis of the structural protein of DENV-2 (n = 202) using FEL, IFEL, MEME, and FUBAR.  The study design and the following disposition of study subject. Two hundred and sixty patients who went to the hospital for fever were recruited in our study; among them, 96 cases were identi ed as dengue NS1-positive. Of these, serum samples were collected for virus ampli cation and viral RNAs extraction.
Phylogenetic analysis was then conducted to characterize the origin and prevalence of DENV in Qiyang, Hunan during 2018. The gene and amino acid mutation site map of structural proteins of epidemic strains from Hunan (HNQY2018001-2018089) compared to the DV2 standard strain KM204118.1.

Figure 5
Prediction results of protein secondary structure of Hunan epidemic strains, DENV-2SS (KM204118) and Zhejiang/2017 (MG356770). Note: The red rhombus denotes the protein-binding region, the yellow dot denotes the DNA-binding region, and the purple dot denotes the RNA-binding region. Blue and red in the rst line represent the strand and helix regions, respectively. Blue and yellow in the second line represent the exposed and buried regions, respectively. Purple in the third line indicates the helical transmembrane regions, and green in the fourth line represents the disordered regions.