The study was conducted at the Hospital for Tropical diseases (HTD), Ho Chi Minh City Vietnam from June to December 2012, January to June 2014 and January to December 2016. HTD is a 650 bed tertiary care hospital for infectious diseases and a designated referral center for hepatitis patients for the southern provinces of Vietnam. All treatment naïve chronic HBV patients attending at the hepatitis outpatient department for viral load assays were eligible for the study. Systematically selected residual diagnostic samples from 2% of the patients (samples from every 50th patient) from 2012 and 2014 and 8% of the patients from 2016 were included in this study. Serum samples from selected patients were stored at minus 86 °C until further analysis. Patient address (province, district, city, and wards), clinical chemistry and viral load data were collected from the hospital database. The geolocation of the patients were mapped with QGIS software version 2.18. The study was approved by the Hospital for Tropical Diseases’ ethical review committee (Approval No: SC/ND/12/14).
Viral DNA was extracted from 200 μL of plasma using QIAamp viral DNA kit (QIAgen GmbH, Hilden, Germany) and eluted in 50 μL TBE. The HBV genome was amplified in 4 overlapping fragments (800 bp to 1.2 kb) using P1-P2, P3-P4, P5-P6, and P7-P8 primers (P1: 5′-TTT TTC ACC TCT GCC TAA TCA-3′; P2: 5′-TTG GGA TTG AAG TCC CAA TCT GG-3′; P3: 5′-GGG TCA CCT TAT TCT TGG-3′; P4: 5′-ATA ACT GAA AGC CAA ACA GTG GG-3′; P5: 5′-GTC TTC TTG GTT GTT CTT CTA C-3′; P6: 5′-GCA GCA CAG CCT AGC AGC CAT GG-3′; P7: 5′-CCA TAC TGC GGA ACT CCT AGC-3′; P8: 5′-CAA TGC TCA GGA GAC TCT AAG GC-3′) [17]. PCR reaction was done in 40 μL of buffer containing 50 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl2, 200 mM deoxynucleoside triphosphates (dNTPs), 1 U of Taq DNA-Pwo Polymerase (Expand High Fidelity assay, Boehringer Mannheim), and 30 pmol of primers. The PCR was performed for 35 cycles at 94 °C for 1 min, 58 °C for 1 min, and 72 °C for 1 min in a thermal cycler (ABI 9800) [18]. The PCR products were visualized by 1% agarose electrophoresis and stained with Nancy 520 DNA gel stain. The PCR product was purified using QIAamp PCR product purification kit (QIAgen GmbH, Hilden, Germany). The eluted DNA was quantified by a fluorescence-based dsDNA quantification method using the Quant-iT dsDNA Assay Kit in a Qubit fluorometer (Invitrogen) and was sequenced either by ABI 3100 system after cycle sequencing reaction or by Illumina Myseq system. For the ABI 3100 system, DNA sequencing was done from both ends and consensus sequence was used to construct the whole genome using overlapping fragments. For Illumina sequencing, the amplified fragments were pooled with an equal quantity of each individual PCR amplicon. One nanogram of pooled DNA from individual samples was subjected to library preparation using the Nextera XT DNA sample preparation kit (Illumina, San Diego, CA, USA), in which each sample was assigned to a unique barcode sequence using the Nex-tera XT Index Kit (Illumina). Sequencing of the prepared library was carried out using the Miseq reagent kit v2 (300 cycles, Illumina) in an Illumina Miseq platform.
The Illumina fastq sequence files were assembled using Genious 8.0.5, software package (Biomatters Ltd, AK, New Zeland) utilizing a reference-based mapping tool after primer sequence clipping (i.e. the consensus sequence was obtained by mapping individual reads of each sample to a reference sequence). Finally, screening of minor (sub-consensus) variants was performed using the SNP detection tool available in Geneious. A minimum variant frequency of 5% and 500-fold coverage were chosen as cut-off values.
Seventy well characterized HBV WGS representing all genotypes and subgenotypes were downloaded from Gene Bank and the HBV WGS from the current study were subjected to phylogenetic analysis. All complete genome sequences were aligned with MUSCLE from the Genious package. The sequence alignments were then subjected to the Jmodel test to identify the best model for phylogenetic analysis [19]. The suggested nucleotide substitution model (GTR + G + I) was subsequently used in phylogenetic analysis using RAxML v7.2.8 (available in the Genious package). To confirm the reliability of phylogenetic tree analysis, bootstrap resampling and reconstruction were carried out 100 times.
All sequences were analyzed for possible recombination by RDP4 v 4.85 software [20]. Any recombination events detected by at least 5 of the 7 programs (RDP, Geneconv, Bootscan, Maxchi, Chimaera, Siscan and Topol) were considered as true recombination. RDP4 v4.85 standard default settings were used, except for Bootscan and Siscan where window sizes of 300 bp, step size 30 were used. The prevalence of recombination, recombination breakpoints (start and end point), length of the recombinant fragments and the locations of the recombination were determined.
HBV reverse transcriptase (RT) regions were analyzed for the presence or absence of 42 potential nucleos(t)ide analogue (NA) resistance (NAr) mutations. This includes primary and secondary drug resistance mutations (rt80, rt169, rt173, rt180, rt181, rt184, rt194, rt202, rt204, rt236 and rt250), putative NAar mutations (rt53, rt54, rt82, rt84,rt85, rt91, rt126, rt128, rt139, rt153,rt166, rt191,rt200, rt207,rt213, rt214,rt215,rt217, rt218, rt221, rt229, rt233, rt237, rt238, rt245, and rt256), and pretreatment mutations (rt38, rt124, rt134, rt139, rt224 and rt242) as described earlier [21].
The preS2/S1 sequences were analyzed for preS1 deletion, preS1 mutations (A2962G, C3026A/T, C2964A, and C3116T), preS2 start codon deletion, and preS2 mutations (T31C, T53C, A162G, and T531C/G). The S gene sequence was analyzed for mutations in the “a” determinant region (T116 N, P120S/T, I/T126S/A, Q129H/R, M133 L/T, K141E, P142S, D144E, and G145R), and other virulence associated mutations (N3S, V184A, and S204R).
Mutations in the BCP (C1653T, T1674C/G, T1753 V, A1762T, G1764/A, C1766T, and T1768A) and the PC/core region (G1899A, C2002T, A2159G, A2189C, and G2203A/T) associated with HCC were also analyzed.
All data (socio demographic, biochemical and virological) were recorded and analyzed with Statistical Package for the Social Sciences (IBM SPSS version 23, NY, USA). Fisher’s exact test was used for the comparison of nominal scale variables and Mann - Whitney U test for ordinal scale variable. A P value </0.05 was considered to indicate statistically significant difference.