Molecular characterization of hepatitis B virus in liver disease patients and asymptomatic carriers of the virus in Sudan

Background Hepatitis B virus is hyperendemic in Sudan. Our aim was to molecularly characterize hepatitis B virus from Sudanese individuals, with and without liver disease, because genotypes play an important role in clinical manifestation and treatment management. Methods Ninety-nine patients - 30 asymptomatic, 42 cirrhotic, 15 with hepatocellular carcinoma, 7 with acute hepatitis and 5 with chronic hepatitis- were enrolled. Sequencing of surface and basic core promoter/precore regions and complete genome were performed. Results The mean ± standard deviation, age was 45.7±14.8 years and the male to female ratio 77:22. The median (interquartile range) of hepatitis B virus DNA and alanine aminotransferase levels were 2.8 (2.2-4.2) log IU/ml and 30 (19–49) IU/L, respectively. Using three genotyping methods, 81/99 (82%) could be genotyped. Forty eight percent of the 99 patients were infected with genotype D and 24% with genotype E, 2% with putative D/E recombinants and 7% with genotype A. Patients infected with genotype E had higher frequency of hepatitis B e antigen-positivity and higher viral loads compared to patients infected with genotype D. Basic core promoter/precore region mutations, including the G1896A in 37% of HBeAg-negative individuals, could account for hepatitis B e antigen-negativity. Pre-S deletion mutants were found in genotypes D and E. Three isolates had the vaccine escape mutant sM133T. Conclusion Sudanese hepatitis B virus carriers were mainly infected with genotypes D or E, with patients infected with genotype E having higher HBeAg-positivity and higher viral loads. This is the first study to molecularly characterize hepatitis B virus from liver disease patients in Sudan.


Background
Hepatitis B virus (HBV), the prototype member of the family Hepadnaviridae is responsible for chronic infection of more than 240 million people worldwide [1], of which 65 million reside in Africa [2].
Sudan is an African country with high HBV seroprevalence of greater than 8% HBsAg-positivity, ranging from 6.8% in central Sudan to 26% in southern Sudan [3][4][5]. HBV infection occurs in early childhood in southern Sudan, with the infection increasing with age in northern Sudan [3,6]. HBV was shown to cause 22% of fulminant hepatitis cases in Sudan [7] and 18.5% of Sudanese blood donors were exposed to the virus and 4% were infected at the time of donation [8]. Nine genotypes of HBV, A-I, with a distinct geographic distribution have been recognized [9][10][11]. A tenth genotype, J, has been proposed but was found only in one person [12]. Genotype A, D and E circulate in Africa [2]. Genotype A prevails in southern, eastern and central Africa. Genotype D is the dominant genotype in northern Africa, whereas in western Africa genotype E predominates. Subgenotypes have also been identified within genotypes A and D [9,10].
Considering Sudan's unique position and the flux of people across its borders, it is important that the HBV genotypes prevailing in this country are determined. In a single study, in Sudanese asymptomatic blood donors, it was found that 57.5% were infected with genotype E, 40.5% with genotype D and 2% with subgenotype A2 [8]. Moreover, a diversity of genotypes are distributed in neighbouring countries [2]: genotype D in Egypt to the north [13], genotype E in the Democratic Republic of Congo to the west [14] and genotype A in Kenya [15,16] and Uganda [17] to the south. Knowledge of the genotypes prevailing in Sudanese, with and without liver disease, is important in treatment management, as well as disease prognosis because genotypes play a role in both of these aspects [18,19].
The Regional committee for the World Health Organization (WHO) Eastern Mediterranean Region (EMR), to which Sudan belongs, urged member states to: "Improve the epidemiological surveillance systems, develop a hepatitis registry and implement serosurveys in order to produced reliable data to guide prevention and control measures and monitor impact of preventive strategies" [20].
Thus our objective was to molecularly characterize HBV from HBsAg-positive persons with known clinical status.

Serum samples
A cross-sectional, laboratory based study was conducted. Ninety-nine sera were collected from HBsAg-positive patients referred to the hepatology and general medical clinics at IbnSina Hospital, Soba University Hospital and Khartoum Teaching Hospital in Khartoum State between August 2008 and March 2009. The clinical report forms (CRFs) were completed by qualified practitioners. Informed consent was obtained from each patient included in the study and the study protocol conforms to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a priori approval by the Human Ethics Committees of the University of the Witwatersrand and the University of Khartoum and the Ministry of Health of Sudan. Alanine amino transferase (ALT) levels were determined (reference range 5-40 IU/L) [21] and samples stored at -20°C. Chronic carriers were infected for longer than six months and were classified as asymptomatic carriers (ASCs) if they had normal ALT or chronic hepatitis (CH) patients if they had abnormal ALT. Acute hepatitis (AH) cases were diagnosed based on clinical presentation (symptoms and clinical presence of jaundice) plus high ALT and the presence of hepatitis B core IgM antibody (HBcAbIgM). Clinical and ultrasonographic evidence were used to diagnose cirrhotic (CR) and hepatocellular carcinoma (HCC) cases.

DNA extraction
DNA was extracted from 200 μL of serum using QIAamp DNA mini blood kit (QIAGEN GmbH, Germany), according to manufacturer's instructions, and eluted in 100 μL of buffer.

Real-time PCR quantification of HBV DNA
PCR primers, HBV-Taq1 and HBV-Taq2 covering a region of the S gene (321 to 401 from the EcoRI site) with a FAM/TAMRA labelled TaqMan BS-1 probe [22] were used to quantify HBV DNA in an ABI 7500 Real Time PCR System (Applied Biosystems, Foster City, Ca, USA). A serial dilution of cloned plasmid DNA containing a single genome of HBV DNA, with concentrations ranging from 2 × 10 1 to 2 × 10 11 IU/ ml, was used as template to generate the standard curve. The second WHO International Standard for HBV Nucleic Acid Amplification Techniques (product code 97/750 National Institute for Biological Standards and Controls (NIBSC); Hertfordshire, UK), which has a final concentration of 10 6 IU/ml was used as the internal standard. The standard curve, blank, positive and negative controls, and samples were all tested in duplicate. The measured IU/ml for each reaction was calculated using the Ct (cycle threshold) value of each PCR interpolated against the linear regression of the standard curve. The lower detection limit of our assay is~20 IU/ ml. The conversion formula of IU =copies/4.7 was used [23][24][25].
Polymerase chain reaction (PCR) and restriction fragment polymorphism assay (RFLP) The basic core promoter/precore (BCP/PC) region and complete S open reading frame (ORF) were amplified in a MyCycler™ thermocycler (Bio-Rad, Hercules, Ca, USA) using Promega Taq DNA polymerase (Promega, Madison, WI). The BCP/PC PCR was amplified using a slight modification of the method described by Takahashi and colleagues [26] primers 1606 (+) (1606-1625 from EcoRI site) and 1974 (−)  from EcoRI site) were used for the first round (denaturation 94°C for 1 min, annealing 55°C for 1 min, extension 72°C for 2 min, 40 cycles) and 1653(+) (1653-1672 from EcoRI site) and 1959(−)  from EcoRI site) for the second round, with the identical cycling conditions as the first-round PCR [25,27]. A nested PCR was carried out to amplify the complete S ORF : primers 2410(+) (2410-2439 from EcoRI site) and 1314(−) (1314-1291 from EcoRI site) were used for the first round (denaturation 94°C for 1 min, annealing 66°C for 1 min, extension 72°C for 3 min, 40 cycles) and 2451 (+) (2451-2482 from EcoRI site) and 1280 (−) (1280-1254 from EcoRI site) for the second round (denaturation 94°C for 1 min, annealing 65°C for 1 min, extension 72°C for 3 min, 40 cycles) [27]. Another nested PCR reaction was carried out to amplify a short region of the S ( [28]. When the complete S ORF and the short S could not be amplified, a RFLP assay was used to determine the genotype of the HBV isolates. Primers P7 (256-278 from EcoRI site) and P8 (796-776 from EcoRI site) were used to amplify nucleotides 256-796 of the S region. The amplicon was then cleaved using restriction enzymes HinfI and Tsp509I, in separate reactions, to give the characteristic RFLP patterns for the different genotypes [29]. The complete genome was amplified using a single amplification method with primers P1 (1821-1841 from EcoRI site) and P2 (1825-1806 from EcoRI site), with modifications in the cycling conditions [30]. Initial denaturation at 98°C for 1 min, 80°C for 30 sec followed by adding the polymerase mix, 98°C for 1 min, then 35 cycles of: 98°C for 10 sec, 57°C for 30 sec and 72°C for 1 min.

Analysis for intergenotypic recombination
SimPlot 3.5.1©, an interactive 32-bit software program, that plots distances (or similarity) versus position [35] was used to show recombination in isolate SDAC031 [33].

Clinical and demographic characteristics
The 99 cases were classified into five clinical groups: HCC (n=15), CR ( n=42), ASCs (n=30), AH ( n=7) and CH ( n=5). The HCC patients were significantly older than patients in the other groups (p <0.05) and the AH patients were significantly younger (p <0.05). The ALT levels were the highest in the AH (p <0.05) ( Table 1). The median viral load of HBeAg-positive patients [6.99 (5.63-7.57) log IU/ml] was significantly higher than in HBeAg-negative patients [2.55 (2.09-3.42) log IU/ml] (p<0.05). The viral loads and frequency of HBeAgpositivity did not differ between clinical groups.

HBV genotyping and phylogenetic analysis
In order to maximize the number of isolates genotyped, three methods were used sequencially ( Figure 1). 81 HBV isolates were thus successfully genotyped Firstly, 53 of 99 were genotyped using amplification and sequencing of the complete S (2848-835 from EcoRI), followed by phylogenetic analysis (Figure 2), secondly,17 of 46 were genotyped by amplification and sequencing of a short S fragment (520-704 from EcoRI), followed by phylogenetic analysis (trees not shown) and thirdly,11 of 29 by RFLP analysis.
The relatively longer amplicon of the S region of 2.1 kb in length, compared to the shorter regions used for the short S and RFLP amplifications, meant that not all samples could be amplified in the longer region successfully. 18.2% (18/99) could not be genotyped using any of the three methods. No sequences could be obtained for 10 of these isolates and for the remaining 8 only the BCP/PC region was sequenced, which is not sufficient to differentiate between genotypes D and E.
The complete genome of four isolates was amplified, two (SDAC024 and SDAC031) belonged to genotype D and two (SDAC047 and SDAC125) to genotype E.  Although the genotypes determined using the complete genome agreed with that determined using the S ORF, there was a discrepancy in the subgenotype classification of SDAC031. Using phylogenetic analysis of the complete S, it clustered as an outlier of subgenotype D4, whereas following complete genome phylogenetic analysis it clustered with D6 ( Figure 2). Simplot analysis showed SDAC031 to be a recombinant of D6 and D4. By mapping the informative sites, we estimated the transition positions between D6 and D4 ( Figure 3). The genotype E isolates (SDAC047 and SDAC125) clustered together when the complete S was compared, whereas they separated into different clades in the complete genome analysis (data not shown).

Comparison of patients infected with genotypes D and E
Because the majority were infected with either genotype D or E, further analyses compared the patients' clinical and virological features infected with these genotypes. No significant difference was observed in gender, age, ALT and clinical groups of patients infected with either genotype. Patients infected with genotype E, showed a significantly higher frequency of HBeAg-positivity compared to patients infected with genotype D (p <0.05) and HBV DNA levels in patients infected with genotype E were significantly higher (p <0.05) ( Table 2).

Molecular characterization of HBV isolates Analysis of basic core promoter/precore (BCP/PC) region
Of the 70 isolates amplified and sequenced in the BCP/PC region, 62 amplfied in the S region and thus had genotype assignments. 57 isolates belonged to either genotype D or E, and 5 to genotype A. Seven of 8 genotype D or E isolates Figure 3 Graphs show bootscanning values of query sequence SDAC031 to representative sequences of subgenotypes D1 to D6 and outgroup, genotype F. Graphs were generated using Simplot version 3.5.1 © with window size: 200 bp, step size 20 bp, gap-strip off, 100 bootstrap replicates, Kimura transition/transversion ratio:2 and NEIGHBOR. Dashed vertical lines indicate breakpoints. The numbers of informative sites, which are shared by the isolate with subgenotype D4 are shown above and with subgenotype D6, below, on the right hand side of the four-member trees representing these sites. The genomic regions belonging to subgenotype D4 are shaded in grey. Although generally the BCP/PC region cannot be used to determine specific genotypes, genotype A and its subgenotypes can be differentiated from genotypes D or E because of specific sequence characteristics. One genotype A isolate from a HBeAg-positive patient, (SDAC108) had A1762T/G1764A. From the BCP/PC sequences it was deduced that this isolate belonged to subgenotype A2, because it had GCAC at 1809-1812, 1858C and 1888G. The other four genotype A isolates were from HBeAg-negative individuals. These isolates, SDAC062, SDAC073 SDAC100 and SDAC113, belonged to subgenotype A1 because they had TCAT at 1809-1812, 1858C and 1888A. One isolate had wild-type BCP/ PC region, two had A1762T/G1764A, one with G1862T and the fourth isolate had G1862T alone. No significant association was found between the BCP/PC mutations and genotypes or clinical groups.

Discussion
In Sudan, where HBV is hyperendemic, little is known about the genotypes and mutants of HBV in liver disease patients. Thus HBV, from 99 patients, belonging to five clinical groups, were studied. Of the 81 samples that were successfully genotyped, close to 60% belonged to genotype D whereas the opposite was true in Sudanese blood donors, where genotype E prevailed (57%) [8]. HCC patients in this study, who were infected with either genotype D or E, were significantly older than ASC, CH, AH, CR patients, with the AH patients being significantly younger (Table 1). Figure 4 Mutation distribution in the basic core promoter/precore region, at 11 loci of interest (1762, 1764, 1809-1812, 1814, 1815,  1858, 1888 and 1896), of 49 genotype D or E HBV isolates from HBeAg-negative patients. Boxed motif represents the wild-type of genotypes D or E. Data were obtained using the Mutation Reporter Tool [36]. Mutations are shaded in grey. The frequency of the mutational patterns did not differ between genotype D and E.
This differs from HCC patients, infected with genotype A, who develop cancer at a significantly younger age [28]. In agreement with others [37,38], the HCC group had the highest median viral load. High viral loads have been implicated as a risk factor for HCC development [39].
In agreement with the study in Sudanese blood donors [8], the predominant subgenotype of D was D1. Furthermore, following phylogenetic analysis of the complete S region, two isolates, which lacked the 33 nucleotide deletion, characteristic of genotype D, were genotype D/E recombinants [8,33,40]. Following phylogenetic analysis of the complete S, one isolate belonged to subgenotype A1, and from the BCP/PC sequence, a further four were deduced to belong to subgenotype A1 [41]. This is the first time that subgenotype A1 has been identified in Sudan whereas a previously sequenced genotype A isolate from Sudan belonged to A2 [8]. Subgenotype A1 circulates in southern and eastern Africa and Southern Asia and A2 is found in Northern and Central Europe and North America [42].
Discrepant results were obtained in the subgenotype assignment of SDAC031 when the complete S and complete genome were analyzed. The complete S clustered as an outlier of D4 whereas the complete genome clustered with D6. D4 and D6 are phylogenetically closely related ( Figure 2). Further analysis using Simplot, showed that it was a D6/D4 recombinant ( Figure 3) and the majority of signature amino acids were of D6, with the exception of a number of amino acids, which were of D4 [33]. Geographically, D6 is distributed in theMaghreb and Madagascar, whereas D4 in the Americas and Australia. It is possible that D4 originated in Africa but has subsequently been replaced by other subgenotypes of D and the recombinant is a remnant of the original strain/s. The HBV genotypes show a distinct geographical distribution in Africa, with genotype D predominating in the North, genotype E in the West and genotype A in the South-East [9]. This is the first study to describe the co-circulation of genotypes D and E in liver disease patients and to allow the comparison of patients infected with these two genotypes. In agreement with a study carried out in Sudanese blood donors [8], viral loads were significantly higher in genotype E-infected patients compared to genotype D-infected, with patients infected with genotype E, showed a significantly higher frequency of HBeAg-positivity. The small size of HBeAg allows it to traverse the placenta and elicit HBe/HBcAg-specific T helper cell tolerance in utero [43]. Thus babies born to HBeAg-positive mothers have high chronicity rates than those born to HBeAg-negative mothers [43]. The high frequency of HBeAg-positivity in mothers infected with genotype E, would lead to its vertical transmission and a nucleotide or amino acid position relative to genotype D (GU456684). b mutant pre-S2 start codon italicized. c nucleotide or amino acid position relative to genotype E (AB205191).
explain the high prevalence and geographical restriction of this genotype in Africa and to African emigrants to other regions [14]. The higher HBeAg-positivity seen in individuals infected with genotype E, could confer tolerance and less serious clinical manifestations than genotype D, where HBeAg-positivity was lower. This could explain why genotype E prevails in the Sudanese blood donors [8], whereas genotype D prevailed in the liver disease patients in the present study. The majority of the HBeAg-negativity was as a result of the classical G1896A, which abolishes HBeAg expression [44] and occurs in genotype D or E but not A because the encapsidation signal secondary structure precludes this mutation in genotype A [45,46]. Other mutations including transcriptional A1762T/G1764A and translation initiation mutations were responsible of HBeAg-negativity in a number of patients. Three isolates had TCAT instead of GCAC in the Kozak sequence preceding the precore initation codon and can affect HBeAg expression at the translational level [47]. This Kozak mutation occurred together with 1858C and 1888A (pattern 5, Figure 4), which are characteristics of subgenotype A1 [41]. One isolate has 1858C and 1888A alone. However, following phylogenetic analysis of the complete S region, these four isolates were found to belong to genotype D. It is possible that these patients were co-infected with genotypes D and A or with D/ A recombinants. These possibilities can only be discriminated by complete genome cloning and sequencing.
Four different pre-S mutational patterns were identified (Table 3). In genotype E, pre-S2 deletions were found mainly in HCC patients, whereas in genotype D, the deletion mutants were from non-HCC patients. However, the numbers were small to reach any firm conclusions. Pre-S deletion mutants were found in genotype E isolates from ASCs from Guinea [48] and Sudan [8]. Pre-S deletion/mutations affect the progression to serious liver disease in patients infected with either genotype B or C [49].
Six genotype E and three genotype D isolates had preS2F22L, which is a risk factor for HCC [50,51]. Interestingly, three genotype E isolates had the sM133T mutation, which could possibly compromise antibody neutralization and may represent potential vaccine escape mutants [52]. However, because these individuals were not vaccinated for HBV, this mutation may have emerged as a result of host immune pressure. The reverse transcriptase mutations rtA194T, rtV207M, rtS213T, rtV214A and rtS215Q detected in the present study are neither primary resistance mutations, nor have they ever been seen in overt resistance during therapy [53]. When mutations rtA194T, rtV207M rtS213T rtV214A and rtQ215S were tested in our sensitive and reliable in vitro resistance test system, the mutants showed no resistance to lamivudine (LMV), entecavir

Conclusion
This is the first study to molecularly characterize HBV from Sudanese liver disease patients, who were predominantly infected with genotypes D and E, allowing comparison of the effect of these genotypes on clinical manifestation in the same ethnic group.