A first assessment of the genetic diversity of Mycobacterium tuberculosis complex in Cambodia

Background Cambodia is among the 22 high-burden TB countries, and has one of the highest rates of TB in South-East Asia. This study aimed to describe the genetic diversity among clinical Mycobacterium tuberculosis complex (MTC) isolates collected in Cambodia and to relate these findings to genetic diversity data from neighboring countries. Methods We characterized by 24 VNTR loci genotyping and spoligotyping 105 Mycobacterium tuberculosis clinical isolates collected between 2007 and 2008 in the region of Phnom-Penh, Cambodia, enriched in multidrug-resistant (MDR) isolates (n = 33). Results Classical spoligotyping confirmed that the East-African Indian (EAI) lineage is highly prevalent in this area (60%-68% respectively in whole sample and among non-MDR isolates). Beijing lineage is also largely represented (30% in whole sample, 21% among non-MDR isolates, OR = 4.51, CI95% [1.77, 11.51]) whereas CAS lineage was absent. The 24 loci MIRU-VNTR typing scheme distinguished 90 patterns with only 13 multi-isolates clusters covering 28 isolates. The clustering of EAI strains could be achieved with only 8 VNTR combined with spoligotyping, which could serve as a performing, easy and cheap genotyping standard for this family. Extended spoligotyping suggested relatedness of some unclassified "T1 ancestors" or "Manu" isolates with modern strains and provided finer resolution. Conclusions The genetic diversity of MTC in Cambodia is driven by the EAI and the Beijing families. We validate the usefulness of the extended spoligotyping format in combination with 8 VNTR for EAI isolates in this region.


Background
Tuberculosis (TB) as caused by Mycobacterium tuberculosis complex (MTC) is one of the most important public health problems in the world. In 2008, WHO estimated 9.4 million incident cases of TB, 11.1 million prevalent cases, 1.3 million deaths among HIV-negative people and an additional 0.52 million TB deaths among HIV-positive people [1]. The largest number of new TB cases occurs in the South-East Asian Region, that accounted for 34% of incident cases in 2006 [2].
Cambodia is among the 22 high-burden TB countries, and has one of the highest rates of TB in South-East Asia [3]. The population of Cambodia is 13.4 million (Population Census of 2008). The new sputum smear positivity rate for Cambodia is estimated to be 220/ 100,000 inhabitants whereas the TB incidence is estimated around 500/100,000 inhabitants per year and the mortality rate at 94/100,000 inhabitants [4]. Phnom-Penh is the capital of Cambodia, with more than 1.3 million inhabitants (Population Census of 2008), which represents 10% of the whole population living in the country. The population density of Phnom-Penh is the highest of the country, with around 4,500 people per km 2 .
Except for studies done in neighboring countries (Vietnam, Thailand and Myanmar) nothing is known about the genetic diversity of MTC in Cambodia [5][6][7]. The goal of this study was to describe the genetic diversity of MTC in Cambodia. Multidrug resistant TB (MDR-TB) is an important aspect of tuberculosis control [8,9]. A high level of resistance to Isoniazid (Inh) and Rifampin (Rif) was identified in Phnom Penh in patients co-infected with HIV, and the minimal number of MDR-TB identified new cases in 2009 in Cambodia was 94 [4].
We focused on a restricted population sample (n = 118), mixing randomly drawn (n = 59) and enriched in MDR-TB clinical isolates (n = 59) according to drug susceptibility testing results. We report in this article the characterization of this 118 clinical isolates collection as a first insight on the genetic population structure of MTC in Cambodia.

Results
Spoligotyping, classical 43 spacers format M. tuberculosis complex clinical isolates were collected from 118 patients. DNA samples were extracted from subcultures by a thermolysis procedure. Patient demographic characteristics were as follows: a median age of 38 years, and a sex ratio close to 1 (Table 1). We obtained a 43-spacers spoligotyping result for 113 DNA samples. A total of 42 individual patterns were found, among which, 27 patterns were unique whereas 85 clinical isolates were distributed in 15 clusters containing 2 to 31 isolates (Table 2). Eighteen patterns were not previously reported in the SpolDB4 international database fifteen of which presenting the typical signature of the East African-Indian (EAI) family (absence of spacers 29-32, presence of spacer 33, absence of spacer 34) [10]. Two other previously unreported isolates (isolates Cam108 and Cam116) carried a T1-ancestor or "Manu" signature (all spacers present except 33 and/or 34) and one isolate harbored a very unusual pattern (spacer 1-3 only were present) [11]. Altogether, EAI is clearly the predominant genetic family in Cambodia. It totaled 67 clinical isolates (59% of successfully genotyped isolates) but up to 56 of the 84 successfully typed non-MDR (67%). The Beijing family encompassed 34 isolates, (30% of the total successfully genotyped isolates) but only 18 of non-MDR typed isolates (21%). In Beijing lineage, 26 out of 34 (76%) isolates were resistant to at least one of the drug (isoniazid, rifampin, streptomycin, ethambutol) while only 34% were among EAI isolates. The weight of these two lineages significantly differed between non-MDR and MDR isolates (OR = 4.52; CI 95% [1.78,11.51]). Other minor families were: the T (Modern) family (with a total of 7 clinical isolates or 6%), some U (unclassified with likely "T1-ancestor" or "Manu" signatures) patterns (4 clinical isolates or 3.5%) and 3 other undefined isolates, all together representing only ∼10% of total isolates.

Spoligotyping, extended 68 spacers format
The use of 25 additional spacers significantly improved the discriminatory power. 51 patterns (+21%) were distinguished among which 36 were unique patterns. 77 isolates were distributed in a total of 15 clusters ( Table 2). In the Beijing family (Spoligotyping-International-Type, SIT1), 2 subtypes were observed, among which, one cluster of only 2 isolates. Within the EAI family, several subtypes were observed by 68 spacers' spoligotyping (see Additional file 1 and Figure 1). Among the predominant clusters, two SIT signatures (SIT459, SIT204) were found to be prevalent and each split into two subtypes. Genotyping with the 25 extra spacers revealed that the "T1-ancestor" or "Manu" isolates all harbored the same signature (absence of spacers 54-61) as do isolates from Principal Genetic Groups 2 and 3 modern isolates. The common ancestry of these "T1 ancestor" or "Manu" isolates with the "Latin-American and Mediterranean (LAM), the "Haarlem" (H), the "Anglo-Saxon" (X), other less-defined "T-related" and some U (unknown) isolates is further supported by their clustering based on MLVA genotyping (see below and Figure 1).

MIRU-VNTR (Multiple Locus VNTR or MLVA) Typing
Hundred and five (105) clinical isolates out of 118 gave interpretable results. QuB26 failure rate was the highest (no amplification for 10 isolates); for three isolates, we had no amplification of Qub4156 and for three others we had no amplification of Mtub21. For some isolates a single VNTR was missing. The 24 loci internationally-agreed MIRU-VNTR typing scheme allowed to distinguish 13 clusters totaling 28 isolates (Table 2). Three isolates (Cam001, Cam033 and Cam042) which were unclassified by spoligotyping were similar to EAI isolates by MIRU-VNTR. In particular, isolate Cam033 (SIT405) differed by a single locus variation (SLV in MIRU02) from isolate Cam038 which belongs to the EAI2_Manilla subfamily. The discriminatory powers of each MIRU-VNTR locus respectively for the whole sample and for EAI or Beijing lineage were estimated by calculating the Hunter and Gaston Discriminatory Index (HGDI) ( Table 3). Nine loci included in the standardized 24 VNTR scheme were uninformative to genotype Beijing clinical isolates. For Beijing isolates, the most discriminatory loci were (in a decreasing

Combined Spoligotyping (68 spacers) and MIRU-VNTR analysis
The number of clusters (defined as 100% identical isolates on both 24 VNTR and spoligotyping) was low (n = 12) and represented a total of 26 clinical isolates ( Figure 1).
Among these clusters, a single cluster of 4 clinical isolates (SIT48) was observed, whereas all others were microclusters of 2 clinical isolates. Interestingly, two of these microclusters were formed of MDR Beijing isolates. The most prevalent Spoligo-International-Types (SIT) were SIT204 and SIT459, both belonging to the EAI family. They were searched for in the genetic databases SITVIT2 http://www.pasteur-guadeloupe.fr and MIRU-VNTRplus to look for the geographical distribution of these two types [12,13]. The search within SITVIT2 of SIT204 (737777777413771), retrieved six clinical isolates , which differs by a single spacer from SIT139, the prototypic spoligotype from the "Hanoi strain" (also designated as EAI4_Vietnam) [10]. This confirms the relatedness to EAI lineage of the SIT459 isolates included in our study. By using the 68 spacer spoligotyping scheme and simultaneously the 6 most discriminant loci for EAI isolates (ETRD, Mtub39, ETRA, Mtub21, ETRB and QuB26), we obtained almost the same clustering result as by using the 24 VNTR loci. Only two clusters of two EAI isolates were not distinguished using this combined scheme: isolates Cam111 and Cam118 that differ by MIRU40 only, and isolates Cam79 and Cam102 that differ by QuB11b only. Considering that isolates that differ by a single marker using MLVA-24 have an equal likelihood to represent a non epidemiologically-informative cluster as an epidemiologically one (due to genetic convergence on MIRU-VNTR), we suggest that these 6 VNTR loci combined to spoligotyping scheme could be sufficient to genotype EAI strains in this part of the world for epidemiological purposes. Including two additional markers (MIRU40 and QuB11b), i.e. 8 VNTR loci could further strengthen the clustering results.

Discussion
This study provides for the first time an insight on the genetic diversity of Mycobacterium tuberculosis complex in Cambodia. Although preliminary because our sampling was a mixed susceptible, mono-resistant, and MDR clinical isolates and did not cover a sufficient period of time and sufficient clinical isolates number, we show that the population structure of M. tuberculosis in Cambodia is dominated by the EAI and Beijing families and is depleted of the Central Asian (CAS) family. Indeed, these two families represent almost 90% of the total population (59-67% for EAI and 21-30% for Beijing depending whether the full set or the susceptible set only is taken to be representative of the actual diversity). Other modern families like T, Haarlem and Latin-American and Mediterranean (LAM) represent anecdotal cases.
Compared to previous studies performed in the South and South-East Asian region (India, Bangladesh, Myanmar, Thailand, Vietnam and this study) we observe that the CAS lineage is totally absent in Cambodia and Vietnam only [6,7,9,[14][15][16]. Indonesia and Malaysia seem to be the lower south-eastern borders of the CAS lineage distribution in South-East Asia. Conversely an increased gradient of the EAI lineage from to India to South-East Asia is observed (Bangladesh: EAI, 27%; CAS, 16%, Beijing, 33%; Myanmar: EAI, 48%; CAS, 5%, Beijing 32%; Vietnam: EAI, 51%, Beijing, 32%; Cambodia: EAI, 60%, Beijing 30%). In India, where the relative CAS/EAI distribution remains to be studied with more accuracy in relation to human populations, it was shown that the CAS lineage is predominant in the North (56%) and sporadic in the South (1%) while the EAI lineage proportion increases from 27% in the North to 89% in the South [17].
The MDR-TB clinical isolates were shown to belong to three lineages: Beijing, EAI and "T1-ancestor" also designated as "Manu". The distribution of non-MDR and MDR-TB clinical isolates within the lineages was significantly different: whereas Beijing strains represented only 21% of non-MDR isolates, this percentage increased to around 50% among MDR-TB clinical isolates. We also observed two Beijing clusters which contained two perfectly identical MDR-clinical isolates each, whereas no such cases were found among EAI MDR strains. With the available data, the hypothesis of MDR-TB transmission could not be rejected. Drug-resistance thus could be associated with Beijing family as already observed in Vietnam, Thailand and South Africa [6,9,[18][19][20][21][22]. The exact features of MDR-TB in this region clearly warrant further studies.
Another goal of this study was also to investigate the discriminatory power of alternative future genotyping schemes. Although the 24 loci MIRU-VNTR typing (MLVA) is actually the most powerful genotyping method for M. tuberculosis, it has some technical limitations: in lower resource countries, MIRU-VNTR can be only performed manually. As these countries have often a high TB prevalence, the work can be very time-consuming and tedious. 24 VNTR typing may also be run on capillary electrophoresis thus achieving higher efficiency. Spoligotyping is a first-line rapid genotyping method, even when performed on membrane since it can analyze up to 40 samples per experiment in one day. When combined with few highly discriminatory VNTR loci, spoligotyping can achieve a discriminatory power similar to that obtained with the 24-VNTR scheme, with much less working load and time. In this study, a combination of spoligotyping with 8 VNTR loci (ETRD, Mtub39, ETRA, Mtub21, ETRB, QuB26, MIRU40 and QuB11b) allows to obtain the same discriminatory power than the MIRU-VNTR 24 locischeme for EAI strains (66 isolates). Thus, typing scheme in Myanmar, Bangladesh, Laos and Cambodia could consist in spoligotyping to identify main families, followed by a restricted 8-VNTR panel for EAI isolates, while the 24 VNTR panel would remain mandatory for the remaining MTC clinical isolates.

Conclusions
This study describes TB genetic diversity in Cambodia and preliminary information on drug-resistance features in this country. Our results support the absence of CAS family such as in Vietnam.

Patient Characteristics
Samples were collected from January 2007 to October 2008 from inpatients and outpatients consulting in Phnom Penh and in provincial hospitals. Strains were obtained from fresh material: sputum (n = 93), bronchoaspirates (n = 5), stool (n = 5), gastric aspirates (n = 3), lymph node aspirates (n = 3), cerebrospinal fluid (n = 2), urine (n = 1), blood culture (n = 1) and other (n = 5). The characteristics of the 118 patients are shown in Table 1. The mean age of the patients is 37.66 years. 58 men and 56 women, the male to female ratio equal to 1.03, slightly higher than the sex ratio of the region of Phnom-Penh (0.9) (no information for 4 patients).

Mycobacterium tuberculosis isolation and identification, drug susceptibility patterns
No formal informed consent was obtained for this study since the study involved clinical isolates obtained during routine diagnostic work. Sputum decontamination was performed by the sodium-lauryl-sulfate method with direct examination by the auramine technique followed by culture on Löwenstein-Jensen and classical identification using the niacin test [23].

Genotyping Methods
DNAs were sent by express carrier to the Institut Pasteur and an aliquot was transferred to the Institut of Genetics and Microbiology to be genotyped. Both spoligotyping and MIRU-VNTR were used as standard genotyping procedures. Spoligotyping: The PCR was performed according to the described protocol with a total of 20 cycles [24]. Spoligotyping was performed by using the microbeadsbased platform Luminex ® 200 [25,26]. In addition to the classical 43 spacers' spoligotyping, an extended spoligotyping assay with 68 spacers, using 25 extra spacers was performed [27,28]. Family assignations were performed by searching in SpolDB4 database [10]. The corresponding naming was used. For a reminder, EAI is sometimes called "Indo-Oceanic"; Beijing, "East-Asian"; CAS, "East-African-Indian"; and modern Principal Genetic Group 2 and 3 strains, "Euro-American lineage" [29].

Statistical and clustering analysis
The discriminatory powers of different typing methods were estimated by computing the numerical index of discrimination, the Hunter-Gaston Discriminatory Index (HGDI), as described previously [36]. Markers with missing data were not included for the calculation of the HGDI. The Recent Transmission Index (RTI) was computed using the (n-1) method [37]. Dendrogram was drawn using the UPGMA (Unweighted Pair Group Method using Arithmetic averages) implemented in Bionumerics 5.1 version software (Applied Maths, Sint-Marten-Latems, Belgium) following the methods described in user's manual and with missing data coded as "0".
To compare the proportion of each lineage among MDR and non-MDR strains, odds ratio were computed in Excel ® spreadsheet files.