Genetic diversity and molecular epidemiology of multidrug-resistant Mycobacterium tuberculosis in Minas Gerais State, Brazil

Background We aimed to characterize the genetic diversity of drug-resistant Mycobacterium tuberculosis (MTb) clinical isolates and investigate the molecular epidemiology of multidrug-resistant (MDR) tuberculosis from Minas Gerais State, Brazil. Methods One hundred and four MTb clinical isolates were assessed by IS6110-RFLP, 24-locus mycobacterial interspersed repetitive units variable-number tandem repeats (MIRU-VNTR), TB-SPRINT (simultaneous spoligotyping and rifampicin-isoniazid drug-resistance mutation analysis) and 3R-SNP-typing (analysis of single-nucleotide polymorphisms in the genes involved in replication, recombination and repair functions). Results Fifty-seven different IS6110-RFLP patterns were found, among which 50 had unique patterns and 17 were grouped into seven clusters. The discriminatory index (Hunter and Gaston, HGDI) for RFLP was 0.9937. Ninety-nine different MIRU-VNTR patterns were found, 95 of which had unique patterns and nine isolates were grouped into four clusters. The major allelic diversity index in the MIRU-VNTR loci ranged from 0.6568 to 0.7789. The global HGDI for MIRU-VNTR was 0.9991. Thirty-two different spoligotyping profiles were found: 16 unique patterns (n = 16) and 16 clustered profiles (n = 88). The HGDI for spoligotyping was 0.9009. The spoligotyped clinical isolates were phylogenetically classified into Latin-American Mediterranean (66.34 %), T (14.42 %), Haarlem (5.76 %), X (1.92 %), S (1.92 %) and U (unknown profile; 8.65 %). Among the U isolates, 77.8 % were classified further by 3R-SNP-typing as 44.5 % Haarlem and 33.3 % LAM, while the 22.2 % remaining were not classified. Among the 104 clinical isolates, 86 were identified by TB-SPRINT as MDR, 12 were resistant to rifampicin only, one was resistant to isoniazid only, three were susceptible to both drugs, and two were not successfully amplified by PCR. A total of 42, 28 and eight isolates had mutations in rpoB positions 531, 526 and 516, respectively. Correlating the cluster analysis with the patient data did not suggest recent transmission of MDR-TB. Conclusions Although our results do not suggest strong transmission of MDR-TB in Minas Gerais (using a classical 100 % MDR-TB identical isolates cluster definition), use of a smoother cluster definition (>85 % similarity) does not allow us to fully eliminate this possibility; hence, around 20–30 % of the isolates we analyzed might be MDR-TB transmission cases.


Background
Multidrug-resistant (MDR) tuberculosis (TB) is an increasingly serious global public health threat that requires robust, efficient and quick actions to improve the control and spread of drug-resistant clinical isolates. TB isolates with resistance to isoniazid (INH) and rifampin (RIF), defined as MDR-TB, are prone to sequential accumulation of mutations in the target genes that confer resistance to them [1].
Brazil ranks sixteenth among the world's 22 countries with high TB burdens; here, the TB prevalence is 92,000 cases, the incidence rate is 35.4 per 100,000 per year, and the mortality rate is 4.9 per 100,000 of the population according to World Health Organization estimates [2]. Minas Gerais State has the fourth lowest TB incidence (17.9/100 000/year) in Brazil [3]. Although the current prevalence of primary MDR-TB is relatively low in Brazil, it has potential to become a major public health issue, because resistance to more than one drug has been shown to be strongly associated with household contact, as suggested in a recent study conducted in Amazonia [4].
Mycobacterium tuberculosis complex (MTBC) genotyping methods have been widely used for investigating epidemics involving MDR-TB [5]. These methods help to define the recent transmission factors for MDR-TB isolates and enable better control programs to be initiated to avoid MDR-TB expansion at local or global population levels.
Among the various genetic markers available for studying genetic polymorphism in drug-resistant M. tuberculosis (MTb), restriction fragment length polymorphism (RFLP) analysis of the IS6110 insertion sequence is a "gold standard" for MTBC typing [5][6][7]; this method has been used widely to identify and investigate TB transmission and re-infection rates, and was used at the end of the 1990s to identify cross-contamination in laboratories [7].
However, the IS6110-RFLP technique has many disadvantages in routine practice: it is laborious, requires trained staff, replies on microgram quantities of purified DNA, and has poor discriminatory power when applied to isolates with low IS6110 copy numbers [5,6,8]. The method based on mycobacterial interspersed repetitive units variable-number tandem repeats (MIRU-VNTR) has progressively replaced IS6110-RFLP. MIRU-VNTR, a very powerful technique, provides adequate discrimination between MTb clinical isolates and is comparable to IS6110-RFLP in terms of its accuracy for estimating TB outbreaks and for use in phylogenetic investigations [5,[8][9][10][11][12][13]. MIRU-VNTR analysis also generates readily comparable numerical values, a useful feature for interlaboratory studies; hence, over the last 10 years it has come into standard use in TB research [9], but may progressively be replaced by whole genome sequencing (WGS) [14].
Spacer oligonucleotide typing (spoligotyping), another well-established (1996) technique, is based on the polymorphisms found in the clustered regularly interspersed short palindromic repeats (CRISPR) of MTb. Spoligotyping detects the presence or absence of 43 spacer sequences in the CRISPR region of MTb [5,15,16]. Spoligotyping data can be expressed in an octal or binary format. Additionally, spoligotyping requires small amounts of crude or purified DNA and its results are as portable as MIRU-VNTR, making data from it easily shared between laboratories. Spoligotyping has provided a lot of highly informative results on the phylogeographic distribution of genotypic diversity in MTb. Furthermore, spoligotyping in combination with MIRU-VNTR has excellent discriminatory power for cluster analysis of tubercle bacilli genomics, making it a valuable tool for the epidemiology and evolutionary biology of MTb [17][18][19]. However, because the discriminatory power of spoligotyping is generally inferior to that of IS6110based RFLP, it cannot be used alone for molecular epidemiology studies [20][21][22].
The use of single-nucleotide polymorphisms (SNPs) as markers of genetic variation for phylogenetic analysis has been described in many studies [23][24][25][26][27][28]. Because SNPs offer important advantages for high-throughput analyses, they are the markers of choice in genetics research, likewise are regions where deletions in the genome occur [29]. Despite MTBC structural genes exhibiting very low levels of polymorphism among strains [30][31][32][33], higher polymorphism levels were found recently in several genes among which were those involved in replication, recombination and repair functions (3R genes) [34]. Indeed, the high-throughput 3R-SNP-typing method is able to classify undefined spoligotype signatures making it an efficient, easy to use tool for evolutionary studies on MTBC clinical isolates [23].
For genetic characterization of TB drug resistance, molecular detection tests currently search for known mutations in different TB-specific target genes [33]. For MDR-TB, the most frequent mutations associated with RIF and INH resistance can be assessed by sequencing, line-probe assays or other methods [28,29]. RIF resistance is mainly (95 %) caused by the 81-bp rifampin resistance-determining region (RRDR) of the rpoB gene [35,36]. INH resistance is often caused by mutations in katG (codon 315), inhA (positions −15 and −8 in the inhA promoter sequence), and in other genes [35,37]. Phenotypic TB culture-based drug susceptibility testing (DST), however, remains the gold standard for diagnosis of MDR-TB [38]. Tuberculosis-spoligo-rifampin-isoniazid typing (TB-SPRINT), a 59-plex multiplexed microbeadbased, high-throughput DNA array method, provides simultaneous spoligotyping and mutation analysis of the most common resistance-associated SNPs for RIF (rpoB RRDR direct and indirect coverage) and INH resistance (katG, inhA) [36,[38][39][40]. This method will soon be replaced by an improved 77-plex version (TB-SPRINT-plus) capable of identifying mutations in target genes conferring resistance to some second-line drugs (Molina et al., unpublished observations).
Here, we aimed to investigate the genetic profile of MDR-TB clinical isolates from Minas Gerais, a Brazilian state, using the following four molecular techniques: IS6110-RFLP, MIRU-VNTR, TB-SPRINT and 3R-SNPstyping.

Clinical isolates and drug susceptibility testing
One hundred and four MDR-TB clinical isolates were collected from 2008 to 2013 in Minas Gerais, Brazil, each one corresponding to a unique TB patient. All the isolates were obtained by culture of respiratory samples and represent distinct local laboratories in Minas Gerais. The isolates were referred to the Ezequiel Dias Foundation (FUNED) for culture, identification and drug susceptibility testing. Samples were transferred to the Research Laboratory of Mycobacteria of the Faculty of Medicine of the Federal University of Minas Gerais (UFMG) where they now belong to the MTb clinical isolate collection. The samples are representative of MDR-TB in Minas Gerais. All the clinical isolates were from patients diagnosed with MDR-TB by the BACTEC™ MGIT™ 960 System [41]. Demographic data were obtained from the Information System on Diseases of Compulsory Declaration of Brazil (otherwise known as SINAN).

Genomic DNA extraction
MTb genomic DNA was extracted from mycobacterial colonies subcultured on Löwenstein-Jensen (LJ) medium. One loopful of mycobacterial colonies was collected in a tube containing 500 μL of TE buffer (10 mM Tris-Cl, 1 mM EDTA) and then incubated at 80°C for 60 min. Lysozyme 10 mg/mL (70 μL) was added to each tube, followed by incubation at 65°C for 15 min with occasional mixing, after which 5 M NaCl (100 μL) and 10 % cetyltrimethylammonium bromide (CTAB) (100 μL) were added to each sample. After adding 70 μL of 10 % SDS and 6 μL of proteinase K (10 mg/mL) the samples were vortexed briefly and then incubated at 65°C for 15 min. Chloroform/isoamyl alcohol (24:1 v/v) (700 μL) was added to each tube, and the solution was centrifuged for 20 min at 12,000 rpm at 4°C in a microcentrifuge. The supernatant was transferred to a new 1.5 mL microcentrifuge tube, 450 μL of ice-cold isopropanol was added, and the tube was inverted 20 times to precipitate the nucleic acids. Samples were incubated overnight at −20°C and then centrifuged at 12,000 rpm for 30 min at 4°C in a microcentrifuge, after which the supernatant was discarded. The pellet was air-dried for 2 h and then resuspended in 60 μL of TE buffer (10 mM Tris-Cl, 1 mM EDTA).

IS6110-RFLP
DNA samples were typed by IS6110-RFLP analysis in accordance with the standardized protocol described by van Embden et al. [7] and van Soolingen et al. [42]. The reference strain used was Mt 14323.

MIRU-VNTR
The standard 24 MIRU-VNTR loci method [18] was performed based on agarose gel electrophoresis. The simplex PCR product size was determined as previously reported [43].

TB-SPRINT
High-throughput TB-SPRINT was performed at the Institute of Genetics and Microbiology at the University Paris-Sud, France, on a Luminex 200™ flow cytometry device (Luminex Corp, Austin, TX) as previously described, using a microbead-based DNA array method [44][45][46]. The TB-SPRINT analysis was performed according to the standardized protocol recommended by Gomgnimbou et al. [37].

3R-SNP typing
The 3R-SNP typing was performed as described by Abadia et al. [23]. This seven gene multiplex-PCR method uses primers designed on the dual-priming oligonucleotide principle, which has been shown to strongly increase the mutated to wild-type signal ratio [47].

Bioinformatic cluster analysis
All results (except IS6110-RFLP) were entered into Excel® spreadsheets, and then transferred to BioNumerics™ software version 6.6 (Applied Maths, Sint-Martens-Latem, Belgium). IS6110 RFLP fingerprints were digitalized and compared using the Dice coefficient and the unweightedpair group method using average linkage (UPGMA) according to the manufacturer's instructions [44]. MIRU-VNTR data were analyzed using the categorical coefficient and UPGMA [45]. TB-SPRINT and 3R-SNP-typing data were analyzed in BioNumerics™ using the Jaccard index and UPGMA [37]. Spoligotyping data were also analyzed using the minimum spanning tree (MST) method, as shown in Fig. 1. A composite data set for the four methods mentioned above and a composite dendrogram were also built ( Fig. 2) [46]. Cluster definition was based on identical patterns using the above four methods (tighter definition) or by setting the percentage similarity at >85 % (smoother definition) [48,49]. The recent transmission index was determined by computing the n and (n minus 1) index [50,51].
The VNTR allelic diversity index (h) was used to evaluate the allelic diversity of the various VNTR loci. The value of h was calculated using the formula described by Selander et al. [52]. The discriminatory power of each typing method was also computed and compared by the Hunter-Gaston discriminatory index (HGDI) [53,54]. The HGDI was calculated using the discriminatory power calculator available at http://insilico.ehu.es/mini_ tools/discriminatory_power/index.php.
To correlate the molecular and patient data, an analysis of basic demographic patient data (city and address) and familiar data (mother's name) was performed.

Ethics statement
The study was approved by the Ethics Committee of the Federal University of Minas Gerais (number 122.941; CAAE 06611912.8.0000.5149). Isolates from this study were obtained by culturing stock clinical isolates.

IS6110-RFLP typing
The IS6110 copy number from each isolate was assessed from the number of bands hybridizing with the probe. The 104 clinical isolates were typed and a total of 67 fingerprint patterns were obtained (65 %). The majority of them (94.03 %) had multiple IS6110 copies (7)(8)(9)(10)(11)(12)(13)(14)(15). This high degree of IS6110 polymorphism is in accordance with the results observed in drug-susceptible clinical isolates and suggests a recent low rate of MDR-TB transmission [55][56][57][58][59][60]. That we observed a low frequency of isolates with low IS6110 copy numbers (5.97 %) demonstrates the excellent discriminatory power of IS6110-RFLP in our setting. Thirty-seven isolates (35 %) lacked IS6110-RFLP profiles, probably resulting from poor DNA quantity. Fiftyseven different IS6110 RFLP patterns were identified, 50 of which were unique, and 17 isolates were found in seven clusters (HGDI = 0.9937).

MIRU-VNTR
The 104 isolates were successfully typed and 99 different MIRU-VNTR patterns were found. Among these patterns, 95 were unique, and nine isolates belonged to four clusters (HGDI = 0.9991). Lineage signature was performed by MIRU-VNTRplus best-match labeling using 24 global MIRU-VNTR loci and the following six main lineages/sublineages were observed: Cameroon The allelic diversity of each MIRU-VNTR locus in our setting was evaluated and classified into highly (HGDI >0.6), moderately (0.6> HGDI <0.3) or poorly discriminative (HGDI <0.3) [52], as summarized in Table 1. The highest allelic diversity indexes were for Qub 26, Mtub 04, MIRU 26, MIRU 16, Qub 11, and MIRU 10. The allelic diversity index was low (h ≤ 0.3) for five of the 24 loci. As supported by this study and others, partial MIRU-VNTR genotyping could be sufficient to define epi-linked clusters after first-line and highthroughput spoligotyping [63]. Ali et al. described seven loci with the highest discriminatory level that could be used preferentially to investigate possible transmission events [61].

TB-SPRINT typing
All the clinical isolates were successfully typed by spoligotyping and were classified phylogenetically into five lineages and 11 sublineages, as shown in Table 2. Thirty-two different spoligotyping patterns were found; 16 of them had unique patterns, and 88 isolates were grouped into 16 clusters (HGDI = 0.9009). A minimum spanning tree (MST) was built (Fig. 1). The single lineage is lineage four (Euro-American) with a majority of LAM, T, H and a minority of S and X2. The two major sublineages (LAM: n = 69 and T: n = 15) are in central positions of the MST (Fig. 1). They show tight links between patterns and represent the major proportion of the clinical isolates in the Minas Gerais State (n = 84/104 or 80.76 % of the clinical isolates). The continuous transmission of MTB in certain settings is strongly affected by the prevailing population structure of tubercle bacilli [63], which induces the predominance of a homogeneous group, such as the LAM family in South America and similarly for the Beijing family in Asia [19,57,[64][65][66]. LAM, T and H clinical isolates are more likely to become MDR in Brazil [67,68].
A further characterization of the RIF-INH typing scheme (16-plex)

3R-SNP-typing
For some spoligotyping patterns (n = 9), it was not possible to assign their lineages/sublineages. To assign such isolates (called "U" isolates in SpolDB4 and/or SITVIT-WEB), a 3R-SNP scheme was used successfully to solve 77.8 % of the cases [23]. The patterns found were Haarlem (n = 4) and LAM (n = 3), and these signatures were confirmed by MIRU-VNTR typing. Only two samples remained unclassified.
Out of 104 isolates, the 3R-SNP-typing method allowed us to find mutations in 94 (90.38 %) isolates associated with four specific genotype families: LAM (n = 67), Haarlem (n = 12), X (n = 2), T2 (n = 1) and an unknown lineage (n = 12). The ten isolates remaining (9.62 %) did not amplify successfully. Use of the 3R-SNP-based method helped to clarify the infra-specific taxonomy of our sampling, thus improving our confidence in the evolutionary analysis of our data [23,71].

Comparison of the discriminatory powers of the genotyping methods
The discriminatory powers of the IS6110-RFLP, MIRU-VNTR, TB-SPRINT (spoligotyping and RIF-INH-typing) and 3R-SNP methodologies are shown in Table 3. Use of a combination of different techniques is important for improved epidemiological and phylogeographical interpretation of molecular results [17,18,51,61,62]. MIRU-VNTR has the highest discriminatory power followed by IS6110-RFLP and TB-SPRINT. 3R-SNPtyping has a lower discriminatory power because only seven SNPs were used in the current format.

Molecular epidemiology in the Minas Gerais State
Among the 104 clinical isolates, 71 displayed a low similarity index (<85 %) and 33 a high similarity index (>85 %). Twelve clusters without any obvious epidemiological link were observed. According to the (n minus 1) or n Recent Transmission Index [50,51] definition (where there is the choice to diminish or not to diminish all the clusters by one index case) and using the smooth cluster definition (>85 %), the maximum transmission rate of MDR-TB in Minas Gerais State would be (33 minus 12/104), or 20 % using the (n minus 1) method, and 33/104 or 31 % using the n method. However, if we assume a 100 % identity cluster definition, no 100 % identity cluster was found, which suggests that no cases of MDR-TB transmission  occurred in Minas Gerais State. Our results clearly point to an extended classical "stone in the pound" epidemiological analysis of the 12 suspected clusters, for which MDR-TB transmission remains likely [72].
Our results are possibly explained by the fact that MDR-TB transmission in Minas Gerais is individually acquired (i.e., there are no primary MDR-TB cases). An alternative explanation is that many cases of MDR-TB were missed, but this seems unlikely because the sampling is representative of MDR-TB cases in Minas Gerais. The situation for Minas Gerais differs from that of other Brazilian studies [67,73] in that the other studies did not use all the techniques used herein, which may have increased the discriminatory power of our analysis. This discordance could also be explained by the global differences in TB prevalence between different regions of Brazil.
When looking more closely at the geographical origin of the samples from Minas Gerais, we could identify only one factor that made us suspicious of an epidemiological link (in eight patients from the same city). Our results suggest that clustered genotypes indicative of recent MDR-TB transmission should be interpreted with caution, unless direct evidence of epidemiological links between clustered cases can be demonstrated [74].
Despite the low number of samples, this collection of MTBC isolates is likely to be representative of the confirmed MDR-TB cases in Minas Gerais State. One limitation of the present study, however, is the absence of clinical epidemiological data. Also, contact tracing for individual patients could not be performed.
Future long-term studies are necessary to identify the possible risk factors for the emergence of drug resistance and/or treatment failure. Additionally, longitudinal studies in regions of Brazil with a high incidence of MDR-TB are now urgently needed.

Conclusions
To sum up, use of four different discriminant genotyping techniques (IS6110-RFLP, MIRU-VNTR, TB-SPRINT and 3R-SNP-typing) provided useful data for phylogenetic evaluation and fine taxonomic characterization of MDR-TB clinical isolates from Minas Gerais State, Brazil. The most common MDR-TB isolates belonged to the LAM lineage and approximately two thirds of them did not provide evidence for recent transmission of MDR-TB. Our data indicate that MDR-TB in Minas Gerais State is caused by clinical isolates that were not transmitted in recent years or that the outbreak is driven by individually acquired resistance and endogenous reactivation. This situation contrasts with the findings from other Brazilian studies, which all reported a high transmission rate for MDR-TB. Such an important issue requires locally-adapted solutions and state-specific control measures in Brazil. Continuous surveillance of MDR-TB transmission could be improved by introduction of new diagnostic tools and epidemiological research using WGS methods.

Competing interests
The authors declare that they have no competing interests.