Inferring patient to patient transmission of Mycobacterium tuberculosisfrom whole genome sequencing data
- Josephine M Bryant†1,
- Anita C Schürch†2, 3, 4,
- Henk van Deutekom5,
- Simon R Harris1,
- Jessica L de Beer2,
- Victor de Jager3, 6,
- Kristin Kremer2,
- Sacha A F T van Hijum3, 6, 7,
- Roland J Siezen3, 6,
- Martien Borgdorff5, 8,
- Stephen D Bentley1,
- Julian Parkhill1Email author and
- Dick van Soolingen2, 9
© Bryant et al; licensee BioMed Central Ltd. 2013
Received: 31 January 2013
Accepted: 19 February 2013
Published: 27 February 2013
Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of the rate of change in the genome over relevant timescales is required.
We attempted to estimate a molecular clock by sequencing 199 isolates from epidemiologically linked tuberculosis cases, collected in the Netherlands spanning almost 16 years.
Multiple analyses support an average mutation rate of ~0.3 SNPs per genome per year. However, all analyses revealed a very high degree of variation around this mean, making the confirmation of links proposed by epidemiology, and inference of novel links, difficult. Despite this, in some cases, the phylogenetic context of other strains provided evidence supporting the confident exclusion of previously inferred epidemiological links.
This in-depth analysis of the molecular clock revealed that it is slow and variable over short time scales, which limits its usefulness in transmission studies. However, the superior resolution of whole genome sequencing can provide the phylogenetic context to allow the confident exclusion of possible transmission events previously inferred via traditional DNA fingerprinting techniques and epidemiological cluster investigation. Despite the slow generation of variation even at the whole genome level we conclude that the investigation of tuberculosis transmission will benefit greatly from routine whole genome sequencing.
KeywordsMycobacterium tuberculosis Molecular clock Whole genome sequencing Transmission Epidemiology
The global burden of tuberculosis remains enormous, resulting in 1.4 million deaths, and 8.7 million new cases in 2011 . Understanding transmission dynamics, and the host and pathogen factors that can affect the spread of tuberculosis, is vital for understanding epidemiology and for effective outbreak management. In order to achieve this, the accurate identification of transmission events through epidemiological and molecular techniques is essential.
In the last two decades several DNA fingerprinting methods for Mycobacterium tuberculosis were developed that reveal different types of DNA polymorphisms in the genome . The first typing method developed used IS6110 restriction fragment length polymorphism (RFLP) analysis, but variable number of tandem repeat (VNTR) typing is currently recognised as the gold standard [3, 4]. In the Netherlands, a country with a low incidence of tuberculosis (6.5 per 100,000 people), epidemiological links between patients are considered “confirmed” if two conditions are met: 1) two M. tuberculosis isolates have an identical RFLP or VNTR profile and 2) interviews suggest that both patients could have been in contact with each other. Genomic changes that are independent of sites monitored by RFLP/VNTR typing, however, are unaccounted for. For some questions concerning the disease, the application of RFLP and VNTR typing is limited. For example Niemann et al. found that two M. tuberculosis isolates had the same RFLP type but differed by 130 SNPs when whole genome sequencing was used . Likewise, isolates with identical DNA fingerprints may not always be epidemiologically linked. For example Gardy et al. found that VNTR typing data suggested a tuberculosis outbreak in Canada was clonal, whereas whole genome sequencing data revealed that there were in fact two concurrent outbreaks . Moreover, RFLP and VNTR typing may suffer from a slow genetic turn over that reduces their usefulness when epidemiological clusters grow over extended time periods. In such case the distinction between spread from primary, secondary, tertiary, etc., sources in the cluster becomes impossible. Whole genome sequencing allows investigation into genomic evolution and transmission at the highest resolution. This level of information can help refine conclusions drawn from traditional fingerprinting methods. The application of sequencing to a large outbreak in Harlingen, the Netherlands, provided a higher resolution picture than could be inferred from typing alone, and reduced the possible number of infection routes for all patients, which agreed with epidemiological investigations .
Due to its slow growth and limited genomic diversity, routine application of clinical whole genome sequencing is particularly attractive for M. tuberculosis. This will make the determination of both antibiotic resistance and transmission events more rapid . However, in order to confidently conclude transmission, knowledge of the mutation rate is required. Whole genome sequencing has demonstrated that M. tuberculosis is characterized by low genomic diversity  suggesting that its mutation rate is low. However, very few studies have attempted to make an accurate measurement of mutation rate. Ford et al. showed using whole genome sequencing, that the mutation rate of M. tuberculosis during active infection of macaques is approximately 0.39 (0.16-0.80 95% CI) single nucleotide polymorphisms per genome per year . However, this was based on only 15 isolates collected from four infections. More recently, an estimate of 0.5 SNPs/genome/year has been made on the basis of within and between patient sampling of 93 patients from a larger study in the UK . In order to better understand the mutation rate in M. tuberculosis during transmission, the genomes of 199 isolates were sequenced. These included isolates from 185 patients with known epidemiological links, and another 14 patient isolates from the same RFLP clusters but with no evident epidemiological link. Here we assess the usefulness of the molecular clock and the phylogenetic resolution provided by whole genome sequencing to both refute and affirm epidemiological inference.
Contact tracing was carried out as part of routine handling by the Municipal Health Service in Amsterdam, and performed following the stone-in-the-pond principle . Contacts were examined in groups, located in concentric circles around the source case, until the observed prevalence of tuberculosis infection corresponds with the expected prevalence. Patients whose isolates had an identical IS6110 RFLP fingerprint were extensively interviewed. The data of these interviews were combined with the contact tracing results, and anonymised for this analysis; therefore data protection permission and ethical approval were not required.
IS6110 restriction fragment length polymorphism (RFLP) typing was performed for all isolates in the study. Spoligotyping, 24-locus variable number of tandem repeats (VNTR) typing were performed on some isolates by the standardized methods detailed elsewhere [3, 13, 14].
Strains of 42 IS6110 RFLP clusters that contained isolates of epidemiologically linked TB patients were selected for whole genome sequencing. Patient data was anonymised and un-identifiable, therefore ethical approval and informed consent were not required. As control strains, the source case isolate and the isolate of the fifth patient in a transmission chain of the Harlingen cluster that were part of an earlier WGS study  were sequenced. Using 454-sequencing and comparative analysis, these two genome sequences had been determined to differ by four SNPs . Also, four multidrug resistant (MDR) isolates of patients diagnosed in Estonia, of which two were linked by transmission, were sequenced. These isolates are representative of cluster EU0051  (de Beer et al., manuscript in preparation) that is one of the largest European MDR –clusters to date. DNA was isolated according to the previously published protocol .
Library construction and sequencing
Libraries were constructed in 18 pools of isolates with each isolate uniquely tagged. Each pool was subjected to paired-end sequencing on a single lane of the Illumina Genome Analyzer GAIIx platform. Thirty-three of the isolates were sequenced with a read length of 76 bases and the remaining 166 with a read length of 108 bases. All raw sequence data used in this work have been deposited in the European Nucleotide Archive under Study Accession Number: ERP000111.
Reads were mapped to a corrected version of the H37Rv reference  using the program SMALT ; and a combination of samtools and bcftools  were used to call bases as part of an in-house pipeline. Appropriate filters were used to reduce the number of false positive SNP calls to a level estimated to be less than 1 SNP per genome. These included at least 75% of high quality mapped reads on each strand agreeing with the call, a base quality score of at least 50 and a mapping quality score of at least 30. Reads which did not map uniquely were discarded, which means repetitive regions are avoided. Mapping and SNP calling were also carried out independently at the Center of Molecular and Biomolecular Informatics (CMBI), Radboud University, using RoVar (Robust Variant detection in genome sequences using Next Generation Data from various platforms). RoVar is available upon request from the authors (https://trac.nbic.nl/rovar/, V.C.L. de Jager, B.A.M. Renckens, R.J. Siezen, and S.A.F.T. van Hijum, unpublished). The mapping results were compared using the epidemiological linked pairs as a test set. Most SNPs were found to agree except those found in regions flanking insertions. This problem was avoided by only calling deletions in the final mapping. As short insertions and deletions are difficult to call in general, only SNPs were considered for all subsequent analysis. The repetitive and GC rich PE/PPE gene family are known to be problematic for analysis with short read data. Most variants in these regions were filtered out due to the quality controls mentioned above. However, to completely account for the possibility of unreliable SNPs called in these regions, the mutation rate analyses were repeated with these regions removed (Additional file 1). It should be noted that this made no impact on the conclusions drawn.
A maximum likelihood tree was constructed based on variable positions using RAxML . Homoplasic SNPs (indicated by identical SNPs that occurred on independent branches of the tree) were inspected manually to discount any that were likely to be the result of incorrect mapping. Path-O-Gen was used to plot root to tip distance against time . This program uses linear regression to root trees with date information at the position that is most compatible with the assumption of the presence of a molecular clock.
Whole genome phylogeny and microevolution between epidemiologically linked pairs
Convergent evolution of drug resistance
Eight of the epidemiologically linked pairs had different drug susceptibility phenotypes for streptomycin and isoniazid. In five of the pairs, a resistant phenotype was found in the primary case isolate, and in three of the pairs it was found in the secondary (see Additional file 3). However no SNPs in genes known or suspected to confer drug resistance were identified to differ between the pairs. In addition minority variants and InDels (insertions and deletions) were also investigated, but no additional variants could be detected. Although this could indicate an undetected mechanism or variant which determines resistance, it may also be due to inaccuracies in the phenotypic resistance tests.
In total, 16 non-synonymous SNPs were found to be homoplasic (Additional file 4). Five of these found in genes rpoB, gyrA , rrs, katG  and embB  have previously been associated with drug resistance. Two of the other homoplasic SNPs occurred in genes thought to be involved in pathogenicity: ino1 and opcA. One possibility is that convergent evolution could be a result of recombination. It has recently been suggested that recombination is frequent and widespread within the M. tuberculosis species , which is counter to a widespread belief that it is strictly clonal. We analysed our data using a recombination detection program  and found no evidence of recombination in our dataset.
Instead, it is more likely that homoplasy has occurred in these genes due to a high selective pressure for traits such as antibiotic resistance. This suggests that the other 11 homoplasic SNPs with no ascribed function deserve further investigation, as they may possibly represent previously un-described pathogenicity or antibiotic-resistance genes conferring selective advantages or potential compensatory mutations .
Deriving a molecular clock
In order to make a judgment about whether direct transmission has taken place using whole genome sequencing data, knowledge of the mutation rate is required. Here we have attempted to estimate this using the epidemiologically linked pairs. Estimates were calculated using SNPs accumulated in the secondary case in each of the linked pairs. SNPs found only in the primary case isolate were excluded as these are likely to represent either variation in the source host population that is not present in the transmitted population, or SNPs generated via laboratory passage. SNPs conferring drug resistance were excluded (n = 7), as these are likely to be subject to strong selection. In addition three pairs were excluded based on the phylogenetic evidence discussed below. The mutation rate was estimated per pair, with sources of error such as diagnostic delay and false positive and false negative SNP calls taken into account (See Additional file 5). The average estimates were wide ranging (0.4-17) with a mean of 5.37 SNPs per genome per year (see Additional file 5). No association between higher estimates and patient factors including sex, country of birth, treatment and drug susceptibility of the isolates was found (Fishers exact test P values 0.11 - 1).
In summary, three methods agreed on an average rate of ~0.3 SNPs per genome per year which is remarkably similar to that estimated by Ford et al.  using the tuberculosis macaque infection model. However, the correlation co-efficient was consistently poor (maximum r2 value of 0.6 in one case) and the level of variation observed at the isolate level was high. This is likely due to the very low rate and stochastic nature of SNP accumulation, and indicates that this estimate needs to be used with caution.
Using the structure of the phylogeny to exclude transmission
One pair from the East Asian linage had a particularly large SNP difference of 149 (Figure 6c). The suspected source case patient lived in the same street as the suspected secondary case patient. It is however unclear if they were in direct contact with each other. Both isolates shared an IS6110 RLFP pattern but their 24-locus VNTR pattern differed in 6 loci. With no detectable evidence of recombination or SNPs in possible hypermutator genes, we examined the sequencing data more closely and we were able to detect two independent deletions in the strains. The large deletion of part of the pks1 gene found in the source isolate was found in another East Asian strain, Beijing 30, in a previous study . This suggests that these non-epidemiologically linked isolates share a more recent common ancestor than the ancestor of epidemiologically linked pair. This evidence along with the large SNP difference means we can be confident in excluding the possibility of recent direct transmission. In the absence of whole genome sequencing, the clear genetic separation of these isolates would have been un-detected.
Identifying novel transmission events
Strikingly 82 pairs of these non-linked isolates, of the sample RFLP type, had a SNP difference of zero. This suggests that amongst these pairs there may be previously un-detected transmission events. The range of date intervals between these pairs ranged from zero days to almost 5 years, which further demonstrates the low and variable mutation rate. In the absence of epidemiological evidence, and the low and variable mutation rate observed, we cannot confidently assess whether direct transmission has occurred in these cases, however this information would provide valuable evidence in a clinical setting, informing further investigation and contact tracing.
In summary, the low mutation rate of M. tuberculosis means that even at the highest resolution provided by whole genome sequencing it is still difficult to confidently affirm the inferences of transmission made by traditional epidemiological techniques. This means it is very difficult to determine transmission inclusively. However, whole genome sequencing does in some cases allow us to exclude direct transmission, by using the phylogenetic context provided by other strains. Not only does whole genome sequencing provide the inter-cluster differentiation provided by current typing methods, but it also achieves intra-cluster resolution which can be used to inform epidemiological investigation.
At the cluster level we were able to estimate an average mutation rate of 0.3 SNPs per genome per year, which is remarkably close to estimates made by Ford et al. . This confirms the extremely low rate of accumulation of variation that characterizes M. tuberculosis, which is approximately 3 times and 44 times slower than that observed in Escherichia coli and Staphylococcus aureus, respectively . It is possible that lack of coverage and mapping to repetitive regions and members of PE and PPE gene families may have resulted in a lower estimated mutation rate. However, it is worth noting that only 5% of the genome on average was unrepresented in this analysis so any effects on the estimated mutation rate are likely to be negligible.
We were unable to detect a clock-like signal at a larger phylogenetic scale (at the lineage level), probably reflecting the different processes of fixation and substitution having variable influences on different parts of the evolutionary history. However, at the intra-cluster level and between the epidemiologically linked pairs, we observed a large level of variation around the mean, which is in contrast to observations of some other bacteria [38, 39]. There are a variety of factors that may have contributed to this variability. Latency is common in tuberculosis infection and could result in considerable discrepancies in the apparent rate of mutation over time. However, Ford et al. recently showed that the mutation rate during latency in macaques was similar to that during active infection . Within host selection for factors such as drug resistance may also result in variation in the accumulation of SNPs over time, as observed for a single patient . However, we propose that the most important factor is probably the low mutation rate itself, meaning over short time scales only a weak signal of a molecular clock can be detected.
This variability means that although a molecular clock may be detectable over longer time frames, it is only an aggregate measure and should be used with extreme caution when applying it to infer local transmission or date recent evolutionary events. Furthermore, while mutation rate can be used to strengthen or exclude epidemiological links, it cannot be used alone to infer direct transmission, particularly due its slow rate. We found no evidence of hyper-mutation in our dataset, and this has not been reported in clinical M. tuberculosis isolates to our knowledge. However it is possible that treatment may impose selection pressures on isolates that could affect the observed rate of fixation, and this should be considered.
In a recent investigation of tuberculosis transmission chains in the UK , an estimate of 0.5 SNPs per genome per year was derived using whole genome sequences from both within-patient and within-household longitudinal sampling. This estimate was based on the number of SNPs accumulated between the first and last sequenced isolate from each patient or transmission chain. Interestingly, the estimate for within-patient mutation rate was identical to our estimate of 0.3 SNPs per genome per year. The higher between-patient rate they estimated may reflect the multiple sources of error when using the inferred epidemiological links to derive mutation rates, which we observed in our dataset. They used this estimate to set a cut-off of ≤5 SNPs for cases less than three years apart. Our more in-depth analysis of the molecular clock reveals a high level of variation which suggests that using a simple cut-off may not be entirely accurate for confirming transmission, but that the phylogenetic context provided by deep sampling of clusters may be more informative.
This dataset reveals that over transmission time scales, the molecular clock of M. tuberculosis is both slow and variable, indicating that genetic distance alone cannot be used to confidently infer transmission. However we propose future transmission studies of M. tuberculosis will benefit from whole genome sequencing through the increased resolution it provides. In addition we demonstrate that deep sampling of a phylogenetic cluster will provide the context to allow exclusion of possible transmission events. The establishment of whole-genome databases will further enhance the possibility to compare samples to exclude or propose transmission.
This work was funded by the Wellcome Trust (grant number 098051) and the Strategical Research (SOR) fund of the RIVM (S/230136/01/GA). JMB is also funded by the Medical Research Council. We would also like to thank Alison Mather (Wellcome Trust, Sanger Institute) for comments on the manuscript.
- WHO: Global tuberculosis report 2012. 2012, Geneva, Switzerland: World Health Organisation, WHO Press
- Van Soolingen D: Molecular epidemiology of tuberculosis and other mycobacterial infections: main methodologies and achievements. J Intern Med. 2001, 249 (1): 1-26. 10.1046/j.1365-2796.2001.00772.x.View ArticlePubMed
- Supply P, Allix C, Lesjean S, Cardoso-Oelemann M, Rusch-Gerdes S, Willery E, Savine E, de Haas P, van Deutekom H, Roring S: Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of mycobacterium tuberculosis. J Clin Microbiol. 2006, 44 (12): 4498-4510. 10.1128/JCM.01392-06.PubMed CentralView ArticlePubMed
- de Beer JL, Kremer K, Kodmon C, Supply P, van Soolingen D: First worldwide proficiency study on variable-number tandem-repeat typing of Mycobacterium tuberculosis complex strains. J Clin Microbiol. 2012, 50 (3): 662-669. 10.1128/JCM.00607-11.PubMed CentralView ArticlePubMed
- Niemann S, Koser CU, Gagneux S, Plinke C, Homolka S, Bignell H, Carter RJ, Cheetham RK, Cox A, Gormley NA: Genomic diversity among drug sensitive and multidrug resistant isolates of mycobacterium tuberculosis with identical DNA fingerprints. PLoS One. 2009, 4 (10): e7407-10.1371/journal.pone.0007407.PubMed CentralView ArticlePubMed
- Gardy JL, Johnston JC, Ho Sui SJ, Cook VJ, Shah L, Brodkin E, Rempel S, Moore R, Zhao Y, Holt R: Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N Engl J Med. 2011, 364 (8): 730-739. 10.1056/NEJMoa1003176.View ArticlePubMed
- Schurch AC, Kremer K, Daviena O, Kiers A, Boeree MJ, Siezen RJ, van Soolingen D: High-resolution typing by integration of genome sequencing data in a large tuberculosis cluster. J Clin Microbiol. 2010, 48 (9): 3403-3406. 10.1128/JCM.00370-10.PubMed CentralView ArticlePubMed
- Koser CU, Ellington MJ, Cartwright EJ, Gillespie SH, Brown NM, Farrington M, Holden MT, Dougan G, Bentley SD, Parkhill J: Routine use of microbial whole genome sequencing in diagnostic and public health microbiology. PLoS Pathog. 2012, 8 (8): e1002824-10.1371/journal.ppat.1002824.PubMed CentralView ArticlePubMed
- Schurch AC, Kremer K, Kiers A, Daviena O, Boeree MJ, Siezen RJ, Smith NH, van Soolingen D: The tempo and mode of molecular evolution of mycobacterium tuberculosis at patient-to-patient scale. Infect Genet Evol. 2010, 10 (1): 108-114. 10.1016/j.meegid.2009.10.002.View ArticlePubMed
- Ford CB, Lin PL, Chase MR, Shah RR, Iartchouk O, Galagan J, Mohaideen N, Ioerger TR, Sacchettini JC, Lipsitch M: Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat Genet. 2011, 43 (5): 482-486. 10.1038/ng.811.PubMed CentralView ArticlePubMed
- Walker TM, Ip CLC, Harrell RH, Evans JT, Kapatai G, Dedicoat MJ, Eyre DW, Wilson DJ, Hawkey PM, Crook DW: Whole-genome sequencing to delineate mycobacterium tuberculosis outbreaks: a retrospective observational study. Lancet Infect Dis. 2013, 13 (2): 137-146. 10.1016/S1473-3099(12)70277-3.PubMed CentralView ArticlePubMed
- Veen J: Microepidemics of tuberculosis: the stone-in-the-pond principle. Tuber Lung Dis. 1992, 73 (2): 73-76. 10.1016/0962-8479(92)90058-R.View ArticlePubMed
- van Embden JD, Cave MD, Crawford JT, Dale JW, Eisenach KD, Gicquel B, Hermans P, Martin C, McAdam R, Shinnick TM: Strain identification of mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J Clin Microbiol. 1993, 31 (2): 406-409.PubMed CentralPubMed
- Kamerbeek J, Schouls L, Kolk A, van Agterveld M, van Soolingen D, Kuijper S, Bunschoten A, Molhuizen H, Shaw R, Goyal M: Simultaneous detection and strain differentiation of mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol. 1997, 35 (4): 907-914.PubMed CentralPubMed
- Devaux I, Kremer K, Heersma H, Van Soolingen D: Clusters of multidrug-resistant mycobacterium tuberculosis cases, europe. Emerg Infect Dis. 2009, 15 (7): 1052-1060. 10.3201/eid1507.080994.PubMed CentralView ArticlePubMed
- Van Soolingen D, De Haas P, Kremer K: Restriction fragment length polymorphism typing of mycobacteria. Mycobacterium tuberculosis protocols. Edited by: Parish T, Stoker NG. 2001, Totowa NJ: Humana Press Inc, 165-203. 12View Article
- Casali N, Nikolayevskyy V, Balabanova Y, Ignatyeva O, Kontsevaya I, Harris SR, Bentley SD, Parkhill J, Nejentsev S, Hoffner SE: Microevolution of extensively drug-resistant tuberculosis in Russia. Genome Res. 2012, 22 (4): 735-745. 10.1101/gr.128678.111.PubMed CentralView ArticlePubMed
- SMALT v0.5.8: http://www.sanger.ac.uk/resources/software/smalt/,
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence alignment/Map format and SAMtools. Bioinformatics. 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.PubMed CentralView ArticlePubMed
- Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006, 22 (21): 2688-2690. 10.1093/bioinformatics/btl446.View ArticlePubMed
- Path-O-Gen: http://tree.bio.ed.ac.uk/software/pathogen/,
- Gagneux S, Small PM: Global phylogeography of mycobacterium tuberculosis and implications for tuberculosis product development. Lancet Infect Dis. 2007, 7 (5): 328-337. 10.1016/S1473-3099(07)70108-1.View ArticlePubMed
- Ramaswamy S, Musser JM: Molecular genetic basis of antimicrobial agent resistance in mycobacterium tuberculosis: 1998 update. Tubercle and lung disease: the official journal of the International Union against Tuberculosis and Lung Disease. 1998, 79 (1): 3-29. 10.1054/tuld.1998.0002.View Article
- Takiff HE, Salazar L, Guerrero C, Philipp W, Huang WM, Kreiswirth B, Cole ST, Jacobs WR, Telenti A: Cloning and nucleotide sequence of Mycobacterium tuberculosis gyrA and gyrB genes and detection of quinolone resistance mutations. Antimicrob Agents Chemother. 1994, 38 (4): 773-780. 10.1128/AAC.38.4.773.PubMed CentralView ArticlePubMed
- Maus CE, Plikaytis BB, Shinnick TM: Molecular analysis of cross-resistance to capreomycin, kanamycin, amikacin, and viomycin in mycobacterium tuberculosis. Antimicrob Agents Chemother. 2005, 49 (8): 3192-3197. 10.1128/AAC.49.8.3192-3197.2005.PubMed CentralView ArticlePubMed
- Heym B, Alzari PM, Honore N, Cole ST: Missense mutations in the catalase-peroxidase gene, katG, are associated with isoniazid resistance in mycobacterium tuberculosis. Mol Microbiol. 1995, 15 (2): 235-245. 10.1111/j.1365-2958.1995.tb02238.x.View ArticlePubMed
- Sreevatsan S, Stockbauer KE, Pan X, Kreiswirth BN, Moghazeh SL, Jacobs WR, Telenti A, Musser JM: Ethambutol resistance in mycobacterium tuberculosis: critical role of embB mutations. Antimicrob Agents Chemother. 1997, 41 (8): 1677-1681.PubMed CentralPubMed
- Movahedzadeh F, Smith DA, Norman RA, Dinadayala P, Murray-Rust J, Russell DG, Kendall SL, Rison SC, McAlister MS, Bancroft GJ: The mycobacterium tuberculosis ino1 gene is essential for growth and virulence. Mol Microbiol. 2004, 51 (4): 1003-1014. 10.1046/j.1365-2958.2003.03900.x.View ArticlePubMed
- Jiang X, Zhang W, Gao F, Huang Y, Lv C, Wang H: Comparison of the proteome of isoniazid-resistant and -susceptible strains of mycobacterium tuberculosis. Microbial drug resistance. 2006, 12 (4): 231-238. 10.1089/mdr.2006.12.231.View ArticlePubMed
- Namouchi A, Didelot X, Schock U, Gicquel B, Rocha EP: After the bottleneck: genome-wide diversification of the mycobacterium tuberculosis complex by mutation, recombination, and natural selection. Genome Res. 2012, 22 (4): 721-734. 10.1101/gr.129544.111.PubMed CentralView ArticlePubMed
- Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS: Rapid pneumococcal evolution in response to clinical interventions. Science. 2011, 331 (6016): 430-434. 10.1126/science.1198545.PubMed CentralView ArticlePubMed
- Comas I, Borrell S, Roetzer A, Rose G, Malla B, Kato-Maeda M, Galagan J, Niemann S, Gagneux S: Whole-genome sequencing of rifampicin-resistant mycobacterium tuberculosis strains identifies compensatory mutations in RNA polymerase genes. Nat Genet. 2011, 44 (1): 106-110. 10.1038/ng.1038.PubMed CentralView ArticlePubMed
- Brown T, Nikolayevskyy V, Velji P, Drobniewski F: Associations between mycobacterium tuberculosis strains and phenotypes. Emerg Infect Dis. 2010, 16 (2): 272-280. 10.3201/eid1602.091032.PubMed CentralView ArticlePubMed
- Krishnan N, Malaga W, Constant P, Caws M, Tran TH, Salmons J, Nguyen TN, Nguyen DB, Daffe M, Young DB: Mycobacterium tuberculosis lineage influences innate immune response and virulence and is associated with distinct cell envelope lipid profiles. PLoS One. 2011, 6 (9): e23870-10.1371/journal.pone.0023870.PubMed CentralView ArticlePubMed
- Sun G, Luo T, Yang C, Dong X, Li J, Zhu Y, Zheng H, Tian W, Wang S, 3rd Barry CE: Dynamic population changes in mycobacterium tuberculosis during acquisition and fixation of drug resistance in patients. J Infect Dis. 2012, 206 (11): 1726-1733.View Article
- Tsolaki AG, Gagneux S, Pym AS, de la Salmoniere YO G, Kreiswirth BN, Van Soolingen D, Small PM: Genomic deletions classify the beijing/W strains as a distinct genetic lineage of mycobacterium tuberculosis. J Clin Microbiol. 2005, 43 (7): 3185-3191. 10.1128/JCM.43.7.3185-3191.2005.PubMed CentralView ArticlePubMed
- Didelot X, Bowden R, Wilson DJ, Peto TE, Crook DW: Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet. 2012, 13 (9): 601-612.View ArticlePubMed
- Mutreja A, Kim DW, Thomson NR, Connor TR, Lee JH, Kariuki S, Croucher NJ, Choi SY, Harris SR, Lebens M: Evidence for several waves of global transmission in the seventh cholera pandemic. Nature. 2011, 477 (7365): 462-465. 10.1038/nature10392.PubMed CentralView ArticlePubMed
- Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA: Evolution of MRSA during hospital transmission and intercontinental spread. Science. 2010, 327 (5964): 469-474. 10.1126/science.1182395.PubMed CentralView ArticlePubMed
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2334/13/110/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.