Skip to main content
  • Research article
  • Open access
  • Published:

Validation of genotype cluster investigations for Mycobacterium tuberculosis: application results for 44 clusters from four heterogeneous United States jurisdictions



Tracking the dissemination of specific Mycobacterium tuberculosis (Mtb) strains using genotyped Mtb isolates from tuberculosis patients is a routine public health practice in the United States. The present study proposes a standardized cluster investigation method to identify epidemiologic-linked patients in Mtb genotype clusters. The study also attempts to determine the proportion of epidemiologic-linked patients the proposed method would identify beyond the outcome of the conventional contact investigation.


The study population included Mtb culture positive patients from Georgia, Maryland, Massachusetts and Houston, Texas. Mtb isolates were genotyped by CDC’s National TB Genotyping Service (NTGS) from January 2006 to October 2010. Mtb cluster investigations (CLIs) were conducted for patients whose isolates matched exactly by spoligotyping and 12-locus MIRU-VNTR. CLIs were carried out in four sequential steps: (1) Public Health Worker (PHW) Interview, (2) Contact Investigation (CI) Evaluation, (3) Public Health Records Review, and (4) CLI TB Patient Interviews. Comparison between patients whose links were identified through the study’s CLI interviews (Step 4) and patients whose links were identified earlier in CLI (Steps 1–3) was conducted using logistic regression.


Forty-four clusters were randomly selected from the four study sites (401 patients in total). Epidemiologic links were identified for 189/401 (47 %) study patients in a total of 201 linked patient-pairs. The numbers of linked patients identified in each CLI steps were: Step 1 - 105/401 (26.2 %), Step 2 - 15/388 (3.9 %), Step 3 - 41/281 (14.6 %), and Step 4 - 28/119 (30 %). Among the 189 linked patients, 28 (14.8 %) were not identified in previous CI. No epidemiologic links were identified in 13/44 (30 %) clusters.


We validated a standardized and practical method to systematically identify epidemiologic links among patients in Mtb genotype clusters, which can be integrated into the TB control and prevention programs in public health settings. The CLI interview identified additional epidemiologic links that were not identified in previous CI. One-third of the clusters showed no epidemiologic links despite being extensively investigated, suggesting that some improvement in the interviewing methods is still needed.

Peer Review reports


Tuberculosis (TB) contact investigation (CI) is a disease control strategy that performs a crucial role in understanding the most relevant epidemiologic factors influencing TB transmission between individuals [1]. In addition to CI, tracking the dissemination of specific Mycobacterium tuberculosis (Mtb) strains in populations is an important tool used to understand TB transmission dynamics [2]. For over 20 years, investigators have been discovering and utilizing genetic elements of the Mtb genome as molecular genotype markers [3]. The Mtb genotyping methodologies include utilizing the direct repeat locus-based spacer oligonucleotide typing (spoligotyping) [4, 5] and mycobacterial interspersed repetitive unit-variable number of tandem repeat (MIRU-VNTR) typing [6]. These genotyping techniques have been routinely used by the Centers for Disease Control and Prevention (CDC) since 2004 [7].

United States (US) public health departments evaluate persons having known contact with infectious TB patients to identify and treat individuals for whom TB transmission results in active TB disease or latent TB infection (LTBI). Because of the difficulty in identifying and assessing all individuals potentially infected by a given TB patient, CIs provide an incomplete picture of TB transmission. The investigation of TB transmission has been enhanced with the application of Mtb genotyping [8]. When Mtb genotyping is conducted routinely on all or nearly all Mtb isolates from a given jurisdiction, persons with isolates that have the same genotype are termed “clustered” and are suspected of being transmitted recently. Individuals with Mtb isolates that have unique genotypes are termed “non-clustered”. TB development in these persons is considered to be due to reactivation of previously acquired LTBI, recent transmission with someone who was not genotyped, transmission from a person outside the 3-year surveillance time window or geographic area, or relapse of a prior episode of TB disease [9].

Genotypic data can facilitate the detection of previously unsuspected transmission [1012]. Furthermore, when TB patients are identified as epidemiologic-linked through CI, TB transmission can be confirmed or refuted by matching (concordant) or discrepant (discordant) genotypes, respectively [13]. Due to issues concerning the discriminatory power of the genotyping techniques used [8, 14, 15], as well as the endemic level of genotype in a jurisdiction [16], it cannot be assumed that TB patients with matching genotypes result from the same chain of transmission. However, transmission between TB patients with matching genotypes can be verified by detecting epidemiologic linkages, which include: timing, interactions, or relationships among the persons [17]. Epidemiologic investigations of TB patients having genotypically matched Mtb isolates can uncover transmission venues and epidemiologic links between persons not identified by routine CI [11]. Public health investigators refer to these additional efforts as cluster investigations (CLI). The current study implements a standardized process for conducting CLI systematically and validates the application of this process to a set of randomly selected Mtb clusters identified in public health settings.



Mtb culture-positive patients from four study sites reported to the CDC from January 2006 to October 2010, whose Mtb isolates were genotyped by CDC’s National TB Genotyping Service (NTGS), were evaluated for Mtb clustering. Study sites, Georgia (GA), Maryland (MD), Massachusetts (MA) and Houston (HOU), Texas, were members of the Tuberculosis Epidemiologic Studies Consortium, a consortium of US sites funded by CDC to conduct TB epidemiologic research [18]. All sites except Texas evaluated TB patients for Mtb clustering in counties throughout the state. In Texas, only TB patients reported in the City of Houston jurisdiction (HOU) were evaluated. The study was approved by Institutional Review Boards at CDC and each study site.

TB cluster selection process

Mtb isolates from all patients were characterized by NTGS using spoligotyping and 12-locus MIRU-VNTR (MIRU12). Each unique combination of spoligotype and MIRU12 results is assigned a “PCRType” [19]. Clusters were defined as two or more TB patients with the same PCRType in a given public health jurisdiction (county or HOU) during the study period. Clusters were eligible for random sampling selection if the cluster consisted of at least three TB patients residing in the same given public health jurisdiction, whose TB status were reported between January 1, 2006 and the time of cluster evaluation. Eligible clusters for each of the four sites (see above) were assigned to three priority groups (low, medium, and high priority) based on their calculated log-likelihood ratio (LLR: <1.00, 1.00-5.79, and ≥5.80, respectively) associated with the public health cluster priorities [20, 21]. After reviewing the geospatial scores [21], initial expert panel rankings, and cluster investigation findings, the CDC statistician and the expert panel determined the log-likelihood ratio (LLR) cut-points that were associated with high-, medium-, and low-priority clusters in our surveillance data. Clusters were then randomly selected from each group. In total, 44 clusters (11 per site) were selected for further investigation. Details of sample size considerations and the sample selection strategy are provided in Additional file 1.

Epidemiologic links

An epidemiologic link was defined as relationships between two TB cases within a cluster who were determined to have likely shared air space while at least one of the cases had active TB disease. Epidemiologic links were considered definite if two cases named each other as a contact or were identified as having been in the same place at the exact same time; probable if the cases were in same place in the same timeframe (same week); and possible if the cases were in the same place possibly at the same time (month or season). A homogeneous attribute was defined as a single epidemiologic characteristic describing all patients in a given cluster.

Cluster investigation

Beginning in late 2009, TB surveillance data for all subjects were obtained using the Report of Verified Case of Tuberculosis (RVCT) [22] through collaborations with local public health staff. The subjects were part of the selected clusters in the data routinely collected by the CDC’s National Tuberculosis Surveillance System.

In coordination with local TB programs, CLIs for each selected cluster were conducted to determine whether TB patients in a given cluster had epidemiologic-linkages. Patients in selected clusters who were identified after cluster selection occurred were also investigated. A study protocol was developed whereby CLIs were carried out in a stepwise fashion (Fig. 1 and Additional file 2-CLI Instruments):

Fig. 1
figure 1

Steps for cluster investigation

  • Step 1: Public health worker interview

    Public health workers (public health supervisors, case managers, disease intervention specialists or contact investigators) for each clustered TB patient were contacted by study staff and asked whether they were aware of any epidemiologic links between the patients. If epidemiologic links were identified between two or more patients in the cluster, that information was documented. The public health workers could use any available documents as mental reminders during the interview.

  • Step 2: Contact investigation evaluation

    In coordination with local TB program staff, contact investigation records of patients in clusters were collected and reviewed to determine whether epidemiologic linkages to other TB patients were identified during the routine CI. For each patient evaluated, the number of contacts evaluated and the number of contacts with newly identified LTBI, previously diagnosed LTBI and active TB disease were documented.

    Public health worker interviews and contact investigation evaluations were carried out on each study patient except when no public health worker could be contacted or when contact investigations were not done. After each single epidemiologic link was established, investigations routinely continued to explore additional epidemiologic links between a patient and other patients in a given cluster.v

  • Step 3: Review of public health records

    If no epidemiologic links were identified in Steps 1 and 2, TB patients’ public health records which contain documentation of any intake or follow-up patient interviews conducted by the health department were reviewed to determine whether there were documented epidemiologic links to other TB patient(s), and whether location-based relationships existed between patients in the same cluster (e.g. residential, social, or medical settings).

  • Step 4: CLI TB patient interviews

    If no epidemiologic links were identified between patients in a cluster from CLI Steps 1–3, patients were contacted and interviewed after verbal consent was obtained using a pilot-tested interview instrument (Additional file 2), beginning with the most recently diagnosed subject. The interview instrument was designed to facilitate identification of epidemiologic linkages to other TB patients. For every epidemiologic link identified, estimated dates of symptom onset, relationship between patients, the most frequent patient-pair setting where transmission may have occurred, and the CLI step where the link was identified were documented.

    Epidemiologic links were investigated only if both patient-isolates were genotyped. CLI study instruments (Additional file 2) contained items designed to collect details of the study patients’ frequently visited locations, which could be evaluated as possible venues for transmission.

Data management and analysis

Study data were entered into a Microsoft Access 2003 (Redmond, WA) database by site staff and merged for analysis by the data coordinating center at the Texas site. National summary data on study PCRTypes (number of patients and the number of states reporting the given genotype) were provided by the CDC. To summarize the characteristics of study clusters, patients in a given cluster were compared to all other study patients by select demographic and behavioral characteristics and two-sided P-values were calculated. Clusters associated with at least one epidemiologic link were compared to those without identified epidemiologic link by demographic, behavioral, clinical and genotypic variables.

Comparison between patients whose linkage was identified through the study’s CLI interviews (Step 4) and patients whose linkage was identified earlier in CLI (Steps 1–3) was conducted using univariate and multiple logistic regression. Statistical analyses were conducted using Stata/SE 13.1 (StataCorp LP, College Station, TX).


From 2006 to 2010, there were 62,642 reported TB cases in the US, reflecting a TB case rate of 4.1 cases/100,000 [23]. During the same time period, study site jurisdictions reported the following number of TB cases: MD-1239, MA-1208, GA-2291 and HOU-1315, corresponding to an average TB rate of 4.4, 3.7, 4.8 and 12.5 cases/100,000, respectively [23]. The proportion of Mtb culture-positive patients that were genotyped during the study period was 82.1 % for the US and 98.5, 89.9, 84.4 and 85.8 % for MD, MA, GA and HOU, respectively [19]. From a pool of 132 eligible clusters (MD-25, MA-23, GA-35, HOU-49), 44 clusters (11 clusters from each site) corresponding to 38 distinct PCRTypes were randomly selected for investigation. Three PCRTypes (PCR00002, PCR00016 and PCR00017) were investigated in more than one study jurisdiction (Table 1). Most of the PCRTypes were of Euro-American (L4) or East Asian (L2) lineage (n = 29 and n = 7, respectively), but one PCRType each was identified of East African Indian (L3) and Indo-Oceanic (L1) lineages. PCRTypes found in the study were also seen nationally with a distribution range from one to 46 states. Three PCRTypes were seen in no US state other than that associated with the study site during the study period: PCR06732 (GA), PCR04837 (TX) and PCR04846 (TX).

Table 1 Study genotypes

A total of 401 study patients in the 44 selected clusters were evaluated by the CLI method. Median cluster size was six (range 3–33); HOU clusters tended to be larger than those from other sites (median 10 vs. 6, p = 0.024). Nineteen clusters (43 %) had only US-born patients and eight clusters (18 %) contained only foreign-born patients (Table 2). Certain single epidemiologic profiles describing all patients in a given cluster were identified for specific clusters (Table 2, “Homogeneous attribute” column).

Table 2 Characteristics of study clusters

In 401 study patients, 189 (47 %) patients were identified with epidemiologic links in a total of 201 linked patient-pairs (Fig. 2), of which 132 (66 %) were definite linkage strength, 27 (13 %) were probable and 42 (21 %) were possible epidemiologic links. Screening by a PHW (Step 1) identified 105/401 (26.2 %) linked patients. Among 388 study patients having contact investigation (CI) records available, CI record review (Step 2) found 15 (3.9 %) linked patients. Patients without CI records (n = 63, 16.2 %) were associated with having only extrapulmonary TB manifestations, or being homeless or injection drug users (p < 0.05). A total of 3893 contacts were evaluated with a median of five contacts per patient. CI outcomes included 687 (17.6 %) individuals with LTBI, 286 (7.3 %) with prior LTBI, 81 (2.1 %) with active TB disease and the remaining 2839 having no TB history, but associated with the study patient. In reviewing the public health records (Step 3) of patients with no epidemiologic link found in Steps 1–2, 41/281 (14.6 %) linked patients were identified. CLI interviews (Step 4) were completed on 30 % (119/401) of patients with 28/119 (23.5 %) linked patients found (Fig. 2). Among patients who did not have CLI interviews, 27.0 % (76/282) had epidemiologic link(s) that had already been identified through Steps 1–3. The patients had decreased odds for CLI interviews if they were homeless (p < 0.001), male (p < 0.001) or age 65 or older and increased odds for CLI interviews if they were diagnosed after 2008 (p < 0.001) or were from the MD site (p = 0.047) (data not shown).

Fig. 2
figure 2

Patient enrollment and study procedures for cluster investigation

Among 201 linked patient-pairs, 188 (93.5 %) pairs had concordant PCRTypes and 13 (6.5 %) pairs had discordant PCRTypes (66 and 62 % with definite linkage strength, respectively; p = 0.7). All of the 13 linked patient-pairs with discordant PCRTypes had discordant MIRU12 patterns (median of three discordant loci), while seven also had discordant spoligotypes. These 13 genotypically discordant, but epidemiologic-linked, patient-pairs were excluded from further consideration because their high level of discordance suggested that the linked patient-pairs were not part of the same transmission chain. The 188 linked patient-pairs with concordant PCRTypes corresponded to only 179 of the 401 study patients (45 %) having epidemiologic links because 75 patients had more than one link identified.

Specific transmission venues were identified for some clusters. Among 19 clusters with at least three pairs of epidemiologic-linked patients (Clusters 01, 02, 08, 10, 11, 12, 13, 14, 16, 20, 23, 24, 28, 29, 30, 31, 36, 39 and 42), 11 (57.9 %) had at least 50 % of their total epidemiologic links associated with a specific venue: four with homeless shelters (Clusters 01, 12, 23 and 36), three with drug houses (Clusters 16, 30 and 42), two with churches (Clusters 14 and 31), one with a bar (Cluster 39) and another with a social club (Cluster 24). Over 90 % of epidemiologic links identified for Clusters 01, 23 and 36 were associated with homeless shelter transmission venues. All epidemiologic links identified for Cluster 42 were associated with a drug house venue and seven of the eight (88 %) epidemiologic links identified for Cluster 24 were associated with a social club transmission venue. Seven (37 %) of the 19 clusters with at least three pairs of epidemiologic-linked patients were mainly (≥50 %) associated with household or non-household close social transmission venue (Clusters 02, 08, 10, 13, 20, 28, and 29). Among 16 epidemiologic linked pairs of the remaining cluster (Cluster 11), seven (44 %) was associated with homeless shelters and four (25 %) was associated with a church.

There was substantial variability by cluster in terms of the proportion of patients with identified epidemiologic links (Table 2, “Epi-linked” column), ranging from 0 to 100 %. No epidemiologic links were identified for patients in 13/44 (30 %) clusters, despite having all four CLI steps completed on 36 % of the 77 patients in these clusters. The number of clusters having all black patients was significantly lower in 13 clusters without epidemiologic links than in those with epidemiologic links [2 (15.4 %) versus 15 (48.4 %), p = 0.040]. No difference in the number of clusters having 100 % foreign-born patients was seen between the two groups (data not shown).

Twenty-five percent of epidemiologic links from HOU were identified through CLI. Meanwhile, epidemiologic links from MA and MD had lower odds of being identified by CLI TB patient interviews than linkages from other sites (Table 3; p = 0.004 and p = 0.036, respectively). All epidemiologic links with a household transmission setting and/or involving relatives were identified earlier than Step 4, while workplace and church transmission settings were associated with identification through CLI TB patient interviews in Step 4 (p = 0.032 and p = 0.046, respectively). Epidemiologic links involving a black TB patient had higher odds of being identified by early investigation steps (p = 0.036). Epidemiologic links including Asians or patients with extrapulmonary TB were associated with identification through CLI TB patient interviews in univariate analysis (p < 0.001 and p = 0.033, respectively); these associations became non-significant in multivariate results. Definite (strength) epidemiologic links had decreased odds for identification through interviews (p < 0.007) (Table 3). All epidemiologic links identified for clusters 14, 19, and 43 were identified by CLI TB patient interviews and over 50 % of links identified for clusters 31 and 39 were identified by CLI TB patient interviews (Data not shown).

Table 3 Characteristics associated with epidemiologic links being identified by CLI TB patient interviews (Step 4) versus earlier in CLI (Steps 1–3)


Contact investigation of individuals who had contact with TB patients is a cornerstone of public health TB control [1]. However, limitations of the concentric circle approach to contact investigations have been highlighted by reports of TB transmission not found through traditional contact investigation methods [11, 2326]. In our study, a considerable number of additional linked patients (n = 28; 14.8 % of all identified patients with at least one link to another person in the cluster) found in the CLI interview were not identified through the previous CI (Fig. 2).

Molecular epidemiologic data suggests that routine contact investigations, targeting household, work, and school contacts, commonly miss other locations where infectious TB patients spend time and transmit disease, especially leisure or social settings [11, 17, 27, 28]. The CDC contact investigation guidelines [1] recommend collecting information on potential transmission settings during patient CI interview. In the absence of named TB patient contacts, location-based information on possible transmission venues collected routinely during patient interviews can be useful in establishing relationships between genotypically linked TB patients [11].

By looking for homogeneity within a cluster using routinely collected surveillance data, we were able to generate characteristic profiles for many clusters. These cluster-specific epidemiologic profiles provided hints into potential transmission venue types for given clusters and provided insight into questions to ask, or locations to look for while seeking epidemiologic linkages during CLI steps.

CLI steps were prioritized to minimize resources required to uncover epidemiologic links by first asking health department staff who were directly involved in the TB patient’s care if they were aware of links to other patients (Step 1). When applied in a local health department context, existing knowledge of clusters or patient relationships is available through communication with a case manager, disease intervention specialist, or contact investigator (public health workers). Existing contact investigation records were then reviewed for documented links (Step 2). The next investigation step, entailing review and evaluation of public health records, added a more time-intensive and analytic component to investigations (Step 3). Finally, the most resource intensive step was patient re-interviews (Step 4). The analysis of the CLI step where epidemiologic links were determined (Table 3) demonstrated various scenarios where CLI interviews had added utility compared to earlier investigative steps. Higher odds of epidemiologic links were found in association with workplace, when patient-pairs resided in different zip codes within the same jurisdiction and Asian or African American patients (Table 3). Although we found 11 study participants having unknown epidemiologic links through contact investigation review (Step 2), we do not know how many contacts with active TB had epidemiologic links because a contact with active TB might be involved in more than one epidemiologic link.

Limitations to this study include the possibility of not including all patients in a potential genotype cluster given genotype coverage during the study period (especially for GA and HOU), the inability to locate and obtain consent from patients for re-interviews and exclusion of clinically defined and non-genotyped culture-positive patients with epidemiologic links to patients in study clusters. In addition, the infectious period of each patient was not considered. Although beyond the scope of this study, including non-genotyped patients may show a more complete picture of cluster transmission dynamics. Furthermore, NTGS transitioned from using spoligotype and 12-locus MIRU-VNTR (MIRU12) to spoligotype and 24-locus MIRU-VNTR (MIRU24) in 2009 to increase the discriminatory power of MIRU-VNTR [8, 14, 15]. Since this study was initiated in 2009 and included cases in previous years, cluster definition and selection process had to be defined by spoligotype and MIRU12. Given the variant number of epidemiologic links identified by different study sites, interview style (although standardized) may have played a role in potentially influencing subjective and qualitative outcomes. Despite the variation of results seen between sites, one of our study outcomes was to provide additional high-risk TB contacts identified by CLIs. In resource-limited jurisdictions where local funding and resource may not be enough to launch the cluster investigations, the information of high-risk contacts that were missed by the initial CIs is still helpful for evaluation purposes and to help TB programs improve their conventional CI techniques. Lastly, recall bias could not be ruled out, especially in patients who were diagnosed with TB many years before their cluster investigation interview was conducted.

Public health departments need to develop strategies and focus resources to prioritize and investigate clusters that may be of public health concern. An initial step in these investigations should be to evaluate clusters using readily available data. Many data elements needed to investigate clusters in specific jurisdictions are now available to TB control personnel routinely and electronically through the Tuberculosis Genotyping Information Management System [19]. Additionally, the 2009 expansion of the RVCT includes up to two state case numbers for TB patients epidemiologic-linked to the reported patient [22], so health departments can easily assess clusters for epidemiologic links. If the transmission dynamics are poorly understood and the cluster continues to grow, additional resources should be devoted to the CLI, including abstracting public health records of clustered patients and interviewing the TB patients to find epidemiologic linkages between patients beyond those identified by the health department. As we found in this study, re-interviewing patients in a cluster (Step 4), especially when no epidemiologic links have been identified can facilitate the identification of transmission venues and locations that are crucial in interrupting the ongoing transmission and cluster growth. Further study on improving the interviewing methods may be needed to increase the detection rate of epidemiologic links in Mtb genotype clusters. In addition, CI record review (Step 2) found 15 (3.9 %) linked patients exemplifying a need for better tools and trainings for contact investigations, which is an essential component of TB control programs.

Despite the continuing decline in US TB rates leading to a decrease of funding for public health activities for TB control, the elimination goals established in 1989 [29] remain unmet. With the recent leveling rates of TB [30], an interruption of the Mtb transmission by implementing the expanded and efficient CIs and CLIs would be critical for the success of TB control and prevention programs in the US.


We validated a practical method to systematically identify tuberculosis epidemiologic links that can be integrated into routine TB control and prevention programs in public health settings. Re-interviewing patients in a cluster can identify additional epidemiologic links that were not found in the previous CLI steps. Improvement of the interview methods and effective contact investigation trainings may be needed as no epidemiologic links were identified in one-third of the Mtb genotype clusters.



Centers for Disease Control and Prevention


Contact investigation


Cluster investigation


Latent TB infection


Mycobacterial interspersed repetitive unit-variable number of tandem repeat

Mtb :

Mycobacterium tuberculosis


National TB Genotyping Service


Public Health Worker


Report of verified case of tuberculosis


  1. National Tuberculosis Controllers Association, Centers for Disease Control and Prevention (CDC). Guidelines for the investigation of contacts of persons with infectious tuberculosis. Recommendations from the National Tuberculosis Controllers Association and CDC. MMWR Recomm Rep. 2005;54(RR-15):1–47.

    Google Scholar 

  2. Mathema B, Kurepina NE, Bifani PJ, Kreiswirth BN. Molecular epidemiology of tuberculosis: current insights. Clin Microbiol Rev. 2006;19:658–85.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Crawford JT. Genotyping in contact investigations: a CDC perspective. Int J Tuberc Lung Dis. 2003;7(12 Suppl 3):S453–7.

    CAS  PubMed  Google Scholar 

  4. Groenen PMA, Bunschoten AE, van Soolingen D, van Embden JDA. Nature of DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis; application for strain differentiation by a novel typing method. Mol Microbiol. 1993;10:1057–65.

    Article  CAS  PubMed  Google Scholar 

  5. Kamerbeek J, Schouls L, Kolk A, van Agterveld M, van Soolingen D, Kuijper S, et al. Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol. 1997;35:907–14.

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Supply P, Allix C, Lesjean S, Cardoso-Oelemann M, Rüsch-Gerdes S, Willery E, et al. Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. J Clin Microbiol. 2006;44:4498–510.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. National TB Controllers Association/CDC Advisory Group on Tuberculosis Genotyping. Guide to the application of genotyping to tuberculosis prevention and control. Atlanta: US Department of Health and Human Services, CDC; 2004. Accessed 05 June 2015.

    Google Scholar 

  8. Kato-Maeda M, Metcalfe JZ, Flores L. Genotyping of mycobacterium tuberculosis: application in epidemiologic studies. Future Microbiol. 2011;6:203–16.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Moonan PK, Ghosh S, Oeltmann JE, Kammerer JS, Cowan LS, Navin TR. Using genotyping and geospatial scanning to estimate recent mycobacterium tuberculosis transmission, United States. Emerg Infect Dis. 2012;18:458–65.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Malamadze N, González IM, Oemig T, Isiadinso I, Rembert D, McCauley MM, et al. Unsuspected recent transmission of tuberculosis among high-risk groups: implications of universal tuberculosis genotyping in its detection. Clin Infect Dis. 2005;40:366–73.

    Article  Google Scholar 

  11. Cronin WA, Golub JE, Lathan MJ, Mukasa LN, Hooper N, Razeq JH, et al. Molecular epidemiology of tuberculosis in a low- to moderate-incidence state: are contact investigations enough? Emerg Infect Dis. 2002;8:1271–9.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Daley CL, Kawamura LM. The role of molecular epidemiology in contact investigations: a US perspective. Int J Tuberc Lung Dis. 2003;7(12 Suppl 3):S458–62.

    CAS  PubMed  Google Scholar 

  13. Bennett DE, Onorato IM, Ellis BA, Crawford JT, Schable B, Byers R, et al. DNA fingerprinting of Mycobacterium tuberculosis isolates from epidemiologically-link case pairs. Emerg Infect Dis. 2002;8:1224–9.

    Article  PubMed  Google Scholar 

  14. Oelemann MC, Diel R, Vatin V, Haas W, Rüsch-Gerdes S, Locht C, et al. Assessment of an optimized mycobacterial interspersed repetitive- unit-variable-number tandem-repeat typing system combined with spoligotyping for population-based molecular epidemiology studies of tuberculosis. J Clin Microbiol. 2007;5:691–7.

    Article  Google Scholar 

  15. Allix-Béguec C, Harmsen D, Weniger T, Supply P, Niemann S. Evaluation and strategy for use of MIRU-VNTRplus, a multifunctional database for online analysis of genotyping data and phylogenetic identification of Mycobacterium tuberculosis complex isolates. J Clin Microbiol. 2008;46:2692–9.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Teeter LD, Ha NP, Ma X, Wenger J, Cronin WA, Musser JM, et al. Evaluation of large genotypic Mycobacterium tuberculosis clusters: contributions from remote and recent transmission. Tuberculosis (Edinb). 2013;93(Suppl):S38–46.

    Article  Google Scholar 

  17. Klovdahl AS, Graviss EA, Yaganehdoost A, Ross MW, Wanger A, Adams GJ, et al. Networks and tuberculosis: an undetected community outbreak involving public places. Soc Sci Med. 2001;52:681–94.

    Article  CAS  PubMed  Google Scholar 

  18. Katz D, Albalak R, Wing JS, Combs V, Tuberculosis Epidemiologic Studies Consortium. Setting the agenda: a new model for collaborative tuberculosis epidemiologic research. Tuberculosis (Edinb). 2007;87:1–6.

    Article  Google Scholar 

  19. Ghosh S, Moonan PK, Cowan L, Grant J, Kammerer S, Navin TR. Tuberculosis genotyping information management system: enhancing tuberculosis surveillance in the United States. Infect Genet Evol. 2012;12:782–8.

    Article  PubMed  Google Scholar 

  20. Lindquist S, Allen S, Field K, Ghosh S, Haddad MB, Narita M, Oren E. Prioritizing tuberculosis clusters by genotype for public health action, Washington, USA. Emerg Infect Dis. 2013;19:493–5.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kammerer JS, et al. Using statistical methods and genotyping to detect tuberculosis outbreaks. Int J Health Geogr. 2013;12:15.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Centers for Disease Control and Prevention. CDC tuberculosis surveillance data training. Report of Verified Case of Tuberculosis (RVCT) instruction manual. 2009. Accessed 05 June 2015.

    Google Scholar 

  23. US Department of Health and Human Services, Centers for Disease Control and Prevention. Online Tuberculosis Information System (OTIS). Accessed 05 June 2015.

  24. Weis SE, Pogoda JM, Yang Z, Cave MD, Wallace C, Kelley M, et al. Transmission dynamics of tuberculosis in Tarrant county, Texas. Am J Respir Crit Care Med. 2002;166:36–42.

    Article  PubMed  Google Scholar 

  25. Mitruka K, Blake H, Ricks P, Miramontes R, Bamrah S, Chee C, et al. A tuberculosis outbreak fueled by cross-border travel and illicit substances: Nevada and Arizona. Public Health Rep. 2014;129:78–85.

    PubMed  PubMed Central  Google Scholar 

  26. Bloss E, Newbill K, Peto H, Rice MJ, Ainsworth G, Travnicek R, et al. Challenges and opportunities in a tuberculosis outbreak investigation in southern Mississippi, 2005–2007. South Med J. 2011;104:731–5.

    Article  PubMed  Google Scholar 

  27. Yaganehdoost A, Graviss EA, Ross MW, Adams GJ, Ramaswamy S, Wanger A, et al. Complex transmission dynamics of clonally related virulent Mycobacterium tuberculosis associated with barhopping by predominantly human immunodeficiency virus-positive gay men. J Infect Dis. 1999;180:1245–51.

    Article  CAS  PubMed  Google Scholar 

  28. Kammerer JS, McNabb SJ, Becerra JE, Rosenblum L, Shang N, Iademarco MF, et al. Tuberculosis transmission in nontraditional settings: a decision-tree approach. Am J Prev Med. 2005;28:201–7.

    Article  PubMed  Google Scholar 

  29. Dowdle WR, Centers for Disease Control (CDC). A strategic plan for the elimination of tuberculosis in the United States. MMWR. 1989;38 Suppl 3:1–25.

    CAS  Google Scholar 

  30. World TB Day — March 24, 2016. MMWR Morb Mortal Wkly Rep 2016;65:273. doi:

Download references


The authors would like to thank the Tuberculosis Epidemiologic Studies Consortium study staff and collaborators at the Georgia, Maryland, Massachusetts and Texas sites as well as CDC collaborators.


This project was funded by the Division of TB Elimination, National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, through a task order announced and managed by the Tuberculosis Epidemiologic Studies Consortium (TBESC). The task order is designated as Task Order 26.

Availability of data and materials

The datasets analyzed during the current study are owned by the US Center for Disease Control and Prevention (CDC), and so are not publicly available. Data are however available from the authors upon reasonable request and permission of the CDC.

Authors’ contributions

All authors have read and approved of the final version of the manuscript. LDT: concept of study, study design, acquisition of data, data analysis and writing/revising the manuscript; PV: acquisition of data, data analysis and writing/revising the manuscript; DTN: revising the manuscript; JT: acquisition of data and writing/revising the manuscript; SS: acquisition of data and writing/revising the manuscript; SG: concept of study, data analysis and writing/revising the manuscript; SK: concept of study, study design, acquisition of data, data analysis and writing/revising the manuscript; RM: concept of study, study design, data analysis and writing/revising the manuscript; WAC: concept of study, study design, acquisition of data, data analysis and writing/revising the manuscript; EAG: concept of study, study design, acquisition of data, data analysis and writing/revising the manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

The study was approved by the Institutional Review Boards (IRB) at CDC and each study site. IRB approvals at non-Houston sites permitted obtaining verbal consent from subjects invited to be interviewed and consent was documented on a patient tracking form. Written informed consents were obtained for all subjects from Houston, Texas as required by the Texas Department of State Health Services IRB (Additional file 1 - Methods technical details).


The findings and conclusions in this manuscript are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention or the US Department of Health and Human Services.

Author information

Authors and Affiliations



Corresponding author

Correspondence to Edward A. Graviss.

Additional files

Additional file 1:

Methods section technical details. (DOCX 18 kb)

Additional file 2:

Cluster Investigation Instruments. (DOC 117 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Teeter, L.D., Vempaty, P., Nguyen, D.T.M. et al. Validation of genotype cluster investigations for Mycobacterium tuberculosis: application results for 44 clusters from four heterogeneous United States jurisdictions. BMC Infect Dis 16, 594 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: