We have used Spoligotyping to analyze M. tuberculosis strains isolated in different parts of Venezuela, and while the collection could be described as a convenience sample, without proportionate geographic or temporal distribution, it likely reflects the prevalence of strains in the entire country, as the spoligotypes are generally the same as those found in a previous study based on a proportional nationwide sampling to determine prevalence of drug resistance . There are a total of approximately 6500 cases of tuberculosis reported per year in Venezuela, so for the ten years during which strains were collected, the sample represents only ~2% of registered cases, but 5% for Caracas and a larger percentage for Valencia and the Carabobo state of which it is the capital.
As in other studies in South America [18, 22, 23] the LAM family predominates, accounting for 53% of all strains, and 60% if, as suggested by the SNP results, SIT 605 is also considered LAM. Although 300 different spoligotype patterns were found, only 6 accounted for almost 50% of the isolates, and these patterns were the most common in nearly all the regions sampled. In contrast, 201 patterns were found in only one isolate, and 173 (57.7%) of these were not present in SpolDB4. Nine of the 127 patterns that are represented in SpolDB4 have been found exclusively in Venezuela, and a tenth, SIT 605, has been isolated outside of Venezuela only in New York, in two immigrants from the neighboring country of Colombia. This last spoligotype, SIT 605, is perhaps the most surprising finding in this study, because unlike the other common spoligotypes such as SIT 17 and 93, which contain a number of distinct genotypes when analyzed by MIRU-VNTR 24 loci, SIT 605 appears to represent a large clonal cluster that is geographically centered around the city of Valencia and the state of Carabobo , but encompasses almost all the SIT 605 strains examined, including the two isolated in New York. The previous study  of the molecular epidemiology of M. tuberculosis strains in Venezuela found several isolates with SIT 605, but as that study was based on spoligotyping without geographic data, the focal and clonal nature of this genotype could not have been detected. We propose that the SIT 605 strains be termed the "Carabobo" genotype.
Phylogeographic analysis of spoligotypes have identified a number of regionally originating patterns or clades whose isolates have similar patterns with RFLP IS6110 and appear to represent strains with a common lineage. The most notorious example is SIT 1, or Beijing, which presumably originated in Asia and has spread widely . Other examples are the SIT 33/F11/LAM clade  first described in South Africa, the SIT 60/F15/LAM4 KZN clade associated with XDR-TB in Kwazulu Natal, South Africa , the SIT 61/LAM10 Cameroon clade, SIT 59/LAM11/ZWE/MERU, SIT 21/CAS1-Kili, SIT 26/CAS1-Dehli, SIT 41/LAM7-Tur, and SIT 19/Manila/EAI2 [10, 27]. These clades are not homogeneous, but include several related MIRU-VNTR and RFLP IS6110 patterns, some of which may comprise more localized clonal clusters. However, the fact that these spoligotype families have a wide distribution, and some are associated with MDR outbreaks, suggests that they have selective advantages over other strains in their ability to cause disease and be transmitted. It appears that SIT 17, and perhaps SIT 93, might similarly be considered successful clades, as they are responsible for 18.6% and 9.9%, respectively, of tuberculosis cases in Venezuela, and have been found in several other countries.
There are also reports of spoligotype patterns that are geographically limited and appear to be clonal by MIRU/RFLP analyses, such as the SIT 2643 Haarlem3 strain in Paraguay , but the number of isolates of these clonal genotypes is generally small. An examination of SpolDB4  shows several SIT's with more than 10 isolates limited to one country, or with a single additional isolate in a separate country, perhaps from an emigrant (for example, SIT's 210, 339, 1258, 1329, 1457, 1518, and 1898). Similarly, in Additional file 2 there are, besides SIT 605, other spoligo patterns that could represent clonal strains because they are found only in particular regions of Venezuela, for example the putative LAM2 strain in position 18 (Figure 3). Several of these are currently being analyzed by MIRU-24 loci. However, it is striking that the SIT 605/SIT 1698 Carabobo genotype is the most common spoligotype in a state with a population greater than 2 million, accounting for 15% of all isolates, while it is relatively uncommon in the other regions we sampled. However, of only nine isolates obtained from the state of Aragua, which borders Carabobo, one was SIT 605, so it is possible that this strain is also common in adjacent areas. As the spoligotype for the Carabobo strain evolved with the loss of spacers 31–32 and 37 – 40, other elements of the genome could have, independently, evolved to gain an unknown selective advantage over other local M. tuberculosis genotypes, but unlike SIT 17, there were no epidemiologic correlations suggesting higher virulence. At least six SIT 605 patients were present or former inmates in a particular prison in Carabobo state, and it appears that other SIT 605 patients had contact with people that were interned in that prison, so this institution could play a role in its local dissemination, as described for other prison-associated clonal outbreaks [28, 22, 29].
Although at least one SIT 605 isolate was found to be Multi-Drug Resistant , the great majority of SIT 605 isolates were pan-sensitive. Our study found only a very few Beijing isolates, some of them lethal MDR strains (National Tuberculosis Program, unpublished data), but fortunately they do not appear to have been transmitted extensively within the Venezuelan population . At least one of the Beijing strains was isolated from a Peruvian national .
The comparisons of epidemiologic and clinical parameters revealed that in the indigenous Warao population in Delta Amacuro  tuberculosis occurs at younger ages (mean of 32.6 years) than in the rest of the population (mean of 38–39 years), and unlike the 71% male predominance in the cities, there is a nearly equal male:female distribution (54%). This pattern is consistent with very active spread within this isolated Amerindian community  that has the highest TB incidence in Venezuela and amongst the lowest life expectancies. The data from the largely Amerindian tuberculosis population in the Amazonas states  suggests a similar, but less marked trend, with an average age of 36.8 that is 64% male. More polemical are the findings suggesting that SIT 17 may be more actively transmitted because patients with isolates belonging to the SIT 17 cluster tended to be younger and were more likely to have AFB positive specimens. Low patient age is considered characteristic of strains being actively transmitted . In contrast, SIT 53 may be less virulent because patients with this spoligotype tended to be older and may have more AFB negative sputa, but the AFB trend did not reach statistical significance, and the older age, while reaching statistical significance, was based on only 21 SIT 53 patients and needs to be confirmed in subsequent studies.
These associations provoke a number of questions. If SIT 53 were really less virulent, why is it still the sixth most common spoligotype, causing 4% (53) of cases? Could it have been a very common strain in the past, that is now more apt at latency and reactivation than person-to-person transmission, and will its prevalence decrease over time? SIT 93 is the second most common spoligotype, found in 10% of all isolates, but why isn't it also associated with clinical parameters suggesting virulence and transmission? Finally, preliminary MIRU-VNTR analysis suggests that all of the most common spoligotype clusters, except for SIT 605, contain several different strain genotypes. Therefore, for the epidemiologic associations with SIT 17 or SIT 53 to make sense, it must be assumed that the strains with these spoligotypes derived from a common ancestor with genetic characteristics that were maintained even as the MIRU-VNTR patterns evolved, as seems true for globally dispersed lineages such as Beijing. Finally, it must be recalled that clinical data was not available for many of the patients whose isolates comprised this study, so although all data were collected before spoligotyping results were known, the epidemiologic associations with SIT 17 could be subject to biases due to regional differences in patient admission or data recording. We attempted to reduce this possibility through regional stratification, and found that associations for SIT 17 persisted, although not always reaching statistical significance. If subsequent studies with more complete data confirm that the SIT 17 or SIT 53 spoligotypes, or the SIT 605 Carabobo genotype, are associated with particular clinical disease characteristics, the challenge will be to identify the molecular basis for these apparent differences.