Variation in Mycobacterium tuberculosis population structure in Iran: a systemic review and meta-analysis

Background Acquiring comprehensive insight into the dynamics of Mycobacterium tuberculosis (Mtb) population structure is an essential step to adopt effective tuberculosis (TB) control strategies and improve therapeutic methods and vaccines. Accordingly, we performed this systematic review and meta-analysis to determine the overall prevalence of Mtb genotypes/ sublineages in Iran. Methods We carried out a comprehensive literature search using the international databases of MEDLINE and Scopus as well as Iranian databases. Articles published until April 2020 were selected based on the PRISMA flow diagram. The overall prevalence of the Mtb genotypes/sublineage in Iran was determined using the random effects or fixed effect model. The metafor R package and MedCalc software were employed for performing this meta-analysis. Results We identified 34 studies for inclusion in this study, containing 8329 clinical samples. Based on the pooled prevalence of the Mtb genotypes, NEW1 (21.94, 95% CI: 16.41–28.05%), CAS (19.21, 95% CI: 14.95–23.86%), EAI (12.95, 95% CI: 7.58–19.47%), and T (12.16, 95% CI: 9.18–15.50%) were characterized as the dominant circulating genotypes in Iran. West African (L 5/6), Cameroon, TUR and H37Rv were identified as genotypes with the lowest prevalence in Iran (< 2%). The highest pooled prevalence rates of multidrug-resistant strains were related to Beijing (2.52, 95% CI) and CAS (1.21, 95% CI). Conclusions This systematic review showed that Mtb populations are genetically diverse in Iran, and further studies are needed to gain a better insight into the national diversity of Mtb populations and their drug resistance pattern.

Different studies have shown that genomic differences among MTBC lineages or sublineages can affect the clinical and epidemiological characteristics of TB infection [5][6][7][8]. In recent decades, some Mycobacterium tuberculosis (Mtb) lineages/sublineages have attracted wide attention due to certain features such as transmission potential, pathogenic properties and association with drug resistance [9,10]. Lineages 2 and 4 are widely distributed and seem to have a higher pathogenic power compared to geographically restricted lineages [2,11,12]. In West and South Asia, a sharp increase has been documented in the circulation of certain sublinages such as NEW-1 (Lineage 4) and CAS (Lineage 3) strains that are prone to emerging as resistant clones [13][14][15]. This growing increase seems be more important in Iran with the national average TB rate of 14 cases per 100,000 population, due to the influx of Afghan refugees and population growth [1]. Accordingly, acquiring comprehensive insight into the dynamics of Mtb population structure is an essential step to adopt effective TB control strategies and improve therapeutic methods and vaccines. Therefore, the current systematic review and metaanalysis was conducted to determine (1) the overall prevalence of Mtb genotypes/sublineages and (2) to determine the dominant multidrug-resistant (MDR) Mtb genotypes in TB patients in Iran.

Study protocol
The meta-analysis was based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for systematic reviews and metaanalyses [16]. The study protocol was registered in the PROSPERO database (CRD42020186561).

Search strategy and selection criteria
For evaluating the diversity of Mtb isolates in Iran, a comprehensive literature search was conducted using the international electronic databases of MEDLINE and Scopus as well as Iranian databases. English-language studies published until April 2020 were retrieved using the following keywords: "Mycobacterium tuberculosis", "tuberculosis", "molecular typing", "genetic diversity", "genotyping" and "Iran" combined with the Boolean operators "OR", "AND" and "NOT" in the Title/Abstract/ Keywords field. Additional keywords such as "lineage" combined with "Mycobacterium tuberculosis" were used to avoid missing any articles. Similar strategies using Persian keywords were used to find relevant Persian original articles in Iranian databases, such as Scientific Information Database (www.sid.ir), Irandoc (www.irandoc. ac.ir), Magiran (www.magiran.com), and Iranmedex (www.iranmedex.com).
The titles and abstracts of all the identified articles were reviewed for eligibility, then screening for relevant articles were performed by reviewing the full texts.
The inclusion criteria were: 1) studies reporting the prevalence of Mtb genotypes among TB patients, 2) studies presenting data from Iran irrespective of the publication year, and 3) studies used Spoligotyping, mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing and Whole-Genome Sequencing (WGS) methods for genotyping. The exclusion criteria, on the other hand, included:1) studies only presenting prevalence data on Mtb genotypes among drug-resistant Mtb isolates, 2) studies providing incomplete data, 3) studies published as meta-analyses and systematic reviews, 4) studies not in English or Persian, 5) studies limited to a single genotype, 6) studies that lacked genotyping data, and 7) studies that were not related to human TB molecular epidemiology. Data screening was performed by two reviewers independently.

Data extraction and quality assessment
Data from the studies meeting our inclusion criteria were extracted. We required the following data: first author's name, year of publication, study area, molecular techniques, genotype, number of genotypes, total sample size, MDR genotype, sample type and nationality.
According to the items defined in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist, we evaluated the methodological quality of the included studies using the pre-defined criteria presented in Table 1. This checklist consists of various methodological aspects, and a maximum quality evaluation score of 32 was considered and articles with scores below 18 were excluded from this study [51]. Data extraction and quality assessment were also carried out by two reviewers independently.

Statistical analysis
Pooled proportion and 95% CI were used to assess the prevalence of the genotypes in the pulmonary Table 1 Characteristics of 34 included studies in this meta-analysis tuberculosis (PTB) and extrapulmonary tuberculosis (EPTB) samples. Generalized linear mixed model with random intercept logistic regression model was used for assessing pooled prevalence [52]. The heterogeneity of prevalence between the included studies was tested and quantified by using Cochran's Q test and I 2 index, respectively [53]. Clopper-Pearson was run for evaluating pooled proportion and confidence interval in the individual studies. Also, continuity correction of 0.5 was considered in studies with zero cell frequencies [54]. The pooled proportion, as an overall prevalence of the genotypes, was derived by the random effects model because of significant heterogeneity between the individual studies. Publication bias was tested by Egger's linear regression test and Begg's test (P < 0.05 was set as the significance level for publication bias) [55]. All the statistical analyses were performed by using the metafor R package and MedCalc software.

Search results and studies' characteristics
A total of 316 articles were identified by the primary search strategy, of which 34 articles met the eligibility criteria and were included in this study (Fig. 1)  ) were identified as genotypes with the lowest prevalence in Iran (< 2%). The forest plot of some of the genotypes (i.e., Beijing, CAS, and EAI) are shown in Fig. 2. In addition, the highest pooled prevalence of MDR strains was found in Beijing (2.52,95% CI) and CAS (1.21,95% CI) genotypes (Table 2).

Publication bias
We observed significant heterogeneity across the studies based on the I 2 index with a few exceptions (Table 2). However, publication bias was not significant based on the results of Egger's linear regression test and Begg's test.

Discussion
Based on the pooled data investigated, all MTBC lineages, except lineage 7 and 8, were found in Iran, which reflects the presence of high diversity in MTBC strains. Phylogeographical population structure of the MTBC stems from the interplay between different factors such as human migration, geography, genetic drift and hostpathogen interaction [4,5,56]. Iran is the main host country for Afghan refugees, but the main factor contributing to formation of MTBC lineages phylogeography in Iran has not been identified. Study of global variation in MTBC strains showed that the prevalence of lineages 2, 3 and 4 strains may be increasing in West Asia, while the prevalence of lineage 1 is declining [15]. The summary of Mtb strains diversity in Iran, based on families/sublineage, showed that NEW1(L4) ( [11]. Movement of strains with people from these regions may explain the presence of these genotypes in Iran. Besides, appearing CAS as a one of the prevalent Mtb subpopulations in Iran may reflect the pathogenic properties of this genotype. In a recent study, the global proportion of MDR in CAS population was estimated at 30.63% [57]. In our study, based on the pooled prevalence of MDR genotype, CAS was found (1.21%) as a one of the dominant genotypes. This finding reflects the needs for more understanding and monitoring of this subpopulation.
Despite the global dissemination of Beijing genotype as a prototype of lineage 2, it had low prevalence in our geographical region. However, the highest pooled  [58]. The low prevalence of Beijing genotype compared to other genotypes in Iran may be explained by the prevalent Beijing sublineage, affecting its pathobiological properties and epidemiological dynamics. Further studies are warranted to identify the distribution pattern of the Beijing sublineages in Iran, which can improve the management of their infection. The dominance of NEW1 as a specialist sublineage of Euro-American lineage (L4) in Iran was not unexpected. Some evidence has shown that Iran is the probable origin of this family/sublineage, which may reflect ecological adaption in this subpopulation [59]. It is noteworthy that NEW1 genotype is prone to MDR [13]. The pooled prevalence of MDR in NEW1 was 0.8% (95% CI). However, the results of overall MDR estimation may be less representative of the target population, as in the some of the included studies in our analysis; drug susceptibility testing was not reported based on the identified genotype, which may lead to variation in the final results. Other sublineages of lineage 4 such as T, Haarlem, Uganda and S in varying proportions were also observed. This distribution pattern in the subtypes of lineage 4 in Iran may be explained by the effect of human migration and genetic and phenotypic characteristics of each sublineage.
In addition, we observed that lineage 5/6 subtype had the lowest prevalence in our geographical region. Based on the fact that these strains are geographically restricted [2], we can only speculate human migration as the determinant of this distribution. A limitation of this study is that most of the included studies were conducted in Tehran (Capital of Iran). Thus, our finding may not be completely representative of the overall prevalence of different Mtb populations in Iran. In addition, the most of the included studies were based on Spoligotyping and MIRU-VNTR typing methods while WGS provides a superior resolution compared with these PCR-based genotyping methods to identification of diversity in Mtb strains.

Conclusions
In summary, this systematic review showed that Mtb population are genetically diverse in Iran and the NEW1 (L4) and West African (L5/6) genotypes had the highest and lowest pooled prevalence rates, respectively. This type of evidence can contribute to better clinical and epidemiological management of Mtb infections. Also, there is a need for further in-depth studies to gain a deeper insight into the national diversity of Mtb populations and their drug resistance pattern.