Introduction

Tuberculosis (TB) is an ancient disease that continues to be a public health challenge1. Globally, 30 high TB-burden countries accounted for 86–90% of the estimated global incidence. Ethiopia is listed as one of the 30 countries with the highest prevalence of TB and human immunodeficiency virus (HIV) co-infected TB patients2. In 2022, the National TB Program of Ethiopia reported a total of 156,000 TB cases, with an estimated incidence rate of 126 cases per 100,000 population2.

TB is caused by nine lineages of Mycobacterium tuberculosis (Mtb), of which lineages 1, 2, 3, and 4 contributed significantly to the global TB epidemic3,4. Phylogeographic research shows that the geographic distribution of the Mtb lineages varies4. Lineages 1 and 3 have been primarily recorded in Asia and East Africa while lineages 2 and 4 occur across the world. The remaining lineages (lineage 5, 6, 7, 8, and 9) are restricted to Africa4,5,6,7. In addition, the Mtb lineages have variation in transmission success and disease phenotypes5. Compared to ancient lineages, "modern" lineages like 2, 3, and 4 are spreading more successfully8,9,10. Additionally, sublineage analysis of lineage 4 identify specific sublineages that have varying geographic distributions8. A similar variance has also been seen in lineage 3 clades6.

Ethiopia continues to have a high burden of both TB and TB/HIV, even with an annual drop in TB incidence since 199011. Information about the molecular epidemiology of Mtb would support the efforts of the National TB Program. This study utilizes Mtb isolates obtained from the nationwide Drug Resistance Survey (DRS) and evaluate the Mtb spoligotypes and lineages population structure and spatial distribution. We also assess the lineages' hotspot areas.

Materials and methods

Study setting and participant enrollment

Ethiopia is a country in East Africa. Six nations border the nation: Sudan, Eritrea, Djibouti, Somalia, Kenya, and South Sudan12. Administratively, the country is divided into four levels: regions, zones, woredas (districts) and kebele (wards). The present study utilized Mtb isolates obtained from DRS which were collected from all regions. The DRS was carried out on 32 health facilities, between November 2011 and June 2013. Smear-positive TB patients were the DRS's target populations. A total of 1785 smear-positive TB patients were included. Ninety-seven percent of the isolates (n = 1735) were available for spoligotyping.

Spoligotyping

All available isolates were undergoing spoligotyping. The spoligotyping was done following the instructions of the manufacturer using a commercially available kit (Qiagen and Sigma)13,14. Each run of the spoligotyping has incorporated a positive control (H37Rv and M. bovis) and a negative control (water). There was double data entering and cross-checking of any inconsistent readings. An international shared type (SIT) was assigned using the SITVIT2/MIRU-VNTRplus or SpolLineages databases. Lineages were extracted from SpolLineages15. Spoligotypes having the same spoligotype pattern were defined as "Cluster" spoligotypes16.

Spatial analysis

The geocode of the health facilities was obtained from the Centers for Disease Control and Prevention of Ethiopia. We utilize ArGIS (version 10.8) for the spatial analysis by taking each health facility as a single unit. The Global Moran’s Index was applied to assess distribution of lineages. Getis-Ord Gi statistic was utilized to identify hotspots. A fixed distance band and a default threshold distance band were used for spatial analysis, which was based on Euclidean distance.

Statistical analysis

The data was captured and analyzed using SPSS v20 (IBM SPSS Statistics 20). The distribution of lineages and predominate spoligotypes (n > 20)9 by drug resistance profiles were evaluated using Fisher’s exact test. Bivariate and multivariate logistic regression models were employed to assess the association of prevalent lineages (lineage 3 and 4, 96.5%) with demographic, clinical, drug resistance and location profiles. The variables with p-value ≤ 0.2 were subjected to multivariable logistic regression models. For the purposes of the logistic regression analysis, the region with the most similar proportion of lineages 3 and 4 to the national average was selected as reference. A p-value of < 0.05 in Fisher’s exact test and multivariate logistic regression was considered statistically significant.

Ethical approval

This study obtained ethical approval from the Ethiopian Public Health Institute (SERO-59-5-2016) and the Addis Ababa University Ethics Committee Institutional Review Board (SF/MCMB/702/08/2016). We used stored isolates. Personal identifier had not been collected, the DRS enrolled patient using unique survey identification number.

Results

Study isolates

A total of 91% (1579/1735) of the isolates, including 68 MDR/RR and 58 INH-resistant Mtb isolates, showed interpretable spoligotype patterns, and each one came from a unique person. Spoligotype and drug susceptibility results were available for 1402 isolates. SITs were retrieved for 88% (1393/1579) of the spoligotypes that were classified into 283 spoligotypes. Ninety percent of the spoligotypes (1423/1579) were grouped into 127 spoligotype patterns with a cluster size of 2–247 (Supplementary Table S1).

Lineages and spoligotypes identified

Distribution of the lineages and predominate spoligotypes (n > 20) in MDR/RR and INH resistant TB is displayed in Table 1. The five prevalent spoligotypes which were SIT149, SIT53, SIT25, SIT37 and SIT26 accounted for 42.8% of all isolates under investigation. Furthermore, SIT149, SIT53 and SIT21 accounted of 57.3% for MDR/RR and 51.7% of INH resistant TB. The percentage of MDR/RR and INH resistant TB among predominate spoligotypes had significant variation. Lineage information was retrieved for 94.6% (1494/1579) of the spoligotypes (10). M. bovis made up 0.13% of the lineages. The Mtb spoligotypes were grouped into four lineages: 1 (1.8%), 3 (25.9%), 4 (70.6%), and 7 (1.6%). The majority of the MDR/RR and INH-resistant TB belonged to lineage 4, which was followed by lineage 3. Within lineages, the percentage of MDR/RR and INH resistant TB had no significant difference (Table 1).

Table 1 Distribution of the lineage and predominate spoligotypes in rifampicin and/or isoniazid resistant tuberculosis.

Spatial distribution of Mtb lineages

Variable proportions of the lineages were reported across regions (Table 2). Dire Dawa was used as a reference since it had the closest proportion of lineage 3 (28.6%) and 4 (71.4%) to the national averages. The bivariate analysis showed that lineage 3 was less likely to be found in Oromia but more likely to be found in places like Gambella, Southern Nations Nationalities and Peoples Region (SNNPR), and Tigray than lineage 4. Only Gambella, SNNPR, and Tigray, however, displayed significant differences on multivariable analysis (Table 3). Accordingly lineage 3 was significantly higher in TB patients from Gambella (AOR = 4.37, P < 0.001) and Tigray (AOR = 3.44, P = 0.001) compared to lineage 4, and lineage 4 was significantly higher in patients from SNNPR (AOR = 1.97, P = 0.026) compared to lineage 3.

Table 2 Proportion of the four lineages by regions.
Table 3 Logistic regression analysis of lineage 3 compared with lineage 4 by demographic, clinical variable, drug resistance and regions.

Hotspot analysis of Mtb lineages

The Global Moran's I test revealed that lineage 7 distribution variability was statistically significant (Table 4). Even though, the hot spot was not within a 95% CI, lineage 4 was found to have hotspots in southern Ethiopia. The hotspot analysis is displayed in Fig. 1. Lineage 1 hotspots were identified in eastern Ethiopia while, lineage 7 hotspots were identified in the north and west parts of Ethiopia.

Table 4 Global spatial autocorrelation results of Mtb lineages.
Figure 1
figure 1

Geographical distribution of hot and cold spot of Mycobacterium tuberculosis lineages with confidence interval.

Demographic factor associated with dominant lineages

The bivariate analysis of the prominent lineages (lineage 3 and 4) showed that, compared to lineage 4, lineage 3 was more likely to be associated with male, HIV positive, and retreatment TB cases (Table 3). Multivariable analysis, however, revealed that only males exhibited a significant association. In comparison to female patients, male TB patients had increased probabilities of having lineage 3 compared to lineage 4 (AOR = 1.37, P = 0.016).

Discussion

This study reports on the population structure and spatial distribution of Mtb using isolates collected for the Drug Resistance Survey. Our study showed that 4 out of 9 lineages were circulating in Ethiopia, with lineages 3 and 4 as major lineages that include MDR/RR and INH resistant TB. Furthermore, a noteworthy proportion of MDR/RR (57.8%) and INH resistant (52%) TB is possessed by the three predominant spoligotypes (SIT149, SIT53, SIT21), whilst the five predominant spoligotypes (SIT149, SIT53, SIT25, SIT37, and SIT26) account for a substantial portion of overall TB cases (42.8%). These findings indicate that some spoligotypes have a high percentage which suggests the possible clonal expansion of those spoligotypes in the country. Furthermore, the spatial analysis reveals that the distribution of the lineages vary by region, which is essential knowledge for improving collaborative planning of the TB program activities between the regions of Ethiopia and among neighboring countries.

Examining Mtb strain diversity and distribution allows for a better understanding of TB transmission dynamics and identification of highly transmissible genotypes6,7,8,9. Certain lineages and spoligotypes have been more prevalent in particular locations; for instance, T2/Uganda II17, T3ETH18, and EAI2-Manila19 have been reported in Uganda, Ethiopia, and the Philippines, respectively. Furthermore, 42–55% of the spoligotypes that have been found can be attributed to 4–5 predominate spoligotypes18,20,21,22. In line with prior studies, our results indicated that 42.8% of the isolates investigated were linked to the five dominate SITs. Additionally, we found that the three predominate spoligotypes have a major share in MDR/RR (57.8%) and INH resistant (52%) TB isolates. Of MDR cases, 60.8% (n = 146/240) in India and 58.2% (n = 100/134) in Zambia were from the three predominate spoligotypes21,22. This might be the outcome of the competitive fitness of the strains and the host–pathogen co-evaluation effect, which increase the likelihood that local strains will spread in patient groups within the same nations5,23.

The population structures of the Mtb lineages are unique to each country, and the distribution of lineages within a country is also distinct4,7,17,25. In the present study, the proportion of lineages 3 and 4 varied among regions, with lineage 3 being significantly greater than lineage 4 in reports from Gambella and Tigray and lineage 4 being significantly higher in SNNPR than lineage 3. Our results are consistent with reports of spatially varied TB lineages within a country. The two prominent lineages in South Africa, lineages 1 and 4, show spatial heterogeneity across province25. In comparison to other zones, the Ugandan II family is primarily found in the south-west zone17. Furthermore, our data also reflects the dominant lineages in the neighboring countries, such as Sudan (lineage 3), which borders Gambella and Kenya (lineage 4), which borders the SNNPR26,27.

Lineage 7 is one of the restricted lineages which is almost exclusively reported in Ethiopia18,28. We found that a lineage 7 hot spot has been identified in north and west part of Ethiopia. Prior studies report that lineage 7 is more common in north Ethiopia (13–15.6%) than in other parts of the country (< 0.6%)19,28,29,30, which explains our findings. However, we found 0–1.9% of lineage 7 in the west of Ethiopia, despite the fact that, to the best of our knowledge, lineage 7 has not been documented in this region, which suggests a need of further research. In addition to lineage 7's only reported presence being restricted in Ethiopia, a recent study revealed that lineage 7's reduced protein abundance may contribute to its slower growth and less virulent phenotype10. This might contribute to the transmission of lineage 7 in certain locations only, even though the lineage is known for its host–pathogen co-evaluation in Ethiopian TB patients. Our findings indicated that a hotspot for lineage 1 was found in eastern Ethiopia. Previous studies from the eastern part of Ethiopia show that lineage 1 made up 4.7–8.4% of the population, whereas a multicenter analysis found that lineage 1 made up 1.1% of the population18,29,31. Furthermore, studies employing sizable data sets revealed that lineage 1/EAI were significantly more common in Somalia (33.63%), which borders the eastern part of Ethiopia32. It is possible that local and cross-border transmission in the eastern region of Ethiopia accounts for the increased occurrence of lineage 1.

Gender disparity has been reported in overall TB burden, where the majority (55%) of the global burden, as well as more than 50% of the TB in Ethiopia, occurs in males2. The experimental study showed that males acquire Mtb infection far earlier than females due to differences in B cell follicle growth between the sexes33. Our results showed that, in contrast to lineage 4, lineage 3 was more common in male patients with an AOR of 1.3 than in female TB patients, which may be partially attributed to male susceptibility. Even though lineage 4 was the predominate lineage in our findings, the gender disparity has not been indicated. There is a possibility that the variation is due to additional underlying patient risk factors, which calls for more carefully monitored research.

This study was subject to limitations. We used spoligotyping to describe the percentage of predominate spoligotypes that may lead to overestimation of the strains in the same categories; hence spoligotyping lacks discriminatory power. Although we include health facilities from every region, the sample we have collected from each region is not representative of the entire region.

Conclusion

This study reported the population structure of Mtb using samples from all regions of Ethiopia. Our study showed that the Mtb population comprises lineages 1, 3, 4, and 7, with lineages 3 and 4 accounting for the majority of cases of INH-resistant TB, MDR/RR TB, and overall TB. The five predominant spoligotypes account for a significant portion of all TB cases (42.8%), while the three predominant spoligotypes are responsible for a notable percentage of MDR/RR (57.8%) and INH resistant (52%) TB. These might be the result of the possible relative transmissibility advantage of those spoligotypes, which influences drug-resistant TB as well as the total TB burden. The lineage variation by region and the similarity with neighboring countries suggest both local and cross border spread of TB, which is likely to be the result of the bacterial genetic background of the lineages and/or human trafficking.