Transmission and drug-resistance of tuberculosis in Luodian revealed by whole genome sequencing based molecular epidemiology study

Mei Liu Afliated Hospital of Medical University College https://orcid.org/0000-0003-2435-8384 Peng Xu Zunyi Medical University Xingwei Liao hospital of Luodian county Qing Li Afliated Hospital of Zunyi Medical University Wei Chen Guizhou Center for Disease Control and Prevention Qian Gao Fudan University School of Basic Medical Sciences Nana Li Afliated Hospital of Zunyi Medical University Tao Luo (  [email protected] ) Sichuan University West School of basic medical science and forensic medicine Ling Chen (  [email protected] ) Afliated Hospital of Zunyi Medical University

Research Article

Keywords:

Posted Date: May 6th, 2020

DOI: https://doi.org/10.21203/rs.3.rs-26012/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License

Page 1/15 Abstract BACKGROUND

Tuberculosis (TB) caused by Mycobacterium tuberculosis (MTB), remains a severe public health problem globally. Guizhou has the fourth highest TB report rate of pulmonary TB around China. Uncovering the current status of TB epidemic, and distinguishing disease caused by recent or remote infections are the key issue to formulate effective prevention and control strategy. However, these data are limited in Guizhou. In this study, we aimed to investigate the transmission and drug-resistance profles of TB in Luodian, a highest TB incidence and resources limited area in Guizhou, China.

METHODS

During 22 May 2018 to 21 April 2019, individuals with positive MTB culture were enrolled, all of them accepted the standardized interview. MTB isolates were performed whole genome sequencing. The prevalence of MTB genotypes, the genomic cluster rate and drug-resistance conferring mutations were analyzed based on the sequencing data.

RESULTS

A total of 107 cases were enrolled, of which 64.5% were male, and the median age of the patients was 51 years old (interquartile range, 40–65 years old). 84% patient were new case while 16% were retreated cases. All cases excepted three came from nine towns, and 55.1% of cases were from Longping and Bianyang. The phylogeny tree showed that 53.3% of strains were Lineage 2 (Beijing genotype), while 46.7% were Lineage 4 (Euro-American genotype). Among Lineage 2 strains, 66.7% were modern Beijing. Seven clusters with genomic distance within 12 SNVs were identifed. The clusters included 14 strains, accounting for a cluster rate of 13.1%. The distance of clustered cases was between 2.1 to 71 kilometers (Km), with a media distance of 14 Km (interquartile range, 2.8–38 Km). Cases of two clusters came from the same town. Based on the gene mutations associated to drug-resistance, we predicted that 4.8% was resistant to isoniazid, 3.7% to rifampicin, 3.7% to streptomycin, and only one strain (0.9%) was multidrug resistance (MDR).

CONLUSIONS:

The study found high transmission and low drug-resistance rate in Luodian, and sublineages of modern Beijing branch had recent expansion in Luodian. this work also may serve as a genomic baseline to study the evolution and spread of MTB in Guizhou.

Introduction

Page 2/15 Tuberculosis (TB), caused by Mycobacterium tuberculosis (MTB), remains the globally severe public health problem, is the top cause of death induced by a single pathogen. In 2018, there were 10 million people fell ill with TB worldwide, and China has the second largest number of new TB cases following India, which account for 9% of the global burden1. According to the ffth national tuberculosis epidemiological survey, there was geographical variation of the incidence rate across China, western part was the area with the most serious prevalence of pulmonary TB2. Guizhou province located in the southwest of China, with low levels of socioeconomic development. Guizhou has the fourth highest TB report rate of pulmonary TB in China, estimated at 102.5 per 10,000 population (average rate in China was 55.6 per 10,000)3. Uncovering the current status of tuberculosis epidemic in a region and distinguishing tuberculosis caused by recent infection or remote infection are the basis and key to formulate effective prevention and control strategy4. However, these data are limited in Guizhou province.

The genotyping of MTB has been served as an essential tool to investigate the key issues in TB epidemiology, for example to assess the recent transmission, identify risk factors associated to recent transmission, and detect outbreak. Recently, Whole genome sequencing (WGS) was considered as an ultimate tool for genotyping MTB strains owing to provide about 90% of genomic information5, 6, and it has advantages including high resolution and be phylogenetic informative as compared to conventional genotyping methods in identifying outbreaks7 and tracing transmissions6.

To estimate the recent transmission of pulmonary TB (PTB) in Guizhou, we conducted a retrospective study of pulmonary tuberculosis in Luodian county, which has a report rate at 271.1 per 10,0 00, being the region with the most severe tuberculosis epidemic in Guizhou8.

Material And Methods

Setting and study population

Luodian county located at the south of Guizhou province, with a population of around 353,000. Passive case fnding strategy was used to identify suspected pulmonary TB with symptoms according to the guidelines of Chinese National Center for Disease Control and Prevention(CDC)9. All suspected TB cases are referred to the only one comprehensive TB designated hospital, were requested to give three sputum samples collected at morning, night, and spot before taking medicine. All samples should be sent for smear, Ziehl-Neelsen staining and microscopy, then samples with smear-positive were cultured on Lowen- stein-Jensen medium. During 22 May 2018 to 21 April 2019, individuals with sputum culture positive were enrolled, all of them accepted the standardized interview.

Whole genome sequencing

Genomic DNAs were extracted with the cetyltrimethylammonium bromide-lysozyme (CTAB) method10. Then, the genomic DNAs were sent for whole genome sequencing. A 300 base pair (bp) paired-end library was constructed for each DNA sample. Sequencing was carried out on the Illumina Hiseq 2000 with 100

Page 3/15 or 115 cycles, with an excepted coverage of 100. To analyze single nucleotide polymorphisms (SNPs), we mapped reads to H37Rv reference genome (GenBank assession: AL123456.3) with Bowtie using non- gapped alignments and identifed SNPs as previously described11. All homozygous heterozygous calls and SNPs with coverage higher than three were called by Samtools/Bcftools. Any SNPs in the PE/PPE genes and drug resistant associated genes were discounted. Using an in-house Perl script, SNP lists for individual strains were combined into a single nonredundant list, and corresponding base calls were recovered for each strain. The concatenated sequences of strains were used to generate a Maximum Likelihood (ML) phylogeny by MEGA512. To test the reproducibility of WGS and data processing, two strains were sequenced and analyzed twice, Fig. 2. Genomic cluster was defned as strains with genomic difference of 12 SNPs or less, and the unique was strains with genetic difference of more than 12 SNPs13.

The drug resistance data were predicted for rifampicin (rpoB RRDR), isoniazid (katG, inhA, and its promoter), streptomycin (rpsl and gidB), ethambutal (embA, embB, and embC), pyrazinamide (pncA and rpsA), fuoroquinolones (gyrA and gyrB), second-line injection antibiotics (rrs, eis-promoter and tlyA) based on the mutations reported to be highly associated with drug resistance (http://www.tbdreamdb.com)14.

Data collection and analysis

Patients’ demographic and clinical characteristics were collected from the national TB registration system and the Luodian hospital clinical system. Data were analyzed using R (version 3.2.1). the fsher’s exact test was used to assess possible association factors of genomic clustering and genotype. P <0.05 was considered statistically signifcance.

Results

Demographic characteristics of PTB cases

During 22 May 2018 to 21 April 2019, 116 MTB clinical strains were isolated and 112 were successfully sequenced. Finally, 107 cases were enrolled after excluding fve cases due to low data quality (Figure 1). Of enrolled patients, 64.5% (69/107) were male, and the median age of the patients was 51 years (interquartile range, 40-65 years). Almost all patients were local residents except one that was from neighbor county and two didn’t have detail information. Two of 107 cases were high school students, 97 were farmers, and eight were lacked information of occupation. 84% (89/107) patient were new case while 16% (17/107) were retreated cases. All local cases came from nine areas, and Longping and Bianyang accounted 55.1% of cases. Table 1.

Basic description of phylogeny tree and clustering by WGS

To describe the genetic structure of these strains, we constructed a maximum likelihood Phylogenetic tree based on the minimum SNP difference between each pair of all 107 strains. Phylogenetic tree of all 107

Page 4/15 strains was shown in Figure 2. 53.3% (57/107) of them were Lineage 2 (Beijing genotype), while 46.7% (50/107) were Lineage 4 (Euro-American genotype). Among Lineage 2 strains, 66.7% (38/57) were modern Beijing (Figure 2).

In study of tuberculosis molecular epidemiology, the recent transmission rate was indicated by cluster rate, here we calculated the genomic-cluster rate. By the cluster defnition as genetic distance was 12 SNPs or less, seven clusters with genomic distance within 12 SNVs were identifed. The clusters include 14 strains, accounting for a cluster rate of 13.1% (14/107). The genetic difference between strains in each cluster ranged from 0 to 11 SNPs, four pairs were less than 5 SNPs, and three pairs were 8, 9, and 11 SNPs respectively.

The distance of clustered cases was between 2.1 to 71 kilometers, with media as 14 kilometers (interquartile range, 2.8-38 kilometers). Cases of two clusters came from the same town (Figure 1). To further understand the factor related to genomic clustering, we analyzed the relationship between the cluster and bacteriologic, demographic, clinic, and geographic factors. However, there was no statistically signifcance. Supplementary fle.

For understanding if genotype of strains in Luodian was different with that of strains isolated from other areas, we constructed the phylogenetic trees based WGS of our strains together strains collected from other parts of China15. The result indicated that almost all Luodian’s strains were mixed with strains from other regions. However, among the modern Beijing branch, there were three sublineages containing 22 strains exclusively from Luodian. In one sublineage, the genetic distance of strains was very small, suggesting potential recent expansion occurred in Luodian area during recent decades. Figure 3.

Drug-resistance related gene mutation

China was one of the countries with high burden of TB drug-resistance. In most county of Guizhou province like Luodian, TB therapy mainly relied on empirical strategy due to clinical resources limited. Understanding the level of regional drug resistance is important for clinical treatment. Hence, in order to realize the situation of TB drug-resistance rate in Luodian, we checked the genetic drug-resistance profle based on WGS data. After checking 16 drug-resistance related genes, a number of 39 strains were identifed with 54 mutations in seven resistance conferring genes. supplementary fle. However, only 13 mutations in ten strains were highly related to drug-resistance, including Ser315Thr (n=4) and Ser315Asn (n=1) of katG, Ser450Leu (n=4) of rpoB, Lys43Arg (n=1) and Lys88Arg (n=3) of rpsl. By these mutations, drug-resistance profles of all strains were predicted as follows, 4.8% (5/107) was resistant to isoniazid, 3.7% (4/107) to rifampicin, 3.7% (4/107) to streptomycin. But only 0.9% (1/107) was multidrug resistance (MDR). Figure 2. We further checked the genotype of these ten drug-resistant strains, the results showed that four strains belonged to lineage 4, while four to ancient Beijing and two to modern Beijing.

Discussion

Page 5/15 Our results showed that MTB strains isolated from Luodian were less likely belonging to Beijing family (53.3%) compared to those from other central, southern or estern China ( 66.0% in Hunan and Fujian16, 84.1% in Shenzhen17, 79.4% in Shandong18). The prevalence of Beijing family was similar to that of the neighbor province (53.9% in Sichuan19, 55.3% in Guangxi20, 59.6% in Yunnan21) and to the neighbored Zunyi city ( 48.0%) of Guizhou province 22. Our results also revealed that among Beijing strains, modern Beijing sublineage (66.7%) was more dominated than ancient Beijing genotype (33.3%), which was close to data from Shenzhen ( modern Beijing 61.5%)17.

We found that 13.1% (14/107) of the pulmonary TB cases had genomic-clustered strains in Luodian county during a period less than one year. The cluster rate was lower than the results of other studies, including one conducted in Shenzhen (25.2% , 105/417) during fve years17, and another one from Spain (33.5%, 109/325) in a 3-year population-based study23. The cluster rate in current study likely be underestimated by following reasons. Firstly, we could not include all incident PTB cases. Due to resources limited, only samples that smear positive would be sent to the higher level hospital for culturing, this test only detected about 18.8%of active TB cases24. In addition, the rate of culture positive was infuenced by the reservation and transferring of samples to other hospital. Secondly, the study was last only one year, while molecular epidemiology study of TB usually choose the sampling time of two years or more25, 26. However, we still found that seven genomic clusters containing 14 strains. Although we did not identify any risk factor related to genomic clustering since short study period, we could conclude that the transmission rate of TB in Luodian is likely much higher, further study is needed to more comprehensively investigate of TB transmissions in this area.

Our results revealed that three local sublineages of modern Beijing branch contained strains from Luodian exclusively by reconstructing phylogeny tree along with strains from other regions of China. This phenomenon should be taken caution because modern Beijing genotype got increasing prevalence globally and had caused outbreaks in different locations worldwide27, and it is being found as signifcantly associated with recurrent TB28, linked to multidrug resistant (MDR) TB29, and more likely to be highly virulent than ancient strains30.

Based on gene mutation that referring to anti-TB drug resistance, we found that 4.8% of strains resistant to isoniazid, 3.7% to rifampicin, 3.7% to streptomycin, only 0.9% were MDR, and none of strains was resistant to any second line drugs. These data indicated drug-resistance in Luodian was at a very low level, as compared to data of whole Guizhou province which was 16.3% to isoniazid, 13.5% to rifampicin, 15.9% to streptomycin, and 9.4% of strains were MDR31. This discrepancy may be due to that TB patient in Luodian prefer conservative treatment to chemical medicine because most of population in this region were minority and had native living habit.

Conclusion

Page 6/15 This retrospective genomic epidemiological study in Luodian of Guizhou, which is limited in resources, provides a primary map of the transmission and drug-resistance profle of PTB in Luodian of Guizhou of China. We found that 13.1% of the population with PTB had genomic-clustered MTB strains, which suggested on-going transmissions were occurring in this area, and it needed to take caution for further expansion of modern Beijing branch. This work also may serve as a genomic baseline to study the evolution and spread of MTB in Guizhou.

Abbreviation

TB: Tuberculosis; MTB: Mycobacterium tuberculosis; MDR: multidrug resistance; WGS: Whole genome sequencing; PTB: pulmonary TB; CDC: Center for Disease Control and Prevention; CTAB: cetyltrimethylammonium bromide-lysozyme; bp: base pair; SNPs: single nucleotide polymorphisms; ML: Maximum Likelihood; Km: kilometer.

Declarations

Acknowledgements

The authors would like to thank Tian Tian and ZhengQin Yang in Luodian hospital for data collection from clinic system,thank Xuefeng Fu for transferring samples, and special thanks are extended to the participants who enrolled in this study.

Author’s contribution

LC and PX designed this survey, XL and QL collected data, ML, TL, and PX managed and analyzed data, QG and WC controlled the quality of data analysis, ML drafted the manuscript. PX and LC edited the manuscript. All authors interpreted the results, revised the report and approved the fnal version.

Funding

This project was funded by the National Natural Science Foundation of China (No. 81760003; No.31700130), Science and Technology program in Guizhou (No. [2019]1465), the Project from Health and Family Planning Commission of Guizhou province (No. gzwjkj2018-1-024), and Guizhou Province Governor for special study of Clinical Application (No. Qian co he chengguo [2020]4Y012). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the study.

Ethics approval and consent to participate

Page 7/15 The project proposal was approved by the Institutional Review Board of Zunyi Medical University, Guizhou, China. Written informed consent was obtained from all participants once they agree to take part in the study.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Author details

1 Key Laboratory of Medical Molecular Virology, School of Basic Medical Sciences, Shanghai Public Health Clinical Center, Fudan University, Shanghai, 200032, China. 2 Key Laboratory of Infectious Disease & Biosafety, Institute of life Sciences, Zunyi Medical University, No.6 West Xuefu Road, Xinpu District, Zunyi, Guizhou Province, 563000, China. 3 Department of Infectious Diseases, Hospital of Luodian County, No.96 Jiefang East Road, Luodian 550100, Guizhou, China. 4 Department of Respiratory Medicine, Afliated Hospital of Zunyi Medical University, No.149 Dalian Road, Zunyi 563000, Guizhou, China. 5 Department of TB Control, Center of Disease Control and Prevention, 550004, Guizhou, China. 6 Department of Pathogenic Biology, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, No.17 People's South Road, Chengdu 610041, China.

References

1. World Health Organization. Global tuberculosis report 2019. In. Geneva: World Health Organization;2019. 2. Technical Guidance Group of the Fifth National TB Epidemiological Survey; The Ofce of the Fifth National TB Epidemiological Survey. The ffth national tuberculosis epidemiological survey in 2010. Chin J Antituberc. 2012; 34(08):485-508. 3. Chen W, Zhang H, Du X, Li Tao, Zhao Y. Characteristics and Morbidity of the Tuberculosis Epidemic — China, 2019. China CDC Weekly 2020;2:181-4. 4. Yang C, Shen X, Peng Y, Lan R, Zhao Y, Long B, et al. Transmission of Mycobacterium tuberculosis in China: a population-based molecular epidemiologic study. Clin Infect Dis. 2015; 61(2):219-27. 5. Torok ME, Reuter S, Bryant J, Koser CU, Stinchcombe SV, Nazareth B, et al. Rapid whole-genome sequencing for investigation of a suspected tuberculosis outbreak. J Clin Microbiol. 2013;51(2):611- 4. 6. Roetzer A, Diel R, Kohl TA, Rückert C, Nübel U, Blom J, et al. Whole genome sequencing versus traditional genotyping for investigation of a Mycobacterium tuberculosisoutbreak: a longitudinal molecular epidemiological study. PLoS Med. 2013; 10(2): e1001387.

Page 8/15 7. Gardy JL, Johnston JC, Ho Sui SJ, Cook VJ, Shah L, Brodkin E, et al. Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N Engl J Med. 2011;364(8):730-9. 8. Chen H, Chen P, Yang J, Yuan W, Lei S, Li Y. Analysis of tuberculosis epidemics in Guizhou Province between 2005 and 2012. Modern Prev Med. 2015;42(02):342-4 (in Chinese). 9. Wang L, Zhang H, Ruan Y, Chin DP, Xia Y, Cheng S, et al. Tuberculosis prevalence in China, 1990– 2010; a longitudinal analysis of national survey data. Lancet. 2014;383(9934):2057-64. 10. Larsen MH, Biermann K, Tandberg S, Hsu T, Jacobs WR. Genetic manipulation of Mycobacterium tuberculosis. Curr Protoc Microbiol. 2007:10A. 12.11-21. 11. Luo T, Yang C, Peng Y, Lu L, Sun G, Wu J, et al. Whole-genome sequencing to detect recent transmission of Mycobacterium tuberculosis in settings with a high burden of tuberculosis. Tuberculosis. 2014;94(4):434-40. 12. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011; 28(10):2731-9. 13. Yang C, Luo T, Shen X, Wu J, Gan M, Xu P, et al. Transmission of multidrug-resistant Mycobacterium tuberculosis in Shanghai, China: a retrospective observational study using whole-genome sequencing and epidemiological investigation. Lancet Infect Dis. 2017;17(3):275-84. 14. Sandgren A, Strong M, Muthukrishnan P, Weiner BK, Church GM, Murray MB. Tuberculosis drug resistance mutation database. PLoS Med. 2009;6(2):e2. 15. Zhang H, Li D, Zhao L, Fleming J, Lin N, Wang T, et al. Genome sequencing of 161 Mycobacterium tuberculosis isolates from China identifes genes and intergenic regions associated with drug resistance. Nat Genet. 2013;45(10):1255-60. 16. Zhao L, Li M, Liu H, Xiao T Li G, Zhao X, et al. Beijing genotype of Mycobacterium tuberculosis is less associated with drug resistance in south China. Int J Antimicrob Agents. 2019;54(6):766-70. 17. Jiang Q, Liu Q, Ji L, Li J, Zeng Y, Meng L, et al. Citywide Transmission of Multidrug-resistant Tuberculosis Under China’s Rapid Urbanization: A Retrospective Population-based Genomic Spatial Epidemiological Study. Clin Infect Dis. 2019. pii: ciz790. 18. Li X, Xu P, Shen X, Qi L, DeRiemer K, Mei J, Gao Q. Non-Beijing strains of Mycobacterium tuberculosis in China. J Clin Microbiol 2011;49:392-5. 19. Deng J, Liu H, Wang B, Dong H, Zhang Z, Zhao X, et al. Spoligotyping of Mycobacterium tuberculosis in the South of Sichuan Province. Chin J Zoon. 2015;31(12):1116-9 (in Chinese). 20. Liu F, Liu Z, Wang X, Zhao X, Dong H, Liu J, et al. Genotyping study of 208 Mycobacterium tuberculosis clinical isolates from with Spoligotyping. Chin J Zoon. 2007;23(12):1226-30 (in Chinese). 21. Chen L, Yang X, Ma L, Yang H, Chen J, Ru H, et al. Analysis of the distribution of Beijing genotype strains of Mycobacterium tuberculosis in parts of Yunnan province. J Patho Biolo. 2015;10(09):825- 9 (in Chinese).

Page 9/15 22. Chen L, Li N, Liu Z, Liu M, Lv B, Wang J, et al. Genetic diversity and drug susceptibility of Mycobacterium tuberculosis isolates from Zunyi, one of the highest-incidence-rate areas in China. J Clin Microbiol. 2012;50(3):1043-7. 23. Xu Y, Cancino-Muñoz I, Torres-Puente M, Villamayor LM, Borrás R, Borrás-Máñez M, Bosque M, Camarena JJ, Colomer-Roig E, Colomina J et al: High-resolution mapping of tuberculosis transmission: Whole genome sequencing and phylogenetic modelling of a cohort from Valencia Region, Spain. PLoS Med. 2019;16(10):e1002961. 24. Chen H, Yang J, Yuan W, Song Q, Chen W, Chen Z, Chen H. Analysis of the result of epidemiological survey on tuberculosis in Guizhou Province. Modern Prev Med. 2013; 40(07):1214-5 (in Chinese). 25. Yang C, Lu L, Warren JL, Wu J, Jiang Q, Zuo T, et al. Internal migration and transmission dynamics of tuberculosis in Shanghai, China: an epidemiological, spatial, genomic analysis. Lancet Infect Dis. 2018;18(7):788-95. 26. Guerra-Assuncao JA, Crampin AC, Houben RM, Mzembe T, Mallard K, Coll F, et al. Large-scale whole genome sequencing of M. tuberculosis provides insights into transmission in a high prevalence area. eLife. 2015, 4. doi: 10.7554/eLife.05166 27. Liu Q, Ma A, Wei L, Pang Y, Wu B, Luo T, et al. China's tuberculosis epidemic stems from historical expansion of four strains of Mycobacterium tuberculosis. Nat Ecol Evol. 2018; 2(12):1982-92. 28. Hang NTL, Maeda S, Keicho N, Thuong PH, Endo H. Sublineages of Mycobacterium tuberculosis Beijing genotype strains and unfavorable outcomes of anti-tuberculosis treatment. Tuberculosis. 2015;95(3):336-42. 29. Nieto Ramirez LM, Ferro BE, Diaz G, Anthony RM, de Beer J, van Soolingen D. Genetic profling of Mycobacterium tuberculosis revealed "modern" Beijing strains linked to MDR-TB from Southwestern Colombia. PLoS One 2020, 15(4):e0224908. 30. Ribeiro SCM, Gomes LL, Amaral EP, Andrade MRM, Almeida FM, Rezende AL, et al. Mycobacterium tuberculosis Strains of the Modern Sublineage of the Beijing Family Are More Likely To Display Increased Virulence than Strains of the Ancient Sublineage. J Clin Microb. 2014;52(7):2615-24. 31. He Y, Yuan W, Chen H. An analysis on the epidemic characteristics of tuberculosis drug resistance in Guizhou province. Stud Trace Elem Health. 2018; 35(4):50-2 (in Chinese).

Tables

Table 1. Characteristics of genomic-clustered and unique cases of pulmonary Tuberculosis and genotype of MTB in Luodian, China

Page 10/15 No.(%) L2 (n=57) L4 (n=50) Clustered (n=14) Unique (n=93) Bacteriologic factor

Genotype (N=107)

Modern Beijing 38 (66.7) / 8 30 (32.3) 38 (35.5)

Ancient Beijing 19 (33.3) / 0 19 (20.4) 19 (17.8)

Non-Beijing / 50 (100.0) 6 (42.9) 44 (47.3) 50 (46.7)

Demographic factor

Gender (N=107)

Male 34 (59.6) 35 (70.0) 6 (42.9) 63 (67.7) 69 (64.5)

Female 23 (40.4) 15 (30.0) 8 (57.1) 30 (32.3) 38 (35.5)

Age, years (N=107)

15-24 3 (5.3) 5 (10.0) 1 (7.1) 7 (7.5) 8 (7.5)

25-34 8 (14.0) 5 (10.0) 2 (14.3) 11 (11.8) 13 (12.1)

35-44 8 (14.0) 4 (8.0) 2 (14.3) 10 (10.8) 12 (11.2)

45-54 12(21.1) 15 (30.0) 2 (14.3) 25 (26.9) 27 (25.2)

≥55 26 (45.6) 21 (42.0) 7 (50.0) 40 (43.0) 47 (43.9)

Occupation (N=105)

Student 1 (1.8) 2 (4.0) 1 (7.1) 2 (2.2) 3 (2.9)

Farmer 54 (98.2) 48 (96.0) 13 (92.9) 89 (97.8) 102 (97.1)

Clinic factor

TB history (N=106)

New case 49 (87.5) 40 (80.0) 11 (78.6) 78 (84.7) 89 (84.0)

Retreated case 7 (12.5) 10 (20.0) 3 (21.4) 14 (15.2) 17 (16.0)

Geographic factor

Living town (N=105)

Bianyang 14 (25.9) 15 (30.0) 2 (14.3) 27 (29.0) 29 (27.1)

Fengting Zhen 5 (9.2) 2 (4.0) 1 (7.1) 6 (6.5) 7 (6.5)

Fengting Xiang 1 (1.9) 1 (2.0) 1 (7.1) 1 (1.1) 2 (1.9)

Hongshuihe 1 (1.9) 6 (12.0) 1 (7.1) 6 (6.5) 7 (6.5)

Longping 20 (37.0) 10 (20.0) 5 (35.7) 25 (26.9) 30 (28.0)

Luokun 7 (13.0) 2 (4.0) 2 (14.3) 7 (7.5) 9 (8.4)

Maojing 1 (1.9) 5 (10.0) 1 (7.1) 5 (5.4) 6 (5.8)

Moyang 5 (9.2) 8 (16.0) 1 (7.1) 12 (12.9) 13 (12.1)

Muyin 0 1 (2.0) 0 1 (1.1) 1 (0.9)

Page 11/15 Figures

Figure 1

A. Sample enrollment and study fowchart. B. Map of study site in Guizhou, China (Study site was located in the south of Guizhou province and Guizhou province is belong to southwest part of China). Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Page 12/15 Figure 2

Phylogeny Tree of 107 MTB strains from Luodian. A maximum likelihood phylogenetic tree was constructed based on the minimum SNP difference between each pair of all 107 MTB strains. Lineage 2 (blue and purple lines) and lineage 4 (yellow lines) genotypes were found in the phylogenetic tree. The lineage 2 stains diverged into two sublineages, ancient Beijing (blue lines) and modern Beijing (purple lines). Four information was showed on the right the phylogenetic tree. First is the case’s living town

Page 13/15 which is distinguish by color. Second is the strain’s ID. Two strains (LD18069 and LD18134 ) were sequenced and analyzed twice, to test the reproducibility of WGS and data processing. Third is the information of cluster. The fourth is drug resistant mutations which were detected in our strains.

Figure 3

Phylogeny Tree of 107 strains from Luodian and 161 strains from other Chinese area. Strains with purple lines and text are collected from Luodian, while strains with blue ones were from other parts of China which data was previously published15. Yellow area indicates the modern Beijing branch.

Supplementary Files

Page 14/15 This is a list of supplementary fles associated with this preprint. Click to download.

Supplementary.docx

Page 15/15