Quick viewing(Text Mode)

GSMN-ML- a Genome Scale Metabolic Network Reconstruction of the Obligate Human Pathogen Mycobacterium Leprae

GSMN-ML- a Genome Scale Metabolic Network Reconstruction of the Obligate Human Pathogen Mycobacterium Leprae

bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

1 Title: GSMN-ML- a genome scale metabolic network

2 reconstruction of the obligate human pathogen

3 Mycobacterium leprae 4 Running title: Reconstruction of Mycobacterium leprae genome 5 scale metabolic network 6 Khushboo Borah1, Jacque-Lucca Kearney2, Ruma Banerjee3, Pankaj Vats3, 7 Huihai Wu1, Sonal Dahale3, Sunitha Manjari K3, Rajendra Joshi3, Bhushan 8 Bonde4, Olabisi Ojo5#, Ramanuj Lahiri6, Diana L. Williams6# and Johnjoe 9 McFadden1* 10 1Faculty of Health and Medical Sciences, University of Surrey, Guildford, GU2 7XH, 11 UK 12 2Robens Centre for Public and Environmental Health, University of Surrey, GU27XH, 13 UK 14 3HPC-Medical and Bioinformatics Applications Group, Centre for Development of 15 Advanced Computing, Pune University Campus, Pune-411007, INDIA 16 4Head of Innovation Development, IT-Early Solutions, UCB Pharma, Slough, SL1 3WE. 17 5 Department of Biological Sciences, Albany state university, GA 31705, USA 18 6National Hansen’s Disease Program, Health Resources and Services Administration, 19 Baton Rouge, LA 70803, USA. #Formerly with NHDP 20 *Corresponding author: [email protected] 21 Abstract 22 Leprosy, caused by Mycobacterium leprae, has plagued humanity for thousands 23 of years and continues to cause morbidity, disability and stigmatization in two 24 to three million people today. Although effective treatment is available, the 25 disease incidence has remained approximately constant for decades so new 26 approaches, such as vaccine or new drugs, are urgently needed for control. 27 Research is however hampered by the pathogen’s obligate intracellular lifestyle 28 and the fact that it has never been grown in vitro. Consequently, despite the 29 availability of its complete genome sequence, fundamental questions regarding 30 the biology of the pathogen, such as its metabolism, remain largely unexplored. 31 In order to explore the metabolism of the leprosy bacillus with a long-term aim 32 of developing a medium to grow the pathogen in vitro, we reconstructed an in 33 silico genome scale metabolic model of the bacillus, GSMN-ML. The model was

1 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

34 used to explore the growth and biomass production capabilities of the pathogen 35 with a range of nutrient sources, such as amino acids, , glycerol and 36 metabolic intermediates. We also used the model to analyze RNA-seq data 37 from M. leprae grown in mouse foot pads, and performed Differential Producibility 38 Analysis (DPA) to identify metabolic pathways that appear to be active during 39 intracellular growth of the pathogen, which included pathways for central carbon 40 metabolism, co-factor, lipids, amino acids, nucleotides and cell wall synthesis. 41 The GSMN-ML model is thereby a useful in silico tool that can be used to 42 explore the metabolism of the leprosy bacillus, analyze functional genomic 43 experimental data, generate predictions of nutrients required for growth of the 44 bacillus in vitro and identify novel drug targets. 45 46 Author Summary 47 Mycobacterium leprae, the obligate human pathogen is uncultivable in axenic 48 growth medium, and this hinders research on this pathogen, and the 49 pathogenesis of leprosy. The development of novel therapeutics relies on the 50 understanding of growth, survival and metabolism of this bacterium in the host, 51 the knowledge of which is currently very limited. Here we reconstructed a 52 metabolic network of M. leprae- GSMN-ML, a powerful in silico tool to study 53 growth and metabolism of the leprosy bacillus. We demonstrate the application 54 of GSMN-ML to identify the metabolic pathways, and metabolite classes that 55 M. leprae utilizes during intracellular growth. 56 57 Key words 58 Mycobacterium leprae; metabolism; metabolic network; genome scale model; 59 flux balance analysis; differential producibility analysis 60 61 Introduction 62 The mycobacterial genus includes two of the greatest scourges of humanity: 63 Mycobacterium tuberculosis and M. leprae, responsible for tuberculosis (TB) and 64 leprosy respectively. Leprosy is one of the oldest plagues of mankind yet 65 remains prevalent in developing countries. In 1981, the World Health 66 Organization (WHO) recommended multidrug treatment (MDT) of disease with

2 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

67 dapsone, rifampicin and clofazimine. Implementation of MDT achieved a 98% 68 cure rate leading to reduction of prevalence by a remarkable 45% within three 69 years. Yet, although MDT reduced global prevalence of leprosy, annual new 70 case detection has remained fairly constant over the last decade [1]. Problems 71 with control include the extremely lengthy drug treatment complicated by 72 reactions that can lead to nerve damage and worsening of symptoms. Relapse 73 of infection can also occur after treatment if the MDT therapies fail to clear 74 drug resistant [2]. Additionally, new cases occurring in children, reported 75 at 9.2% in 2013, have not reduced, indicating that active transmission continues 76 [1], [2]. According to the WHO reports, 210,758 new cases of leprosy were 77 detected in 136 countries in 2015 with the highest number of cases recorded 78 in the developing nations of South-east Asia [3]. New approaches are clearly 79 needed for case-finding, diagnosis, treatment and tackling social stigma of this 80 devastating disease [3], [4]. 81 One of the most striking facts about leprosy is how different it is from the 82 disease caused by its related pathogen, M. tuberculosis. Both M. tuberculosis 83 and M. leprae are intracellular pathogens that replicate primarily in macrophages 84 but whereas the TB bacillus appears to primarily infect macrophages (and 85 possibly also dendritic cells), the leprosy bacillus replicates in many cell types 86 but particularly Schwann cells and endothelial cells [5, 6]. Possibly as a 87 consequence of their differing cell tropism, tuberculosis is predominantly a 88 pulmonary inflammatory infection (although extra-pulmonary infections are 89 common) characterized by strong cellular immune reactions; whereas leprosy 90 causes both disseminated infections and localized infections [7], [8]. The 91 replicative capabilities of the two pathogens are also different, with M. 92 tuberculosis being able to grow both intracellularly and extracellularly; whereas 93 M. leprae is an obligate intracellular pathogen. Additionally, M. tuberculosis is 94 able to grow in vitro, in minimal medium containing a single carbon source 95 such as glycerol, a nitrogen source, such as ammonium ions and a few 96 essential mineral ions [9], whereas M. leprae has never been grown in axenic 97 media, even with the addition of various growth supplements [10]. This latter 98 feature severely hampers most biological studies of the pathogen. The replication 99 rate of the pathogens also differs markedly. Although both are very slow growing 100 organisms, M. tuberculosis replicates with a doubling time of about a day in 3 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

101 optimal conditions whereas the doubling time of M. leprae in infected mouse is 102 10-14 days [11], [12]. The reasons for these differences in growth, virulence 103 characteristics and cell tropism still remain unknown. 104 A major advances in the understanding the pathogenesis of both TB and leprosy 105 was achieved with the publication of their complete genome sequences, first 106 with M. tuberculosis in 1998 [13] and then M. leprae in 2001 [10]. Analysis of 107 the M. leprae genome sequence revealed that the pathogen had undergone 108 massive gene decay with only 1604 predicted open reading frames compared 109 to 3959 in M. tuberculosis. The reduced genome is a consequence of multiple 110 mutations in coding sequences leading to gene decay, and the accumulation of 111 1116 pseudogenes and thereby loss of their associated functions [12], [14]. The 112 loss of so many genes raised the obvious possibility that the inability of M. 113 leprae to grow in axenic culture is due to mutational loss of biosynthetic function 114 and consequent reliance on importing complex nutrients from the host which 115 are not available in artificial media, as has been found for several other 116 intracellular parasites. However, analysis of the genome sequence of M. leprae 117 demonstrated that the most of the anabolic capability of the pathogen, relative 118 to M. tuberculosis, seems to be intact. The pathogen appears to have retained 119 complete systems for synthesis of all amino acids, except for methionine 120 and lysine [10]. The biosynthesis of purines, pyrimidines, nucleosides, 121 nucleotides, vitamins and cofactors is also impaired [10]. Gene deletion and 122 decay does however appear to have eliminated many important catabolic 123 systems and respiratory systems. For example, as previously reported Cole et 124 al., 2001, the aerobic respiratory chain of M. leprae is truncated due to loss 125 of several genes in the NADH oxidase operon, nuoA-N, potentially eliminating 126 the ability of the pathogen to produce ATP from the oxidation of NADH. Yet 127 central carbon metabolism appears to be intact in the pathogen with an intact 128 glycolytic pathway, pentose pathway (PPP), gluconeogenesis and the 129 tricarboxylic acid (TCA) cycle. Thus, despite the availability of the complete 130 genome sequence of M. leprae, the inability of the pathogen to grow in axenic 131 culture remains a mystery. Also, the substrates utilized during growth of M. 132 leprae in vivo, remain unclear. Recent research has also revealed intriguing 133 capabilities of the leprosy bacillus to manipulate its host cell including its ability 134 to dedifferentiate and reprogram its host Schwann cell to a stem-cell-like state 4 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

135 in which the pathogen can proliferate more efficiently and which likely promotes 136 its dissemination through the body [15]. 137 138 The availability of full genome sequences does however allow reconstruction of 139 genome-scale metabolic reaction networks for micro-organisms. These not only 140 instantiate current knowledge of the metabolism of particular organisms; but may 141 be used to analyze and integrate data, particularly functional genomic data, to 142 generate novel predictions that may be subjected to experimental validation. 143 Genome-scale metabolic networks have been constructed for many microbes 144 including M. tuberculosis and Mycobacterium bovis where they have been used 145 to identify novel metabolic pathways and examine their growth and likely 146 substrate utilization within their host cells [16], [17], [18], [19]. In this study we 147 constructed the first genome-scale network of M. leprae, GSMN-ML. We used 148 the network to examine the organism’s catabolic and anabolic capabilities to 149 gain insight into why the pathogen is unable to grow in vitro. We also used 150 GSMN-ML to interrogate a RNA-seq dataset obtained from ex vivo grown M. 151 leprae to probe the pathogen’s metabolic response to its host environment and 152 analyzed the data using differential producibility analysis (DPA), a bioinformatic 153 tool that translates transcriptomic data to predictions of metabolic responses 154 [20]. 155 156 Materials and methods 157 Ethics statement 158 All animal studies were performed under the scientific protocol number A-105 159 reviewed and approved by the National Hansen’s Disease Programs (NHDP) 160 Institutional Animal Care and Use Committee, and were conducted in strict 161 accordance with all state and federal laws in adherence with PHS policy and 162 as outlined in The Guide to Care and Use of Laboratory Animals, Eighth

163 Edition. Mice were euthanized by CO2 asphyxiation followed by cervical 164 dislocation. 165 M. leprae culture and growth conditions 166 Athymic nude mice (Envigo) were infected by inoculating 3 x 107 M. leprae, 167 strain Thai-53, in each hind footpad. At 4-5 months post inoculation mice were

5 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

168 euthanized and footpad tissues removed. These tissues were either fixed in 169 70% ethanol for at least 48 h or minced and homogenized immediately for 170 purification of viable M. leprae [21]: in vivo M. leprae. Viable bacteria were 171 then also added to axenic medium 7H9 containing Caesitone (0.1% w/v), BSA 172 (0.5% w/v), dextrose (0.75% w/v) and ampicillin (50µg/ml), and incubated at 33°C,

173 5% O2 for 48h. Following incubation, bacteria were pelleted and fixed in 70% 174 ethanol: in vitro M. leprae. After fixation for at least 48h, ethanol was removed 175 and M. leprae were treated with 0.1N NaOH to remove mouse tissues. Bacteria 176 were adjusted to 2 x 109/ml and 1 ml aliquots were pelleted and resuspended 177 in 1 ml Trizol and stored at -200C. M. leprae in ethanol-fixed tissues were 178 also purified, treated with NaOH, resuspended in Trizol and stored at -200C. 179 180 RNA Purification and RNASeq Analysis 181 Bacterial lysis and RNA purification was performed using FastPrep-24 vertical 182 homogenizer (MP BioMedicals), and FastPrep Lysing Matrix B tubes. After 183 DNase treatment, an aliquot of sample RNA was converted to cDNA and 184 analyzed for the presence of M. leprae esxA expression and DNA contamination 185 [21]. Ribosomal RNA was depleted from total RNA preparations using Ribo- 186 Zero rRNA Removal Kit according to manufacturer’s recommendations, and RNA 187 quantity and quality was determined using NanoDrop 2000 and Agilent 2100 188 BioAnalyzer. Sequencing samples were prepared using the SOLiD® Total RNA- 189 Seq Kit ( Life Technologies, PN4445374) using 100 ng ribosome RNA-depleted 190 and barcoded t-RNA (SOLiD™ RNA Barcoding Kit, Module 1-16 (Life 191 Technologies, PN 4427046) according to the manufacturer’s protocol. Emulsion 192 PCR and SOLiD sequencing, 75 base pairs single direction, were performed 193 for the SOLiD™ 5500 System (Life Technologies, Thermo Fisher Scientific). 194 Sequencing analysis was done using Bacterial RNA-seq Analysis Kit 1.0 on 195 Maverix Analytic Platform (Maverix Biomics Inc.). Raw sequencing reads from 196 SOLiD sequencing platform that were converted into FASTQ file format were 197 quality checked for potential sequencing issues and contaminants using FastQC 198 (FastQC). Adapter sequences, primers, Ns, and reads with quality score below 199 13 were trimmed using fastq-mcf of ea-utils [22] and Trimmomatic [23]. Reads 200 with a remaining length of < 20 bp after trimming were discarded. Pre-processed

6 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

201 reads were mapped to the Mycobacterium leprae TN genome (RefSeq Accession 202 Number: NC_002677) using EDGE-pro [24]. Read coverage on forward and 203 reverse strands for genome browser [25] visualization was computed using 204 BEDtools [26], SAMtools [27] and UCSC Genome Browser utilities [23]. Read 205 counts for RefSeq genes generated by EDGE-pro [25] were normalized across 206 all samples and then used for differential expression analysis using DEseq [28]. 207 Significant differentially expressed genes were determined by adjusted P-value

208 with a threshold of 0.05. Fold change (log2) between samples were hierarchically 209 clustered using Pearson correlation. 210 211 Construction of GSMN-ML 212 The updated M. tuberculosis genome scale metabolic network (GSMN-TB_2) 213 [16] created for H37Rv strain (supplementary File S1) was the basis of 214 reconstruction of M. leprae metabolic model GSMN-ML (Supplementary File S3). 215 The reconstruction of the network followed the methods as described previously 216 [9], [16]. M. leprae (RefSeq ID: NC_002677) functional genes were identified 217 using the online GenBank tool [29], and compared to the same list used for 218 the development of the GSMN-TB of M. tuberculosis using the bi-directional 219 best hit criteria to search for orthologs [30]. Using M. tuberculosis H37Rv 220 (RefSeq ID: NC_000962) as a template, the metabolic network was reconstructed 221 using Pathway Tools (Version 24.0) [31]. Non-functional pathways in, and gene 222 essentiality data for M. leprae were identified and the model refined to remove 223 any non-homologous pathways, identified by literature search and metabolic 224 pathway data from previous mutagenesis study [32]. Surrey-FBA was used in 225 the study for the flux balance analysis to further refine the model by iteration 226 [33]. Single genes, and the respective pathways were removed and the model’s 227 feasibility re-assessed. Only edits which did not render the model infeasible 228 were used in the final product. If found to be essential, transport reactions 229 were introduced for their product(s), so that the inactivation of these genes will 230 be compensated by uptake of the predicted metabolites from the artificial media 231 (presumed to be obtained from host in vivo). The full list of reactions, for 232 GSMN-ML and GSMN-TB_2 models can be found in supplementary files S1 233 and S3 respectively.

7 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

234 Flux balance analysis (FBA) 235 FBA simulations in this study were also performed with the SurreyFBA software 236 as described previously [9], [33]. Simulations were conducted under steady-state 237 conditions, using biomass production as the objective function to obtain the 238 steady state theoretical fluxes through the reactions in the network that 239 maximised biomass production. For FBA, substrate inputs for carbon and 240 nitrogen sources were specified in the test medium that was used to constrain 241 the simulations. External metabolites, except in the cases of those identified as 242 being rate-limiting, were considered to be freely available and disposable. Rate- 243 limiting external nutrients are bound by upper and lower flux rate limits, as 244 described in the model’s set-up. FVA analysis were performed as described in 245 Gevorgyan et al., 2011 to obtain the upper and lower flux limits through a 246 particular reaction [33]. Gene essentiality predictions were tested using 247 SurreyFBA software. The maximal theoretical growth rate of each in silico gene 248 knock out was calculated by removing single genes from the network and 249 performing FBA linear programming as described previously [9], [16], [33]. 250 251 Differential producibility analysis (DPA) 252 DPA seeks to translate transcriptomic signatures into metabolic fluxes using 253 FBA to identify metabolites or metabolic pathways most affected by the changes 254 in gene transcription and is described in detail by Bonde et al., 2011 [20]. 255 RNA-seq data from M. leprae, strain Thai-53 cultivated in mice foot pads were 256 used for DPA. DPA utilized a FBA-based metabolite expression analysis using 257 glucose or glycerol as the carbon source input to GSMN-ML to calculate the 258 maximal theoretical flux towards each metabolite in the network. For every 259 metabolite, reactions involved were then ranked based on the expression values, 260 with those ≥ 2 being highly expressed and those < 2 lowly expressed. A 261 median value was assessed, creating a matrix of values for each metabolite. 262 The same process is repeated with down-regulated reactions and the result is 263 two lists of the most differentially produced metabolites for a given experimental 264 environment. Statistical significance of the DPA is assessed by the following: 265 ≥푂, o푟 266 ≤푂, 푡ℎ =1

8 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

267 표푡ℎ =0 268 퐸=푆/푁 269 where RUm = median expected value of up-regulated metabolite 270 DUm = median expected value of down-regulated metabolite 271 O = observed value in real dataset 272 S = arbitrary score value 273 N = number of datasets 274 E = E-value 275 A significance threshold of E <0.15 was decided for further analysis 276 277 Results 278 Construction of GSMN-ML genome scale metabolic network 279 Our approach was to start with our previously-constructed metabolic model of 280 M. tuberculosis, GSMN-TB [16] and step-by-step, replace all M. tuberculosis 281 genes with orthologues from M. leprae. We used an updated version of GSMN- 282 TB_2, which includes reactions for cholesterol catabolism (Supplementary File 283 S1). We adjusted the biomass composition to reflect what is known about the 284 macromolecular composition of M. leprae, for example, removing M. tuberculosis- 285 specific components and adding the phenolic glycolipid (PGL1) which is a major 286 antigen in leprosy. This entailed adding additional reactions for PGL1 synthesis. 287 Also, M. leprae lacks N-glycosylated muramic acid in its peptidoglycan; and 288 glycine is substituted in place of L-alanine [34], so this was reflected in the in 289 silico peptidoglycan composition. Mannosyl β-1 phosphodolichol (MPD) of M. 290 tuberculosis is a polyketide known to invoke strong T-cell response and pks12 291 [35], which is required for formation of MPD is a pseudogene in M. leprae 292 (ML1437c) and therefore, the corresponding biosynthesis reactions were removed 293 from the model. A small number of additional metabolic reactions that are found 294 in M. leprae but absent in M. tuberculosis, were added to the network. These 295 included ML2177, a putative uridine phosphorylase involved in nucleotide 296 salvage, highlighting the previously described difference between nucleotide 297 salvage in M. leprae compared to M. tuberculosis [36]; and ML0247, a putative 298 arsenate reductase arsC. Using the template of reactions available for M. 299 tuberculosis H37Rv in GSMN-TB_2 (856 reactions catalysed by 729 ), 300 we were able to transfer 713 reactions to construct the M. leprae model using

9 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

301 gene orthology. However, only 396 of orthologues appeared to be functional in 302 M. leprae, the remainders were pseudogenes. 21 additional reactions were 303 added to the M. leprae model based on manual literature curation, and the 304 resulting M. leprae network encoded a total of 417 genes encoding metabolic 305 enzymes. This reduced the number of functional enzymes (nearly half the 306 number as compared to M. tuberculosis), which is a direct consequence of the 307 massive genome reduction as well as pseudogene formation in M. leprae. 308 309 Incorporation of pseudogenes and orphan reactions facilitated in 310 silico growth of GSMN-ML 311 We applied flux balance analysis (FBA) to investigate the ability of the in silico 312 M. leprae model to synthesize biomass from basic nutrients, starting from the 313 minimal nutrient requirements (supplementary File S2) needed to grow M. 314 tuberculosis in vitro. GSMN-ML was however unable to synthesize biomass from 315 the minimal requirements that are needed by GSMN-TB. To address this problem 316 we reintroduced reactions that were removed in constructing GSMN-ML from 317 GSMN-TB (because genes encoding these reactions were pseudogenes in M. 318 leprae) until we obtained a feasible network. This resulted in the introduction 319 of several orphan reactions that appeared to be essential in M. leprae, but 320 whose function is not encoded by any known functional M. leprae gene. We 321 then adopted two approaches to minimize the number of these GSMN-ML 322 orphan reactions. First, we performed bioinformatic analysis to identify any 323 hypothetical gene that could complement a function provided by an essential 324 orphan reaction. If such a gene was found, it was added to the network. 325 However, in many cases, no putative gene could be identified to potentially 326 encode the function of the orphan gene. We then investigated whether the 327 orphan reaction could be made non-essential by opening a transport gate for 328 a metabolite that was potentially imported from the intracellular environment of 329 M. leprae. If that was the case then we opened that transport gate to allow 330 import and removed that orphan reaction from the network. Lastly, essential 331 orphan reactions that could not be complemented by a hypothetical gene whose 332 product could not be feasibly obtained from the host were left as functional 333 orphan reactions. This operation led to the generation of a feasible network,

10 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

334 GSMN-ML that included eight orphan transport reactions (Table 1). A full list of 335 network reactions and the medium components used for this analysis is included 336 in supplementary files S2 and S3 respectively. 337 The following potentially imported nutrients that were identified by this procedure 338 were formate, L,L-2,6-diaminopimelate, lysine, methionine, tetrahydrofolate, 339 coenzyme-B (COB-I), pantothenate and protoporphyrinogen (Table 1). The 340 requirement for methionine and lysine biosynthesis is caused by the genes 341 metC encoding O-acetyl-L-homoserine sulfhydrylase (ML0683c) and dapB that 342 encodes dihydrodipicolinate reductase (ML1527) being pseudogenes in M. leprae. 343 The requirement for L,L-2,6-diaminopimelate (DAPIM), an intermediate for lysine 344 synthesis and one of the principle substrates for peptidoglycan biosynthesis, is 345 due to the absence of diaminopimelate decarboxylase (EC: 4.1.1.20) in M. 346 leprae (Table 1). The requirement for tetrahydrofolate, formate and COB-I is 347 due to many genes involved in their synthesis being pseudogenes in M. leprae. 348 The requirement for pantothenate, the substrate for coenzyme-A (CoA) synthesis, 349 is due to M. leprae lacking 2-dehydropantoate 2-reductase (EC: 1.1.1.169) that 350 is needed to synthesize CoA from . The requirement for 351 protoporphyrinogen is a consequence of M. leprae’s inability to synthesise 352 porphyrin, and thereby any hemoproteins, from L-glutamate due to the absence 353 of the hemN gene, coding for oxygen-independent coproporphyrinogen III 354 oxidase. The import flux of these nutrients was fixed to the minimum needed 355 for maximum biomass production (see supplementary File S3 and Figure 1). 356 357 Substrate utilization by GSMN-ML 358 A major finding of the genome sequencing project by Cole et al., 2001 was 359 that M. leprae has lost many catabolic genes and pathways [10]. To investigate 360 this in more detail we investigated the ability GSMN-ML to utilize various carbon 361 and nitrogen substrates for biomass production and compared it to GSMN-TB_2 362 (supplementary File S4). The results, (Figure 2A) shows that, perhaps 363 surprisingly, M. leprae has retained the ability to utilize many carbon sources; 364 yet, in comparison with M. tuberculosis, has lost the ability to utilize acetate 365 and glycolate due to pseudogenization of acetate , phosphate acetate 366 and acetyl-coA synthase. M. leprae has also lost almost the entire

11 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

367 pathway involved in cholesterol utilization and cannot thereby utilize host-derived 368 cholesterol [37], which appears to be a growth substrate for M. tuberculosis in 369 vivo. Catabolism of amino acids, such as valine, threonine and isoleucine as 370 carbon sources was also impaired in M. leprae (Figure 2B). To explore the 371 nitrogen requirements of GSMN-ML, FBA was performed with five different single 372 nitrogen sources: ammonia, nitrate, urea, glutamate and glutamine (Figure 2C) 373 and with glucose as the primary carbon source, comparing to GSMN-TB_2. 374 Both models were able to utilize glutamate and glutamine to generate 375 approximately equal biomass. The GSMN-TB_2 model was able to generate 376 31.3% more BIOMASS using ammonia as compared to GSMN-ML, and unlike 377 M. leprae, was able to grow on urea as a single nitrogen source. Neither of 378 the models could generate biomass using nitrate as nitrogen source. These 379 data suggest that glutamate or glutamine is the most likely nitrogen source for 380 M. leprae. 381 382 DPA to predict intracellular metabolism of M. leprae from 383 transcriptome data 384 Directly measuring metabolism is very difficult, particularly for intracellular 385 bacteria. It is however relatively straightforward to obtain gene expression profiles 386 of intracellular bacteria and, from that, discover whether key metabolic genes 387 are up- or down-regulated in a certain condition. However, single genes may 388 contribute to several different metabolic pathways; and most metabolic pathways 389 involve many genes, which may show contradictory expression patterns. To 390 provide a systems-wide insight into intracellular metabolism, we previously 391 developed Differential Producibility Analysis (DPA) that uses a genome scale 392 model to associate metabolites with system-wide analysis of gene expression 393 patterns [20]. The first step in DPA is to identify all genes (the metabolite 394 producibility gene list) that are required (in the genome-scale model) for 395 production of every metabolite, given a set of nutrients inputs. The next step 396 in DPA is to interrogate gene expression data to identify whether the genes 397 that contribute to the production of a particular metabolite tend to be associated 398 with either up-regulated or down-regulated genes in the test condition. Note that 399 metabolites may appear in both lists if, for example, one set of genes associated 400 with their production is found to be up-regulated and another set is found to

12 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

401 be down-regulated. In our previous study, we applied DPA to gene expression 402 data obtained from both E. coli and M. tuberculosis, including M. tuberculosis 403 replicating within macrophages [20]. In this study, we applied the same 404 approach to investigate metabolic changes associated with in vivo growth of M. 405 leprae. 406 Since the nutrients used by M. leprae in vivo are unknown, we had to perform 407 the first step of DPA with putative intracellular nutrients. Enzymes for catabolism 408 of hexoses and glycerol have been detected in M. leprae isolated from armadillos 409 [38, 39], prompting us to perform the DPA growth simulations with both 410 (independently) glucose and glycerol. RNA was extracted and immediately 411 processed from in vivo-grown viable M. leprae harvested and from mouse 412 footpads [21] (see Methodology for details). The control condition was M. leprae 413 from mouse footpads, as above, but then incubated in axenic culture for 48 414 hours under conditions (see Methodology) where the bacilli remain viable but 415 do not replicate. In vivo up-regulated genes were defined as being the product 416 of an enzyme transcribed from a gene with a fold change >2 in the RNA-Seq 417 data. In vivo down-regulated genes are defined as genes with a fold change 418 <0.5 in the RNA-Seq data (supplementary File S5). These expression lists were 419 then cross-checked against the metabolite genes to identify metabolite whose 420 production (genes encoding the enzymes involved in their production) is 421 significantly associated with the up-regulated gene list, or is significantly 422 associated with the down-regulated gene list. The top 100 metabolites associated 423 with up- or down-regulated genes for both carbon sources is presented in 424 supplementary File S6. For up-regulated genes, 67 and 54 metabolites were 425 shared between GSMN-ML using glucose and glycerol as carbon sources (Figure 426 3A, B). Pathways and areas of metabolism likely to be affected by up and 427 down-regulated genes in M. leprae during growth on the two carbon sources 428 are illustrated in Figure 3C-F. Metabolites required for synthesis of amino acids 429 comprised the major category of metabolites predominantly associated with up- 430 regulated genes in both glucose and glycerol; whereas metabolites involved in 431 synthesis were mostly associated with upregulated on glucose 432 simulations (Figure 3C, D). Metabolites involved in lipid synthesis were the 433 major family of metabolites identified as being associated with down-regulated 434 genes for growth on glucose (Figure 3E). On glycerol, metabolites for co-factor 13 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

435 biosynthesis were down-regulated on glycerol (Figure 3F). Several metabolites 436 involved in cell wall synthesis of M. leprae’s cell wall were associated with up- 437 regulated genes only during glycerol utilisation simulations. This is likely because 438 cell wall components, such as arabinogalactan-decaprenyl phosphate [40], and 439 d-alanyl-d-alanine, a peptide-glycan precursor [41], fatty-acid derived mycolates 440 and arabinogalactan-peptidoglycan molecules are derived directly from glycerol 441 metabolism. 442 This pattern of gene regulation of M. leprae in response to its in vivo 443 environment appears to be very different from that of intracellular M. tuberculosis 444 [20]. For example, in M. tuberculosis 25% of metabolites associated with 445 phospholipid synthesis were associated with upregulated genes compared to 446 only 1% associated with down-regulated genes [20]; whereas, in M. leprae, for 447 glycerol and glucose-based simulations, a similar fraction of around 7% of 448 metabolites involved in phospholipid synthesis were found to be associated with 449 both up and down-regulated genes. Also, in M. tuberculosis, there was a clear 450 tendency for metabolites involved in lipid metabolism (other than phospholipids) 451 to be associated with up-regulated genes, whereas, for M. leprae, they were 452 mostly associated with down-regulated genes. This pattern was reversed for 453 amino acid metabolism, with a clear association with down-regulated genes in 454 M. tuberculosis (25% compared with 15% associated with up-regulated genes) 455 [20], whereas, in M. leprae, 18% of metabolites involved in amino acid synthesis 456 were associated with up-regulated genes and only 9% associated with down- 457 regulated genes. Comparing the two pathogens, the adaptation of M. tuberculosis 458 to its intracellular environment appears to be associated with an up-regulation 459 of lipid synthesis and down-regulation of amino acid synthesis; whereas the 460 reverse appears to be true for in vivo M. leprae. 461 462 To assess the importance of M. leprae genes, i.e., whether a gene is essential 463 or non-essential for growth on glycerol and glucose, we used GSMN-ML for 464 gene knockout predictions. We calculated the growth rate when a gene in the 465 GSMN-ML network was removed. We calculated gene essentiality (GS) ratios- 466 the ratio of the growth rate of the knock out to that of wildtype for each gene 467 in the network (supplementary Files S7, S8). Figure 4, 5 shows the central 468 carbon network genes and the GS ratios. A gene was considered essential if

14 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

469 the GS ratio was < 0.01. We found 21 genes involved in glycolysis, 470 gluconeogenesis, TCA cycle and PPP to be essential central metabolic genes 471 for growth of M. leprae on both glucose and glycerol. We compared the GS 472 ratios for the central metabolic network genes with that of M. tuberculosis, by 473 performing GS analysis of GSMN-TB_2 using glucose or glycerol as the sole 474 carbon source. Around 64-66% genes of the total genes in the network were 475 predicted to be essential in GSMN-ML compared to ~28% in GSMN-TB_2. The 476 higher percentage of essential metabolic genes in M. leprae is likely a 477 consequence of the loss of non-essential genes due to deletion or 478 pseudogenization during evolution of M. leprae from it progenitor. 479 480 Discussion 481 In this study, we constructed GSMN-ML, a genome scale metabolic model of 482 M. leprae. Interestingly, the model predicted that the pathogen would be unable 483 to grow without the addition of supplements (Table 1) that the pathogen probably 484 obtains from the host in vivo. Although we predict that these supplements are 485 required for in vitro growth of M. leprae, it they are likely to be present in many 486 of the media that have been tested in attempts to grow M. leprae in vitro. So 487 it seems likely that additional, as yet unidentified, vital nutrients are needed for 488 in vitro growth of the pathogen. Nonetheless, Table 1 does at least provide a 489 minimal nutrient requirement that could be further investigated. Host-derived 490 lipids are a well-established carbon source in mycobacterial infection such as 491 M. tuberculosis [42-44]. Although there is no direct evidence of M. leprae using 492 host cell lipids as a nutrient source, M. leprae infection did induce changes in 493 lipid metabolism in the host Schwann cells and macrophages, making them 494 lipid-rich foamy cells [45-47]. Also, lipid droplet biogenesis in the host provides 495 growth advantage to M. leprae by providing lipid nutrient pools to the pathogen 496 [46-48]. Our FBA-based simulations indicated that carbon sources triacylglycerol 497 and phosphotidyl choline provided the highest biomass production for in silico 498 growth on GSMN-ML, suggesting these, or similar lipids, as potential carbon 499 sources for M. leprae. In comparison to M. tuberculosis, GSMN-ML failed to 500 use either amino acids or a range of other lipid-derived metabolites as carbon 501 sources. Similarly, and in contrast to M. tuberculosis, GSMN-ML was incapable

15 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

502 of in silico growth with cholesterol as a carbon source (Figure 2A). This is in 503 agreement with previous studies that indicate that, despite retaining an ability 504 to oxidize cholesterol, M. leprae is unable to use cholesterol as an energy or 505 carbon source [37]. This is consistent with the genome studies, which indicate 506 that evolutionary gene-decay in M. leprae has resulted in the loss of the mce4 507 operon and that encodes for active transport system for cholesterol [10], [37], 508 as well as many genes required for cholesterol catabolism. Despite being part 509 of a major component to mycobacterial cell walls [48], GSMN-ML also failed to 510 utilize galactose; indicating that galactose-cell wall components are likely 511 synthesised by the Leloir pathway in the leprosy bacillus from UDP-glucose, 512 rather than from galactose directly [10], [11]. Amino acid utilisation in M. leprae 513 and M. tuberculosis were also significantly different. While M. leprae retains the 514 ability to biosynthesize amino acids, catabolic pathways are degraded, accounting 515 for why the pathogen cannot utilize amino acids as carbon sources (Figure 2B) 516 [10]. This deficit is largely explained by the absence of the urease operon in 517 M. leprae; also accounting for the inability of GSMN-ML to utilize urea as the 518 nitrogen source (Figure 2C). 519 Our FBA analysis of GSMN-ML indicates that M. leprae is able to utilize both 520 glucose and glycerol as carbon sources, consistent with evidence that glucose 521 and/or glucose-derived metabolites were used as nutrients by M. leprae during 522 in vivo growth [38], [49]. The finding may also be relevant to the observation 523 that glucose metabolism of host Schwann cells is perturbed by M. leprae 524 infection [49]. DPA analysis of gene expression data identified clear differences 525 in intracellular metabolism of M. leprae compared to intracellular M. tuberculosis, 526 particularly in lipid and amino acid metabolism. These differences may contribute 527 to the very different virulence properties of these two related pathogens. Lastly, 528 our analysis confirmed the expected higher degree of gene essentiality in M. 529 leprae compared to M. tuberculosis. This is likely to be associated with a lower 530 degree of metabolic robustness in the leprosy pathogen, compared to the TB 531 pathogen; a vulnerability that might provide novel opportunities for new drugs. 532 533 Conclusion

16 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

534 Despite intensive efforts, M. leprae has never been grown in the laboratory, 535 severely hampering research on this important pathogen. In this study, we 536 describe the construction of an in silico genome scale metabolic model of M. 537 leprae GSMN-ML that is able to simulate growth. The model was used to 538 devise a base medium for efforts to grow the pathogen in the laboratory. We 539 also used the model to interrogate gene expression data from in vivo-grown M. 540 leprae and identified several metabolic differences between that and in vitro- 541 incubated cells. We also found evidence for major differences in the intracellular 542 metabolism of M. leprae compared to the related pathogen, M. tuberculosis. 543 The availability of the GSMN-ML model provides a new tool for the further 544 exploration of the biology of this important pathogen that could lead to 545 development of new approaches for control of leprosy. 546 547 Acknowledgement 548 We thank BBSRC grant: BB/L022869/1, MRC/Newton Fund: MR/M026434/1, 549 BBSRC: India Partnering Award IPA1825 and Bioinformatics Resources and 550 Applications Facility (BRAF) of Government of India for funding. We thank National 551 Institute of Allergy and Infectious Diseases (NIAID) Interagency Agreement IAA 552 15006-004-00000 for the provision of viable M. leprae for this study. 553 554 References 555 1) Chaptini C, Marshman G. Leprosy: a review on elimination, reducing the disease 556 burden, and future research. Leprosy Rev. 2015;86: 307-15. PMID: 26964426. 557 558 2) Malathi M, Thappa DM. Fixed-duration therapy in leprosy: limitations and 559 opportunities. Indian J of Dermatology 2013;58: 93-100. PMID: 23716796. 560 561 3) World Health Organisation. Weekly epidemiological record 206;91: 405–420. 562 563 4) Salgado CG, Barreto JG, da Silva MB, Frade MAC, Spencer JS. What do we 564 actually know about leprosy worldwide? The Lancet Infectious Diseases 2016;16: 565 778. PMID: 27352757. 566 567 5) Spierings E, De Boer T, Zulianello L, Ottenhoff TH. The role of Schwann cells, 568 T cells and Mycobacterium leprae in the immunopathogenesis of nerve damage 569 in leprosy. Leprosy Review 2000;71: 121-9. PMID: 11201869. 570 571

17 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

572 6) Madigan CA, Cambier CJ, Kelly-Scumpia KM, Scumpia PO, Cheng TY, et al. 573 A Macrophage Response to Mycobacterium leprae Phenolic Glycolipid Initiates 574 Nerve Damage in Leprosy. Cell 2017;170: 973-985.e10. PMID: 28841420. 575 576 7) Suzuki K, Akama T, Kawashima A, Yoshihara A, Yotsu RR et al. Current status 577 of leprosy: epidemiology, basic science and clinical perspectives. The Journal of 578 Dermatology 2012;39: 121-9. PMID: 21973237. 579 580 8) Guinto RS, Doull JA, de Guia I. Mortality of persons with leprosy prior to 581 sulfone therapy, Cordova and Talisay, Cebu, Philippines. International journal of 582 leprosy 1954;22: 273-84. PMID: 13232775. 583 584 9) Lofthouse EK, Wheeler PR, Beste DJ, Khatri BL, Wu H, Mendum TA, Kierzek 585 A, McFadden J. Systems-based approaches to probing metabolic variation within 586 the Mycobacterium tuberculosis complex. PLoS One 2013;8: e75913. PMID: 587 24098743. 588 589 10) Cole ST, Eiglmeier K, Parkhill J, James KD, Thomson NR et al. Massive gene 590 decay in the leprosy bacillus. Nature 2001;409: 1007-11. PMID: 11234002. 591 592 11) Wheeler P Leprosy – clues about the biochemistry of Mycobacterium leprae and 593 its host-dependency from the genome. World Journal of Microbiology and 594 Biotechnology 2003;19: 1–16. 595 596 12) Pattyn SR. The problem of cultivation of Mycobacterium leprae. A review with 597 criteria for evaluating recent experimental work. Bulletin of World Health 598 Organisation 2003;49: 403–410. PMID: 4212439. 599 600 13) Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C et al. Deciphering the 601 biology of Mycobacterium tuberculosis from the complete genome sequence. 602 Nature 1998; 393: 537-44. 603 604 14) Gómez-Valero L, Rocha EP, Latorre A, Silva FJ. Reconstructing the ancestor 605 of Mycobacterium leprae: The dynamics of gene loss and genome reduction. 606 Genome Research 2007;17: 1178–1185. PMID: 17623808. 607 608 15) Masaki T, Qu J, Cholewa-Waclaw J, Burr K, Raaum R, Rambukkana A 609 Reprogramming adult schwann cells to stem cell-like cells by leprosy bacilli 610 promotes dissemination of infection. Cell 2013;152: 51–67. PMID: 23332746 611 612 16) Beste DJ, Hooper T, Stewart G, Bonde B, Avignone-Rossa C, Bushell ME, 613 Wheeler P, Klamt S, Kierzek AM, McFadden J. GSMN-TB: a web-based genome- 614 scale network model of Mycobacterium tuberculosis metabolism. Genome Biology 615 2007;8: R89. PMID: 17521419 616 617 17) Jamshidi N, Palsson BØ Investigating the metabolic capabilities of Mycobacterium 618 tuberculosis H37Rv using the in silico strain iNJ661 and proposing alternative 619 drug targets. BMC Systems Biology 2007;1: 26. PMID: 17555602 620

18 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

621 18) Rienksma RA, Suarez-Diez M, Spina L, Schaap PJ, Martins dos Santos VA. 622 Systems-level modeling of mycobacterial metabolism for the identification of new 623 (multi-)drug targets. Seminars in Immunology 2014;26: 610-22. PMID: 25453232. 624 625 19) Kavvas ES, Seif Y, Yurkovich JT, Norsigian C, Poudel S, Greenwald WW, 626 Ghatak S, Palsson BØ, Monk JM.Updated and standardized genome-scale 627 reconstruction of Mycobacterium tuberculosis H37Rv, iEK1011, simulates flux 628 states indicative of physiological conditions. BMC Systems Biology 2018; 12: 25. 629 PMID: 29499714 630 631 20) Bonde BK, Beste DJ, Laing E, Kierzek AM, McFadden J. Differential Producibility 632 Analysis (DPA) of Transcriptomic Data with Metabolic Networks: Deconstructing 633 the Metabolic Response of M. tuberculosis. PLOS Computational Biology 2011;7: 634 e1002060. PMID: 21738454 635 636 21) Davis GL, Ray NA, Lahiri R, Gillis TP, Krahenbuhl JL, Williams DL et al. 637 Molecular assays for determining Mycobacterium leprae viability in tissues of 638 experimentally infected mice. PLoS Neglected Tropical Disease 2013;7: e2404. 639 PMID: 24179562 640 641 22) Aronesty E. Comparison of sequencing utility programs. The Open Bioinformatics 642 Journal 2013;7: 1-8. 643 644 23) Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina 645 sequence data. Bioinformatics 2014;30: 2114-20. PMID: 24695404 646 647 24) Magoc T, Wood D, Salzberg SL. EDGE-pro: Estimated degree of gene 648 expression in prokaryotic genomes. Evolutionary Bioinformatics 2013;9: 127-36. 649 PMID: 23531787 650 25) Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, et al. The 651 UCSC genome browser database: extensions and updates. Nucleic Acids 652 Research 2013;41: D64-9. PMID: 23155063 653 26) Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing 654 genomic features. Bioinformatics 2010;26: 841-2. PMID: 20110278 655 27) Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al. The 656 Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 2009;25: 657 2078-9. PMID: 19505943 658 28) Anders S, Huber W. Differential expression analysis for sequence count data. 659 Genome Biology 2010;11: R106. PMID: 20979621 660 661 29) Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. 662 Nucleic Acids Research 2011;39: D32-7. PMID: 23193287 663 664 30) Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. 665 Science 1997;278: 631-7. PMID: 9381173 666 667 31) Karp PD, Paley S, Romero P. The pathway tools software. Bioinformatics 668 2002;18: S225-S232. PMID: 12169551 669

19 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

670 32) Sassetti CM, Boyd DH, Rubin EJ. Genes required for mycobacterial growth 671 defined by high density mutagenesis. Molecular Microbiology 2003;48: 77-84. 672 PMID: 12657046 673 33) Gevorgyan A, Bushell ME, Avignone-Rossa C, Kierzek AM. SurreyFBA: a 674 command line tool and graphics user interface for constraint-based modeling of 675 genome-scale metabolic reaction networks. Bioinformatics 2011;27: 433-434. 676 PMID: 21148545 677 678 34) Bhamidi S, Scherman MS, Jones V, Crick DC, Belisle JT, Brennan PJ, McNeil 679 MR. Detailed structural and quantitative analysis reveals the spatial organization 680 of the cell walls of in vivo grown Mycobacterium leprae and in vitro grown 681 Mycobacterium tuberculosis. The Journal of Biological Chemistry 2011;286: 23168 682 –23177. PMID: 21555513 683 684 35) Matsunaga I, Bhatt A, Young DC, Cheng TY, Eyles SJ, Besra GS et al. 685 Mycobacterium tuberculosis pks12 Produces a novel polyketide presented by 686 CD1c to T Cells. The journal of experimental medicine 2004; 200: 1559–1569. 687 PMID: 15611286 688 689 36) Dawes SS, Mizrahi V. DNA metabolism in Mycobacterium leprae. Leprosy Review 690 2001;72: 408-14. PMID: 11826477 691 692 693 37) Marques MA, Berrêdo-Pinho M, Rosa TL, Pujari V, Lemes RM, Lery LM et al. 694 The essential role of cholesterol metabolism in the intracellular survival of 695 Mycobacterium leprae is not coupled to central carbon metabolism and energy 696 production. Journal of bacteriology 2015;197: 3698-707. PMID: 26391209 697 698 38) Wheeler PR. Catabolic pathways for glucose, glycerol and 6-phosphogluconate 699 in Mycobacterium leprae grown in armadillo tissues. Journal of General 700 Microbiology 1983;129: 1481-95. PMID: 6352857 701 702 39) Franzblau SG, Harris EB. Biophysical Optima for metabolism of Mycobacterium 703 leprae. Journal of Clinical Microbiology 1988;26: 1124-1129. PMID: 3290244 704 705 40) Kaur D, Brennan PJ, Crick DC. Decaprenyl diphosphate synthesis in 706 Mycobacterium tuberculosis. Journal of Bacteriology 2004;186: 7564-7570. PMID: 707 15516568 708 709 41) Doan TTN, Kim JK, Tran HT, Cha SS, Chung KM, Huynh KH et al. Crystal 710 structures of d-alanine-d-alanine from Xanthomonas oryzae pv. oryzae 711 alone and in complex with nucleotides. Archives of Biochemistry and Biophysics 712 2014;545: 92-99. PMID: 24440607 713 714 715 42) Norman E1, De Smet KA, Stoker NG, Ratledge C, Wheeler PR, Dale JW. Lipid 716 synthesis in mycobacteria: characterization of the biotin carboxyl carrier protein 717 genes from Mycobacterium leprae and M. tuberculosis. Journal of Bacteriology 718 1994;176: 2525-31. PMID: 7909542 719

20 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

720 43) Pandey AK, Sassetti CM. Mycobacterial persistence requires the utilization of 721 host cholesterol. Proceedings of the National Academy of Sciences of the United 722 States of America 2008;105: 4376-80. PMID: 18334639 723 724 44) Beste DJ, Nöh K, Niedenführ S, Mendum TA, Hawkins ND, Ward JL, Beale 725 MH, Wiechert W, McFadden J. 13C-Flux spectral analysis of host-pathogen 726 metabolism reveals a mixed diet for intracellular Mycobacterium tuberculosis. 727 Chemistry and Biology 2013;20: 1012–1021. PMID: 23911587 728 729 45) Cruz D, Watson AD, Miller CS, Montoya D, Ochoa MT, Sieling PA et al. Host- 730 derived oxidized phospholipids and HDL regulate innate immunity in human 731 leprosy. The Journal of Clinical Investigation 2008;118: 2917-28. PMID: 18636118 732 733 46) Mattos KA, Oliveira VG, D'Avila H, Rodrigues LS, Pinheiro RO, Sarno EN et 734 al. TLR6-driven lipid droplets in Mycobacterium leprae-infected Schwann cells: 735 immunoinflammatory platforms associated with bacterial persistence. Journal of 736 immunology 2011; 187: 2548-58. PMID: 21813774 737 738 47) Elamin AA, Stehr M, Singh M. Lipid droplets and Mycobacterium leprae infection. 739 Journal of Pathogens 2012;361374. PMID: 23209912 740 741 48) Mattos KA, D'Avila H, Rodrigues LS, Oliveira VG, Sarno EN, Atella GC et al 742 Lipid droplet formation in leprosy: Toll-like receptor-regulated organelles involved 743 in eicosanoid formation and Mycobacterium leprae pathogenesis. Journal of 744 Leukocyte Biology 2010;87: 371-84. PMID: 19952355 745 746 49) Medeiros RC, Girardi KD, Cardoso FK, Mietto BS, Pinto TG, Gomez LS et al. 747 Subversion of Schwann cell glucose metabolism by Mycobacterium leprae. The 748 journal of Biological Chemistry 2016; 291: 21375-21387. PMID: 27555322 749 750 751 Main Figure legends: 752 Figure 1: Influence of opening transport gates for essential in silico medium 753 supplements on predicted biomass production for the M. leprae GSMN 754 model. Media component uptake was limited to maximal value of 1 mmol g- 755 1dryweight h-1. The biomass flux rate for GSMN-ML was calculated with respect 756 to the media components 757 758 Figure 2: In silico prediction of biomass production for various substrates 759 of M. leprae compared to the M. tuberculosis GSMN models. Carbon 760 substrate utilization profiles of M. leprae GSMN-ML vs. M. tuberculosis GSMN- 761 TB_2. Biomass production is shown for both the models on (A) various carbon 762 substrates and (B) amino acids as carbon sources (C) various nitrogen sources. 763 Biomass production flux was calculated on the various substrates using FBA.

21 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

764 Abbreviations for amino acids are alanine (ALA), glycine (GLY), arginine (ARG), 765 asparagine (ASPN) aspartic acid (ASP), cysterine (CYS), glutamic acid (GLU), 766 glutamine (GLUN), histidine (HIS), isoleucine (ILE), leucine (LEU), lysine (LYS), 767 methionine (MET), phenylalanine (PHE), proline (PRO), serine (SER), threonine 768 (THR), tryptophan (TRP), tyrosine (TYR) and valine (VAL). 769 770 Figure 3: DPA analysis of gene expression patterns to identify metabolic 771 processes associated with up-regulated (A) and down-regulated M. leprae 772 genes (B) in vivo. Venn diagram compares metabolites identified as associated 773 with up-regulated (C & D) or down-regulated (E and F) genes using either 774 glucose (C & E) or glycerol (D & F) as carbon source in DPA. 775 776 Figure 4: Central metabolic network of GSMN-ML. Reactions and their 777 respective annotated genes (highlighted in blue) are shown for Glycolysis, 778 Gluconeogenesis, PPP, TCA cycle, Glyoxylate shunt and anaplerotic CO2 779 fixation. Abbreviations for metabolites are: G6P (glucose-6-phosphate), F6P 780 (fructose-6-phosphate), FDP (fructose-1,6-bisphosphate), G3P (D-glyceraldehyde- 781 3-phosphate), DHAP (dihydroxyacetone-phosphate), 3PG (3-phosphoglycerate), 782 2PG (2-phosphoglycerate), PEP (phosphoenolpyruvate), PYR (pyruvate), ACCOA 783 (acetyl-CoA), OA (oxaloacetate), MAL (malate), FUM (fumarate), SUCC 784 (succinate), SUCCOA (succinyl-CoA), SUCCSAL (succinate_semialdehyde), AKG 785 (α-ketoglutarate), ICIT (isocitrate), CIT (citrate), GLX (glyoxylate), D6PGL (D- 786 gluconolactone-6-phosphate), D6PGC (6-phospho-D-gluconate), R5P (D-ribose-5- 787 phosphate), RL5P (D-ribulose-5-phosphate), X5P (D-xylulose-5-phosphate), E4P 788 (D-erythrose-4-phosphate) and S7P (D-sedoheptulose-7-phosphate). 789 790 Figure 5: Gene essentiality analysis of M. leprae and M. tuberculosis 791 utilizing glucose and glycerol as carbon sources. Heat map showing the 792 gene essentiality (GS) predictions of central carbon metabolic genes during 793 growth of M. leprae on glucose and glycerol. Simulations were performed in 794 SurreyFBA platform and a gene was considered essential if the GS ratio < 795 0.01. Essentiality is indicated by single colour gradient with the maximum GS ratio of 796 1 (green) and minimum GS ratio (red). 797 22 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

798 Main Table: 799 Table 1: Essential in silico medium supplements and their transport 800 gates in GSMN-ML. 801 802 Supplementary information 803 Supplementary File S1. Metabolic network of GSMN-TB_2. The network file 804 includes metabolite lists, pathways and reactions for M. tuberculosis and the 805 problem file (p-file) including the list of medium supplements and the upper and 806 lower limits of each medium component uptake used for nutrient utilization 807 analysis. The network was constructed using SurreyFBA software. 808 809 Supplementary File S2. In silico medium supplement composition. This 810 composition was used to test and incorporate pseudogenes and orphan reactions 811 that facilitated the growth of GSMN-ML. The p-file includes the list of medium 812 supplements and the upper and lower limits of each medium component uptake 813 used for nutrient utilization analysis. 814 815 Supplementary File S3. Metabolic network of GSMN-ML. The network file 816 includes metabolite lists, pathways and reactions for M. leprae and the problem 817 file (p-file) including the list of medium supplements and the upper and lower 818 limits of each medium component uptake used for nutrient utilization analysis. 819 The network was constructed using SurreyFBA software. 820 821 Supplementary File S4. P-files used in SurreyFBA to generate growth and 822 biomass production with GSMN-ML and GSMN-TB_2 model. The p-files 823 include the list of nutrients and their uptake rates- upper and lower limits. 824 825 Supplementary File S5. RNA-seq data for M. leprae isolated from mouse foot 826 pads. The file includes expression data and fold changes in expression of M. 827 leprae freshly harvested vs. 48h incubation in axenic medium. 828 829 Supplementary File S6. List of up-regulated and down-regulated metabolites 830 identified using DPA analysis. Metabolites are listed for GSMN-ML and RNA-

23 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

831 seq analysis using glucose and glycerol as sole carbon sources. The p-files 832 include the list of nutrients and their uptake rates- upper and lower limits. 833 834 Supplementary File S7. Flux balance analysis of GSMN-ML on glucose and 835 glycerol as carbon sources and gene essentiality predictions. File shows 836 flux variability analysis and gene essentiality analysis for the model. Two media 837 p-files was set for this test. One is to use glycerol as only carbon source and 838 the other is to use glycerol as only carbon source and the other necessary 839 molecules. Table 'FVA' shows results for all reactions with maximised biomass. 840 Table 'media' shows media settings, those substrates whose lower-bound < 0 841 are allowed to be uptaken. Table 'metabolite_name' shows metabolite IDs and 842 their names. Table 'KO_gene' shows gene essentiality analysis results on the 843 media set in 'Media'. GRateKO: growth rate when the gene knocked out. Ratio: 844 calculated as GRateKO/GRateWT(wildType). Genes are considered as essential 845 If Ratio < 0.01 (marked red). 846 847 Supplementary File S8. Flux balance analysis of GSMN-TB_2 on glucose 848 and glycerol as carbon sources and gene essentiality predictions. File 849 shows flux variability analysis and gene essentiality analysis for the model. Two 850 media p-files was set for this test. One is to use glycerol as only carbon 851 source and the other is to use glycerol as only carbon source and the other 852 necessary molecules. Table 'FVA' shows results for all reactions with maximised 853 biomass. Table 'media' shows media settings, those substrates whose lower- 854 bound < 0 are allowed to be uptaken. Table 'metabolite_name' shows metabolite 855 IDs and their names. Table 'KO_gene' shows gene essentiality analysis results 856 on the media set in 'Media'. GRateKO: growth rate when the gene knocked 857 out. Ratio: calculated as GRateKO/GRateWT(wildType). Genes are considered 858 as essential If Ratio < 0.01 (marked red). 859 860 861 862 863 864

24 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

865 Table 1: Metabolite Transport gate L, L-diaminopimelate R827 Lysine R834 Methionine R835 Formate R806 Tetrahydrofolate R936 Pantothenate R939 COB-I R937 Protoporphyrinogen R938 866 867 Medium supplements for growth of GSMN-ML. Transport gates for the metabolites shown 868 are included in supplementary File S3. 869

25 bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. bioRxiv preprint doi: https://doi.org/10.1101/819508; this version posted October 25, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.