Metabolic Route Computation in Organism Communities Markus Krummenacker* , Mario Latendresse and Peter D Karp

Metabolic Route Computation in Organism Communities Markus Krummenacker* , Mario Latendresse and Peter D Karp

Krummenacker et al. Microbiome (2019) 7:89 https://doi.org/10.1186/s40168-019-0706-6 SOFTWARE Open Access Metabolic route computation in organism communities Markus Krummenacker* , Mario Latendresse and Peter D Karp Abstract Background: Microbiomes are complex aggregates of organisms, each of which has its own extensive metabolic network. A variety of metabolites are exchanged between the microbes. The challenge we address is understanding the overall metabolic capabilities of a microbiome: through what series of metabolic transformations can a microbiome convert a starting compound to an ending compound? Results: We developed an efficient software tool to search for metabolic routes that include metabolic reactions from multiple organisms. The metabolic network for each organism is obtained from BioCyc, where the network was inferred from the annotated genome. The tool searches for optimal metabolic routes that minimize the number of reactions in each route, maximize the number of atoms conserved between the starting and ending compounds, and minimize the number of organism switches. The tool pre-computes the reaction sets found in each organism from BioCyc to facilitate fast computation of the reactions defined in a researcher-specified organism set. The generated routes are depicted graphically, and for each reaction in a route, the tool lists the organisms that can catalyze that reaction. We present solutions for three route-finding problems in the human gut microbiome: (1) production of indoxyl sulfate, (2) production of trimethylamine N-oxide (TMAO), and (3) synthesis and degradation of autoinducers. The optimal routes computed by our multi-organism route-search (MORS) tool for indoxyl sulfate and TMAO were the same as routes reported in the literature. Conclusions: Our tool quickly found plausible routes for the discussed multi-organism route-finding problems. The routes shed light on how diverse organisms cooperate to perform multi-step metabolic transformations. Our tool enables scientists to consider multiple alternative routes and identifies the organisms responsible for each reaction. Keywords: Route search, Metabolic network, Microbiome, BioCyc, Pathway tools Background solute in humans, which has been implicated in toxic- Microbiomes harbor a multitude of different microbes ity among patients with kidney disease [1]. Clearance by that are living together in close contact. These microbes hemodialysis appears to be limited, because indoxyl sul- are interacting with each other synergistically, com- fate is mostly protein bound and shows limited diffusion petitively, and antagonistically, by various mechanisms. across hemodialysis membranes. One problem is identi- One key interaction is the exchange of metabolites. To fying the microbes in the human gut that participate in understand the functional capabilities and dynamics of a the synthesis of this toxic metabolite, and via which reac- microbiome, knowing how exchanged metabolites hold tions and enzymes. Such results could potentially lead together the microbiome’s overall metabolic network is to clinical interventions. The sheer number of microbes necessary. and reactions potentially involved presents challenges for For example, indoxyl sulfate is derived from the break- identifying relevant targets. down of L-tryptophan by colon microbes, involving also We have developed a software tool called Multi the human host. It is an extensively studied uremic Organism Route Search (MORS) to propose plausible biosynthetic routes between researcher-supplied starting *Correspondence: [email protected] and ending (goal) metabolites, where such routes can SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, USA © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Krummenacker et al. Microbiome (2019) 7:89 Page 2 of 8 span multiple microbes and other organisms, including BioCyc organism databases by the PathoLogic algorithm the human host. MORS finds linear reaction sequences [8]. Therefore, the vast majority of reactions within Bio- that convert the specified start metabolite to the specified CycdatabasesarealsopresentinMetaCyc,andthesame final metabolite. Such a series of consecutive biosynthetic unique identifier assigned to a reaction R in MetaCyc is reactions is called a route. A key use case for MORS is assigned to R in every other BioCyc database in which it exploratory searching for implicated organisms and reac- occurs. tions, when a goal metabolite is given, which may have The exceptions are transport reactions that are inferred been found in a metabolomics experiment. by our Transport Inference Parser [9]basedongene annotations, which can create novel reactions not in Implementation MetaCyc. Additionally, some manually curated organism During more than two decades, we have developed the databases contain new reactions that are not present in BioCyc website [2] and its underlying software called MetaCyc. To accommodate these differences, we con- Pathway Tools [3]. In the latest release 22.6, BioCyc pub- structed a new database called MetaRoute, into which we lishes 14,560 metabolic organism databases, mostly bac- first copied all the metabolic reactions from MetaCyc. terial. Additionally, 9 pan-genome databases are provided. Then, we imported into MetaRoute the extra reactions Most databases were computationally generated; approxi- that were present in other organism databases but not in mately two dozen received varying levels of expert human MetaCyc. Thus, MetaRoute spans all metabolic reactions curation. found in BioCyc. However, MetaRoute lacks transport We introduced a single-organism Metabolic Route reactions and MORS does not currently use transport Search tool (called RouteSearch) in 2014 [4]. RouteSearch reactions, because most genome annotations fail to anno- is designed to find optimal metabolic routes, according tate the substrates of significant numbers of transporters. to a set of criteria, which a researcher interactively speci- If transport were required by MORS whenever adjacent fies and explores. Generally, longer routes are considered reactions were catalyzed by different organisms, then gen- less optimal. Another optimality criterion is to retain as eration of many valid routes would be prevented. Fur- many atoms as possible from the start metabolite, such thermore, MORS operates in a compartment-agnostic that they still are present in the goal metabolite. Inferring manner, meaning reactions are not segregated into sepa- the retained atoms is enabled by pre-computed atom- rate compartments. Unsegregated metabolites have been mappings [5] between the metabolites of the reactions used before in multi-organism investigations, e.g., [10, 11]. that are obtained from MetaCyc. To speed the execution of MORS, we pre-compute a 2D MORS extends the single-organism RouteSearch to binary array whose rows are reaction IDs in MetaRoute enable route searches that utilize reactions from an arbi- and whose columns are BioCyc organism IDs. At the trary number of organism databases in the BioCyc collec- intersection of a particular reaction ID and a particular tion. Thus, one new feature is to enable the user to select organism ID, the bit is either set if this reaction is in that a set of organisms to consider in a particular search. An organism or unset otherwise. This array enables quickly organism set can be selected in several ways: by searching determining all the reactions that a given organism in for individual organism by name, by browsing alphabeti- BioCyc contains, without even having to open and load cal lists of organism names, by selecting organisms from that particular database. Constructing this array requires the NCBI taxonomy [6], and by selecting organisms based opening all BioCyc organism databases and takes sev- on metadata recorded by the genome-sequencing project, eral hours of processing time. A separate, pre-computed which can include the Human Microbiome Project [7] database contains genomic metadata for each BioCyc (HMP) defined body site, in which the organism is found. organism, so we can rapidly obtain the list of organisms for Another new feature is the minimization of organism a specific HMP body site, for example. Combined, these switches needed for completing a route, as described pre-computed data enable efficiently finding the union below. of all reaction IDs expected for the set of organisms the To make MORS practical, finding an efficient way to researcher has selected. perform route searches across an arbitrary subset of MORS provides an additional property that is mini- the 14,560 organism databases in BioCyc was essential. mized during route searches, in addition to the Route- Our solution exploits the special role that our Meta- Search costs of lost atoms and of the number of reactions Cyc [2] database plays. MetaCyc is our master database in the route. MORS also minimizes switching of organ- that aims to cover the universe of chemical reactions isms in a route. A switching

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us