Models and Algorithms for Metabolic Networks: Elementary Modes and Precursor Sets Vicente Acuña
Total Page:16
File Type:pdf, Size:1020Kb
Models and algorithms for metabolic networks: elementary modes and precursor sets Vicente Acuña To cite this version: Vicente Acuña. Models and algorithms for metabolic networks: elementary modes and precursor sets. Algorithme et structure de données [cs.DS]. Université Claude Bernard - Lyon I, 2010. Français. tel-00850705 HAL Id: tel-00850705 https://tel.archives-ouvertes.fr/tel-00850705 Submitted on 8 Aug 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. No d’ordre: 218-2008 Année 2009 Thèse Présentée devant l’Université Claude Bernard - Lyon 1 pour l’obtention du Diplôme de Doctorat (arrêté du 7 août 2006) et soutenue publiquement le 4 Juin 2010 par Vicente Acuña Models and algorithms for metabolic networks: elementary modes and precursor sets Directrice de thèse: Marie-France Sagot Co-directeur de thèse: Christian Gautier Jury: Pierluigi Crescenzi, Rapporteur Khaled Elbassioni, Rapporteur Christian Gautier, Directeur Dominique Perrin, Examinateur Marie-France Sagot, Directrice Alain Viari, Président UNIVERSITÉ CLAUDE BERNARD-LYON 1 Président de l’Université M. le Professeur L. COLLET Vice-Président du Conseil Scientifique M. le Professeur J. F. MORNEX Vice-Président du Conseil d’Administration M. le Professeur G. ANNAT Vice-Président du Conseil des Etudes et M. le Professeur D. SIMON de la Vie Universitaire Secrétaire Général M. G. GAY COMPOSANTES SANTÉ UFR de Médecine Lyon-Est – Claude Bernard Directeur: M. le Professeur J. ETIENNE UFR de Médecine Lyon-Sud – Charles Mérieux Directeur: M. le Professeur F-N. GILLY UFR d’Ontologie Directeur: M. le professeur D. BOURGEOIS Institut des Sciences Pharmaceutiques et Bi- Directeur: M. le Professeur F. LOCHER ologiques Institut des Sciences et Techniques de Réadapta- Directeur: M. le Professeur Y. MATILLON tion Département de Formation et Centre de Recherche Directeur: M. le Professeur P. FARGE en Biologie Humaine COMPOSANTES SCIENCES ET TECHNOLOGIE Faculté des Sciences et Technologies Directeur: M. le Professeur F. GIERES UFR Sciences et Techniques des Activités Directeur: M. C. Collignon Physiques et Sportives Observatoire de Lyon Directeur: M. B. Guiderdoni Institut des Sciences et des Techniques de Directeur: M. le Professeur J. LIETO l’Ingénieur de Lyon Institut Universitaire de Technologie A Directeur: M. le Professeur C. COULET Institut Universitaire de Technologie B Directeur: M. le Professeur R. LAMARTINE Institut de Science Financière et d’Assurances Directeur: M. le Professeur J-C. AUGROS Abstract In this PhD, we present some algorithms and complexity results for two general prob- lems that arise in the analysis of a metabolic network: the search for elementary modes of a network and the search for minimal precursors sets. Elementary modes is a common tool in the study of the cellular characteristic of a metabolic network. An elementary mode can be seen as a minimal set of reactions that can work in steady state independently of the rest of the network. It has there- fore served as a mathematical model for the possible metabolic pathways of a cell. Their computation is not trivial and poses computational challenges. We show that some problems, like checking consistency of a network, finding one elementary mode or checking that a set of reactions constitutes a cut are easy problems, giving polynomial algorithms based on LP formulations. We also prove the hardness of central problems like finding a minimum size elementary mode, finding an elementary mode containing two given reactions, counting the number of elementary modes or finding a minimum reaction cut. On the enumeration problem, we show that enumerating all reactions containing one given reaction cannot be done in polynomial total time unless P=NP. This result provides some idea about the complexity of enumerating all the elementary modes. The search for precursor sets is motivated by discovering which external metabolites are sufficient to allow the production of a given set of target metabolites. In contrast with previous proposals, we present a new approach which is the first to formally consider the use of cycles in the way to produce the target. We present a polynomial algorithm to decide whether a set is a precursor set of a given target. We also show that, given a target set, finding a minimal precursor set is easy but finding a precursor set of minimum size is NP-hard. We further show that finding a solution with minimum size internal supply is NP-hard. We give a simple characterisation of precursors sets by the existence of hyperpaths between the solutions and the target. If we consider the enumeration of all the minimal precursor sets of a given target, we find that this problem cannot be solved in polynomial total time unless P=NP. Despite this result, we present two algorithms that have good performance for medium-size networks. 6 Contents Introduction 11 1 Some Basic Mathematical Definitions 19 1.1 Graphs, digraphs and hypergraphs .................... 19 1.1.1 Graph and digraph definitions ................... 20 1.1.2 Graphical representation and labels . 20 1.1.3 Walks, paths, cycles and hamiltonian cycles . 20 1.1.4 Adjacency and incidence matrix . 21 1.1.5 Induced subgraph, bipartite graphs and trees . 21 1.1.6 Directed hypergraphs ........................ 22 1.2 Hitting set .................................. 23 1.3 Boolean functions .............................. 24 1.3.1 Monotone Boolean functions .................... 25 2 Basic Concepts of Time Complexity Analysis 27 2.1 Defining a problem ............................. 28 2.1.1 Decision problems .......................... 28 2.1.2 Optimisation problems ....................... 29 2.1.3 Enumeration and counting problems . 29 2.2 Analysis of algorithms ........................... 30 2.2.1 Input size .............................. 31 2.2.2 Worst case analysis ......................... 31 2.2.3 Asymptotic analysis ........................ 31 2.3 Complexity classes of decision problems . 32 2.3.1 The class P ............................. 32 2.3.2 The class Np ............................ 32 2.3.3 Reducibility among problems ................... 33 2.3.4 Np-complete problems ....................... 33 2.4 Complexity classes of optimisation problems . 34 2.4.1 The classes Po and Npo ...................... 34 2.4.2 Np-hard optimisation problems . 35 2.4.3 Approximation algorithms ..................... 35 2.4.4 The classes Apx and Apx-hard . 36 2.5 Complexity of counting solutions ..................... 36 8 CONTENTS 2.5.1 ♯P and ♯P-complete ......................... 36 2.6 Complexity of enumerating all the solutions . 37 2.6.1 Time delay, incremental time and total time . 37 3 Metabolic Networks 39 3.1 Entities involved in metabolism ...................... 40 3.1.1 Biochemical reactions and metabolites . 40 3.1.2 Enzymes and genes ......................... 41 3.1.3 Metabolism regulation ....................... 41 3.1.4 Reconstructing a metabolic network . 42 3.2 Modelling metabolic networks ....................... 43 3.2.1 Graph and hypergraphs models . 43 3.2.2 Including stoichiometry ....................... 44 3.2.3 Assuming steady state ....................... 44 4 Complexity of Computing Elementary Modes 47 4.1 Modelling metabolic network in steady state . 48 4.1.1 Definitions .............................. 48 4.1.2 Relation between elementary modes and extreme rays . 50 4.1.3 Reversibility of reactions ...................... 50 4.2 Checking consistency of the stoichiometric matrix . 51 4.3 Finding elementary modes ......................... 53 4.3.1 Finding an elementary mode .................... 53 4.3.2 Finding elementary modes with support containing a given set of reactions ............................. 54 4.3.3 Finding the shortest elementary modes . 56 4.4 Counting elementary modes ........................ 59 4.5 Enumerating elementary modes ...................... 60 4.5.1 Enumerating elementary modes with a given reaction in its support 61 4.5.2 Analysis of the complexity result . 61 4.5.3 Case when all reactions are reversible . 62 4.6 Reaction cuts ................................ 62 4.6.1 Finding minimal and minimum reaction cuts . 63 4.7 Proof of Theorem 4.11 and Theorem 4.15 ................ 65 5 Modelling Precursor Sets in Metabolic Networks 71 5.1 Definitions and Characterisations ..................... 71 5.1.1 Modelling a metabolic network . 71 5.1.2 Forward propagation ........................ 72 5.1.3 Definition of precursor sets considering cycles . 73 5.1.4 Alternative characterisation of precursor set . 74 5.1.5 Maximal target ........................... 75 5.1.6 Hyperpaths from sources to the target . 76 5.1.7 Precursor cut set .......................... 77 CONTENTS 9 5.2 Complexity results ............................. 77 5.2.1 Deciding if a set of sources is a precursor set . 78 5.2.2 Finding a minimal and a minimum precursor sets . 78 5.2.3 Enumerating all minimal precursor sets . 81 6 Algorithms to Enumerate All Minimal Precursor Sets 85 6.1 Preprocessing the network ......................... 85 6.2 The replacement tree ...........................