A Systems-Level Investigation of the Metabolism of Dehalococcoides mccartyi and the Associated Microbial Community

by

Mohammad Ahsanul Islam

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Department of Chemical Engineering and Applied Chemistry University of Toronto

© Copyright by Mohammad Ahsanul Islam 2014

A Systems-Level Investigation of the Metabolism of Dehalococcoides mccartyi and the Associated Microbial Community

Mohammad Ahsanul Islam

Doctor of Philosophy

Department of Chemical Engineering and Applied Chemistry University of Toronto

2014

Abstract

Dehalococcoides mccartyi are a group of strictly anaerobic important for the detoxification of man-made chloro-organic solvents, most of which are ubiquitous, persistent, and often carcinogenic ground water pollutants. These bacteria exclusively conserve energy for growth from a pollutant detoxification reaction through a novel metabolic process termed organohalide respiration. However, this energy harnessing process is not well elucidated at the level of D. mccartyi metabolism. Also, the underlying reasons behind their robust and rapid growth in mixed consortia as compared to their slow and inefficient growth in pure isolates are unknown. To obtain better insight on D. mccartyi physiology and metabolism, a detailed pan- genome-scale constraint-based mathematical model of metabolism was developed. The model highlighted the energy-starved nature of these bacteria, which probably is linked to their slow growth in isolates. The model also provided a useful framework for subsequent analysis and visualization of high-throughput transcriptomic data of D. mccartyi. Apart from confirming expression of the majority genes of these bacteria, this analysis helped review the annotations of

ii

metabolic genes. Revised annotations of two such metabolic genes — NADP+-isocitrate dehydrogenase and phosphomannose isomerase — were then experimentally verified. Finally, growth experiments were performed with a D. mccartyi-containing anaerobic mixed enrichment culture to explore the effects of exogenous vitamin omission from the growth medium on D. mccartyi and the associated microbial community. The experiments showed how nutritional requirements of these bacteria changed the composition and dynamics of their associated microbial community. Overall, a systems-level approach was used in this research to obtain a fundamental and critical understanding of the metabolism and physiology of D. mccartyi in isolates, as well as in microbial communities they naturally inhabit. The results presented in this thesis, therefore, will help design effective strategies for future efforts by D. mccartyi.

iii

Acknowledgements

I would like to take this opportunity to thank many fascinating people who helped me complete this very long but exciting journey! It was, indeed, a life changing experience!

Fisrt of all, my sincere thanks goes to my supervisors, Dr. Radhakrishnan (Krishna) Mahadevan and Dr. Elizabeth A. Edwards — both of whom are extremely caring mentors, knowledgeable scholars, and great personalities. It was their contagious enthusiasm and passion about any scientific matter, tireless inquisitive minds, and critical thinking that shaped me as a researcher during my stay at UofT. They provided me with all kinds of support and help over these many years without which this thesis wouldn’t be possible. I’m eternally grateful to both Krishna and Elizabeth for giving me such rare opportunities as to work in the amazing world of microbial metabolism.

I would like to thank all past and present members of both LMSE (Laboratory for Metabolic Systems Engineering) and EdLab, including Jiao, Karthik, Bahareh, Nadeera, Nick B, Peter, Kai, Laurence, Nik, Fahime, Srinath, Pratish, Sarat, Victor, Chris, Ariel, Alison, Eve, Roya, Anna, Winnie, Max, Jennifer, Laura, Marie, Alfredo, Torsten, Jine Jine, Cheryl, Ivy, Shuiquan, Fei, Wendy, Luz, Sarah, and Olivia. These are the people with whom I shared some of the most treasured moments of my life. You are truly the most talented, enthusiastic, and beloved coworkers that I have ever worked with.

I also want to express my humble gratitude to my committee members, Dr. Nicholas J. Provart and Dr. Emma R. Master. Their guidance, support, critical comments, and careful directions were some of the most influential factors that worked behind materializing this thesis. I feel fortunate to have these powerful scientific minds and fantastic personalities as my committee members. Likewise, I express my gratitude to Dr. David S. Guttman and Dr. Boris Steipe for teaching me bioinformatics, without which I wouldn’t be able to even start my research.

Thanks to Dr. Alexander F. Yakunin and Dr. Alexei Savchenko for sharing your expertise in enzymology and proteomics with someone like me who is so naïve in those areas. Thanks also to Dr. Melanie Duhamel, the BioZone Assistant Director and Project Manager, for being a friend

iv

and “oracle” of KB-1 related challenges, as well as of monetary issues. Also, thank you to Endang Susilawati (Susie) and Angelika Duffy, the BioZone Lab Manager and Assistant Lab Manager. Susie’s motherly touch made my office-stay feel like home-stay, and Angelika’s warm support helped me steer through difficult times more easily. I am also very grateful to Anatoli Tchigvintsev and Greg Brown, the two knowledgeable technicians and biochemists, without whom my enzyme work would only be a dream!

Thank you to Leticia Gutierrez and Gorette Silva for being so supportive in solving administrative matters, and always greeting with cordial smiles. Thanks to Pauline Martini and Joan Chen at the Chemical Engineering Graduate office for making the “bureaucratic part” of my graduate life easy, and providing essential information for all scholarship and funding related matters. Thanks so much to Julie Mendonca for keeping my head straight about payroll and tax related issues. Thanks also to Daniel Tomchyshyn and Weijun Gao, the Departmental and BioZone computer geeks, for providing and solving all IT supports and problems.

I am truly indebted to my parents, Shamsun Nahar and Shamsul Alam, for bringing me into this spectacular world of microbes. It was their lifelong teaching, their simple yet powerful philosophies about life in general, and their intrinsic novel qualities that shaped me as a human being. Without their love, devotion, encouragement, and continuous prayers, I wouldn’t imagine to come this far. I am also extremely grateful to my parents-in-law, Shamima Ali and Mazed Ali, for their love and prayers, and most importantly, for giving me their daughter, Rutba, my wife and my eternal love. Rutba is an inspirational teacher, guide, and philosopher without whom I couldn’t even imagine to embark on this PhD journey. So, equal credit goes to Rutba on successful completion of this journey, and I dedicate this thesis to her — my best friend with whom I’m looking forward to spending many more fun-filled years. The acknowledgements remain incomplete if I don’t mention the most precious gift of my life, Manar, my daughter. Her smile and warmth of calling “Baba” refresh me every single day.

Finally, I want to thank my funding agencies for their generous support during my graduate study: Government of Ontario, Genome Canada, Ontario Genomics Institute, Natural Sciences and Engineering Research Council of Canada, US Department of Defense Strategic Environmental Research and Development Program, and the University of Toronto. v

Table of Contents

Abstract ...... ii Acknowledgements ...... iv Table of Contents ...... vi List of Tables ...... ix List of Figures ...... x List of Appendices ...... xii List of Non-Standard Abbreviations Used ...... xiv Chapter 1: Introduction ...... 1 1.1. Motivation ...... 1 1.2. Research objectives ...... 3 1.3. Thesis outline ...... 4 1.4. Statement of authorship and publication status ...... 7 Chapter 2: General overview ...... 11 2.1. Systems biology ...... 11 2.2. Modeling microbial metabolism ...... 12 2.3. Constraint-based reconstruction and analysis (COBRA) approach ...... 14 2.4. Metabolic network reconstruction procedures ...... 15 2.5. Determination of biomass composition and maintenance energy ...... 19 2.6. Model validation and refinement ...... 21 2.7. Flux balance analysis ...... 21 2.8. Energy conservation in microbes ...... 23 2.9. Chlorinated xenobiotics and ...... 25 2. 10. Dehalococcoides bacteria ...... 28 2.11. The KB-1 microbial community ...... 30 Chapter 3: Characterizing the metabolism of Dehalococcoides with a constraint-based model .. 32 3.1. Abstract ...... 32 3.2. Introduction ...... 33 3.3. Materials and methods ...... 36 3.3.1. Dehalococcoides pan-genome ...... 36 3.3.2. Reconstructing the metabolic network of Dehalococcoides ...... 37 3.3.3. Estimation of biomass composition and maintenance energy requirements ...... 38 3.3.4. In silico analysis of Dehalococcoides metabolism ...... 39 3.4. Results and discussion ...... 40 vi

3.4.1. Dehalococcoides metabolic network ...... 40 3.4.1.1. Pan-metabolic-genes of Dehalococcoides ...... 40 3.4.1.2. Features of the Reconstructed Metabolic Network of Dehalococcoides ...... 43 3.4.2. Model-based simulations of Dehalococcoides physiology ...... 47 3.4.2.1. Exploring the central metabolism of Dehalococcoides ...... 47

3.4.2.2. CO2-fixation by Dehalococcoides ...... 48 3.4.3. Energy conservation process of Dehalococcoides ...... 53 3.4.4. Implications of the incomplete cobalamin synthesis pathway in Dehalococcoides .... 55 3.4.5. Does carbon or energy limit the in silico growth of Dehalococcoides? ...... 60 3.5. Conclusions ...... 63 Chapter 4: New insight into Dehalococcoides mccartyi metabolism from a model-integrated systems-level analysis of D. mccartyi transcriptomes ...... 65 4.1. Abstract ...... 65 4.2. Introduction ...... 66 4.3. Materials and methods ...... 68 4.3.1. Identification of D. mccartyi genes from KB-1 shotgun microarray data ...... 68 4.3.2. Dehalococcoides mccartyi strain 195 microarray data ...... 69 4.3.3. Operon prediction for Dehalococcoides mccartyi genomes ...... 70 4.3.4. Microarray data analysis and visualization ...... 71 4.4. Results and discussion ...... 72 4.4.1. Principal component analysis of strain 195 and KB-1 Dhc microarray data ...... 72 4.4.2. Improved identification and confirmation of D. mccartyi genes ...... 75 4.4.3. Confirmation of hypothetical proteins in strain 195 and KB-1 Dhc genomes ...... 76 4.4.4. Confirmation of metabolic genes in strain 195 and KB-1 Dhc genomes ...... 78 4.4.5. Clustering of microarray data and operon predictions ...... 85 4.4.7. Analysis of strain 195 QT cluster 2 ...... 89 4.4.8. Analysis of strain 195 QT cluster 6 ...... 92 4.5. Conclusions ...... 95 Chapter 5: Model-assisted prediction and experimental characterization of isocitrate dehydrogenase and phosphomannose isomerase from Dehalococcoides mccartyi strain KB-1 .. 97 5.1. Abstract ...... 97 5.2. Introduction ...... 98 5.3 Materials and methods ...... 100 5.3.1. Bacterial culture, reagents and chemicals ...... 100 5.3.2. Gene cloning and overexpression of the selected genes in E. coli ...... 101

vii

5.3.3. Purification of the overexpressed recombinant proteins ...... 101 5.3.4. Enzymatic assays for the purified recombinant proteins ...... 102 5.4. Results and discussion ...... 103 5.4.1. Biochemical activities and kinetic parameters of KB1_0495 (DmIDH) and KB1_0553 (DmPMI) ...... 103 5.4.2. Sequence homology and phylogenetic analyses of DmIDH and DmPMI sequences 110 5.4.3. Structure based analysis of DmIDH and DmPMI sequences ...... 117 5.5. Conclusions ...... 123 Chapter 6: Role of exogenous vitamin omission on the growth and community dynamics of a Dehalococcoides mccartyi-containing anaerobic mixed microbial community ...... 125 6.1. Abstract ...... 125 6.2. Introduction ...... 126 6.3. Materials and methods ...... 128 6.3.1. Chemicals and analytical procedures ...... 128 6.3.2. Preparation of exogenous vitamins and resazurin free KB-1 cultures ...... 128 6.3.3. Preparation of diluted KB-1 cultures for the time-course experiment ...... 131 6.3.4. DNA collection and extraction ...... 133 6.3.5. Quantitative PCR (qPCR) primers and method ...... 133 6.3.6. Analysis of qPCR data ...... 134 6.4. Results ...... 135 6.4.1. Dechlorination and growth of washed KB-1 cultures cultivated in different growth media ...... 135 6.4.2. Community composition of washed KB-1 cultures cultivated in different growth media ...... 137 6.4.3. Dechlorination and growth of diluted KB-1 cultures cultivated in different growth media ...... 143 6.5. Discussion ...... 147 6.6. Conclusions ...... 151 Chapter 7: Summary, conclusions, and future work ...... 152 7.1. Summary ...... 152 7.2. Conclusions ...... 154 7.3. Future work ...... 156 References ...... 160 Appendices ...... 186

viii

List of Tables

Table 3.1. General features of Dehalococcoides metabolic network (iAI549) ...... 44 Table 3.2. Composition of the in silico minimal medium of Dehalococcoides ...... 45 Table 3.3. Comparison of various in silico genome-scale models with iAI549 ...... 46 Table 4.1. Strain 195 genes identified in functionally enriched clusters and associated inferred annotations ...... 94 Table 5.1. Kinetic parameters of DmIDH and DmPMI from D. mccartyi strain KB-1 ...... 105 Table 6.1. Composition of different mineral media used in this study ...... 130 Table 6.2. Description of different KB-1 cultures used in this study ...... 130 Table 6.3. Estimated TCE dechlorination and ethene production rates of different washed KB-1 cultures ...... 137

ix

List of Figures

Figure 1.1. Schematic representation of the relationship between different thesis objectives. .... 10 Figure 2.1. Steps involved in developing a genome-scale constraint-based metabolic model by COBRA approach...... 15 Figure 2.2. Metabolic network reconstruction procedure...... 17 Figure 2.3. Anaerobic reductive dechlorination of chlorinated ethenes to benign ethene and higher chlorinated to less toxic lower chlorinated benzenes ...... 27 Figure 2.4. Schematic representation of the microbial interactions between different community members in the KB-1 community...... 31 Figure 3.1. Composition of the Dehalococcoides pan-genome...... 41 Figure 3.2. Distribution of dispensable and unique metabolic genes in different Dehalococcoides strains...... 43 Figure 3.3. The reconstructed TCA-cycle and CO2 fixation pathway of Dehalococcoides...... 51 Figure 3.4. Analysis of the citrate synthase (CS) reaction on Dehalococcoides growth...... 53 Figure 3.5. A Tentative Scheme for D. mccartyi electron transport chain (ETC)...... 55 Figure 3.6. Reconstructed cobalamin biosynthesis pathway of Dehalococcoides...... 58 Figure 3.7. Influence of cobalamin on the growth rate and yield of Dehalococcoides...... 60 Figure 3.8. Effect of carbon and energy sources on the growth yield of Dehalococcoides...... 63 Figure 4.1. Principal component analysis (PCA) of the array data for strain 195 and KB-1 Dhc samples...... 75 Figure 4.2. Hypothetical proteins of (A) strain 195 and (B) KB-1 Dhc with proteomic and transcriptomic evidence...... 78 Figure 4.3. Proteomic and transcriptomic evidence for the hypothetical proteins of strain 195 reannotated in the D. mccartyi metabolic model...... 80 Figure 4.4. Proteomic and transcriptomic evidence for the hypothetical proteins of KB-1 Dhc reannotated in the D. mccartyi metabolic model...... 82 Figure 4.5. Expression of reductive dehalogenase homologous (rdhA) genes...... 84 Figure 4.6. Functional enrichment analysis of QT clusters for (A) strain 195 and (B) KB-1 Dhc array data...... 88 Figure 4.7. Analysis of two functionally enriched strain 195 QT clusters...... 92 Figure 5.1. Effects of pH and substrate concentrations on the rate of DmIDH...... 108 Figure 5.2. Effects of pH and substrate concentrations on the rate of DmPMI...... 109 Figure 5.3. Mannosylglycerate (MG) biosynthesis pathway in D. mccartyi...... 109 Figure 5.4. Sequence homology network for DmIDH (KB1_0495) and DmPMI (KB1_0553). 113 Figure 5.5. Phylogenetic analysis of DmIDH protein sequence...... 115 Figure 5.6. Phylogenetic analysis of DmPMI protein sequence...... 117 Figure 5.7. Structure-based multiple sequence alignment (MSA) of DmIDH...... 119 Figure 5.8. Structure-based multiple sequence alignment (MSA) of DmPMI ...... 121 Figure 6.1. Genealogy of KB-1 cultures used in this study...... 133 Figure 6.2. TCE dechlorination profiles of different washed KB-1 cultures...... 137 Figure 6.3. Community composition of different washed KB-1 cultures ...... 140 Figure 6.4. Community composition of different washed KB-1 cultures in terms of absolute cell numbers...... 142 Figure 6.5. Dechlorination profiles for the time-course experiment of diluted KB-1 cultures. . 145

x

Figure 6.6. TCE dechlorination rates, ethene production rates, and D. mccartyi cell numbers for diluted KB-1 cultures...... 147

xi

List of Appendices

Appendix A: Supplemental information for Chapter 3 ...... 186 Table A1. Overall Macromolecular Composition of a Dehalococcoides Cell ...... 186 Table A2. Protein Composition of 1 Gram of Dehalococcoides Cell ...... 186 Table A3. DNA Composition of 1 Gram of Dehalococcoides Cell ...... 187 Table A4. RNA Composition of 1 Gram of Dehalococcoides Cell ...... 187 Table A5. Lipid Composition of 1 Gram of Dehalococcoides Cell ...... 187 Table A6. Composition of Cofactors and Other Soluble Pools of 1 Gram of Dehalococcoides Cell ...... 187 Table A7. Experimental Growth Yields of Various Dehalococcoides Cultures ...... 188 Table A8. Experimental Growth Rates of Various Dehalococcoides Cultures ...... 190 Table A9. Experimental Decay Rates of Different Anaerobes ...... 191 Table A10. Energy Cost for Processing and Polymerization of Macromolecules (GAM) of a Typical Bacterial Cell ...... 191 Table A11. Standard Gibbs Free Energies for Different Dechlorination Reactions ...... 192 Table A12. Theoretical ATP/e- and H+/e- Ratios of Reductive Dechlorination by Dehalococcoides ...... 193 Table A13. Experimental Values of Corrinoid Content of Various Anaerobes ...... 193 Table A14. Growth Rate Simulations with and without the Citrate Synthase (CS) Reaction in the TCA-cycle ...... 194 Table A15. List of tables containing information for Dehalococcoides metabolic model, iAI549 ...... 194 Supplemental Text ...... 195 Dehalococcoides Biomass Synthesis Reaction ...... 195 Calculation of Dehalococcoides Cell Composition ...... 196 Calculation of NGAM and GAM Parameters of iAI549 ...... 198 Calculation of Theoretical Maximum Energy Transfer Efficiency (ATP/e-) and Proton Translocation Stoichiometry (H+/e- ratio) of Dehalococcoides Electron Transport Chain (ETC) ...... 198 Detailed procedures for developing the Dehalococcoides pan-genome and model ...... 200 Figure A1. Steps involved in developing the pan-genome ...... 200 Figure A2. Steps involved in developing the core-genome ...... 201 Figure A3. Steps involved in developing the unique-genome ...... 202 Figure A4. Steps involved in developing the dispensable-genome ...... 203 Figure A5. Reconstructed Wood-Ljungdahl pathway for Dehalococcoides...... 205 Figure A6. Distribution of metabolic genes in different subsystems of iAI549 ...... 205 Figure A7. Distribution of gene-associated model reactions in different subsystems of iAI549 206 Appendix B: Supplemental information for Chapter 4 ...... 207 Table B1: List of supplemental tables for chapter 4 ...... 207 Figure B1. Workflow for Analyzing Pre-Processed KB-1 Microarray Data...... 209 Figure B2. Workflow for Analyzing Pre-Processed Strain 195 Microarray Data...... 211 Figure B3. Distribution of Strain 195 Gene Expression Intensities for 27 Samples...... 212 Figure B4. Distribution of KB-1 Dhc Gene Expression Intensities for 33 Samples...... 214 Figure B5. Visualization of gene expression data on the Dehalococcoides mccartyi metabolic network...... 216 Appendix C: Supplemental information for Chapter 5 ...... 217 xii

Identification of KB1_0495 (DmIDH) and KB1_0553 (DmPMI) ...... 217 Figure C1. Orthologous gene neighborhood analysis for DmIDH (KB1_0495) DmPMI (KB1_0553) ...... 220 Figure C2. Orthologous gene neighborhood analysis for the 3-isopropylmalate dehydrogenase (IPMDH) from D. mccartyi (cbdbA804, DET0826, DhcVS_730, DehaBAV1_0745, DehalGT_0706)...... 222 Appendix D: Supplemental information for Chapter 6 ...... 223 Table D1. Previously developed (Duhamel, 2005, Waller, 2010) qPCR primer-sets used in this study ...... 223 Figure D1. qPCR results for Acetobacterium OTU for 1:10 dilution of KB-1 samples ...... 224 Figure D2. Ratios of total bacterial and archaeal cell numbers tracked by individual OTU primers to general bacterial and archaeal primers...... 226 Figure D3. Electron balance profiles for the time-course experiment of diluted KB-1 cultures. 228 Appendix E: Genome-scale constraint-based metabolic modeling of thermoacetica 229 Abstract ...... 229 Introduction ...... 229 Materials and methods ...... 231 Automated generation of a draft metabolic model for ...... 231 Determination of biomass compositions ...... 232 Curation of the draft Moorella thermoacetica metabolic model ...... 233 Figure E1. Steps involved in curating the M. thermoacetica draft metabolic model...... 233 Results and discussion ...... 233 General features of the reconstructed metabolic network of M. thermoacetica ...... 234 Figure E2. Subsystems of the Moorella thermoacetica metabolic model, iAI517...... 235 Figure E3. Reconstructed metabolic network of Moorella thermoacetica...... 236 Figure E4. Reconstructed Wood-Ljungdahl (W-L) pathway of CO2-fixation for M. thermoacetica...... 237 Model-based simulations of M. thermoacetica metabolism ...... 238 Figure E5. In silico growth profile of Moorella thermoacetica on different substrates using the metabolic model, iAI517...... 238 Figure E6. In silico ATP generation profile of Moorella thermoacetica on different substrates using the metabolic model, iAI517...... 239 Conclusions ...... 240 Table E1. General features of Moorella thermoacetica metabolic reconstruction ...... 240 Table E2. Overall cellular composition of a Moorella thermoacetica cell ...... 241 Table E3. Protein composition of a Moorella thermoacetica cell ...... 242 Table E4. DNA composition of a Moorella thermoacetica cell ...... 243 Table E5. RNA composition of a Moorella thermoacetica cell ...... 243 Table E6. Fatty acid composition of a Moorella thermoacetica cell ...... 243 Table E7. Lipid composition of a Moorella thermoacetica cell ...... 244 Table E8. Cell wall composition of a Moorella thermoacetica cell ...... 245 Table E9. Ions and metabolites of a Moorella thermoacetica cell ...... 245 Moorella thermoacetica biomass equation ...... 247

xiii

List of Non-Standard Abbreviations Used

BLAST — Basic local alignment search tool MCA — Metabolic control analysis HCM — Hybrid cybernetic modeling COBRA — Constraint-based reconstruction and analysis FBA — Flux balance analysis GAM — Growth associated maintenance parameter NGAM — Non-growth associated maintenance parameter SLP — Substrate level phosphorylation ETP — Electron transport phosphorylation DNAPL— Dense non-aqueous phase liquid PCE — Tetrachloroethene TCE — Trichloroethene cDCE — cis 1,2-Dichloroethene VC — Vinyl chloride HCB — Hexachlorobenzene PeCB — Pentachlorobenzene TeCB — Tetrachlorobenzene TCB — Trichlorobenzene DCB — Dichlorobenzene TCEM — Trichloroethene and methanol cDCEM — cis 1,2-Dichloroethene and methanol VCM — Vinyl chloride and methanol VCH — Vinyl chloride and hydrogen NA — Not amended qPCR — Quantitative polymerase chain reaction OTU — Operational taxonomic unit

xiv

1

Chapter 1: Introduction

1.1. Motivation

Halogenated organic compounds or organohalides, i.e., organofluorides, organochlorides, organobromides, and organoiodides are abundant in nature. They originate primarily from three sources: geogenic, biogenic, and anthropogenic (Gribble, 1992, Gribble, 2010, Öberg, 2002, Gribble, 2012). Geogenic organohalides are generated mainly from oceanic sea-salt spray, volcanic eruptions, forest and grass fires, sediments, and soil, while biogenic organohalides are the results of biological synthesis of these compounds in bacteria, fungi, plants, marine plants, marine sponges, corals, insects, and mammals including humans (Gribble, 1992, Gribble, 2010). Thus, both geogenic and biogenic organohalides are naturally produced compounds, and the vast majority of these compounds, except a few such as dioxins produced by forest fires, are non- toxic, non-persistent, biodegradable, and rather useful compounds (Gribble, 1992, Gribble, 2010, Öberg, 2002).

On the contrary, the vast majority of anthropogenic organohalides generated by chemical synthesis are toxic, persistent, and recalcitrant to biodegradation (Öberg, 2002, Leys et al., 2013). Also, it is their toxic, persistent, less reactive, less flammable, and less corrosive properties that made such synthetic organohalides, including tetrachloroethene (PCE), trichloroethene (TCE), pentachlorobenzene (PeCB), hexachlorobenzene (HCB), dioxins, and polychlorinated biphenyls (PCB), very useful and popular for extensive commercial and industrial use (McCarty, 2010, Doherty, 2000a, Doherty, 2000b). They were widely used as degreasers in cleaning metal parts, dry cleaning agents, solvents, paints, pharmaceuticals, pesticides, fungicides, and as ingredients and intermediates in chemical synthesis of many useful compounds (Doherty, 2000a, Doherty, 2000b, Doherty, 2012, Lohman, 2002). However, past uncontrolled disposal and handling practices, together with their toxicity and structural stability have made them ubiquitous and persistent xenobiotics (ATSDR, 2013, McCarty, 2001, McCarty, 2010, Petrisor and Wells, 2008). Because they contaminate soil, sediments, and groundwater aquifers, their presence poses a significant threat to human health and the environment.

2

Despite being persistent and recalcitrant to biodegradation, chlorinated xenobiotics undergo both abiotic and biotic transformations (McCarty, 2001, McCarty, 2010, Leys et al., 2013). However, partial transformations of these compounds by abiotic and biotic processes produce even more toxic intermediates than the parent compounds (Öberg, 2002, McCarty, 2010); for instance, partial degradation of PCE and TCE generates vinyl chloride (VC), a known human causing a rare form of liver cancer (Vianna et al., 1981). Nevertheless, only an anaerobic biological process, termed organohalide respiration is capable of complete biodegradation of PCE and TCE to non-toxic ethene by the reductive dechlorinatrion reaction (Freedman and Gossett, 1989, Holliger et al., 1993). This novel and natural process is catalyzed by the reductive dehalogenase enzymes of mostly Dehalococcoides mccartyi — a group of strictly anaerobic and organohalide respiring tiny bacteria (Smidt and de Vos, 2004, Tas et al., 2010, Leys et al., 2013). The fact that D. mccartyi harness energy for growth only from the reductive dechlorination reaction during organohalide respiration has made these bacteria very unique and immensely important for the bioremediation application (Holliger et al., 1993, Holliger et al., 1998b, Adrian, 2009). In other words, faster growth rates of D. mccartyi are directly linked to the rapid bioremediation of chlorinated xenobiotics from nature. However, growth of these specialized bacteria is significantly slower in pure cultures than in mixed cultures (McCarty, 2010, Adrian et al., 1998, Adrian et al., 2000a, Adrian et al., 2000b). Also, dechlorination activities of these bacteria are more robust in consortia than in isolates (Duhamel et al., 2002, Duhamel, 2005). Thus, in order to better apply organohalide respiration for the bioremediation of chlorinated pollutants, we need to have a detailed understanding of the unusual metabolism of D. mccartyi and the microbial community to which they are intricately linked.

Recent advances in high-throughput experimental technologies such as genome sequencing have facilitated detailed and systems-level studies of microbial metabolism (Heinemann and Sauer, 2010, Pagani et al., 2012). Such studies usually involve in silico mathematical modeling of cells and simulating their integrated behavior in the context of cellular physiology (Covert et al., 2001, Di Ventura et al., 2006). One of the pivotal aspects of such systematic analysis and interpretation of data is the reconstruction of cellular networks, the collection and visualization of all physiologically relevant cellular processes (Ideker et al., 2001, Chuang et al., 2010). Unlike most cellular networks, such as regulatory, signaling, and protein-protein interaction networks, the

3

interaction topology of metabolic networks is well established since metabolism is fairly conserved across different branches of life (Peregrin-Alvarez et al., 2009, Fani and Fondi, 2009). Most importantly, metabolic interactions can be described quantitatively with the help of cellular biochemical reactions, which essentially represent the inviolable physicochemical constraints of a cell (Reed et al., 2006a, Price et al., 2004). Hence, a metabolic network can be interrogated and analyzed by mathematical tools in order to predict the metabolic fate and phenotypic behavior of a cell (Price et al., 2004, Lewis et al., 2012, Pfau et al., 2011). These useful properties of modeling microbial metabolism have led to an astronomical amount of research initiatives in this field (Baumann, 2011, Patil et al., 2004, Mardinoglu et al., 2013, Thiele et al., 2013, Kim et al., 2012). Systems-level analyses of microbial metabolism have enabled researchers with improved understanding of such critical aspects of microbial physiology as gene transcription, protein translation, enzyme kinetics, and metabolic regulation (Liu et al., 2010, Oberhardt et al., 2009). This enhanced understanding of microbial physiology, in turn, allows researchers to manipulate the metabolic capability of these microbes for achieving desired goals in medical, industrial, food, and environmental biotechnology sectors (McCloskey et al., 2013, Oberhardt et al., 2009, Patil et al., 2004, Lovley, 2003, Mahadevan et al., 2011).

1.2. Research objectives

The overall goal of this research is to investigate the fundamental characteristics of the unusual metabolism of specialist bacteria Dehalococcoides mccartyi, both in isolates and in microbial communities. This goal was achieved through a systems-level approach, and both computational and experimental studies concerning the physiology and metabolism of D. mccartyi were conducted. In silico studies included the construction of a detailed mathematical model of metabolism for D. mccartyi followed by integration of high-throughput transcriptomic data with the model obtained from gene-expression microarray experiments of these bacteria. Experimental studies were conducted for the biochemical characterization of two metabolic genes — isocitrate dehydrogenase and phosphomannose isomerase — from D. mccartyi strain KB-1 and with an in-house D. mccartyi-containing anaerobic mixed enrichment culture, KB-1. Thus, the overall goal of this research was accomplished by pursuing four specific objectives:

4

1. Construction of a detailed constraint-based systemic mathematical model of D. mccartyi metabolism;

2. Improved annotation and elucidation of D. mccartyi genes and metabolism through the model-integrated analysis of high-throughput transcriptomic data;

3. Biochemical characterization of isocitrate dehydrogenase and phosphomannose isomerase genes of D. mccartyi; and

4. Exploring the influence of exogenous vitamin B12 (cobalamin) removal from the growth medium on a D. mccartyi-containing anaerobic dechlorinating microbial community.

1.3. Thesis outline

Chapter 1: Introduction

This chapter describes the motivation behind the research by defining the problems and challenges, states research objectives, and gives an outline of the contents of all chapters included in this thesis.

Chapter 2: General overview

This chapter presents a general overview and background information on some of the most important underlying basic concepts and topics of the research projects described throughout this thesis, including different concepts of systems-level modeling of microbial metabolism, essential concepts of genome-scale constraint-based modeling, theory of microbial energy conservation, and a general introduction to chloro-organic xenobiotics contamination and their remediation strategies by D. mccartyi metabolism in pure isolates and in mixed microbial communities.

5

Chapter 3: Characterizing the metabolism of Dehalococcoides with a constraint-based model

There are four results chapters in this thesis, and chapter 3 describes the in silico experiments conducted to achieve the first research objective. In particular, this chapter details the construction of a pan-genome-scale constraint-based mathematical model of D. mccartyi metabolism. The D. mccartyi pan-genome and subsequent metabolic network were developed from the publicly accessible genome sequences of D. mccartyi strains 195, CBDB1, BAV1, and VS, as well as from relevant information on the organisms’ physiology and metabolism from published literature and biological databases. In silico growth and metabolism of D. mccartyi were simulated with the flux balance analysis approach using the reconstructed metabolic network as a modeling framework. The model was used for predicting the influence of carbon and energy sources, as well as the availability of cobalamin in the growth medium on D. mccartyi growth rate and yield. The annotations of all genes in D. mccartyi genomes were also reviewed or corrected bioinformatically during the construction of the pan-genome-scale metabolic network.

Chapter 4: New insight into Dehalococcoides mccartyi metabolism from a model-integrated systems-level analysis of D. mccartyi transcriptomes

This chapter also describes the in silico experiments and analyses performed to achieve the second research objective of this thesis. Using the pan-genome-scale metabolic network and model developed in chapter 3 as a common platform, transcriptomic data from the gene- expression microarray experiments of pure and mixed cultures of D. mccartyi were analyzed in this chapter. Various bioinformatic and statistical analyses of the transcriptomic data were performed to shed light on different metabolic processes, including the poorly understood mechanism of energy conservation in D. mccartyi. Transcriptomic data, together with available proteomic data were also used to confirm transcription and expression of the majority genes in D. mccartyi genomes, as well as to review or improve their annotations. Finally, this meta-

6 analysis generated experimentally testable hypotheses regarding the function of some hypothetical proteins and metabolic genes of these environmentally important bacteria.

Chapter 5: Model-assisted prediction and experimental characterization of isocitrate dehydrogenase and phosphomannose isomerase from Dehalococcoides mccartyi strain KB-1

This chapter is one of two experimental chapters in this thesis, and it describes both computational and enzymology techniques used to achieve the third research objective. In particular, chapter 5 elucidates one of many applications of systems-level modeling of microbial metabolism by describing the detailed experimental characterization two metabolic genes — isocitrate dehydrogenase and phosphomannose isomerase — of D. mccartyi strain KB-1. Annotations of these two putative metabolic genes, one of which was originally annotated as a hypothetical protein, were reviewed in detail during the construction of D. mccartyi metabolic network and model, as well as the model-integrated bioinformatic analyses of transcriptomic data obtained from microarray experiments. The genes were heterologously expressed in E. coli, overexpressed recombinant proteins were then purified, and biochemical activity of the purified recombinant proteins were tested with appropriate enzymatic assays. The results confirmed the presence of two novel metabolic genes in D. mccartyi, as well as highlighted the importance of revised gene-annotations presented during the construction of the D. mccartyi metabolic model.

Chapter 6: Role of exogenous vitamin omission on the growth and community dynamics of a Dehalococcoides mccartyi-containing anaerobic mixed microbial community

Another experimental chapter of this thesis, which describes the microbiological techniques used to pursue the fourth and final research objective. This chapter explores if D. mccartyi, the vitamin B12-auxotrophic bacteria, can survive without the addition of exogenous vitamin mixtures, including vitamin B12, in their growth medium. The growth experiments were conducted with KB-1, a D. mccartyi-containing anaerobic and dechlorinating mixed microbial enrichment culture. KB-1 growth on trichloroethene and methanol was monitored in different

7

growth media with various combinations of exogenous vitamins and with no exogenous vitamins. D. mccartyi growth was inferred from the gas chromatographic analysis of degradation products, as well as the calculation of dechlorination rates. Also, the influence of different growth media on the KB-1 community composition was identified using the quantitative PCR (qPCR) technique. Finally, the difference in dechlorination and growth rates were analyzed by conducting growth experiments with diluted KB-1 cultures in media with and without any exogenous vitamins. D. mccartyi growth in these diluted cultures was also verified using qPCR.

Chapter 7: Summary, conclusions, and future work

This chapter summarizes the main findings from different research projects presented in this thesis, and describes their novelty, significance, and impact on D. mccartyi metabolism and physiology, as well as on the bioremediation application of these organisms, overall. New prospects for future research initiatives based on the research presented in this thesis are also described briefly.

1.4. Statement of authorship and publication status

Chapter 3: Characterizing the metabolism of Dehalococcoides with a constraint-based model

Authors: M. Ahsanul Islam1, Elizabeth A. Edwards1, and Radhakrishnan Mahadevan1

Contributions: EAE and RM conceived of the ideas and designed the experiments. MAI performed the experiments and analyzed the data. MAI wrote the manuscript with input from all co-authors.

Affiliations: 1-Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada

8

Reference to publication: PLoS Computational Biology 2010, 6(8): e1000887. doi:10.1371/journal.pcbi.1000887

Chapter 4: New insight into Dehalococcoides mccartyi metabolism from a model-integrated systems-level analysis of D. mccartyi transcriptomes

Authors: M. Ahsanul Islam1, Alison S. Waller2, Laura A. Hug3, Nicholar J. Provart4, Elizabeth. A. Edwards1, and Radhakrishnan Mahadevan1

Contributions: MAI conceived of the ideas and designed the experiments in consultation with NJP, EAE, and RM. ASW generated the KB-1 transcriptomic data. LAH generated the draft genome of D. mccartyi in KB-1. MAI analyzed the data and wrote the manuscript with input from all co-authors.

Affiliations: 1-Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada, 2-European Molecular Biology Laboratory (EMBL), Heidelberg, Germany, 3-Department of Earth and Planetary Science, University of California, Berkeley, USA, 4-Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada

Conditional acceptance: PLOS ONE

Chapter 5: Model-assisted prediction and experimental characterization of isocitrate dehydrogenase and phosphomannose isomerase from Dehalococcoides mccartyi strain KB-1

Authors: M. Ahsanul Islam, Anatoli Tchigvintsev, Veronica Yim, Alexei Savchenko, Alexander F. Yakunin, Elizabeth. A. Edwards, and Radhakrishnan Mahadevan

Contributions: MAI, AS, AFY, EAE, and RM conceived of the ideas. MAI and AFY designed the experiments. VY prepared the clones, and MAI and AT performed the experiments. MAI characterized the enzymes and wrote the manuscript with input from all co-authors.

9

Affiliations: 1-Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada

In preparation for: Journal of Bacteriology

Chapter 6: Role of exogenous vitamin omission on the growth and community dynamics of a Dehalococcoides mccartyi-containing anaerobic mixed microbial community

Authors: M. Ahsanul Islam1, Radhakrishnan Mahadevan1, and Elizabeth A. Edwards1

Contributions: MAI, RM, and EAE designed the experiments. MAI performed the experiments, analyzed the data, and wrote the manuscript with input from all co-authors.

Affiliations: 1-Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada

In preparation for: Applied and Environmental Microbiology

10

Figure 1.1. Schematic representation of the relationship between different thesis objectives.

11

Chapter 2: General overview

2.1. Systems biology

A “systems” concept is relatively new in biology; at least in comparison to physical sciences or engineering (Wiki_System, 2013). In 1824, the French physicist Nicolas Léonard Sadi Carnot first used the “system” concept for studying thermodynamics (Wiki_Sadi_Carnot, 2013), while in biology, it was not until 1934 when the Austrian biologist Karl Ludwig von Bertalanffy used “systems” theory for describing an organism’s growth over time by a simple mathematical model (Wiki_Bertalanffy, 2013, Wiki_System, 2013). More recently, the advent of various “omics” data, such as genomics, transcriptomics, proteomics, and metabolomics, generated from high- throughput biological experiments has led to a new era in modern biology; a data-rich environment that is completely unfamiliar to data-poor classical biology. The high-throughput experimental technologies, including genome sequencing, microarray, and mass spectrometry, not only generated a plethora of data but also significantly broadened the horizon of our knowledge about the biological systems. However, study of this data-rich modern biology requires a more holistic approach as compared to the reductionist approach of classical biology; and this requirement ultimately gives rise to systems biology (Chuang et al., 2010). Thus, systems biology is the integrated study of a biological system and all of its components, and their intricate relationships, rather than individual study of the system components (Chuang et al., 2010, Ideker et al., 2001, Peitsch and de Graaf, 2013).

The whole new paradigm of systems biology uses genome, proteome, and metabolome-scale data, as well as model systems for accelerating predictive and hypothesis-driven research; however, such model systems are first required to be validated by the detailed single component experiments and literature from classical biology. Nevertheless, the ability of systems biology to integrate and study myriad biological data generated from both innovative system-wide experiments and computational approaches has shown its promise in many research areas, including gene expression analysis, network biology, signal transduction, pathway-based biomarkers and analysis, and metabolic pathways (Chuang et al., 2010, Peitsch and de Graaf,

12

2013). Systems biology research in metabolic pathways has been especially facilitated by the exponential growth of biological databases of microbial genome sequences, both in size and numbers (Pagani et al., 2012, Robbins, 1994, Stein, 2003). In silico mathematical modeling of microbial cells and simulating their integrated behavior in the context of cellular physiology is an organized and useful way of interpreting these biological data (Karr et al., 2012, Durot et al., 2009, Hyduke et al., 2013, McCloskey et al., 2013).

2.2. Modeling microbial metabolism

From a biochemical engineering point of view, microbial cells, or cells in general can be regarded as biological chemical process plants, where hundreds of enzyme catalyzed biochemical reactions are operating to achieve a specific cellular objective such as the cell growth. All of these biochemical reactions, categorized as energy-producing (catabolic reactions), and energy-consuming or biosynthetic precursor producing (anabolic reactions), jointly constitute the cellular process called metabolism (Madigan et al., 2010, Nelson and Cox, 2006, Todar, 2012). Thus, metabolism is considered as the “driving engine” of a cell (Buchakjian and Kornbluth, 2010), and the effort to model microbial metabolism, as well as cell growth is fairly old. As mentioned before, the earliest mathematical model for cell growth as a function of time was described by Karl Ludwig von Bertalanffy in 1934 (Wiki_Bertalanffy, 2013). Then, Jacques Monod, in 1949, presented a more systematic model for cellular growth based on empirical results (Monod, 1949), in which he described the relationship among substrates, products, and biomass by a single hyperbolic equation (Monod, 1949).

As the simplified models of cell growth were more like a correlation than a real model, the first attempt to detailed mathematical modeling of an individual cell was described by Shuler et al (Shuler et al., 1979). Shuler and coworkers developed a single-cell computer model of Escherichia coli by incorporating the formation of biomass precursors, cellular macromolecules, and intracellular metabolites as lumped reactions, called “pseudo-chemical” reactions (Shuler et al., 1979). The model was later refined to include cellular energy requirements, and model predictions were validated with experimental data under glucose limiting conditions (Domach et

13 al., 1984). However, these relatively simplified single-cell models failed to address the complexity of metabolic networks that usually arises from the multitude and regulation of metabolic reactions in a cell. These limitations were overcome in more systematic and detailed modeling approaches, such as “metabolic control analysis” (MCA) (Kacser and Burns, 1973, Heinrich and Rapoport, 1974) and “cybernetic modeling” (Ramkrishna, 1983).

Developed in the 1970s (Kacser and Burns, 1973, Heinrich and Rapoport, 1974), MCA is a mathematical modeling approach for understanding the control of metabolic flux and intermediate metabolite concentrations on the enzymes involved in a particular metabolic pathway (Wildermuth, 2000). It is essentially a sensitivity analysis of the metabolic network of a cell by perturbing metabolic fluxes and metabolite concentrations, the two system variables, for identifying rate-limiting enzymatic reactions in a pathway (Fell, 1992). Although MCA quantifies metabolic regulation to some extent, its use of sensitivity coefficients is more representative of microbial enzymatic kinetic competition than cellular regulation (Patnaik, 2001). Cellular regulatory effects on metabolic networks were addressed in a goal-oriented modeling approach called “cybernetic modeling”. Developed by Ramkrishna and coworkers (Ramkrishna and Song, 2012, Ramkrishna, 1983), cybernetic approach is based on the premise that biological systems can manipulate their metabolism in response to environmental changes such as availability of nutrients for maximizing a particular objective. This modeling framework was later extended to develop hybrid cybernetic models (HCM) (Kim et al., 2008) and lumped hybrid cybernetic models (L-HCM) (Song and Ramkrishna, 2010) by including large numbers of metabolic reactions in the models. However, none of these frameworks are suitable for incorporating genome-scale information because estimation of model parameters for solving such models, even for a small-scale model like E. coli central metabolism, is a formidable challenge (Ramkrishna and Song, 2012). Moreover, both MCA and cybernetic approaches require a large number of experimentally measured information such as enzyme kinetics data, and this requirement especially makes them inconvenient to use with systems-level data.

The linear programming-based approach for mathematical modeling of metabolism, called the constraint-based reconstruction and analysis (COBRA) (Becker et al., 2007, Schellenberger et al., 2011, Thiele and Palsson, 2010a), is a simple yet powerful method that can incorporate

14

genome-scale information and identify optimal flux distributions of a large metabolic network with minimum experimental information (Lewis et al., 2012). The COBRA approach is mainly based on flux balance analysis (FBA), which was primarily developed by Varma and Palsson in the 1990s (Varma and Palsson, 1994). Since the COBRA framework was extensively used for developing the models of microbial metabolism in this thesis, all steps involved are briefly described in the following sections.

2.3. Constraint-based reconstruction and analysis (COBRA) approach

Genome-scale constraint-based modeling of microbial metabolism, also known as COBRA approach, is an iterative model building process that requires extensive genomic, biochemical, and physiological information about an organism (Reed et al., 2006a, Thiele and Palsson, 2010a, Thiele and Palsson, 2010b, Feist et al., 2009, Palsson, 2006). Such a model never becomes complete but evolves with the evolution of knowledge about the organism of interest. Figure 2.1 is a schematic illustration of the steps involved in constructing a genome-scale model. The first step in the development of such a detailed model is the reconstruction of a genetically, genomically, and biochemically characterized metabolic network which forms the base for mathematical analysis of the model. Once a highly curated metabolic network is constructed, the biomass composition of the organism has to be determined followed by representation of all biomass components in the form of a biomass synthesis reaction. Then, additional physiological information such as the cellular maintenance energy in the form of ATP requirements has to be incorporated in the model, so that the model can be used to predict the growth rate and by- product secretion patterns of the organism (Palsson, 2006, Becker et al., 2007, Feist et al., 2009, Lewis et al., 2012, Schellenberger et al., 2011, Thiele and Palsson, 2010a). Finally, the model has to be validated by experimental data such as experimental growth rate or substrate uptake rate of the model organism, as well as by simulating its metabolism using flux balance analysis (FBA), a mathematical modeling technique used for quantitatively simulating microbial metabolism (Thiele and Palsson, 2010a, Varma and Palsson, 1994).

15

Figure 2.1. Steps involved in developing a genome-scale constraint-based metabolic model by COBRA approach. After reconstructing the metabolic network, and estimating biomass compositions and ATP requirements for cellular maintenance of the organism of interest, FBA is used to integrate all these information for simulating the metabolism of the organism.

2.4. Metabolic network reconstruction procedures

Metabolism is a collective process of all cellular biochemical reactions, and it drives the physiological processes of a cell (Madigan et al., 2010, Nelson and Cox, 2006, Todar, 2012). Thus, the backbone of a genome-wide in silico metabolic model of a microorganism is the reconstructed network of all biochemical reactions taking place during its metabolism. Such an

16

integrated representation of genes, proteins, and their interactions is required to be well curated in order to enhance the predictive power of the model for predicting the metabolic phenotype of the microbe (Covert et al., 2001, Feist et al., 2009, Francke et al., 2005). In recent years, extensive research in this field has generated a number of algorithms for developing automated reconstructed metabolic network of an organism from its annotated genome sequence (Arakawa et al., 2006, DeJongh, 2007, Karp et al., 2002, Pinney et al., 2005, Sun and Zeng, 2004, Henry et al., 2010, Hung et al., 2010). However, automatically generated networks are not free from inconsistencies, such as the presence of missing reactions required to generate essential biomass precursors, or unwanted reactions not present in the organism’s genome. Hence, they require an extensive and laborious manual curation in order to be incorporated in the genome-scale model. Various steps involved in the network reconstruction process are briefly shown in Figure 2.2.

The network reconstruction process starts with the annotation of a sequenced genome of the organism to be modeled, and this has been facilitated by the recent advances in high-throughput genome sequencing technology (Mohamed and Syed, 2013, Metzker, 2010, MacLean et al., 2009). Currently, several thousands of completely sequenced and annotated genomes are available in the World Wide Web that can be downloaded from a number of websites of different biological databases, such as GOLD (http://www.genomesonline.org/cgi-bin/GOLD/index.cgi), KEGG (http://www.genome.jp/kegg), JGI-IMG (http://img.jgi.doe.gov), JCVI-CMR (http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi), EMBL-EBI (http://www.ebi.ac.uk/genomes/), and NCBI (http://www.ncbi.nlm.nih.gov/genome). The annotation of a sequenced genome is the assignment of functions to genes or gene products based on sequence similarity or homology with molecules or proteins of known functions available in biological databases (Koonin and Galperin, 2003). In rare cases, gene annotations are verified with mRNA, protein, or enzyme-level experimental results.

17

Figure 2.2. Metabolic network reconstruction procedure. Steps involved in constructing and curating a reconstructed metabolic network of an organism from its annotated genome sequence, and available biochemical and experimental data are illustrated.

The next step in the reconstruction process is the identification of genes with defined metabolic functions, which are then verified by identifying their homologs in other well characterized and extensively studied organisms, such as Escherichia coli, Bacillus subtilis, and Saccharomyces cerevisiae with the help of sequence alignment tool BLAST (Altschul et al., 1997). Subsequently, confidence levels are assigned based on the degree of sequence identity or bi- directional best BLAST hits. In addition, these genes are also evaluated on the basis of both gene order or conserved synteny, as well as phylogenetic analysis with the updated versions of biochemical databases, including KEGG (Kanehisa and Goto, 2000), UniProt (Apweiler et al.,

18

2012), SWISSPROT (Boeckmann et al., 2003), IMG (Markowitz et al., 2007), and PDB (Berman et al., 2000).

After identification of the genes in terms of the organism’s metabolic context, the next step is to obtain the metabolic reactions which require different levels of detailed biochemical information (Feist et al., 2009, Reed et al., 2006a, Thiele and Palsson, 2010a). First, substrate specificity for an enzyme has to be determined from biochemical literature of the organism to be modeled, as well as from the updated biochemical databases such as BRENDA (Schomburg et al, 2004) and ENZYME (Gasteiger et al, 2003), and the links to relevant publications therein. Secondly, after determining the molecular and charged formula of participating metabolites, stoichiometric coefficients for the biochemical reactions to be incorporated in the network are estimated by balancing the products and substrates on both sides of a reaction. Afterwards, the information regarding thermodynamic considerations or reaction directionality has to be incorporated followed by localization of reactions and proteins in cellular compartments (Feist et al., 2009, Reed et al., 2006a, Thiele and Palsson, 2010a). All of the aforementioned information is available in metabolic databases, including MetaCyc (Caspi et al., 2012), BioCyc (Caspi et al., 2012), KEGG (Kanehisa et al., 2011), SEED (DeJongh et al., 2007), and PubChem (Bolton et al., 2008, Wang et al., 2012). Although the cellular localization of reactions is very important and challenging for eukaryotic organisms, the task is relatively easy for containing only one compartment.

Once the metabolic reactions are defined, the next and most critical step in the network reconstruction process is the pathway analysis for finding and filling gaps in a metabolic network. This process is known as network debugging or network curation (Feist et al., 2009, Thiele and Palsson, 2010a). Although a number of algorithms (Green and Karp, 2004, Herrgård et al., 2006, Hung et al., 2010, Notebaart et al., 2006, Reed et al., 2006b, Kumar et al., 2007) are available for automated curation of a metabolic network, manual curation is still necessary. The manual curation step is very laborious and cumbersome yet essential for generating an accurate reconstruction, which can then be used as a scaffold for developing a genome-scale in silico model. The metabolic reconstruction generated from a genome annotation alone has several pitfalls, including incorrect substrate specificity, reaction reversibility, cofactor usage, treatment

19 of enzyme subunits as separated enzymes, and missing reactions that have no assigned ORFs (Durot et al., 2009, Feist et al., 2009, Reed et al., 2006a, Thiele and Palsson, 2010a). This is because genome annotations of an organism are usually based on sequence similarity or homology with known database proteins without having any experimental or biochemical evidence for that gene product, reaction or function prediction (Devos and Valencia, 2001, Reed et al., 2006a). In addition, the metabolic capability demonstrated by the reconstructed network has to be consistent with the known physiology of the organism. Hence, knowledge about the organism’s physiology, especially in the context of metabolic capabilities, is crucial for a curated reconstructed network.

Typically, gaps in a metabolic network are generated by the so called “dead metabolites” or “blocked metabolites”, i.e., metabolites that are only either consumed or produced in the network (Feist et al., 2009, Schellenberger et al., 2011, Thiele and Palsson, 2010a). The dead metabolites are usually originated from the presence or absence of certain pathways/reactions affecting the availability of substrates for other reactions (Francke et al., 2005, Kumar et al., 2007). So, the missing reactions associated with such metabolites have to be identified from previously mentioned reaction databases, as well as from the analysis of major metabolic pathways, such as glycolysis, TCA-cycle, or amino acid biosynthesis pathways that are essential for generating biomass precursors of the organism. Once the reactions are identified and relevant gaps are fixed, the next step is to look for the genes encoding the enzymes catalyzing the missing reactions in other organisms. Subsequently, the homologs of these genes in the organism to be modeled have to be identified using reciprocal BLAST analyses. This process, in turn, leads to reannotations of existing ORFs, or adding new genes into the reconstructed network.

2.5. Determination of biomass composition and maintenance energy

Once a highly curated metabolic network of the organism to be modeled is reconstructed, it is necessary to know the demands on the metabolic system in terms of biomass synthesis, as well as maintenance energy requirements. Hence, this information is incorporated in the model as a biomass synthesis reaction in the next step of the model building process (Becker et al., 2007,

20

Feist et al., 2009, Schellenberger et al., 2011, Thiele and Palsson, 2010a, Varma and Palsson, 1994). Usually, the composition of cellular macromolecules, such as proteins, nucleic acids, polysaccharides, lipids, fatty acids, and cofactors, are to be estimated from detailed physiological experiments about the organism of interest. However, amino acid, DNA, and RNA composition can also be estimated from the organism’s genome sequence (Roberts et al., 2010), and detailed experimental biomass compositions of several well studied organisms from Eukarya, Bacteria, and Archaea is available (Duarte et al., 2004, Feist et al., 2006, Mahadevan et al., 2006, Neidhardt et al., 1990, Oh et al., 2007). Thus, in the absence of organism-specific experimental data, the distribution and composition of remaining biomass precursors can be approximated and modified from other organisms’ data in accordance with the physiology and morphology of the organism to be modeled.

In addition to the biomass composition, the other key component of the biomass synthesis reaction is the maintenance energy requirement of a microbial cell. The maintenance energy refers to the energy required for a microbial cell to perform functions that are not directly growth related, or not involved in synthesizing new materials for a microbial cell (Pirt, 1965, Pirt, 1982, Russell and Cook, 1995). Cells produce energy in the form of ATP from catabolic reactions, and utilize these ATP molecules for biosynthetic anabolic reactions. However, not all ATP from catabolism are consumed by anabolic reactions, and cells require ATP energy for many functions not otherwise captured in the metabolic mdoel, such as the polymerization of macromolecules, turnover of cellular amino acid pools, active movement or motility, and active substrate and ion transport; this cellular ATP requirement is termed maintenance energy (Neidhardt et al., 1990). Cellular maintenance energy requirement can be of two types: growth-associated maintenance energy (GAM), and non-growth associated maintenance energy (NGAM) (Neidhardt et al., 1990). GAM is variable and accounts for the energy related to assembly and polymerization of macromolecules (i.e., proteins, DNA, RNA, lipids, and polysaccharides) while constant NGAM corresponds to ATP energy required for maintaining the integrity of a microbial cell (Pirt, 1965, Pirt, 1982, Russell and Cook, 1995). Methods for estimating both types of maintenance energy have been developed and described in literature (Neijssel et al., 1996, Pirt, 1965, Pirt, 1982).

21

2.6. Model validation and refinement

The final step in developing a genome-scale in silico model of microbial metabolism is the validation of the model with organism-specific experimental data, and analysis of the model predictions for probable refinement of the metabolic network, as well as the overall model. This process can be accomplished by predicting and comparing the data on growth and by-product secretion patterns of the organism in other conditions that are not used to estimate the model parameters. Such experimental data can be obtained from large-scale high-throughput physiological techniques, called phenotype micro-arrays (Bochner et al., 2001, Atanasova and Druzhinina, 2010, Oh et al., 2007). In addition, isotope labeling experiments such as 13C-based metabolic flux analysis can also be used for evaluating metabolic networks and validating genome-scale models (Tang et al., 2009a, Tang et al., 2012, Sauer, 2006). Moreover, the developed model can be used to generate experimentally testable hypotheses regarding the cellular physiology, as well as the metabolic capability in terms of genotype-phenotype relationship of an organism by utilizing flux balance analysis (FBA) technique. In fact, this technique is at the core of constraint-based genome scale modeling; thus, discussed in detail in the following section.

2.7. Flux balance analysis

Flux balance analysis (FBA) is an established mathematical modeling approach that has been extensively used for quantitatively simulating cellular metabolism using genome-scale metabolic models. Essentially, FBA is a mathematical framework that is usually used to interrogate the reconstructed metabolic network of an organism to predict its cellular behavior, or metabolic phenotype under certain physicochemical constraints, such as mass balance constraints, energy balance constraints, and flux limitations or bounds constraints (Price et al., 2004, Orth et al., 2010, Edwards et al., 2002, Bonarius et al., 1997, Kauffman et al., 2003). Thus, FBA calculates the flow of metabolites through a metabolic network and predicts the growth or by-product secretion pattern of an organism. The fundamental principle upon which FBA is based on is the law of conservation of mass (Edwards et al., 2002, Kauffman et al., 2003, Palsson, 2006).

22

Extensive literature describing the formulation, as well as the implementation in metabolic modeling is available on FBA (Price et al., 2004, Orth et al., 2010, Edwards et al., 2002, Bonarius et al., 1997, Kauffman et al., 2003, Gianchandani et al., 2010, Raman and Chandra, 2009, Palsson, 2006, Varma and Palsson, 1994, Lewis et al., 2012), and formulation of the method is briefly described in the following section.

Let us assume, the concentration of a particular metabolite (xi) in a metabolic reaction network is

influenced by various reaction fluxes (vj). A material/mass balance around each metabolite in a metabolic network results in the dynamic mass balance equation of the form:

dx i = v − v − ()v ± v (1) dt syn deg use trans

where, vsyn and vdeg are referring to the synthesis and degradation fluxes of the metabolite xi. vuse is the cellular maintenance flux and vtrans is the uptake or secretion flux of the metabolite xi; the former can be determined from cellular compositions as discussed previously while the latter can be measured experimentally. Thus, equation (1) takes the form:

dx i = v − v − b (2) dt syn deg i

where, bi is the total output of xi from the defined metabolic system. Equation (2) can also be represented by a single matrix equation of the form:

dX = S • v − b (3) dt

where, X is an m x 1 dimensional matrix of m metabolites within the cell, v is the n x 1 dimensional matrix of n fluxes through n number of metabolic reactions, S is the m x n dimensional stoichiometric matrix representing the entire metabolic network, and b is the matrix of known metabolic demands or exchange fluxes.

23

Due to the fact that the metabolic transients are typically more rapid compared to cellular growth and process dynamics (Vallino and Stephanopoulos, 1993), a steady-state condition can be assumed and equation (3) reduces to:

S•v=b (4)

If exchange fluxes are known and incorporated in the v matrix, equation (4) can also be written as:

S•v=0 (5)

Equation (5) simply states that the synthesis fluxes of a metabolite must be balanced by the degradation fluxes over long time intervals. Since the number of reaction fluxes usually exceeds the number of metabolites or mass balances, equation (5) constitutes an underdetermined system resulting in a plurality of solutions, or feasible flux distributions. Although infinite in number, these solutions are constrained by mass balances included in the S matrix, forming a bounded solution space. Thus, a linear programming (LP) problem can be formulated and solved for a particular objective such as the maximization of cellular growth. Additional constraints such as ≤ ≤ ai vi bi , where ai and bi represent the upper and lower bounds of the corresponding reaction

fluxes (vi), in addition to the stoichiometric constraints represented by equation (5) are required to solve the LP problem. The optimal flux distributions represented by the solution of the LP problem is essentially the metabolic phenotype of the microbe at the conditions described by the constraints.

2.8. Energy conservation in microbes

The metabolic diversity of microorganisms is unfathomable, and this is especially true for their energy metabolism. In biological systems, adenosine triphosphate (ATP) is the universal molecular currency for storing and exchanging biological energy (Madigan et al., 2010, Nelson

24

and Cox, 2006). Irrespective of the types of substrate used for metabolism, energy is transformed and ultimately conserved in the form of energy-rich pyrophosphate bond of ATP in all forms of life (Thauer et al., 1977). Chemotrophic microbes usually produce ATP during catabolism by one of the two energy conserving mechanisms — substrate level phosphorylation (SLP), and electron transport phosphorylation (ETP), and these processes consist of a combination of various electron-accepting and electron-donating redox reactions, respectively (Kröger et al., 2002, Kröger et al., 1992, Thauer et al., 1977). In SLP or fermentation, ATP is generated from

the exergonic reaction of adenosine di-phosphate (ADP) and inorganic phosphate (Pi) during the conversion of an organic compound, and in the absence of a terminal electron acceptor (Thauer et al., 1977, Madigan et al., 2010, http://textbookofbacteriology.net/index.html, 2008, Unden et al., 2013). The synthesis of ATP through ETP is more evolved and observed widely in aerobic, anaerobic, and photosynthetic microorganisms (Thauer et al., 1977, Unden et al., 2013). In ETP or respiration, the synthesis of ATP is coupled to redox reactions mediated by electron carriers in an electron transport chain via a “chemiosmotic mechanism”, and in the presence of a terminal electron acceptor (Kröger et al., 2002, Kröger et al., 1992, Thauer et al., 1977, Unden et al., 2013).

The chemiosmotic mechanism, proposed by the British biochemist Peter Mitchell (Mitchell, 1961, Mitchell, 1972), deals with a proton motive force or PMF (Δp) consisting of a pH gradient (ΔpH) and an electrochemical potential difference (Δψ), and generated by the flow of electrons from an electron donor to an electron acceptor through a membrane-bound electron transport

chain. The generated PMF is then utilized for synthesizing ATP from ADP and Pi by the action of a membrane-potential-driven enzyme complex, called ATP synthase or ATPase (Thauer et al., 1977, Mitchell, 1961, Mitchell, 1972, Unden et al., 2013). Mitchell originally described the mechanism for aerobic bacteria which use oxygen as the terminal electron acceptor, and the process is known as oxidative phosphorylation. However, the generation of ATP via chemiosmotic mechanism by chemotrophic anaerobes is known as electron transport phosphorylation (ETP), or , where fumarate, nitrate, nitrite, sulfate, polysulfide, organohalides, carbon dioxide, and metal oxides instead of oxygen are used as terminal electron acceptors (Unden et al., 2013, Thauer et al., 1977, Kröger et al., 2002, Kröger et al., 1992).

25

2.9. Chlorinated xenobiotics and reductive dechlorination

Chlorinated organic solvents, including both chlorinated aliphatic hydrocarbons (e.g., PCE, TCE, and VC) and chlorinated aromatic hydrocarbons (e.g., HCB, TCB, and MCB), are highly volatile, less corrosive, less reactive, less flammable, and are being capable of effectively dissolving a wide range of organic compounds (Doherty, 2000a, Doherty, 2000b, Doherty, 2012, Lohman, 2002). These properties have made them very popular for widespread commercial and industrial use as cleaning and degreasing agents, refrigerants, extraction agents, ingredients for making adhesives, industrial paints, paint strippers, varnishes, lubricants, fungicides, herbicides, and pesticides in industries ranging from dry cleaning, electronics, defense, and automotive to pharmaceutical, textile and agriculture sectors, for more than 30 years (Petrisor and Wells, 2008, Doherty, 2000a, Doherty, 2000b, Lohman, 2002). However, such extensive uses, and past uncareful handling and disposal practices, together with the lack of awareness of harmful effects on human health and the environment, have made chlorinated solvents the most widespread xenobiotics or man-made contaminants of groundwater and soil (Doherty, 2000a, Doherty, 2000b, Doherty, 2012, Lohman, 2002). Trichloroethene (TCE) was found in at least 852 of 1430 while trichlorobenzene (TCB) was identified in 1699 USEPA (US Environmental Protection Agency) National Priorities List (NPL) Superfund sites (ATSDR, 2013). A 1995 Health Canada study (Canada and Limited, 1995) also found tetrachloroethene (PCE) and TCE to be the most common groundwater contaminants at Canadian sites while TCE was the most prevalent contaminants (in 3% surface and 19% ground water samples) worldwide as per a 1989 estimate by USEPA (Petrisor and Wells, 2008). Being sparingly soluble and denser than water, chlorinated solvents tend to sink in water and create a separate layer called dense non-aqueous phase liquids (DNAPLs) in soil and subsurface liquids (McCarty, 2001, McCarty, 2010). Due to their persistent nature, such DNAPLs ultimately hit the groundwater level and contaminate it by creating DNAPL plumes (McCarty, 2001, McCarty, 2010).

In spite of their environmental persistence, chlorinated solvents can be degraded by both biotic and abiotic processes to form even harmful intermediates than the original higher chlorinated compounds. For instance, partial biological transformation of TCE to produce cis-1,2- dichloroethene (cDCE) and vinyl chloride (VC) was first reported in 1985 (Parsons and Lage,

26

1985, Vogel and McCarty, 1985); VC is a known human carcinogen (Maltoni and Lefemine, 1975, Maltoni and Cotti, 1988) as its exposure causes a rare form of liver cancer angiosarcoma (Vianna et al., 1981) while TCE has been listed as a carcinogen recently (ATSDR_TCE_PCE, 2010). High level of TCE exposure also causes nervous system effects and lung damage, and PCE is a suspected carcinogen as it is found to be linked to leukemia and other defects in children with indirect exposure to PCE (ATSDR_TCE_PCE, 2010). Pentachlorobenzene (PeCB) and hexachlorobenzene (HCB) are reported to show hepatocarcinogenic activity (Thomas et al., 1998), while mono and dichlorobenzenes are relatively less toxic and can cause defects in skin, hyperpigmentation, osteoporosis, and other diseases primarily in children (Gustafson et al., 2000).

In 1989, it was first reported (Freedman and Gossett, 1989) that biological transformation of not only PCE and TCE but also VC was occurred to produce completely non-hazardous ethene by an anaerobic process termed reductive dechlorination (Freedman and Gossett, 1989). Although TCE was earlier reported to degrade by aerobic co-metabolic processes (Wilson and Wilson, 1985, McCarty et al., 1998), the anaerobic reductive dechlorination was shown to be linked to organisms’ growth (Holliger et al., 1993, Freedman and Gossett, 1989). In a co-metabolic process, microbes cannot couple the energy of the dechlorination reaction to their growth because the transformation of chlorinated compounds is a fortuitous modification by cofactors or enzymes which are catalyzing other reactions (Haggbloom and Bossert, 2003, El Fantroussi et al., 1998). However, during the anaerobic reductive dechlorination process, microbe can conserve the energy of the dechlorination reaction by generating ATP through the electron transport phosphorylation and the chemiosmotic mechanism using chlorinated solvents as terminal electron acceptors, and coupling this energy to their growth (Smidt and de Vos, 2004, Tas et al., 2010, Holliger et al., 1998b, Leys et al., 2013); hence, the process is also known as dehalorespiration, or organohalide respiration, and found to be catalyzed by reductive dehalogenase enzymes of some anaerobic bacteria (Figure 2.3) (Smidt and de Vos, 2004, Tas et al., 2010, Holliger et al., 1998b, Leys et al., 2013, Futagami et al., 2008).

27

Figure 2.3. Anaerobic reductive dechlorination of chlorinated ethenes to benign ethene and higher chlorinated benzenes to less toxic lower chlorinated benzenes

Although organohalide respiration is catalyzed by reductive dehalogenase (RDase) enzymes encoded by reductive dehalogenase homologous (rdh) genes, exact mechanism of the reductive dechlorination reaction is yet to be known because the structure of a purified RDase protein has not been determined yet. However, the purified and characterized RDase enzymes to date contain a corrinoid protein such as cobalamin and two iron-sulfur clusters as cofactors (Schumacher et al., 1997, Miller et al., 1997, Miller et al., 1998, Neumann et al., 1996, Neumann et al., 2002, Magnuson et al., 2000, Magnuson et al., 1998, Adrian et al., 2007b, Krajmalnik- Brown et al., 2004, Müller et al., 2004), except the 3-chlorobenzoate dehalogenase of Desulfomonile tiedjei which contains a heme group instead of a corrinoid protein (Ni et al., 1995). The involvement of a corrinoid cofactor makes the reductive dechlorination reaction a novel type of corrinoid-dependent reaction (Banerjee and Ragsdale, 2003, Holliger et al., 1998b). Krasotkina and coworkers (Krasotkina et al., 2001) proposed two working models regarding the mechanism of the reductive dechlorination reaction of chloro-aromatic compounds. The first model described the formation of different intermediates with cob(I)alamin of corrinoid as addition reactions while the second one categorized the formation of intermediates as radical

28

reactions (Banerjee and Ragsdale, 2003, Krasotkina et al., 2001). The first mechanism was purely theoretical while there was some experimental evidence for the latter one (Glod et al., 1997, Holliger et al., 1998b)

2. 10. Dehalococcoides bacteria

The biological transformation of VC, the known human carcinogen (Vianna et al., 1981), to completely non-toxic ethene by anaerobic reductive dechlorination was first reported in 1989 (Freedman and Gossett, 1989). This finding attracted renewed interest from researchers about this topic, and led to the isolation of many anaerobic bacteria, including Desulfitobacterium sp. strain PCE 1 (Gerritse et al., 1996) Sulfurospirillum multivorans (Scholzmuramatsu et al., 1995), and restrictus (Holliger et al., 1998a), capable of transforming PCE and TCE partially to VC, but not to ethene by reductive dechlorination. In 1997, the anaerobic bacterium, Dehalococcoides ethenogenes strain 195 capable of degrading PCE completely to ethene was first isolated (Maymó-Gatell et al., 1997). Since then, a number of Dehalococcoides strains, including strains CBDB1 (Adrian et al., 2000b), BAV1 (He et al., 2003), FL2 (He et al., 2005), GT (Sung et al., 2006), VS (Cupples et al., 2003), MB (Cheng and He, 2009), and ANAS1 and ANAS2 (Lee et al., 2011), have been isolated from geographically diverse contaminated sites. Recently, the genus and of Dehalococcoides has been defined, and all isolates are now known as the strains of Dehalococcoides mccartyi (Löffler et al., 2012). Phylogenetically, members of D. mccartyi belong to the , a not very well characterized bacterial phylum of green non-sulphur bacteria (Löffler et al., 2012). Despite being named as coccoids, these bacteria are fairly small with a flattened disc shape morphology, having a diameter of approximately 0.3 ~ 1 µm, and a thickness of 0.1 ~ 0.2 μm with occasional cellular appendages and biconcave indentations (Adrian et al., 2000b, Löffler et al., 2012). Unlike a typical bacterial cell wall, D. mccartyi cell wall has a close resemblance to archaeal S-layer like proteins (Maymó-Gatell et al., 1997, Adrian et al., 2000b, Löffler et al., 2012); hence, they are neither Gram-positive nor Gram-negative bacteria.

29

More than 98% sequence identity in their 16S rRNA gene sequences renders the strains of D. mccartyi strikingly similar and makes their isolation very difficult (Ritalahti et al., 2006, Löffler et al., 2012); nonetheless, they are very diverse in terms of their metabolic capabilities as demonstrated by the wide range of chloro-organics they can respire. Strains 195 and CBDB1 can dechlorinate both chloro-aliphatics and chloro-aromatics, including potent human VC, TCE, and dioxins (Adrian et al., 2007a, Bunge et al., 2003) while rest of the strains respire chloro-aliphatics only (Löffler et al., 2012). All strains use the chlorinated organics as terminal electron acceptors while only hydrogen as the electron donor or energy source and acetate as the carbon source during organohalide respiration (Löffler et al., 2012, Adrian, 2009). Notably, all steps involved in the dechlorination of chloro-aliphatics by D. mccartyi are not energy conserving. For example, during the dechlorination of higher chlorinated ethenes by strains 195 and FL2, the vinyl chloride (VC) to ethene step is co-metabolic (He et al., 2005, Maymó-Gatell et al., 1997) while for strain BAV1, PCE and TCE degradation steps are co-metabolic, but cDCE and VC steps are growth related (He et al., 2003). Based on 16S rRNA gene sequence similarity of D. mccartyi isolates, and other mixed culture and environmental samples, Hendrickson et al. (2002) divided the Dehalococcoides phylotype into 3 subgroups: Cornell, Pinellas and Victoria. While strains 195, MB, ANAS1, and ANAS2 belong to the Cornell subgroup, the other isolates except strain VS belong to the Pinellas group; strain VS is the lone isolate from the Victoria group (Hendrickson et al., 2002, Löffler et al., 2012).

Organohalide respiration or reductive dechlorination is the only known metabolic process by which D. mccartyi conserve energy for growth (Löffler et al., 2012, Futagami et al., 2008, Adrian, 2009, Leys et al., 2013). Thus, the difference in metabolizing various organohalides by D. mccartyi strains can be attributed to the presence of multiple copies of non-identical but homologous rdh genes (Hölscher et al., 2004) because the RDase enzymes, encoded by the rdh genes, catalyze the reductive dechlorination reaction in D. mccartyi. Genome sequences of multiple D. mccartyi strains (Kube et al., 2005, McMurdie et al., 2009, Seshadri et al., 2005) revealed the presence of an unusually large number of rdh genes in each strain, ranging from 10 in strain BAV1 to 36 in strain VS (Hug et al., 2013, McMurdie et al., 2009). Interestingly, the majority of these rdh genes are located in two variable regions called high plasticity regions in the genomes, along with other insertion sequences and repeated elements (McMurdie et al.,

30

2009, Kube et al., 2005, Seshadri et al., 2005). Apart from these differences, the core genome of sequenced D. mccartyi strains is remarkably similar and conserved (Ahsanul Islam et al., 2010, McMurdie et al., 2009). Although substrate ranges of only five RDase enzymes were experimentally characterized (Adrian et al., 2007b, Krajmalnik-Brown et al., 2004, Magnuson et al., 2000, Magnuson et al., 1998, Müller et al., 2004) so far, it is hypothesized that these bacteria probably degrade a wide variety of chlorinated pollutants as growth supporting terminal electron acceptors due to a large number of rdh genes in the genomes (Kube et al., 2005, Seshadri et al., 2005).

2.11. The KB-1 microbial community

KB-1 is an anaerobic dechlorinating mixed microbial culture, maintained and enriched in the Edwards lab for more than 16 years, and originated from the soil and groundwater of a TCE- contaminated site in Southern Ontario (Duhamel et al., 2002, Edwards and Cox, 1997). Using only methanol as the electron donor, KB-1 can dechlorinate PCE, TCE, cDCE and VC to ethene by using them as electron acceptors. Although KB-1 is a mixed microbial community, past studies (Duhamel et al., 2004, Duhamel, 2005, Waller, 2010) have identified dominant organisms in the community belonging to four major phylotypes: dechlorinators, acetogens, methanogens, and fermenters (Figure 2.4); however, dechlorinators, such as Dehalococcoides and Geobacter, are the largest microbial populations, comprising more than 60% of the KB-1 community (Duhamel et al., 2004, Duhamel, 2005, Waller, 2010). Both Dehalococcoides and Geobacter are active dechlorinating organisms in the community while other members mainly play supporting roles by providing essential substrates for the dechlorinarors; for instance, methanol is converted to acetate and hydrogen by acetogens (Duhamel and Edwards, 2006, Duhamel et al., 2002, Duhamel, 2005), and these two metabolites are the carbon and energy source for Dehalococcoides. Growth and dechlorination activities by Dehalococcoides are reported to be faster and more robust in the KB-1 community than in pure cultures (Duhamel, 2005, Hug, 2012, Waller, 2010), which indicate the presence of some beneficial interactions between the community members. Moreover, the presence of multiple organisms provide functional redundancies in KB-1, which likely plays a crucial role in the robustness of

31 dechlorination activity by this microbial consortium as compared to pure cultures of Dehalococcoides (Duhamel and Edwards, 2006, Duhamel et al., 2002) A metagenome of the KB-1 community has recently been sequenced, and a composite genome of two very similar Dehalococcoides strains was identified (Hug, 2012). A variety of KB-1 culture is also using in commercial bioremediation applications, especially for purposes in more than 180 contaminated sites around the world (Nicholson, 2010).

Figure 2.4. Schematic representation of the microbial interactions between different community members in the KB-1 community. Only major KB-1 phylotypes are shown in the figure.

32

Chapter 3: Characterizing the metabolism of Dehalococcoides with a constraint-based model

3.1. Abstract

Dehalococcoides strains respire a wide variety of chloro-organic compounds and are important for the bioremediation of toxic, persistent, carcinogenic, and ubiquitous ground water pollutants. In order to better understand metabolism and optimize their application, we have developed a pan-genome-scale metabolic network and constraint-based metabolic model of Dehalococcoides. The pan-genome was constructed from publicly available complete genome sequences of Dehalococcoides mccartyi strains CBDB1, 195, BAV1, and VS. We found that Dehalococcoides pan-genome consisted of 1118 core genes (shared by all), 457 dispensable genes (shared by some), and 486 unique genes (found in only one genome). The model included 549 metabolic genes that encoded 356 proteins catalyzing 497 gene-associated model reactions. Of these 497 reactions, 477 were associated with core metabolic genes, 18 with dispensable genes, and 2 with unique genes. This study, in addition to analyzing the metabolism of an environmentally important phylogenetic group on a pan-genome scale, provides valuable insights into Dehalococcoides metabolic limitations, low growth yields, and energy conservation. The model also provides a framework to anchor and compare disparate experimental data, as well as to give

insights on the physiological impact of “incomplete” pathways, such as the TCA-cycle, CO2 fixation, and cobalamin biosynthesis pathways. The model, referred to as iAI549, highlights the specialized and highly conserved nature of Dehalococcoides metabolism, and suggests that evolution of Dehalococcoides species is driven by the electron acceptor availability.

33

3.2. Introduction

Genome sequencing has enabled the characterization of biological systems in a more comprehensive manner. Recent research in bioinformatics and systems biology has resulted in the development of numerous systematic approaches for the analysis of cellular physiology that have been reviewed elsewhere (Covert et al., 2008, Medini et al., 2008, Reed et al., 2006a, Young et al., 2008). However, constraint-based reconstruction and analysis (COBRA), a mathematical framework for integrating sequence data with a plethora of experimental ‘omics’ data has been shown to be successful in the genome-wide analysis of cellular physiology (Becker et al., 2007, Becker and Palsson, 2008, Feist et al., 2009). In addition, this approach has also been utilized to explore the metabolic potential, as well as the gene essentiality analysis of several organisms across different kingdoms of life (Heinemann et al., 2005, Joyce and Palsson, 2008, Kim and Lee, 1999, Nookaew et al., 2008, Schilling et al., 2002, Teusink et al., 2006); however, the COBRA approach has not yet been implemented for Dehalococcoides, or any other known dechlorinating bacterium.

Using acetate as a carbon source and hydrogen as an electron donor, small, disc-shaped anaerobic bacteria Dehalococcoides are capable of dehalogenating a variety of halogenated organic compounds as electron acceptors, of which many are problematic ground water pollutants (El Fantroussi et al., 1998, Haggbloom and Bossert, 2003, Holliger et al., 1998b, Smidt and de Vos, 2004). Dehalococcoides mccartyi strain 195 (strain 195) is the first member of this phylogenetic branch that was grown as an isolate (Maymó-Gatell et al., 1997). Subsequently, a number of Dehalococcoides strains were isolated: strain CBDB1 (Adrian et al., 2000b), strain BAV1 (He et al., 2003), strain FL2 (He et al., 2005), strain GT (Sung et al., 2006), and strain VS (Cupples et al., 2003). The strains respire through a membrane-bound electron transport chain (ETC) (Hölscher et al., 2003, Jayachandran et al., 2004, Nijenhuis and Zinder, 2005), which is incompletely defined. Reductive dehalogenases (RDases), encoded by reductive dehalogenase homologous (rdh) genes, are pivotal membrane-associated enzymes of the ETC (Hölscher et al., 2003, Jayachandran et al., 2004, Nijenhuis and Zinder, 2005). Genome sequencing has revealed the presence of multiple non-identical putative rdh genes in each strain (Hölscher et al., 2004, Kube et al., 2005, Seshadri et al., 2005, Waller et al., 2005). Since these

34

microbes respire chlorinated pollutants by RDase-catalyzed reductive dechlorination reaction, rdh genes determine a significant part of Dehalococcoides’ phenotypes. Functional characterization of only 5 of the over 190 rdh genes reveals that cobalamin — a corrinoid compound — is an essential cofactor for the corresponding RDases (Adrian et al., 2007b, Cupples et al., 2004, Krajmalnik-Brown et al., 2004, Magnuson et al., 2000, Magnuson et al.,

1998). Hydrogenase (H2ase) is another key enzyme of Dehalococcoides ETC (Jayachandran, 2004, Kube et al., 2005, Nijenhuis and Zinder, 2005, Seshadri et al., 2005). Interestingly, the genomes of Dehalococcoides strains encode 5 different types of H2ases: membrane-bound hup, ech, hyc, hym, and cytoplasmic vhu (Kube et al., 2005, Morris et al., 2007, Morris et al., 2006,

Seshadri et al., 2005). The presence of multiple types of H2ases clearly emphasizes the

importance of H2 in their energy metabolism (Adrian et al., 2000a, Adrian et al., 2000b, He et al.,

2005, He et al., 2003, Maymó-Gatell et al., 1997). This multiplicity of H2ases and RDases further highlights redundancy in the organisms’ energy conservation process that may ensure a rapid and efficient response of their energy metabolism towards changing growth conditions (Meyer, 2007, Vignais et al., 2001).

In addition to RDase and H2ase, the ETC likely requires an in vivo electron carrier to mediate electron transport between these enzymes. Previous studies have shown that the reductive ’ dechlorination reaction requires an in vivo electron donor of redox potential (E0 ) ≤-360 mV (Hölscher et al., 2003, Nijenhuis and Zinder, 2005), similar to other dechlorinating bacteria (Holliger et al., 1998b, Krasotkina et al., 2001, Miller et al., 1997). The cob(II)alamin of corrinoid cofactor in the RDase enzyme is reduced to cob(I)alamin during the reductive dechlorination reaction; hence, necessitating a low-potential donor because the redox potential

(E0’) of Co(II)/Co(I) couple is between -500 and -600 mV (Banerjee and Ragsdale, 2003, Holliger et al., 1998b, Krasotkina et al., 2001). While quinones such as menaquinone or ubiquinone could act as electron carriers in anaerobes (Kröger et al., 2002, Louie and Mohn, 1999, Schumacher and Holliger, 1996), experimental evidence suggests this is not the case in Dehalococcoides (Jayachandran et al., 2004, Nijenhuis and Zinder, 2005). Moreover, the redox ’ ’ potentials for quinones (Menaquinone ox/red E0 = -70 mV and Ubiquinone ox/red E0 = +113 mV (Thauer et al., 1977)) are not compatible with the RDases’ requirement of a low potential donor. Furthermore, cytochrome b — a typical donor for the quinones to participate in the redox

35

reactions of anaerobic ETCs (Dross et al., 1992, Menon, 1992) — appears to be absent in the genomes of Dehalococcoides (Kube et al., 2005, Seshadri et al., 2005). However, the genomes have ferredoxin, an iron-sulphur protein, which can act as the low-potential donor for RDases because ferredoxin is the most electronegative electron carrier yet found in the bacterial ETCs (Bruschi and Guerlesquin, 1988, Eisenstein and Wang, 1969, Miller et al., 1997, Sterner, 2001, Thauer et al., 1977, Valentine and Wolfe, 1963, Valentine, 1964).

Although, Dehalococcoides are capable of harnessing free energy from the RDase catalyzed exergonic reductive dechlorination reactions by coupling to ATP generation for growth (Holliger et al., 1998b, Smidt and de Vos, 2004), their pure culture growth is much less robust than their growth in mixed cultures (Adrian et al., 2000a, Cupples et al., 2004, Duhamel and Edwards, 2007); even in mixed cultures, their growth yield is not as high as that predicted from the free energy of reductive dechlorination (Jayachandran, 2004, Jayachandran et al., 2004). Thus, in order to better understand dechlorination-metabolism, and given that to-date sequenced Dehalococcoides genomes are more than 85% identical at the amino acid level (Krajmalnik- Brown et al., 2006, Morris et al., 2006), we developed a pan-genome-scale constraint-based in silico metabolic model of Dehalococcoides. The model was constructed from the complete genome sequences of 4 geographically distinct strains: strain CBDB1 from the Saale river near Jena, Germany (Adrian et al., 1998, Nowak et al., 1996), strain BAV1 from Oscoda, Michigan, USA (He et al., 2002, Lendvay et al., 2003), strain 195 from a wastewater treatment plant in Ithaca, New York, USA (Distefano et al., 1991, Freedman and Gossett, 1989, Maymó-Gatell et al., 1997), and strain VS from Victoria, Texas, USA (Cupples et al., 2004, Cupples et al., 2003). Although the model comprises multiple genomes, it analyzed the outcome of metabolic genes only. Also, it did not include information about cellular regulation due to the lack of adequate knowledge about Dehalococcoides regulatory networks. Nonetheless, the model was primarily used to investigate the intrinsic metabolic limitations, in addition to addressing open questions regarding Dehalococcoides physiology, such as the incomplete nature of various metabolic pathways, and attendant implications on metabolism and growth. We also identified the environmental conditions from the model simulations that resulted in faster in silico growth of Dehalococcoides. Furthermore, the constraint-based model, along with the comparative analysis of 4 genomes, clarifies both similarities and differences among the strains in terms of their core

36

metabolism and other biosynthetic processes leading to an improved understanding of metabolism and evolution in Dehalococcoides.

3.3. Materials and methods

3.3.1. Dehalococcoides pan-genome

In order to develop the pan-genome of Dehalococcoides, we obtained strain CBDB1 genome sequence from JCVI (http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi) while strain 195 and strain BAV1 genome sequences were downloaded from the IMG database (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi). Strain VS genome sequence was obtained from Alfred Spormann at Stanford University, CA. The genome sequences were compared using OrthoMCL (Li et al., 2003), a widely accepted method for finding orthologs across different genomes (Chen et al., 2007). OrthoMCL is based on reciprocal best BLAST hit (RBH), but recognizes co-orthologous groups using a Markov graph clustering (MCL) algorithm (Van Dongen, 2000). The Dehalococcoides pan-genome was developed following a previously described approach (Tettelin et al., 2005, Tettelin et al., 2008) outlined in Figures A1-A4 in Appendix A and in the following section.

First, we identified putative orthologs between a reference genome and a subject genome which were selected arbitrarily from the 4 genomes compared. The analysis was conducted by OrthoMCL, keeping the parameters of the algorithm in default settings. Subsequently, those genes that were present only in subject genome 1 were identified and combined with the reference genome to create the augmented genome 1 (Figure A1 in Appendix A). Then, the augmented genome 1 was compared and analyzed with subject genome 2 as described above to construct the augmented genome 2. The pan-genome was obtained by comparing the augmented genome 2 and subject genome 3. The number of genes in a pan-genome was reported to depend on both the order of genomes analyzed and the reference genome (Tettelin et al., 2005); hence, we constructed 6 pan-genomes for 6 different genome-order combinations. Of these 6 pan- genomes, we selected the one with the highest number of genes (2061) as Dehalococcoides pan- genome in order to capture the entire gene repertoire of Dehalococcoides species (Muzzi et al.,

37

2007). We also identified the core, dispensable, and unique genomes for Dehalococcoides pan- genome modifying the previously described methods (Medini et al., 2008, Tettelin et al., 2005), and detailed in the supplemental text in Appendix A.

3.3.2. Reconstructing the metabolic network of Dehalococcoides

The pan-genome was used to reconstruct the pan-genome-scale metabolic network, and the constraint-based model of Dehalococcoides metabolism was developed from this reconstruction. Since the strains of Dehalococcoides share a high degree of sequence identity, we arbitrarily chose strain CBDB1 genome as a reference and constructed the metabolic network from its annotated genome sequence (Kube et al., 2005), publications regarding its physiology, and various genomic and biochemical databases (Feist et al., 2009). Then, we included other metabolic genes from the pan-genome into the reconstructed network that were missing from strain CBDB1 genome. Five gene correspondence tables for the four genomes were prepared (Tables S3-S7 in Table A15 in Appendix A) for facilitating gene identification and cross- reference regardless of the genome of interest. We developed and manually curated the reconstructed network using the procedures described previously (Covert et al., 2001, Feist et al., 2009, Francke et al., 2005, Reed et al., 2006a, Thiele and Palsson, 2010a) with the SimPheny platform (Genomatica Inc., San Diego, CA). Since genome annotations are error prone (Devos and Valencia, 2001), annotated genes of strain CBDB1, as well as the pan-genome genes with defined metabolic functions were verified by identifying their homologs in other well characterized organisms, including Escherichia coli, Bacillus subtilis, Geobacter sulfurreducens, and Saccharomyces cerevisiae with BLAST (Altschul et al., 1997). Subsequently, confidence levels were assigned based on the degree of sequence identities or reciprocal best BLAST hits. Dehalococcoides genes, for instance, having > 40% amino acid sequence identity with homologs in the protein databases (SWISSPROT (Boeckmann et al., 2003), IMG (Markowitz et al., 2012), PDB (Berman et al., 2000), GO (Ashburner et al., 2000)) were given a confidence level of 3, and genes with > 30% and < 30% identity were assigned a confidence level of 2 and 1, respectively. In addition, these genes were also evaluated on the basis of gene order or conserved synteny (Markowitz et al., 2012), along with phylogenetic analysis with updated versions of biological databases, such as UniProt (Apweiler et al., 2012), IMG, GO, and PDB. Afterwards, both

38

elementally and charge balanced biochemical reactions were assigned to the genes to create the gene-protein-reaction (GPR) associations (Reed et al., 2006a). These reactions were further verified by biochemical literature as well as enzyme databases, such as KEGG (Kanehisa et al., 2011), BRENDA (Chang et al., 2009) , MetaCyc (Caspi et al., 2012), and ENZYME (Bairoch, 2000). In some instances, genes required for some biosynthetic reactions essential for producing all the precursor metabolites for cell biomass were not identified. Such reactions (21 in numbers detailed in Table S1 in Table A15 in Appendix A) were added to the reconstructed network as non-gene associated reactions.

3.3.3. Estimation of biomass composition and maintenance energy requirements

The biomass composition (dry basis) of 1 gram of Dehalococcoides cells was calculated from various published and experimental data, and expressed in mmol (millimoles)/g DCW (dry cell weight) (Tables A1-A6 in Appendix A). Due to the lack of detailed experimental data on the cellular composition of Dehalococcoides, the weight fractions of protein, lipid, carbohydrate, soluble pools and ions of the cell were estimated from the published genome-scale model of Methanosarcina barkeri (Feist et al., 2006). We choose to use data from M. barkeri model — an archaeon — because Dehalococcoides cells are enclosed by the archaeal S-layer like protein instead of a typical bacterial cell wall (Adrian et al., 2000b, He et al., 2003, Maymó-Gatell et al., 1997). The weight percent of DNA was estimated from the cell morphology, length of the genome sequence (Borodina et al., 2005), and molar mass of the DNA while the weight percent of RNA was calculated from the experimental data on a Dehalococcoides containing mixed microbial culture (see supplemental text in Appendix A for details). In addition, the detailed composition of each macromolecule, as well as the composition of cofactors, and other soluble pools and ions are presented in Tables A1-A6 in Appendix A. The distribution of amino acids, nucleotides and cofactors in the biomass was calculated from the data reported previously (Neidhardt et al., 1990, Pramanik and Keasling, 1997) while the weight fractions of different fatty acids were estimated from White et al. (2005). These compositions were then integrated into the model as a biomass synthesis reaction, BIO_DHC_DM_61 (see the supplemental text in Appendix A for additional details).

39

Maintenance energy accounts for the ATP requirements of cellular processes, such as turnover of the amino acid pools, polymerization of cellular macromolecules, and ion transport, which are not included in the biomass synthesis reaction (Pirt, 1965, Pirt, 1982, Russell and Cook, 1995). These ATP requirements can be either growth associated (GAM), i.e., related to assembly and polymerization of macromolecules (eg. proteins, DNA, etc.), or non-growth associated (NGAM) that corresponds to maintaining membrane potential for keeping cellular integrity (Neijssel et al., 1996, Pirt, 1965, Pirt, 1982). Due to the lack of experimental chemostat data required for calculating both maintenance parameters (Varma and Palsson, 1994), the NGAM for a Dehalococcoides cell (1.8 mmol ATP.gDCW-1.h-1) was calculated from the experimental decay rate (0.09 day-1) (Cupples et al., 2003) and the average of pure-culture experimental growth yields (0.69 g DCW/eeq; Table A7 in Appendix A) following the procedures described previously (Pirt, 1965, Russell and Cook, 1995). The GAM was estimated by the regression analysis, using an initial estimate of 26 mmol ATP/g DCW for a typical bacterial cell (Table A9 in Appendix A) (Neidhardt et al., 1990). The initial estimate of GAM and the calculated NGAM were then used to simulate (using flux balance analysis, described below) the average of reported pure-culture experimental growth rates (0.014 h-1; Table A8 in Appendix A). A GAM of 61 mmol ATP/g DCW gave the best prediction of the experimental growth rate.

3.3.4. In silico analysis of Dehalococcoides metabolism

Flux Balance Analysis (FBA) relies on the imposition of a series of constraints including stoichiometric mass balance constraints derived from the metabolic network, thermodynamic reversibility constraints and any available enzyme capacity constraints (Price et al., 2004, Reed et al., 2003, Reed et al., 2006a). The imposition of these constraints leads to a linear optimization (Linear Programming, LP) problem formulated to maximize a cellular objective function such as the growth rate. Hence, the biomass synthesis reaction is assumed to be the objective function to be maximized to solve the LP problem in SimPheny. In addition, a number of reversible reactions were added in the network for exchanging external metabolites, such as acetate (ac), - -2 chloride (Cl ), carbondioxide (CO2), and sulphate (SO4 ), to represent the in silico minimal medium (Table 3.2) for Dehalococcoides. Cobalamin is essential for Dehalococcoides growth, but they are unable to synthesize it de novo; hence, they salvage cobalamin from the medium. In

40

order to analyze whether cobalamin flux can limit Dehalococcoides growth, we performed a robustness analysis on the cobalamin exchange reaction for different weight fractions of cobalamin in the biomass. We also simulated growth rates by incorporating all the pathways required for de novo cobalamin synthesis in iAI549 for analyzing cobalamin synthesis cost, and its effect on Dehalococcoides growth. Finally, to identify whether the growth of Dehalococcoides was carbon or energy limited, the growth simulations were conducted by

varying acetate fluxes and energy transfer efficiencies since acetate and H2 are the carbon and energy source of these microbes, respectively (Adrian et al., 2000b, He et al., 2003, Maymó- Gatell et al., 1997). Energy transfer efficiencies were calculated by normalizing the ATP fluxes to the maximum ATP that could be generated from H2 based on Gibb’s free energy of H2 oxidation and the energetic cost of ATP synthesis (mol ATP/mol H2) (see Table A12 and supplemental text in Appendix A for additional details). The constraints set used to simulate Dehalococcoides growth is listed in Table S18 in Table A14 in Appendix A, and the SBML file for the reconstructed network (iAI549) is presented in Table A14.

3.4. Results and discussion

3.4.1. Dehalococcoides metabolic network

3.4.1.1. Pan-metabolic-genes of Dehalococcoides

The concept of a pan-genome was first investigated by Tettelin and colleagues for the 8 isolates of common human pathogen Streptococcus agalactiae (Tettelin et al., 2005). While pan-genome analyses for other organisms have been reported (Tettelin et al., 2008), no such analysis has been performed to-date for any dechlorinating bacterium, or any other microbe of bioremediation importance. In addition, most of the reported pan-genome analyses were conducted on pathogenic isolates for designing vaccines by assessing their virulence evolution and diversity (Tettelin, 2009). Here, we developed the Dehalococcoides pan-genome from the complete sequenced genomes of four Dehalococcoides strains. Method details are provided in the materials and methods, and also in the supplemental text (Figures A1-A4) in Appendix A. The pan-genome comprises 2061 genes (Figure 3.1). Of these 2061 genes, 1118 genes are in the core,

41

457 are dispensable, and 486 are unique (Figure 3.1). The genes are further classified as metabolic, non-metabolic, and hypothetical based on information obtained from the literature and various biochemical databases, such as SWISSPROT, UniProt, IMG, and PDB. We defined metabolic genes as those that are exclusively related to metabolic processes such as carbon and energy metabolism of Dehalococcoides. Genes that are involved in DNA repair and metabolism, as well as encoding putative transposable elements and insertion elements (Kube et al., 2005, Seshadri et al., 2005) are classified as non-metabolic. Putative genes with a non-specific metabolic function or genes without any function or annotation are categorized as hypothetical (Figure 3.1).

Figure 3.1. Composition of the Dehalococcoides pan-genome. Core, dispensable, and unique genomes are represented by blue, green, and orange, respectively. Genes in these genomes are also categorized as metabolic (spotted pattern), non-metabolic (plain), and hypothetical (grid pattern) on the basis of various bioinformatic analyses (see text for details).

42

Most of the metabolic genes (413 out of 549) were found in the core genome while only a small number of those were identified in the dispensable (75) and unique (61) genomes. The abundance of core metabolic genes in the pan-genome indicates that the central metabolism of Dehalococcoides is very well conserved across strains since core genes are shared by all. We further categorized the metabolic genes in the dispensable and unique genomes based on both function and strain (Figure 3.2). Clearly, the majority of differences among the strains (45 out of the 75 dispensable genes and 47 out of the 61 unique genes) are due to the rdh genes (Figure 3.2). In addition, only strain195 has nitrogen fixing genes and associated transporters related to the nitrogen fixation process. As a result of these genes, together with unique rdhs, strain 195 has the most unique genes of the 4 genomes compared. Due to the presence of a suite of multiple non-identical rdh genes, each strain metabolizes a unique set of specific chlorinated substrates (Adrian et al., 2000b, Bunge et al., 2003, Morris et al., 2007). Hence, the differences in rdh genes largely define the strain specific phenotypes of Dehalococcoides.

43

Figure 3.2. Distribution of dispensable and unique metabolic genes in different Dehalococcoides strains. Colors are assigned to further categorize the genes according to their function identified from annotation and verified by different bioinformatic analyses. Each color except black signifies the presence of a corresponding metabolic gene while black indicates the absence of the corresponding gene. Genes belonging to amino acid metabolism, lipid metabolism and nucleotide metabolism are small in number; hence, included in ‘other’ category. This heat map essentially describes the differences among Dehalococcoides strains from the context of metabolic genes.

Though there were differences in rdh genes, most of these were found in the dispensable genome (Figure 3.2 and Table A15 in Appendix A) while only 9 rdhs (5 rdhA and 4 rdhB genes with >35% amino acid sequence identity) were shared by all strains and found in the core genome (Table A15 in Appendix A). Presence of the majority of rdh genes in the dispensable genome further supports the hypothesis that they were acquired through lateral gene transfer events (Krajmalnik-Brown et al., 2006, Kube et al., 2005, McMurdie et al., 2009).

3.4.1.2. Features of the Reconstructed Metabolic Network of Dehalococcoides

The reconstructed metabolic network of Dehalococcoides, denoted as iAI549 according to the established naming convention (Reed et al., 2003), accounted for 549 open reading frames (ORF) or protein coding genes (27% of the total 2061 genes). Metabolic genes were identified from the genome annotations which were verified with various bioinformatic analyses (see Materials and Methods). In addition, we annotated or revised the annotation for 70 ORFs based on information obtained from different biochemical databases (Table S2 in Table A15 in Appendix A provides a full list of reannotated genes). General features of the Dehalococcoides metabolic network (iAI549) are provided in Table 3.1. iAI549 includes 518 model reactions and 549 metabolites where 497 reactions are gene associated and 21 (4%) are non-gene associated (Table 3.1). The non-gene associated reactions (Table A15 in Appendix A) were added in order to fill gaps in the reconstructed network based on simulations. Although no gene associations were identified for these reactions, we provided a list of core hypothetical genes (Table A15 in Appendix A) which potentially could contain genes associated with these reactions and are prime candidates for further biochemical testing. The network also comprises 36 exchange reactions, including one demand reaction called the

44

biomass synthesis reaction (BIO_DHC_DM_61), to facilitate the transport of various metabolites into and out of the cell. The composition of the in silico minimal medium is shown in Table 3.2, while detailed composition of BIO_DHC_DM_61 is available in the supplemental text in Appendix A. We further categorized the genes and reactions of iAI549 into 7 different functional categories or subsystems based on the associated metabolic pathways (Figures A6 and A7 in Appendix A). The differences among the strains are mainly observed in the energy metabolism category, which includes 51 dispensable and 54 unique metabolic genes, and most of these are rdhs (Figure A6 in Appendix A). However, almost all the reactions of iAI549 (96% of the total 518) are core, which again indicates that the basic central metabolism of Dehalococcoides is strictly conserved (Figure A7 in Appendix A). Although a number of dispensable metabolic genes are found in different subsystems, most of these genes are actually paralogs of the core metabolic genes. This relationship explains why, for example, there are 13 dispensable genes in the transport subsystem, 3 genes each in the lipid and nucleotide metabolism, but no corresponding dispensable reactions (Figure A7 in Appendix A). Since rdhs were found in core, unique and dispensable genomes, we assigned the reductive dechlorination reaction as a core reaction. Therefore, the truly unique metabolic reactions of iAI549 are the nitrogen fixing reaction (EC-1.18.6.1) and the molybdate (required for synthesizing cofactor for the nitrogenase) transport reaction (TC-3.A.1.8) belonging to strain 195 only.

Table 3.1. General features of Dehalococcoides metabolic network (iAI549)

Genes

Total number of genes 2061

Number of included genes 549

Number of excluded genes 1512

Proteins

Total number of proteins 356

Intra-system Reactions

Total number of model reactions 518

45

Gene associated model reactions 497

Non-gene associated model reactions 21

Exchange Reactions

Total number of exchange reactions 36

Input-output reactions 35

Demand reactions 1

Metabolites

Total number of metabolites 549

Number of extracellular metabolites 31

Number of intracellular biomass metabolites 110

Table 3.2. Composition of the in silico minimal medium of Dehalococcoides

Abbreviation Exchange reaction Equation Acetate exchange EX_ac(e) ac <==>

Vitamin B12 or cobalamin exchange EX_cbl1(e) cbl1 <==> Chloride exchange EX_cl(e) cl <==> Carbon dioxide exchange EX_co2(e) co2 <==> Proton exchange EX_h(e) h <==> Hydrogen exchange EX_h2(e) h2 <==> Water exchange EX_h2o(e) h2o <==> Dichlorobenzene exchange EX_dcb(e) dcb <==> Ethene exchange EX_etl(e) etl <==> Tetrachloroethene exchange EX_pce(e) pce <==> Hexachlorobenzene exchange EX_hcb(e) hcb <==> Ammonium exchange EX_nh4(e) nh4 <==> Inorganic phosphate exchange EX_pi(e) pi <==> Sulphate exchange EX_so4(e) so4 <==>

46

Table 3.3. Comparison of various in silico genome-scale models with iAI549

iAI549 iRM588 iAF692 In silico models iAF1260 iYO844 (B. (D. (G. (M. (Organisms) (E. coli) subtilis) mccartyi) sulfurreducens) barkeri) Total reactions 518 522 619 2077 1020 Amino acid 139 119 150 198 207 metabolism Cofactor and prosthetic group 102 100 153 162 83 biosynthesis Nucleotide 83 58 75 155 123 metabolism Lipid metabolism 81 93 46 522 126 Central carbon 41 64 72 252 196 metabolism Energy metabolism 40 37 41 90 41

Transport 32 51 82 698 244

Furthermore, we compared iAI549 to a number of in silico genome-scale models of other Bacteria and Archaea (Table 3.3): iAF1260 for Escherichia coli (Feist et al., 2007), iYO844 for Bacillus subtilis (Oh et al., 2007), iRM588 for Geobacter sulfurreducens (Mahadevan et al., 2006), and iAF692 for Methanosarcina barkeri (Feist et al., 2006). We found that iAI549 had the lowest number of total reactions because of the limited scope of Dehalococcoides’ metabolism. In addition, these numbers also suggest that facultative anaerobes (E. coli and B. subtilis) are more versatile in their lifestyle and metabolism compared to obligate anaerobes (Dehalococcoides, Geobacter and Methanosarcina). These differences are further supported by the presence of a high number of transporters in iAF1260 and iYO844 compared to the presence of only 32 transporters in iAI549 (Table 3.3). A large number of reactions of iAI549 are found to be involved in the amino acid metabolism since the genes for de novo synthesis of all the amino acids except methionine are identified to be present in the genomes (Kube et al., 2005, Seshadri et al., 2005). Also, iAI549 comprises only 41 reactions for the central carbon metabolism — glycolysis, gluconeogenesis, TCA-cycle, pentose phosphate pathway, carbohydrate metabolism

47

— compared to 262 reactions in iAF1260; an incomplete TCA-cycle and an inactive glycolysis pathway explain this low number for iAI549. Since Dehalococcoides lack a typical bacterial cell wall (Adrian et al., 2000b, He et al., 2003, Maymó-Gatell et al., 1997), iAI549 has only 81 reactions for the lipid metabolism category. Furthermore, the cofactor and prosthetic group biosynthesis comprises 101 reactions of iAI549 compared to 162 reactions of iAF1260 because the pathways for synthesizing vitamin B12 and quinones are predicted to be incomplete in Dehalococcoides (Kube et al., 2005, Seshadri et al., 2005).

3.4.2. Model-based simulations of Dehalococcoides physiology

3.4.2.1. Exploring the central metabolism of Dehalococcoides

The reconstructed network for glycolysis, gluconeogenesis, the TCA-cycle and the pentose phosphate pathway of iAI549 highlighted some of the key limitations of Dehalococcoides central metabolism. Although putative genes for glycolysis and gluconeogenesis were identified, no gene for a glucose or fumarate transporter was found in any of the genomes, explaining the inability of Dehalococcoides to use glucose or fumarate as a carbon source. The TCA-cycle of Dehalococcoides (Figure 3.3A) is incomplete, as previously reported (Kube et al., 2005, Seshadri et al., 2005). We could identify putative genes for 2-oxoglutarate synthase and succinyl Co-A synthetase (with 26% amino acid sequence identity to the Methanococcus jannaschii gene), and fumarate reductase/succinate dehydrogenase (with 31-33% amino acid sequence identity to the E. coli gene), but we could not find a gene encoding the citrate synthase (CS) in Dehalococcoides. In a scenario without CS, carbon assimilation could occur using a reductive TCA-cycle. However, the biosynthetic formation of citrate by Dehalococcoides ethenogenes strain 195 was recently demonstrated using 13C-labeled isotopomer experiments although the gene encoding the putative Re-type CS enzyme was not identified (Tang et al., 2009b). The two Dehalococcoides genes that are most similar to the only biochemically characterized Re-type CS gene from Clostridium kluyveri DSM555 (Li et al., 2007) are annotated as isopropyl malate and homocitrate synthase; however, these genes share only 27% amino acid sequence identity with CS gene from C. kluyveri. Hence, further experiments are required to establish the role of these genes, as well as the aforementioned putative TCA-cycle genes in Dehalococcoides.

48

Nonetheless, these isotope labeling studies suggest the formation of 2-oxoglutarate from citrate through the oxidative branch of the TCA-cycle.

In order to analyze the effect of the presence of CS reaction on Dehalococcoides growth, we conducted growth simulations with and without this reaction in iAI549 (Figure 3.4). Only a subtle difference in the growth rate (0.0137 h-1 vs. 0.014 h-1) and yield (0.72 gDCW/eeq vs. 0.71 gDCW/eeq) was observed (Figures 3.4A, 3.4B, and Table A14 in Appendix A). Hence, regardless of whether the TCA- cycle is oxidative (Figure 3.4B) or reductive (Figure 3.4A), the fact that it is incomplete explains why Dehalococcoides are unable to use acetate as their energy source. Interestingly, iAI549 has one anaplerotic reaction — pyruvate carboxylase (PC) — which produces oxaloacetate from pyruvate (Figures 3.3A, 3.4A, and 3.4B). Generally, anaplerotic reactions generate intermediates of a TCA-cycle, but in the absence of a CS reaction, PC is essentially the sole pathway for producing oxaloacetate in the TCA-cycle of iAI549.

3.4.2.2. CO2-fixation by Dehalococcoides

Analysis of iAI549 also revealed the presence of a carbon fixation step via pyruvate-ferredoxin oxidoreductase or pyruvate synthase (POR) enzyme encoded by 4 putative Dehalococcoides genes (gene number 181, 182, 183, 184; Table S13 in Table A15 in Appendix A). Anaerobes such as Geobacter sulfurreducens and Methanosarcina barkeri are also reported to utilize this step in their central metabolism (Bock et al., 1996, Mahadevan et al., 2006). POR is essential for the in silico growth of Dehalococcoides using iAI549 since it is the only pathway for producing pyruvate from acetate (Figures 3.3A, 3.4A, and 3.4B). Growth simulations of iAI549 further

predict that 33% of the total moles of carbon fixed into the biomass is from extracellular CO2 via POR, and the balance (67%) is from extracellular acetate through acetyl-CoA synthetase (Figure

3.3B); thus, clearly highlighting the important requirement for extracellular CO2 in addition to acetate as a carbon source for Dehalococcoides.

Moreover, the presence of both POR and carbon-monoxide dehydrogenase enzymes (CODHr) encoded by 4 putative genes of iAI549 (gene number 170, 171, 172, 174; Table S13 in Table A15 in Appendix A) initially suggested that the Wood-Ljungdahl pathway (Wood and

49

Ljungdahl, 1991) of CO2 fixation might be active in Dehalococcoides. However, the absence of several key enzyme encoding genes, such as the methylenetetrahydrofolate reductase and a methyltransferase in the folate-dependant branch of the Wood-Ljungdahl pathway (Drake et al., 2008, Müller, 2003, Ragsdale, 2008) indicated that the pathway was incomplete in Dehalococcoides (Figure A5 in Appendix A). All of these observations are consistent with the carbon labeling studies by Tang et al. (2009b).

50

51

Figure 3.3. The reconstructed TCA-cycle and CO2 fixation pathway of Dehalococcoides. The arrows show the directionality of the reactions. (A) Grey: citrate synthase gene currently not identified in iAI549, but the pathway was suggested to be present in the carbon isotope labeling study (Tang et al., 2009b); Orange: pathways for which homologous putative genes (~30% amino acid sequence identity) were tentatively identified in Dehalococcoides, but are suggested to be absent by the carbon isotope labeling study (Tang et al., 2009b); Red: pathways for which putative genes are confirmed to be present by both iAI549 and the carbon isotope labeling study (Tang et al., 2009b). In all cases, the TCA-cycle of Dehalococcoides is not closed which explains their inability to use acetate as an energy source. (B) Dehalococcoides’ requirement of CO2 in addition to acetate for their in silico growth. The numbers are flux values in mmol.gDCW-1.h-1. During pyruvate synthesis, Dehalococcoides require 67% carbon (molar basis) from acetate and 33% (molar basis) from CO2. Thus, Dehalococcoides fix carbon via the pyruvate-ferredoxin oxidoreductase or pyruvate synthase (POR) pathway.

52

53

Figure 3.4. Analysis of the citrate synthase (CS) reaction on Dehalococcoides growth. (A) In the absence of the CS reaction, the TCA-cycle operates reductively via succinyl-CoA synthetase and 2-oxoglutarate synthase for producing biomass precursors for Dehalococcoides to grow. (B) The oxidative TCA-cycle operates when the CS reaction is present, but succinyl-CoA synthetase and 2-oxoglutarate synthase are absent, as suggested by the carbon isotope labeling experiment (Tang et al., 2009b). However, Dehalococcoides growth remains almost unchanged with and without the CS reaction (0.0137 h-1 vs. 0.014 h-1) as represented by the flux values obtained from the growth simulations of iAI549.

3.4.3. Energy conservation process of Dehalococcoides

Dehalococcoides strains respire through a membrane-bound electron transport chain (ETC) (Hölscher et al., 2003, Jayachandran et al., 2004, Nijenhuis and Zinder, 2005), which is

incompletely defined. In addition to RDase and hydrogenase (H2ase) enzymes, the ETC of

Dehalococcoides requires an in vivo electron carrier to mediate electron transport between H2ase and RDase. The reductive dechlorination reaction requires an in vivo electron donor of redox ’ potential (E0 ) ≤ -360 mV (Hölscher et al., 2003, Jayachandran et al., 2004, Nijenhuis and Zinder, 2005) similar to other dechlorinating bacteria (Holliger et al., 1998b, Krasotkina et al., 2001, Miller et al., 1997). The cob(II)alamin of corrinoid cofactor in the RDase enzyme is reduced to cob(I)alamin during the reductive dechlorination reaction; hence, necessitating a low-potential

donor because the redox potential (E0’) of Co(II)/Co(I) couple is between -500 and -600 mV (Banerjee and Ragsdale, 2003, Holliger et al., 1998b, Krasotkina et al., 2001). While quinones, such as menaquinone or ubiquinone could act as electron carriers in anaerobes (Kröger et al., 2002, Louie and Mohn, 1999, Schumacher and Holliger, 1996), experimental evidence suggests this is not the case in Dehalococcoides (Jayachandran et al., 2004, Nijenhuis and Zinder, 2005). ’ Moreover, the half reaction potentials for quinones (Menaquinone ox/red E0 = -70 mV, ’ Ubiquinone ox/red E0 = +113 mV (Thauer et al., 1977)) are not compatible with RDases’

requirement of a donor of E0’ ≤-360 mV.

Therefore, we hypothesize that ferredoxin could be a low-potential electron donor for the RDase of Dehalococcoides because it is the most electronegative electron carrier yet found in the bacterial ETCs (Bruschi and Guerlesquin, 1988, Valentine, 1964). Various redox potentials had been reported for bacterial ferredoxins, which included -417 mV at pH 7.55 for Clostridium

54

pasteurianum (Valentine and Wolfe, 1963), -398 and -367 mV in the range of pH 6.13 to 7.41 for C. pasteurianum (Eisenstein and Wang, 1969, Thauer et al., 1977), -445 mV at pH 7 for Dehalospirillum multivorans (Miller et al., 1997), -453 mV at pH 8 for Thermotoga maritime (Sterner, 2001). While these experimental data illustrate the differences in ferredoxin potential across microbes, it also supports their putative role as a low-potential electron carrier in the Dehalococcoides ETC. Furthermore, there was strong genomic evidence that the sequences of rdh genes contained two iron-sulfur cluster binding motifs, which are the characteristic motifs for bacterial ferredoxins (Hölscher et al., 2004). So far, the genomes of Dehalococcoides have 6 putative ferredoxin-encoding genes, but no gene was identified for a b-type cytochrome. Miller and colleagues (Miller et al., 1997) described a mechanism for the ETC of D. multivorans

involving both H2ase and RDase enzymes where they propose the “reverse electron transport”, and the requirement of both a low-potential and a high-potential electron carrier for the ETC. Recently, Thauer et al. (2008) suggested that the energy conservation process of methanogens without cytochromes (a system similar to Dehalococcoides) used a flavin-based “electron bifurcation” system where an endergonic reaction was driven by the energy from an exergonic reaction that took place simultaneously. A similar bifurcation mechanism was also proposed for

the trimeric [Fe]-only H2ase of T. maritime (Schut and Adams, 2009). Based on the literature and considering the lack of information on the Dehalococcoides ETC, we propose the following simplified mechanism of energy conservation for its ETC (Figure 3.5).

55

Figure 3.5. A Tentative Scheme for D. mccartyi electron transport chain (ETC). Dehalococcoides grow by conserving the free energy of reductive dechlorination reaction ( + + + → + + - + + RCl Fd red 2H RH Fd ox Cl H ) through the membrane bound ETC. During this process, the donor H2 likely reduces the putative electron carrier oxidized ferredoxin (FdOx), and - the reduced ferredoxin (FdRed) transfers 2e to the terminal electron acceptors chlorinated ethene or (RCl) via cob(II)alamin to produce lower chlorinated compounds or ethene (RH) and hydrogen chloride (HCl). Reduction of ferredoxin and electron acceptors are catalyzed by H2ase and RDase enzymes, respectively, and 2 protons (H+) are consumed from the cytoplasm during the reductive dechlorination reaction. Hence, the proton translocation stoichiometry of Dehalococcoides ETC is 2H+/2e- or 1 H+/e-.

We assumed that the H2ase of Dehalococcoides reduced ferredoxin (FdOx) in a similar process as described for M. barkeri (Deppenmeier, 2002, Deppenmeier, 2004, Hedderich, 2004, Thauer et

al., 2008). Subsequently, the reduced ferredoxin (FdRed) transferred 2 electrons to the terminal electron acceptors such as chloroethenes or chlorobenzenes (RCl) via cob(II)alamin, and cob(II)alamin was reduced to cob(I)alamin while RCl was reduced to lower chlorinated

compounds or ethenes (RH). Alternatively, the endergonic reduction of ferredoxin (FdOx) with

H2 could be coupled to the exergonic reduction of RCl with reduced ferredoxin (FdRed), in which the latter reaction was catalyzed by the RDase in a similar manner as the electron bifurcation scheme. This might be possible because a corrinoid protein, like a flavo-protein, could also be a site for electron bifurcation (R. K. Thauer, personal communication). In either case, we assumed the uptake of two protons (2H+) from the cytoplasm during the transfer of two electrons (2e-)

from the donor H2 to the acceptor RCl; thus, resulting in a net proton translocation stoichiometry of 1 H+ per e- (Figure 3.5).

3.4.4. Implications of the incomplete cobalamin synthesis pathway in Dehalococcoides

Cobalamin or vitamin B12 is essential for RDase activity; however, the pathway for producing cobalamin is incomplete in Dehalococcoides (Kube et al., 2005, Seshadri et al., 2005) (Figure 3.6). The complete de novo biosynthesis (aerobic or anaerobic) of vitamin B12 requires around 30 genes (Warren et al., 2002), of which only 18 are identified in Dehalococcoides. Seven (7) of these genes belong to the “anaerobic” pathway while 2 are found to be involved in the “aerobic”

56

pathway of cobalamin biosynthesis. Several key enzyme encoding genes required for the precorrin ring formation, cobalt insertion, and methylation were not found in Dehalococcoides genomes (Figure 3.6). However, 7 genes of iAI549 (3 core, 1 dispensable, and 3 unique genes: 161, 162, 163, 433, 524, 525, 526; Tables S3-S6 in Table A15 in Appendix A) that encode a putative cobalamin transporter were identified; thus, indicating that Dehalococcoides could uptake vitamin B12 from the medium in the form of either cobinamide or cobalamin (Escalante- Semerena, 2007). In fact, vitamin B12 has been shown to be required for the growth of pure cultures, and its addition to the medium has been reported to enhance the growth rate of Dehalococcoides (He et al., 2007).

57

58

Figure 3.6. Reconstructed cobalamin biosynthesis pathway of Dehalococcoides. Dashed orange lines indicate cell membrane, grey lines indicate missing pathways, and red lines indicate existing pathways, putative genes of which are identified in Dehalococcoides during the reconstruction of iAI549. The arrows are denoting the directionality of the reactions. Since the genomes encode a putative cobalamin transporter, Dehalococcoides may salvage vitamin B12 either in the form of cobinamide or cobalamin from the environment as indicated by ‘cobinamide transport’ and ‘cobalamin transport’ reactions in the figure. The adenosylcobalamin, which is the end product of the entire pathway, is a biomass constituent and is assumed to take part in Dehalococcoides cell formation.

Therefore, in order to examine the influence of cobalamin on the growth of Dehalococcoides, we conducted growth simulations (Figure 3.7) for two scenarios using iAI549: 1) Dehalococcoides growth rate as a function of weight fraction of cobalamin in the biomass and cobalamin salvage rate from the medium (Figure 3.7A), and 2) Dehalococcoides growth yield assuming it could synthesize its own cobalamin (i.e., adding all the reactions to iAI549 required for de novo cobalamin synthesis) compared to the yield when B12 is salvaged from the medium (Figure 3.7B). Predictably, the growth rate decreases to zero at low cobalamin salvage rates (Figure 3.7A). Also, the cobalamin salvage rate at which metabolism becomes limited by vitamin B12 is a strong function of the cobalamin fraction in the biomass, which has never been experimentally measured for Dehalococcoides (Figure 3.7A). From the second simulation, it is clear that the energetic cost for synthesizing cobalamin de novo is not very significant since the predicted yield with and without a cobalamin synthesis pathway is almost identical (Figure 3.7B). Only if one assumes a biomass cobalamin fraction 10 times higher than the maximum reported, a small (4%) reduction in the growth yield (from 0.72 gDCW/eeq to 0.69 gDCW/eeq) is predicted as a penalty for synthesizing cobalamin de novo (Figure 3.7B and Table A13 in Appendix A). This low synthesis cost, along with the fact that cobalamin is essential, yet its synthesis pathway is incomplete in Dehalococcoides suggests that perhaps Dehalococcoides might have evolved syntrophically with cobalamin secreters, and never faced significant evolutionary pressure to acquire a complete cobalamin synthesis pathway in their genomes.

59

60

Figure 3.7. Influence of cobalamin on the growth rate and yield of Dehalococcoides. (A) Growth rate of Dehalococcoides is simulated as a function of both cobalamin salvage rate and cobalamin fraction in the biomass equation. It shows the role of cobalamin in limiting the growth rate of Dehalococcoides. Clearly, the cobalamin uptake or salvage rate at which Dehalococcoides growth is limiting increases with the increase of cobalamin fraction in the biomass. (B) The cost of de novo cobalamin synthesis in terms of Dehalococcoides growth yield is compared (see text for details). The predicted yield of Dehalococcoides with and without the de novo cobalamin synthesis pathway remains almost identical for the reported maximum cobalamin fraction in the biomass. However, the predicted yield decreased only by 4% (from 0.72 gDCW/eeq to 0.69 gDCW/eeq) with 10 fold increase of cobalamin fraction in the biomass indicating the low cost of de novo cobalamin synthesis.

3.4.5. Does carbon or energy limit the in silico growth of Dehalococcoides?

Growth of Dehalococcoides is more rapid in mixed microbial communities than in pure cultures (Adrian et al., 1998, Cupples et al., 2003, Duhamel and Edwards, 2007, Duhamel et al., 2002) although the reasons for this discrepancy are not entirely clear. The difference in reported growth yields between pure and mixed cultures is more significant (p = 0.0005 at 95% confidence level) than the difference in reported growth rates (p = 0.05 at 95% confidence level) (Tables A7 and A8 in Appendix A). Thus, in order to examine the growth-limiting conditions, we simulated Dehalococcoides growth yields (Figure 3.8A) under two different conditions: 1) allowing unlimited flux of amino acids in the medium at a hydrogen flux of 10 mmol.gDCW-1.h1 (equivalent to the dechlorination rate obtained from average pure-culture growth yields and rates; Tables A7 and A8 in Appendix A), and 2) doubling the hydrogen flux (20 mmol. gDCW-1.h-1) without allowing any amino acid flux in the medium (Figure 3.8A). The first condition mimics a carbon-rich environment while the second one represents an energy-rich situation. The model predicts that adding unlimited amount of any or all of the amino acids in the growth medium (obviating the need for the cell to synthesize these amino acids) increased the growth yield by a maximum of 55% (1.13 gDCW/eeq) compared to the case with no amino acids in the medium (0.72 gDCW/eeq) (Figure 3.8A). However, doubling only the hydrogen flux enhanced the growth yield by 65% (from 0.72 gDCW/eeq to 1.19 gDCW/eeq) (Figure 3.8A).

To further analyze this aspect of energy limitation, we simulated in silico growth yields of Dehalococcoides as a function of both acetate flux (carbon availability) and energy flux,

61 represented by the energy transfer efficiency (Figure 3.8B). This analysis shows that the growth of Dehalococcoides is energy-limited but not carbon-limited since growth yield increases proportionally to increase in energy transfer efficiency regardless of acetate flux. Moreover, simulations also reveal that growth yield of Dehalococcoides in a pure culture is only 30% efficient (corresponding to the green arrow) compared to 65% efficient (corresponding to the red arrow) in a mixed culture (Figure 3.8B). These simulations point towards the electron flux from hydrogen to the RDase as the rate-limiting step, which is somehow more efficient in mixed cultures. It is possible that interspecies hydrogen transfer, such as in a mixed culture is more direct than hydrogen provided in the medium (as for pure cultures). Electrons supplied from an electrode that was polarized to a very low potential were shown to stimulate Dehalococcoides metabolism (Aulenta et al., 2009), possibly illustrating such an effect; if true, these results suggest a mechanism for the enhanced growth of Dehalococcoides and to a faster dechlorination of pollutants.

62

63

Figure 3.8. Effect of carbon and energy sources on the growth yield of Dehalococcoides. (A) The experimental growth yield of Dehalococcoides in the minimal medium (0.69 gDCW/eeq) is compared with increased growth yields achieved by allowing unlimited fluxes of all amino acids -1 -1 at a H2 flux of 10 mmol.gDCW .h (corresponding to the experimental dechlorination rate), as -1 -1 well as doubling the H2 flux (20 mmol.gDCW .h ). It shows that unlimited flux of amino acids (carbon source) increased the in silico growth yield of Dehalococcoides by 55%, whereas doubling the H2 flux (electron donor or energy source) alone enhanced the yield by 65%. (B) Analysis of the energy limited growth of Dehalococcoides. Since the growth yield of Dehalococcoides varies linearly with the energy transfer efficiency, their yield can be improved by increasing the flux of their energy source or electron donor to generate more ATP per electron. However, the variation in acetate fluxes has no effect on growth yields. Red and green arrows show growth yields and corresponding efficiencies for Dehalococcoides growth in mixed and pure cultures, respectively. ‘MM’ = minimal medium; ‘Tyr’ = tyrosine; ‘Glu’ = glutamate; ‘Gln’ = glutamine; ‘Gly’ = glycine; ‘Ala’ = alanine; ‘Thr’ = threonine; ‘Asp’ = aspartate; ‘All -1 -1 AA’ = all amino acids; ‘2X H2 flux’ = 20 mmol H2.gDCW .h .

As described earlier, experimental studies clearly illustrate the favorable growth of Dehalococcoides in syntrophic microbial consortia compared to their isolated pure cultures. This obviously points towards the existence of some undefined beneficial metabolic interactions among the consortia members. Although iAI549 simulations suggested more efficient electron transfer and energy utilization in a mixed culture, this result requires further experimental validation. Because these microbes harness energy for their growth from reductive dechlorination reactions, their increased growth will certainly accelerate the bioremediation process. Hence, the current challenge is to understand the reason behind their favorable growth in a mixed microbial community prevailing in their natural habitat. Therefore, a genome-scale metabolic model of a syntrophic community of dechlorinating bacteria, where Dehalococcoides are the dominant members, can be useful to understand the factors influencing their growth. This information may also help to develop a defined bacterial community with enhanced bioremediation capability, in addition to developing effective strategies for exploiting these microbes for effective bioremediation of contaminated sites around the world.

3.5. Conclusions

Although genome-scale constraint-based models are available for several microbes from all three forms of life, iAI549 is the first such endeavor for dechlorinating bacteria. This constraint-based

64

flux balance model is consistent with the specialized nature of Dehalococcoides metabolism. The model supports the idea that evolution of the chlorinated compound specific rdh genes conferred the strain-specific metabolic phenotype to Dehalococcoides. In addition to cataloguing significant metabolic similarities among Dehalococcoides strains, the model also provides valuable insights regarding physiological and metabolic bottlenecks of these microbes. Reconstructed central metabolic pathways, for example, identified underlying reasons for Dehalococcoides’ requirement of a separate energy source in addition to a carbon source for growth, as well as a carbon fixation step. Also, growth simulations revealed the energy-limited rather than carbon or cobalamin-limited growth of these organisms. In the process of developing the model, detailed tables of metabolic gene correspondences among 4 genomes, reannotations based on pathway analysis, and intrinsic kinetic and stoichiometric parameters were developed for the user community. We also created lists of core hypothetical genes and non-gene associated model reactions; these lists will be useful for designing enzyme assays for functional annotation of the hypothetical genes. Finally and most importantly, this pan-genome-scale metabolic model now provides a common and scalable framework as well as a knowledgebase, which can be used for visualization and interpretation of various omics-scale data from transcriptomics, proteomics and metabolomics for any Dehalococcoides strain; such analysis will further our understanding of these environmentally important organisms so that the outcome of bioremediation can be improved.

65

Chapter 4: New insight into Dehalococcoides mccartyi metabolism from a model-integrated systems-level analysis of D. mccartyi transcriptomes

4.1. Abstract

Organohalide respiration, mediated by Dehalococcoides mccartyi, is a useful bioremediation process that transforms ground water pollutants and known human carcinogens such as trichloroethene and vinyl chloride into benign ethenes. Successful application of this process depends on the fundamental understanding of the respiration and metabolism of D. mccartyi. To better elucidate D. mccartyi metabolism and physiology, we analyzed available transcriptomic data for a pure isolate (Dehalococcoides mccartyi strain 195) and a mixed microbial consortium (KB-1) using the pan-genome-scale metabolic model for D. mccartyi developed in chapter 3. The transcriptomic data, together with available proteomic data helped confirm transcription and expression of the majority of D. mccartyi genes. We also identified functionally enriched important clusters (13 for strain 195 and 11 for KB-1) of co-expressed metabolic genes using the quality threshold clustering algorithm and information from the model. A composite genome of two highly similar D. mccartyi strains from the KB-1 metagenome sequence was constructed, and operon prediction was conducted for this composite genome and other single genomes. This operon analysis, together with clustering analysis of transcriptomic data helped generate experimentally testable hypotheses regarding the function of a number of hypothetical proteins and the poorly understood mechanism of energy conservation in D. mccartyi. Overall, this study shows how an organism’s metabolic model can be used as a platform to analyze and visualize transcriptomic data for obtaining improved understanding of the unusual metabolism of an environmentally important but difficult to grow microorganism.

66

4.2. Introduction

Obligate anaerobes such as Dehalococcoides mccartyi support growth and metabolism by conserving energy from an unusual respiratory metabolic process termed organohalide respiration (Holliger et al., 1998b, Smidt and de Vos, 2004, Tas et al., 2010). The hallmark of this important biological process lies in the detoxification of halogenated xenobiotics such as trichloroethene and vinyl chloride — known human carcinogens and groundwater pollutants — as well as tetrachloroethene, chlorobenzenes, dioxins, and polychlorinated biphenyls (Adrian et al., 2000b, Bunge et al., 2003, He et al., 2003, Maymó-Gatell et al., 1997). However, optimized use of this natural and effective bioremediation process is hampered due to the lack of detailed knowledge about D. mccartyi metabolism, both in pure cultures and in mixed microbial communities they normally inhabit. Although some of the genes and enzymes involved in organohalide respiration are identified and characterized (Adrian et al., 2007b, Jayachandran et al., 2004, Müller et al., 2004, Nijenhuis and Zinder, 2005), mechanism of the respiratory chain and its components, as well as functional annotations of ~50% D. mccartyi genes is yet to be determined (Kube et al., 2005, Seshadri et al., 2005). Due to the associated difficulty in expressing genes heterologously and the lack of a genetic system in D. mccartyi (Löffler et al., 2012), experimental studies on characterization and manipulation of genes and enzymes of these bacteria are challenging. Hence, most studies to date have primarily focused on the identification and characterization of reductive dehalogenase homologous (rdh) genes, and their respective enzyme’s cofactors and substrate ranges (Adrian et al., 2007b, Krajmalnik-Brown et al., 2004, Lee et al., 2006, Magnuson et al., 2000, Magnuson et al., 1998, Müller et al., 2004).

Recently, a number of isotope labeling studies concerning D. mccartyi metabolism have discussed the genes and enzymes of some key metabolic processes, including the TCA-cycle, and amino acid transport and metabolism (Marco-Urrea et al., 2011, Tang et al., 2009b, Zhuang et al., 2011). In addition, sequencing of multiple D. mccartyi genomes (Kube et al., 2005, McMurdie et al., 2009, Seshadri et al., 2005) enabled the construction of a detailed pan-genome- scale constraint-based model of metabolism, which revealed their energy-starved nature, as well as depicted the overall metabolic landscape of D. mccartyi (Ahsanul Islam et al., 2010). Also, a number of proteomic studies (Lee et al., 2012, Morris et al., 2007, Morris et al., 2006, Tang et

67

al., 2013) have provided important information on some metabolic genes and processes, including nitrogen fixation and carbon metabolism of D. mccartyi. Apart from these metabolic studies, data from systems-wide high-throughput experimental studies such as whole genome microarrays are available for Dehalococcoides mccartyi strain 195 (formerly Dehalococcoides ethenogenes strain 195) (Johnson et al., 2008, Johnson et al., 2009, Lee et al., 2011). A shotgun metagenome microarray study on KB-1 — a D. mccartyi-containing anaerobic mixed microbial community — has been published recently (Waller et al., 2012, Waller, 2010). While these studies obtained expression data for all genes, each study focused on analyzing the expression of specific genes involved in, for instance, reductive dechlorination and energy conservation, in cobalamin (vitamin B12) biosynthesis pathway, or phage related genes. None of these studies, however, focused on the analysis of overall D. mccartyi metabolism using genome-wide transcriptomic data. Also, no integrated analysis of the available transcriptomic and proteomic data with the published pan-genome-scale metabolic model of these bacteria has been conducted yet. Such a systemic analysis of the “omics” data can be useful to glean a more comprehensive understanding of the metabolic processes of D. mccartyi, as well as to verify the presence of sequenced genes in their genomes as most genes have only weak bioinformatic evidence.

Here, we analyzed the published transcriptomic data for a pure culture, Dehalococcoides mccartyi strain 195 (from here on, strain 195) (Johnson et al., 2008, Johnson et al., 2009) and a mixed culture, KB-1 (Waller, 2010, Waller et al., 2012) using our previously developed pan- genome-scale D. mccartyi metabolic model (Ahsanul Islam et al., 2010) as a guide. A composite genome of two highly similar D. mccartyi strains in KB-1 (from here on, KB-1 Dhc) was constructed from the publicly available KB-1 metagenome sequences (http://img.jgi.doe.gov/cgi- bin/m/main.cgi) and subsequently used for analyzing D. mccartyi-specific transcriptomic data from the KB-1 community arrays (Waller, 2010, Waller et al., 2012). This model-guided study of transcriptomic data, together with available proteomic data analyzed and confirmed the transcription and expression of the majority of genes in strain 195 and KB-1 Dhc genomes. In addition, we specifically examined and visualized the expression of some metabolic genes and hypothetical proteins, as well as their putative annotations proposed during the metabolic modeling study. Then, operon analysis for the KB-1 Dhc genome and for other single strain- genomes of D. mccartyi, including strains 195, CBDB1, and GT was conducted. The

68

transcriptomic data were further analyzed with the quality threshold (QT) clustering algorithm and functional enrichment analysis, which provided valuable insight on the poorly understood mechanism of energy conservation in these bacteria. Moreover, these bioinformatic analyses of transcriptomic data, along with operon analysis helped suggest putative functions for at least five hypothetical proteins of strain 195. Thus, our analysis provides a guide for selecting and screening some of the hypothetical proteins in D. mccartyi genomes, which can aid future targeted proteomic work to increase our knowledge on the physiology and biochemistry of these useful bacteria.

4.3. Materials and methods

4.3.1. Identification of D. mccartyi genes from KB-1 shotgun microarray data

Pre-processed and normalized transcriptomic data for the KB-1 community were collected from a shotgun microarray study of 33 KB-1 samples (Waller, 2010, Waller et al., 2012). Details of array construction methods, experimental conditions, and array data normalization techniques were described elsewhere (Waller, 2010, Waller et al., 2012). RNA was collected from KB-1 cultures amended pairwise with and without chlorinated acceptors for a given donor (methanol or hydrogen). These RNA samples included combinations with methanol only (M) compared to the same cultures with trichloroethene and methanol (TCEM), cis-1, 2 dichloroethene and methanol (cDCEM), vinyl chloride and methanol (VCM), and vinyl chloride and hydrogen (VCH) (Waller, 2010, Waller et al., 2012). In these experiments, a TCE-grown culture was first purged of all chlorinated substrates and starved for 4 days, prior to amendment with electron donors and acceptors. In addition, arrays were also interrogated with RNA from cultures after being starved (i.e., not amended) for 4 days (NA) and one sample from a culture kept anaerobic, but starved for 1 year (“Starved”) (Waller, 2010, Waller et al., 2012). In total, 33 independent RNA samples and corresponding array data were used for principal component analysis (PCA). Although the KB-1 mixed microbial community mainly comprises dechlorinators, methanogens, acetogens, and fermenters (Duhamel and Edwards, 2006, Duhamel et al., 2004, Edwards and Cox, 1997, Waller, 2010), D. mccartyi are the dominant members that detoxify toxic chlorinated solvents (Duhamel and Edwards, 2006, Duhamel et al., 2004, Edwards and Cox, 1997, Waller,

69

2010). In addition, only D. mccartyi-specific array data can be integrated with the pan-genome- scale metabolic model (Ahsanul Islam et al., 2010); hence, only those genes and the corresponding array data were analyzed in this study. The data were extracted from KB-1 arrays and nucleotide sequences following a simple workflow (Figure B1 in Appendix B). First, all array sequences were aligned against the non-redundant nucleotide database (“nt”) from NCBI (http://www.ncbi.nlm.nih.gov/nuccore) with BLAST (blastn) (Altschul et al., 1997) for identifying their species level identity. Sequences that matched to a database D. mccartyi genome as the best hit with > 85% identity at the nucleotide level were chosen as D. mccartyi genes. Next, all array sequences were compared to the NCBI non-redundant protein database (“nr”) (http://www.ncbi.nlm.nih.gov/protein) with BLAST (blastx) (Altschul et al., 1997) for identifying their annotations. Since D. mccartyi genomes are very similar (Ahsanul Islam et al., 2010, Hug et al., 2011, Kube et al., 2005, Seshadri et al., 2005), only sequences that matched to the database D. mccartyi genes with > 95% identity at the amino acid level were retained for subsequent analyses. Finally, KB-1 array nucleotide sequences were compared to the draft composite genome of KB-1 Dhc as constructed from the KB-1 metagenome (Hug, 2012). Afterwards, results from all three analyses were compared, and only consensus array sequences and corresponding intensity data were selected as KB-1 Dhc array data (Figure B1 in Appendix B). Out of a total of 26,186 sequences on the shotgun array, 1,162 consensus sequences were identified as D. mccartyi. Subsequently, the data were analyzed with QT clustering algorithm (Heyer et al., 1999) followed by mapping to D. mccartyi metabolic model (Ahsanul Islam et al., 2010) for conducting functional enrichment analysis of the clusters (Mahadevan et al., 2008, Tavazoie et al., 1999, Huang et al., 2009) (Figure B1 in Appendix B).

4.3.2. Dehalococcoides mccartyi strain 195 microarray data

Pre-processed and normalized transcriptomic data for Dehalococcoides mccartyi (formerly Dehalococcoides ethenogenes) strain 195 was obtained from published literature (Johnson et al., 2008, Johnson et al., 2009) and NCBI GEO database (http://www.ncbi.nlm.nih.gov/geo/). In total, microarray data for 9 experimental conditions and 27 samples were analyzed, where each condition comprised 3 biological replicates. Experimental conditions include the growth of Strain 195 in 5 phases — early exponential (EE), late exponential (LE), transition (TR), early

70

stationary (ES) and late stationary (LS) — of its growth curve (Johnson et al., 2008). Arrays were also generated for RNA samples collected from the cultures growing in Strain 195 medium with the addition of: high and low concentrations of vitamin B12 (HighB12 and LowB12), filter sterilized supernatant of ANAS (ANASspent), and ANAS mineral medium (ANASmedium) (Johnson et al., 2009). ANAS is an anaerobic and TCE-dechlorinating mixed methanogenic microbial community that was enriched from the contaminated sites of Alameda Naval Air Station (Richardson et al., 2002). Of the total 1,579 array sequences, 1,560 non-duplicate sequences and corresponding array data from 27 samples were further analyzed following a workflow (Figure B2 in Appendix B). After PCA, array data for all samples were mapped to D. mccartyi metabolic model for identifying metabolic genes followed by clustering of genes with the QT clustering algorithm (Heyer et al., 1999). Then functional enrichment analysis was performed through calculation of enrichment p-values for metabolic genes in each cluster with hypergeometric distribution method (Figure B2 in Appendix B).

4.3.3. Operon prediction for Dehalococcoides mccartyi genomes

Operon predictions for both KB-1 Dhc and strain 195 were performed using the procedure described in Bergman et al. (2007). As per the procedure, we randomly chose 27 diverse bacterial genomes (Table S17 in Table B1 in Appendix B) from different branches of the bacterial phylogeny for constructing the barcode. The barcode was generated by identifying homologs of strain 195 and KB-1 Dhc in the chosen bacterial genomes. Subsequently, intergenic distance for each gene was calculated from the positional information of genes in the genome. Intergenic distance and strand location, as well as the barcode information was then used for calculating posterior probabilities of genes to be considered as operonic or not operonic. If the probability value of a gene was ≥ 0.5, it was assigned as an operonic gene; otherwise, genes were not considered as operonic for lower probability values (Appendix B: Table S15 in Table B1). A similar procedure was followed for identifying operon structures of Strains CBDB1 and GT (Appendix B: Table S15 in Table B1).

71

4.3.4. Microarray data analysis and visualization

Clustering analysis and heat map visualization of transcriptomic data were conducted with MeV: MultiExperiment Viewer (Saeed et al., 2003) — an open-source software for analyzing and visualizing microarray gene expression data. First, the array data were mapped to the D. mccartyi metabolic model for identifying metabolic genes and classifying them according to the model subsystems. Next, the quality threshold (QT) clustering algorithm (Heyer et al., 1999) and Spearman’s rank correlation coefficient as the distance metric (Usadel et al., 2009) were used for clustering the gene expression data. The number of clusters generated by QT clustering depends on two parameters: cluster diameter and minimum cluster size; thus, threshold for a cluster diameter and minimum cluster size was chosen as 0.06 and 7 for obtaining very stringent QT clusters. Theses stringent cut offs also ensured that co-expressed or co-transcribed clusters formed were not very large and potentially more meaningful. Using both subsystem and clustering information, hypergeometric p-values were calculated for each QT cluster to identify functionally enriched i.e., overrepresented (p ≤ 0.05) clusters (Mahadevan et al., 2008, Tavazoie et al., 1999, Huang et al., 2009). Subsequently, hierarchical clustering (Eisen et al., 1998) was used for further analysis of some functionally enriched important QT clusters. Absolute intensity values were used for representing if a gene was highly expressed/ transcribed (“on”) or not highly expressed/ not transcribed (“off”) in heat maps. The frequency distribution of intensity values (Figures B3 and B4 in Appendix B) showed that the majority of strain 195 and KB-1 Dhc genes were expressed above intensity values of 800 and 100, respectively. Hence, we set the threshold intensity of 800 (< 800 = “off”, > 800 = “on”) for strain 195 data and 100 (< 100 = “off”, > 100 = “on”) for KB-1 Dhc arrays to represent as heat maps. Relative or normalized gene expression intensities were calculated using the formula: normalized intensity value = [(absolute intensity value) – mean of absolute intensity values in a row)] / [standard deviation of absolute intensity values in a row]. Thus, normalized intensities depicted the highest and lowest expression of any gene across all samples. Principal component analysis of all array data was performed by MATLAB (The Mathworks Inc.), and the metabolic network of D. mccartyi was visualized with Cytoscape (Smoot et al., 2011).

72

4.4. Results and discussion

4.4.1. Principal component analysis of strain 195 and KB-1 Dhc microarray data

Principal component analysis (PCA) is a useful statistical method to identify underlying trends of a high-dimensional data set such as microarray data by reducing its dimensionality and extracting important information (Clark and Ma’ayan, 2011, Gehlenborg et al., 2010, Hotelling, 1933). PCA was performed for strain 195 and KB-1 Dhc array data to analyze their dimensionality and variability (Figure 4.1). In total, data for 27 strain 195 samples under 9 conditions (Figure 4.1A) and 33 KB-1 Dhc samples under 7 conditions (Figure 4.1B) (Johnson et al., 2008, Johnson et al., 2009, Waller, 2010, Waller et al., 2012) were analyzed by PCA. Strain 195 samples (Figure 4.1A) were collected from parallel triplicate cultures during sequential dechlorination of trichloroethene (TCE) at 5 time points: Early Exponential (EE), Late Exponential (LE), Transition (TR), Early Stationary (ES), and Late Stationary (LS), in high and low vitamin B12 concentrations (HighB12 and LowB12), and in two different growth media with higher nutrient contents (ANASmedium and ANASspent) (Johnson et al., 2008, Johnson et al., 2009). ANAS is an enrichment culture of a D. mccartyi-containing methanogenic mixed microbial community (Richardson et al., 2002, West et al., 2008), and array experiments were conducted with strain 195 in the ANAS mineral medium (ANASmedium), as well as in the filter sterilized supernatant of ANAS culture (ANASspent) (Johnson et al., 2009). The PCA-plot (Figure 4.1A) shows good agreement between triplicate samples for the corresponding conditions, indicating that the biological replicates behaved consistently in the array experiments.

The samples used for extracting RNA to interrogate KB-1 Dhc arrays were comparisons of mainly two growth conditions: one with and one without a chlorinated electron acceptor (Waller, 2010, Waller et al., 2012); specifically, KB-1 cultures grown with trichloroethene and methanol (TCEM) were compared to cultures grown with methanol (M) only. Other conditions tested included cis-1,2-dichloroethene and methanol (cDCEM), vinyl chloride and methanol (VCM), and vinyl chloride and hydrogen (VCH). These samples were also compared to samples that were not amended with any substrates for 4 days (NA), and for 1 year (“Starved”) (Figure 4.1B).

73

Although methanol is supplied to the KB-1 community as the electron donor, it is fermented to

H2 which is the direct electron donor for D. mccartyi strains in KB-1 (Duhamel and Edwards, 2006, Waller, 2010). RNA for the cDCEM and starved conditions was arrayed only once while multiple biological replicates for other conditions were analyzed (TCEM: 3 samples, VCM: 10 samples, VCH: 2 samples, M: 11 samples, and NA: 5 samples). PCA showed high dimensionality in KB-1 Dhc array data (Figure 4.1B), which primarily stemmed from the type of array technology (shotgun “spotted” DNA array) and the experimental approach used (sample collection for only one time point 4 hours after substrate addition), as well as the inherent variability of working with a mixed microbial culture.

Strain 195 arrays were short oligonucleotide-based Affymetrix microarrays (Johnson et al., 2008, Johnson et al., 2009), whereas KB-1 Dhc arrays were shotgun “spotted” DNA microarrays where the DNA probe samples were generated from PCR-amplified shotgun clones of mixed culture DNA (Waller, 2010, Waller et al., 2012). Spotted arrays, in general, are more dimensional and noisy than oligonucleotide arrays due to the nature of spotting procedure and the use of different fluorescent dyes (Cy3 and Cy5) in target preparation. These factors can also contribute to the cross hybridization of different cell populations to the same array (Allison et al., 2006, Lee and Saeed, 2007, Schulze and Downward, 2001). Moreover, transcriptomic data for KB-1 Dhc were extracted from the shotgun metagenome microarray experiments of KB-1 (see materials and methods for details) — an anaerobic mixed microbial community mainly includes dechlorinators, acetogens, methanogens, and fermenters (Duhamel and Edwards, 2006, Duhamel et al., 2004, Edwards and Cox, 1997, Waller, 2010); hence, unlike studies with pure cultures such as strain 195, growth of D. mccartyi strains in KB-1 is always associated with the interactions of other organisms in the consortium. These interactions are further complicated by the fact that KB-1 contains at least two D. mccartyi strains: one grows preferentially on TCE, and the other grows on cDCE and VC (Duhamel et al., 2004). Also, the KB-1 biological replicates were sampled from various experiments conducted separately and at different times, but always consisting of transfers of the same parent culture. Thus, in addition to the type of array and experimental design, the intricate and subtle interactions of microbes in the mixed community are responsible for the observed high dimensionality of KB-1 Dhc array data (Figure 4.1B).

74

75

Figure 4.1. Principal component analysis (PCA) of the array data for strain 195 and KB-1 Dhc samples. (A) Array data for pure culture strain 195 included triplicate biological replicates that were clustered together for each experimental condition by PCA. All samples were used for subsequent data analysis. (B) D. mccartyi-specific array data for biological replicates of KB-1 mixed culture demonstrated variability owing to array type, experimental design, and complex interactions of organisms in the community. Subsequent data analyses, therefore, were conducted with the expression values of all 33 biological replicates. “EE” = early exponential phase, “LE”= late exponential phase, “TR” = transition phase, “ES” = early stationary phase, “LS” = late stationary phase, “HighB12” = higher concentration of vitamin B12 in the medium, “LowB12” = lower concentration of vitamin B12 in the medium, “ANASspent” = ANAS supernatant added medium, “ANASmedium” = growth medium of ANAS cultures, “TCEM” = trichloroethene and methanol, “cDCEM” = cis 1,2-dichloroethene and methanol, “VCM” = vinyl chloride and methanol, “VCH” = vinyl chloride and hydrogen, “M” = methanol only, “NA” = not amended.

4.4.2. Improved identification and confirmation of D. mccartyi genes

Of the total 1560 putative genes in strain 195 genome, only 3 were biochemically characterized: DET0079 (tceA) (Magnuson et al., 2000), DET0318 (pceA) (Magnuson et al., 1998), and DET1363 (mgsD) (Empadinhas et al., 2004). However, none of the 1162 putative genes of KB-1 Dhc was biochemically characterized. Due to the lack of biochemical evidence for the majority of genes in strain 195 and KB-1 Dhc genomes, available high-throughput experimental data such as proteomics (Lee et al., 2012, Morris et al., 2007, Tang et al., 2013) and transcriptomics (Johnson et al., 2008, Johnson et al., 2009, Waller et al., 2012) data can be used to identify and support the existence of these putative genes, if not their functions, in the genomes. Previous proteomic studies (Lee et al., 2012, Morris et al., 2007, Tang et al., 2013) identified only 718 strain 195 and 20 KB-1 Dhc genes (Appendix B: Tables S1 and S2 in Table B1). However, the transcriptomic data for both organisms, analyzed in this study, showed high expression (see Materials and methods for how gene expression cut-off values were chosen to determine “on” and “off” genes) of 925 strain 195 genes and 257 KB-1 Dhc genes in all samples. Apart from these genes, only 229 and 34 genes from strain 195 and KB-1 Dhc were found to be “off” or not transcribed in any sample, and the remaining genes (406 of strain 195 and 871 of KB-1 Dhc) showed high expression in at least one sample (Appendix B: Tables S1 and S2 in Table B1). Thus, the majority (~60%) of genes were transcribed in all strain 195 samples, while the majority (~75%) of KB-1 Dhc genes showed high expression in at least one sample. Overall, the existence of more strain 195 genes was supported by both proteomic and transcriptomic evidence as

76

compared to KB-1 Dhc genes. We further discussed the proteomic and transcriptomic evidence for hypothetical proteins and metabolic genes in the following sections.

4.4.3. Confirmation of hypothetical proteins in strain 195 and KB-1 Dhc genomes

Hypothetical proteins or genes with unknown functions constitute ~33% (523) of strain 195 and ~22% of KB-1 Dhc (264) genomes, the latter being a draft genome. Analysis of transcriptomic data (Figure 4.2) revealed high expression of 243 (Appendix B: Table S3 in Table B1) and 56 (Appendix B: Table S5 in Table B1) hypothetical proteins of strain 195 and KB-1 Dhc in all samples, respectively. Notably, 96 of the 243 strain 195 hypotheticals were also detected in previous proteomic studies (Figure 4.2A and Appendix B: Table S3 in Table B1), while none of the 56 KB-1 Dhc hypotheticals has proteomic evidence (Figure 4.2B and Appendix B: Table S5 in Table B1). Thus, the existence of these hypothetical proteins was supported by either proteomic or transcriptomic data or both. However, the majority hypothetical proteins of both genomes (280 of strain 195 and 208 of KB-1 Dhc) were not highly expressed or “on” in all samples, which also included 14 strain 195 (Figure 4.2A and Appendix B: Table S4 in Table B1) and 2 KB-1 Dhc (Figure 4.2B and Appendix B: Table S6 in Table B1) hypotheticals detected in the proteomic studies. From these lists, only 116 and 6 hypothetical proteins of strain 195 (Appendix B: Table S4 in Table B1) and KB-1 Dhc (Appendix B: Table S6 in Table B1) were found to be “off” or not highly expressed in all samples, and the remaining hypotheticals (164 of strain 195 and 202 of KB-1 Dhc) showed high expression in at least one sample; thus, probably be considered as true hypothetical proteins.

77

78

Figure 4.2. Hypothetical proteins of (A) strain 195 and (B) KB-1 Dhc with proteomic and transcriptomic evidence. Hypothetical proteins that are highly expressed (“on”) in all samples are represented by “blue color”, and those that are not highly expressed (“on”) in all samples are represented by “orange color”. Also, hypothetical proteins with both proteomic and transcriptomic evidence are represented by “plain blue and orange colors”, and with only transcriptomic evidence are represented by “grid patterned blue and orange colors”.

4.4.4. Confirmation of metabolic genes in strain 195 and KB-1 Dhc genomes

Metabolic genes from the transcriptomic data were identified by mapping them to the manually curated pan-genome-scale metabolic model for D. mccartyi (Ahsanul Islam et al., 2010) (see Materials and methods, and Appendix B: Figures B1 and B2). As expected, more metabolic genes (467) were identified for strain 195 than for the composite genome of KB-1 Dhc (429) (Appendix B: Tables S7 and S8 in Table B1) because the latter one was a draft genome. Of the 467 putative metabolic genes of strain 195, 314 genes were highly expressed or “on” in all samples, 93 were “on” in at least one sample, and 60 were “off” or not transcribed in all samples. Also, the majority (305) of these metabolic genes (Appendix B: Table S7 in Table B1) were detected in previous proteomic studies (Lee et al., 2012, Morris et al., 2007). Thus, the presence of at least 412 metabolic genes in strain 195 genome was supported by either proteomic or transcriptomic evidence. On the contrary, only 10 out of 429 metabolic genes of KB-1 Dhc were identified in proteomic studies (Morris et al., 2007, Tang et al., 2013) (Appendix B: Table S8 in Table B1); however, 101 of these genes were found to be “on” in all samples, 317 were “on” in at least one sample, and only 11 putative metabolic genes were “off” in all 33 KB-1 Dhc samples (Appendix B: Table S8 in Table B1). Thus, the presence of 418 metabolic genes of KB-1 Dhc, including 10 with proteomic evidence, is supported by transcriptomic data. Most importantly, analysis of transcriptomic data for the hypothetical proteins reannotated during the metabolic modeling study (Ahsanul Islam et al., 2010) showed high expression of 13 strain 195 (Figure 4.3) and 11 KB-1 Dhc hypothetical proteins (Figure 4.4) in at least one sample. Because their presence is supported by either proteomic or transcriptomic evidence or both, these hypothetical proteins are good candidates for future biochemical experiments as their proposed functions can serve as valuable hypotheses to be tested experimentally.

79

80

Figure 4.3. Proteomic and transcriptomic evidence for the hypothetical proteins of strain 195 reannotated in the D. mccartyi metabolic model. Transcriptomic evidence for the reannotated hypothetical proteins is presented as heat maps while proteomic evidence is obtained from literature (Lee et al., 2012, Morris et al., 2007). Proposed functions and the metabolic pathways in which the hypothetical proteins were involved in the metabolic model are also shown in the table.

Further analysis of the transcriptomic data for metabolic genes identified the presence of more rdhA genes — involved in the energy conserving reductive dechlorination reaction — for KB-1 Dhc (20 rdhAs) than for strain 195 (17 rdhAs) (Figure 4.5, and Appendix B: Tables S9 and S10 in Table B1), and 7 of those were homologous to strain 195 rdhA genes (Figure 4.5A). The KB-1 rdhA genes included homologs of the characterized pceA (Magnuson et al., 1998) and vcrA genes (Hug, 2012, Waller, 2010, Müller et al., 2004); however, probes for homologs of other characterized rdhAs such as bvcA (Krajmalnik-Brown et al., 2004) and tceA (Magnuson et al., 2000) were not present in KB-1 Dhc shotgun arrays. A recent proteomic study (Tang et al., 2013) of KB-1 identified 5 rdhAs, including vcrA (KB1_1502), bvcA (KB1_6), tceA (KB1_1037), RdhA5 (KB1_0072), and RdhA1 (KB1_0054). In total, 6 out of 17 KB-1 rdhAs were highly expressed in all samples while only one rdhA gene (KB1_1570) was “off” in all samples (Figures 4.5A and 4.5B). Most importantly, a total of 12 KB-1 rdhAs were transcribed even in the starved condition (Figures 4.5A and 4.5B), indicating that the genes were not strictly regulated by the presence of chlorinated substrates. This notion is further evident from the rdhA expression profiles (Figures 4.5A and 4.5B) which do not show any major difference between the samples with chlorinated solvents and those without. All the M and NA samples showed almost similar expression patterns.

Among the strain 195 rdhA genes, only 2 (DET1559 and DET0079, tceA) out of 17 were highly transcribed or “on” in all samples (Figures 4.5A and 4.5B); tceA was transcribed because TCE was used as the electron acceptor in all samples, but the expression of DET1559 seemed to be constitutive as noted previously (Adrian et al., 2007b, Morris et al., 2007, Morris et al., 2006). Also DET1545, similar to previous studies (Rahm and Richardson, 2008a, Rahm and Richardson, 2008b), was highly transcribed even in the stationary phase when the substrate concentration was low (Figure 4.5A).

81

82

Figure 4.4. Proteomic and transcriptomic evidence for the hypothetical proteins of KB-1 Dhc reannotated in the D. mccartyi metabolic model. Transcriptomic evidence for the reannotated hypothetical proteins is presented as heat maps while proteomic evidence is obtained from literature (Lee et al., 2012, Morris et al., 2007). Proposed functions and the metabolic pathways in which the hypothetical proteins were involved in the metabolic model are also shown in the table.

83

84

Figure 4.5. Expression of reductive dehalogenase homologous (rdhA) genes. Absolute intensities of (A) homologous and (B) non-homologous rdhA genes of strain 195 and KB-1 Dhc are illustrated as heat maps. For Strain 195 data, the characterized genes, tceA and pceA (Magnuson et al., 2000, Magnuson et al., 1998), and DET1559 were highly expressed as previously reported (Rahm and Richardson, 2008a, Rahm and Richardson, 2008b). DET1545 and its homolog in KB-1 Dhc, KB1_0072, were expressed at highest levels in late stationary or unamended conditions (to see this more clearly, refer to absolute values of intensities provided in Appendix B: Tables S9 and S10 in Table B1). For KB-1 Dhc rdhA genes, identifiers in parenthesis are provided for cross-referencing as they were used in other studies (Morris et al., 2006, Rahm and Richardson, 2008a, Waller, 2010). Although vcrA and pceA homologs were found, bvcA and tceA homologs were unfortunately not identified as probes in the KB-1 Dhc shotgun arrays. Note that 12 out of 20 rdhAs from KB-1 Dhc were found to be “on” even in the “Starved” condition.

We further visualized the expression of all metabolic genes by overlaying both data sets on the D. mccartyi metabolic network (Appendix B: Figure B5). The reconstructed network (genes, reactions and metabolites) was organized using the organic layout algorithm (http://docs.yworks.com/yfiles/doc/developers-guide/smart_organic_layouter.html) in Cytoscape (Smoot et al., 2011), and genes and reactions were colored (Appendix B: Figure B5A) according to their functional categories in D. mccartyi model (Ahsanul Islam et al., 2010) for obtaining a better topological view of the network. Clearly, genes and reactions involved in the energy metabolism category are very important for D. mccartyi as they formed three distinct clusters in the network (orange colored nodes). Comparison of the absolute intensities of only metabolic genes from both arrays revealed the presence of highest number of highly transcribed (“on”) genes in “ANASspent” and “TCEM” samples (Appendix B: Figures B5B and B5D) while the lowest number of metabolic genes was “on” in “LS” and “Starved” conditions (Appendix B: Figures B5C and B5E) of Strain 195 and KB-1 Dhc, respectively. Interestingly, high expression of the majority of metabolic genes (358 or 77% in strain 195 and 209 or 61% in KB-1 Dhc), including 6 and 12 rdhA genes (Figures 4.5A and 4.5B) in “LS” and “Starved” conditions (Appendix B: Tables S11 and S12 in Table B1), suggests that D. mccartyi metabolism remains active even when the organisms are not growing. This constitutive gene expression under non- growth conditions further indicates that perhaps many of the metabolic genes in D. mccartyi are essential or housekeeping genes.

85

4.4.5. Clustering of microarray data and operon predictions

In addition to confirming the existence of sequenced genes in strain 195 and KB-1 Dhc genomes, we also analyzed both transcriptomic data sets with the quality threshold (QT) clustering algorithm (Heyer et al., 1999) for identifying clusters of co-expressed or co-transcribed genes (Hanson et al., 2009). QT clustering is an unsupervised algorithm that, in addition to finding co- expressed gene clusters, ensures the quality of formed clusters by applying quality thresholds such as minimum cluster diameter and minimum cluster size (Heyer et al., 1999). Using very stringent cut-offs for QT clustering (see Materials and methods), we obtained 30 QT clusters of 7 – 31 genes for strain 195 and 26 QT clusters of 7 – 35 genes for KB-1 Dhc (Appendix B: Tables S13 and S14 in Table B1). D. mccartyi genes were categorized in 7 different model subsystems (i.e., functional categories) based on their involvement in different metabolic pathways in the previously developed metabolic model (Ahsanul Islam et al., 2010). We used these functional classifications for identifying metabolic genes in each QT cluster (see Materials and methods, and Appendix B: Figures B1 and B2). Furthermore, hypothetical proteins and genes without any particular annotations or predicted functions were catagorized as “unknown function” while genes involved in regulation, DNA repair, replication and recombination were classified as “non- metabolic function”. Subsequently, functional enrichment analysis (Huang et al., 2009) (Figure 4.6) was performed for all QT clusters, and enrichment p-values were calculated using the hypergeometric distribution method (Huang et al., 2009). Enrichment analysis essentially selects a subset of genes from a larger gene list, in which genes having similar metabolic functions, or genes involved in the same metabolic pathway have a higher likelihood or enriched potential to be selected as a group (Huang et al., 2009, Mahadevan et al., 2008, Tavazoie et al., 1999). It also helps in observing the frequencies of genes from particular functional categories in a cluster by chance (Huang et al., 2009, Mahadevan et al., 2008, Tavazoie et al., 1999). We obtained 13 and 11 functionally enriched, i.e., overrepresented clusters (p < 0.05) for strain 195 and KB-1 Dhc, respectively (Figures 4.6A and 4.6B).

We further predicted the operon structures of strain 195 genome and the composite genome of KB-1 Dhc with a published operon prediction algorithm (Bergman et al., 2007). This algorithm was chosen because of its improved prediction capability for any newly sequenced genome and

86

ease of implementation as it does not require any experimental data (Bergman et al., 2007). Since operons are sets of multiple co-transcribed genes forming a single mRNA sequence (Jacob and Monod, 1961), they encode proteins of similar metabolic or regulatory functions; hence, this information, together with co-expressed gene clusters, can be used to infer functions for hypothetical proteins and proteins with unknown functions (Aravind, 2000, Hanson et al., 2009, Overbeek et al., 1999). Of the 1589 and 1614 total genes in the genome of strain 195 and in the contigs from D. mccartyi strains in KB-1, 1251 (79%) and 984 (61%) were identified to be part of an operon (i.e., operonic) comprising 348 and 318 multigene operon pairs, respectively (Appendix B: Table S15 in Table B1). Due to the low number (61%) of predicted operonic genes for KB-1 Dhc, we tested the prediction capability of the algorithm by applying it to two other publicly accessible and complete D. mccartyi genomes — strains CBDB1 and GT — that share high nucleotide similarity and gene synteny with KB-1 Dhc (Hug et al., 2011). Strain CBDB1 contains 79% (1150 of 1457) operonic genes consisted of 333 multigene operon pairs while strain GT has 295 such operon pairs comprising 78% (1119 of 1432) of genes in the genome (Appendix B: Table S15 in Table B1). Our operon predictions for strains 195 and CBDB1 (79% for each) are comparable to the publicly available results for those genomes (71% and 76%) in the DOOR database (Mao et al., 2009) (Appendix B: Table S15 in Table B1). Operon prediction result for the composite genome of D. mccartyi strains in KB-1 was lower because only a draft genome assembled from the KB-1 metagenome is available, and contig breaks can disrupt operons.

87

88

Figure 4.6. Functional enrichment analysis of QT clusters for (A) strain 195 and (B) KB-1 Dhc array data. Genes in each QT cluster were categorized according to the subsystems or functional categories of D. mccartyi metabolic model. Next, enrichment p-values were calculated using hypergeometric distribution for each QT cluster to identify which clusters were enriched with genes from a particular subsystem. This analysis identified 13 and 11 clusters of co- expressed genes for strain 195 and KB-1 Dhc, which were significantly overrepresented by genes from specific functional categories. Such functionally enriched clusters are shaded in red (p ≤ 0.05) while black (No gene) indicates the absence of a gene from the corresponding subsystems, and green represents non-significant p-values (p > 0.05) for the clusters.

4.4.6. Functionally enriched QT clusters

Functional enrichment analysis (Mahadevan et al., 2008, Tavazoie et al., 1999, Huang et al., 2009) for the QT clusters of strain 195 and KB-1 Dhc was performed to obtain better insight into the contents of each co-expressed cluster. Although each QT cluster contains important information, functionally enriched clusters emphasize the presence of genes from a certain functional category is statistically significant, and potentially all genes in the cluster might be related to similar functions, or involved in similar metabolic pathways (see Appendix B: Tables S13 and S14 in Table B1 for a list of all QT clusters and genes). These clusters are, therefore, useful in predicting and analyzing the functions of hypothetical proteins within them. Of the 13 and 11 functionally enriched QT clusters of strain 195 and KB-1 Dhc, some are enriched for more than one functional category (Figures 4.6A and 4.6B). This multiple enrichment situation indicates genes belonged to the enriched categories are probably functionally related, or may be regulated by common regulators. Since D. mccartyi are organohalide respiring microbes, QT clusters enriched for genes involved in energy metabolism, such as hydrogenases, reductive dehalogenases, and proton translocating NADH-dehydrogenases, are very important. Also, QT clusters enriched for genes from multiple functional categories, including genes with unknown functions, are interesting as the co-expressed metabolic genes probably help annotate the hypothetical proteins. Thus, further analysis of two functionally enriched QT clusters (Figure 4.7) is described in the following sections and summarized in Appendix B: Table S16 in Table B1.

89

4.4.7. Analysis of strain 195 QT cluster 2

Cluster 2 of strain 195 comprises 25 genes and is overrepresented by genes from central carbon metabolism, nucleotide metabolism, and of unknown function (Figure 4.6A). The absolute gene- expression profile (Figure 4.7A) shows that genes in this cluster have similar expression patterns with higher expression in “HighB12” and “ANASspent” conditions. However, the relative gene- expression profile (Figure 4.7B) indicates the genes were most highly and lowly transcribed in “ANASspent” and “LS” conditions, respectively. Since genes in this cluster are mostly growth related, as suggested by the enrichment of genes from central carbon metabolism and nucleotide metabolism categories, higher gene-expression likely indicates a faster growth of strain 195. Also, the filter sterilized supernatant of ANAS cultures (i.e., ANASspent) added growth medium probably had the highest nutrient content (Richardson et al., 2002, Johnson et al., 2009) as compared to the rest of the conditions; hence, higher transcription of genes in the “ANASspent” condition (Figure 4.7B) was likely due to the favorable growth of strain 195. The lowest concentration of substrate and nutrient in the “LS” condition caused slow growth of strain 195 (Johnson et al., 2008) and was possibly responsible for the lowest gene-transcription (Figure 4.6B). In the metabolic modeling study (Ahsanul Islam et al., 2010), the central metabolic genes (DET0509 and DET0742) of this cluster were suggested to be involved in glycolysis/gluconeogenesis and sugar metabolism to produce precursors for cell membrane biogenesis (Kanehisa et al., 2011, Markowitz et al., 2009, Nelson and Cox, 2006). DET0509 (hypothetical protein) was annotated as a putative bifunctional phosphoglucose isomerase (EC: 5.3.1.8)/phosphomannose isomerase (EC: 5.3.1.9) during extensive curation of the D. mccartyi metabolic model (Ahsanul Islam et al., 2010) (Table 4.1 and Appendix B: Table S16 in Table B1). Thus, its inclusion in a central carbon metabolism gene enriched cluster further supports its annotation. Similarly, two other operonic hypothetical proteins, DET0591 and DET0592 (Figure 4.7B and Table 4.1), of this cluster are probably involved in sugar or carbohydrate metabolism because they clustered closer to the central metabolic genes (DET0509 and DET0742) during hierarchical clustering (Figure 4.7B). Moreover, two other genes (DET0590: glyceraldehyde-3- phosphate dehydrogenase and DET0593: enolase) of this operon (Markowitz et al., 2009) are also involved in sugar metabolism (Kanehisa et al., 2011). In fact, DET0592 is 58% identical at the amino acid level to the biochemically characterized maltose-6-phosphate glucosidase (EC:

90

3.2.1.122) of Fusobacterium mortiferum (Thompson et al., 1995) in SWISSPROT (Boeckmann et al., 2003) and PDB (Berman et al., 2000); hence, annotated as a putative maltose-6-phosphate glucosidase involved in carbohydrate metabolism (Kanehisa et al., 2011) (Table 4.1 and Appendix B: Table S16 in Table B1).

The cluster also includes three putative lipid metabolism genes that are members of the same operon: DET0369, DET0371 and DET0372 (Table 4.1). DET0369 (EC: 1.17.7.1) and DET0371 (EC:1.1.1.267) are involved in isoprenoid biosynthesis using the non-mevalonate pathway (Brammer et al., 2011, Kanehisa et al., 2011, Kemp et al., 2002, Ramsden et al., 2009) while DET0372 (phosphatidate cytidylyltransferase, EC: 2.7.7.41) takes part in glycerophospholipid metabolism (Kanehisa et al., 2011, Markowitz et al., 2009), the main structural components of biological cell membranes (Nelson and Cox, 2006) (Figure 4.7B and Table 4.1). Two operonic transporter genes (DET0417 and DET0418) were proposed to be putative L-glutamine transporters during the previous modeling study (Ahsanul Islam et al., 2010); however, clustering of DET0418 closer to DET0518 (Figure 4.7B) suggests both are probably methionine transporters. This is because the proposed annotation of DET0518 was a putative methylthioribose-1-phosphate isomerase (EC: 5.3.1.23), involved in methionine metabolism (Kanehisa et al., 2011, Markowitz et al., 2009), in the modeling study (Ahsanul Islam et al., 2010). Intriguingly, the close hierarchical clustering of a putative methionine transporter (DET0418) with a gene involved in glycerophospholipid metabolism (DET0372) (Figure 4.7B and Table 4.1) suggests a potential relationship between amino acid transport and lipid metabolism. A recent isotope labelling study (Zhuang et al., 2011), indeed, showed that strain 195 incorporated methionine from the external medium during growth and dechlorination. Thus, QT clustering analysis of transcriptomic data, along with functional enrichment analysis and operon predictions, helped annotate hypothetical proteins, or propose new annotation for previously annotated genes of strain 195.

91

92

Figure 4.7. Analysis of two functionally enriched strain 195 QT clusters. Two functionally enriched and interesting QT clusters (clusters 2 and 6) of strain 195 transcriptomic data were further analyzed by the hierarchical clustering algorithm as represented by the dendrograms in (B) and (D). Absolute gene expression intentisities of the clusters are plotted in (A) and (C) while relative or normalized gene expression intensities (Materials and methods) are presented as heat maps in (B) and (D). The height of the dendrograms represents the similarity of gene expression patterns and is measured by the Spearman’s rank correlation coefficient (SCC). Genes whose names are in green or orange are part of an operon, but orange further indicates that multiple genes from the same operon are present in the cluster.

4.4.8. Analysis of strain 195 QT cluster 6

Another important QT cluster of strain 195, overrepresented by genes involved in energy metabolism (Figure 4.6A), is cluster 6 comprising 15 genes. Absolute (Figure 4.7C) and relative (Figure 4.7D) gene expression profiles of this cluster showed high and low transcription of genes in” LS” and “ANASspent” conditions, respectively — a scenario opposite to the previously described QT cluster 2. This difference in relative gene expression profiles suggests that strain 195 needs to generate energy by reductive dechlorination to maintain cellular integrity (Pirt, 1965, Pirt, 1982, Russell and Cook, 1995) even though the cells are not growing in the “LS” condition. It also supports the notion of growth-decoupled reductive dechlorination by strain 195 (Maymó-Gatell et al., 1997, Seshadri et al., 2005). Genes in this cluster are mainly involved in energy metabolism, specifically genes present in the respiratory chain of strain 195, including 2 rdhA and 2 rdhB genes (DET0318, pceA, DET0319, DET1558, and DET1559) (Table 4.1 and Figure 4.7D). Interestingly, DET0318 — a biochemically characterized tetrachloroethene (PCE) rdhA (pceA) gene (Magnuson et al., 1998) — was not transcribed in “ANASspent” and “ANASmedium” conditions though it was the most highly transcribed gene during the growth of strain 195 in its own medium (Figure 4.7C). ANAS cultures were not reported to degrade PCE (Lee et al., 2006, Richardson et al., 2002), and the supernatant, as well as the growth medium of ANAS might contain nutrients that possibly inhibited the pceA gene expression.

The cluster also contains a putative flavodoxin gene (DET1501) that is 33% identical at the amino acid level with the biochemically characterized flavodoxin from Desulfovibrio vulgaris strain Hildenborough (Curley and Voordouw, 1988) in SWISSPROT and PDB (Table 4.1 and

93

Figure 4.7D). Flavodoxins are small electron transfer proteins containing a single flavin mononucleotide (FMN) molecule that usually participates in low potential redox reactions (Biel et al., 1996, Sancho, 2006). Thus, the presence of a putative flavodoxin (DET1501) with rdh genes in a co-expressed and energy metabolism gene enriched QT cluster indicates its potential involvement in the reductive dechlorination process, as well as in D. mccartyi respiration (Figure 4.7D, Table 4.1 and Appendix B: Table S16 in Table B1). This hypothesis is further corroborated by the fact that a low potential electron donor is required to continue the reductive dechlorination process (Holliger et al., 1998b, Hölscher et al., 2003). Recently, a flavin mediated “electron bifurcation” mechanism has been reported for anaerobic microorganisms (Herrmann et al., 2008, Thauer et al., 2008), in which an endergonic reaction is driven by the energy from a simultaneously occurring exergonic reaction. The mechanism of D. mccartyi electron transport chain (ETC) is still unknown; however, probable involvement of a flavodoxin, together with reductive dehalogenases in the ETC suggests the possibility of electron bifurcation during the reductive dechlorination process. Also, the inclusion of DET0320 and DET1500 — two putative transcriptional regulators due to their homology (46% sequence identity with E. coli K12) in SWISSPROT, IMG, PDB, and EBI InterProScan (Quevillon et al., 2005) databases — in this cluster suggests their likely involvement in regulating energy conservation processes and reductive dehalogenation, as has been suggested previously (Kube et al., 2005, Seshadri et al., 2005) (Table 4.1 and Appendix B: Table S16 in Table B1). Clustering of similar energy metabolism genes was also observed for KB-1 Dhc transcriptomic data (Appendix B: Table S16 in Table B1).

94

Table 4.1. Strain 195 genes identified in functionally enriched clusters and associated inferred annotations

Model Suggested New Locus Oper Clust Revised Annotation in Gene Primary Annotation Annotation from Subsystem Tag on ID er No the Model No This Study putative bifunctional Central Retain previous DET0509 106 2 113 hypothetical protein phosphoglucose/phosph Carbon annotation omannose isomerase Metabolism Central triosephosphate Retain previous Retain previous DET0742 160 2 192 Carbon isomerase annotation annotation Metabolism 1-hydroxy-2-methyl-2- Retain previous Retain previous Lipid DET0369 84 2 57 (E)-butenyl 4- annotation annotation Metabolism diphosphate synthase 1-deoxy-D-xylulose 5- Retain previous Retain previous Lipid DET0371 84 2 58 phosphate annotation annotation Metabolism reductoisomerase phosphatidate Retain previous Retain previous Lipid DET0372 84 2 59 cytidylyltransferase annotation annotation Metabolism amino acid ABC putative putative glutamine DET0417 91 2 79 transporter; ATP- methionine Transport transporter binding protein transporter amino acid ABC putative putative glutamine DET0418 91 2 80 transporter; permease methionine Transport transporter protein transporter translation initiation methylthioribose-1- Retain previous Amino Acid DET0518 108 2 118 factor, putative, phosphate isomerase annotation Metabolism putative Central DET0591 125 2 hypothetical protein hypothetical protein carbohydrate Carbon

esterase Metabolism 03, Löffler et al., Central DET0592 125 2 hypothetical protein hypothetical protein hosphate Carbon

glucosidase Metabolism reductive tetrachloroethene Retain previous Energy DET0318 71 6 19 dehalogenase, putative reductive dehalogenase annotation Metabolism reductive dehalogenase tetrachloroethene Retain previous Energy DET0319 71 6 446 anchoring protein, reductive dehalogenase annotation Metabolism putative anchoring protein putative Non- DET0320 71 6 hypothetical protein hypothetical protein transcriptional metabolic regulator/activator reductive dehalogenase Retain previous Retain previous Energy DET1558 326 8 523 anchoring protein, annotation annotation Metabolism putative reductive Retain previous Retain previous Energy DET1559 326 8 425 dehalogenase, putative annotation annotation Metabolism putative Non- DET1500 310 8 hypothetical protein hypothetical protein transcriptional metabolic regulator/activator Retain previous Energy DET1501 310 8 flavodoxin flavodoxin annotation Metabolism

95

Although gene expression microarrays are genome-wide high throughput experimental studies cataloguing the global transcriptional changes of an organism, they cannot provide deterministic information such as the activity of genes and enzymes, or their involvement in specific metabolic processes. Hence, this information alone lacks the capability of unraveling and depicting the activity of metabolic genes, as well as the metabolism of an organism. However, if transcriptomic data can be analyzed together with detailed metabolic information such as a pan- genome-scale metabolic model as discussed in this study, they can provide useful insight about the function of metabolic genes, as well as hypothetical proteins. Such integrated analysis can also be instrumental in shedding light on poorly understood physiological processes of difficult to culture organisms like D. mccartyi. That being said, the transcriptomic experiments and data analyzed in this study were not designed specifically to capture the changes in expression pattern of metabolic genes; for instance, D. mccartyi were either growing or not-growing in all experimental conditions, and no specific metabolic perturbations such as the lack of an essential nutrient or vitamin were imposed on them during their growth. Moreover, absolute expression intensities, rather than differential gene expression analysis, of array data were used in our study due to the variability of the array design and array data sources. Hence, future microarray experiments designed to perturb and catalogue metabolic changes in D. mccartyi will be useful for advancing our fundamental understanding about the physiology and metabolism of these environmentally important and difficult to culture microbes.

4.5. Conclusions

Due to the lack of a genetic system and associated challenges of growing pure isolates of D. mccartyi in defined mineral media, detailed biochemical studies concerning their physiology and metabolism are limited. This study analyzed and visualized curated transcriptomic data for strain 195 and D. mccartyi strains in KB-1 (KB-1 Dhc) from various experiments while leveraging our previously developed D. mccartyi metabolic model. Using available transcriptomic and proteomic data from previous studies, we confirmed the presence of the majority of hypothetical proteins and metabolic genes in strain 195 and KB-1 Dhc genomes. We identified a number of high quality clusters for both data sets that provided improved understanding of the genes (such as flavodoxin and rdhs) involved in the yet unknown mechanism of the energy conserving

96 respiratory chain of these organisms. Clustering and functional enrichment analyses of the transcriptomic data highlighted that lipid metabolism, more specifically, cell membrane biogenesis and the function of transporters were very important for D. mccartyi. Operon analysis, as well as the quality threshold clustering of transcriptomic data, provided additional confidence in prior reannotations or new function predictions for a number of genes, including 5 hypothetical proteins. Since hypothetical proteins constitute a major portion of any sequenced genome, predicting function is a significant challenge, and all relevant clues are welcome. Also, predicted annotations for the hypothetical proteins can serve as a guide in designing future experiments for biochemical characterization of these genes. Finally, this meta-analysis clearly shows that the integrated study of high-throughput transcriptomic data with the pan-genome- scale metabolic model for D. mccartyi can advance our knowledge on the fundamental details of the physiology and metabolism of these difficult to grow yet environmentally important anaerobes. This enhanced knowledge of metabolism, in turn, will be beneficial for the optimal use of these bacteria in elucidating global halogen cycles and developing effective strategies for the bioremediation of chlorinated pollutant contaminated sites around the world.

97

Chapter 5: Model-assisted prediction and experimental characterization of isocitrate dehydrogenase and phosphomannose isomerase from Dehalococcoides mccartyi strain KB-1

5.1. Abstract

Proteins of unknown or non-specific function and hypothetical proteins constitute ~50% of the genomes of Dehalococcoides mccartyi — a group of environmentally important strictly anaerobic bacteria. Previous genome-scale modeling and transcriptomic studies on D. mccartyi metabolism led to the review or reannotation of over 80 genes, including some hypothetical proteins. These reannotations were based on various bioinformatic analyses, such as sequence homology analysis, operon analysis, and phylogenetic profiling. Here, we experimentally characterized two of those reannotated genes to verify the proposed annotations: 1) an NADP+- dependent isocitrate dehydrogenase (KB1_0495), and 2) a bifunctional phosphoglucose/phosphomannose isomerase (KB1_0553) from D. mccartyi strain KB-1. The original annotation for KB1_0495 was an NAD+-dependent isocitrate dehydrogenase, and KB1_0553 was originally annotated as a hypothetical protein/SIS protein. KB1_0495, also denoted as DmIDH, showed activity primarily with NADP+ as a cofactor, while only phosphomannose isomerase activity was identified and confirmed for KB1_0553, also denoted as DmPMI. Bioinformatic analysis of their sequences suggested their involvement in novel enzyme families within the respective enzyme superfamilies. Thus, the biochemical characterization of KB1_0495 (DmIDH) and KB1_0553 (DmPMI) highlights the importance of gene reannotations from metabolic modeling and transcriptomic studies as valuable hypotheses that, if tested, can enhance our knowledge about the physiology and biochemistry of the organism of interest.

98

5.2. Introduction

A group of one of the smallest free-living organisms, Dehalococcoides mccartyi is important for their unique niche specialty — detoxification of ubiquitous and stable ground water pollutants such as anthropogenic chlorinated ethenes and benzenes into benign or less toxic compounds (Adrian et al., 2007a, Adrian et al., 2000b, He et al., 2003, Löffler et al., 2012, Maymó-Gatell et al., 1997). Only these strictly anaerobic bacteria are capable of harnessing energy for growth from the complete detoxification of known human carcinogens — trichloroethene (TCE) and vinyl chloride (VC) (Guha et al., 2012, TEACH, 2011) — into benign ethenes. This energy- conserving metabolic process, termed organohalide respiration, is catalyzed by reductive dehalogenases, the respiratory enzyme system of D. mccartyi. Although organohalide respiration is useful for the bioremediation of toxic chloro-organic solvents, the process is slow due to the slower growth of D. mccartyi in pure isolates than in mixed microbial communities, their natural habitats (Adrian et al., 1998, Adrian et al., 2000b, Ahsanul Islam et al., 2010). Thus, the fundamental understanding of D. mccartyi metabolism, including the genes and enzymes involved in metabolic processes, is essential for their successful application to bioremediation purposes. So far, numerous systems-level studies on D. mccartyi metabolism, including the construction of a pan-genome-scale metabolic model, and various transcriptomic and proteomic analyses (Johnson et al., 2008, Johnson et al., 2009, Lee et al., 2012, Morris et al., 2007, Morris et al., 2006), have shed light on key metabolic processes and the genes involved.

Although D. mccartyi metabolism is well-studied, only a few metabolic genes are experimentally characterized. These include an Re-citrate synthase (Marco-Urrea et al., 2011) involved in the TCA-cycle, a bifunctional mannosylglycerate synthase/phosphatase with potential role in osmotic stress adaptation (Empadinhas et al., 2004), and five reductive dehalogenases involved in respiration and energy conservation (Adrian et al., 2007b, Krajmalnik-Brown et al., 2004, Magnuson et al., 2000, Magnuson et al., 1998, Müller et al., 2004, Tang et al., 2013). Also, the activity of hydrogenases was experimentally characterized in D. mccartyi strains 195 (Nijenhuis and Zinder, 2005) and CBDB1 (Jayachandran et al., 2004). Apart from these functionally characterized genes, experimental studies on D. mccartyi genes are largely unknown. In addition, genome sequences of these bacteria revealed the presence of ~50% hypothetical proteins and

99

proteins with unknown or non-specific functions in their genomes (Kube et al., 2005, Seshadri et al., 2005). However, the primary gene-annotations, including the annotations for more than 80 metabolic genes, were reviewed or corrected during the construction and manual curation of the D. mccartyi metabolic model (Chapter 3) (Ahsanul Islam et al., 2010). Also, a recent system- wide study (Ahsanul Islam et al., 2013) on D. mccartyi transcriptomes (Chapter 4) lead to proposing putative functions for 5 hypothetical proteins, in addition to providing additional confidence in reannotated genes from the modeling study. One of these 5 hypothetical proteins was proposed to be a putative bifunctional phosphoglucose isomerase (PGI; EC 5.3.1.8)/ phosphomannose isomerase (PMI; EC 5.3.1.9) due to its inclusion in a co-expressed gene-cluster enriched for central carbon metabolic genes (Ahsanul Islam et al., 2013). The same annotation was proposed during the construction of the D. maccartyi metabolic model (Ahsanul Islam et al., 2010). Another metabolic gene that was primarily annotated as a putative NAD+-isocitrate dehydrogenase (IDH) (EC 1.1.1.41) was reannotated as a putative NADP+-dependent IDH (EC 1.1.1.42) during the modeling study (Ahsanul Islam et al., 2010). However, neither of these proposed gene-annotations was supported by experimental evidence.

In this study, we reported the biochemical characterization of the aforementioned putative IDH (KB1_0495) and PGI/PMI (KB1_0553) from D. mccartyi strain KB-1. Although IDH is an + important TCA-cycle enzyme catalyzing the formation of 2-oxoglutarate and CO2 with NAD or NADP+ as a cofactor (Kanehisa et al., 2011, Madigan et al., 2010, Nelson and Cox, 2006), the physiological role of a bifunctional PGI/PMI in D. mccartyi is unclear. PGI, in general, plays a central role in sugar metabolism via glycolysis and gluconeogenesis in all three forms of life (Hansen et al., 2004, Hansen, 2004, Nelson and Cox, 2006), while PMI helps to produce precursors for cell wall components, glycoproteins, glycolipids, and storage polysaccharides (Quevillon et al., 2005, Rajesh et al., 2012, Hansen, 2004). However, glycolysis is inactive in D. mccartyi, and they also lack a typical bacterial cell wall (Löffler et al., 2012). Moreover, all metabolic genes in the D. mccartyi metabolic model (Ahsanul Islam et al., 2010) were categorized into three classes based on the quality and types (i.e., biochemical or bioinformatic) of evidence supporting the gene-annotations: low, medium, and high confidence genes. We choose to functionally characterize a medium confidence (KB1_0495) and a low confidence gene (KB1_0553) because (1) biochemical evidence for these genes is non-existent and

100

annotations without experimental evidence are simply hypotheses, (2) bioinformatic evidence for the genes is either limited (represented by medium confidence for KB1_0495) or insufficient (represented by low confidence for KB1_0553) in the model, and (3) correctness of their proposed annotations/hypotheses will support, or at least, add confidence to the other gene reannotations/hypotheses presented in the modeling and transcriptomic studies (Ahsanul Islam et al., 2010, Ahsanul Islam et al., 2013). We, therefore, heterologously expressed KB1_0495 and KB1_0553 in E. coli and tested biochemical activities of the purified recombinant proteins. We characterized IDH activity in KB1_0495 (also denoted as DmIDH) with NADP+ as the main cofactor, while only PMI activity was identified and confirmed for KB1_0553 (also denoted as DmPMI). Analyses of their physiological roles indicated potential involvement of DmIDH in energy metabolism and DmPMI in compatible solute production to help D. mccartyi adapt osmotic stress. We also analyzed the predicted secondary structures of both enzymes with the crystal structures of their closest homologs, and conducted bioinformatic analyses of their sequences. These analyses revealed their novelties and suggested they might part of new enzyme families within their respective enzyme superfamilies.

5.3 Materials and methods

5.3.1. Bacterial culture, reagents and chemicals

Genomic DNA (gDNA) was collected from KB-1, a D. mccartyi-containing anaerobic mixed culture growing on TCE and methanol following the procedure described previously (Duhamel and Edwards, 2006, Duhamel, 2005, Waller, 2010). The PCR primers for amplifying strain KB-1 gDNA were synthesized by Integrated DNA Technologies (Coralville, IA, USA). Luria Broth (LB) and Terrific Broth (TB) powder were purchased from EMD Chemicals (Gibbstown, NJ, USA), and the Bradford assay reagent from Bio-Rad (Hercules, CA, USA). Lysozyme, proteinase K, agarose, glycerol, ampicillin, kanamycin, SDS, and IPTG were obtained from BioShop (Burlington, ON, Canada), and all other chemicals were purchased from Sigma-Aldrich (St. Louis, MO, USA) with greater than 98% in purity. Nickel-nitrilotriacetic acid (Ni-NTA) resin and the QIAquick PCR purification kit were purchased from Qiagen (Mississauga, ON,

101

Canada), while the In-Fusion PCR cloning kit was purchased from Clontech (Palo Alto, CA, USA). The commercially available kits were used according to the manufacturers’ instructions.

5.3.2. Gene cloning and overexpression of the selected genes in E. coli

The selected genes (KB1_0495 and KB1_0553) were PCR-amplified using KB-1 gDNA and the PCR primers containing the restriction sites for BamHI and NdeI, and were cloned into the modified pET-15b vector (Novagen, Madison, WI, USA) containing a 5’ N-terminal hexahistidine tag (6xHis-tag) and an ampicillin resistance gene as described previously (Zhang et al., 2001). In the modified vector, the tobacco etch virus protease cleavage site replaced the thrombin cleavage site, and a double stop codon was introduced downstream from the BamHI site (Zhang et al., 2001). These vectors were subsequently transformed into E. coli BL21 (DE3) Gold strain (Stratagene, La Jolla, CA, USA) for overexpression of the targeted fused genes. The cells were grown aerobically in 1 liter flasks containing LB medium at 37°C and ~220 rpm until

the OD600 reached around 1.0 (approximately in 3 hours). Expression of the cloned genes was induced by the addition of 100 mg IPTG. The cells were then harvested the following day by centrifugation at 7,500 rpm and 4°C for 20 min.

5.3.3. Purification of the overexpressed recombinant proteins

The overexpressed, fused 6xHis-tagged proteins were purified using the immobilized metal ion affinity chromatography (IMAC) (Hochuli, 1990, Hochuli et al., 1988, Porath, 1992, Porath et al., 1975) as described previously (Zhang et al., 2001). Briefly, the harvested E. coli cells were suspended in the binding buffer and disrupted by sonication. The sonicated cell lysates were separated by centrifugation at 23,500 rpm and 4°C for 45 minutes. Afterwards, the supernatant was loaded on a column containing Ni-NTA resins and washed with 200 mL wash buffer to remove non-specifically attached proteins to the resins. The 6xHis-tagged proteins were eluted using an eluent buffer with increasing concentrations of imidazole. Purified proteins were frozen with liquid nitrogen if they were collected with a high yield (≥ 50 mg/liter of culture) and homogeneity (≥ 95%) as described previously (Zhang et al., 2001). SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) followed by staining with Coomassie Brilliant Blue

102

R 250 were performed according to standard procedures (Laemmli, 1970) for checking the expression level and purity of targeted proteins.

5.3.4. Enzymatic assays for the purified recombinant proteins

The isocitrate dehydrogenase (IDH) activity KB1_0495 was determined by an standard assay

(Steen et al., 1998), in which the enzyme converted D-isocitric acid to 2-oxoglutarate and CO2 using NADP+ or NAD+ as a cofactor according to the following equation:

IDH isocitrate + NADP or NAD 2-oxoglutarate + CO + NADPH or NADH

The standard 1 mL assay contained 50 mM tricine-hydrochloride (Tris-HCl) buffer (pH 7.5), 0.3 + mM NADP , 1mM D-isocitric acid, and 10 mM MgCl2 (Steen et al., 1998). The reaction was started by adding 1 µg of purified protein to the reaction mixture, and the product, 2-oxoglutarate formation was inferred by measuring NADPH spectrophotometrically at 340 nm and 30ºC. The IDH activity of KB1_0495 was also measured using NAD+ instead of NADP+ as a cofactor.

Both phosphoglucose isomerase (PGI) and phosphomannose isomerase (PMI) activities were tested for KB1_0553, but only PMI activity was identified and confirmed using two standard assays (Hansen et al., 2004) described by the following equations:

Assay 1: PMI M6P F6P M1PDH F6P + NADH M1P + NAD

where, M6P refers to mannose-6-phosphate, F6P refers to fructose-6-phosphate M1P refers to mannitol-1-phosphate, and M1PDH is mannitol-1-phosphate-5-dehydrogenase (EC 1.1.1.17).

Assay 2: PMI M6P F6P

103

PGI F6P G6P G6PDH G6P + NADP 6PG15L + NADPH where, G6P refers to glucose-6-phosphate, 6PG15L refers to 6-phospho-D-glucono-1,5-lactone, and G6PDH is glucose-6-phosphate dehydrogenase (EC. 1.1.1.49).

Reaction mixtures for assay 1 contained 100 mM Tris-HCl (pH 7.5), 0.5 mM NADH, 10 mM M6P, and 10 µL of M1PDH purified from E. coli (Novotny et al., 1984), and assay 2 contained 100 mM Tris-HCl (pH 7.5), 0.5 mM NADP+, 10 mM M6P, 1.1 U of G6PDH from yeast (Sigma- Aldrich, St. Louis, MO), and 1 U of PGI from yeast (Sigma-Aldrich, St. Louis, MO). The formation of product, F6P was determined from the oxidation of NADH in assay 1 and by monitoring the reduction of NADP+ in assay 2. In both instances, the product formation was inferred from the measurement of absorbance with a spectrophotometer at 340 nm and 30ºC. The kinetic parameters, Km and Vmax for both KB1_0495 and KB1_0553 were determined by nonlinear curve fitting of the Michaelis-Menten enzyme kinetics model using the software GraphPad Prism v 5.0 (GraphPad Software, Inc., La Jolla, CA).

5.4. Results and discussion

5.4.1. Biochemical activities and kinetic parameters of KB1_0495 (DmIDH) and KB1_0553 (DmPMI)

The heterologously expressed and purified recombinant enzymes of D. mccartyi strain KB-1 (KB1_0495 and KB1_0553) were tested for biochemical activities (Figures 5.1 and 5.2), and the results are reported in Table 5.1. The first enzyme encoded by KB1_0495 (DmIDH) was tested for IDH activity with a standard assay (see Materials and methods). The pH optimum (7.5) of DmIDH was determined by measuring its catalytic activity in the pH range of 6 – 8.5 at 30 ºC (Figure 5.1A). Subsequently, the catalytic activity of the enzyme was measured at the optimum pH for various concentrations of substrate, D-isocitric acid (Figure 5.1B) with NADP+ as the

104

cofactor. These data were used to calculate the maximum enzyme velocity (Vmax) and the Michaelis constant (Km) (Table 5.1) from the non-linear regression analysis of the Michaelis- Menten enzyme kinetics model (Figure 5.1B). The enzyme also showed activity with NAD+ as the cofactor, but the activity was much lower than that with NADP+ (Table 5.1). This finding confirmed the proposed annotation of DmIDH as an NADP+-dependent isocitrate dehydrogenase during the metabolic modeling study (Ahsanul Islam et al., 2010). Kinetic parameters for DmIDH were also estimated using both NADP+ (Figure 5.1C) and NAD+ (Figure 5.1D) as substrates (Table 5.1).

The estimated kinetic parameters for DmIDH are quite comparable to similar bacterial and archaeal enzymes in the BRENDA database (Chang et al., 2009) (Table 5.1). Notably, the faster activity and higher efficiency of DmIDH with NADP+ relative to NAD+ are probably linked to its physiological roles in D. mccartyi. DmIDH is an important TCA-cycle enzyme that, in addition to producing the critical biosynthetic precursor 2-oxoglutarate, is involved in energy metabolism by recycling essential cellular energy currencies such as NADP+ and NADPH cofactors. In fact, those bacteria which grow on acetate and have a complete TCA-cycle, generate 90% of the NADPH required for biosynthetic pathways using IDH (Dean and Golding, 1997, Steen et al., 1998). Thus, DmIDH plays a crucial role in D. mccartyi metabolism which require acetate for growth (Adrian et al., 2000b, Maymó-Gatell et al., 1997), but possess an incomplete TCA-cycle (Kube et al., 2005, Marco-Urrea et al., 2011, Seshadri et al., 2005, Tang et al., 2009b).

The other purified enzyme encoded by KB1_0553 (DmPMI) was tested for both PGI/PMI activities using two standard assays (see Materials and methods). Although PGI activity (glucose-6-phosphate, G6P ↔ fructose-6-phosphate, F6P) was tested in both directions, the enzyme showed no activity with either G6P or F6P as substrates. Then, PMI activity (mannose- 6-phosphate, M6P ↔ fructose-6-phosphate, F6P) was tested in the direction of F6P formation using two assays described in the materials and methods. After confirming the activity on M6P, the pH optimum (7.5) of DmPMI was identified by measuring its catalytic activity in the pH range of 6.5 –8.5 (Figure 5.2A). Then, its activities were further measured at the optimum pH by varying the substrate, M6P concentration (Figure 5.2B). These data were used for non-linear regression analysis of the Michaelis-Menten enzyme kinetics model for calculating the

105

maximum enzyme velocity (Vmax), and the Michaelis constant (Km) (Figure 5.2B and Table 5.1). We also determined the turnover number (kcat) and efficiency (kact/Km) for both DmIDH and DmPMI as reported in Table 5.1.

Table 5.1. Kinetic parameters of DmIDH and DmPMI from D. mccartyi strain KB-1

Bacterial Bacterial Bacterial Bacterial and and and and Archaeal Archaeal Archaeal Archaeal IDHs IDHs IDHs PMIs Kinetic DmIDH (isocitrate) DmIDH (NADP+) DmIDH (NAD+) DmPMI (M6P) in Parameters (isocitrate) in (NADP+) in (NAD+) in (M6P) BRENDA BRENDA BRENDA BRENDA (Range (Range (Range (Range and and and and Median) Median) Median) Median)

6.5 – 8.5; pH optimum 7.5 6 – 9; 8 7.5 6 – 9; 8 7.5 6 – 9; 8 7.5 7

Vmax 0.04 – 0.04 – 0.32 ± 0.04 – 1.19 ± 0.01 – (µmoles.min- 21.3 ± 2 20.8 ± 2 3884; 109 3884; 109 0.05 3884; 109 0.2 890; 2.1 1.mg-1) 0.0057 – 0.0024 – 0.06 – 0.027 ± 0.12 ± 0.144 – 0.89 ± Km (mM) 0.11 ± 0.01 0.13; 19.6; 23.12; 0.02 0.04 18.6; 4.15 0.09 0.0255 0.0315 1.25 0.0005 – 0.0003 – 0.22 ± 0.0007 – 0.84 ± 0.076 – Kcat (s-1) 15 ± 1 14.6 ± 1 31.4; 0.005 37.4; 4 0.03 2.11; 1.37 0.1 1371; 11

kcat/Km Not Not Not 0.96 ± 13 – 139.6 ± 1 540 ± 1 2 ± 1 (mM-1.s-1) reported reported reported 0.3 6685; 109

The physiological role of DmPMI in D. mccartyi is unclear because cells of these bacteria lack a typical bacterial cell wall. They, instead, have the archaeal S-layer like protein and cell membranes (Adrian et al., 2000b, He et al., 2003, Maymó-Gatell et al., 1997, Löffler et al., 2012); thus, DmPMI is likely involved in cell membrane biogenesis in these bacteria. It can also be involved in the mannosylglycerate (MG) biosynthesis pathway (Figure 5.3) because a

106 bifunctional mannosyl-3-phosphoglycerate synthase (EC 2.4.1.217)/phosphatase (EC 3.1.3.70) enzyme encoded by DET1363 (mgsD) gene was biochemically characterized from D. mccartyi strain 195 (Empadinhas et al., 2004). MG is an unusual compatible solute that helps thermophilic or hyperthermophilic bacteria and archaea adjust to osmotic stress and heat (Empadinhas, 2011, Santos and da Costa, 2002). Also, in vitro studies established its role in protecting proteins against thermal denaturation (Borges et al., 2002, Ramos et al., 1997). Although the physiological role of MG in mesophilic D. mccartyi is unclear, it was hypothesized to be involved in cellular osmotic stress adaptation in these bacteria (Hendrickson et al., 2002, Empadinhas et al., 2004, Smidt and de Vos, 2004). However, this suggested physiological role is a non-essential or specialized function than a major biological function, which probably explains why DmPMI has relatively slower catalytic activity and lower efficiency than similar enzymes in BRENDA (Table 5.1).

107

108

Figure 5.1. Effects of pH and substrate concentrations on the rate of DmIDH. (A) Catalytic activities of DmIDH with substrate D-isocitric acid were plotted for the pH range of 6 – 8.5, and the highest activity was identified at pH 7.5. Rate dependence of DmIDH with change in substrate D-isocitric acid (with NADP+ cofactor), NADP+, and NAD+ concentrations is shown in (B), (C), and (D), respectively. The data were fitted with the non-linear regression analysis of the Michaelis-Menten enzyme kinetics model to identify the kinetic parameters (Vmax, Km, kcat, and kcat/Km) for DmIDH. Error bars represent standard deviation of triplicates samples.

109

Figure 5.2. Effects of pH and substrate concentrations on the rate of DmPMI. (A) Catalytic activities of DmPMI with substrate mannose-6-phosphate were plotted for the pH range of 6 – 8.5, and the highest activity was identified at pH 7.5. (B) Rate dependence of DmPMI with change in substrate mannose-6-phosphate concentrations is shown. The data were fitted with the non-linear regression analysis of the Michaelis-Menten enzyme kinetics model to identify the kinetic parameters (Vmax, Km, kcat, and kcat/Km) for DmPMI. Error bars represent standard deviation of triplicate samples.

Figure 5.3. Mannosylglycerate (MG) biosynthesis pathway in D. mccartyi. Proposed genes and enzymes involved in the compatible solute, MG, biosynthesis pathway in D. mccartyi are shown. Gene locus names of homologous genes in different D. mccartyi genomes encoding the enzymes in each step are shown in parenthesis. Biochemically characterized enzymes are highlighted with the red font color. Phosphomannose isomerase (EC 5.3.1.8) and the bifunctional

110

mannosyl-3-phosphoglycerate synthase (EC 2.4.1.217)/phosphatase (EC 3.1.3.70) were characterized from strain KB-1 (KB1_0553, DmPMI) and strain 195 (DET1363, mgsD), respectively. The two other genes were identified in D. mccartyi genomes during the metabolic modeling study (Ahsanul Islam et al., 2010).

5.4.2. Sequence homology and phylogenetic analyses of DmIDH and DmPMI sequences

The sequence homology analysis of DmIDH and DmPMI protein sequences (Figure 5.4) revealed the remarkably conserved nature of DmIDH across various domains of life (Figure 5.4A) as compared to DmPMI (Figure 5.4B). Being a TCA-cycle enzyme, DmIDH was found to be > 40% identical at the amino acid level with eukaryotic, archaeal and bacterial IDHs, while > 90% identity was observed only within the Chloroflexi phylum (Figure 5.4A). Although DmPMI was > 90% identical with other PMIs from Chloroflexi, it showed < 30% amino acid sequence identity with the majority of its homologs (Figure 5.4B). This difference in sequence conservation for both enzymes was also observed from their phylogenetic analysis (Figures 5.5 and 5.6). In addition to the members of Chloroflexi, DmIDH showed close relationship to and (Figure 5.5). The maximum likelihood (ML) tree of DmIDH (Figure 5.5) further showed its separate clustering from the previously described (Steen et al., 1997) subfamilies I and II of biochemically characterized IDHs; subfamily II includes eukaryotic, and subfamily I comprises archaeal and bacterial NADP+-dependent homodimeric IDHs (Steen et al., 1997). No DmIDH homolog was also identified from subfamily III, which mainly comprises NAD+-dependent IDHs from eukaryotes (Steen et al., 1997). Thus, DmIDH is likely part of a new subfamily that also includes bacterial IDHs from Planctomycetes, , and Cyanobacteria (Figure 5.5).

DmPMI, on the other hand, was identified to be closely related to , in addition to the members of Chloroflexi; within Chloroflexi, it showed a closer relationship to D. mccartyi strain BAV1 than any other strains (Figure 5.6). However, no homolog of DmPMI was identified within the three types of PMIs characterized and described so far (Proudfoot et al., 1994). Type I PMIs are from eukarya and bacteria while type II PMIs are bifunctional, and included both bacterial and archaeal PMIs (Proudfoot et al., 1994, Schmidt et al., 1992). The well-studied but

111 not experimentally characterized PMI from the bacterium Rhizobium meliloti has been categorized as a type III PMI (Proudfoot et al., 1994, Schmidt et al., 1992). Among the biochemically characterized enzymes, the closest homologs of DmPMI were identified to be archaeal bifunctional PGI/PMIs (Figure 5.6). The archaeal bifunctional PGI/PMIs represent a novel enzyme family within the PGI superfamily (Hansen et al., 2004, Hansen, 2004), but they are not part of the cupin superfamily that includes all characterized PMIs (Dunwell et al., 2000, Dunwell, 2001). Thus, DmPMI from D. mccartyi strain KB-1 likely represents a novel class of bacterial PMIs, and the members may include homologs from Actinobacteria, Firmicutes, and (Figure 5.6). Notably, the bootstrap values for the DmPMI ML tree (Figure 5.6) are relatively small. Such small bootstrap values, however, can be originated from the less conserved nature of DmPMI and its homologous sequences.

112

113

Figure 5.4. Sequence homology network for DmIDH (KB1_0495) and DmPMI (KB1_0553). Homologous protein sequences of DmIDH and DmPMI in archaea, bacteria, and eukaryota, identified by BLASTP (Altschul et al., 1997) from UniProt (Apweiler et al., 2012), are shown as nodes, and sequence identities of DmIDH and DmPMI homologs are shown as edges in (A) and (B), respectively. The majority of DmIDH homologs were > 40% identical at the amino acid level (indicated by blue edges), while the majority of DmPMI homologs showed < 30% amino acid sequence identity with the DmPMI sequence (indicated by grey edges).

114

115

Figure 5.5. Phylogenetic analysis of DmIDH protein sequence. Maximum likelihood (ML) tree for DmIDH and its homologous protein sequences was constructed by PhyML (Guindon et al., 2010) plugin in Geneious (http://www.geneious.com/). Protein sequences were mined from UniProt (Apweiler et al., 2012) and aligned with MUSCLE (Edgar, 2004) plugin in Geneious. Then, the ML tree was constructed under WAG (Whelan and Goldman, 2001) model of amino acid substitution with 100 boot strap resampling trees were conducted. Boot strap values are shown as branch labels, and the biochemically characterized genes are marked by asterisks (*). Organism names are colored according to different kingdoms (orange = archaea, green = bacteria, and purple = eukaryota).

116

117

Figure 5.6. Phylogenetic analysis of DmPMI protein sequence. Maximum likelihood (ML) tree for DmPMI and its homologous protein sequences was constructed by PhyML (Guindon et al., 2010) plugin in Geneious (http://www.geneious.com/). Protein sequences were mined from UniProt (Apweiler et al., 2012) and aligned with MUSCLE (Edgar, 2004) plugin in Geneious. Then, the ML tree was constructed under WAG (Whelan and Goldman, 2001) model of amino acid substitution with 100 boot strap resampling trees were conducted. Boot strap values are shown as branch labels, and the biochemically characterized genes are marked by asterisks. Organism names are colored according to different kingdoms (orange = archaea and green = bacteria).

5.4.3. Structure based analysis of DmIDH and DmPMI sequences

In spite of not being a member of subclass I of the bacterial and archaeal NADP+-dependent IDHs as mentioned before, DmIDH showed significant sequence identity with the biochemically characterized, crystallized, and well-studied IDH sequences from Archaeoglobus fulgidus (44% sequence identity) (Steen et al., 1997, Stokke et al., 2007) and E. coli (39% sequence identity) (Hurley et al., 1991) in SWISSPROT (Boeckmann et al., 2003). Multiple sequence alignment (MSA) of these and other experimentally characterized DmIDH homologs from SWISSPROT identified 66 completely conserved residues in the DmIDH sequence (Figure 5.7). Among these conserved residues included all residues involved in substrate and coenzyme binding in A. fulgidus (Steen et al., 1997, Stokke et al., 2007) and E. coli (Hurley et al., 1991), suggesting a similar reaction mechanism for DmIDH. Also, the signature motif of both isocitrate and isopropylmalate dehydrogenases (IDH and IPMDH) (Prosite id: PS00470) was identified in DmIDH (indicated by a black box in Figure 5.7) using ScanProsite (de Castro et al., 2006) in the PROSITE database (Sigrist et al., 2012). However, only IDH activity was tested and confirmed for DmIDH because of its presence in an operon containing other putative TCA-cycle genes in D. mccartyi genomes (Figure C1 in Appendix C) (Markowitz et al., 2012). Moreover, IPMDH is involved in L-leucine biosynthesis (Kanehisa et al., 2011), and D. mccartyi genomes harbor a putative IPMDH gene (KB1_0839, cbdbA804, DET0826, DhcVS_730, DehaBAV1_0745, and DehalGT_0706) located in the L-leucine biosynthesis operon (Figure C2 in Appendix C) (Markowitz et al., 2012). Comparison of the predicted secondary structure of DmIDH (Figure 5.7) with the crystal structures of A. fulgidus and E. coli IDHs (Stokke et al., 2007, Hurley et al., 1991) showed that most helix and strand regions were conserved among them in spite of the

118 presence of some subtle structural differences. For instance, DmIDH has 12 α-helices and 11 β- strands (Figure 5.7), whereas A. fulgidus has 18 α-helices, 16 β-strands (Stokke et al., 2007) and E. coli has 13 α-helices, 12 β-strands (Hurley et al., 1991).

119

Figure 5.7. Structure-based multiple sequence alignment (MSA) of DmIDH. MSA of DmIDH protein sequence and its homologous, biochemically characterized sequences are shown. Protein sequences were mined from the SWISSPROT curated database (Boeckmann et al.,

120

2003), and MSA was performed by ClustalX (Larkin et al., 2007). Completely conserved residues are marked by asterisks while residues reported to be important for substrate and coenzyme binding (Hurley et al., 1991, Stokke et al., 2007) are marked by blue and red boxes, respectively. Black box indicates signature motif for isocitrate/isopropylmalate dehydrogenases as identified by ScanProsite (de Castro et al., 2006). Protein secondary structure was predicted by PredictProtein (Rost and Sander, 1993, Rost, 1994), and indicated by bars (α-helices) and arrows (β-strands). Protein accession numbers (UniProt) are: A. fulgidus (O29610), C. noboribetus (P96318), S. aureus (P99167), B. subtilis (P39126), E. coli (P08200), C. maris (P41560), B. Taurus (P41563), and H. sapiens (P50213).

121

Figure 5.8. Structure-based multiple sequence alignment (MSA) of DmPMI. MSA of DmPMI protein sequence and its homologous, manually reviewed/biochemically characterized sequences are shown. Protein sequences were mined from the SWISSPROT curated database

122

(Boeckmann et al., 2003), and MSA was performed by ClustalX (Larkin et al., 2007). Completely conserved residues are marked by asterisks, while residues reported to be important for substrate binding and catalysis (Hansen et al., 2004, Swan et al., 2004) are marked by red boxes. Blue boxes indicate two signature motifs for bifunctional PGI/PMI protein family (Hansen et al., 2004), and black box indicates Pfam (Punta et al., 2012) SIS (sugar isomerase) domain identified by ScanProsite (de Castro et al., 2006). Protein secondary structure was predicted by PredictProtein (Rost and Sander, 1993, Rost, 1994), and indicated by bars (α- helices) and arrows (β-strands). Protein accession numbers (UniProt) are: C. bescii (Q44407), T. volcanium (Q978F3), T. acidophilum (Q9HIC2), A. aeolicus (O66954), S. tokodaii (Q96YC2), S. acidocaldarius (Q4JCA7), S. solfataricus (Q97WE5), P. aerophilum (Q8ZWV0), and A. pernix (Q9YE01).

As mentioned before, no homolog of DmPMI was identified in the three types of PMIs identified and characterized so far (Proudfoot et al., 1994). Nonetheless, the closest homologs of DmPMI, among the biochemically characterized enzymes, were identified to be archaeal bifunctional PGI/PMIs from Thermoplasma acidophilum (27% sequence identity), Pyrobaculum aerophilum (24% sequence identity), and Aeropyrum pernix (23% sequence identity) in SWISSPROT. MSA of these experimentally characterized or reviewed DmPMI homologs identified only 20 completely conserved residues in DmPMI (Figure 5.8), indicating it is less conserved than DmIDH. In addition to two signature motifs (marked by blue boxes in Figure 5.8) of archaeal PGI/PMIs (Hansen et al., 2004), a SIS (sugar isomerase) domain (marked by a black box in Figure 5.8) was identified in the DmPMI sequence. All residues proposed to be important for substrate binding and catalysis in archaeal PGI/PMIs (Hansen et al., 2004, Swan et al., 2004) are also conserved in DmPMI (red boxes in Figure 5.8) except a few important residues; these includes Ser-154, Pro-341, Ile-342, Ser-103, Thr-60, and Arg-152 in the DmPMI sequence. Among these important residues, the most critical change was detected at residue position 154 (Ser-154) in DmPMI, where a serine (S) substituted an arginine (R) found in other archaeal PGI/PMIs (Figure 5.8). This residue is one of the four most crucial residues reported to be responsible for catalytic activities of archael PGI/PMIs (Hansen et al., 2004, Jeffery et al., 2000). Hence, these changes in the DmPMI sequence are likely responsible for its inability to function as PGI. A similar notion was also obtained from the comparison of the predicted secondary structure of DmPMI (12 α–helices and 9 β-strands) (Figure 5.8) with the crystal structure of PaPGI/PMI from P. aerophilum (Swan et al., 2004). This comparison further highlighted the presence of a serine (Ser-154) in DmPMI in place of an arginine (Arg-135 in PaPGI/PMI) was

123

likely responsible for the enzyme’s PGI inactivity. This is probably due to the fact that the presence of an arginine (R) in the aforementioned position in PaPGI/PMI stabilizes the formation of crucial enediol intermediates during the PGI catalytic activity of the enzyme on G6P or F6P (Hansen et al., 2004, Seeholzer, 1993, Swan et al., 2004). However, the mechanism for PMI catalytic activity of DmPMI is likely similar to that of PaPGI/PMI because the presence of a threonine (Thr-291) in PaPGI/PMI is key to its PMI activity (Swan et al., 2004), and the equivalent residue in DmPMI is an isoleucine (Ile-342) (Figure 5.8).

In summary, the two characterized enzymes, DmIDH and DmPMI, of D. mccartyi strain KB-1 likely have reaction mechanisms similar to the previously characterized enzymes although they appear to constitute novel isocitrate dehydrogenase and phosphomannose isomerase enzyme families. This result suggests that perhaps DmIDH and DmPMI were acquired by D. mccartyi through lateral gene transfer events having evolved from their ancestors while retaining similar enzymatic activities. Owing to its potential involvement in major physiological roles such as energy metabolism in D. mccartyi, DmIDH showed faster catalytic activity with NADP+ as a cofactor, while DmPMI is likely involved in non-essential physiological roles such as osmotic stress adjustment in these bacteria.

5.5. Conclusions

Although D. mccartyi are environmentally important for their bioremediation capability, detailed biochemical studies on their genes are limited. This scarce biochemical information is a major challenge for the fundamental understanding of their metabolism and physiology, and this lack of detailed knowledge can, ultimately, hamper their successful application to bioremediation purposes. Thus, we reported in this study the detailed biochemical characterization of two metabolic genes, including one originally annotated as a hypothetical protein. Primary annotations of the genes encoding the characterized enzymes were either non-specific or incorrect, and they were revised or reannotated with other D. mccartyi genes based on our previously published metabolic modeling and transcriptomic studies. This study, therefore, illustrates the utility of metabolic modeling and bioinformatic approaches as important tools for

124 hypothesis generation regarding gene functions. Furthermore, the proposed annotations can be helpful in designing future biochemical experiments for characterizing other D. mccartyi gene- functions. Such fundamental understanding of the physiology and biochemistry of these useful and valuable organisms will continue to increase their use in future bioremediation efforts.

125

Chapter 6: Role of exogenous vitamin omission on the growth and community dynamics of a Dehalococcoides mccartyi-containing anaerobic mixed microbial community

6.1. Abstract

Detoxification of chlorinated organic solvents, including known human carcinogens trichloroethene and vinyl chloride, by Dehalococcoides mccartyi is an important and useful bioremediation process. Reductive dehalogenases, the respiratory enzymes of these specialized bacteria catalyze this bioremediation process termed organohalide respiration. Vitamin B12 or cobalamin is an essential cofactor for reductive dehalogenases, but D. mccartyi strains are vitamin B12 auxotrophs. Hence, both pure and mixed cultures of these bacteria are grown in the presence of exogenous vitamins, including vitamin B12, in the growth medium. However, mixed cultures of D. mccartyi include microbes who are natural producers of corrinoids such as vitamin B12, and they can fulfill D. mccartyi’s corrinoid requirements. To investigate if D. mccartyi can grow and dechlorinate in a mixed culture without the supply of exogenous vitamin B12, we conducted growth experiments in both regular (vitamin-added) and vitamin-free growth media with a D. mccartyi-containing anaerobic mixed culture, KB-1. The experimental results showed no substantial change in dechlorination rates between the cultures grown with or without the exogenous vitamins in the growth medium. Therefore, KB-1 community members are capable of fulfilling not only vitamin B12 but also other possible vitamin requirements of D. mccartyi. Using qPCR analysis, interesting shifts in community compositions were identified in vitamin- free cultures, presumably reflecting the growth of vitamin-producers in those cultures. Sustained growth of a mixed microbial community without the addition of any exogenous vitamins in the growth medium provides a unique environmentally-relevant model of a self-sustaining anaerobic microbial community in which to study interspecies nutrient exchange.

126

6.2. Introduction

Chlorinated organic solvents such as chloroethene and chlorobenzene congeners are ubiquitous and persistent ground water xenobiotics, or man-made pollutants (USEPA, 2013, USGS, 2013). These toxic compounds, including known human carcinogens trichloroethene (TCE) and vinyl chloride (VC) (ATSDR, 2013, TEACH, 2011), can be decontaminated by a natural biological process termed organohalide respiration (Holliger et al., 1998b, Smidt and de Vos, 2004). During this process, the chlorinated xenobiotics are converted to either benign ethenes or less toxic lower chlorinated compounds by the reductive dechlorination reaction catalyzed by the reductive dehalogenase enzymes of Dehalococcoides mccartyi — a group of strictly anaerobic and non- pathogenic bacteria (Löffler et al., 2012, Tas et al., 2010). Organohalide respiration is the only known metabolic process by which these bacteria conserve energy for growth by coupling the reduction of chloro-organics to ATP generation (Löffler et al., 2012, Smidt and de Vos, 2004, Tas et al., 2010). Thus, slow growth of these bacteria, usually observed in pure cultures (Adrian et al., 2000b, Löffler et al., 2012, Maymó-Gatell et al., 1997), is a major obstacle for studying detailed anaerobic microbial physiology. However, D. mccartyi grow relatively faster in syntrophic mixed microbial communities, their natural habitats, than in pure cultures (Ahsanul Islam et al., 2010, Duhamel and Edwards, 2007). This improved growth is possibly linked to some unknown beneficial interactions that exist between the community members, and one such beneficial interaction is hypothesized to be the supply of vitamin B12, or cobalamin — the essential cofactor for the reductive dehalogenase enzymes (Smidt and de Vos, 2004, Tas et al., 2010, Banerjee and Ragsdale, 2003, Krasotkina et al., 2001, Neumann et al., 1996, Miller et al., 1998) — from the natural producers found in the community.

The necessity of vitamin B12 for the reductive dehalogenase enzymes makes it essential for the survival and growth of D. mccartyi. However, the complete pathway for de novo biosynthesis of this corrinoid compound is absent in the genomes of D. mccartyi strains (Ahsanul Islam et al., 2010, Kube et al., 2005, McMurdie et al., 2009, Seshadri et al., 2005); hence, it is added to the growth medium of both pure and mixed cultures of these bacteria (Adrian et al., 2000b, Duhamel et al., 2002, Löffler et al., 2012, Maymó-Gatell et al., 1997). Recent experimental studies (Schipp et al., 2013, Yan et al., 2013, Yan et al., 2012, Yi et al., 2012) showed that D. mccartyi

127 could not only transport vitamin B12 but also salvage other corrinoid precursors from the growth medium to remodel or assemble their required corrinoid compounds. These studies further revealed the ability of these bacteria to use only specific types of corrinoids, in which the 5,6- dimethylbenzimidazole (DMB) was attached as a lower α-ligand to the cobamides (Yan et al., 2013, Yan et al., 2012, Yi et al., 2012); examples of such cobalamins include adenosylcobalamin, hydroxycobalamin, and cyanocobalamin. Notably, these corrinoids can be produced by only certain acetogens such as Acetobacterium woodii (Stupperich et al., 1988, Stupperich et al., 1990) although acetogens and methanogens in general can de novo synthesize different types of corrinoids (Stupperich and Kräutler, 1988, Stupperich et al., 1990). Previous experimental studies (He et al., 2007, Johnson et al., 2009), as well as in silico growth simulations of D. mccartyi with the metabolic model (Chapter 3) (Ahsanul Islam et al., 2010) showed enhanced growth of these bacteria as isolates with increased concentrations of exogenous vitamin B12 in the medium. However, the ability of D. mccartyi to grow and dechlorinate in mixed microbial communities without the presence of vitamin B12 in the medium is unknown. Also, the influence of this vital nutrient, if there is any, on the microbial community structure and dynamics in a D. mccartyi-containing mixed community has yet to be explored. To investigate these interesting issues related to interspecies vitamin B12 transfer, we conducted growth experiments with KB-1 — a D. mccartyi-containing mixed enrichment culture comprising dechlorinators, acetogens, methanogens, and fermenters (Duhamel and Edwards, 2006, Duhamel, 2005, Hug, 2012, Waller, 2010) — with and without various vitamins in the medium.

The KB-1 culture is typically maintained in a medium containing a mixture of 11 different exogenous vitamins, including vitamin B12, as well as the redox indicator, resazurin (Edwards and Cox, 1997, Edwards and Grbic-Galic, 1994). We decided to 1) remove vitamin B12 from the medium because its presence would obviate the need for any natural vitamin B12 producers in KB-1, and 2) remove resazurin from the medium because that might interfere with later analysis of vitamin B12 produced endogenously in the community. Therefore, we made transfers of the KB-1 culture into vitamin B12-free medium, and determined the stability of dechlorination activity under this condition. Since there could be other vitamins than vitamin B12 that might be transferred between the microbes in KB-1, we also prepared completely vitamin-free cultures

128

and compared their dechlorination activities with cultures where only vitamin B12 was omitted. We tracked any changes in the community using qPCR (quantitative polymerase chain reaction), and tested the robustness of the vitamin-free cultures by transferring into new medium.

6.3. Materials and methods

6.3.1. Chemicals and analytical procedures

All chemicals were purchased from Sigma-Aldrich (St. Louis, MO, USA) with greater than 98% in purity, unless otherwise stated. Chlorinated ethenes, methane, and ethene were routinely analyzed by injecting a 300 µl headspace sample with a 0.5 mL gas syringe (VICI precision sampling Inc., Baton Rouge, LA) onto a Hewlett-Packard 5890 Series II gas chromatograph (GC) fitted with a GSQ column (30 m by 0.53 mm ID PLOT column; J&W Scientific) and a flame ionization detector. The oven temperature was programmed to hold at 50°C for 1 min, then to increase to 190°C at 60°C/min and hold at 190°C for 4.6 min. Calibration of cis-1,2- dichloroethene (cDCE) and trichloroethene (TCE) was performed with aqueous external standards prepared gravimetrically from a concentrated methanolic stock solution, and vinyl chloride (VC) was added to these external standards via a gastight syringe as described previously (Duhamel and Edwards, 2006, Duhamel et al., 2004, Duhamel et al., 2002, Duhamel,

2005). A gas-mix (80% N2 and 20% CO2; Praxair, Inc., Danbury, CT, USA) was used as the anaerobic purge gas, and a 1% mixture of ethene and methane (Scotty II, Alltech Associates, Inc., Deerfield, IL, USA) was utilized as the analytical standard for GC.

6.3.2. Preparation of exogenous vitamins and resazurin free KB-1 cultures

A detailed description of all mineral media and the KB-1 cultures used in this study is presented in Tables 6.1 and 6.2. Also, the genealogy, including the treatment, maintenance, and transfer history of all cultures is briefly depicted in Figure 6.1. All defined mineral media (Table 6.1) were modified from the original KB-1 mineral medium (Edwards and Grbic-Galic, 1994). The three types of medium used were: 1) regular medium that includes all vitamins (RG), 2) medium

129

with all vitamins except vitamin B12 (NB), and 3) medium with no exogenous vitamins (NV). The media were also free from the resazurin indicator present in the original KB-1 medium (Edwards and Grbic-Galic, 1994). All KB-1 subcultures (Table 6.2 and Figure 6.1) were derived from a 4 liter (L) bottle of TCE and methanol fed parent KB-1 culture (Duhamel et al., 2002) maintained in the original KB-1 medium. Approximately 800 mL of this culture was first centrifuged anaerobically using four (4) 200 mL centrifuge tubes at 8000 rpm and 4°C for 15 min followed by removing the supernatant inside the glove box (Coy Lab, grass Lake, MI). Then, cell pellets were resuspended in 200 mL NV medium followed by centrifuging and discarding the supernatant as before. Following this procedure, the cell pellets were washed two more times with the NV medium. Finally, ~25 mL NV medium was added to the washed cell pellets in each of the four centrifuge tubes and combined to create 100 mL of concentrated KB-1 cell suspension. Afterwards, the “1st Wash” triplicate parallel cultures (RG-1, NB-1, and NV-1 in Figure 6.1 and Table 6.2) were set up using a 10% inoculum (vol/vol) from this concentrated culture into 250 mL (approximately 100 mL liquid and 150 mL headspace) mininert-capped glass bottles containing RG, NB, and NV media. After 242 days, the “2nd Wash” cultures (RG-2, NB-2, and NV-2 in Figure 6.1 and Table 6.2) were prepared from the “1st Wash” cultures by removing a 10 mL aliquot from each culture, and centrifuging and resuspending 3 times in the NV medium before reinoculating in 90 mL of fresh medium of the respective type (Figure 6.1). This process was repeated a third time after 114 days to create the “3rd Wash” cultures (RG-3, NB-3, and NV-3 in Figure 6.1 and Table 6.2). All cultures were fed 18 µL of 5:1 electron equivalent ratio (eeq) of TCE to methanol every two weeks; this was equivalent to about 62 µmoles of TCE and resulted in a liquid TCE concentration of approximately 50 mg/L in the bottles.

130

Table 6.1. Composition of different mineral media used in this study

Modified Mineral MM MM Composition (1 Liter MM) Medium (MM) Abbreviation

Phosphate buffer (KH2PO4: 27.2 g, K2HPO4: 34.8 g), Salt solution (NH4Cl: 53.5 g, CaCl2.6H2O: 7.0 g, FeCl2.4H2O 2.0 g), Trace minerals (H3BO3: 0.3 g, ZnCl2: 0.1 g, Na2MoO4.2H2O: 0.1 g, NiCl2.6H2O: 0.75 g, MnCl2.4H2O: 1.0 g, CuCl2.2H2O: 0.1 g, CoCl2.6H2O: 1.5 g, Na2SeO3: 0.02 g, Al2(SO4)3.18H2O: 0.1 g), Magnesium sulfate solution (MgSO .7H O: 62.5 g), Saturated Regular medium RG 4 2 bicarbonate (NaHCO3: 20 g), Vitamin mix (Biotin: 0.02 g, Folic acid: 0.02 g, Pyridoxine HCl: 0.1 g, Riboflavin: 0.05 g, Thiamine: 0.05 g, Nicotinic acid: 0.05 g, Pantothenic acid: 0.05 g, Para-amino benzoic acid: 0.05 g, Cyanocobalamin (vitamin B12): 0.05 g, Thioctic (lipoic) acid: 0.05 g, Coenzyme M: 1.0 g), Amorphous ferrous sulfide ((NH4)2Fe(SO4)2.6H2O: 39.2 g, Na2S.9H2O: 24.0 g) Contains all minerals and nutrients as in the Exogenous vitamin B12- NB Regular medium except the Cyanocobalamin free medium (vitamin B12) All exogenous vitamins- Contains all minerals as in the Regular medium NV free medium except the Vitamin mix

Table 6.2. Description of different KB-1 cultures used in this study

KB-1 Cultures Abbreviation Description 1st Wash KB-1 cultures inoculated into the regular medium Regular 1st Wash RG-1 containing all vitamins 2nd Wash KB-1 cultures inoculated into the regular Regular 2nd Wash RG-2 medium containing all vitamins 3rd Wash KB-1 cultures inoculated into the regular medium Regular 3rd Wash RG-3 containing all vitamins No Vitamin B12 1st Wash KB-1 cultures inoculated into the medium NB-1 1st Wash containing all vitamins except vitamin B12 No Vitamin B12 2nd Wash KB-1 cultures inoculated into the medium NB-2 2nd Wash containing all vitamins except vitamin B12 No Vitamin B12 3rd Wash KB-1 cultures inoculated into the medium NB-3 3rd Wash containing all vitamins except vitamin B12 No Vitamins 1st 1st Wash KB-1 cultures inoculated into the medium NV-1 Wash containing no vitamins

131

No Vitamins 2nd 2nd Wash KB-1 cultures inoculated into the medium NV-2 Wash containing no vitamins No Vitamins 3rd 3rd Wash KB-1 cultures inoculated into the medium NV-3 Wash containing no vitamins Regular 3rd Wash 3rd Wash KB-1 cultures maintained in the regular medium KB-1 into RG-3+RG containing all vitamins were diluted to a similar medium Regular medium No Vitamin B12 3rd Wash KB-1 3rd Wash KB-1 cultures maintained in the vitamin B12-free NB-3+NB into No Vitamin medium were diluted to a similar medium B12 medium No Vitamin B12 3rd Wash KB-1 cultures maintained in the vitamin B12-free 3rd Wash KB-1 NB-3+RG medium were diluted to the regular medium containing all into Regular vitamins medium No Vitamins 3rd Wash KB-1 into 3rd Wash KB-1 cultures maintained in the medium NV-3+NV No Vitamins containing no vitamins were diluted to a similar medium medium No Vitamins 3rd 3rd Wash KB-1 cultures maintained in the medium Wash KB-1 into NV-3+NB containing no vitamins were diluted to the vitamin B12 No Vitamin B12 free medium medium No Vitamins 3rd 3rd Wash KB-1 cultures maintained in the medium Wash KB-1 into NV-3+RG containing no vitamins were diluted to the regular medium Regular medium containing all vitamins

6.3.3. Preparation of diluted KB-1 cultures for the time-course experiment

To determine how well the cultures established in different media recover after dilution, we designed a dilution time-course experiment. In total, 18 triplicate parallel subcultures were prepared using a 2% inoculum (2 mL of culture into 98 mL of medium) from the “3rd Wash” KB-1 cultures. Cultures were inoculated into a medium of the same or richer vitamin content than the original cultures. These new subcultures were labelled according to the names of the cultures from which they were derived and to which growth medium they were inoculated (Table 6.2): RG-3+RG, NB-3+NB, NB-3+RG, NV-3+NV, NV-3+NB, and NV-3+RG. All diluted cultures were set up in 250 mL (approximately 100 mL liquid and 150 mL headspace) mininert- capped glass bottles and fed 18 µL of 5:1 electron equivalent ratio (eeq) of TCE to methanol only once.

132

133

Figure 6.1. Genealogy of KB-1 cultures used in this study. Origin, treatment, and transfer history, as well as different medium conditions, in which the KB-1 cultures used in this study were grown, are briefly showed in this figure. The colors are representing different treatments. Triplicate parallel subcultures were set up for each treatment and medium condition. All cultures were refed as electron acceptors were consumed, approximately every two to four weeks. A detailed description of different growth media and cultures are presented in Tables 6.1 and 6.2.

6.3.4. DNA collection and extraction

Genomic DNA (gDNA) was extracted from all KB-1 cultures following the procedures described previously (Duhamel, 2005, Waller, 2010). Total gDNA from the “1st, 2nd, and 3rd Wash” cultures was extracted after maintaining them for 563, 321, and 207 days, respectively (Figure 6.1); 8 months later, gDNA was again extracted from the “2nd and 3rd Wash” cultures only. gDNA from the 2% transfer subcultures was extracted for 5 time points (day 0, day 6, day 11, day 17, day 24, and day 31).

6.3.5. Quantitative PCR (qPCR) primers and method

Quantitative PCR (qPCR) was used for identifying the community composition of all KB-1 cultures. In total, 6 bacterial (Dehalococcoides, Geobacter, Bacteroidetes, Acetobacterium, Sporomusa, ) and 4 archaeal (Methanomethylovorans, Methanomicrobiales, Methanosaeta, Methanosarcina) operational taxonomic units (OTU) were tracked in all KB-1 cultures using the previously published qPCR primer sets (Duhamel, 2005, Waller, 2010) (Table D1 in Appendix D). Individual gDNA samples were 1:10 diluted for testing possible inhibitor effects, and this dilution, in general, resulted in higher copy numbers in qPCR reactions. Plasmid DNAs containing 16S rRNA gene sequence of KB-1 OTUs of interest were prepared as described previously (Zila, 2011). Standard curves for each OTU were generated for the concentrations of 108 copies/µL to 101 copies/µL by serial dilutions of concentrated plasmid DNAs containing 1010 copies/µL. qPCR reactions were set up in duplicate 20 µL wells containing 10 µL of SsoFast EvaGreen Supermix (Bio-Rad Laboratories Inc., Hercules, CA, USA), 0.5 µL of forward (10µM) and 0.5 µL of reverse (10µM) primers, 7 µL of UV-treated dH2O, and 2 µL of gDNA of interest. The reactions were conducted and analyzed using a Bio-

134

Rad CFX96 Real-Time System and C1000 Thermal Cycler (Bio-Rad Laboratories Inc., Hercules, CA, USA) with the following protocol: initial denaturation for 2 min at 98°C, 40 cycles of denaturation at 98°C for 5 s, annealing at the specific primer annealing temperature (Table D1 in Appendix D) for 10 s, and extension at 65°C for 5 s followed by a final melting curve analysis from 65 to 95°C measuring fluorescence in every 0.5°C (Waller, 2010, Zila, 2011).

6.3.6. Analysis of qPCR data

Calibration/standard curves for qPCR were generated from serial dilutions of concentrated plasmid DNAs containing 16S rRNA gene sequence of a KB-1 OTU of interest. Gene copy numbers in plasmid DNAs were calculated using the following equation (Ritalahti et al., 2006):

Gene copy number (copies/μL) plasmid DNA concentration (ng/μL) × 10 × 6.023 × 10 = 660 × total plasmid size (bp)

Concentration of plasmid DNAs was measured with a NanoDrop ND-1000 Spectrophotometer (Thermo Scientific, Wilmington, DE, USA) at 260 nm absorbance. Total plasmid size was calculated from the length of the vector pCR2.1-TOPO (3900 base pairs) provided in the TOPO10 cloning kit (Invitrogen Corp., Grand Island, NY) and from the inserted bacterial fragment of 1500 bp, or archaeal fragment of 1100 bp (Waller, 2010, Zila, 2011). Assuming 1 gene copy per plasmid molecule and the average molecular weight of 660 g for a bp of the double-stranded DNA, gene copy numbers were estimated using the Avogadro’s constant (6.023 X 1023) and the above equation.

Amplification efficiencies for a qPCR run were calculated using the method of Pfaffl (Pfaffl, 2001) implemented in Bio-Rad CFX Manager 2.1 software (Bio-Rad Laboratories Inc.,

Hercules, CA). The quantification cycle, Cq (also known as the threshold cycle, Ct) for all qPCR runs was calculated with the software by manually setting a threshold fluorescence value (Relative Fluorescence Unit, RFU) of 601.235, which resulted in an optimal amplification

135

efficiency close to 100% for most runs (Figure D1 in Appendix D). Gene copies in each reaction mix were automatically calculated from the standard curves corresponding to the Cq value. Finally, gene copy numbers in each sample were calculated using the following equation:

Gene copy number (copies/mL) Gene copies in each reaction mix (copies/μL) × 10 (dilution factor) × Volume of eluted DNA (μL) = Volume of sample used for extracting DNA (mL)

6.4. Results

6.4.1. Dechlorination and growth of washed KB-1 cultures cultivated in different growth media

All washed KB-1 cultures were maintained on TCE and methanol (Materials and methods), and their dechlorination profiles (Figure 6.2) showed similar trends in all media within the respective treatments. The cultures grown in the RG medium containing all exogenous vitamins (Figures 6.2A, 6.2D, and 6.2G) were positive controls. Dechlorination profiles for both vitamin B12- omitted (NB) (Figures 6.2B, 6.2E, and 6.2H) and all vitamins-omitted (NV) (Figures 6.2C, 6.2F, and 6.2I) cultures were similar to the positive controls (Figures 6.2A, 6.2D, and 6.2G); this indicates the NV cultures in all vitamins-free media were growing and dechlorinating at rates similar to the RG and NB cultures grown in exogenous vitamin-added media. The estimated maximum TCE dechlorination and ethene production rates (Table 6.3) also confirmed this outcome; the rates were calculated when the cultures were dechlorinating at a steady rate. Nonetheless, a few subtle differences were observed between dechlorination profiles of the 3rd Wash cultures. For instance, methane concentration (0.45 mmoles/bottle) in the NV-3 culture (Figure 6.2I) was more than two times higher than that in the NB-3 (0.2 mmoles/bottle) (Figure 6.2H) and RG-3 (0.18 mmoles/bottle) (Figure 6.2G) cultures. Although VC accumulation was initially noted in these cultures, it was completely dechlorinated around day 116 (Figure 6.2G and 6.2H). Thus, a complete dechlorination of TCE to ethene was eventually observed in both exogenous vitamin-added and vitamin-free KB-1 cultures.

136

137

Figure 6.2. TCE dechlorination profiles of different washed KB-1 cultures. Trichloroethene (TCE) dechlorination profiles of different washed KB-1 cultures maintained in three different growth media are shown: (A), (D), and (G) are profiles of the cultures grown in the RG medium; (B), (E), and (H) are profiles of the cultures grown in the NB medium; and (C), (F), and (I) are profiles of the cultures grown in NV medium. See Tables 6.1 and 6.2 for a detailed description of different media and cultures. Error bars represent standard deviations of triplicate cultures.

Table 6.3. Estimated TCE dechlorination and ethene production rates of different washed KB-1 cultures

Medium 1st Wash 2nd Wash 3rd Wash Conditions Cultures Cultures Cultures TCE Dechlorination Rates (µmoles/L/hr)

RG 0.69 ± 0.01 0.73 ± 0.01 0.74 ± 0.03

NB 0.69 ± 0.01 0.74 ± 0.02 0.74 ± 0.01

NV 0.69 ± 0.02 0.74 ± 0.01 0.73 ± 0.01

Ethene Production Rates (µmoles/L/hr)

RG 0.65 ± 0.03 0.55 ± 0.02 0.57 ± 0.02

NB 0.66 ± 0.05 0.60 ± 0.01 0.60 ± 0.05

NV 0.66 ± 0.01 0.60 ± 0.03 0.65 ± 0.03

6.4.2. Community composition of washed KB-1 cultures cultivated in different growth media

KB-1 is a syntrophic mixed microbial consortium that mainly includes 4 major phylotypes: dechlorinators (Dehalococcoides, Geobacter), acetogens (Acetobacterium, Sporomusa, Spirochaetes), methanogens (Methanomethylovorans, Methanomicrobiales, Methanosaeta, Methanosarcina), and fermenters (Bacteroidetes) (Duhamel and Edwards, 2006, Duhamel et al., 2002, Duhamel, 2005). Thus, to track the change in microbial community composition and population dynamics in the KB-1 cultures used in this study, qPCR was used to enumerate absolute cell numbers (Materials and methods) of the aforementioned OTUs, as well as the total

138 bacterial and archaeal OTUs (Appendix D: Figure D2). Then, relative abundance of an OTU was calculated from the ratio of cell numbers of that OTU to the sum of all OTUs tracked, and the results are presented in Figure 6.3. The absolute cell numbers of all OTUs tracked are shown in Figure 6.4.

139

140

Figure 6.3. Community composition of different washed KB-1 cultures. Relative proportion of the ten most representative OTUs of KB-1 cultures, identified by qPCR using the previously published primer sets (Duhamel, 2005, Waller, 2010) and the genomic DNA (gDNA) of triplicate samples, are shown as a percentage of all OTUs tracked in a sample. gDNA from the “1st Wash” triplicate parallel cultures were combined to create one sample for corresponding conditions to conduct qPCR reactions, and the community composition is shown in (A). gDNA for the “2nd and 3rd Wash” cultures were collected from triplicate samples, and the percentage abundance of different OTUs are shown in (B) and (C) for each replicate. (D) and (E) are showing the community compositions of same cultures 8 months later. See Tables 6.1 and 6.2 for a detailed description of different media and cultures.

The “1st Wash” KB-1 cultures were dominated by three major OTUs: Dehalococcoides, Sporomusa, and Methanomicrobiales (Figures 6.3A, 6.4A and 6.4B). Although Dehalococcoides population was more than 60% of the total culture, Sporomusa and Methanomicrobiales percentages varied between the three cultures (Figure 6.3A). Apart from these OTUs, a small proportion of Spirochaetes and Bacteroidetes were also noticed in these cultures (Figures 6.4A, and 6.4B). The same dominant OTUs were identified in the “2nd Wash” cultures although their proportions (Figures 6.3B and 6.3D) and especially cell numbers (Figures 6.4A and 6.4C) were different from the previous treatment. For instance, the proportion of Dehalococcoides population decreased by about 10% in the “2nd Wash” cultures (Figure 6.3B), but the absolute cell numbers actually increased by an order of magnitude (Figure 6.4A). Sporomusa represented almost 20% – 40% of the culture populations in the “2nd Wash” cultures (Figure 6.3B) up from the “1st Wash” because their absolute cell numbers increased by two orders of magnitude (Figure 6.4A). Methanomicrobiales cell numbers also increased by the same order of magnitude in all “2nd Wash” cultures (Figure 6.4B). Furthermore, Spirochaetes and Bacteroidetes OTUs increased by several orders of magnitude in these cultures, specifically in the cultures containing the NV medium (Figure 6.4A).

141

142

Figure 6.4. Community composition of different washed KB-1 cultures in terms of absolute cell numbers. The absolute cell numbers of the ten most representative OTUs of washed KB-1 cultures, identified by qPCR using the previously published primer sets (Duhamel, 2005, Waller, 2010) and the gDNA of triplicate samples, are shown for bacterial (A), (C), and archaeal (B), (D) OTUs. (C) and (D) are showing the qPCR results for only the “2nd and 3rd Wash” cultures 8 months later after conducting the 1st round of survey [(A), (B)]. Cultures belonging to a particular washing and centrifuging treatment are separated by rectangles. See Tables 6.1 and 6.2 for a detailed description of different media and cultures. Error bars represent standard deviations of triplicate cultures, except for “1st Wash” cultures, where error bars represent standard deviations of triplicate samples.

The most striking shift in population dynamics and the change in community composition were observed in the NV-3 cultures (Figures 6.3C, 6.4A, and 6.4B). This change occurred due to the growth of two OTUs — Acetobacterium and Methanosaeta — in these cultures, along with the presence of previously observed OTUs. The increase in Acetobacterium cells by four orders of magnitude was quite remarkable since they were present in very low numbers in previous treatments (Figures 6.3C and 6.4A). Equally interesting was the emergence of Methanosaeta OTU in these cultures as it was almost non-existent previously (Figures 6.3C and 6.4B). While the Spirochaetes and Bacteroidetes cell numbers increased slightly in the NV-3 cultures, the Sporomusa population dropped by 10 fold as compared to the RG-3 and NB-3 cultures (Figure 6.4A). Also noteworthy was the complete dominance of Dehalococcoides population (almost 98%) in the 1st bottle of the RG-3 cultures (Figure 6.4C). To identify the stability of the shift in population dynamics of various KB-1 cultures, another qPCR survey of the “2nd and 3rd Wash” cultures was conducted 8 months later (Figures 6.3D, 6.3E and 6.4C, 6.4D). The absolute cell numbers of different OTUs remained almost unchanged (Figures 6.4C and 6.4D), but subtle changes were noticed in their relative proportions (Figures 6.3D and 6.3E). The most notable change occurred in the 1st bottle of the RG-3 cultures (Figure 6.4E), where the bloom of Bacteroidetes, Sporomusa, and Methanomicrobiales OTUs made the community more diverse than before (Figure 6.4C).

143

6.4.3. Dechlorination and growth of diluted KB-1 cultures cultivated in different growth media

Aliquots from the actively dechlorinating “3rd Wash” cultures were inoculated into six different growth media (Materials and methods). Dechlorination profiles of these cultures are shown in Figure 6.5. The TCE dechlorination profiles were similar in all cases, but the VC dechlorination profiles were quite different; in particular, VC accumulation was noted in the NV-3+NV and NV-3+RG cultures until day 60 (Figures 6.5D and 6.5F). However, addition of methanol (arrows in Figures 6.5D and 6.5F), the electron donor for KB-1 led to the complete dechlorination of VC.

144

145

Figure 6.5. Dechlorination profiles for the time-course experiment of diluted KB-1 cultures. Trichloroethene (TCE), cis-1,2-dichloroethene (cDCE), and vinyl chloride (VC) dechlorination profiles of 50X (2%; vol/vol) diluted KB-1 cultures in six different growth media are shown. (A) shows profiles of positive controls; (B) and (C) are profiles of the cultures derived from the NB- 3 cultures, and (D), (E), and (F) are profiles of the cultures derived from the NV-3 cultures. See Tables 6.1 and 6.2 for a detailed description of different media and cultures. Methanol amendment is represented by arrows in (D) and (F). Error bars represent standard deviations of triplicate cultures.

We further quantified growth characteristics of these diluted cultures (Figure 6.6) by calculating the TCE dechlorination and ethene production rates (Figure 6.6A), as well as by tracking the Dehalococcoides cell numbers with qPCR (Figure 6.6B). The fastest rates (1.4 µmole TCE/L/hr or 8.4 µeeq TCE/L/hr and 0.6 µmole ethene/L/hr or 3.6 µeeq ethene/L/hr) were observed in the NB-3+RG cultures, while the NV-3+NV cultures showed the slowest rates (0.2 µmole TCE/L/hr or 1.2 µeeq TCE/L/hr and 0.06 µmole ethene/L/hr or 0.36 µeeq ethene/L/hr) (Figure 6.6A). However, a similar trend in the increase of Dehalococcoides cell numbers was noticed in all cultures (Figure 6.6B).

146

147

Figure 6.6. TCE dechlorination rates, ethene production rates, and D. mccartyi cell numbers for diluted KB-1 cultures. Trichloroethene (TCE) dechlorination rates, ethene production rates, and D. mccartyi absolute cell numbers in corresponding growth media are shown in (A) and (B). D. mccartyi cell numbers were calculated by qPCR using the previously published primer sets (Duhamel, 2005, Waller, 2010), and the gDNA of triplicate samples collected from the six diluted cultures and time points. See Tables 6.1 and 6.2 for a detailed description of different media and cultures. Error bars represent standard deviations of triplicate samples.

6.5. Discussion

Vitamin B12 or cobalamin is an essential corrinoid compound for the growth and dechlorination activity of Dehalococcoides in KB-1 (Smidt and de Vos, 2004, Tas et al., 2010, Banerjee and Ragsdale, 2003, Krasotkina et al., 2001, Kräutler et al., 2003). Although these bacteria are known vitamin B12 auxotrophs (Löffler et al., 2012), KB-1 comprises natural corrinoid producers, such as methanogens (DiMarco et al., 1990, Gorris and van der Drift, 1994, Mazumder et al., 1987), acetogens (Stupperich et al., 1988), and Geobacter (Yan et al., 2012). Thus, Dehalococcoides in KB-1 are able to grow and dechlorinate without vitamin B12 addition in the growth medium only if they obtain it from other community members. Similar dechlorination profiles and unchanged dechlorination rates of all washed transfers of KB-1 cultures into NB and NV medium (Figure 6.2 and Table 6.3) indicate that Dehalococcoides were scavenging essential corrinoids from their neighbors. A recent experimental study on D. mccartyi strain CBDB1 (Schipp et al., 2013) identified the capability of these bacteria to produce all necessary cofactors and vitamins de novo except vitamin B12, biotin, and thiamine, which they obtained from their growth medium (Schipp et al., 2013). Thus, the dechlorination activity of the NV-1, NV-2, and NV-3 cultures (Figure 6.2 and Table 6.3) in the vitamin-free growth medium indicate that, in addition to vitamin B12, Dehalococcoides in KB-1 also obtained other necessary vitamins from their associated community members.

Vitamin B12 or complete vitamin omission caused no significant change in dechlorination rates. Moreover, in the case of simply omitting vitamin B12, no significant change in major community members was observed. Only in the cultures not receiving any vitamins at all did the community structure change. The fact that omitting vitamin B12 did not change the community indicates that one of the other dominant taxa, either Sporomusa or Methanomicrobiales, was able

148

to provide sufficient amount and the appropriate form of vitamin B12 to Dehalococcoides. However, Sporomusa produce corrinoids, such as p-cresolylcobamide and phenolylcobamide (Stupperich and Konle, 1993, Stupperich et al., 1988, Stupperich et al., 1990) that are different from 5,6-dimethylbenzimidazolyl (DMB)-cobamide or vitamin B12 or cobalamin, the corrinoid required for Dehalococcoides (Yan et al., 2013, Yan et al., 2012, Yi et al., 2012). Given that Sporomusa has been shown not to support Dehalococcoides growth (Yan et al., 2013) without the addition of DMB, then it follows that Methanomicrobiales may indeed be the source of DMB and vitamin B12 for Dehalococcoides in the NB cultures. Of course it is possible that some of the low abundance organisms produce sufficient vitamin B12 for Dehalococcoides, but the most likely scenario is that it is coming from a combination of sources, including Methanomicrobiales and Sporomusa.

In the NV-3 culture, a consistent community shift was seen: in addition to previously observed Dehalococcoides, Sporomusa, Spirochaetes, Bacteroidetes, and Methanomicrobiales populations, the emergence of Acetobacterium and Methanosaeta OTUs in these cultures was intriguing because they were almost undetected in other cultures (Figures 6.3 and 6.4). This community change in the NV-3 cultures was possibly related to the lack of all vitamins in the medium. Although both Acetobacterium and Sporomusa are acetogens, they produce structurally different corrinoid compounds (Stupperich and Konle, 1993, Stupperich et al., 1988). Sporomusa ovata produces p-cresolylcobamide and phenolylcobamide — two unusual corrinoids containing aromatic nucleotides instead of nitrogen-containing heterocyclic nucleotides usually found in vitamin B12 or cobalamin corrinoid produced by Acetobacterium woodii (Stupperich and Konle, 1993, Stupperich et al., 1988, Stupperich et al., 1990). p-cresolylcobamide is essential for growth and catabolic conversion of methanol to acetate by the Sporomusa species (Stupperich et al., 1988), while vitamin B12 from Acetobacterium OTU is essential for Dehalococcoides to grow and dechlorinate toxic chloro-organics (Schipp et al., 2013, Yan et al., 2013, Yi et al., 2012). Thus, the production and requirement of two different types of corrinoids in KB-1 possibly influenced the population abundance of Sporomusa and Acetobacterium OTUs in the NV-3 cultures. Moreover, the ability of Acetobacterium to produce H2 as a byproduct of acetogenesis

(Drake et al., 2008, Winter and Wolfe, 1980) might increase the partial pressure of H2 in these cultures; this increase, in turn, likely contributed to increased Methanosaeta cell numbers

149

because these acetoclastic methanogens have a higher threshold for H2 partial pressure than other methanogens in KB-1 (Barber et al., 2011, Thauer et al., 2008). Thus, the abundance of Acetobacterium population was probably linked to increased Methanosaeta cells in the NV-3 cultures. Hence, the community composition of NV-3 cultures was the most different among all washed KB-1 cultures.

The H2 partial pressure in the KB-1 anoxic environment is most likely low due to its use by multiple KB-1 OTUs, and this probably helped the Methanomicrobiales OTU to dominate in all washed KB-1 cultures. Apart from Methanomicrobiales, other methanogens such as Methanomethylovorans, Methanosarcina, and Methanosaeta in KB-1 contain cytochromes

(Thauer et al., 2008), and methanogens without cytochromes have lower thresholds for H2 partial pressure than methanogens with cytochromes (Thauer et al., 2008). This implies that the Methanomicrobiales OTU has a competitive advantage over other methanogens in the KB-1 environment. Hence, they became the most abundant archaeal OTU in all washed KB-1 cultures. Their dominance is also fitting to the KB-1 anoxic environment that is congenial to Dehalococcoides growth, and these bacteria do not possess cytochromes as well (Ahsanul Islam et al., 2010, Kube et al., 2005, Löffler et al., 2012, Seshadri et al., 2005). Moreover, the increased Methanomicrobiales population can also be attributed to their corrinoid producing capability (DiMarco et al., 1990, Gorris and van der Drift, 1994, Mazumder et al., 1987) although they produce different corrinoids (DiMarco et al., 1990, Stupperich and Kräutler, 1988) than Dehalococcoides’ requirement of cobalamin corrinoids. Recent experimental studies on Dehalococcoides (Yan et al., 2013, Yi et al., 2012) identified the ability of these bacteria to salvage not only vitamin B12, but also other corrinoid precursors or non-functional corrinoids to assemble or remodel their required corrinoids. Thus, the large abundance of Methanomicrobiales OTU in actively dechlorinating washed KB-1 cultures suggest that Dehalococcoides are likely capable of using the corrinoids synthesized by methanogens in KB-1.

In addition to corrinoids, Dehalococcoides require acetate and H2 for carbon and energy. The ability of Sporomusa (Cord-Ruwisch et al., 1988, Stupperich and Konle, 1993), Spirochaetes, and Bacteroidetes (Leadbetter et al., 1999, Salmassi and Leadbetter, 2003, Rey et al., 2010) to produce these metabolites was possibly responsible for their increased presence in the washed

150

KB-1 cultures. Another noticeable feature of these treated cultures was the decline in Geobacter, Methanomethylovorans, and Methanosarcina populations although they were abundant in the original KB-1 (Duhamel, 2005, Hug, 2012, Waller, 2010, Zila, 2011). The Geobacter cell numbers decreased probably due to maintaining all washed KB-1 cultures on TCE because Geobacter obtain less energy for growth from the dechlorination of TCE to cDCE as compared to PCE to cDCE, and they were reported to dechlorinate PCE and TCE only to cDCE (Amos et al., 2007, Duhamel and Edwards, 2007, Waller, 2010). Moreover, the omission of resazurin indicator from all growth media potentially played a role in decreasing Geobacter cell numbers. Since it is a redox dye, resazurin can act as an electron shuttle (Ishii et al., 2008) for electron loving Geobacter cells (Lovley et al., 2011) to offer them unknown competitive advantages in KB-1. The Methanomethylovorans and Methanosarcina cells were low because they possibly were outcompeted by the Methanomicrobiales population in the KB-1 environment. This environment, as previously discussed, seemed to be especially favorable to these particular methanogens. Overall, the NV-3 cultures showed the most significant change in community composition and variety of organisms in the community, which potentially can lead to the creation of a robust and self-sustaining dechlorinating consortium usually observed in nature.

The vitamin-omission experiment with the washed KB-1 cultures was unable to identify the beneficial effect of readily available exogenous vitamins in the growth medium on Dehalococcoides. The time-course experiment with 50X (2%, vol/vol) diluted KB-1 cultures, indeed, showed that the vitamin-added growth medium helped Dehalococcoides achieve a faster dechlorination rate (Figure 6.6A); however, growth of these bacteria was similar in all six growth conditions (Figure 6.6B). The beneficial effect of a vitamin-added growth medium was further evident from the fastest dechlorination rate achieved by the NB-3 cultures inoculated in all vitamins-added medium (NB-3+RG). In fact, both the NB-3+NB and NB-3+RG cultures were faster than the NV-3 cultures cultivated in three different media: NV-3+NV, NV-3+NB, and NV- 3+RG (Figure 6.6A). The NB-3 cultures were maintained in a medium containing other exogenous vitamins except vitamin B12 (Table 6.1), while the NV-3 cultures were grown in a vitamin-free medium. Although both were inoculated in a vitamin-added medium, the NV-3 cultures showed slower dechlorination rates than the NB-3 cultures, indicating that the latter ones were more benefitted from vitamin-addition than the NV-3 cultures. These results also suggest

151

that apart from vitamin B12, other exogenous vitamins, including biotin and thiamine, are required for Dehalococcoides to dechlorinate fast. Moreover, the NV-3+NV and NV-3+RG cultures were electron donor limited until day 69 as evident from VC accumulation in these cultures (Figure 6.5), and this could also contribute to their slower dechlorination activities. However, this discrepancy in TCE dechlorination and ethene production rates was observed immediately after transferring the cultures to a vitamin-added medium; eventually, though, both vitamin-added and vitamin-free cultures were dechlorinating at similar rates after multiple feedings as described before.

Thus, we showed that KB-1, a Dehalococcoides-containing, anaerobic and mixed microbial consortium, grew in an exogenous vitamin-free growth medium, and this medium did not influence its dechlorination activity. However, the vitamin-omitted medium triggered an interesting shift in microbial populations in the KB-1 consortium. Due to the faster growth of Acetobacterium and Methanosaeta populations, as well as the presence of Dehalococcoides, Sporomusa, Spirochaetes, Bacteroidetes, and Methanomicrobiales OTUs, the NV-3 KB-1 cultures grown in the vitamin-free medium showed the highest change in microbial community composition. We also showed that the vitamin-added growth medium helped the diluted KB-1 cultures to dechlorinate faster than the cultures grown in vitamin-free media; this difference was not noticeable later though.

6.6. Conclusions

Growth experiments with KB-1 confirmed that Dehalococcoides, in a mixed microbial community, were capable to grow by dechlorinating chlorinated ethenes without any exogenous vitamins, including vitamin B12, in the growth medium. Also, the addition or omission of exogenous vitamins did not significantly influence steady state dechlorination activities/rates, as well as their growth rates. However, the recovery of dechlorination activity after transfer was slower in cultures lacking all vitamins, though full activity comparable to regular medium cultures was restored eventually.

152

Chapter 7: Summary, conclusions, and future work

7.1. Summary

The overall goal of this research was to investigate fundamental characteristics of the unusual metabolism of specialist bacteria Dehalococcoides mccartyi, both in isolates and in communities with other microbes. This ultimate goal was achieved by pursuing four specific objectives:

Objective 1: Construction of a detailed constraint-based systemic mathematical model of D. mccartyi metabolism

A pan-genome-scale constraint-based mathematical model of D. mccartyi metabolism was constructed from publicly available genome sequences of four D. mccartyi strains: 195, CBDB1, BAV1, and VS, as well as from published literature and biological databases describing their physiology. First, the D. mccartyi pan-genome and subsequent metabolic network were developed from the genome sequences. Then, in silico growth and metabolism of the organism were simulated with the flux balance analysis approach. This mathematical modeling approach employed the pan-genome-scale reconstructed metabolic network of D. mccartyi for mathematically interrogating, analyzing, and then quantitatively simulating the organisms’ metabolism. The model determined the optimal flux distributions, representing the maximum reaction rates, of all metabolic reactions under the conditions of maximum experimental growth rates. The modeling study was also very useful in identifying and analyzing various metabolic bottlenecks, as well as the energy-starved nature of these bacteria.

Objective 2: Improved annotation and elucidation of D. mccartyi genes and metabolism through model-integrated analysis of high-throughput transcriptomic data

High-throughput transcriptomic data from gene-expression microarray experiments with pure and mixed cultures of D. mccartyi were analyzed and visualized using the organisms’ metabolic model as a framework. Various bioinformatic analyses such as clustering and functional enrichment analyses of the data were performed using the functional categories of metabolic

153

genes from the model, and these analyses led to the identification of some co-expressed gene clusters important for D. mccartyi metabolism. Model-integrated data analysis also shed light on the poorly understood mechanism of energy conservation, as well as the genes and enzymes potentially involved in the respiratory process of these bacteria. Most importantly, integration of systems-level information such as transcriptomic data with the pan-genome-scale D. mccartyi metabolic model helped review the gene annotations, in addition to generating experimentally testable hypotheses regarding the functions of D. mccartyi hypothetical proteins. Experimental verification of such hypotheses will be useful for both elucidating D. mccartyi metabolism and improving functional annotations of their genes.

Objective 3: Biochemical characterization of isocitrate dehydrogenase and phosphomannose isomerase genes of D. mccartyi strain KB-1

Two metabolic genes — an NADP+-dependent isocitrate dehydrogenase and a phosphomannose isomerase — from D. mccartyi strain KB-1 were biochemically characterized. Previous annotation of the first gene was an isocitrate dehydrogenase with specificity for NAD+ only, while the second gene was a hypothetical protein/SIS domain protein in D. mccartyi genomes. Their annotations were reviewed, or revised during the construction of the D. mccartyi metabolic model and subsequent model-integrated analysis of transcriptomic data from microarray experiments. The putative genes were cloned into E. coli using the genomic DNA from KB-1; then the recombinant proteins were overexpressed heterologously in E. coli, and finally, the purified recombinant proteins were tested for biochemical activities using appropriate enzymatic assays.

Objective 4: Exploring the influence of exogenous vitamin B12 (cobalamin) removal from the growth medium on a D. mccartyi-containing anaerobic dechlorinating microbial community

Growth and dechlorination activity of D. mccartyi in an anaerobic mixed microbial community, KB-1, were monitored in cultures established with and without any exogenous vitamins, including vitamin B12, in the growth medium. KB-1 cultures were set up in growth media

154

containing all exogenous vitamins, all exogenous vitamins except vitamin B12, and no exogenous vitamins. Although no substantial change in steady-state dechlorination or growth rates were observed between these cultures, community composition did change for the KB-1 cultures grown in the medium containing no vitamins at all. These no vitamin cultures contained a greater number of organisms at high abundance that were previously present in very low numbers in the original KB-1 culture. However, after a 2% transfer, initial dechlorination rates were found to be slower for the KB-1 cultures grown without vitamins in some cases, particularly when transferred to the vitamin-free medium.

7.2. Conclusions

The D. mccartyi metabolic model was the first reporting of such a detailed mathematical modeling effort for any organohalide respiring bacteria. In fact, it is the only pan-genome-scale, species-level constraint-based metabolic model available for any organism although > 100 genome-scale constraint-based models are available for organisms from bacteria, eukarya, and archaea. The model served as a comprehensive knowledgebase for D. mccartyi physiology and metabolism, and was proved to be useful for elucidating some notable metabolic bottlenecks of these specialized bacteria, including their requirement of separate carbon and energy sources, inability to use organic substrates (e.g., glucose, acetate, and fumarate) as electron donors, and the essentiality of cobalamin supply into their growth medium. The model also predicted that energy sources or electron donors, rather than carbon sources and/or electron acceptors were limiting substrates for D. mccartyi growth, revealing the energy-starved nature of the inefficient metabolism of these bacteria. The model simulations further suggested efficient interspecies hydrogen transfer as a potential reason for the faster growth of D. mccartyi in communities than in isolates. Most importantly, the model can be used as a common platform for visualizing and elucidating disparate various high-throughput experimental “omics” data for any strain of D. mccartyi because of its pan-genome nature.

The systemic analysis of D. mccartyi metabolism by integrating high-throughput transcriptomic data with the metabolic model was very useful in gaining new insight into the unusual metabolism of these anaerobic bacteria. This analysis also helped review or revise the

155

annotations of D. mccartyi genes as experimental biochemical evidence supporting the suggested functions for the majority genes of these bacteria is unavailable. Most notably, bioinformatic analyses, such as QT clustering, operon predictions, and functional enrichment analyses of the transcriptomic data using functional information for metabolic genes from the model suggested putative functions for a number of hypothetical proteins, which currently constitute a large portion of D. mccartyi genomes. These suggested annotations could serve as valuable hypotheses for guiding and designing future biochemical experiments for unraveling the actual functions of D. mccartyi hypothetical proteins. Such endeavors, in turn, can significantly enhance our knowledge about the physiology and metabolism of these important bacteria.

Two such metabolic genes, whose putative annotations were reviewed or corrected during the modeling and transcriptomic analyses, were biochemically characterized from D. mccartyi strain KB-1. The characterized NADP+-dependent isocitrate dehydrogenase is a critical TCA-cycle enzyme while the characterized phosphomannose isomerase, an enzyme usually involves in sugar metabolism and cell wall biogenesis, is possibly catalyzing the osmotic stress adaptation in D. mccartyi. Although the gene encoding the former enzyme was annotated as isocitrate dehydrogenase with specificity for NAD+ cofactor only, the gene encoding the latter enzyme was a hypothetical protein/SIS domain protein in D. mccartyi genomes. Thus, the experimental proof of their suggested functions indicated that other such putative annotations for D. mccartyi hypothetical proteins could be verified experimentally for increasing the knowledgebase of functionally characterized D. mccartyi genes. Sequence alignment, sequence homology, and phylogenetic analysis of gene sequences for both enzymes identified their novelty as they were found to be different from the previously characterized isocitrate dehydrogenases and phosphomannose isomerases from other organisms.

Growth experiments with a D. mccartyi-containing anaerobic mixed microbial community, KB-1 in the presence and absence of exogenous vitamins in the growth medium proved that these bacteria obtained essential vitamins, including vitamin B12, from other microbes in the community. Most importantly, omission or addition of all exogenous vitamins from the growth medium did not affect steady-state growth and dechlorination rates of D. mccartyi as evident from the unchanged dechlorination rates of all KB-1 cultures in long term cultivations in

156

different growth media. However, the microbial community shifted slightly to include more abundant acetogens and methanogens in the cultures without addition of any exogenous vitamins in the growth medium. Since acetogens provide carbon and energy sources for D. mccartyi in KB-1 by converting methanol, the electron donor supplied to KB-1, their growth can help to create a self-sustaining and robust dechlorinating microbial community. Such a community will be very useful in the bioremediation of chlorinated solvent contaminated sites around the world.

7.3. Future work

Based on the analyses of results from all research projects presented in this thesis, outlines for some future research initiatives are briefly described in the following sections:

An expanded version of the D. mccartyi metabolic model

Construction of detailed models of microbial metabolism and physiology such as genome-scale constraint-based models are never complete because they change with the addition of new knowledge about the microbes of interest. The metabolic model for D. mccartyi was developed using the information from available published literature and biochemical databases regarding their physiology, as well as the genome sequences of four D. mccartyi strains in 2010. Since then, the volume of published literature describing D. mccartyi physiology and metabolism has increased significantly. Also, a new complete genome sequence of D. mccartyi strain GT and a draft composite genome sequence of two very similar D. mccartyi strains from the KB-1 metagenome sequence are publicly available. Inclusion of these newly sequenced genomes, as well as the integration of information from recently published literature and updated biochemical databases, will certainly generate an expanded metabolic model reflecting the known biochemistry and physiology of D. mccartyi. Such an expanded model verified by recent biochemical and physiological information will be very useful for researchers in the bioremediation community, as well as for the scientific community in general.

157

Modeling the sequential reductive dechlorination reaction with the D. mccartyi metabolic model

The D. mccartyi metabolic model in its current form included the reductive dechlorination reaction as a one step conversion of tetrachloroethene (PCE) to ethene, or hexachlorobenzene (HCB) to dichlorobenzene (DCB). However, the actual reductive dechlorination reaction of higher chlorinated compounds such as PCE and HCB is comprised of four sequential redox reactions, where each hydrogen atom from PCE or HCB is replaced by a chlorine atom through four redox reactions, generating trichloroethene (TCE), cis-1,2-dichloroethene (cDCE), vinyl chloride (VC), and ethene, or pentachlorobenzene (PeCB), tetrachlorobenzene (TeCB), trichlorobenzene (TCB), and dichlorobenzene (DCB). Thus, the simplified one step reductive dechlorination reaction included in the metabolic model is not completely consistent with the known physiology of D. mccartyi. Moreover, rates of the four redox reactions are different, and each reaction was reported to be catalyzed by different reductive dehalogenase enzymes. Inclusion of the four redox reactions in the metabolic model can be achieved by modifying the current static flux balance model to a dynamic one. This modification will accommodate the inclusion of all reactions, gene-reaction associations, and reaction rates, ultimately leading to a model that will be more consistent with actual D. mccartyi physiology. Such a physiologically consistent model will be more useful than the current one in terms of exploring the unknown interactions that are hypothesized to exist in D. mccartyi-containing mixed microbial communities.

A constraint-based dynamic metabolic model for a synthetic KB-1 community

Growth and dechlorination activities of D. mccartyi are comparatively slower and less robust in pure isolates than in mixed microbial communities such as KB-1. Although the enhanced activities exhibited by D. mccartyi in the microbial communities point toward the existence of some unknown beneficial interactions between the microbes, these interactions and their nature are neither known nor identified yet. With the construction of detailed metabolic models for D. mccartyi and M. thermoacetica in this thesis (Appendix E), and the availability of similar models for other major phylotypes in KB-1, a detailed constraint-based dynamic mathematical model of

158

a reduced or simplified KB-1 community comprising D. mccartyi, G. sulfurreducens, M. thermoacetica, and M. barkeri, is now possible to develop. Such a constraint-based community model will be very useful for exploring and identifying the unknown beneficial interactions that likely exist in a D. mccartyi-containing syntrophic microbial community like KB-1. These results, in turn, can help researchers develop effective and practical strategies for the accelerated bioremediation of chloro-organic xenobiotics contaminated sites by D. mccartyi.

Biochemical characterization of reannotated D. mccartyi genes

Similar to any sequenced genome, a large portion (~50%) of D. mccartyi genes are either hypothetical proteins or putative genes having non-specific annotations. Based on various bioinformatic analyses of gene sequences during the metabolic model construction and the model-integrated analysis of transcriptomic data from D. mccartyi gene expression experiments, revised annotations for a list of >80 metabolic genes including 9 hypothetical proteins are presented in Tables S11-S12 in Table B1 in Appendix B of this thesis. Annotation of one of the 9 hypothetical proteins has already been biochemically verified and presented in chapter 6 of this thesis. Thus, the rest of the suggested annotations can serve as valuable hypotheses for corresponding gene functions, and help guide researchers in designing appropriate biochemical experiments for identifying functions of D. mccartyi hypothetical proteins and other genes. Implementation of such experimental initiatives will greatly enhance our knowledge about the physiology and metabolism of D. mccartyi that are immensely important for bioremediation applications.

Pyrotag sequencing of different KB-1 cultures described in this thesis

Three different KB-1 cultures described in this thesis were developed to identify if D. mccartyi could grow and dechlorinate without adding any exogenous vitamins, including vitamin B12, in the growth medium. Although dechlorination rates remained unchanged, removal of exogenous vitamins from the growth medium changed the community composition as identified by qPCR surveys of the KB-1 cultures. The original KB-1 culture, from which the subcultures used in this thesis were derived, was a complex community of at least 30 different microbial OTUs.

159

Although we were able to track the major and dominant OTUs by the qPCR technique, the full diversity and abundance of many other OTUs in KB-1 cultures remained undiscovered. Pyrotag sequencing or pyro sequencing is a powerful high-throughput experimental technique that is capable of qualitatively determining the community structure and full diversity of complex microbial communities. Thus, the application of this sequencing method to all KB-1 cultures derived and described in this thesis will certainly identify the effect of removal or addition of exogenous vitamins on the community structure in great details. This type of analysis will be useful for better understanding the community dynamics and the role played by different organisms in the community during the organohalide respiration of chlorinated xenobiotics by D. mccartyi in KB-1.

160

References

ADRIAN, L. (2009) ERC-group microflex: microbiology of Dehalococcoides-like Chloroflexi. Reviews in Environmental Science and Bio/Technology, 8, 225-229. ADRIAN, L., HANSEN SK, FUNG JM, GÖRISCH H & ZINDER, S. H. (2007a) Growth of Dehalococcoides strains with chlorophenols as electron acceptors. Environmental Science & Technology, 41, 2318-2323. ADRIAN, L., MANZ, W., SZEWZYK, U. & GORISCH, H. (1998) Physiological characterization of a bacterial consortium reductively dechlorinating 1,2,3 - and 1,2,4- trichlorobenzene. Applied and Environmental Microbiology, 64, 496-503. ADRIAN, L., RAHNENFÜHRER, J., GOBOM, J. & HÖLSCHER, T. (2007b) Identification of a chlorobenzene reductive dehalogenase in Dehalococcoides sp. strain CBDB1. Applied and Environmental Microbiology, 73, 7717-24. ADRIAN, L., SZEWZYK, U. & GORISCH, H. (2000a) Bacterial growth based on reductive dechlorination of trichlorobenzenes. Biodegradation 11, 73-81. ADRIAN, L., SZEWZYK, U., WECKE, J. & GÖRISCH, H. (2000b) Bacterial dehalorespiration with chlorinated benzenes. Nature, 408, 580-583. AHSANUL ISLAM, M., EDWARDS, E. A. & MAHADEVAN, R. (2010) Characterizing the metabolism of Dehalococcoides with a constraint-based model. PLoS Computational Biology, 6, e1000887. doi:10.1371/journal.pcbi.1000887. AHSANUL ISLAM, M., WALLER, A. S., HUG, L. A., PROVART, N. J., EDWARDS, E. A. & MAHADEVAN, R. (2013) New Insights into Dehalococcoides mccartyi Metabolism from a Model-Integrated System-Level Analysis of D. mccartyi Transcriptomes. PLoS ONE. ALLISON, D. B., CUI, X., PAGE, G. P. & SABRIPOUR, M. (2006) Microarray data analysis: from disarray to consolidation and consensus. Nature Reviews Genetics, 7, 55-65. ALTSCHUL, S. F., MADDEN, T. L., SCHÄFFER, A. A., ZHANG, J., ZHANG, Z., MILLER, W. & LIPMAN, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25, 3389-3402. AMOS, B. K., SUNG, Y., FLETCHER, K., GENTRY, T., WU, W., CRIDDLE, C., ZHOU, J. & LÖFFLER, F. (2007) Detection and quantification of Geobacter lovleyi strain SZ: implications for bioremediation at tetrachloroethene and uranium-impacted sites. Applied and Environment Microbiology, 73, 6898-6904. APWEILER, R., JESUS MARTIN, M., O'ONOVAN, C., MAGRANE, M., ALAM-FARUQUE, Y. & ANTUNES, R., ET AL (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Research, 40, D71-75. ARAKAWA, K., YAMADA, Y., SHINODA, K., NAKAYAMA, Y. & TOMITA, M. (2006) GEM system: automatic prototyping of cell-wide metabolic pathway models from genomes. BMC Bioinformatics, 7, 168. ARAVIND, L. (2000) Guilt by association: contextual information in genome analysis. Genome Research, 10, 1074-7. ASHBURNER, M., BALL, C., BLAKE, J., BOTSTEIN, D., BUTLER, H., CHERRY, J., DAVIS, A., DOLINSKI, K., DWIGHT, S., EPPIG, J., HARRIS, M., HILL, D., ISSEL- TARVER, L., KASARSKIS, A., LEWIS, S., MATESE, J., RICHARDSON, J.,

161

RINGWALD, M., RUBIN, G. & SHERLOCK, G. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 25, 25-29. ATANASOVA, L. & DRUZHININA, I. (2010) Global nutrient profiling by Phenotype MicroArrays: a tool complementing genomic and proteomic studies in conidial fungi. Journal of Zhejiang University Science B, 11, 151-168. ATSDR (2013) Agency for Toxic Substances and Disease Registry (http://www.atsdr.cdc.gov). ATSDR_TCE_PCE (2010) http://www.atsdr.cdc.gov/sites/lejeune/tce_pce.html. AULENTA, F., CANOSA, A., REALE, P., ROSSETTI, S., PANERO, S. & MAJONE, M. (2009) Microbial reductive dechlorination of trichloroethene to ethene with electrodes serving as electron donors without the external addition of redox mediators. Biotechnology and Bioengineering, 103, 85-91. AZIZ, R. K., BARTELS, D., BEST, A. A., DEJONGH, M., DISZ, T., EDWARDS, R. A., FORMSMA, K., GERDES, S., GLASS, E. M., KUBAL, M., MEYER, F., OLSEN, G. J., OLSON, R., OSTERMAN, A. L., OVERBEEK, R. A., MCNEIL, L. K., PAARMANN, D., PACZIAN, T., PARRELLO, B., PUSCH, G. D., REICH, C., STEVENS, R., VASSIEVA, O., VONSTEIN, V., WILKE, A. & ZAGNITKO, O. (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics, 9, 75. BAIROCH, A. (2000) The ENZYME database in 2000. Nucleic Acids Research 28, 304-305. BANERJEE, R. & RAGSDALE, S. W. (2003) The many faces of vitamin B12: catalysis by cobalamin-dependent enzymes. Annual Review of Biochemistry, 72, 209-247. BARBER, R. D., ZHANG, L., HARNACK, M., OLSON, M., KAUL, R., INGRAM-SMITH, C. & SMITH, K. (2011) Complete genome sequence of Methanosaeta concilii, a specialist in aceticlastic methanogenesis. Journal of Bacteriology, 193, 3668-3669. BAUMANN, K. (2011) Systems biology: Scaling in flies. Nat Rev Mol Cell Biol, 12, 767-767. BECKER, S. A., FEIST, A. M., MO, M. L., HANNUM, G., PALSSON, B. O. & HERRGARD, M. J. (2007) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nature Protocols 2, 727-738. BECKER, S. A. & PALSSON, B. O. (2008) Context-specific metabolic networks are consistent with experiments. PLoS Computational Biology, 4, e1000082. BEDARD, D. L., RITALAHTI, K. M. & LOFFLER, F. E. (2007) The Dehalococcoides population in sediment-free mixed cultures metabolically dechlorinates the commercial mixture Aroclor 1260. Applied and Environment Microbiology, 73, 2513-2521. BERGMAN, N. H., PASSALACQUA, K. D., HANNA, P. C. & QIN, Z. S. (2007) Operon prediction for sequenced bacterial genomes without experimental information. Applied and Environment Microbiology, 73, 846-854. BERMAN, H. M., WESTBROOK, J., FENG, Z., GILLILAND, G., BHAT, T., WEISSIG, H., SHINDYALOV, I. & BOURNE, P. (2000) The Protein Data Bank. Nucleic Acids Research 28, 235-242. BIEL, S., KLIMMEK, O., GROSS, R. & KRÖGER, A. (1996) Flavodoxin from Wolinella succinogenes. Archives of Microbiology, 166, 122-127. BOCHNER, B. R., GADZINSKI, P. & PANOMITROS, E. (2001) Phenotype microarrays for high-throughput phenotypic testing and assay of gene function. Genome Research, 11, 1246-1255. BOCK, A. K., KUNOW, J., GLASEMACHER, J. & SCHÖNHEIT, P. (1996) Catalytic properties, molecular composition and sequence alignments of pyruvate: ferredoxin

162

oxidoreductase from the methanogenic archaeon Methanosarcina barkeri (strain Fusaro). European Journal of Biochemistry, 237, 35-44. BOECKMANN, B., BAIROCH, A., APWEILER, R., BLATTER, M.-C., ESTREICHER, A., GASTEIGER, E., MARTIN, M. J., MICHOUD, K., O'DONOVAN, C., PHAN, I., PILBOUT, S. & SCHNEIDER, M. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research, 31, 365-370. BOLTON, E. E., WANG, Y., THIESSEN, P. A. & BRYANT, S. H. (2008) PubChem: integrated platform of small molecules and biological activities. Chapter 12: Annual Reports in Computational Chemistry BONARIUS, H. P. J., SCHMID, G. & TRAMPER, J. (1997) Flux analysis of underdetermined metabolic networks: the quest for the missing constraints. Trends in Biotechnology, 15, 308-314. BORGES, N., RAMOS, A., RAVEN, N., SHARP, R. & SANTOS, H. (2002) Comparative study of the thermostabilizing properties of mannosylglycerate and other compatible solutes on model enzymes. Extremophiles, 6, 209-216. BORODINA, I., KRABBEN, P. & NIELSEN, J. (2005) Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Research 15, 820-829. BRAMMER, L. A., SMITH, J. M., WADE, H. & MEYERS, C. F. (2011) 1-Deoxy-D-xylulose 5-phosphate synthase catalyzes a novel random sequential mechanism. Journal of Biological Chemistry, 286, 36522-31. BRUSCHI, M. & GUERLESQUIN, F. (1988) Structure, function and evolution of bacterial ferredoxins. FEMS (Federation of European Microbiological Societies) Microbiology Reviews, 4, 155-175. BUCHAKJIAN, M. R. & KORNBLUTH, S. (2010) The engine driving the ship: metabolic steering of cell proliferation and death. Nat Rev Mol Cell Biol, 11, 715-727. BUNGE, M., ADRIAN, L., KRAUS, A., OPEL, M., LORENZ, W. G., ANDREESEN, J. R., GÖRISCH, H. & LECHNER, U. (2003) Reductive dehalogenation of chlorinated dioxins by an anaerobic bacterium. Nature, 421, 357-360. CANADA, H. & LIMITED, R. B. E. (1995) Survey of and Occurrences in Canadian Groundwater. CASPI, R., ALTMAN, T., DREHER, K., FULCHER, C., SUBHRAVETI, P., KESELER, I., KOTHARI, A., KRUMMENACKER, M., LATENDRESSE, M., MUELLER, L., ONG, Q., PALEY, S., PUJAR, A., SHEARER, A., TRAVERS, M., WEERASINGHE, D., ZHANG, P. & KARP, P. (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Research 36, D742-753. CHANG, A., SCHEER, M., GROTE, A., SCHOMBURG, I. & SCHOMBURG, D. (2009) BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucleic Acids Research 37, D588-592. CHEN, F., MACKEY, A., VERMUNT, J. & ROOS, D. (2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE 2, e383. CHENG, D. & HE, J. (2009) Isolation and characterization of "Dehalococcoides" sp. strain MB, which dechlorinates tetrachloroethene to trans-1,2-dichloroethene. Applied and Environmental Microbiology, 75, 5910-5918. CHUANG, H. Y., HOFREE, M. & IDEKER, T. (2010) A decade of systems biology. Annual Review of Cell and Developmental Biology, 26, 721-744.

163

CLARK, N. R. & MA’AYAN, A. (2011) Introduction to statistical methods to analyze large data sets: Principal components analysis. Science Signalling, 4, tr3. CORD-RUWISCH, R., SEITZ, H.-J. & CONRAD, R. (1988) The capacity of hydrogenotrophic anaerobic bacteria to compete for traces of hydrogen depends on the redox potential of the terminal electron acceptor. Archives of Microbiology, 149, 350-357. COVERT, M. W., SCHILLING, C. H., FAMILI, I., EDWARDS, J. S., GORYANIN, I., SELKOV, E. & PALSSON, B. O. (2001) Metabolic modeling of microbial strains in silico. Trends in Biochemical Sciences 26, 179-186. COVERT, M. W., XIAO, N., CHEN, T. & KARR, J. R. (2008) Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli. Bioinformatics, 24, 2044-2050. CUPPLES, A. M., SPORMANN, A. & MCCARTY, P. (2004) Vinyl chloride and cis- Dichloroethene dechlorination kinetics and microorganism growth under substrate limiting conditions. Environmental Science & Technology, 38, 1102-1107. CUPPLES, A. M., SPORMANN AM & MCCARTY, P. L. (2003) Growth of a Dehalococcoides-like microorganism on vinyl chloride and cis-dichloroethene as electron acceptors as determined by competitive PCR. Applied and Environment Microbiology, 69, 953-959. CURLEY, G. P. & VOORDOUW, G. (1988) Cloning and sequencing of the gene encoding flavodoxin from Desulfovibrio vulgaris Hildenborough. FEMS (Federation of European Microbiological Societies) Microbiology Letters, 49, 295-299. DE CASTRO, E., SIGRIST, C. J., GATTIKER, A., BULLIARD, V., LANGENDIJK- GENEVAUX, P. S., GASTEIGER, E., BAIROCH, A. & HULO, N. (2006) ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Research, 34, W362-365. DEAN, A. M. & GOLDING, G. (1997) Protein engineering reveals ancient adaptive replacements in isocitrate dehydrogenase. Proceedings of the National Academy of Sciences of the United States of America, 94, 3104-3109. DEJONGH, M., ET AL (2007) Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics, 8, 139. DEJONGH, M., FORMSMA, K., BOILLOT, P., GOULD, J., RYCENGA, M. & BEST, A. (2007) Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics, 8, 139. DEPPENMEIER, U. (2002) Redox-driven proton translocation in methanogenic Archaea. Cellular and Molecular Life Sciences 59, 1513-1533. DEPPENMEIER, U. (2004) The membrane-bound electron transport system of Methanosarcina species. Journal of Bioenergetics and Biomembranes 36, 55-64. DEVOS, D. & VALENCIA, A. (2001) Intrinsic errors in genome annotation. Trends in Genetics 17, 429-431. DI VENTURA, B., LEMERLE, C., MICHALODIMITRAKIS, K. & SERRANO, L. (2006) From in vivo to in silico biology and back. Nature, 443, 527-533. DIEKERT, G. & WOHLFARTH, G. (1994) Metabolism of homoacetogens. Antonie Van Leeuwenhoek, 66, 209-221. DIMARCO, A. A., BOBIK, T. A. & WOLFE, R. S. (1990) Unusual coenzymes of methanogenesis. Annual Review of Biochemistry, 59.

164

DISTEFANO, T. D., GOSSETT, J. & ZINDER, S. H. (1991) Reductive dechlorination of high concentrations of tetrachloroethene to ethene by an anaerobic enrichment culture in the absence of methanogenesis Applied and Environment Microbiology, 57, 2287-2292. DOHERTY, R. E. (2000a) A History of the Production and Use of Carbon Tetrachloride, Tetrachloroethylene, Trichloroethylene and 1,1,1-Trichloroethane in the United States: Part 1-- Historical Background; Carbon Tetrachloride and Tetrachloroethylene. Environmental Forensics, 1, 69-81. DOHERTY, R. E. (2000b) A History of the Production and Use of Carbon Tetrachloride, Tetrachloroethylene, Trichloroethylene and 1,1,1-Trichloroethane in the United States: Part 2--Trichloroethylene and 1,1,1-Trichloroethane. Environmental Forensics, 1, 83-93. DOHERTY, R. E. (2012) The Manufacture, Use, and Supply of Chlorinated Solvents in the United States During World War II. Environmental Forensics, 13, 7-26. DOLFING, J. & HARRISON, B. K. (1992) Gibbs free energy of formation of halogenated aromatic compounds and their potential role as electron acceptors in anaerobic environments. Environmental Science & Technology, 26, 2213-2218. DOLFING, J. & JANSSEN, D. B. (1994) Estimates of Gibbs free energies of formation of chlorinated aliphatic compounds. Biodegradation, 5, 21-28. DOMACH, M. M., LEUNG, S. K., CAHN, R. E., COCKS, G. G. & SHULER, M. L. (1984) Computer model for glucose-limited growth of a single cell of Escherichia coli B/r-A. Biotechnology and Bioengineering, 26, 1140-1140. DRAKE, H. L. (1994) Acetogenesis, acetogenic bacteria, and the acetyl-CoA "Wood/Ljungdahl" pathway: past and current perspectives. IN DRAKE, H. L. (Ed.) Acetogenesis. Chapman & Hall. DRAKE, H. L. & DANIEL, S. L. (2004) Physiology of the thermophilic acetogen Moorella thermoacetica. Research in Microbiology, 155, 869-883. DRAKE, H. L., GOSSNER, A. S. & DANIEL, S. L. (2008) Old acetogens, new light. Annals of the New York Academy of Sciences, 1125, 100-128. DROSS, F., GEISLER, V., LENGER, R., THEIS, F. & KRAFFT, T., ET AL. (1992) The quinone-reactive Ni/Fe-hydrogenase of Wolinella succinogenes. European Journal of Biochemistry, 206, 93-102. DUARTE, N. C., HERRGÅRD, M. J. & PALSSON, B. O. (2004) Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Research, 14, 1298-1309. DUHAMEL, M. & EDWARDS, E. (2007) Growth and yields of dechlorinators, acetogens, and methanogens during reductive dechlorination of chlorinated ethenes and dihaloelimination of 1, 2-dichloroethane. Environmental Science & Technology, 41, 2303-2310. DUHAMEL, M. & EDWARDS, E. A. (2006) Microbial composition of chlorinated ethene- degrading cultures dominated by Dehalococcoides. FEMS (Federation of European Microbiological Societies) Microbiology Ecology, 58, 538-549. DUHAMEL, M., MO, K. & EDWARDS, E. A. (2004) Characterization of a highly enriched Dehalococcoides-containing culture that grows on vinyl chloride and trichloroethene. Applied and Environment Microbiology, 70, 5538-45. DUHAMEL, M., WEHR, S. D., YU, L., H, R., SEEPERSAD, D., DWORATZEK, S., COX, E. E. & EDWARDS, E. A. (2002) Comparison of anaerobic dechlorinating enrichment

165

cultures maintained on tetrachloroethene, trichloroethene, cis-dichloroethene and vinyl chloride. Water Research, 36, 4193-4202. DUHAMEL, M. A. (2005) Community structure and dynamics of anaerobic chlorinated ethene- degrading enrichment cultures. Chemical Engineering and Applied Chemistry. Toronto, University of Toronto. DUNWELL, J. M., CULHAM A, CARTER CE, SOSA-AGUIRRE CR, GOODENOUGH PW (2001) Evolution of functional diversity in the cupin superfamily. Trends in Biochemical Sciences, 26, 740-746. DUNWELL, J. M., KHURI, S. & GANE, P. (2000) Microbial relatives of the seed storage proteins of higher plants: conservation of structure and diversification of function during evolution of the cupin superfamily. Microbiology and Molecular Biology Reviews, 64, 153-179. DUROT, M., BOURGUIGNON, P.-Y. & SCHACHTER, V. (2009) Genome-scale models of bacterial metabolism: reconstruction and applications. FEMS Microbiology Reviews, 33, 164-190. EDGAR, R. C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5, 113. EDWARDS, E. A. & COX, E. (1997) Field and laboratory studies of sequential anaerobic– aerobic chlorinated solvent biodegradation. In situ and on-site bioremediation. Fourth International Symposium on In Situ and On-Site Bioreclamation, New Orleans, LA, Columbus, OH: Battelle Press. EDWARDS, E. A. & GRBIC-GALIC, D. (1994) Anaerobic degradation of toluene and o-xylene by a methanogenic consortium. Applied and Environment Microbiology, 60, 313-322. EDWARDS, J. S., COVERT, M. & PALSSON, B. (2002) Metabolic modelling of microbes: the flux-balance approach. Environmental Microbiology, 4, 133-140. EISEN, M. B., SPELLMAN, P. T., BROWN, P. O. & BOTSTEIN, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95, 14863-8. EISENSTEIN, K. K. & WANG, J. H. (1969) Conversion of light to chemical free energy. Journal of Biological Chemistry 244, 1720-1728. EL FANTROUSSI, S., NAVEAU, H. & AGATHOS, S. N. (1998) Anaerobic dechlorinating bacteria. Biotechnology Progress, 14, 167-188. EMPADINHAS, N., ALBUQUERQUE, L., COSTA, J., ZINDER, S., SANTOS, M., SANTOS, H. & DA COSTA, M. (2004) A gene from the mesophilic bacterium Dehalococcoides ethenogenes encodes a novel mannosylglycerate synthase. Journal of Bacteriology, 186, 4075-4084. EMPADINHAS, N., DA COSTA MS (2011) Diversity, biological roles and biosynthetic pathways for sugar-glycerate containing compatible solutes in bacteria and archaea. Environmental Microbiology, 13, 2056-2077. ESCALANTE-SEMERENA, J. C. (2007) Conversion of cobinamide into adenosylcobamide in bacteria and archaea. Journal of Bacteriology, 189, 4555-4560. FANI, R. & FONDI, M. (2009) Origin and evolution of metabolic pathways. Physics of Life Reviews, 6, 23-52. FEIST, A. M., HENRY, C., S, , REED, J., L, , KRUMMENACKER, M., JOYCE, A., R, , KARP, P., D, , BROADBELT, L. J., HATZIMANIKATIS, V. & PALSSON, B. O. (2007) A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that

166

accounts for 1260 ORFs and thermodynamic information. Molecular Systems Biology, 3, 121. FEIST, A. M., HERRGARD, M. J., THIELE, I., REED, J. L. & PALSSON, B. O. (2009) Reconstruction of biochemical networks in microorganisms. Nature Reviews: Microbiology 7, 129-143. FEIST, A. M., SCHOLTEN, J., PALSSON, B., BROCKMAN, F. & IDEKER, T. (2006) Modelling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri Molecular Systems Biology, 2, 1-14. FELL, D. A. (1992) Metabolic control analysis: a survey of its theoretical and experimental development. Biochemical Journal, 286, 313-330. FONTAINE, F. E., PETERSON, W. H., MCCOY, E., JOHNSON, M. J. & RITTER, G. J. (1942) A new type of glucose fermentation by Clostridium thermoaceticum. Journal of Bacteriology, 43, 701-715. FRANCKE, C., SIEZEN, R. & TEUSINK, B. (2005) Reconstructing the metabolic network of a bacterium from its genome. Trends in Microbiology, 13, 550-558. FREEDMAN, D. L. & GOSSETT, J. M. (1989) Biological reductive dechlorination of tetrachloroethylene and trichloroethylene to ethylene under methanogenic conditions. Applied and Environment Microbiology, 55, 2144-2151. FUCHS, G. (2011) Alternative Pathways of Carbon Dioxide Fixation: Insights into the Early Evolution of Life? Annual Review of Microbiology, 65, 631-658. FUTAGAMI, T., GOTO, M. & FURUKAWA, K. (2008) Biochemical and genetic bases of dehalorespiration. The Chemical Record, 8, 1-12. GASTEIGER ET AL (2003) ExPASY: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Research, 31, 3784-3788. GEHLENBORG, N., O'DONOGHUE, S. I., BALIGA, N. S., GOESMANN, A., HIBBS, M. A., KITANO, H., KOHLBACHER, O., NEUWEGER, H., SCHNEIDER, R., TENENBAUM, D. & GAVIN, A. C. (2010) Visualization of omics data for systems biology. Nature Methods, 7, S56-68. GERRITSE, J., RENARD, V., GOMES, T. M. P., LAWSON, P. A., COLLINS, M. D. & GOTTSCHAL, J. C. (1996) Desulfitobacterium sp. strain PCE1, an anaerobic bacterium that can grow by reductive dechlorination of tetrachloroethene or ortho-chlorinated phenols. Archives of Microbiology, 165, 132-140. GIANCHANDANI, E. P., CHAVALI, A. K. & PAPIN, J. A. (2010) The application of flux balance analysis in systems biology. Wiley Interdisciplinary Reviews: Systems Biology and Medicine, 2, 372-382. GLOD, G., BRODMANN, U., ANGST, W., HOLLIGER, C. & SCHWARZENBACH, R. (1997) Cobalamin-mediated reduction of cis- and trans-dichloroethene, 1, 1- dichloroethene, and vinyl chloride in homogenous aqueous solution: reaction kinetics and mechanistic considerations. Environmental Science & Technology, 31, 3154-3160. GORRIS, L. G. & VAN DER DRIFT, C. (1994) Cofactor contents of methanogenic bacteria reviewed. Biofactors, 4, 139-145. GREEN, M. & KARP, P. (2004) A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics, 5, 76. GREEN, M. L. & KLEIN, T. (2002) A multidomain TIGR/olfactomedin protein family with conserved structural similarity in the N-terminal region and conserved motifs in the C- terminal region. Molecular and Cellular Proteomics, 1, 394-403.

167

GRIBBLE, G. W. (1992) Naturally Occurring Organohalogen Compounds--A Survey. Journal of Natural Products, 55, 1353-1395. GRIBBLE, G. W. (2010) Naturally Occurring Organohalogen Compounds - A Comprehensive Update, Springer-Verlag Vienna. GRIBBLE, G. W. (2012) Occurrence of halogenated alkaloids. Alkaloids Chem Biol, 71, 1-165. GUHA, N., LOOMIS, D., GROSSE, Y., LAUBY-SECRETAN, B., EL GHISSASSI, F., BOUVARD, V., BENBRAHIM-TALLAA, L., BAAN, R., MATTOCK, H., STRAIF, K. & GROUP, I. A. F. R. O. C. M. W. (2012) Carcinogenicity of trichloroethylene, tetrachloroethylene, some other chlorinated solvents, and their metabolites. Lancet Oncology, 13, 1192-1193. GUINDON, S., DUFAYARD, J., LEFORT, V., ANISIMOVA, M., HORDIJK, W. & GASCUEL, O. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biology, 59, 307-321. GUSTAFSON, D. L., LONG, M. E., THOMAS, R. S., BENJAMIN, S. A. & YANG, R. S. H. (2000) Comparative Hepatocarcinogenicity of Hexachlorobenzene, Pentachlorobenzene, 1,2,4,5-Tetrachlorobenzene, and 1,4-Dichlorobenzene: Application of a Medium-Term Liver Focus Bioassay and Molecular and Cellular Indices. Toxicological Sciences, 53, 245-252. HAGGBLOOM, M. M. & BOSSERT, I. D. (2003) Halogenated organic compounds-a global perspective. IN HAGGBLOOM, M. M., BOSSERT ID, WULDER MA (Ed.) Microbial Processes and Environmental Applications. Boston, Kluwer Academic Publishers. HANSEN, T., URBANKE C, SCHÖNHEIT P (2004) Bifunctional phosphoglucose/phosphomannose isomerase from the hyperthermophilic archaeon Pyrobaculum aerophilum. Extremophiles, 8, 507-512. HANSEN, T., WENDORFF, D. & SCHÖNHEIT, P. (2004) Bifunctional phosphoglucose/phosphomannose isomerases from the archaea Aeropyrum pernix and Thermoplasma acidophilum constitute a novel enzyme family within the phosphoglucose isomerase superfamily. Journal of Biological Chemistry, 279, 2262-2272. HANSON, A. D., PRIBAT, A., WALLER, J. C. & DE CRÉCY-LAGARD, V. (2009) 'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list -- and how to find it. Biochemical Journal, 425, 1-11. HAROLD, F. M. & MALONEY, P. (1996) Energy transduction by ion currents. IN NEIDHARDT, F. C., CURTISS R, INGRAHAM JL (Ed.) Escherichia coli and Salmonella: cellular and molecular biology. Washington, D. C., ASM Press. HE, J., HOLMES, V. F., LEE, P. K. & ALVAREZ-COHEN, L. (2007) Influence of vitamin B12 and cocultures on the growth of Dehalococcoides isolates in defined medium. Applied and Environment Microbiology, 73, 2847-53. HE, J., RITALAHTI, K. M., YANG, K. L., KOENIGSBERG, S. S. & LÖFFLER, F. E. (2003) Detoxification of vinyl chloride to ethene coupled to growth of an anaerobic bacterium. Nature, 424, 62-65. HE, J., SUNG, Y., DOLLHOPF, M., FATHEPURE, B., TIEDJE, J. & LÖFFLER, F. (2002) Acetate versus hydrogen as direct electron donors to stimulate the microbial reductive dechlorination process at chloroethene-contaminated sites. Environmental Science & Technology, 36, 2945-3952. HE, J., SUNG, Y., KRAJMALNIK-BROWN, R., RITALAHTI, K. & LÖFFLER, F. (2005) Isolation and characterization of Dehalococcoides sp. strain FL2, a trichloroethene

168

(TCE)- and 1,2- dechloroethene-respiring anaerobe. Environmental Microbiology, 7, 1442-1450. HEDDERICH, R. (2004) Energy-converting [NiFe] hydrogenases from archaea and extremophiles: ancestors of complex I. Journal of Bioenergetics and Biomembranes 36, 65-75. HEINEMANN, M., KÜMMEL, A., RUINATSCHA, R. & PANKE, S. (2005) In silico genome- scale reconstruction and validation of the Staphylococcus aureus metabolic network. Biotechnology and Bioengineering, 92, 850-864. HEINEMANN, M. & SAUER, U. (2010) Systems biology of microbial metabolism. Current Opinion in Microbiology, 13, 337-343. HEINRICH, R. & RAPOPORT, T. A. (1974) A Linear Steady-State Treatment of Enzymatic Chains. European Journal of Biochemistry, 42, 89-95. HENDRICKSON, E., PAYNE, J., YOUNG, R., STARR, M., PERRY, M., FAHNESTOCK, S., ELLIS, D. & EBERSOLE, R. C. (2002) Molecular analysis of Dehalococcoides 16S ribosomal DNA from chloroethene-contaminated sites throughout North America and Europe. Applied and Environment Microbiology, 68, 485-495. HENRY, C. S., DEJONGH, M., BEST, A. A., FRYBARGER, P. M., LINSAY, B. & STEVENS, R. L. (2010) High-throughput generation, optimization and analysis of genome-scale metabolic models. Nature Biotechnology, 28, 977-982. HERRGÅRD, M. J., FONG, S. & PALSSON, B. (2006) Identification of genome-scale metabolic network models using experimentally measured flux profiles. PLoS Computational Biology, 2, e72. doi:10.1371/journal.pcbi.0020072. HERRMANN, G., JAYAMANI, E., MAI, G. & BUCKEL, W. (2008) Energy conservation via electron-transferring flavoprotein in anaerobic bacteria. Journal of Bacteriology, 190, 784-91. HEYER, L. J., KRUGLYAK, S. & YOOSEPH, S. (1999) Exploring expression data: identification and analysis of coexpressed genes. Genome Research, 9, 1106-15. HOCHULI, E. (1990) Purification of recombinant proteins with metal chelate adsorbent. Genetic Engineering (N Y), 12, 87-98. HOCHULI, E., BANNWARTH, W., DÖBELI, H., GENTZ, R. & STÜBER, D. (1988) Genetic approach to facilitate purification of recombinant proteins with a novel metal chelate adsorbent. Nature Biotechnology, 6, 1321-1325. HOFFMANN, B., OBERHUBER, M., STUPPERICH, E., BOTHE, H., BUCKEL, W., KONRAT, R. & KRÄUTLER, B. (2000) Native corrinoids from Clostridium cochlearium are adeninylcobamides: spectroscopic analysis and identification of pseudovitamin B(12) and factor A. Journal of Bacteriology, 182, 4773-4782. HOLLIGER, C., HAHN, D., HARMSEN, H., LUDWIG, W., SCHUMACHER, W., TINDALL, B., VAZQUEZ, F., WEISS, N. & ZEHNDER, A. J. B. (1998a) Dehalobacter restrictus gen. nov. and sp. nov., a strictly anaerobic bacterium that reductively dechlorinates tetra- and trichloroethene in an anaerobic respiration. Archives of Microbiology, 169, 313-321. HOLLIGER, C., SCHRAA, G., STAMS, A. J. & ZEHNDER, A. J. (1993) A highly purified enrichment culture couples the reductive dechlorination of tetrachloroethene to growth. Applied and Environment Microbiology, 59, 2991-2997. HOLLIGER, C., WOHLFARTH, G. & DIEKERT, G. (1998b) Reductive dechlorination in the energy metabolism of anaerobic bacteria. FEMS (Federation of European Microbiological Societies) Microbiology Reviews, 22, 383-398.

169

HOLMES, V. F., HE, J., LEE, P. & ALVAREZ-COHEN, L. (2006) Discrimination of multiple Dehalococcoides strains in a trichloroethene enrichment by quantification of their reductive dehalogenase genes. Applied and Environment Microbiology, 72, 5877-5883. HÖLSCHER, T., GÖRISCH, H. & ADRIAN, L. (2003) Reductive dehalogenation of chlorobenzene congeners in cell extracts of Dehalococcoides sp. strain CBDB1. Applied and Environment Microbiology, 69, 2999-3001. HÖLSCHER, T., KRAJMALNIK-BROWN, R., RITALAHTI, K. M., VON WINTZINGERODE, F., GÖRISCH, H., LÖFFLER, F. & ADRIAN, L. (2004) Multiple nonidentical reductive-dehalogenase-homologous genes are common in Dehalococcoides. Applied and Environment Microbiology, 70, 5290-5297. HOTELLING, H. (1933) Analysis of complex statistical variables into principal components. Journal of Educational Psychology, 24, 498-520. HTTP://DOCS.YWORKS.COM/YFILES/DOC/DEVELOPERS- GUIDE/SMART_ORGANIC_LAYOUTER.HTML. HTTP://TEXTBOOKOFBACTERIOLOGY.NET/INDEX.HTML (2008) Todar's Online Textbook of Bacteriology. HTTP://WWW.GENEIOUS.COM/, G. V. C. B. B. A. F. Geneious v 6.1.5 created by Biomatters. Available from http://www.geneious.com/. HUANG, D. W., SHERMAN, B. & LEMPICKI, R. (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research, 37, 1-13. HUG, L. A. (2012) A metagenome-based examination of dechlorinating enrichment cultures: Dehalococcoides and the role of the non-dechlorinating microorganisms. Cell and Systems Biology. Toronto, University of Toronto. HUG, L. A., MAPHOSA, F., LEYS, D., LÖFFLER, F. E., SMIDT, H., EDWARDS, E. A. & ADRIAN, L. (2013) Overview of organohalide-respiring bacteria and a proposal for a classification system for reductive dehalogenases. Philosophical Transactions of the Royal Society B: Biological Sciences, 368. HUG, L. A., SALEHI, M., NUIN, P., TILLIER, E. R. & EDWARDS, E. A. (2011) Design and verification of a pangenome microarray oligonucleotide probe set for Dehalococcoides spp. Applied and Environment Microbiology, 77, 5361-9. HUNG, S. S., WASMUTH, J., SANFORD, C. & PARKINSON, J. (2010) DETECT-- a Density Estimation Tool for Enzyme ClassificaTion and its application to Plasmodium falciparum. Bioinformatics, 26, 1690-1698. HURLEY, J. H., DEAN, A., KOSHLAND, D. J. & STROUD, R. M. (1991) Catalytic mechanism of NADP(+)-dependent isocitrate dehydrogenase: implications from the structures of magnesium-isocitrate and NADP+ complexes. Biochemistry (Washington), 30, 8671-8678. HYDUKE, D. R., LEWIS, N. E. & PALSSON, B. O. (2013) Analysis of omics data with genome-scale models of metabolism. Molecular BioSystems, 9, 167-174. IDEKER, T., GALITSKI, T. & HOOD, L. (2001) A new approach to decoding life: systems biology. Annual Review of Genomics & Human Genetics, 2, 343-372. ISHII, S., SHIMOYAMA, T., HOTTA, Y. & WATANABE, K. (2008) Characterization of a filamentous biofilm community established in a cellulose-fed microbial fuel cell. BMC Microbiology, 8, 6.

170

JACOB, F. & MONOD, J. (1961) Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology, 3, 318-356. JAYACHANDRAN, G. (2004) Physiological and enzymatic studies of respiration in Dehalococcoides species strain CBDB1. Berlin, Technischen Universität Berlin, Germany. JAYACHANDRAN, G., GORISCH, H. & ADRIAN, L. (2003) Dehalorespiration with hexachlorobenzene and pentachlorobenzene by Dehalococcoides sp. strain CBDB1. Archives of Microbiology, 180, 411-416. JAYACHANDRAN, G., GORISCH, H. & ADRIAN, L. (2004) Studies on hydrogenase activity and chlorobenzene respiration in Dehalococcoides sp. strain CBDB1. Archives of Microbiology, 182, 498-504. JEFFERY, C. J., BAHNSON, B., CHIEN, W., RINGE, D. & PETSKO, G. (2000) Crystal structure of rabbit phosphoglucose isomerase, a glycolytic enzyme that moonlights as neuroleukin, autocrine motility factor, and differentiation mediator. Biochemistry (Washington), 39, 955-964. JOHNSON, D. R., BRODIE, E. L., HUBBARD, A. E., ANDERSEN, G. L., ZINDER, S. H. & ALVAREZ-COHEN, L. (2008) Temporal transcriptomic microarray analysis of Dehalococcoides ethenogenes strain 195 during the transition into stationary phase. Applied and Environment Microbiology, 74, 2864-72. JOHNSON, D. R., NEMIR, A., ANDERSEN, G. L., ZINDER, S. H. & ALVAREZ-COHEN, L. (2009) Transcriptomic microarray analysis of corrinoid responsive genes in Dehalococcoides ethenogenes strain 195. FEMS (Federation of European Microbiological Societies) Microbiology Letters, 294, 198-206. JOYCE, A. R. & PALSSON, B. (2008) Predicting gene essentiality using genome-scale in silico models. IN OSTERMAN, A. L., GERDES SY (Ed.) Microbial Gene Essentiality: Protocols and Bioinformatics. Humana Press. KACSER, H. & BURNS, J. A. (1973) The control of flux. Symp Soc Exp Biol, 27, 65-104. KANEHISA, M. & GOTO, S. (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27-30. KANEHISA, M., GOTO, S., SATO, Y., FURUMICHI, M. & TANABE, M. (2011) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Research, Epub: Nov 10. KARADAGLI, F. & RITTMANN, B. E. (2005) Kinetic characterization of Methanobacterium bryantii M.o.H. Environmental Science & Technology, 39, 4900-4905. KARP, P. D., PALEY, S. & ROMERO, P. (2002) The pathway tools software. Bioinformatics, 18, S225-32. KARR, JONATHAN R., SANGHVI, JAYODITA C., MACKLIN, DEREK N., GUTSCHOW, MIRIAM V., JACOBS, JARED M., BOLIVAL, B., ASSAD-GARCIA, N., GLASS, JOHN I. & COVERT, MARKUS W. (2012) A Whole-Cell Computational Model Predicts Phenotype from Genotype. Cell, 150, 389-401. KAUFFMAN, K. J., PRAKASH, P. & EDWARDS, J. S. (2003) Advances in flux balance analysis. Current Opinion in Biotechnology, 14, 491-496. KEMP, L. E., BOND, C. S. & HUNTER, W. N. (2002) Structure of 2C-methyl-D-erythritol 2,4- cyclodiphosphate synthase: an essential enzyme for isoprenoid biosynthesis and target for antimicrobial drug development. Proceedings of the National Academy of Sciences of the United States of America, 99, 6591-6596.

171

KHERSONSKY, O., ROODVELDT, C. & TAWFIK, D. (2006) Enzyme promiscuity: evolutionary and mechanistic aspects. Current Opinion in Chemical Biology, 10, 498- 508. KIM, H.-S. & LEE, J.-Y. (1999) Molecular bioremediation: metabolic engineering for biodegradation of recalcitrant pollutants. IN LEE, S. Y., PAPOUTSAKIS ET (Ed.) Metabolic Engineering. New York, Marcel Dekker, Inc. KIM, J. I., VARNER, J. D. & RAMKRISHNA, D. (2008) A hybrid model of anaerobic E. coli GJT001: Combination of elementary flux modes and cybernetic variables. Biotechnology Progress, 24, 993-1006. KIM, T. Y., SOHN, S. B., KIM, Y. B., KIM, W. J. & LEE, S. Y. (2012) Recent advances in reconstruction and applications of genome-scale metabolic models. Current Opinion in Biotechnology, 23, 617-623. KOONIN, E. V. & GALPERIN, M. (2003) Genome annotation and analysis Sequence - Evolution - Function: Computational Approaches in Comparative Genomics. Boston: Kluwer Academic. KRAJMALNIK-BROWN, R., HÖLSCHER, T., THOMSON, I. N., SAUNDERS, F. M., RITALAHTI, K. M. & LÖFFLER, F. E. (2004) Genetic identification of a putative vinyl chloride reductase in Dehalococcoides sp. strain BAV1. Applied and Environmental Microbiology, 70, 6347-51. KRAJMALNIK-BROWN, R., SUNG, Y., RITALAHTI, K. M., SAUNDERS, F. M. & LOFFLER, F. E. (2006) Environmental distribution of the trichloroethene reductive dehalogenase gene (tceA) suggests lateral gene transfer among Dehalococcoides. FEMS Microbiology Ecology, 59, 206-214. KRASOTKINA, J., WALTERS, T., MARUYA, K. A. & RAGSDALE, S. W. (2001) Characterization of the B12- and iron-sulfur-containing reductive dehalogenase from Desulfitobacterium chlororespirans. Journal of Biological Chemistry, 276, 40991-40997. KRÄUTLER, B., FIEBER, W., OSTERMANN, S., FASCHING, M., ONGANIA, K.-H., GRUBER, K., KRATKY, C., MIKL, C., SIEBERT, A. & DIEKERT, G. (2003) The cofactor of tetrachloroethene reductive dehalogenase of Dehalospirillum multivorans is norpseudo-B12, a new type of a natural corrinoid. Helvetica Chimica Acta, 86, 3698- 3716. KRÖGER, A., BIEL, S., SIMON, J., GROSS, R., UNDEN, G. & LANCASTER, C. (2002) Fumarate respiration of Wolinella succinogens: enzymology, energetics and coupling mechanism. Biochimica et Biophysica Acta 15553, 23-38. KRÖGER, A., GEISLER, V., LEMMA, E., THEIS, F. & LENGER, R. (1992) Bacterial fumarate respiration. Archives of Microbiology, 158, 311-314. KUBE, M., BECK, A., ZINDER, S. H., KUHL, H., REINHARDT, R. & ADRIAN, L. (2005) Genome sequence of the chlorinated compound-respiring bacterium Dehalococcoides species strain CBDB1. Nature Biotechnology, 23, 1269-73. KUMAR, V. S., DASIKA, M. & MARANAS, C. (2007) Optimization based automated curation of metabolic reconstructions. BMC Bioinformatics, 8, 212. LAEMMLI, U. K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature (London), 227, 680-685. LARKIN, M. A., BLACKSHIELDS, G., BROWN, N., CHENNA, R., MCGETTIGAN, P., MCWILLIAM, H., VALENTIN, F., WALLACE, I., WILM, A., LOPEZ, R.,

172

THOMPSON, J., GIBSON, T. & HIGGINS, D. (2007) Clustal W and Clustal X version 2.0. Bioinformatics (Oxford), 23, 2947-2948. LEADBETTER, J., SCHMIDT, T., GRABER, J. & BREZNAK, J. (1999) Acetogenesis from H2 plus CO2 by spirochetes from termite guts. Science 283, 686-689. LEE, N. H. & SAEED, A. I. (2007) Microarrays: an overview. Methods in Molecular Biology, 353, 265-300. LEE, P. K., CHENG, D., HU, P., WEST, K. A., DICK, G. J., BRODIE, E. L., ANDERSEN, G. L., ZINDER, S. H., HE, J. & ALVAREZ-COHEN, L. (2011) Comparative genomics of two newly isolated Dehalococcoides strains and an enrichment using a genus microarray. ISME Journal, 5, 1014-24. LEE, P. K., DILL, B., LOUIE, T., SHAH, M., VERBERKMOES, N., ANDERSEN, G., ZINDER, S. & ALVAREZ-COHEN, L. (2012) Global transcriptomic and proteomic responses of Dehalococcoides ethenogenes strain 195 to fixed nitrogen limitation. Applied and Environment Microbiology, 78, 1424-1436. LEE, P. K., JOHNSON, D. R., HOLMES, V. F., HE, J. & ALVAREZ-COHEN, L. (2006) Reductive dehalogenase gene expression as a biomarker for physiological activity of Dehalococcoides spp. Applied and Environmental Microbiology, 72, 6161-8. LENDVAY, J. M., F. E. LÖFFLER, M. DOLLHOPF, M. R. AIELLO, G. DANIELS, B. Z. FATHEPURE, M. GEBHARD, R. HEINE, R. HELTON, J. SHI, R. KRAJMALNIK- BROWN, C. L. MAJOR, J., M. J. BARCELONA, E. PETROVSKIS, R. HICKEY, J. M. TIEDJE & ADRIAENS, P. (2003) Bioreactive barriers: bioaugmentation and biostimulation for chlorinated solvent remediation. Environmental Science & Technology, 37, 1422-1431. LEWIS, N. E., NAGARAJAN, H. & PALSSON, B. O. (2012) Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat Rev Micro, 10, 291-305. LEYS, D., ADRIAN, L. & SMIDT, H. (2013) Organohalide respiration: microbes breathing chlorinated molecules. Philosophical Transactions of the Royal Society B: Biological Sciences, 368. LI, F., HAGEMEIER, C., SEEDORF, H., GOTTSCHALK, G. & THAUER, R. (2007) Re-citrate synthase from Clostridium kluyveri is phylogenetically related to homocitrate synthase and isopropylmalate synthase rather than to Si-citrate synthase. Journal of Bacteriology 189, 4299-4304. LI, L., STOECKERT, C. J. & ROOS, D. (2003) OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Research, 13, 2178-2189. LIU, L., AGREN, R., BORDEL, S. & NIELSEN, J. (2010) Use of genome-scale metabolic models for understanding microbial physiology. FEBS Letters, 584, 2556-2564. LÖFFLER, F. E., YAN, J., RITALAHTI, K. M., ADRIAN, L., EDWARDS, E. A., KONSTANTINIDIS, K. T., MÜLLER, J. A., FULLERTON, H., ZINDER, S. H. & SPORMANN, A. M. (2012) Dehalococcoides mccartyi gen. nov., sp. nov., obligate organohalide-respiring anaerobic bacteria, relevant to halogen cycling and bioremediation, belong to a novel bacterial class, classis nov., within the phylum Chloroflexi. International Journal of Systematic and Evolutionary Microbiology, ePub Apr 27 LOHMAN, J. H. (2002) A History of Dry Cleaners and Sources of Solvent Releases from Dry Cleaning Equipment. Environmental Forensics, 3, 35-58.

173

LOUBIERE, P. & LINDELY, N. D. (1991) The use of acetate as additional co- substrate improves methylotropic growth of the acetogeneic anaerobe Eubacterium limosum when CO2 fixation is rate limiting. Journal of General Microbiology 137, 2247-2251. LOUIE, T. M. & MOHN, W. W. (1999) Evidence for a chemiosmotic model of dehalorespiration in Desulfomonile tiedjei DCB-1. Journal of Bacteriology 181, 40-46. LOVLEY, D. R. (2003) Cleaning up with genomics: applying molecular biology to bioremediation. Nat Rev Micro, 1, 35-44. LOVLEY, D. R., UEKI, T., ZHANG, T., MALVANKAR, N., SHRESTHA, P., FLANAGAN, K., AKLUJKAR, M., BUTLER, J., GILOTEAUX, L., ROTARU, A., HOLMES, D., FRANKS, A., ORELLANA, R., RISSO, C. & NEVIN, K. P. (2011) Geobacter: the microbe electric's physiology, ecology, and practical applications. Advances in Microbial Physiology, 59, 1-100. MACLEAN, D., JONES, J. D. G. & STUDHOLME, D. J. (2009) Application of 'next- generation' sequencing technologies to microbial genetics. Nat Rev Micro, 7, 287-296. MADIGAN, M. T., MARTINKO, J. M. & PARKER, J. (2010) Brock biology of microorganisms, Benjamin Cummings; 13th Edition. MAGNUSON, J. K., ROMINE, M. F., BURRIS, D. R. & KINGSLEY, M. T. (2000) Trichloroethene reductive dehalogenase from Dehalococcoides ethenogenes: sequence of tceA and substrate range characterization. Applied and Environmental Microbiology, 66, 5141-7. MAGNUSON, J. K., STERN, R. V., GOSSETT, J. M., ZINDER, S. H. & BURRIS, D. R. (1998) Reductive dechlorination of tetrachloroethene to ethene by a two-component enzyme pathway. Applied and Environmental Microbiology, 64, 1270-5. MAHADEVAN, R., BOND, D., BUTLER, J., ESTEVE-NUÑEZ, A., COPPI, M., PALSSON, B., SCHILLING, C. & LOVLEY, D. (2006) Characterization of metabolism in the Fe(III)-reducing organism Geobacter sulfurreducens by constraint-based modeling. Applied and Environment Microbiology, 72, 1558-1568. MAHADEVAN, R., PALSSON, B. O. & LOVLEY, D. R. (2011) In situ to in silico and back: elucidating the physiology and ecology of Geobacter spp. using genome-scale modelling. Nat Rev Micro, 9, 39-50. MAHADEVAN, R., YAN, B., POSTIER, B., NEVIN, K. P., WOODARD, T. L., O'NEIL, R., COPPI, M. V., METHÉ, B. A. & KRUSHKAL, J. (2008) Characterizing regulation of metabolism in Geobacter sulfurreducens through genome-wide expression data and sequence analysis. OMICS: A Journal of Integrative Biology, 12, 33-59. MALTONI, C. & COTTI, G. (1988) Carcinogenicity of Vinyl Chloride in Sprague-Dawley Rats after Prenatal and Postnatal Exposurea. Annals of the New York Academy of Sciences, 534, 145-159. MALTONI, C. & LEFEMINE, G. (1975) Carcinogenicity bioassays of vinyl chloride: current results. Annals of the New York Academy of Sciences, 246, 195-218. MAO, F., DAM, P., CHOU, J., OLMAN, V. & XU, Y. (2009) DOOR: a database for prokaryotic operons. Nucleic Acids Research, 37, D459–D463. MARCO-URREA, E., PAUL, S., KHODAVERDI, V., SEIFERT, J., VON BERGEN, M., KRETZSCHMAR, U. & ADRIAN, L. (2011) Identification and characterization of a Re- citrate synthase in Dehalococcoides strain CBDB1. Journal of Bacteriology, 193, 5171- 5178.

174

MARDINOGLU, A., GATTO, F. & NIELSEN, J. (2013) Genome-scale modeling of human metabolism – a systems biology approach. Biotechnology Journal, n/a-n/a. MARKOWITZ, V. M., CHEN, I.-M. A., PALANIAPPAN, K., CHU, K., SZETO, E., GRECHKIN, Y., RATNER, A., ANDERSON, I., LYKIDIS, A., MAVROMATIS, K., IVANOVA, N. N. & KYRPIDES, N. C. (2009) The integrated microbial genomes system: an expanding comparative analysis resource. Nucleic Acids Research, doi:10.1093/nar/gkp887, 1-9. MARKOWITZ, V. M., CHEN, I., PALANIAPPAN, K., CHU, K., SZETO, E., GRECHKIN, Y., RATNER, A., JACOB, B., HUANG, J., WILLIAMS, P., HUNTEMANN, M., ANDERSON, I., MAVROMATIS, K., IVANOVA, N. & KYRPIDES, N. (2012) IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Research, 40, D115-122. MARKOWITZ, V. M., SZETO, E., PALANIAPPAN, K., GRECHKIN, Y., CHU, K., CHEN, I.- M. A., DUBCHAK, I., ANDERSON, I., LYKIDIS, A., MAVROMATIS, K., IVANOVA, N. N. & KYRPIDES, N. C. (2007) The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions. Nucleic Acids Research, [Epub ahead of print]. MAYMÓ-GATELL, X., CHIEN, Y., GOSSETT, J. M. & ZINDER, S. H. (1997) Isolation of a bacterium that reductively dechlorinates tetrachloroethene to ethene. Science, 276, 1568- 1571. MAZUMDER, T. K., NISHIO, N., FUKUZAKI, S. & NAGAI, S. (1987) Production of extracellular vitamin B-12 compounds from methanol by Methanosarcina barkeri. Applied Microbiology and Biotechnology, 26, 511-516. MCCARTY, P. L. (2001) Strategies for in situ bioremediation of chlorinated solvent contaminated groundwater. Groundwater Quality: Natural and Enhanced Restoration of Groundwater Pollution. Sheffield, UK. MCCARTY, P. L. (2010) Groundwater Contamination by Chlorinated Solvents: History, Remediation Technologies and Strategies. IN STROO, H. F. & WARD, C. H. (Eds.) In Situ Remediation of Chlorinated Solvent Plumes. Springer New York. MCCARTY, P. L., GOLTZ, M. N., HOPKINS, G. D., DOLAN, M. E., ALLAN, J. P., KAWAKAMI, B. T. & CARROTHERS, T. J. (1998) Full-Scale Evaluation of In Situ Cometabolic Degradation of Trichloroethylene in Groundwater through Toluene Injection. Environmental Science & Technology, 32, 88-100. MCCLOSKEY, D., PALSSON, B. O. & FEIST, A. M. (2013) Basic and applied uses of genome-scale metabolic network reconstructions of Escherichia coli. Mol Syst Biol, 9, 661. MCMURDIE, P. J., BEHRENS, S. F., MÜLLER, J. A., GÖKE, J., RITALAHTI, K. M., WAGNER, R., GOLTSMAN, E., LAPIDUS, A., HOLMES, S., LÖFFLER, F. E. & SPORMANN, A. M. (2009) Localized plasticity in the streamlined genomes of vinyl chloride respiring Dehalococcoides. PLoS Genetics, 5, e1000714. doi:10.1371/journal.pgen.1000714. MEDINI, D., SERRUTO, D., PARKHILL, J., RELMAN, D. A., DONATI, C., MOXON, R., FALKOW, S. & RAPPUOLI, R. (2008) Microbiology in the post-genomic era. Nature Reviews: Microbiology 6, 419-430.

175

MENON, A. L., MORTENSON LE, ROBSON RL (1992) Nucleotide sequences and genetic analysis of hydrogen oxidation (hox) genes in Azotobacter vinelandii. Journal of Bacteriology 174, 4549-4557. METZKER, M. L. (2010) Sequencing technologies - the next generation. Nature Reviews Genetics, 11, 31-46. MEYER, J. (2007) [FeFe] hydrogenases and their evolution: a genomic perspective. Cellular and Molecular Life Sciences 64, 1063-1084. MILLER, E., WOHLFARTH, G. & DIEKERT, G. (1997) Studies on tetrachloroethene respiration in Dehalospirillum multivorans. Archives of Microbiology, 166, 379-387. MILLER, E., WOHLFARTH, G. & DIEKERT, G. (1998) Purification and characterization of the tetrachloroethene reductive dehalogenase of strain PCE-S. Archives of Microbiology, 169, 497-502. MITCHELL, P. (1961) Coupling of phosphorylation to electron and hydrogen transfer by a chemi-osmotic type of mechanism. Nature, 191, 144-148. MITCHELL, P. (1972) Chemiosmotic coupling in energy transduction: a logical development of biochemical knowledge. Journal of Bioenergetics, 3, 5-24. MOHAMED, S. & SYED, B. A. (2013) Commercial prospects for genomic sequencing technologies. Nat Rev Drug Discov, 12, 341-342. MONOD, J. (1949) The Growth of Bacterial Cultures. Annual Review of Microbiology, 3, 371- 394. MORRIS, R. M., FUNG, J. M., RAHM, B. G., ZHANG, S., FREEDMAN, D. L., ZINDER, S. & RICHARDSON, R. E. (2007) Comparative proteomics of Dehalococcoides spp. reveals strain-specific peptides associated with activity. Applied and Environment Microbiology, 73, 320-326. MORRIS, R. M., SOWELL, S., BAROFSKY, D., ZINDER, S. & RICHARDSON, R. (2006) Transcription and mass-spectroscopic proteomic studies of electron transport oxidoreductases in Dehalococcoides ethenogenes. Environmental Microbiology, 8, 1499- 1509. MÜLLER, J. A., ROSNER, B. M., VON ABENDROTH, G., MESHULAM-SIMON, G., MCCARTY, P. L. & SPORMANN, A. M. (2004) Molecular identification of the catabolic vinyl chloride reductase from Dehalococcoides sp. strain VS and its environmental distribution. Applied and Environmental Microbiology, 70, 4880-8. MÜLLER, V. (2003) Energy conservation in acetogenic bacteria. Applied and Environment Microbiology, 69, 6345-6353. MUZZI, A., MASIGNANI, V. & RAPPUOLI, R. (2007) The pan-genome: towards a knowledge-based discovery of novel targets for vaccines and antibacterials. Drug Discovery Today, 12, 429-439. NEIDHARDT, F. C., INGRAHAM, J. & SCHAECHTER, M. (1990) Physiology of the bacterial cell: a molecular approach, Sunderland, Massachusetts, Sinauer Associates, Inc. NEIJSSEL, O. M., MATTOS, M. & TEMPEST, D. (1996) Growth yield and energy distribution IN NEIDHARDT, F. C. (Ed.) Escherichia coli and Salmonella: cellular and molecular biology. Washington D C, ASM Press. NELSON, D. L. & COX, M. M. (2006) Lehninger principles of biochemistry, W. H. Freeman and Company. NEUMANN, A., SIEBERT, A., TRESCHER, T., REINHARDT, S., WOHLFARTH, G. & DIEKERT, G. (2002) Tetrachloroethene reductive dehalogenase of Dehalospirillum

176

multivorans: substrate specificity of the native enzyme and its corrinoid cofactor. Archives of Microbiology, 177, 420-426. NEUMANN, A., WOHLFARTH, G. & DIEKERT, G. (1996) Purification and characterization of tetrachloroethene reductive dehalogenase from Dehalospirillum multivorans. Journal of Biological Chemistry, 271, 16515-16519. NI, S., FREDRICKSON, J. K. & XUN, L. (1995) Purification and characterization of a novel 3- chlorobenzoate-reductive dehalogenase from the cytoplasmic membrane of Desulfomonile tiedjei DCB-1. Journal of Bacteriology, 177, 5135-5139. NICHOLSON, J. (2010) Bioaugmentation, the case for using it at contaminated sites. HazMat Management NIJENHUIS, I. & ZINDER, S. H. (2005) Characterization of hydrogenase and reductive dehalogenase activities of Dehalococcoides ethenogenes strain 195. Applied and Environmental Microbiology, 71, 1664-7. NOOKAEW, I., JEWETT, M., MEECHAI, A., THAMMARONGTHAM, C., LAOTENG, K., CHEEVADHANARAK, S., NIELSEN, J. & BHUMIRATANA, S. (2008) The genome- scale metabolic model iIN800 of Saccharomyces cerevisiae and its validation: a scaffold to query lipid metabolism. BMC Systems Biology, 2, 71. NOTEBAART, R., VAN ENCKEVORT, F., FRANCKE, C., SIEZEN, R. & TEUSINK, B. (2006) Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics, 7, 296. NOVOTNY, M. J., REIZER, J., ESCH, F. & SAIER, M. J. (1984) Purification and properties of D-mannitol-1-phosphate dehydrogenase and D-glucitol-6-phosphate dehydrogenase from Escherichia coli. Journal of Bacteriology, 159, 986-990. NOWAK, J., KIRSCH, N., HEGEMANN, W. & STAN, H. (1996) Total reductive dechlorination of chlorobenzenes to benzene by a methanogenic mixed culture isolated from Saale river sediment. Applied Microbiology and Biotechnology, 45, 700-709. ÖBERG, G. (2002) The natural chlorine cycle -- fitting the scattered pieces. Applied Microbiology and Biotechnology, 58, 565-581. OBERHARDT, M. A., PALSSON, B. O. & PAPIN, J. A. (2009) Applications of genome-scale metabolic reconstructions. Mol Syst Biol, 5. OH, Y. K., PALSSON, B., PARK, S., SCHILLING, C. & MAHADEVAN, R. (2007) Genome- scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. Journal of Biological Chemistry, 282, 28791- 28799. ORTH, J. D., THIELE, I. & PALSSON, B. O. (2010) What is flux balance analysis? Nat Biotech, 28, 245-248. OVERBEEK, R., BEGLEY, T., BUTLER, R. M., CHOUDHURI, J. V., CHUANG, H. Y. & COHOON, M., ET AL (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Research, 33, 5691-5702. OVERBEEK, R., FONSTEIN, M., D’SOUZA, M., PUSCH, G. D. & MALTSEV, N. (1999) The use of gene clusters to infer functional coupling. Proceedings of the National Academy of Sciences of the United States of America, 96, 2896-2901. PAGANI, I., LIOLIOS, K., JANSSON, J., CHEN, I. M. A., SMIRNOVA, T., NOSRAT, B., MARKOWITZ, V. M. & KYRPIDES, N. C. (2012) The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Research, 40, D571-D579.

177

PALSSON, B. O. (2006) Systems biology: properties of reconstructed networks, Cambridge University Press. PARSONS, F. & LAGE, G. B. (1985) Chlorinated organics in simulated groundwater environments. Journal (American Water Works Association), 77, 52-59. PATIL, K. R., ÅKESSON, M. & NIELSEN, J. (2004) Use of genome-scale microbial models for metabolic engineering. Current Opinion in Biotechnology, 15, 64-69. PATNAIK, P. R. (2001) Microbial metabolism as an evolutionary response: the cybernetic approach to modeling. Critical Reviews in Biotechnology, 21, 155-175. PEITSCH, M. C. & DE GRAAF, D. (2013) A decade of Systems Biology: where are we and where are we going to? Drug Discovery Today, S1359-6446(13)00165-7. doi: 10.1016/j.drudis.2013.06.002. PEREGRIN-ALVAREZ, J., SANFORD, C. & PARKINSON, J. (2009) The conservation and evolutionary modularity of metabolism. Genome Biology, 10, R63. PETRISOR, I. G. & WELLS, J. T. (2008) Tracking chlorinated solvents in the environment. Environmental Forensics. The Royal Society of Chemistry. PFAFFL, M. W. (2001) A new mathematical model for relative quantification in real-time RT– PCR. Nucleic Acids Research, 29, e45. PFAU, T., CHRISTIAN, N. & EBENHÖH, O. (2011) Systems approaches to modelling pathways and networks. Briefings in Functional Genomics, 10, 266-279. PIERCE, E., XIE, G., BARABOTE, R., SAUNDERS, E., HAN, C., DETTER, J., RICHARDSON, P., BRETTIN, T., DAS, A., LJUNGDAHL, L. & RAGSDALE, S. (2008) The complete genome sequence of Moorella thermoacetica (f. Clostridium thermoaceticum). Environmental Microbiology, 10, 2550-2573. PINNEY, J. W., SHIRLEY, M. W., MCCONKEY, G. A. & WESTHEAD, D. R. (2005) metaSHARK: software for automated metabolic network prediction from DNA sequence and its applications to the genomes Plasmodium falciparum and Eimeria tenella. Nucleic Acids Research, 33, 1399-1409. PIRT, S. J. (1965) The maintenance energy of bacteria in growing cultures. Proceedings of the Royal Society Biological Sciences Series B, 163, 224-31. PIRT, S. J. (1982) Maintenance energy: a general model for energy-limited and energy-sufficient growth. Archives of Microbiology, 133, 300-302. PORATH, J. (1992) Immobilized metal ion affinity chromatography. Protein Expression and Purification, 3, 263-281. PORATH, J., CARLSSON, J., OLSSON, I. & BELFRAGE, G. (1975) Metal chelate affinity chromatography, a new approach to protein fractionation. Nature, 258, 598-599. PRAMANIK, J. & KEASLING, J. D. (1997) Stoichiometric model of Escherichia coli metabolism: incorporation of growth-rate dependent biomass composition and mechanistic energy requirements. Biotechnology and Bioengineering 56, 398-421. PRICE, N. D., REED, J. L. & PALSSON, B. O. (2004) Genome-scale models of microbial cells: evaluating the consequences of contraints. Nature Reviews: Microbiology 2, 886-897. PROUDFOOT, A. E., TURCATTI, G., WELLS, T. N., PAYTON, M. A. & SMITH, D. J. (1994) Purification, cDNA cloning and heterologous expression of human phosphomannose isomerase. European Journal of Biochemistry, 219, 415-423. PUNTA, M., COGGILL, P. C., EBERHARDT, R. Y., MISTRY, J., TATE, J., BOURSNELL, C., PANG, N., FORSLUND, K., CERIC, G., CLEMENTS, J., HEGER, A., HOLM, L.,

178

SONNHAMMER, E. L. L., EDDY, S. R., BATEMAN, A. & FINN, R. D. (2012) The Pfam protein families database. Nucleic Acids Research, 40, D290-301. QUEVILLON, E., SILVENTOINEN, V., PILLAI, S., HARTE, N., MULDER, N., APWEILER, R. & LOPEZ, R. (2005) InterProScan: protein domains identifier. Nucleic Acids Research, 33, W116-W120. RAGSDALE, S. W. (2008) Enzymology of the Wood-Ljungdahl pathway of acetogenesis. Annals of the New York Academy of Sciences 1125, 129-136. RAHM, B. G. & RICHARDSON, R. E. (2008a) Correlation of respiratory gene expression levels and pseudo-steady-state PCE respiration rates in Dehalococcoides ethenogenes. Environmental Science & Technology, 42, 416-421. RAHM, B. G. & RICHARDSON, R. E. (2008b) Dehalococcoides' gene transcripts as quantitative bioindicators of tetrachloroethene, trichloroethene, and cis-1,2- dichloroethene dehalorespiration rates. Environmental Science & Technology, 42, 5099- 5105. RAJESH, T., SONG, E., KIM, J., LEE, B., KIM, E., PARK, S., KIM, Y., YOO, D., PARK, H., CHOI, Y., KIM, B. & YANG, Y. (2012) Inactivation of phosphomannose isomerase gene abolishes sporulation and antibiotic production in Streptomyces coelicolor. Applied Microbiology and Biotechnology, 93, 1685-1693. RAMAN, K. & CHANDRA, N. (2009) Flux balance analysis of biological systems: applications and challenges. Briefings in Bioinformatics, 10, 435-449. RAMKRISHNA, D. (1983) A cybernetic perspective of microbial growth. Foundations of Biochemical Engineering. AMERICAN CHEMICAL SOCIETY. RAMKRISHNA, D. & SONG, H.-S. (2012) Dynamic models of metabolism: Review of the cybernetic approach. AICHE Journal, 58, 986-997. RAMOS, A., N. RAVEN, R. J. SHARP, S. BARTOLUCCI, M. ROSSI, R. CANNIO, J. LEBBINK, J. VAN DER OOST, W. M. DE VOS & SANTOS, H. (1997) Stabilization of Enzymes against Thermal Stress and Freeze-Drying by Mannosylglycerate. Applied and Environment Microbiology, 63, 4020-4025. RAMSDEN, N. L., BUETOW, L., DAWSON, A., KEMP, L. A., ULAGANATHAN, V., BRENK, R., KLEBE, G. & HUNTER, W. N. (2009) A structure-based approach to ligand discovery for 2C-methyl-D-erythritol-2,4-cyclodiphosphate synthase: a target for antimicrobial therapy. Journal of Medicinal Chemistry, 52, 2531-42. REED, J. L., FAMILI I, THIELE I & PALSSON, B. O. (2006a) Towards multidimensional genome annotation. Nature Reviews: Genetics, 7, 130-141. REED, J. L., PATEL, T. R., CHEN, K. H., JOYCE, A. R., APPLEBEE, M. K., HERRING, C. D., BUI, O. T., KNIGHT, E. M., FONG, S. S. & PALSSON, B. O. (2006b) Systems approach to refining genome annotation. Proceedings of the National Academy of Sciences, 103, 17480-17484. REED, J. L., VO, T., SCHILLING, C. & PALSSON, B. (2003) An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biology, 4, R54. REY, F., FAITH, J., BAIN, J., MUEHLBAUER, M., STEVENS, R., NEWGARD, C. & GORDON, J. (2010) Dissecting the in vivo metabolic potential of two human gut acetogens. Journal of Biological Chemistry, 285, 22082-22090. RICHARDSON, R. E., BHUPATHIRAJU, V. K., SONG, D. L., GOULET, T. A. & ALVAREZ- COHEN, L. (2002) Phylogenetic characterization of microbial communities that

179

reductively dechlorinate TCE based upon a combination of molecular techniques. Environmental Science & Technology, 36, 2652-62. RITALAHTI, K. M., AMOS, B. K., SUNG, Y., WU, Q., KOENIGSBERG, S. S. & LÖFFLER, F. E. (2006) Quantitative PCR targeting 16S rRNA and reductive dehalogenase genes simultaneously monitors multiple Dehalococcoides strains Applied and Environment Microbiology, 72, 2765-2774. RITTMANN, B. E. & MCCARTY, P. L. (2001) Environmental biotechnology: principles and applications, New York, McGraw-Hill. ROBBINS, R. J. (1994) Biological databases: A new scientific literature. Publishing Research Quarterly, 10, 3-27. ROBERTS, S., GOWEN, C., BROOKS, J. P. & FONG, S. (2010) Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production. BMC Systems Biology, 4, 31. ROST, B. & SANDER, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232, 584-599. ROST, B., SANDER C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 19, 55-72. RUSSELL, J. B. & COOK, G. M. (1995) Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiological Reviews, 59, 48-62. SAEED, A. I., SHAROV, V., WHITE, J., LI, J., LIANG, W., BHAGABATI, N., BRAISTED, J., KLAPA, M., CURRIER, T., THIAGARAJAN, M., STURN, A., SNUFFIN, M., REZANTSEV, A., POPOV, D., RYLTSOV, A., KOSTUKOVICH, E., BORISOVSKY, I., LIU, Z., VINSAVICH, A., TRUSH, V. & QUACKENBUSH, J. (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques, 34, 374-378. SALMASSI, T. & LEADBETTER, J. (2003) Analysis of genes of tetrahydrofolate-dependent metabolism from cultivated spirochaetes and the gut community of the termite Zootermopsis angusticollis. Microbiology, 149, 2529-2537. SANCHO, J. (2006) Flavodoxins: sequence, folding, binding, function and beyond. Cellular and Molecular Life Sciences, 63, 855-864. SANTOS, H. & DA COSTA, M. (2002) Compatible solutes of organisms that live in hot saline environments. Environmental Microbiology, 4, 501-509. SARGENT, M. G. (1975) Control of cell length in Bacillus subtilis. Journal of Bacteriology, 123, 7-19. SAUER, U. (2006) Metabolic networks in motion: 13C-based flux analysis. Mol Syst Biol, 2. SCHELLENBERGER, J., PARK, J., CONRAD, T. & PALSSON, B. (2010) BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics, 11, 213. SCHELLENBERGER, J., QUE, R., FLEMING, R., THIELE, I., ORTH, J., FEIST, A., ZIELINSKI, D., BORDBAR, A., LEWIS, N., RAHMANIAN, S., KANG, J., HYDUKE, D. & PALSSON, B. (2011) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nature Protocols, 6, 1290-1307. SCHILLING, C. H., COVERT, M., FAMILI, I., CHURCH, G., EDWARDS, J. & PALSSON, B. (2002) Genome-scale metabolic model of Helicobacter pylori 26695. Journal of Bacteriology, 184, 4582-4593.

180

SCHIPP, C. J., MARCO-URREA, E., KUBLIK, A., SEIFERT, J. & ADRIAN, L. (2013) Organic cofactors in the metabolism of Dehalococcoides mccartyi strains. Philosophical Transactions of the Royal Society of London B Biological Sciences, 368, 20120321. SCHMIDT, M., ARNOLD, W., NIEMANN, A., KLEICKMANN, A. & PÜHLER, A. (1992) The Rhizobium meliloti pmi gene encodes a new type of phosphomannose isomerase. Gene (Amsterdam), 122, 35-43. SCHOLZMURAMATSU, H., NEUMANN, A., MESSMER, M., MOORE, E. & DIEKERT, G. (1995) ISOLATION AND CHARACTERIZATION OF DEHALOSPIRILLUM MULTIVORANS GEN-NOV, SP-NOV, A TETRACHLOROETHENE-UTILIZING, STRICTLY ANAEROBIC BACTERIUM. Archives of Microbiology, 163, 48-56. SCHOMBURG ET AL (2004) BRENDA: the enzyme database. Nucleic Acids Research, 32, D431-D433. SCHULZE, A. & DOWNWARD, J. (2001) Navigating gene expression using microarrays — a technology review. Nature Cell Biology, 3, E190-5. SCHUMACHER, W. & HOLLIGER, C. (1996) The proton/electron ration of the menaquinone- dependent electron transport from dihydrogen to tetrachloroethene in "Dehalobacter restrictus". Journal of Bacteriology, 178, 2328-2333. SCHUMACHER, W., HOLLIGER, C., ZEHNDER, A. J. B. & HAGEN, W. R. (1997) Redox chemistry of cobalamin and iron-sulfur cofactors in the tetrachloroethene reductase of Dehalobacter restrictus. FEBS Letters, 409, 421-425. SCHUT, G. J. & ADAMS, M. (2009) The iron-hydrogenase of Thermotoga maritima utilizes ferredoxin and NADH synergistically: a new perspective on anaerobic hydrogen production. Journal of Bacteriology 191, 191: 4451-4457. SEEHOLZER, S. H. (1993) Phosphoglucose isomerase: a ketol isomerase with aldol C2- epimerase activity. Proceedings of the National Academy of Sciences of the United States of America, 90, 1237-1241. SESHADRI, R., ADRIAN, L., FOUTS, D. E., EISEN, J. A., PHILLIPPY, A. M., METHE, B. A., WARD, N. L., NELSON, W. C., DEBOY, R. T., KHOURI, H. M., KOLONAY, J. F., DODSON, R. J., DAUGHERTY, S. C., BRINKAC, L. M., SULLIVAN, S. A., MADUPU, R., NELSON, K. E., KANG, K. H., IMPRAIM, M., TRAN, K., ROBINSON, J. M., FORBERGER, H. A., FRASER, C. M., ZINDER, S. H. & HEIDELBERG, J. F. (2005) Genome sequence of the PCE-dechlorinating bacterium Dehalococcoides ethenogenes. Science, 307, 105-8. SHULER, M. L., LEUNG, S. & DICK, C. C. (1979) A mathematical model for the growth of single bacterial cell. Annals of the New York Academy of Sciences, 326, 35-52. SIGRIST, C. J., DE CASTRO, E., CERUTTI, L., CUCHE, B., HULO, N., BRIDGE, A., BOUGUELERET, L. & XENARIOS, I. (2012) New and continuing developments at PROSITE. Nucleic Acids Research, 41, D344-347. SIMMONDS, A. (2007) Dechlorination rates in KB-1, a commercial trichloroethylene degrading bacterial culture. Chemical Engineering and Applied Chemistry. Toronto, University of Toronto. SMIDT, H. & DE VOS, W. M. (2004) Anaerobic microbial dehalogenation. Annual Review of Microbiology, 58, 43-73. SMOOT, M. E., ONO, K., RUSCHEINSKI, J., WANG, P. L. & IDEKER, T. (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics (Oxford), 27, 431-2.

181

SONG, H.-S. & RAMKRISHNA, D. (2010) Prediction of metabolic function from limited data: Lumped hybrid cybernetic modeling (L-HCM). Biotechnology and Bioengineering, 106, 271-284. STEEN, I. H., LIEN, T. & BIRKELAND, N. (1997) Biochemical and phylogenetic characterization of isocitrate dehydrogenase from a hyperthermophilic archaeon, Archaeoglobus fulgidus. Archives of Microbiology, 168, 412-420. STEEN, I. H., MADSEN, M., BIRKELAND, N.-K. & LIEN, T. (1998) Purification and characterization of a monomeric isocitrate dehydrogenase from the sulfate-reducing bacterium Desulfobacter vibrioformis and demonstration of the presence of a monomeric enzyme in other bacteria. FEMS Microbiology Letters, 160, 75-79. STEIN, L. D. (2003) Integrating biological databases. Nature Reviews Genetics, 4, 337-345. STERNER, R. (2001) Ferredoxin from Thermotoga maritima. Methods in Enzymology, 334, 23- 30. STOKKE, R., KARLSTRÖM, M., YANG, N., LEIROS, I., LADENSTEIN, R., BIRKELAND, N. & STEEN, I. (2007) Thermal stability of isocitrate dehydrogenase from Archaeoglobus fulgidus studied by crystal structure analysis and engineering of chimers. Extremophiles, 11, 481-493. STUPPERICH, E., EISINGER, H.-J. & SCHURR, S. (1990) Corrinoids in anaerobic bacteria. FEMS (Federation of European Microbiological Societies) Microbiology Letters, 87, 355-360. STUPPERICH, E., EISINGER, H. J. & KRÄUTLER, B. (1988) Diversity of corrinoids in acetogenic bacteria. P-cresolylcobamide from Sporomusa ovata, 5-methoxy-6- methylbenzimidazolylcobamide from Clostridium formicoaceticum and vitamin B12 from Acetobacterium woodii. European Journal of Biochemistry, 172, 459-464. STUPPERICH, E. & KONLE, R. (1993) Corrinoid-Dependent Methyl Transfer Reactions Are Involved in Methanol and 3,4-Dimethoxybenzoate Metabolism by Sporomusa ovata. Applied and Environment Microbiology, 59, 3110-3116. STUPPERICH, E. & KRÄUTLER, B. (1988) Pseudo vitamin B12 or 5-hydroxybenzimidazolyl- cobamide are the corrinoids found in methanogenic bacteria. Archives of Microbiology, 149, 268-271. SUN, J. & ZENG, A. P. (2004) IdentiCS - identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated lo-coverage bacterial genome sequence. BMC Bioinformatics, 5, 112. SUNG, Y., RITALAHTI, K. M., APKARIAN, R. P. & LÖFFLER, F. E. (2006) Quantitative PCR confirms purity of strain GT, a novel trichloroethene-to-ethene-respiring Dehalococcoides isolate. Applied and Environment Microbiology, 72, 1980-1987. SWAN, M. K., HANSEN, T., SCHÖNHEIT, P. & DAVIES, C. (2004) A novel phosphoglucose isomerase (PGI)/phosphomannose isomerase from the crenarchaeon Pyrobaculum aerophilum is a member of the PGI superfamily: structural evidence at 1.16-A resolution. Journal of Biological Chemistry, 279, 39838-39845. TANG, J. K.-H., YOU, L., BLANKENSHIP, R. E. & TANG, Y. J. (2012) Recent advances in mapping environmental microbial metabolisms through 13C isotopic fingerprints. Journal of the Royal Society Interface, 9, 2767-2780. TANG, S., CHAN, W., FLETCHER, K., SEIFERT, J., LIANG, X., LÖFFLER, F., EDWARDS, E. & ADRIAN, L. (2013) Functional characterization of reductive dehalogenases by

182

using blue native polyacrylamide gel electrophoresis. Applied and Environment Microbiology, 79, 974-981. TANG, Y. J., MARTIN, H. G., MYERS, S., RODRIGUEZ, S., BAIDOO, E. E. K. & KEASLING, J. D. (2009a) Advances in analysis of microbial metabolic fluxes via 13C isotopic labeling. Mass Spectrometry Reviews, 28, 362-375. TANG, Y. J., YI, S., ZHUANG, W. Q., ZINDER, S. H., KEASLING, J. D. & ALVAREZ- COHEN, L. (2009b) Investigation of carbon metabolism in Dehalococcoides ethenogenes strain 195 by use of isotopomer and transcriptomic analyses. Journal of Bacteriology, 191, 5224-31. TAS, N., EEKERT, M. H. A. V., VOS, W. M. D. & SMIDT, H. (2010) The little bacteria that can - diversity, genomics and ecophysiology of 'Dehalococcoides' spp. in contaminated environments. Microbial Biotechnology, 3, 389-402. TATUSOV, R. L., GALPERIN, M., NATALE, D. & KOONIN, E. (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Research, 28, 33-36. TAVAZOIE, S., HUGHES, J. D., CAMPBELL, M. J., CHO, R. J. & CHURCH, G. M. (1999) Systematic determination of genetic network architecture. Nature Genetics, 22, 281-5. TEACH (2011) http://www.epa.gov/teach/teachsummaries.html. TETTELIN, H. (2009) The bacterial pan-genome and reverse vaccinology. Genome Dynamics 6, 35-47. TETTELIN, H., MASIGNANI, V., CIESLEWICZ, M., DONATI, C., MEDINI, D., WARD, N., ANGIUOLI, S., CRABTREE, J., JONES, A., DURKIN, A., DEBOY, R., DAVIDSEN, T., MORA, M., SCARSELLI, M., MARGARIT, Y., ROS, I., PETERSON, J., HAUSER, C., SUNDARAM, J., NELSON, W., MADUPU, R., BRINKAC, L., DODSON, R., ROSOVITZ, M., SULLIVAN, S., DAUGHERTY, S., HAFT, D., SELENGUT, J., GWINN, M., ZHOU, L., ZAFAR, N., KHOURI, H., RADUNE, D., DIMITROV, G., WATKINS, K., O'CONNOR, K., SMITH, S., UTTERBACK, T., WHITE, O., RUBENS, C., GRANDI, G., MADOFF, L., KASPER, D., TELFORD, J., WESSELS, M., RAPPUOLI, R. & FRASER, C. (2005) Genome analysis of multiple pathegenic isolates of Streptococcus agalactiae: Implications for the microbial "pan-genome". Proceedings of the National Academy of Sciences, USA, 102, 13950-13955. TETTELIN, H., RILEY, D., CATTUTO, C. & MEDINI, D. (2008) Comparative genomics: the bacterial pan-genome. Current Opinion in Microbiology, 12, 472-477. TEUSINK, B., WIERSMA, A., MOLENAAR, D., FRANCKE, C., DE VOS, W., SIEZEN, R. & SMID, E. (2006) Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. Journal of Biological Chemistry, 281, 40041-40048. THAUER, R. K., JUNGERMANN, K. & DECKER, K. (1977) Energy conservation in chemotrophic anaerobic bacteria. Bacteriological reviews 41, 100-180. THAUER, R. K., KASTER, A.-K., SEEDORF, H., BUCKEL, W. & HEDDERICH, R. (2008) Methanogenic archaea: ecologically relevant differences in energy conservation. Nature Reviews: Microbiology, 6, 579-591. THIELE, I. & PALSSON, B. O. (2010a) A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protocols, 5, 93-121. THIELE, I. & PALSSON, B. O. (2010b) Reconstruction annotation jamborees: a community approach to systems biology. Mol Syst Biol, 6.

183

THIELE, I., SWAINSTON, N., FLEMING, R. M. T., HOPPE, A., SAHOO, S., AURICH, M. K., HARALDSDOTTIR, H., MO, M. L., ROLFSSON, O., STOBBE, M. D., THORLEIFSSON, S. G., AGREN, R., BOLLING, C., BORDEL, S., CHAVALI, A. K., DOBSON, P., DUNN, W. B., ENDLER, L., HALA, D., HUCKA, M., HULL, D., JAMESON, D., JAMSHIDI, N., JONSSON, J. J., JUTY, N., KEATING, S., NOOKAEW, I., LE NOVERE, N., MALYS, N., MAZEIN, A., PAPIN, J. A., PRICE, N. D., SELKOV SR, E., SIGURDSSON, M. I., SIMEONIDIS, E., SONNENSCHEIN, N., SMALLBONE, K., SOROKIN, A., VAN BEEK, J. H. G. M., WEICHART, D., GORYANIN, I., NIELSEN, J., WESTERHOFF, H. V., KELL, D. B., MENDES, P. & PALSSON, B. O. (2013) A community-driven global reconstruction of human metabolism. Nat Biotech, 31, 419-425. THOMAS, R. S., GUSTAFSON, D. L., POTT, W. A., LONG, M. E., BENJAMIN, S. A. & YANG, R. S. (1998) Evidence for hepatocarcinogenic activity of pentachlorobenzene with intralobular variation in foci incidence. Carcinogenesis, 19, 1855-1862. THOMPSON, J., GENTRY-WEEKS, C. R., NGUYEN, N. Y., FOLK, J. E. & ROBRISH, S. A. (1995) Purification from Fusobacterium mortiferum ATCC 25557 of a 6-phosphoryl-O- alpha-D-glucopyranosyl: 6-phosphoglucohydrolase that hydrolyzes maltose 6-phosphate and related phospho-alpha-D-glucosides. Journal of Bacteriology, 177, 2505-12. TODAR, K. (2012) Todar's Online Textbook of Bacteriology (http://textbookofbacteriology.net/index.html). UNDEN, G., EDITORS-IN-CHIEF: WILLIAM, J. L. & LANE, M. D. (2013) Energy Transduction in Anaerobic Bacteria. Encyclopedia of Biological Chemistry. Waltham, Academic Press. USADEL, B., OBAYASHI, T., MUTWIL, M., GIORGI, F. M., BASSEL, G. W., TANIMOTO, M., CHOW, A., STEINHAUSER, D., PERSSON, S. & PROVART, N. J. (2009) Co- expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant, Cell & Environment, 32, 1633-51. USEPA (2013) http://clu- in.org/contaminantfocus/default.focus/sec/Dense_Nonaqueous_Phase_Liquids_%28DNA PLs%29/cat/Overview/. USGS (2013) http://toxics.usgs.gov/definitions/dnapl_def.html. VALENTINE, R. C. (1964) Bacterial ferredoxin Microbiology and Molecular Biology Reviews, 28, 497-517. VALENTINE, R. C. & WOLFE, R. (1963) Role of ferredoxin in the metabolism of molecular hydrogen. Journal of Bacteriology 85, 1114-1120. VALLINO, J. & STEPHANOPOULOS, G. (1993) Metabolic flux distributions in Corynebacterium glutamicum during growth and lysine overproduction. Biotechnology and Bioengineering, 41, 633-646. VAN DONGEN, S. (2000) Graph Clustering by Flow Simulation. PhD Thesis. The Dutch National Research Institute for Mathematics and Computer Science., University of Utrecht, The Netherlands. VARMA, A. & PALSSON, B. O. (1994) Metabolic flux balancing: basic concepts, scientific and practical use. Nature Biotechnology 12, 994-998. VIANNA, N. J., BRADY, J. & HARPER, P. (1981) Angiosarcoma of the liver: a signal lesion of vinyl chloride exposure. Environ Health Perspect, 41, 207-210.

184

VIGNAIS, P. M., BILLOUD, B. & MEYER, J. (2001) Classification and phylogeny of hydrogenases. FEMS Microbiology Reviews 25, 455-501. VOGEL, T. M. & MCCARTY, P. L. (1985) Biotransformation of tetrachloroethylene to trichloroethylene, dichloroethylene, vinyl chloride, and carbon dioxide under methanogenic conditions. Applied and Environment Microbiology, 49, 1080-1085. WALLER, A. S. (2010) Molecular investigation of chloroethene reductive dehalogenation by the mixed microbial community KB1 (http://hdl.handle.net/1807/19106). Chemical Engineering and Applied Chemistry. Toronto, University of Toronto. WALLER, A. S., HUG, L. A., MO, K., RADFORD, D. R., MAXWELL, K. L. & EDWARDS, E. A. (2012) Transcriptional analysis of a Dehalococcoides-containing microbial consortium reveals prophage activation. Applied and Environmental Microbiology, 78, 1178-1186. WALLER, A. S., KRAJMALNIK-BROWN, R., LÖFFLER, F. & EDWARDS, E. (2005) Multiple reductive-dehalogenase-homologous genes are simultaneously transcribed during dechlorination by Dehalococcoides-containing cultures. Applied and Environment Microbiology, 71, 8257-8264. WANG, Y., XIAO, J., SUZEK, T. O., ZHANG, J., WANG, J., ZHOU, Z., HAN, L., KARAPETYAN, K., DRACHEVA, S., SHOEMAKER, B. A., BOLTON, E., GINDULYTE, A. & BRYANT, S. H. (2012) PubChem's BioAssay Database. Nucleic Acids Research, 40, D400-D412. WARREN, M. J., RAUX, E., SCHUBERT, H. L. & ESCALANTE-SEMERENA, J. C. (2002) The biosynthesis of adenosylcobalamin (vitamin B12). Natural Product Reports, 19, 390- 412. WEST, K. A., JOHNSON, D. R., HU, P., DESANTIS, T. Z., BRODIE, E. L., LEE, P. K., FEIL, H., ANDERSEN, G. L., ZINDER, S. H. & ALVAREZ-COHEN, L. (2008) Comparative genomics of "Dehalococcoides ethenogenes" 195 and an enrichment culture containing unsequenced "Dehalococcoides" strains. Applied and Environmental Microbiology, 74, 3533-40. WHELAN, S. & GOLDMAN, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution, 18, 691-699. WHITE, D. C., GEYER, R., PEACOCK, A., HEDRICK, D., KOENIGSBERG, S., SUNG, Y., HE, J. & LÖFFLER, F. (2005) Phospholipid furan fatty acids and Ubiquinone-8: lipid biomakers that may protect Dehalococcoides strains from free radicals. Applied and Environmental Microbiology 71, 8426-8433. WIKI_BERTALANFFY (2013) http://en.wikipedia.org/wiki/Ludwig_von_Bertalanffy. WIKI_SADI_CARNOT (2013) http://en.wikipedia.org/wiki/Nicolas_Leonard_Sadi_Carnot. WIKI_SYSTEM (2013) http://en.wikipedia.org/wiki/System. WILDERMUTH, M. (2000) Metabolic control analysis: biological applications and insights. Genome Biology, 1, reviews1031.1 - reviews1031.5. WILSON, J. T. & WILSON, B. H. (1985) Biotransformation of trichloroethylene in soil. Applied and Environment Microbiology, 49, 242-243. WINTER, J. U. & WOLFE, R. (1980) Methane formation from fructose by syntrophic associations of Acetobacterium woodii and different strains of methanogens. Archives of Microbiology, 124, 73-79.

185

WOOD, H. G. & LJUNGDAHL, L. (1991) Autotrophic character of acetogenic bacteria IN SHIVELY, J. M., BARTON LL (Ed.) Variations in Autotrophic Life. San Diego, CA, Acdemic Press. YAMAMOTO, K., MURAKAMI, R. & TAKAMURA, Y. (1998) Isoprenoid quinone, cellular fatty acid composition and diaminopimelic acid isomers of newly classified thermophilic anaerobic gram-positive bacteria. FEMS (Federation of European Microbiological Societies) Microbiology Letters, 161, 351-358. YAN, J., IM J, , YANG, Y. & LÖFFLER, F. E. (2013) Guided cobalamin biosynthesis supports Dehalococcoides mccartyi reductive dechlorination activity. Philosophical Transactions of the Royal Society of London B Biological Sciences, 368, 20120320. YAN, J., RITALAHTI, K. M., WAGNER, D. D. & LÖFFLER, F. E. (2012) Unexpected specificity of interspecies cobamide transfer from Geobacter spp. to organohalide- respiring Dehalococcoides mccartyi strains. Applied and Environment Microbiology, doi: 10.1128/AEM.01535-12. YI, S., SETH, E., MEN, Y., STABLER, S., ALLEN, R., ALVAREZ-COHEN, L. & TAGA, M. (2012) Versatility in corrinoid salvaging and remodeling pathways supports corrinoid- dependent metabolism in Dehalococcoides mccartyi. Applied and Environment Microbiology, 78, 7745-7752. YOUNG, J. D., HENNE, K. L., MORGAN, J. A., KONOPKA, A. E. & RAMKRISHNA, D. (2008) Integrating cybernetic modeling with pathway analysis provides a dynamic, systems-level description of metabolic control. Biotechnology and Bioengineering 100, 542-559. ZHANG, R. G., SKARINA, T., KATZ, J., BEASLEY, S., KHACHATRYAN, A., VYAS, S., ARROWSMITH, C., CLARKE, S., EDWARDS, A., JOACHIMIAK, A. & SAVCHENKO, A. (2001) Structure of Thermotoga maritima stationary phase survival protein SurE: a novel acid phosphatase. Structure, 9, 1095-1106. ZHUANG, W. Q., YI, S., FENG, X., ZINDER, S. H., TANG, Y. J. & ALVAREZ-COHEN, L. (2011) Selective utilization of exogenous amino acids by Dehalococcoides ethenogenes strain 195 and the effects on growth and dechlorination activity. Applied and Environment Microbiology, Sep 2, 2011. ZILA, A. (2011) A molecular study of field bioaugmentation using the KB-1® mixed microbial consortium: the application of real-time PCR in analyzing population dynamics (http://www.beem.utoronto.ca/sites/www.beem.utoronto.ca/files/Zila%202011%20MEng %20thesis.pdf). Chemical Engineering and Applied Chemistry. Toronto, University of Toronto.

186

Appendices

Appendix A: Supplemental information for Chapter 3

Table A1. Overall Macromolecular Composition of a Dehalococcoides Cella

Protein 63% RNA 16% DNA 12% Lipid 5% Carbohydrate 1% Soluble pools and ions 3% Total 100% aAssumption based on iAF692 (Methanosarcina barkeri model) (Feist et al., 2006). The DNA content is higher than M. barkeri because Dehalococcoides are disc shaped and smaller in size than M. barkeri (Duhamel and Edwards, 2007).

Table A2. Protein Composition of 1 Gram of Dehalococcoides Cell (Neidhardt et al., 1990, Pramanik and Keasling, 1997)

Amino acids Content (mol%) Content (mmol/g DCW) L – Alanine 9.58 0.5588 L – Arginine 5.52 0.3220 L – Asparagine 4.5 0.2625 L – Aspartate 4.5 0.2625 L – Cysteine 1.72 0.1003 L – Glutamine 4.91 0.2864 L – Glutamate 4.91 0.2864 Glycine 11.45 0.6679 L – Histidine 1.78 0.1038 L – Isoleucine 5.45 0.3179 L – Leucine 8.45 0.4929 L – Lysine 6.4 0.3733 L – Methionine 2.88 0.1680 L – Phenylalanine 3.47 0.2024 L – Proline 4.15 0.2421 L – Serine 4.05 0.2363 L – Threonine 4.73 0.2759 L – Tryptophan 1.07 0.0624 L – Tyrosine 2.59 0.1511 L – Valine 7.89 0.4603

187

Table A3. DNA Composition of 1 Gram of Dehalococcoides Cell (Markowitz et al., 2012)

dNTPs Content (mol%) Content (mmol/g DCW) dATP 95.5 0.0955 dGTP 84.7 0.0847 dCTP 84.7 0.0847 dTTP 95.5 0.0955

Table A4. RNA Composition of 1 Gram of Dehalococcoides Cella

rNTPs Content (mol%) Content (mmol/g DCW) ATP 26.19 0.1289 GTP 32.22 0.1586 CTP 20.00 0.1063 UTP 21.59 0.0985 aElizabeth A. Edwards (personal communication)

Table A5. Lipid Composition of 1 Gram of Dehalococcoides Cell (White et al., 2005)

Lipids Content (mol%) Content (mmol/g DCW) Dodecanoic acid (C12:0) 0.91 0.0019 Tetradecanoic acid (C14:0) 7.44 0.0154 Hexadecanoic acid (C16:0) 41.85 0.0865 Octadecanoic acid (C18:0) 18.19 0.0376 Eicosanoic acid (C20:0) 0.49 0.0010 Oleic acid (18:1w9c) 0.35 0.0007 10-R-Methylhexadecanoic acid 22.8 0.0471 (10Me16:0) Dodecanoic acid (C12:0) 0.91 0.0019

Table A6. Composition of Cofactors and Other Soluble Pools of 1 Gram of Dehalococcoides Cell (Feist et al., 2006)

Components Content (mmol/g DCW)

Putrescine 0.0262 Homospermidine 0.0047 Acetyl-CoA 0.0001 CoA 0.000006 NAD 0.0022

188

NADH 0.0001 NADP 0.0001 NADPH 0.0004 Succinyl-CoA 0.000003 AMP 0.0010 ADP 0.002 ATP 0.004 5,6,7,8-tetrahydrofolate 0.0001 Adenosylcobalamin 0.0047 Glycogen 0.0154

Table A7. Experimental Growth Yields of Various Dehalococcoides Cultures

Yield Yield Yield Yield Dehalococcoides Electron (g Yield (copy/µmol (copy/µmol (gDCW/mol Reference culture acceptora protein/mol (gDCW/eeq) ethene) Cl) Cl)c Cl)b

Pure cultures

(Jayachandran Strain CBDB1 HCB 2.1 - 9.13 x 107 1.11 0.55 et al., 2003)

(Jayachandran Strain CBDB1 PeCB 2.9 - 1.26 x 108 1.53 0.77 et al., 2003)

(Adrian et al., Strain CBDB1 2,3-DCP 1.73 - 7.52 x 107 0.91 0.46 2007a)

(Maymó- Strain 195 PCE 4.8 - 2.9 x 108 2.53 1.27 Gatell et al., 1997)

(Adrian et al., Strain 195 2,3-DCP - - 8.30 x 107 1.01 0.50 2007a)

(He et al., Strain BAV1 VC - - 6.30 x 107 0.76 0.38 2003)

(He et al., Strain FL2 TCE - - 7.80 x 107 0.95 0.47 2005)

189

Yield Yield Yield Yield Dehalococcoides Electron (g Yield (copy/µmol (copy/µmol (gDCW/mol Reference culture acceptora protein/mol (gDCW/eeq) ethene) Cl) Cl)c Cl)b (He et al., Strain FL2 cDCE - - 8.40 x 107 1.02 0.51 2005)

(He et al., Strain FL2 trans-DCE - - 8.10 x 107 0.98 0.49 2005)

(Sung et al., Strain GT VC - - 2.50 x 108 3.03 1.52 2006)

(Sung et al., Strain GT TCE - 9.30 x 108 3.10 x 108 3.76 1.88 2006)

Average (± standard deviation) 1.38 (± 1.03) 0.69 (± 0.51)

Mixed cultures

(Cupples et VS enrichment VC - - 5.20 x 108 6.31 3.16 al., 2003)

(Duhamel et KB1/VC enrichment VC - - 5.60 x 108 6.80 3.40 al., 2004)

(Duhamel et KB1/VC enrichment TCE - - 3.60 x 108 4.37 2.19 al., 2004)

(Holmes et ANAS enrichment VC - - 1.30 x 107 0.16 0.08 al., 2006)

(Holmes et ANAS enrichment DCE - - 1.2 x 107 0.15 0.07 al., 2006)

(Holmes et ANAS enrichment TCE - - 1.4 x 107 0.17 0.08 al., 2006)

(Bedard et al., JN culture PCB - - 9.25 x 108 11.23 5.61 2007)

(Duhamel and KB1/TCE enrichment 1,2-DCA - 3.20 x 108 1.60 x 108 1.94 1.94 Edwards, 2007)

190

Yield Yield Yield Yield Dehalococcoides Electron (g Yield (copy/µmol (copy/µmol (gDCW/mol Reference culture acceptora protein/mol (gDCW/eeq) ethene) Cl) Cl)c Cl)b (Duhamel and KB1/TCE enrichment VC - 2.90 x 108 2.90 x 108 3.52 1.76 Edwards, 2007) (Duhamel and KB1/TCE enrichment cDCE - 3.50 x 108 1.75 x 108 2.12 1.06 Edwards, 2007)

Average (± standard deviation)e 4.18 (± 2.05) 2.25 (± 0.88)

aShort forms for electron acceptors are: HCB, Hexachlorobenzene; PeCB, Pentachlorobenzene; PCB, Polychlorinated biphenyls; 2,3-DCP, 2,3-Dichlorophenol; PCE, Tetrachloroethene; VC, Vinyl chloride; TCE, Trichloroethene; cDCE, cis-1,2-dichloroethene; trans-DCE, trans-1,2- dichloroethene; 1,2-DCA, 1,2-Dichloroethane.

bA conversion factor of 2.3 x 10-14 g protein cell-1 is used to convert the numbers in g protein to copy (Adrian et al., 2007a).

cCopy numbers are converted to gram dry cell weight (gDCW) by assuming cylindrical shape, 0.5 µm diameter, 0.2 µm thickness and 70% water content of a Dehalococcoides cell as well as 1 copy of the 16S rRNA gene per genome or per cell.

dBold numbers are yield values cited in the literature.

eAverage and standard deviation of mixed cultures are calculated without including ANAS and JN cultures’ yield since those are outliers. ANAS yields are based on long term experiments where dechlorination and growth may be uncoupled (Holmes et al., 2006). JN yield is likely to be inaccurate due to difficulties in measuring PCB concentration.

Table A8. Experimental Growth Rates of Various Dehalococcoides Cultures

Growth Dehalococcoides Electron Growth rate rate Reference culture acceptora (h-1) (d-1) Pure cultures

Strain CBDB1 2,3-DCP 0.41 0.017 (Adrian et al., 2007a) (Karadagli and Rittmann, Strain 195 PCE 1.26 0.053b 2005) Strain BAV1 VC 0.32 0.013 (He et al., 2003)

191

Strain GT VC 0.35 0.014 (Sung et al., 2006)

Strain FL2 VC 0.29 0.012 (He et al., 2005)

Average (± standard deviation) 0.014 (± 0.002)

Mixed cultures

VS enrichment TCE 0.35 0.015 (Cupples et al., 2004)

VS enrichment cDCE 0.46 0.019 (Cupples et al., 2004)

VS enrichment VC 0.49 0.020 (Cupples et al., 2004)

KB1/VC enrichment TCE 0.33 0.014 (Cupples et al., 2004)

KB1/VC enrichment cDCE 0.44 0.018 (Cupples et al., 2004)

KB1/VC enrichment VC 0.42 0.018 (Cupples et al., 2004)

Average (± standard deviation) 0.017 (± 0.003) aShort forms for electron acceptors are: 2,3-DCP, 2,3-Dichlorophenol; VC, Vinyl chloride; PCE, Tetrachloroethene; TCE, Trichloroethene; cDCE, cis-1,2-dichloroethene.

bGrowth rate calculation was not substantiated; hence, not used in calculating average.

Table A9. Experimental Decay Rates of Different Anaerobes

Decay rate Organism Reference (d-1) Dehalococcoides sp. strain VS (Cupples et al., 0.05 (during growth) 2003) Dehalococcoides sp. strain VS (no (Cupples et al., 0.09 growth) 2003) (Karadagli and Methanobacterium bryantii 0.088 Rittmann, 2005) (Rittmann and Typical for anaerobes 0.02 McCarty, 2001)

Table A10. Energy Cost for Processing and Polymerization of Macromolecules (GAM) of a Typical Bacterial Cell (Neidhardt et al., 1990)

mmol ATP/g Process mmol/g DCW µmol ATP/µmol DCW

192

Protein Activation 4.0000 23.3332 mRNA synthesis 0.2000 1.1667 5.8333 Proofreading 0.1000 0.5833 Assembly/modification 0.0060 0.1399 RNA Discarding segments 0.3800 0.1871 0.4923 Modification 0.0200 0.0098 DNA Unwinding helix 1.0000 0.3604 Proofreading 0.3600 0.1297 0.3604 Discontinuous synthesis 0.0060 0.0022 Negative supercoiling 0.0050 0.0018 Methylation 0.0010 0.0004 Total cost 25.9145

Table A11. Standard Gibbs Free Energies for Different Dechlorination Reactions

Electron Reaction ΔG ’ Electron acceptor Product 0 Reference donor (kJ/mol) (Dolfing and Janssen, 1994, Hydrogen Tetrachloroethene Trichloroethene -175.31 Rittmann and McCarty, 2001) (Dolfing and Janssen, 1994, Hydrogen Trichloroethene Dichloroethene -166.31 Rittmann and McCarty, 2001) (Dolfing and Janssen, 1994, Hydrogen Dichloroethene Chloroethene -145.71 Rittmann and McCarty, 2001) (Dolfing and Janssen, 1994, Hydrogen Chloroethene Ethene -151.35 Rittmann and McCarty, 2001) (Dolfing and Hydrogen Hexachlorobenzene Pentachlorobenzene -171.40 Harrison, 1992)

(Dolfing and Hydrogen Pentachlorobenzene Tetrachlorobenzene -164.07 Harrison, 1992)

193

Electron Reaction ΔG ’ Electron acceptor Product 0 Reference donor (kJ/mol)

(Dolfing and Hydrogen Tetrachlorobenzene Trichlorobenzene -164.3 Harrison, 1992)

(Dolfing and Hydrogen Trichlorobenzene Dichlorobenzene -152.77 Harrison, 1992)

Average ΔG0’ -161.40

Table A12. Theoretical ATP/e- and H+/e- Ratios of Reductive Dechlorination by Dehalococcoides

H+/e- ratio (Assumed/Maximum) Average ATP/e- Proton X 100 ATP/e- ΔG0’ ratio translocation/mole (Energy transfer (Assumed) (kJ/mol) (maximum) ATP Maximum Assumed efficiency) 5 100 1.64 4 80 1.30 161.40 1.64 3 4.92 3 60 1.00 2 40 0.66 1 20 0.33

Table A13. Experimental Values of Corrinoid Content of Various Anaerobes

Dehalococcoides Dehalococcoides yield prediction yield prediction Corrinoid Corrinoid by iAI549 during by iAI549 during content Organism content corrinoid salvage de novo Reference (literature (mmol/gDCW) from the corrinoid values) medium synthesis (gDCW/eeq) (gDCW/eeq) Clostridium 30 nmol/g (Hoffmann 0.00015 0.713 0.713 cochlearium wet mass et al., 2000) 650 Acetobacterium (Stupperich nmol/g 0.00065 0.713 0.713 woodii et al., 1988) dry mass 950 Clostridium (Stupperich nmol/g 0.00095 0.713 0.713 formicoaceticum et al., 1988) dry mass 3100 Sporomusa (Stupperich nmol/g 0.0031 0.713 0.710 ovata et al., 1988) dry mass

194

Dehalococcoides Dehalococcoides yield prediction yield prediction Corrinoid Corrinoid by iAI549 during by iAI549 during content Organism content corrinoid salvage de novo Reference (literature (mmol/gDCW) from the corrinoid values) medium synthesis (gDCW/eeq) (gDCW/eeq) Methanosarcina (Gorris and barkeri (used in - 0.0047 0.713 0.709 van der iAI549) Drift, 1994)

10X Methanosarcina - 0.047 0.707 0.676 Assumption barkeri

Table A14. Growth Rate Simulations with and without the Citrate Synthase (CS) Reaction in the TCA-cycle

Flux values without Flux values with the Exchange reactions the CS reaction CS reaction (mmol/gDCW.h) (mmol/gDCW.h) Acetate exchange, 0.1820 0.1943 EX_ac(e) Cobalamin exchange, 0.0001 0.0001 EX_cbl1(e) Carbon dioxide 0.1741 0.1383 exchange, EX_co2(e) Hydrogen exchange, 10.0000 10.0000 EX_h2(e) Chloride exchange, 9.6067 9.6793 EX_Cl(e) Without the CS Parameters With the CS reaction reaction Growth rate 0.014 h-1 0.0137 h-1 Growth yield 0.72 gDCW/eeq 0.71 gDCW/eeq

Table A15. List of tables containing information for Dehalococcoides metabolic model, iAI549

Table Name File location Table S1. List of non-gene associated reactions doi:10.1371/journal.pcbi.1000887.s002 included in iAI556 Table S2. List of reannotated genes of different doi:10.1371/journal.pcbi.1000887.s002

195

Dehalococcoides strains Table S3. Gene correspondence for core genes mapped doi:10.1371/journal.pcbi.1000887.s002 to reactions of iAI556 Table S4. Gene correspondence for dispensable genes doi:10.1371/journal.pcbi.1000887.s002 mapped to new reactions of iAI556 Table S5. Gene correspondence for dispensable genes doi:10.1371/journal.pcbi.1000887.s002 mapped to reactions of iAI556 already present in core Table S6. Gene correspondence for unique genes doi:10.1371/journal.pcbi.1000887.s002 mapped to new reactions of iAI556 Table S7. Gene correspondence for unique genes doi:10.1371/journal.pcbi.1000887.s002 mapped to reactions of iAI556 already present in core Table S8. Detailed list of proteins of iAI556 doi:10.1371/journal.pcbi.1000887.s002 Table S9. List of reactions of iAI556 associated with doi:10.1371/journal.pcbi.1000887.s002 core genes (core reactions) Table S10. List of reactions of iAI556 associated with doi:10.1371/journal.pcbi.1000887.s002 dispensable genes (dispensable reactions) Table S11. List of reactions of iAI556 associated with doi:10.1371/journal.pcbi.1000887.s002 unique genes (unique reactions) Table S12. Detailed list of metabolites of iAI556 doi:10.1371/journal.pcbi.1000887.s002 Table S13. Core reductive dehalogenase homologous doi:10.1371/journal.pcbi.1000887.s002 (rdh) genes of iAI556 Table S14. Dispensable reductive dehalogenase doi:10.1371/journal.pcbi.1000887.s002 homologous (rdh) genes of iAI556 Table S15. Unique reductive dehalogenase doi:10.1371/journal.pcbi.1000887.s002 homologous (rdh) genes of iAI556 Table S16. List of Core Hypothetical Genes of doi:10.1371/journal.pcbi.1000887.s002 Dehalococcoides Pan-genome Table S17. List of Exchange Reactions of iAI549· doi:10.1371/journal.pcbi.1000887.s002 Table S18. Set of Constraints for Simulating doi:10.1371/journal.pcbi.1000887.s002 Dehalococcoides Growth using iAI549·

Supplemental Text

Dehalococcoides Biomass Synthesis Reaction

The detailed macromolecular composition of one (1) gram of Dehalococcoides cell, presented in Tables A1-A6, as well as the GAM (61 mmol ATP. gDCW-1) has been included in iAI549 as a biomass synthesis reaction: BIO_DHC_DM_61.

Considering a basis of 1 gram dry cell weight, the biomass synthesis equation is defined as:

196

0.0001 mmol Acetyl-CoA + 0.0047 mmol Adenosylcobalamin + 0.5588 mmol L-alanine + 0.001 mmol AMP + 0.3320 mmol L-arginine + 0.2625 mmol L-asparagine + 0.2625 mmol L-aspartate + 61 mmol ATP + 0.000006 mmol CoenzymeA + 0.1063 mmol CTP + 0.1003 mmol L-cysteine + 0.0955 mmol dATP + 0.0847 mmol dCTP + 0.0019 mmol Dodecanoic acid + 0.0847 mmol dGTP + 0.0955 mmol dTTP + 0.0471 mmol 10-R-Methylhexadecanoic acid + 0.2684 mmol L- glutamine + 0.2684 mmol L-glutamate + 0.6679 mmol Glycine + 0.0154 mmol Glycogen + 0.1586 mmol GTP + 61 mmol H2O + 0.0865 mmol Hexadecanoic acid + 0.1038 mmol L- histidine + 0.0047 mmol Homospermidine + 0.001 mmol Eicosanoic acid + 0.3179 mmol L- isoleucine + 0.4929 mmol L-leucine + 0.3733 mmol L-lysine + 0.1680 mmol L-methionine + 0.0022 mmol NAD + 0.0001 mmol NADH + 0.0001 mmol NADP + 0.0004 mmol NADPH + 0.0376 mmol Octadecanoic acid + 0.0007 mmol Oleic acid + 0.0347 mmol L-phenylalanine + 0.0415 mmol L-proline + 0.0262 mmol Putrescine + 0.0405 mmol L-serine + 0.000003 mmol Succinyl-CoA + 0.0001 mmol 5,6,7,8-tetrahydrofolate + 0.2759 mmol L-threonine + 0.0624 mmol L-tryptophan + 0.0154 mmol Tetradecanoic acid + 0.1511 mmol L-tyrosine + 0.0985 mmol UTP + 0.4603 mmol L-valine ----> 61 mmol ADP + 61 mmol H+ + 61 mmol Inorganic Phosphate

Calculation of Dehalococcoides Cell Composition

The shape of Dehalococcoides cell is reported to be cylindrical (Duhamel and Edwards, 2007). Diameter of one Dehalococcoides cell = 0.5 µm (Duhamel and Edwards, 2007) Thickness of one Dehalococcoides cell = 0.2 µm

2 2  D   0.5  Hence, the volume of one Dehalococcoides cell = πr 2 h = π   h = π   0.2  2   2  = 0.0393 µm3

Assume, cell density is equal to the density of water = 1.03 g/ml.

Therefore, mass of one Dehalococcoides cell = 1.03 g/ml x 3.93 x 10-2 µm3 x 10-12 ml/µm3 = 4.05 x 10-14 g

Typically, a bacterial cell has 70% water (Neidhardt et al., 1990)

Hence, dry mass of one Dehalococcoides cell = 4.05 x 10-14 x 0.3 g = 1.21 x 10-14 g

Length of Dehalococcoides DNA (roughly) = 1.4 x 106 base pairs (bp) (Kube et al., 2005)

Assume, the average molecular mass of a nucleotide or 1 bp = 666 g/mol

Hence, the molar mass of a Dehalococcoides genome = 1.4 x106 x 666 g/mol

197

Since, 1 mole of nucleotide = 6.023 x 1023 molecules of nucleotide

Therefore, The mass of 1 molecule of Dehalococcoides nucleotide (or genome) = (1.4 x106 x 666)/(6.023 x 1023) g = 1.55 x 10-15 g

1.55×10−15  So, the percentage of DNA in 1 gram dry cell mass =   ×100 0 = 12.75 0  −14  0 0 1.21×10 

We know, the amount of RNA in a 50 ml culture = 50 µg (Elizabeth A. Edwards, personal communication)

So, 1 ml of culture contains 1 µg of RNA.

Also, 1 ml of similar culture contains 1 x 107 ~ 5 x 108 copies of Dehalococcoides cells (Elizabeth A. Edwards, personal communication)

Assuming that 1 ml of culture has 5 x 108 copies of Dehalococcoides cells.

Hence, the dry mass of 5 x 108 cells = 5 x 108 x 1.21 x 10-14 g = 6.05 x 10-6 g

So, 6.05 x 10-6 g of cells has 1 x 10-6 g of RNA

Therefore,  1×10−6  The percentage of RNA in 1 gram dry cell mass =  ×100 = 16.53%  −6   6.05×10 

Since, the experimental data for estimating the percentage contents of other components of a Dehalococcoides cell were not available, the corresponding estimates from the published Methanosarcina barkeri model (Feist et al., 2006) that included protein, lipid, carbohydrate, and soluble pools and ions were used in this model.

In order to determine the amount of individual component of the macromolecules of a Dehalococcoides cell, physiological data from various published models of different microorganisms (Feist et al., 2006, Mahadevan et al., 2006, Neidhardt et al., 1990, Pramanik and Keasling, 1997) have been used. The contents of different fatty acids were calculated from White et al. (2005).

198

Calculation of NGAM and GAM Parameters of iAI549

Non-growth associated (NGAM) and growth associated maintenance (GAM) parameters for Dehalococcoides were estimated using the published data from (Adrian et al., 2007a, Cupples et al., 2004, Cupples et al., 2003, Duhamel et al., 2004, He et al., 2003, He et al., 2005, Jayachandran et al., 2003, Simmonds, 2007, Sung et al., 2006) and the equation from (Pirt, 1965, Pirt, 1982, Russell and Cook, 1995), as well as simulations in SimPhenyTM.

The non-growth associated ATP maintenance is given by = b m YG where, b = specific maintenance rate or decay rate (d-1)

YG = True growth yield or yield without maintenance (g DCW/eeq)

Assuming YG = Y= Observed growth yield for Dehalococcoides bacteria, b m = Y

Using pure culture growth yield, Y = 0.69 gDCW/eeq (Table A7) and b = 0.09 d-1 (Table A9), the calculated NGAM for Dehalococcoides bacteria is

0.09×1×1000 m = = 1.8 mmol ATP/g DCW.h 0.69× 24×3

Calculation of Theoretical Maximum Energy Transfer Efficiency (ATP/e-) and Proton Translocation Stoichiometry (H+/e- ratio) of Dehalococcoides Electron Transport Chain (ETC)

The theoretical maximum ATP/e- ratio, (ηATP/ηe)max can be determined from the following equation (Kröger et al., 2002):

η ΔE ' F  ATP  = 0 (1)  η  Δ '  e  max GP

199

where, F is the Faraday constant (96,500 J/mol .V), ΔE0’ is the difference in standard redox potential between the electron donor and acceptor, and ΔGP’ is the free energy of phosphorylation reaction at pH 7 and physiological condition.

Δ ' = − Δ ' Since, G0 nF E0 η ΔG ' Therefore,  ATP  = 0 (2)  η  Δ '  e  max n GP

where n is the number of electrons transferred in the reaction.

ΔGP’ at physiological conditions can be calculated from the free energy of the phosphorylation reaction at standard conditions and pH 7 (ΔG0,p’) using the following equation:

 []ATP  ΔG ' = ΔG ' + RT ln  p 0, p  [][] (3)  ADP Pi 

where, ΔG0,p’ = 32 kJ/mol (Thauer et al., 1977), R is the universal gas constant having a value of 8.314 J/mol.K and T is the absolute temperature, 298.15 K at 25 ºC.

Assuming that the concentrations of ATP and ADP are equal and that the concentration of Pi is 1 mM, then the calculated value of ΔGP’ using equation (3) is 49.12 kJ/mol.

The average standard free energy for dechlorination was found to be ΔG0’ of -161.40 kJ/mol (Table A11). Therefore, theoretical maximum ATP/e- using equation (2) is:

η 161.40  ATP  = =  η  1.64  e  max 2× 49.12

Assuming the number of H+ translocated across the cell membrane during the phosphorylation of ADP is 3 (Harold and Maloney, 1996), we obtain the theoretical maximum H+/e- of dechlorination process is 4.92 (Table A12) which means, the H+/e- should be either 5 or 4.

200

Since the ATP/e- value (0.33) corresponding to 1 H+/e- (Table 30) was found to be in agreement with the experimental ATP/e- value of 0.6 mol ATP/mol Cl- (Loubiere and Lindely, 1991, Miller et al., 1997, Tang et al., 2009b), the proton translocation stoichiometry of Dehalococcoides ETC was chosen as 1 H+/e-

Detailed procedures for developing the Dehalococcoides pan-genome and model

Figure A1. Steps involved in developing the pan-genome

The concept of pan-genome was first investigated by Tettelin and colleagues for the 8 isolates of common human pathogen Streptococcus agalactiae (Tettelin et al., 2005). A pan-genome, which catalogues the entire gene repertoire of a bacterial species, has three parts: core-genome, dispensable-genome and unique-genome. The core-genome includes the genes shared by all strains while the genes that are present in two or more strains, but not all, are included in the dispensable-genome. Obviously, the unique-genome has those genes that are present in only one strain (Medini et al., 2008, Muzzi et al., 2007). The procedures for developing Dehalococcoides

201

pan-genome, as well as its core, unique and dispensable parts are sequentially shown in Figures A1, A2, A3, and A4. Using strain BAV1 genome as reference, we first identified the putative orthologs between strain BAV1 and strain CBDB1 by OrthoMCL (Li et al., 2003), keeping OrthoMCL’s parameters in default settings. Then, we sorted out the genes that were present only in CBDB1 genome in order to combine with BAV1 genome to obtain augmented-genome 1 (Figure A1). This was then compared and analyzed with strain 195 genome as before, to construct augmented genome 2.

Figure A2. Steps involved in developing the core-genome

Subsequent comparative analysis between augmented genome 2 and strain VS genome resulted in the pan-genome. Since the order of genome, during the comparative analysis, changes the total number of genes in a pan-genome (Tettelin et al., 2005), we obtained 6 pan-genomes for 6

202 different combinations; the one with the highest number of genes (2061) was chosen as Dehalococcoides pan-genome.

The next step is to develop the core-genome, and the procedures are shown in Figure A3. Using the same reference – strain BAV1 genome – as used for the pan-genome, core-genome 1 was created by comparing the reference genome with subject-genome 1 (i.e., strain CBDB1 genome). Since core genes are shared by all, only orthologous genes between the reference genome and subject-genome 1 were included in core-genome 1 (Figure A2). The same procedure was followed for generating core-genome 2 and eventually, the core-genome of Dehalococcoides, where subject-genomes 2 and 3 were the genomes of strain 195 and strain VS, respectively.

Figure A3. Steps involved in developing the unique-genome

203

Successively, the unique-genome for Dehalococcoides was developed following the steps illustrated in Figure A3. As before, the comparative analysis by OrthoMCL between genome-1 and genome-2 produced unique-genome 1 that comprised of genes found in genome-1 only. Further comparison and sorting out of unique genes between unique-genome 1 and genome-3 resulted in unique-genome 2, and the final comparative analysis of genome-4 with unique- genome 2 generated a unique-genome. This unique-genome included the genes that were unique to, and found in only genome-1. Because each genome has unique genes, we actually developed 4 unique-genomes following the aforementioned steps; the summation of these 4 unique- genomes ultimately resulted in Dehalococcoides unique-genome. Once the pan, core, and unique genomes were developed, the dispensable-genome of Dehalococcoides was obtained by combining the core and unique genomes, followed by subtracting from the pan-genome (Figure A4).

Figure A4. Steps involved in developing the dispensable-genome

After the pan-genome and all of its parts were built, the pan-metabolic-network was developed and curated following the procedure described previously in literature (Covert et al., 2001, Feist et al., 2009, Francke et al., 2005, Reed et al., 2006a, Thiele and Palsson, 2010a); this network finally generated the “pan-genome-scale” metabolic network for Dehalococcoides. Since the metabolic-network of strain CBDB1 was reconstructed in an earlier work, only the metabolic genes missing from strain CBDB1 genome were included from Dehalococcoides pan-genome to

204 develop the pan-metabolic-network. A gene correspondence among the four genomes was prepared to facilitate gene identification regardless of the genome of interest. This, in turn, mapped the genes from other strains to strain CBDB1 genes; thus, same gene-protein-reaction (GPR) associations are applicable for all genes in the pan-genome-scale metabolic model.

205

Figure A5. Reconstructed Wood-Ljungdahl pathway for Dehalococcoides. Grey lines indicate missing pathways and red lines indicate existing pathways, the genes of which are identified in the genomes of Dehalococcoides during the reconstruction of iAI549. The arrows are denoting the directionality of the reactions.

Core genes Transport 46 8 6 Dispensable genes Unique genes

Energy metabolism 76 51 54

Central carbon metabolism 31 7

Nucleotide metabolism 53 11

Lipid metabolism 32 3

Cofactor and prosthetic group 62 8 biosynthesis

Amino acid metabolism 123 5

0 1224364860728496108120132144156168180

Number of Genes

Figure A6. Distribution of metabolic genes in different subsystems of iAI549

Core reactions Transport 26 1 Dispensable reactions Unique reactions

Energy metabolism 37 1

Central carbon metabolism 36 5

Nucleotide metabolism 83

Lipid metabolism 81

Cofactor and prosthetic group 91 10 biosynthesis

Amino acid metabolism 122 3

0 25 50 75 100 125 Number of Reactions

206

Figure A7. Distribution of gene-associated model reactions in different subsystems of iAI549

207

Appendix B: Supplemental information for Chapter 4

Table B1: List of supplemental tables for chapter 4

Table Name Table Location Table S1. Proteomic and Transcriptomic Evidence Islam_M_A_PhDThesis_022014_Addition for All Genes in Strain 195 Genome al File 1.xlsx Table S1. Proteomic and Transcriptomic Evidence Islam_M_A_PhDThesis_022014_Addition for All Genes in KB-1 Dhc Genome al File 1.xlsx Table S3. Strain 195 Hypothetical Proteins Highly Islam_M_A_PhDThesis_022014_Addition Expressed or "On" (≥ 800) in All Samples al File 1.xlsx Table S4. Strain 195 Hypothetical Proteins Not Islam_M_A_PhDThesis_022014_Addition Highly Expressed or Not "On" (< 800) in All al File 1.xlsx Samples Table S5. KB-1 Dhc Hypothetical Proteins Highly Islam_M_A_PhDThesis_022014_Addition Expressed or "On" (≥ 100) in All Samples al File 1.xlsx Table S6. KB-1 Dhc Hypothetical Proteins Not Islam_M_A_PhDThesis_022014_Addition Highly Expressed or Not "On" (< 100) in All al File 1.xlsx Samples Table S7. Proteomic and Transcriptomic Evidence Islam_M_A_PhDThesis_022014_Addition for Strain 195 Metabolic Genes al File 1.xlsx Table S8. Proteomic and Transcriptomic Evidence Islam_M_A_PhDThesis_022014_Addition for KB-1 Dhc Metabolic Genes al File 1.xlsx Islam_M_A_PhDThesis_022014_Addition Table S9. Expression of rdhA Genes of Strain 195 al File 1.xlsx Table S10. Expression of rdhA Genes of KB-1 Islam_M_A_PhDThesis_022014_Addition Dhc al File 1.xlsx Table S11. "ON" (Absolute Intensity ≥ 800) Islam_M_A_PhDThesis_022014_Addition Metabolic Genes of Strain 195 in Late Stationary al File 1.xlsx (No Growth) Condition Table S12. "ON" (Absolute Intensity ≥ 100) Islam_M_A_PhDThesis_022014_Addition Metabolic Genes of KB-1 Dhc in Starved (No al File 1.xlsx Growth) Condition Table S13. Quality Threshold (QT) Clusters of Islam_M_A_PhDThesis_022014_Addition Strain 195 Transcriptomic Data al File 1.xlsx Table S14. Quality Threshold (QT) Clusters of Islam_M_A_PhDThesis_022014_Addition KB-1 Dhc Transcriptomic Data al File 1.xlsx Table S15. Operon Prediction Results for Islam_M_A_PhDThesis_022014_Addition Dehalococcoides mccartyi Genomes al File 1.xlsx Table S16. Selection of Genes Identified in Islam_M_A_PhDThesis_022014_Addition Functionally Enriched Significant Clusters and al File 1.xlsx Associated Inferred Annotations Table S17. List of Genomes Used for Operon Islam_M_A_PhDThesis_022014_Addition Prediction al File 1.xlsx

208

209

Figure B1. Workflow for Analyzing Pre-Processed KB-1 Microarray Data. This figure describes the detailed steps involved in curating and reconciling Dehalococcoides mccartyi- specific gene expression data from the shotgun metagenome microarray of KB-1 mixed microbial community. Subsequently, QT clustering and functional enrichment analyses of the data were conducted for identifying interesting clusters of metabolic genes.

210

211

Figure B2. Workflow for Analyzing Pre-Processed Strain 195 Microarray Data. This figure illustrates the detailed steps involved in curating and analyzing transcriptomic data from the pure culture Dehalococcoides mccartyi strain 195.

212

Figure B3. Distribution of Strain 195 Gene Expression Intensities for 27 Samples. Histograms of absolute gene expression intensities for strain 195 are plotted using the median values of the array data for triplicate samples under 9 experimental conditions.

213

214

Figure B4. Distribution of KB-1 Dhc Gene Expression Intensities for 33 Samples. Histograms of absolute gene expression intensities for KB-1 Dhc are plotted for all 33 samples analyzed in this study.

215

216

Figure B5. Visualization of gene expression data on the Dehalococcoides mccartyi metabolic network. (A) The pan-genome-scale D. mccartyi metabolic network is organized by organic layout algorithm in Cytoscape, where the genes and reactions involved in energy metabolism form 3 distinct clusters. The exchange reactions, representing the in silico growth medium and different from transporters, are organized toward the periphery of the network. Expressions of only metabolic genes from both arrays were visualized by overlaying the intensities on the network for conditions when highest and lowest number of genes was “on”. The highest number of metabolic genes was “on” in (B) “ANASspent” (D) “TCEM” conditions while the lowest number of such genes was found in (C) “LS” and (E) “Starved” samples for strain 195 and KB-1 Dhc, respectively. “Not Examined” genes were those for which no array data were available because no corresponding homologs were found in either data set.

217

Appendix C: Supplemental information for Chapter 5

Identification of KB1_0495 (DmIDH) and KB1_0553 (DmPMI)

The gene encoding a putative isocitrate dehydrogenase (IDH) enzyme in D. mccartyi genomes (cbdbA408, DET0450, DehaBAV_0427, DhcVS_392, DehalGT_0391, and KB1_0495) (Markowitz et al., 2012) was primarily annotated as an NAD+-dependent isocitrate dehydrogenase (EC. 1.1.1.41) (Markowitz et al., 2012, Kube et al., 2005, Seshadri et al., 2005). The annotation of this gene is also not very specific in biochemical databases, including COG (Tatusov et al., 2000), TIGR Pfam (Green and Klein, 2002), and EBI Pfam (Punta et al., 2012), where it belongs to the isocitrate/isopropylmalate dehydrogenase protein family. However, during the construction and extensive manual curation of the D. mccartyi metabolic model (Ahsanul Islam et al., 2010), the gene was annotated as an IDH with specificity for NADP+ only (EC. 1.1.1.42). This annotation was assigned based on a rigorous analysis of the gene sequence with various bioinformatic tools during the metabolic model construction (Ahsanul Islam et al., 2010). Also, the orthologous gene neighborhood analysis (Figure C1) (Markowitz et al., 2012) — the operon structure for the putative D. mccartyi IDH gene in orthologous genomes — shows that the gene is located in an operon in all D. mccartyi genomes with other putative TCA-cycle genes, including a propionyl-CoA carboxylase, large and small subunits of a putative aconitase, a putative malate dehydrogenase, and a putative fumarate hydratase (alpha and beta subunits) (Figure 5.1A). Due to the non-specific nature of annotations in biochemical databases and considering the promiscuity of enzyme functions (Khersonsky et al., 2006), we biochemically characterized the putative D. mccartyi IDH (DmIDH) from strain KB-1 (KB1_0495).

The second selected gene of D. mccartyi strain KB-1 (KB1_0553) was annotated as a hypothetical protein/SIS domain protein in D. mccartyi genomes (cbdbA472, DET0509, DehaBAV1_0485, DhcVS_0450, and DehalGT_0448) (Markowitz et al., 2012). Some biochemical databases such as COG (Tatusov et al., 2000) and EBI Pfam (Punta et al., 2012) annotated the gene as a putative glucose-6-phosphate isomerase while SEED (Overbeek et al., 2005), TIGR Pfam (Green and Klein, 2002), and IMG (Markowitz et al., 2012) annotated it as a bifunctional phosphoglucose isomerase (PGI; EC 5.3.1.9)/phosphomannose isomerase (PMI; EC 5.3.1.8). This gene was also identified to be a putative PGI/PMI during the construction and

218

extensive manual curation of the D. mccartyi metabolic model (Ahsanul Islam et al., 2010); however, the gene annotation was given very low confidence during model construction due to the lack of biochemical evidence and insufficient bioinformatic evidence (Ahsanul Islam et al., 2010). Notably, the ortholog of KB1_0553 from D. mccartyi strain 195 (DET0590) was found in a coexpressed gene cluster enriched (i.e., overrepresented) with central carbon metabolism genes, specifically sugar metabolism genes, during the clustering analysis of D. mccartyi transcriptomes (chapter 4). In addition, the orthologous gene neighborhood analysis (Figure C1B) (Markowitz et al., 2012) shows that the gene is located in an operon in all D. mccartyi genomes where it is located upstream of a putative bifunctional phosphoglucomutase/phosphomannomutase (Figure C1B). Hence, the putative hypothetical protein from D. mccartyi in KB-1 (KB1_0553) was biochemically characterized for identifying its catalytic activities.

219

220

Figure C1. Orthologous gene neighborhood analysis for DmIDH (KB1_0495) DmPMI (KB1_0553). (A) Genes that are orthologous to DmIDH in all D. mccartyi genomes (cbdbA408, DET0450, DehaBAV_0427, DhcVS_392, DehalGT_0391) are shown by red color. Annotations of all genes in the operon (marked by a red circle) containing DmIDH are (from left to right): Propionyl-CoA carboxylase, aconitase-large subunit, aconitase-small subunit, isocitrate dehydrogenase, malate dehydrogenase, hypothetical protein, fumarate hydratase-alpha subunit, fumarate hydratase-beta subunit, and HIT domain protein (Markowitz et al., 2012). (B) Genes that are orthologous to DmPMI in all D. mccartyi genomes (cbdbA472, DET0509, DehaBAV1_0485, DhcVS_0450, DehalGT_0448) are shown by red color. The other gene in the operon (marked by a red circle) with DmPMI is a putative bifunctional phosphoglucomutase/phosphomannomutase (from left to right) (Markowitz et al., 2012).

221

222

Figure C2. Orthologous gene neighborhood analysis for the 3-isopropylmalate dehydrogenase (IPMDH) from D. mccartyi (cbdbA804, DET0826, DhcVS_730, DehaBAV1_0745, DehalGT_0706). Genes that are orthologous to IPMDH are shown by red color. Annotations of all genes in the operon (marked a red circle) containing the IPMDH are (from left to right): 2-isopropylmalate synthase, leuB: 3-isopropylmalate dehydrogenase, leuD: 3-isopropylmalate dehydratase-small subunit, leuC: 3-isopropylmalte dehydratase-large subunit, membrane protein, 2-isopropylmalate synthase, ilvC: ketol-acid reductoisomerase, ilvN: acetolactate synthase-small subunit, ilvB: acetolactate synthase-large subunit, ilvD: dihydroxy- acid dehydratase, putative translation factor.

223

Appendix D: Supplemental information for Chapter 6

Table D1. Previously developed (Duhamel, 2005, Waller, 2010) qPCR primer-sets used in this study

Annealing Temperature, Organism Primer name Primer sequence (5’-3’) Tm (ºC) Dhc 1f GATGAACGCTAGCGGCG Dehalococcoides 60 Dhc 264r CCTCTCAGACCAGCTACCGATCGAA Aceto 572f GGCTCAACCGGTGACATGCA Acetobacterium 59 Aceto 784r ACTGAGTCTCCCCAACACCT Bact 215f TACGGAATGGCTCGCGTGAC Bacteroides 67 Bact 415r ATAGGGCCGCCTTCCTGCAC Geo 73f CTTGCTCTTTCATTTAGTGG Geobacter 59 Geo 485r AAGAAAACCGGGTATTAACC Spi 67f CGCAGCAATGCGCTGAGAGC Spirochaetes 69 Spi 287r AGGCCGGCTACCCATCATCG Sporo 168f TAGAGATGGGTCTGCGTCTG Sporomusa 59 Sporo 367r TCGTCCCAAACGACAGAGCT Mvorans 166f AAAGCTTTTGTGCCTAAGGA Methanomethylovorans 59 Mvorans 413r ATGGACAGCCAACATAGGAT Mbiales 471f ACTATTACTGGGCTTAAAGC Methanomicrobiales 59 Mbiales 754r ACCGATACACCTAACGCGCA Msaeta 170f TGCATCGAGATTTAAAGCTC Methanosaeta 59 Msaeta 390r TGTAACCTGGCACTCGAGGT Msarcina 180f ATGCGTAAAATGGATTCGTC Methanosarcina 59 Msarcina 511r TAGACCCAATAATCACGATC BAC 1055F ATGGYTGTCGTCAGCT General Bacteria 55 BAC 1392R ACGGGCGGTGTAC ARCH-787F ATTAGATACCCGBGTAGTCC General Archaea 59 ARCH-1059R GCCATGCACCWCCTCT

224

Figure D1. qPCR results for Acetobacterium OTU for 1:10 dilution of KB-1 samples

225

226

Figure D2. Ratios of total bacterial and archaeal cell numbers tracked by individual OTU primers to general bacterial and archaeal primers. Total bacterial and archeal OTU cell numbers of different KB-1 cultures used in this study and tracked by individual OTU primers, and general bacterial and archaeal primers are presented as ratios for qPCR surveys conducted in January, 2012 (A), (B), and September, 2012 (C), (D).

227

228

Figure D3. Electron balance profiles for the time-course experiment of diluted KB-1 cultures. Dechlorination profiles of electron acceptors ethene, methane and electron donor methanol as calculated in meeq (millielectron equivalence)/botttle for 50x (2%; vol/vol) diluted KB-1 subcultures in different media conditions are shown: (A) KB-1 subcultures previously maintained in all exogenous vitamins added media are transferred to similar growth media (“Regular KB-1+All Vitamin Mix Medium”); (B) KB-1 subcultures previously maintained in all exogenous vitamins except vitamin B12 free media are transferred to similar media (“B12 Free KB-1+B12 Free Vitamin Mix Medium”) and (C) all exogenous vitamins including vitamin B12 added media (“B12 Free KB-1+All Vitamin Mix Medium”); and (D) KB-1 cultures previously maintained in all exogenous vitamins free media are transferred to similar media (“Vitamin Free KB-1+No Vitamin Mix Medium”), (E) all exogenous vitamins except vitamin B12 added media (“Vitamin Free KB-1+B12 Free Vitamin Mix Medium”), and (F) all exogenous vitamins including vitamin B12 added media (“Vitamin Free KB-1+All Vitamin Mix Medium”). Methanol amendment is represented by arrows in (D) and (F). Error bars represent the standard deviations of triplicate samples.

229

Appendix E: Genome-scale constraint-based metabolic modeling of Moorella thermoacetica

Abstract

Acetogens are obligate anaerobes capable of conserving energy and assimilating carbon into biomass through one of the most simple and unique microbial inorganic carbon fixation pathways: the reductive acetyl-CoA pathway, or the Wood-Ljungdahl (W-L) pathway. Although acetogens can use different electron donors and acceptors due to their metabolic diversity, the

main feature of the W-L pathway involves the reductive synthesis of acetyl-CoA from CO2 and

H2 (acetogenesis), or from the fermentation of sugars (homoacetogenesis) by acetogens. Because of their phenotypic and metabolic diversity, acetogens play vital roles in many syntrophic and symbiotic microbial habitats; one such example is KB-1 — a syntrophic microbial consortium important for the bioremediation of toxic chlorinated xenobiotics. In order to better understand acetogenes’ metabolism and their potential roles in KB-1, a constraint-based genome-scale metabolic model of a well-studied acetogen, Moorella thermoacetica has been developed from its genome sequence and physiological information. The reconstructed biochemical network of M. thermoacetica was automatically constructed by the RAST annotation server, and manually curated using information from published literature and biochemical databases. Subsequently, the Model SEED framework was used to construct the metabolic model while FBA (Flux Balance Analysis) was used for simulating the metabolism of M. thermoacetica. The model, also known as iAI517, was comprised of 517 metabolic genes, 812 reactions, and 815 metabolites. Of the 812 model reactions, 777 are gene-associated reactions while the rest are non-gene associated reactions. The model, in addition to simulating both autotrophic and heterotrophic growth of M. thermoacetica, also revealed degeneracy in the TCA-cycle, a common characteristic of anaerobic metabolism. In addition to rendering a total picture of the organism’s metabolism, such models are useful for generating experimentally testable hypothesis regarding its physiology.

Introduction

Acetogenesis is a microbial metabolic process for acetate formation from carbondioxide (CO2)

and hydrogen (H2) through a metabolic pathway called the reductive acetyl-CoA pathway, also

230

known as the Wood-Ljungdahl (W-L) pathway (Drake, 1994, Drake et al., 2008) . Unlike other

prokaryotic CO2-fixing pathways, the reductive acetyl-CoA pathway is the most simplified and only linear pathway to synthesize acetyl-CoA from CO2 without requiring any recycled intermediates (Drake and Daniel, 2004, Fuchs, 2011). Although many organisms, including methanogens, sulfate reducing bacteria, and methanogenic euryarchaeota, use this unique

autotrophic pathway for CO2-fixation or acetate oxidation, only acetogens — a group of obligate

anaerobic bacteria — use it for simultaneous assimilation of CO2 into cell-biomass and ATP generation for conserving energy by converting acetyl-CoA to acetate (Drake and Daniel, 2004, Fuchs, 2011). Acetogens are also capable of conserving energy and producing acetate as an exclusive fermentation product from sugars, such as glucose, fructose, and xylose, via the W-L pathway by a process termed homoacetogenesis (Fontaine et al., 1942, Diekert and Wohlfarth, 1994). Thus, acetogens can conserve energy by both electron transport phosphorylation and substrate level phosphorylation (Drake et al., 2008). This versatility in energy conservation processes, together with their capability of using a wide range of organic and inorganic compounds as electron donors and acceptors enable them to play critical roles in many diverse ecosystems, such as soils, sediments, sludge, and intestinal tracts of termites and humans (Drake, 1994, Drake and Daniel, 2004, Drake et al., 2008).

Due to their metabolic diversity, acetogens play pivotal roles in anaerobic dechlorinating microbial communities such as KB-1. In KB-1, acetogens convert methanol to produce acetate

and H2, the carbon and energy source for the main dechlorinating bacteria, D. mccartyi in the community (Duhamel and Edwards, 2006, Duhamel et al., 2004, Duhamel et al., 2002). Being a natural producer of vitamin B12 (Stupperich et al., 1988, Stupperich et al., 1990), acetogens possibly also supply vitamin B12 to D. mccartyi in KB-1 because D. mccartyi are vitamin B12 auxotrophs but this nutrient is essential for their growth and dechlorination activity (Yan et al., 2013, Yan et al., 2012). Considering these crucial roles, it is very important to understand the metabolism of acetogens in detail; this knowledge, in turn, will be very useful in designing and developing a robust and self-sustaining dechlorinating microbial community for the bioremediation application. Since constraint-based genome-scale models are very useful for characterizing the metabolism of an organism (Lewis et al., 2012, Feist et al., 2009, Thiele and Palsson, 2010a), we developed such a detailed metabolic model for Moorella thermoacetica, a

231

well-studied and versatile anaerobic bacterium capable of growing by both acetogenesis and homoacetogenesis.

M. thermoacetica is a low GC content, Gram-positive bacterium belonging to family of the diversified bacterial phylum Firmicutes (Pierce et al., 2008, Drake and Daniel, 2004). This spore forming rod-shaped bacterium is 2.8 µm thick and 0.4 µm in diameter, and it can grow in a minimal medium with the supply of only nicotinic acid and

trace metals (Drake and Daniel, 2004). M. thermoacetica uses H2, CO, sugars, alcohols, organic

acids and methoxylated aromatic compounds as electron donors, and CO2, nitrate, nitrite, thiosulfate, and dimethylsulfoxide as electron acceptors (Drake, 1994, Drake and Daniel, 2004, Drake et al., 2008). Originally isolated from horse manure, this bacterium is capable of growing both autotrophically using H2/CO2 and heterotrophically by sugar fermentation (Drake et al., 2008). Using the genome sequence of M. thermoacetica (Pierce et al., 2008) as a backbone, an initial automated metabolic network of the organism was constructed by the RAST annotation server (Aziz et al., 2008). Subsequently, a draft constraint-based model of M. thermoacetica was developed using the Model SEED platform (Henry et al., 2010) in SEED database (Overbeek et al., 2005). After constructing the draft model, it was extensively curated by manually incorporating physiological information of M. thermoacetica from published literature and biochemical databases describing the organism’s physiology. In silico growth of M. thermoacetica was simulated with the model in a minimal medium, and the difference in energy conservation process was also explored by model simulations.

Materials and methods

Automated generation of a draft metabolic model for Moorella thermoacetica

The reconstructed metabolic network and a draft metabolic model for M. thermoacetica were automatically generated using the RAST (Rapid Annotation using Subsystem Technology) (Aziz et al., 2008) annotation server and the Model SEED platform (Henry et al., 2010) in SEED (Overbeek et al., 2005), a comprehensive web-based environment for performing comparative genomic analyses and developing highly curated genomic data. The Model SEED platform is a semi-automated pipeline for generating, optimizing, and analyzing genome-scale metabolic

232

models of microorganisms (Henry et al., 2010). Using this pipeline, the assembled genome- sequence of M. thermoacetica was first uploaded into the RAST server, which annotated the genome automatically and imported it into the SEED environment. Subsequently, a preliminary reconstructed network of M. thermoacetica was generated based on RAST annotations and incorporating all spontaneous reactions, as well as a template biomass reaction from the SEED database. This preliminary metabolic network included network gaps which were later identified and filled during the auto completion stage by adding necessary intracellular and transport reactions to generate an analysis-ready model. This analysis-ready draft model was capable of generating all biomass precursor metabolites and the growth rate of M. thermoacetica using the flux balance analysis (FBA) approach (Varma and Palsson, 1994). However, this analysis-ready draft model of M. thermoacetica required extensive manual curation because it included a template biomass reaction from the SEED database, and this reaction was not consistent with the actual biomass composition of M. thermoacetica.

Determination of biomass compositions

The detailed biomass composition, including the amount of all cellular macromolecules, such as amino acids, DNA, RNA, fatty acids, and lipids, of M. thermoacetica was generated from its genome sequence, biochemical database, and published literature describing its cellular compositions and physiology (Tables E2–E9). The overall percentage composition of macromolecules in a M. thermoacetica cell was assumed to be similar to that of a Bacillus subtilis cell, and obtained from the genome-scale Bacillus subtilis model (Oh et al., 2007). The rationale behind this assumption was the fact that both M. thermoacetica and B. subtilis are Gram-positive bacteria, and their cell-size and cell-morphology are quite comparable (Drake and Daniel, 2004, Sargent, 1975). Detailed composition of all amino acids, DNA, and RNA was calculated from the genome sequence (Pierce et al., 2008) while fatty acids composition was obtained from published literature on M. thermoacetica physiology (Yamamoto et al., 1998). Cell wall, and ions and metabolites compositions were used from the Bacillus subtilis model (Oh et al., 2007). All compositions were calculated using the basis of 1 g dry cell weight.

233

Curation of the draft Moorella thermoacetica metabolic model

The draft metabolic model of M. thermoacetica was first checked for consistency of metabolites and reactions included in the model. Reactions and metabolites of the draft model were different from the published genome-scale models, including the D. mccartyi metabolic model developed in this thesis. This difference was due to following a separate naming convention for metabolites and reactions by the Model SEED platform; hence, the names of all reactions and metabolites were converted to the BiGG database (Schellenberger et al., 2010) notations for making the draft model consistent with other curated and published genome-scale models (Figure E1). This conversion also helped identify and further curate the metabolic network. Next, the biomass composition for M. thermoacetica, estimated from published literature and the genome sequence of the organism, was included in the model. After incorporating the correct biomass equation, FBA was used for generating all biomass precursors with the model using a minimal growth medium for M. thermoacetica. Once the model was capable of generating all biomass precursors

in the minimal medium, both autotrophic growth on H2/CO2 and heterotrophic growth using glucose, fructose, and xylose as substrates were simulated for M. thermoacetica.

Figure E1. Steps involved in curating the M. thermoacetica draft metabolic model. All steps are described in the text. Results and discussion

234

General features of the reconstructed metabolic network of M. thermoacetica

The reconstructed metabolic network of M. thermoacetica, also denoted as iAI517 according to the established naming convention (Reed et al., 2003), comprises 517 metabolic genes, representing 21% genes of a genome of 2.6 Mbp (Table E1). In total, there are 812 biochemical reactions in the network, of which 777 are gene-associated reactions while 35 reactions are non- gene associated; non-gene associated reactions are either spontaneous reactions (18 in total) such as diffusion reactions, or added to the network during the gap-filling procedure (17 in total). The estimated composition of different biomass precursors of a M. thermoacetica cell was included in the network as a biomass demand reaction, which is essentially an exchange reaction. The network also comprises 56 exchange reactions, including the components of the in silico minimal growth medium and transporters. In order to analyze the metabolic processes of M. thermoacetica, reactions in the network were further categorized based on the metabolic pathways, also known as model subsystems, in which they were involved. For instance, reactions involved in glycolysis/gluconeogenesis, TCA-cycle, pentose phosphate pathways, and carbohydrate metabolism were referred to as “central carbon metabolism” reactions, while reactions not involved in any particular pathway were included in “other” category. This classification of metabolic reactions led to the identification of 10 model subsystems in the M. thermoacetica metabolic model.

Classification of reactions in different metabolic pathways revealed that the subsystem “amino acid metabolism” has the highest number of reactions followed by “fatty acid metabolism” and “cofactor metabolism” subsystems (Figure E2). In order to obtain a better topological view of the model subsystems, the entire reconstructed metabolic network was visualized (Figure E3) by Cytoscape (Smoot et al., 2011), an open source bioinformatics software for integrating and visualizing systemic data. Genes, metabolites and reactions of the M. thermoacetica metabolic network were organized using the organic layout algorithm (http://docs.yworks.com/yfiles/doc/developers-guide/smart_organic_layouter.html) in Cytoscape. This representation clearly pointed out that exchange reactions are located near the periphery of the network indicating their role in facilitating the passage of various metabolites into and out of the system boundary. Also, reactions involved in lipid and fatty acid metabolism were clustered

235

differently compared to reactions in other subsystems (Figure E2). Lipid metabolism reactions are mainly producing precursors, such as lipoteichoic acids and peptidoglycan, for cell wall biogenesis while reactions in fatty acid metabolism are mainly step-wise chain elongation reactions. Due to the stepwise nature of these reactions, they clustered closely in the network. Reactions in other subsystems, especially those in amino acid metabolism and central carbon metabolism, are widely distributed throughout the entire network. Scattered distributions of these reactions are likely due to their involvement in producing precursor metabolites for various metabolic pathways (Figure E3). The complete W-L pathway, used for autotrophic growth of M. thermoacetica, was also reconstructed from M. thermoacetica genome sequence (Figure E4).

Figure E2. Subsystems of the Moorella thermoacetica metabolic model, iAI517. Metabolic reactions in different model subsystems were categorized according to their metabolic functions, or the metabolic pathways they involved.

236

Figure E3. Reconstructed metabolic network of Moorella thermoacetica. The network was organized by the organic layout algorithm in Cytoscape. Reactions involved in different metabolic pathways or model subsystems were colored according to the subsystem names.

237

Figure E4. Reconstructed Wood-Ljungdahl (W-L) pathway of CO2-fixation for M. thermoacetica.

238

Model-based simulations of M. thermoacetica metabolism

Moorella thermoacetica is a versatile acetogen capable of growing both autotrophically and heterotrophically using a multitude of organic and inorganic compounds as electron donors and acceptors. Unlike D. mccartyi, M. thermoacetica can use its carbon source for generating energy

for growth. It uses H2 and CO2 for autotrophic growth while ferments sugars, such as glucose, xylose, and fructose, for heterotrophic growth (Drake and Daniel, 2004, Pierce et al., 2008); in both cases, it employs the Wood-Ljungdahl (W-L) reductive acetyl-CoA pathway for producing acetate although the former mode of growth is called acetogenesis, and the latter is known as homoacetogenesis (Diekert and Wohlfarth, 1994, Drake and Daniel, 2004, Drake et al., 2008). Thus, we examined if the model was able to simulate M. thermoacetica’s growth in silico using

H2, CO2, glucose, xylose, and fructose as substrates. Growth simulation results are shown in Figure E5, which shows that autotrophic growth mode of M. thermoacetica is very slow compared to its growth on sugars. We also measured the amount of ATP produced during the growth of M. thermoacetica on each growth substrate to identify if its slow growth was related to the less availability of energy.

Figure E5. In silico growth profile of Moorella thermoacetica on different substrates using the metabolic model, iAI517.

239

Figure E6. In silico ATP generation profile of Moorella thermoacetica on different substrates using the metabolic model, iAI517.

Model simulations, indeed, showed that the autotrophic growth was associated with less ATP production, hence resulted in slow growth of the bacterium, while ATP production was the highest when M. thermoacetica was growing on glucose (Figure E6). These results are in agreement with the known physiology of the bacterium because the W-L pathway produces 4 moles of ATP in total (2 moles from glycolysis of glucose to pyruvate, and 2 moles from acetylphosphate to acetate step in the W-L pathway [Figure E4]) by substrate level phosphorylation during the homoacetogenic growth of M. thermoacetica on glucose. The scenario is different for autotrophic growth of the organism. When M. thermoacetica grows autotrophically on H2 and CO2 using the W-L pathway, one mole of ATP is consumed during the formate to 10-formyltetrahydrofolate step and one mole of ATP is produced during the formation of acetate from acetyl phosphate (Figure E4); thus, in the absence of no net ATP gain from the W-L pathway, M. thermoacetica resorted to conserve energy by the proton-gradient driven respiration through the chemiosmotic mechanism that produces only 1 or 2 moles of ATP by the proton-dependent ATP synthase (Müller, 2003). Thus, less availability of ATP from respiration than glucose fermentation is responsible for slower growth of M. thermoacetica on H2 and CO2.

240

Conclusions

Construction of a genome-scale constraint-based metabolic model of Moorella thermoacetica, and subsequent growth simulations by the model identified the differences in energy conservation processes during the autotrophic and heterotrophic growth of this acetogen.

Although M. thermoacetica grows autotrophically only on H2 and CO2, it can use a variety of organic and inorganic compounds, including methanol, ethanol, formate, oxalate, pyruvate, lactate, nitrate and nitrite, as electron donors and acceptors during the homoacetogenic fermentative growth. Thus, growth of this bacterium can now be analyzed with all of these substrates, in addition to the ones already analyzed in this study. Most importantly, construction of iAI517 now enabled researchers in the bioremediation community to develop a genome-scale community model for the KB-1 community by incorporating such models available for the dominant members of the community. Such a detailed modeling of microbial metabolism at the community-level will be beneficial for understanding microbial interactions in the community, as well as for exploiting the microbial communities in bioremediation and other biotechnology applications.

Table E1. General features of Moorella thermoacetica metabolic reconstruction

Genes

Total number of Protein-Coding Genes 2523

Genes Included in the Model 517 (21%)

Genes Excluded from the Model 2006 (79%)

Intra-System Reactions

Total Number of Model Reactions 812

Gene-Associated Model Reactions 777

241

Non-Gene Associated Model Reactions 35

Exchange Reactions

Total Number of Exchange Reactions 57

Input-Output Reactions 56

Biomass Demand Reaction 1

Metabolites

Total Number of Metabolites 809

Number of Extracellular Metabolites 57

Number of Biomass Metabolites 64

Table E2. Overall cellular composition of a Moorella thermoacetica cell (Oh et al., 2007)

Metabolite Class % w/w composition

Protein 52.85

DNA 2.60

RNA 6.55

Lipid 7.60

Lipoteichoic acid 3.04

Cell wall components 22.42

Ion and metabolite 4.94

Sum 100.00

242

Table E3. Protein composition of a Moorella thermoacetica cell (Pierce et al., 2008)

Count (all % MW % (by Amino Acid %P*MW mmol/gDCW ORFs) † Prevalence (g/mol) weight) Alanine (A) 79490.00 10.37 89.05 923.89 7.23 0.4288

Arginine (R) 52645.00 6.87 175.11 1203.21 9.41 0.2840

Asparagine (N) 23361.00 3.05 132.05 402.63 3.15 0.1260

Aspartic acid (D) 34673.00 4.53 132.04 597.55 4.67 0.1870

Cysteine (C) 8138.00 1.06 121.02 128.54 1.01 0.0439

Glutamate (E) 51805.00 6.76 146.05 987.52 7.72 0.2795

Glutamine (Q) 27224.00 3.55 146.07 519.02 4.06 0.1469

Glycine (G) 65648.00 8.57 75.03 642.88 5.03 0.3541

Histidine (H) 13400.00 1.75 155.07 271.21 2.12 0.0723

Isoleucine (I) 46347.00 6.05 131.09 792.99 6.20 0.2500

Leucine (L) 86082.00 11.24 131.09 1472.84 11.52 0.4644

Lysine (K) 33094.00 4.32 147.11 635.43 4.97 0.1785

Methionine (M) 17155.00 2.24 149.05 333.73 2.61 0.0925

Phenylalanine (F) 26267.00 3.43 165.08 565.95 4.43 0.1417

Proline (P) 38383.00 5.01 115.06 576.42 4.51 0.2071

Serine (S) 33503.00 4.37 105.04 459.32 3.59 0.1807

Threonine (T) 38340.00 5.00 119.06 595.79 4.66 0.2068

Tryptophan (W) 8681.00 1.13 204.09 231.24 1.81 0.0468

Tyrosine (Y) 23295.00 3.04 181.07 550.53 4.31 0.1257

Valine (V) 58640.00 7.65 117.08 896.09 7.01 0.3163

Sum 766171 100 2736.31 12786.79 100

243

Table E4. DNA composition of a Moorella thermoacetica cell (Pierce et al., 2008)

% MW % (by DNA %P*MW mmol/gDCW Prevalence (g/mol) weight) dATP 22.105 487.00 10765.15 22.30 0.0119

dCTP 27.895 462.99 12915.09 26.75 0.0150

dGTP 27.895 503.00 14031.07 29.06 0.0150

dTTP 22.105 477.99 10565.95 unity 0.0119

Sum 100 48277.27 100

Table E5. RNA composition of a Moorella thermoacetica cell (Pierce et al., 2008)

Count % MW % (by RNA (all %P*MW mmol/gDCW Prevalence (g/mol) weight) ORFs)

ATP 582554.0 22.2 503.00 11146.68 22.49 0.0293

CTP 732124.0 27.9 478.98 13339.86 26.91 0.0368

GTP 734515.0 27.9 518.99 14501.25 29.25 0.0369

UTP 579591.0 22.0 479.97 10582.29 21.35 0.0291

Sum 2628784 100 49570.07 100

Table E6. Fatty acid composition of a Moorella thermoacetica cell (Yamamoto et al., 1998)

% molar Fatty Acid % (by Normalized MW Averaged MW composition Components weight) % w/w (g/mol) (g/mol) (mmol/gDCW)

244

% molar Fatty Acid % (by Normalized MW Averaged MW composition Components weight) % w/w (g/mol) (g/mol) (mmol/gDCW)

C14:0 normal 1.4 1.4156 227.4 3.2190 0.0127

C15:0 branched 39.7 40.1416 241.4 96.9017 0.3812 (iso)

C15:0 normal 2.4 2.4267 241.4 5.8580 0.0230

C16:0 normal 21 21.2335 255.4 54.2305 0.2133

C16:1 1.9 1.9211 253.4 4.8681 0.0191

C17:0 branched 28 28.3114 269.4 76.2709 0.3000 (anteiso)

C18:0 normal 3.3 3.3367 283.5 9.4595 0.0372

C18:1 1.2 1.2133 281.5 3.4155 0.0134

Sum 98.9 100 2053.4 254.2235 1

Table E7. Lipid composition of a Moorella thermoacetica cell (Oh et al., 2007)

Content % % (by Components (id) MW (g/mol) %P*MW mmol/gDCW (w/w) * weight) Lipoteichoic acid (n=24), linked, glucose 19 8445.3175 1604.6103 22.6795 0.0007 substituted Lipoteichoic acid (n=24), linked, N-acetyl-D- 19 9430.5895 1791.8120 25.3254 0.0006 glucosamine Lipoteichoic acid (n=24), linked, D-alanine 40 6692.1895 2676.8758 37.8348 0.0018 substituted Lipoteichoic acid (n=24), 22 4553.9335 1001.8654 14.1603 0.0015 linked, unsubstituted

245

Content % % (by Components (id) MW (g/mol) %P*MW mmol/gDCW (w/w) * weight)

Sum 100.0 7075.2 100.0

Table E8. Cell wall composition of a Moorella thermoacetica cell (Oh et al., 2007)

Content % mmol / Components (id) MW (g/mol) %P*MW % (mole) (w/w) * gDCW

Peptidoglycan subunit 45 991 445.9500 6.2786 0.1018 (peptido_MT)

Glycerol teichoic acid (n=45), unlinked, unsubstituted 11.9 7373.6 877.4584 12.3539 0.0036 (gtca1_45_MT)

Glycerol teichoic acid (n=45), unlinked, D-ala substituted 11.9 11382.8 1354.5532 19.0711 0.0023 (gtca2_45_MT)

Glycerol teichoic acid (n=45), unlinked, glucose substituted 11.9 14688 1747.8720 24.6087 0.0018 (gtca3_45_MT)

Minor teichoic acid (acetylgalactosamine glucose 19.3 13869.6 2676.8328 37.6877 0.0031 phosphate, n=30) (tcam_MT)

Sum 100.0 7102.7 100.0

Table E9. Ions and metabolites of a Moorella thermoacetica cell (Oh et al., 2007)

Content % MW % (by Components (id) %P*MW mmol/gDCW (w/w) * (g/mol) weight)

246

Content % MW % (by Components (id) %P*MW mmol/gDCW (w/w) * (g/mol) weight)

K 86 39.1 33.626 5.3396 1.0865

Mg 7.7 24.3 1.8711 0.2971 0.1565

Fe(+3) 0.6 55.9 0.3354 0.0533 0.0053

Ca 0.4 40.1 0.1604 0.0255 0.0049

Phosphate 4.3 96 4.128 0.6555 0.0221

Diphospate 0.5 174.9 0.8745 0.1389 0.0014

Menaquinol 7 1 651 6.51 1.0337 0.0008

10- 1 471.4 4.714 0.7485 0.0010 Formyltetrahydrofolate

NAD 61.9 662.4 410.0256 65.1089 0.0462

AMP 9.3 345.2 32.1036 5.0978 0.0133

ATP 8.7 503.2 43.7784 6.9517 0.0085

ADP 6.3 424.2 26.7246 4.2437 0.0073

CMP 1.9 321.2 6.1028 0.9691 0.0029

NADP 4 740.4 29.616 4.7028 0.0027

CTP 1.5 479.1 7.1865 1.1412 0.0015

GMP 1.1 361.2 3.9732 0.6309 0.0015

GTP 1.3 519.1 6.7483 1.0716 0.0012

CDP 0.6 400.2 2.4012 0.3813 0.0007

NADPH 0.9 741.4 6.6726 1.0596 0.0006

247

Content % MW % (by Components (id) %P*MW mmol/gDCW (w/w) * (g/mol) weight)

GDP 0.5 440.2 2.201 0.3495 0.0006

Sum 100.0 629.8 100.0

Moorella thermoacetica biomass equation

109 h2o[c] + 109 atp[c] + 0.0462 nad[c] + 0.0027 nadp[c] + 0.2795 glu-L[c] + 0.0006 nadph[c]

+ 0.1469 gln-L[c] + 0.0133 amp[c] + 6e-006 coa[c] + 0.0291 utp[c] + 0.0925 met-L[c] + 0.1807 ser-L[c] + 0.0006 gdp[c] + 0.0369 gtp[c] + 0.0015 gmp[c] + 0.187 asp-L[c] + 0.4288 ala-L[c] +

0.1785 lys-L[c] + 0.0029 cmp[c] + 0.0007 cdp[c] + 0.284 arg-L[c] + 0.0368 ctp[c] + 0.126 asn-

L[c] + 0.0468 trp-L[c] + 0.1417 phe-L[c] + 0.1257 tyr-L[c] + 0.0439 cys-L[c] + 0.3541 gly[c] +

0.001 10fthf[c] + 0.4644 leu-L[c] + 0.0199 datp[c] + 0.0723 his-L[c] + 0.3163 val-L[c] + 0.2071

pro-L[c] + 0.2068 thr-L[c] + 0.015 dgtp[c] + 0.0199 dttp[c] + 0.25 ile-L[c] + 0.015 dctp[c] +

0.0034 fe3[c] + 0.7063 k[c] + 0.1017 mg2[c] + 0.0008 mqn8[c] + 0.0032 ca2[c] + 0.011

d12dg_MT[c] + 0.0084 m12dg_MT[c] + 0.0066 t12dg_MT[c] + 0.0177 pgly_MT[c] + 0.0005

cdlp_MT[c] + 0.0022 lysylpgly_MT[c] + 0.0562 psetha_MT[c] + 0.0007 lipo1_24_MT[c] +

0.0006 lipo2_24_MT[c] + 0.0018 lipo3_24_MT[c] + 0.0015 lipo4_24_MT[c] + 0.0036

gtca1_45_MT[c] + 0.0023 gtca2_45_MT[c] + 0.0018 gtca3_45_MT[c] + 0.0031 tcam_MT[c] +

0.1018 peptido_MT[c] --> Biomass[c] + 0.0009 ppi[c] + 109 pi[c] + 109 h[c] + 109 adp[c]