The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

Citation Dmytrenko, Oleg. 2018. The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts. Doctoral dissertation, , Graduate School of Arts & Sciences.

Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:41129192

Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts

A dissertation presented

by

Oleg Dmytrenko

to

The Department of Organismic and Evolutionary

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of

Biology

Harvard University Cambridge, Massachusetts

April 2018

© 2018 Oleg Dmytrenko All rights reserved.

Dissertation Advisor: Professor Colleen M. Cavanaugh Oleg Dmytrenko

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts

ABSTRACT

Symbiosis is a major driving force of biological diversity. within mutualistic symbiotic associations are capable of occupying new ecological niches which would have been otherwise inaccessible to their individual partners. Symbioses between chemoautotrophic and marine are an example of such partnerships in which hosts benefit from organic carbon supplied by its symbiotic bacteria, while the symbionts profit from a steady supply of reduced inorganic compounds and electron acceptors sequestered and delivered to them by their hosts. These symbioses are able to, for instance, create lush oases of surrounding hydrothermal vents in the amid the otherwise barren . Perhaps the most enigmatic feature shared by all chemoautotrophic symbionts within the class of in the lack for a gene encoding fructose 1,6-bisphosphatase (FBPase), an enzyme which in bacteria catalyzes two essential reactions in the Calvin-Benson-Bassham

(Calvin) carbon fixation cycle. Yet chemoautotrophic bacterial symbionts are not only able to fix

CO2 using the Calvin cycle, but are among some of the most prolific primary producers in the ocean. It has been hypothesized that a glycolytic pyrophosphate-dependent phosphofructokinase (PPi-PFK), acting in reverse, can perform the function of the missing

FBPase in these bacteria. To test this hypothesis in my thesis I investigated the ability of PPi-

PFK from the symbionts of Solemya velum coastal protobranch bivalve to perform the biochemical function of the missing FBPase. I detected high gene expression of the symbiont

PPi-PFK-encoding gene and high reverse PPi-PFK activity in the symbiont-containing tissue of the host. The recombinant enzyme from the S. velum symbiont had the highest specificity for the reverse reaction compared to other bacterial PPi-PFKs and higher catalytic efficiency than

iii

many bacterial FBPases. By recreating the symbiont-like Calvin cycle in a free-living closely- related purple gammaproteobacterium, Allochromatium vinosum, I demonstrated that in the absence of FBPase its function in the cycle can be performed by PPi-PFK. The shift from

FBPase to PPi-PFK in A. vinosum came at the cost of reduced growth and decreased adaptability but offered an improvement in thermodynamic efficiency potentially due to of the high energy pyrophosphate generated by PPi-PFK in the Calvin cycle. Data presented in my thesis show that the selection of PPi-PFK over FBPase took place in all lineages of gammaproteobacterial chemoautotrophic symbionts. My results also demonstrate that PPi-PFK can perform the biochemical function of FBPase and may have become specifically adapted to this function in the symbionts. The feasibility of the Calvin cycle which uses PPi-PFK instead of FBPase was demonstrated in A. vinosum. The observed physiological changes accompanying the shift from FBPase to PPi-PFK in this bacterium suggest that such a transition could be advantageous to the symbionts. Living in a relative constant and isolated host environment, these specialist bacteria may be selected for the thermodynamic efficiency which accompanies PPi-PFK use. Free-living bacterial generalists, on the other hand, would be severely disadvantaged by the associated decline in growth rate and adaptability, as they would become less fit to outgrow their competition and slower at adapting to fluctuation in environmental conditions. A proposed link between PPi-PFK reverse activity in the Calvin cycle and a sulfur oxidation pathway in chemoautotrophic symbionts may explain why a shift to PPi-

PFK has not occurred in photoautotrophic symbionts and plastids, which obtain their energy from light instead of sulfide. These results advance our understanding of the key metabolic processes and evolutionary forces responsible for the origin and maintenance of chemoautotrophic symbioses.

iv

TABLE OF CONTENTS

CHAPTER PAGES

Title page i

Copyright ii

Abstract iii-iv

Table of Contents v

Acknowledgements vi-viii

Introduction 1-13

Chapter 1 The genome of the intracellular bacterium of the coastal 14-34 bivalve, Solemya velum: a blueprint for thriving in and out of

Chapter 2 The “missing enzyme” in the enigmatic Calvin cycle of 35-82 chemoautotrophic bacterial symbionts

Chapter 3 The enigmatic Calvin cycle of chemoautotrophic 83-119 bacterial symbionts deciphered

Conclusion 120-127

Appendix 1 Supplementary material for Chapter 1 128-150

Appendix 2 Supplementary material for Chapter 2 151-166

Appendix 3 Supplementary material for Chapter 3 167-182

v

ACKNOWLEDGEMENTS

I would like to thank Colleen Cavanaugh and all the current and former members of the

Cavanaugh lab for their help and support throughout my PhD years. I am grateful for their guidance in developing my research ideas and methodologies, sharing many fun moments in the laboratory, and, of course, commiserating. In particular I would like to thank Colleen for trusting me with finding a research project. I am very grateful for her continuous enthusiasm and support of my research and uncompromising scientific rigor. I would like to thank Kristina

Fontanez and Guus Roeselers for getting me started with stimulating projects in the laboratory early in my graduate career. Finally, I am thankful to Alicja Kunikowska and Daniel Utter for providing invaluable feedback on my thesis chapters.

I deeply appreciate the time, commitment, support, and guidance from my committee members, Edward DeLong, Peter Girguis, Hopi Hoekstra, and Christopher Marx. The extensive expertise and wisdom they brought to our meetings was tremendously helpful in refining my research direction, contextualizing data, and exploring new ways of answering nascent questions. I am deeply indebted to Peter Girguis for his scientific and career advice, for letting me use his lab equipment, and helping me navigate the graduate program. I would like to thank

Christopher Marx for helping me take my first steps in reverse . I owe big thanks to

Edward DeLong for letting me into his lab and providing funding and resources to study gene expression in the Solemya velum symbionts. I am incredibly thankful to Hopi Hoekstra for joining my committee well into my PhD and bringing along valuable evolutionary and genetic insights.

I would like to also thank everyone who has helped me learn and develop new experimental techniques and methodologies. I am particularly grateful to Frank Stewart who supervised my S. velum symbiont transcriptome study in the laboratory of Edward DeLong at

Massachusetts Institute of Technology. Molecular genetic experiments with Allochromatium

vi

vinosum were made possible in large part due to generous advice as well as bacterial strains provided by Christiane Dahl and Renate Zigann. I would also like to thank Dipti Nayak, Nicole

De Nisco, Paige Swanson, and Anna Wang for sharing with me their plasmid stocks. Great many experimental measurements in my thesis were made possible thanks to Matthew

Meselson, who has generously given me access to his laboratory. Stable isotope experiments would not have been possible without advice from Wiebke Mohr, Tiantian Tang, and Daniel

Hoer. Sequencing and analysis of the S. velum symbiont genome was a large team effort which came to fruition thanks to contributions from Shelbi Russell, Wesley Loo, Kristina Fontanez, Li

Liao, Guus Roeselers, Irene Newton, Frank Stewart, John Eppley, Tanja Woyke, Jenna Morgan

Lang, Raghav Sharma, Donhying Wu, and Jonathan Eisen.

Great many people, including Chris Preheim, Lydia Carmosino, and Elena Kramer, deserve thanks for guiding me through the graduate program and making it such a positive and memorable experience. I would especially like to thank Elena Kramer, Peter Girguis, and

Rebecca Chetham for securing funding I needed to complete my thesis research. For excellent teaching experience in their courses I would like to express my gratitude to Joshua Sanes, Jeff

Lichman, Maryellen Ruvolo, Pardis Sabeti, Hopi Hoekstra, and Andrew Berry. For administrative, technical, and moral support I am particularly indebted to Madeleine Marino,

Nikki Hughes, Bridget Power, Jason Green, and Kendall Winters.

I owe big thanks to Shelbi Russell, Guus Roeselers, Chris Baker, Mark Comerford, and

Alicja Kunikowska for helping me collect specimens of Solemya velum.

I incredibly appreciate being part of the Quincy House community throughout my time at

Harvard. The students, the tutors, and the resident deans, Lee and Deb Gehrke, enriched my life in myriad ways as I served as a resident tutor in Quincy.

I am infinitely grateful to my family for their love, incredible patience, understanding, and support throughout my graduate studies. I admire my wife, Alicja Kunikowska, for her inquisitive

vii

mind, impeccable work ethic, and sense of humor. Thank you for being there for me. My sister,

Olga Dmytrenko, has been an unwavering source of support and a tireless travel companion.

My father, Volodymyr Dmytrenko, has taught me to persevere even under the toughest of circumstances. My mother, Svitlana Dmytrenko, has always encouraged my scientific interests, shared my curiosity for biology and chemistry, taught me the value of hard work, and has been a true inspiration. Without all of you I would have never made it this far.

viii

INTRODUCTION

Symbiosis was first defined by the botanist and mycologist, Anton de Bary, in 1878 as

"the living together of differently named " (de Bary 1878). This original definition of symbiosis included a wide range of associations as far apart as parasitism and mutualism, but in contemporary literature it is most commonly used to describe an interaction benefiting both the host and the symbiont which persists over their lifetime. Symbioses have been instrumental in the of eukaryotic cells according to the endosymbiotic theory of the origin of mitochondria and plastids (Mereschkowsky 1905; Sagan 1967; de Duve 2007) from a purple non-sulfur bacterium (John & Whatley 1975; Gray et al. 1999; Andersson et al. 2003; Cavalier-

Smith 2006) and a cyanobacterium (Cavalier-Smith 1982; McFadden & van Dooren 2004;

Bhattacharya et al. 2007), respectively. Symbiosis is one of the major driving forces of biological diversity on Earth. It allows its partners to occupy otherwise inaccessible ecological niches.

Associations between plants and bacteria are able to colonize terrestrial environments poor in ammonium and nitrate, biologically available forms of nitrogen (Gibson et al. 2008). For ruminants and termites, partnerships with cellulose-digesting bacteria enable access to limited nutrients through feeding on plants (Krause et al. 2013; Brune 2014). Symbioses between chemolithoautotrophic bacteria and marine invertebrates colonize habitats in the deep sea, creating oases of life in contrast to their barren, food-limited benthic surroundings

(Cavanaugh et al. 2013).

Chemoautotrophic symbioses

Symbioses between chemoautotrophic bacteria and marine invertebrates are ubiquitous in environments featuring gradients of reduced inorganic molecules such as sulfide (Felbeck et al. 1981; Cavanaugh 1983), (Cavanaugh et al. 1992), or hydrogen (Petersen et al.

1

2011) and oxygen (Dubilier et al. 2008; Cavanaugh et al. 2013). Chemoautotrophic symbionts, which primarily belong to the class Gammaproteobacteria, oxidize reduced sulfide compounds with oxygen and use this energy to fix inorganic carbon into biomass primarily through the

Calvin-Benson-Bassham (Calvin) cycle (Stewart et al. 2005; Childress & Girguis 2011). The resulting organic carbon is supplied to the host either through secretion (Fisher & Childress

1986; Bright et al. 2000; Ponsard et al. 2013) or digestion of the symbionts (Fisher & Childress

1992). Some symbionts also supply their hosts with nitrogenous compounds (Lee et al. 1999).

The reliance on symbionts for nutrition is so prominent, that many host organisms do not have their own gut and are incapable of filter-feeding (Reid & Bernard 1980; Krueger et al. 1992;

Cavanaugh et al. 2013). Symbiont hosts have evolved behavioral, physiological, and biochemical adaptations for capturing energy substrates and acceptors and delivering them to the symbionts. Some invertebrates, for example, Solemya velum, build Y-shaped burrows in reduced coastal sediments and position themselves at the junction to gain access to the overlaying oxygenated water and from below (Stanley 1970; Cavanaugh 1983;

Roeselers & Newton 2012). Certain of symbiotic , nematodes, and ciliates migrate along the gradient of electron acceptors and donors (Polz et al. 2000). Some hosts also possess hemoglobin molecules which are capable of reversibly binding sulfide and oxygen for delivery to the symbionts (Doeller et al. 1988; Hourdez & Weber 2005;

Bailly & Vinogradov 2005; Flores et al. 2005). Since the discovery of hydrothermal vents

(Corliss & Ballard 1977; Corliss et al. 1979) and chemoautotrophic symbioses subsequently at the Galápagos Rift (Cavanaugh et al. 1981; Felbeck et al. 1981), basic principles governing these partnerships between bacteria and eukaryotes have been established. Yet, the life cycle of the bacterial symbionts, their recruitment by the host, and communication between the partners remain primarily unexplored. In comparison with our understanding of non-symbiotic

2

model organisms, such as Escherichia coli, Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, or Mus musculus, we have barely begun.

The study of chemoautotrophic symbionts is severely limited by our current inability to grow symbiotic bacteria in pure culture outside of the host and to maintain symbiont-free hosts.

Tools for genetically manipulating symbionts and their hosts are still to be developed. Thus, chemoautotrophic symbioses have been primarily studied using indirect methods. Location and morphology of symbionts are investigated using light (Ponsard et al. 2013; Eichinger et al.

2014), confocal (Bettencourt et al. 2014; Volland et al. 2018), and electron microscopy

(Conway, Howes, et al. 1992; Gros et al. 2012; Klose et al. 2016). Measurements of enzymatic activity in tissue extracts and characterization of recombinant enzymes have provided biochemical evidence for metabolic processes involved in functioning of symbioses. One of the first enzymes detected in chemoautotrophic symbionts was ribulose 1,5-bisphosphate carboxylate-oxygenase (RuBisCO) (Felbeck et al. 1981; Cavanaugh 1983), which catalyzes the key CO2 incorporation step in the Calvin cycle. Activity of enzymes, such as ATP sulfurylase

(SAT), was detected in the symbionts inferred to derive ATP and reducing equivalents from sulfide oxidation (Felbeck et al. 1981; Chen et al. 1987; Fisher et al. 1993). Methanol dehydrogenase activity central to methane-oxidizing metabolism, was detected in -free extracts containing methylotrophic symbionts of Bathymodiolus platifrons (Barry et al. 2002). In comparison, heterologous expression and subsequent characterization of symbiont genes in E. coli, has been less common (Millikan et al. 1999; Schwedock et al. 2004). Using pulse-chase tracer experiments, it has been possible to postulate metabolic pathways and metabolite fluxes occurring within symbioses (Felbeck 1983; Felbeck & Turner 1995; Volland et al. 2018).

Analysis of stable carbon, nitrogen, and sulfur isotopes has been a powerful tool in uncovering metabolic networks and trophic levels (Rau et al. 1990; Conway, Capuzzo, et al. 1992; Lee &

Childress 1994; Robinson et al. 2003). Experimental manipulations of symbionts have involved

3

controlled laboratory incubations of whole organisms (Girguis et al. 2002), preparation of cell- free extracts (Vacelet et al. 1996; Lee et al. 1999), symbiont enrichments (Scott & Cavanaugh

2007), and measurements of select metabolites (Liao et al. 2013). PCR amplification and sequencing of symbiont and host genes has becomes commonplace (Laue & Nelson 1994;

Robinson et al. 1998; Stewart et al. 2008; Russell et al. 2017), now more frequently superseded by whole genome and metagenome sequencing (Woyke et al. 2006; Newton et al. 2007;

Robidart et al. 2008; Dmytrenko et al. 2014). Genomic studies have been complemented by transcriptomics (Stewart et al. 2011; Sanders et al. 2013; Seston et al. 2016), proteomics

(Markert et al. 2011), and metabolomics (Kleiner et al. 2012), allowing direct investigation of predicted functional capabilities. In recent years a vast increase in sequence data from diverse symbiotic associations has occurred. Analysis of these data brought about a surge in hypotheses about , function, and activity of chemoautotrophic symbioses. For example, based on metagenomic data, multiple toxin-like genes have been hypothesized in the symbionts of deep-sea Bathymodiolus (Sayavedra et al. 2015) and recycling of host urea has been proposed in the symbionts of algarvensis (Woyke et al. 2006). However, the majority of these hypotheses remain untested due to lack of suitable tools which can be robustly applied to symbiotic systems.

Novel Calvin cycle in chemoautotrophic bacterial symbionts

In my thesis I set out to identify a key hypothesis in the field of chemoautotrophic symbiosis using genomic and transcriptomic data. Next, I developed a means of testing this hypothesis in a way that overcame the limitations imposed by our current inability to grow and genetically manipulate symbionts in pure culture. Together, I interpret the combined results in

4

the context of our current understanding of evolution and functioning of chemoautotrophic symbioses.

As a model symbiotic system, I chose to study the symbiosis between a protobranch bivalve, Solemya velum, and its gammaproteobacterial chemoautotrophic endosymbiont

(Cavanaugh 1983), which is closely related to other chemoautotrophic symbionts and a number of well-characterized free living bacteria, including Allochromatium vinosum (Weissgerber et al.

2011). The S. velum symbiosis is one of the best studied chemoautotrophic symbioses, with well-studied physiology and ecology (Stewart & Cavanaugh 2006; Scott & Cavanaugh 2007;

Russell & Cavanaugh 2017). Using energy from the oxidation of sulfide, the symbionts are known to fix CO2 using ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO), the key enzyme in the Calvin cycle, and are thought to feed their host with the resulting organic carbon

(Cavanaugh 1983; Conway & McDowell Capuzzo 1991; Scott & Cavanaugh 2007).

In Chapter 1 of my thesis the genome of S. velum symbiont was analyzed and compared to the genomes of other sequenced symbionts and closely-related free-living bacteria. In this analysis the extent of genome reduction, typical of many intracellular symbionts was evaluated.

Genes specific to the symbiotic lifestyle were identified. Metabolic pathways and cellular processes were inferred from sequence data. The S. velum symbiont genome (2.7 Mb) was comparable in size to the genomes of many free-living bacteria, had high GC content (51%), and carried a large number of mobile genetic elements, which are less common in obligate vertically-transmitted intracellular bacteria (Newton & Bordenstein 2011). Unlike symbionts from oligotrophic environments, the symbionts of S. velum contained genes which encoded the complete TCA and glyoxylate cycles, DMSO and urea reductases, and a highly-branched electron transport chain. Just like the first sequenced chemoautotrophic symbiont of

Calyptogena magnifica (Newton et al. 2008) and other gammaproteobacterial chemoautotrophic symbionts sequenced to date, the symbiont of S. velum lacked a gene for fructose 1,6-

5

bisphosphatase (FBPase), an enzyme which catalyzes two essential reaction in the Calvin cycle. It was hypothesized that in the S. velum symbionts a bidirectional pyrophosphate- dependent phosphofructokinase (PPi-PFK) can perform the function of the missing FBPase when operating in reverse, a possibility which was originally proposed in the symbiont of C. magnifica (Newton et al. 2008).

In Chapter 2 of my thesis, transcriptional activity in the S. velum symbionts was examined, with the focus on the genes involved in sulfur oxidation and carbon fixation through the Calvin cycle, including pfp which encodes PPi-PFK. Next, PPi-PFK activity in the symbiont- containing cell-free extracts was measured, and purified recombinant PPi-PFK was characterized. High transcriptional pfp and reverse enzymatic PPi-PFK activity was detected in the symbiont-containing gill-tissue of S. velum. Purified PPi-PFK had high specificity for the reverse reaction and higher catalytic efficiency than most bacterial FBPases. Finally, a multi- gene time-calibrated Bayesian phylogeny was constructed to investigate the presence or absence of PPi-PFK and FBPase in extant chemoautotrophic symbionts and free-living bacteria and to infer their ancestral states. Ancestral state reconstruction showed that the shift from

FBPase to PPi-PFK occurred in evolutionary histories of all analyzed chemoautotrophic symbionts. Together, these data support the hypothesis that PPi-PFK can perform the biochemical function of FBPase in the S. velum symbionts and may be essential to their evolution and maintenance.

Chapter 3 investigated the ability of PPi-PFK to perform the function of FBPase in the

Calvin cycle. Owing to the limitations of working with uncultured chemoautotrophic symbionts, the symbiont-like Calvin cycle was reconstructed in a closely-related free-living purple sulfur bacterium, Allochromatium vinosum. To study the physiological changes associated with the loss of FBPase and the use of PPi-PFK during autotrophic growth, an anaerobic bioreactor was built which continuously monitored cell-growth, sulfide oxidation, pH, temperature, and light

6

intensity. CO2 fixation rates, total protein concentrations, and ATP levels of the cultures were also measured. The obtained data showed that the shift from FBPase to PPi-PFK during autotrophic growth using the Calvin cycle is associated with a decrease in growth and adaptability, but offers a significant increase in thermodynamic efficiency. These results provide further evidence for the alternative Calvin cycle hypothesized in chemoautotrophic symbionts and offer insights into the associated energy-saving mechanisms potentially coupled to sulfur- metabolism.

Taken together, these results demonstrate that PPi-PFK can catalyze the same reactions as FBPase in the Calvin cycle of sulfur-oxidizing bacteria, particularly in chemoautotrophic symbionts. These findings suggest that the proposed function of PPi-PFK in the Calvin cycle may be essential to the origin and maintenance of chemoautotrophic symbionts. Furthermore, the adapted experimental approach illustrates the feasibility of experimentally testing hypotheses which originate from sequence data of uncultured microorganisms by applying molecular genetics in closely-related free-living bacteria.

References

Andersson, S. et al., 2003. On the origin of mitochondria: a genomics perspective. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences, 358(1429), pp.165–177.

Bailly, X. & Vinogradov, S., 2005. The sulfide binding function of hemoglobins: relic of an old biosystem? Journal of Inorganic Biochemistry, 99(1), pp.142–150.

Barry, J. et al., 2002. Methane-based symbiosis in a , Bathymodiolus platifrons, from cold seeps in Sagami Bay, Japan. Invertebrate Biology, 121(1), pp.47–54.

Bettencourt, R. et al., 2014. Site-related differences in gene expression and bacterial densities in the mussel Bathymodiolus azoricus from the Menez Gwen and Lucky Strike deep-sea hydrothermal vent sites. Fish & Shellfish Immunology, 39(2), pp.343–353.

Bhattacharya, D. et al., 2007. How do endosymbionts become organelles? Understanding early events in plastid evolution. BioEssays, 29(12), pp.1239–1246.

7

Bright, M., Keckeis, H. & Fisher, C., 2000. An autoradiographic examination of carbon fixation, transfer and utilization in the symbiosis. Marine Biology, 136(4), pp.621– 632.

Brune, A., 2014. Symbiotic digestion of lignocellulose in termite guts. Reviews Microbiology, 12(3), pp.168–180.

Cavalier-Smith, T., 2006. Origin of mitochondria by intracellular enslavement of a photosynthetic purple bacterium. Proceedings of the Royal Society B: Biological Sciences, 273(1596), pp.1943–1952.

Cavalier-Smith, T., 1982. The origins of plastids. Biological Journal of the Linnean Society, 17(3), pp.289–306.

Cavanaugh, C.M., 1983. Symbiotic chemoautotrophic bacteria in marine invertebrates from sulphide-rich habitats. Nature, 302, pp.58–61.

Cavanaugh, C.M. et al., 1981. Prokaryotic cells in the hydrothermal vent tube worm Riftia pachyptila Jones: possible chemoautotrophic symbionts. Science, 213(4505), pp.340–342.

Cavanaugh, C.M., Wirsen, C. & Jannasch, H., 1992. Evidence for methylotrophic symbionts in a hydrothermal vent mussel (Bivalvia: Mytilidae) from the Mid-Atlantic Ridge. Applied and Environmental Microbiology, 58(12), pp.3799–3803.

Cavanaugh, D.C.M. et al., 2013. Marine chemosynthetic symbioses. In The Prokaryotes. Berlin Heidelberg: Springer Berlin Heidelberg, pp. 579–607.

Chen, C., Rabourdin, B. & Hammen, C., 1987. The effect of hydrogen sulfide on the metabolism of Solemya velum and enzymes of sulfide oxidation in gill tissue. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 88(3), pp.949–952.

Childress, J.J. & Girguis, P.R., 2011. The metabolic demands of endosymbiotic chemoautotrophic metabolism on host physiological capacities. The Journal of experimental biology, 214(2), pp.312–325.

Conway, N. & McDowell Capuzzo, J., 1991. Incorporation and utilization of bacterial lipids in the Solemya velum symbiosis. Marine Biology, 108(2), pp.277–291.

Conway, N., Capuzzo, M. & Judith, E., 1992. High taurine levels in the Solemya velum symbiosis. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 102(1), pp.175–185.

Conway, N., Howes, B., et al., 1992. Characterization and site description of Solemya borealis (Bivalvia; ), another bivalve-bacteria symbiosis. Marine Biology, 112(4), pp.601– 613.

Corliss, J. et al., 1979. thermal springs on the Galápagos Rift. Science, 203(4385), pp.1073–1083.

Corliss, J.B. & Ballard, R.D., 1977. Oases of life in the cold abyss. National Geographic

8

Magazine, 152(4), pp.441–452. de Bary, A., 1878. Die Erscheinung der Symbiose. Verlag von Karl J, Trübner, Strassburg. English translation in Oulhen, N., Schulz, B. J., Carrier, T. J. (2016). English translation of Heinrich Anton de Bary’s 1878 speech, ‘Die Erscheinung der Symbiose’ (‘De la symbiose’). Symbiosis, 69, pp.131–139. de Duve, C., 2007. The origin of eukaryotes: a reappraisal. Nature Reviews Genetics, 8(5), pp.395–403.

Dmytrenko, O. et al., 2014. The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis. BMC Genomics, 15(924), pp.1–20.

Doeller, J. et al., 1988. Gill hemoglobin may deliver sulfide to bacterial symbionts of Solemya velum (Bivalvia, Mollusca). The Biological Bulletin, 175(3), pp.388–396.

Dubilier, N., Bergin, C. & Lott, C., 2008. Symbiotic diversity in marine : the art of harnessing . Nature Reviews Microbiology, 6(10), pp.725–740.

Eichinger, I. et al., 2014. Symbiont-driven sulfur crystal formation in a thiotrophic symbiosis from deep-sea hydrocarbon seeps. Environmental Microbiology Reports, 6(4), pp.364–372.

Felbeck, H., 1983. Sulfide oxidation and carbon fixation by the gutless clam Solemya reidi: an -bacteria symbiosis. Journal of Comparative Physiology, 152(1), pp.3–11.

Felbeck, H. & Turner, P., 1995. CO2 transport in catheterized hydrothermal vent tubeworms, Riftia pachyptila (Vestimentifera). Journal Of Experimental Zoology, 272(2), pp.95–102.

Felbeck, H., Childress, J.J. & Somero, G.N., 1981. Calvin-Benson cycle and sulphide oxidation enzymes in animals from sulphide-rich habitats. Nature, 293(5830), pp.291–293.

Fisher, C. & Childress, J., 1992. Organic carbon transfer from methanotrophic symbionts to the host hydrocarbon-seep mussel. Symbiosis, 12(3), pp.221–235.

Fisher, C. & Childress, J., 1986. Translocation of fixed carbon from symbiotic bacteria to host tissues in the gutless bivalve Solemya reidi. Marine Biology, 93(1), pp.59–68.

Fisher, C.R. et al., 1993. The Co-occurrence of Methanotrophic and Chemoautotrophic Sulfur- Oxidizing Bacterial Symbionts in a Deep-sea Mussel. Marine Ecology, 14(4), pp.277–289.

Flores, J.F. et al., 2005. Sulfide binding is mediated by zinc ions discovered in the crystal structure of a hydrothermal vent tubeworm hemoglobin. Proceedings of the National Academy of Sciences of the United States of America, 102(8), pp.2713–2718.

Gibson, K., Kobayashi, H. & Walker, G., 2008. Molecular determinants of a symbiotic chronic infection. Annual Review of Genetics, 42, pp.413–441.

Girguis, P.R. et al., 2002. Effects of metabolite uptake on proton-equivalent elimination by two species of deep-sea vestimentiferan tubeworm, Riftia pachyptila and Lamellibrachia cf

9

luymesi: proton elimination is a necessary adaptation to sulfide-oxidizing chemoautotrophic symbionts. The Journal of experimental biology, 205(19), pp.3055–3066.

Gray, M.W., Burger, G. & Lang, B.F., 1999. Mitochondrial evolution. Science, 283(5407), pp.1476–1481.

Gros, O. et al., 2012. Plasticity of symbiont acquisition throughout the life cycle of the shallow- water tropical lucinid Codakia orbiculata (Mollusca: Bivalvia). Environmental Microbiology, 14(6), pp.1584–1595.

Hourdez, S. & Weber, R.E., 2005. Molecular and functional adaptations in deep-sea hemoglobins. Journal of Inorganic Biochemistry, 99(1), pp.130–141.

John, P. & Whatley, F.R., 1975. Paracoccus denitrificans and the evolutionary origin of the mitochondrion. Nature, 254(5500), pp.495–498.

Kleiner, M. et al., 2012. Metaproteomics of a gutless marine worm and its symbiotic microbial community reveal unusual pathways for carbon and energy use. Proceedings of the National Academy of Sciences of the United States of America, 109(19), pp.1173–1182.

Klose, J. et al., 2016. Trophosome of the deep-sea tubeworm Riftia pachyptila inhibits bacterial growth. S. Duperron, ed. PLoS ONE, 11(1), p.e0146446.

Krause, D.O. et al., 2013. Board-invited review: Rumen microbiology: leading the way in microbial ecology. Journal of Animal Science, 91(1), pp.331–341.

Krueger, D.M., Gallager, S. & Cavanaugh, C.M., 1992. Suspension feeding on phytoplankton by Solemya velum, a symbiont-containing clam. Marine Ecology Progress Series, 86(2), pp.145–151.

Laue, B.E. & Nelson, D.C., 1994. Characterization of the gene encoding the autotrophic ATP sulfurylase from the bacterial endosymbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 176(12), pp.3723–3729.

Lee, R., Robinson, J. & Cavanaugh, C.M., 1999. Pathways of inorganic nitrogen assimilation in chemoautotrophic bacteria-marine invertebrate symbioses: expression of host and symbiont glutamine synthetase. The Journal of Experimental Biology, 202, pp.289–300.

Lee, R.W. & Childress, J.J., 1994. Assimilation of inorganic nitrogen by marine invertebrates and their chemoautotrophic and methanotrophic symbionts. Applied and Environmental Microbiology, 60(6), pp.1852–1858.

Liao, L. et al., 2013. Characterizing the plasticity of nitrogen metabolism by the host and symbionts of the hydrothermal vent chemoautotrophic symbioses Ridgeia piscesae. Molecular Ecology, 23(6), pp.1544–1557.

Markert, S. et al., 2011. Status quo in physiological proteomics of the uncultured Riftia pachyptila endosymbiont. Proteomics, 11(15), pp.3106–3117.

McFadden, G.I. & van Dooren, G.G., 2004. Evolution: red algal genome affirms a common

10

origin of all plastids. Current biology, 14(13), pp.R514–6.

Mereschkowsky, C., 1905. Über Natur und Ursprung der Chromatophoren im Pflanzenreiche. Biol. Centralbl., 25: 593–604. English translation in Martin, W., Kowallik, K. V. (1999). Annotated English translation of Mereschkowsky’s 1905 paper “ Über Natur und Ursprung der Chromatophoren im Pflanzenreiche.” European Journal of Phycology, 34, pp.287–295.

Millikan, D.S., Felbeck, H. & Stein, J.L., 1999. Identification and characterization of a flagellin gene from the endosymbiont of the hydrothermal vent tubeworm Riftia pachyptila. Applied and Environmental Microbiology, 65(7), pp.3129–3133.

Newton, I. et al., 2007. The Calyptogena magnifica chemoautotrophic symbiont genome. Science, 315(5814), pp.998–1000.

Newton, I., Girguis, P.R. & Cavanaugh, C.M., 2008. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts. BMC Genomics, 9(1), pp.1–13.

Newton, I.L.G. & Bordenstein, S.R., 2011. Correlations Between Bacterial Ecology and Mobile DNA. Current Microbiology, 62(1), pp.198–208.

Petersen, J.M. et al., 2011. Hydrogen is an energy source for hydrothermal vent symbioses. Nature, 476(7359), pp.176–180.

Polz, M. et al., 2000. When bacteria hitch a ride. ASM News, 66(9), pp.531–539.

Ponsard, J. et al., 2013. Inorganic carbon fixation by chemosynthetic ectosymbionts and nutritional transfers to the hydrothermal vent host-shrimp Rimicaris exoculata. The ISME Journal, 7(1), pp.96–109.

Rau, G. et al., 1990. δ13C, δ15N and δ18O of Calyptogena phaseoliformis (bivalve mollusc) from the Ascension Fan-Valley near Monterey, California. Deep-Sea Research Part a- Oceanographic Research Papers, 37(11), pp.1669–1676.

Reid, R. & Bernard, F., 1980. Gutless bivalves. Science, 208(4444), p.609.

Robidart, J. et al., 2008. Metabolic versatility of the Riftia pachyptila endosymbiont revealed through metagenomics. Environmental Microbiology, 10(3), pp.727–737.

Robinson, J. et al., 2003. Kinetic isotope effect and characterization of form II RubisCO from the chemoautotrophic endosymbionts of the hydrothermal vent tubeworm Riftia pachyptila. Limnology and , 48(1), pp.48–54.

Robinson, J., Stein, J. & Cavanaugh, C.M., 1998. Cloning and sequencing of a form II ribulose- 1,5-bisphosphate carboxylase/oxygenase from the bacterial symbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 180(6), p.1596.

Roeselers, G. & Newton, I.L.G., 2012. On the evolutionary ecology of symbioses between chemosynthetic bacteria and bivalves. Applied Microbiology and Biotechnology, 94(1), pp.1–10.

11

Russell, S.L. & Cavanaugh, C.M., 2017. Intrahost Genetic Diversity of Bacterial Symbionts Exhibits Evidence of Mixed Infections and Recombinant Haplotypes. Molecular Biology and Evolution, 34(11), pp.2747–2761.

Russell, S.L., Corbett-Detig, R.B. & Cavanaugh, C.M., 2017. Mixed transmission modes and dynamic genome evolution in an obligate animal–bacterial symbiosis. The ISME Journal, 11, pp.1359–1371.

Sagan, L., 1967. On the origin of mitosing cells. Journal of Theoretical Biology, 14(3), pp.225– 274.

Sanders, J.G. et al., 2013. Metatranscriptomics reveal differences in in situ energy and nitrogen metabolism among hydrothermal vent snail symbionts. The ISME Journal, 7(8), pp.1556– 1567.

Sayavedra, L. et al., 2015. Abundant toxin-related genes in the genomes of beneficial symbionts from deep-sea hydrothermal vent mussels. eLife, 4, pp.1–39.

Schwedock, J. et al., 2004. Characterization and expression of genes from the RubisCO gene cluster of the chemoautotrophic symbiont of Solemya velum: cbbLSQO. Archives of Microbiology, 182(1), pp.18–29.

Scott, K.M. & Cavanaugh, C.M., 2007. CO2 uptake and fixation by endosymbiotic chemoautotrophs from the bivalve Solemya velum. Applied and Environmental Microbiology, 73(4), pp.1174–1179.

Seston, S.L. et al., 2016. Metatranscriptional response of chemoautotrophic Ifremeria nautilei endosymbionts to differing sulfur regimes. Frontiers in Microbiology, 7, p.1074.

Stanley, S.M., 1970. Relation of Shell Form to Life Habits of the Bivalvia (Mollusca). In Relation of shell form to life habits of the bivalvia (mollusca). Geological Society of America Memoirs. Geological Society of America, pp. 119–121.

Stewart, F.J. & Cavanaugh, C.M., 2006. Bacterial endosymbioses in Solemya (Mollusca: Bivalvia)—model systems for studies of symbiont–host adaptation. Antonie van Leeuwenhoek, 90(4), pp.343–360.

Stewart, F.J. et al., 2011. Metatranscriptomic analysis of sulfur oxidation genes in the endosymbiont of Solemya velum. Frontiers in Microbiology, 2, pp.1–10.

Stewart, F.J., Newton, I. & Cavanaugh, C.M., 2005. Chemosynthetic endosymbioses: adaptations to oxic-anoxic interfaces. TRENDS in Microbiology, 13(9), pp.439–448.

Stewart, F.J., Young, C.R. & Cavanaugh, C.M., 2008. Lateral symbiont acquisition in a maternally transmitted chemosynthetic clam endosymbiosis. Molecular Biology and Evolution, 25(4), pp.673–687.

Vacelet, J. et al., 1996. Symbiosis between methane-oxidizing bacteria and a deep-sea carnivorous cladorhizid sponge. Marine Ecology Progress Series, 145(1-3), pp.77–85.

12

Volland, J.-M. et al., 2018. NanoSIMS and tissue autoradiography reveal symbiont carbon fixation and organic carbon transfer to giant ciliate host. The ISME Journal, 3, p.2393.

Weissgerber, T. et al., 2011. Complete genome sequence of Allochromatium vinosum DSM 180(T). Standards in Genomic Sciences, 5(3), pp.311–330.

Woyke, T. et al., 2006. Symbiosis insights through metagenomic analysis of a microbial consortium. Nature, 443(7114), pp.950–955.

13

CHAPTER 1

The genome of the intracellular bacterium of the coastal bivalve,

Solemya velum: a blueprint for thriving in and out of symbiosis

1 1 1 2 3 Oleg Dmytrenko , Shelbi L Russell , Wesley T Loo , Kristina M Fontanez , Li Liao , Guus

4 5 5 6 7 Roeselers , Raghav Sharma , Frank J Stewart , Irene LG Newton , Tanja Woyke , Dongying

8 8 8 Wu , Jenna Morgan Lang , Jonathan A Eisen

1 and Colleen M Cavanaugh

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge,

Massachusetts, United States of America.

2Department of Civil and Environmental Engineering, Massachusetts Institute of Technology,

Cambridge, Massachusetts, United States of America.

3SOA Key Laboratory for Polar Science, Polar Research Institute of China, Shanghai, China.

4Microbiology and Systems Biology Group, TNO, Utrecht, The Netherlands.

5School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America.

6Department of Biology, Indiana University, Bloomington, Indiana, United States of America.

7DOE Joint Genome Institute, Walnut Creek, California, United States of America.

8UC Davis Genome Center, Davis, California, United States of America.

(as published in BMC Genomics)

14

Dmytrenko et al. BMC Genomics 2014, 15:924 http://www.biomedcentral.com/1471-2164/15/924

RESEARCH ARTICLE Open Access The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis Oleg Dmytrenko1, Shelbi L Russell1, Wesley T Loo1, Kristina M Fontanez2, Li Liao3, Guus Roeselers4, Raghav Sharma5, Frank J Stewart5, Irene LG Newton6, Tanja Woyke7, Dongying Wu8, Jenna Morgan Lang8, Jonathan A Eisen8* and Colleen M Cavanaugh1*

Abstract Background: Symbioses between chemoautotrophic bacteria and marine invertebrates are rare examples of living systems that are virtually independent of photosynthetic primary production. These associations have evolved multiple times in marine habitats, such as deep-sea hydrothermal vents and reducing sediments, characterized by steep gradients of oxygen and reduced chemicals. Due to difficulties associated with maintaining these symbioses in the laboratory and culturing the symbiotic bacteria, studies of chemosynthetic symbioses rely heavily on culture independent methods. The symbiosis between the coastal bivalve, Solemya velum, and its intracellular symbiont is a model for chemosynthetic symbioses given its accessibility in intertidal environments and the ability to maintain it under laboratory conditions. To better understand this symbiosis, the genome of the S. velum endosymbiont was sequenced. Results: Relative to the genomes of obligate symbiotic bacteria, which commonly undergo erosion and reduction, the S. velum symbiont genome was large (2.7 Mb), GC-rich (51%), and contained a large number (78) of mobile gen- etic elements. Comparative genomics identified sets of genes specific to the chemosynthetic lifestyle and necessary to sustain the symbiosis. In addition, a number of inferred metabolic pathways and cellular processes, including heterotrophy, branched electron transport, and motility, suggested that besides the ability to function as an endosymbiont, the bacterium may have the capacity to live outside the host. Conclusions: The physiological dexterity indicated by the genome substantially improves our understanding of the genetic and metabolic capabilities of the S. velum symbiont and the breadth of niches the partners may inhabit during their lifecycle. Keywords: Symbiosis, Chemosynthesis, Sulfur oxidation, Respiratory flexibility, H+/Na+ -membrane cycles, Calvin cycle, Pyrophosphate-dependent phosphofructokinase, Heterotrophy, Motility, Mobile genetic elements

Background organisms as diverse as plants, insects, marine inverte- Symbiosis is one of the major driving forces of evolutionary brates, and protists [2-5], expanding metabolic capabilities adaptation. Chloroplasts and mitochondria are examples of of the partners and allowing them to occupy otherwise ancient symbiotic partnerships which played key roles unavailable ecological niches. Despite the ubiquity of such in the emergence and diversification of eukaryotic life mutualistic associations and their importance to health on Earth [1]. Bacteria have been found in symbioses with and the environment, studies of many host-associated microorganisms have been complicated by difficulties * Correspondence: [email protected]; [email protected] in both the maintenance of symbiotic organisms in culture 8 UC Davis Genome Center, 451 East Health Sciences Drive, Davis, CA and the inability to genetically manipulate them. However, 95616-8816, USA 1Department of Organismic and Evolutionary Biology, Harvard University, 16 progress in culture-independent techniques has allowed Divinity Avenue, 4081 Biological Laboratories, Cambridge, MA 02138, USA Full list of author information is available at the end of the article

© 2014 Dmytrenko et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

15

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 2 of 20 http://www.biomedcentral.com/1471-2164/15/924

for rapid advances in understanding symbiosis diversity, hosts [5,28]. A preliminary analysis was unable to defini- evolution, genetics, and physiology [6-8]. tively resolve the extent of genetic coupling between Symbioses between chemoautotrophic bacteria and the S. velum host and its symbionts in populations along invertebrates are ubiquitous in reducing marine habitats, the southern [26]. These patterns may such as deep-sea hydrothermal vents and coastal sediments. be the result of a physical decoupling of symbiont and host In these environments, the symbiotic bacteria derive energy lineages, possibly due to lateral symbiont transmission by oxidizing reduced inorganic molecules (e.g., sulfide) and between hosts. fix carbon dioxide for biomass production. Their hosts have It is therefore possible that transmission in solemyid evolved behavioral, physiological, and biochemical adapta- symbioses, as in vesicomyids, involves a combination of tions for capturing and delivering the required electron both vertical passage through the maternal germ line donors and acceptors to the symbionts. In return, these and lateral acquisition of symbionts from the environment invertebrates obtain their nutrition from bacterial chemo- or other co-occurring host individuals. Such a mixed trans- synthesis [5,9]. mission mode could strongly impact symbiont genome Solemya velum and its endosymbionts is one of the best- evolution by creating opportunities for lateral gene transfer, described chemoautotrophic symbioses. The host, a proto- relieving the constraints of genetic bottlenecks imposed by branch bivalve, in coastal nutrient-rich sediments strict vertical transmission [29,30], and imposing selective where it builds Y-shaped burrows that span the oxic- pressures for the maintenance of diverse functions in the anoxic interface, allowing access to both reduced inorganic symbiont genome that would mediate survival outside the sulfur as an energy source and oxygen for use as a terminal host. The genome of the S. velum symbiont will provide oxidant [10]. The symbionts, which constitute a single 16S insights into the transmission mode of this symbiont, rRNA phylotype of γ-proteobacteria [11], are localized to define a framework for examining its physiological ad- specialized epithelial cells (bacteriocytes) in the gills, sepa- aptations, and supply a reference sequence for future stud- rated from the cytoplasm by a peribacterial membrane. ies of the ecology and evolution of solemyid symbionts. Using energy from the oxidation of sulfide, the symbionts Here we present an analysis of the genome from the fix CO2 via the Calvin-Benson-Bassham Cycle [12,13]. S. velum symbiont. First, genes that encode core meta- Primary production in the symbionts sustains the host, bolic functions are discussed. Emphasis is placed on which has only a rudimentary gut and cannot effectively bioenergetics, autotrophy, heterotrophy, and nitrogen me- filter-feed [14,15]. Many key properties of this symbiosis tabolism, which indicate metabolic potential beyond strict still remain to be characterized, including the exchange of chemolithoautotrophy. Genes encoding cellular functions metabolites and signals between the symbiont and the that pertain to the symbiotic lifestyle are also analyzed. A host and the mechanism of symbiont acquisition at each special focus is on the processes, such as membrane trans- new host generation (i.e., symbiont transmission mode). port, sensing, and motility that may be involved in interac- The mode by which S. velum acquires its symbionts tions of the symbiont with the host and the environment. has important implications for understanding symbiont Wherever appropriate, the gene content is compared to genome evolution. Symbiont-specific genes have been that of free-living and host-associated bacteria, in par- amplified from the host ovarian tissue of both S. velum ticular the intracellular chemosynthetic symbionts of and its congener, S. reidi [16,17], raising the hypothesis the vesicomyid clams, Calyptogena magnifica [22] and that symbionts are transmitted maternally (vertically) Calyptogena okutanii [20], the vestimentiferan tubeworms, between successive host generations via the egg. Vertical Riftia pachyptila [31] and Tevnia jerichona [32], the transmission has also been inferred in deep-sea clams of scaly-foot snail, Crysomollon squamiferum, [33] and the the Vesicomyidae [18,19], in which symbionts have a re- marine oligochaete worm, Olavius algarvensis, [34]. This duced genome size (1.2 Mb) and appear to be obligately comprehensive analysis defines the S. velum symbiont as a associated with their host [20-23]. In vesicomyid symbi- metabolically versatile bacterium adapted to living inside oses, host and symbiont phylogenies are largely congruent, the host but also potentially capable of survival on a pattern consistent with vertical symbiont transmission the outside. It informs attempts to culture the symbionts [24]. Nonetheless, instances of lateral symbiont movement and generates multiple intriguing hypotheses that now among some vesicomyids have been inferred based on de- await experimental validation. coupling of symbiont and host evolutionary trajectories [25], bringing diverse symbiont strains into contact and Results and discussion creating opportunities for symbiont genome evolution via General genome features recombination [26,27]. In the Solemyidae, on the other The genome of the S. velum symbiont consists of 10 hand, symbionts of different Solemya species are scattered non-overlapping scaffolds, totaling 2,702,453 bp, with an across phylogenetic clades (i.e., polyphyly), indicating dis- average G + C content of 51%. The three largest scaffolds tinct evolutionary origins relative to the monophyly of the (1.21 Mb, 0.89 Mb, 0.54 Mb) contain 97.8% of the total

16

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 3 of 20 http://www.biomedcentral.com/1471-2164/15/924

genomic sequence and 98.4% of the predicted genes illustrate the ability of the S. velum symbiont to oxidize (Additional file 1: Table S1). Assembly of the scaffolds both hydrogen sulfide and thiosulfate via diverse path- into a closed genome was prevented by stretches of single ways, in agreement with previous measurements of nucleotides or groups of a few nucleotides repeated up to symbiont gene expression [40] and in vitro experiments 70 times that could not be spanned. However, the high showing that both substrates can stimulate carbon fix- depth of sequence coverage and the presence of all 31 ation in the symbiont [10,13]. The S. velum symbiont core bacterial phylogenetic gene markers [35] suggest that genes involved in the oxidation of reduced sulfur species most gene-coding regions were detected in the analysis. are most closely related to those of the purple sulfur Nevertheless, as the genome is not closed, a definitive list γ-proteobacterium, Allochromatium vinosum (Figure 4), of all symbiont genes could not be made. in which the genetic components and the biochemical An overview of the S. velum symbiont genome compared mechanisms of sulfur metabolism have been well charac- to selected symbiotic and free-living γ-proteobacteria, in- terized [41]. cluding other thiotrophs, is presented in Table 1. Briefly, 90.7% of the genome sequence encodes 2,757 genes, on Periplasmic sulfide and thiosulfate oxidation In the average 885 bp long. 2,716 (98.5%) genes are protein- periplasm of the S. velum symbiont, sulfide, thiosulfate, coding. Function was predicted for 1,988 (72.1%) of all the and, possibly, elemental sulfur, may be oxidized for energy genes, while 769 (27.9%) were identified as encoding by the Sox system, which is represented in the genome hypothetical proteins. 382 genes (13.8%) have one or more (Figure 4). The encoded SoxYZAXB, flavocytochrome c paralog in the genome, with the largest paralogous group dehydrogenase (FccAB), and type I and IV sulfide-quinone encoding transposases associated with mobile elements. reductases (Sqr) potentially reduce cytochromes c and qui- The genome contains a single ribosomal RNA (rRNA) op- nones, which along the course of the electron-transport eron and 38 transfer RNAs (tRNA) corresponding to the chain translate into membrane-ion gradients, NADH, and 20 standard proteinogenic amino acids. Due to the wobble ATP, ultimately fueling biosynthetic and other energy- base-pairing [36], tRNAs for each given amino acid can requiring cellular processes, including autotrophy (Figure 1). pair with any codon in the genome for that amino acid In A. vinosum and the green non-sulfur bacterium, (Additional file 2: Table S2). Chlorobium tepidum, SoxYZ, SoxAX, and SoxB proteins A model of the symbiont cell based on functional pre- participate in the formation of transient sulfur deposits as dictions is presented in Figure 1 (see Additional file 3: intermediates during sulfur oxidation [43]. In fact, sulfur Table S3 for the list of the corresponding gene products). deposits are common to all known sulfur-oxidizing bacteria When grouped into COG categories [37], the largest num- (SOB) which, like the S. velum symbiont, lack SoxCD sulfur ber of genes within the genome of the S. velum symbiont dehydrogenase (Figure 4) [44],includingthesymbiontsof was associated with metabolism of coenzymes, transcrip- the hydrothermal vent tubeworm, R. pachyptila [31,45], tion, posttranscriptional modification of proteins, cell divi- and the clam, C. magnifica [22,46]. Microscopically- sion, DNA replication, and energy metabolism (Figure 2). detectable intracellular or extracellular sulfur has not Based on a BLASTN [38] search against the NCBI-nr been observed either in the symbiont-containing gills database analyzed by MEGAN [39], 1,735 of the genes in of S. velum or directly within the symbionts (Cavanaugh, the genome were assigned to γ-proteobacteria, mainly unpublished observation). Absence of sulfur deposits may other sulfur-oxidizing symbionts (197 genes) and bacteria be attributed to a very rapid consumption of any available from the order of Chromatiales (184 genes). Among the reduced sulfur substrate. This agrees with the fact that the genes within γ-proteobacteria, 897 could not be assigned to S. velum symbiont have the highest known carbon fixation a lower-level taxon in the NCBI . 37 genes had rate, and, hence, demand for energy, of all the studied the closest matches to eukaryotes and 6 to archaea. No taxa chemosynthetic symbionts, i.e., 65 μmol min−1 g of pro- could be assigned to 29 genes, while 212 genes had no hits tein−1 [13] compared to 0.45 μmol min−1 g of protein−1 of in the NCBI-nr database (Figure 3). The majority of the se- the next highest rate measured in the symbionts of R. quences designated as “eukaryotic” were hypothetical and pachyptila [47]. Alternatively, in the S. velum symbiont produced low percent amino acid identity matches in the intermediate sulfur may be stored in a chemical form that BLASTN search. is not easily observed microscopically.

Metabolic functions Cytoplasmic sulfide oxidation Energy generating oxi- Chemolithotrophy dation of sulfide to sulfite may be catalyzed in the cyto- The S. velum symbiont, and chemoautotrophic symbionts plasm of the S. velum symbiont by the reverse-acting in general, are remarkable in their ability to support dissimilatory sulfite reductase (rDsr) pathway (Figure 1). almost all the metabolic needs of their metazoan hosts All of the enzymes and accessory proteins required for with energy derived from thiotrophy. Present genome data this pathway are encoded in a dsrABEFHCMKLJOPNRS

17

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 4 of 20 http://www.biomedcentral.com/1471-2164/15/924 K12 DH1, ATCC 33849 Escherichia coli -proteobacterial aphid α ; an vinosum DSM 180 Allochromatium Carsonella ruddii XCL-2 crunogena Thiomicrospira PV Carsonella ruddii Ca. APS Buchnera aphidicola -proteobacteria okutanii γ ; a symbiont of psyllids (the smallest sequenced genome), Calyptogena endosymbiont . *NCBI Accession PRJNA16744 and PRJNA72967. C. okutanii E. coli , and magnifica Calyptogena endosymbiont * C. magnifica , 11111337 042220667 100 2 1 0 8 81 178 , and enterobacterium 10 0 0 0 0 10 19 39 32 36 36 32 28 43 51 88 354 874 897292 935 27 737 19 974 7 1005 0 940 159 413 794 3.2057.9 1.2069.8 34.0 1.02 79.8 31.6 0.65 26.4 85.9 0.16 16.6 87.6 2.40 97.3 43.1 3.60 90.5 64.3 4.63 90.6 50.8 86.6 4182 11183693 981 615 175 213 253 2263 106 3317 46 4273 689 924 833 2218 932 838 561 113 1785 2505 3506 R. pahyptila endosymbiont Riftia pachyptila A. vinosum symbiont in comparison to other and 769 S. velum Symbiont Symbiont Symbiont Symbiont Symbiont Symbiont Free-living Free-living Free-living endosymbiont Solemya velum T. crunogena ORFs 2757 G + C% 51.0 Size, mb 2.70 tRNA genes 38 Pseudogenes 0 Sigma factors 9 Percent coding 90.7 Mobile elements 78 conserved proteins ; free-living sulfur-oxidizers, Average ORF length, bp 885 rRNA operons (16S-23S-5S) 1 ORFs in paralogous families 382 Proteins with predicted function 1988 B. aphidicola Hypothetical and uncharacterized Table 1 General genome features of the The comparison includes genomes of the chemosynthetic symbionts of symbiont,

18

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 5 of 20 http://www.biomedcentral.com/1471-2164/15/924

Figure 1 (See legend on next page.)

19

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 6 of 20 http://www.biomedcentral.com/1471-2164/15/924

(See figure on previous page.) Figure 1 Predicted model of the S. velum symbiont cell. The diagram, based on the gene annotation of the symbiont genome, depicts key

functional systems and metabolic pathways: sulfur oxidation, electron transport, ATP synthases, CO2-fixation via the Calvin Cycle, gluconeogenesis, polyglucose synthesis, glycolysis, TCA and glyoxylate cycles, synthesis of amino acids, fatty acids, lipids, isoprenoids via non-mevalonate pathway, and the cell wall, solute transporters, protein secretion systems, and the type IV pilus. Different protein categories are color-coded and the individual subunits indicated by shape symbols. The direction of substrate transport across the membrane is shown with arrows. Components of the electron transport chain are arranged from the lowest to the highest electronegativity of the electron donors (blue) and acceptors (red). The corresponding electronegativity values are listed next to the respective enzymes. Enzymes shared between glycolysis, gluconeogenesis, and the Calvin cycle are designated in green. Enzymes unique to these pathways are designated in red. Enzymes shared between the Calvin cycle and the pentose phosphate cycle are designated in blue. Amino acids which may be essential for the host are designated in red. Speculated pathways are designated with a question mark. The abbreviations used, the respective full gene product names, and the corresponding NCBI protein ID references are listed in Additional file 3: Table S3. operon (Figure 4). While multiple homologues of dsrC Sulfite oxidation Sulfite generated by rDsr may be were identified outside the dsr operon, these genes did further oxidized to sulfate in the cytoplasm by a sequential not encode the two conserved C-terminal cysteines action of APS reductase (AprABM) and an ATP-generating required for the protein to function [48,49]. The DsrC ATP sulfurylase (Sat) (Figures 1 and 4). Identification of enzyme likely mediates transfer of electrons from sulfide the respective genes agrees with measured Apr and Sat reductase, DsrAB, to a transmembrane electron trans- activity in the symbiont-containing S. velum tissue [51]. port complex DsrKMJOP, an entry point for electrons Sulfate generated in this pathway may be exported from derived from cytoplasmic oxidation of sulfur into the the cytoplasm via a sulfate-bicarbonate antiporter SulP electron transport chain [50]. rDsr may be the key (Figure 1). While electrons obtained from the oxidation of energy-generating pathway in the symbiont, as sulfide sulfide, thiosulfate, and, possibly, elemental sulfur by Sox has a six-fold higher effect on carbon fixation in the S. and rDsr are shuttled into the electron transport chain, velum symbiosis [13] compared to thiosulfate oxidized energy obtained from the oxidation of sulfite is immediately by the Sox pathway. available in the form of ATP.

Figure 2 Comparison of the COG categories between the S. velum symbiont and selected symbiotic and free-living bacteria. The percentage of genes in each category is normalized to the percentage of those COG categories in the genome of E. coli K12 DH1, ATCC 33849. *NCBI accession PRJNA16744.

20

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 7 of 20 http://www.biomedcentral.com/1471-2164/15/924

Figure 3 Taxa assigned to the genes in the S. velum symbiont genome. The insert chart shows the breakdown of the genes by taxa within the class of γ-proteobacteria (62.9%). The unassigned genes have not been assigned a lower taxon in this analysis. The unclassified genes have not been further classified in the NCBI taxonomy. All the taxa are mutually exclusive.

Bioenergetics as -500 mV [52], compared, for example, to −400 mV 2− The S. velum symbiont is thought to harvest energy of S2O3 and -270 mV of H2S. The reversible oxida- from reduced sulfur oxidation with oxygen. Interestingly, tion of ferrodoxins coupled to the reduction of NAD+ in its genome also encodes other respiratory pathways sug- the S. velum symbiont may be catalyzed by the H+ or/and gestive of diverse metabolic strategies. Based on the gene Na+-motive Rnf complexes (Figure 1) encoded in the gen- content, the symbiont may utilize multiple electron donors ome by two complete rnfABCDGE (rnf1) and rnfBCDGEA such as hydrogen, pyruvate, malate, succinate, and formate, (rnf2) operons. The organization of these genes in the and use alternative electron acceptors such as nitrate operons is conserved with other bacteria, suggesting that and dimethyl sulfoxide (DMSO). Furthermore, unlike these clusters did not arise from duplication. Previously, any chemosynthetic symbiont studied to date, the only Axotobacter vinelandii and Desulfobacterium auto- S. velum symbiont contains genes that may allow it to trophicum were known to harbor two rnf operons [52]. preferentially establish H+ and Na+ electrochemical Based on the presence of genes for pyruvate:ferredoxin membrane gradients during each step of respiration and oxidoreductase located between rnfB2 and rnfC2, pyruvate to selectively utilize them for ATP synthesis, solute may serve as an electron donor for at least one of the Rnf transport, and pH control. This high degree of respiratory complexes. In general, rnf genes are distributed mainly flexibility encoded in the S. velum symbiont genome sug- among obligate and facultative anaerobes, including many gests that this bacterium is adapted to a highly variable pathogens that colonize oxygen-limited host tissues [52]. environment. Together with the fact that ferrodoxins play a key role in anaerobic metabolism [53], this suggests that the S. velum Rnf complexes The versatile electron transport chain of symbiont, as well as other sequenced chemosynthetic the S. velum symbiont may utilize electron donors like symbionts, which all contain rnf genes, may be capable of ferrodoxins, which have a redox potential as negative facultative anaerobiosis.

21

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 8 of 20 http://www.biomedcentral.com/1471-2164/15/924

Figure 4 Comparison of the sulfur oxidation genes between the S. velum symbiont and other SOB. (a) Presence of genes involved in chemotrophic sulfur oxidation in the symbionts of S. velum, other sulfur-oxidizing bacteria and archaea, and sulfate reduction in D. autotrophicum, which is included for comparison. Genes encoding pathways for reverse-acting dissimilatory sulfur-oxidation (rDsr) (Drs in D. autotrophicum) and periplasmic sulfur-oxidation (Sox), as well as auxiliary proteins, are listed. Numbers of gene homologs in each organism are designated with color. Presence of extra- or intracellular sulfur deposits, i.e., globules, in each organism, as obtained from literature, is indicated with hollow circles. The abbreviations used, the respective full gene product names, and the corresponding NCBI protein ID references in the genome of the S. velum symbiont are listed in Additional file 3: Table S3. (b) Presence of signal sequences and transmembrane domains in the sulfur-oxidations genes of the S. velum symbiont, followed by the list of organisms with the closest known homologs to those genes and their respective BLASTP % identities (Avi - Allochromatium vinosum, Sup05 - uncultivated oxygen minimum zone microbe [42], Sli - Sideroxydans lithotrophicus, and Thia - Thiocapsa marina, Uncul - uncultured organism, Tsul - Thioalkalivibrio sulfidiphilus, Eper - R. pachyptila endosymbiont).

22

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 9 of 20 http://www.biomedcentral.com/1471-2164/15/924

Hydrogenases Hydrogen is another highly electron Buchnera spp., an obligate endosymbiont of aphids negative reductant (−420 mV) that the S. velum symbiont [60]. The S. velum symbiont may be able to switch be- may harness for the reduction of the quinone and the tween Complex I and Nqr, preferentially generating either NAD+ cellular pools (Figure 1). Hydrogen oxidation is H+ or Na+ electrochemical gradients. Thus, depending on suggested by the presence in the symbiont genome of hup the cellular requirements, the symbiont may synthesize and hox2 operons encoding an uptake and a bidirectional ATP (see ATP synthases) and regulate pH (see Ion gradi- hydrogenase, respectively. The two subunits of the sym- ent driven transporters) independently from each other. biont [Ni-Fe]-uptake hydrogenase, HupSL, are most simi- lar in amino acid sequence to HupS and HupL proteins Quinone reductases Apart from the electron donors from the symbionts of the tubeworms, R. pachyptila and such as sulfur and NADH, the S. velum symbiont may the T. jerichona, (73% and 78% identity for the S and L be able to directly reduce its quinone pool with a number subunits respectively), the sulfur bacterium, Thiocapsa of other substrates. This is suggested by the presence of roseopersicina, (68 and 74%), and the symbionts of the genes encoding malate:quinone oxidoreductase (Mqo), scaly-foot snail, C. squamiferum, (50 and 53%). In T. roseo- succinate dehydrogenase (ShdCDAB), homologous to persicina, HupSL has been experimentally demonstrated Complex II in mitochondria, and formate dehydrogenase- to reduce quinones of the respiratory chain with H2 O (FdoGHI) (Figure 1). This is the first report of FdoGHI [54,55]. Unlike all the other γ-proteobacteria containing in a chemosynthetic symbiont genome. In E. coli this HupSL, the hup operon in the S. velum symbiont does not enzyme, which is common to facultative anaerobes [61], is encode the di-heme cytochrome b, which is necessary to used in formate-dependent oxygen respiration, allowing link H2 oxidation to quinone reduction in the cellular the bacteria to rapidly adapt to shifts from aerobiosis to membrane [56]. However, a [Ni-Fe] hydrogenase cyto- anaerobiosis [62]. The presence of FdoGHI is additional chrome b homolog was found on a different genomic scaf- evidence that the S. velum symbiont may be capable of fold. Though this discordant gene organization is unlike facultative anaerobiosis (see Rnf complexes). that in other H2 oxidizers, it is possible that the identified The genome-encoded quinol-cytochrome-c oxidoreduc- cytochrome b may act in tandem with HupSL to enable tase (bc1, Complex III homologue) potentially links oxida- H2 oxidation. tion of quinols to the generation of a proton membrane Apart from potentially reducing the respiratory quin- gradient and the reduction of terminal electron acceptors one pool with H2, the symbiont, by means of a bidirec- (Figure 1), discussed next. + tional hydrogenase, may produce H2 by oxidizing NAD . The S. velum symbiont Hox2FUYH enzyme complex is Terminal oxygen reductases Similar to most aerobic most similar in amino acid sequence (63-66%) to the re- and microaerophilic bacteria, the genome of the S. velum cently-characterized NAD+-reducing [Ni-Fe]-hydrogen- symbiont encodes three types of H+-motive terminal oxy- ase from T. roseopersicina, which can operate in reverse, gen reductases (Figure 1), which suggest a capacity to re- generating H2 when the high reduction state of the spire O2 over a wide range of concentrations. The genome dinucleotide pool is growth-limiting [57]. As H2 con- contains a ccoNOQP operon encoding a cbb3 cytochrome centrations available to the S. velum symbiont have oxidase, which is known to function at nanomolar O2 not been measured, it is unknown whether the H2 concentrations in the nitrogen-fixing plant symbionts, oxidation contributes to primary production to the degree Bradyrhizobium japonicum [63], and in the microaero- that has been recently demonstrated in a hydrothermal philic human pathogens, Campylobacter jejuni, Helicobac- vent symbiosis [58]. ter pylori,andNeisseria meningitidis [64]. The genome also encodes a aa3 cytochrome oxidase (CoxAB), which is Primary ion pumps NADH (−320 mV), potentially thought to function primarily under atmospheric oxygen derived from oxidation of H2 or heterotrophic meta- concentrations [65] and is the only terminal oxidase in the bolism (see Heterotrophy) in the S. velum symbiont, symbionts of the bivalves C. magnifica [22] and C. okutanii could be converted into an electrochemical gradient by [20]. The third terminal oxidase identified in the symbiont two NADH:quinone oxidoreductases. The genome of genome is a cydAB-encoded quinol oxidase, which is the symbiont encodes the conventional H+-translocat- thought to oxidize quinols instead of cytochromes. CydAB ing quinone-reducing NADH dehydrogenase (NdhABC- may operate when an excess of reductants, potentially DEFGHIJKLMN), a homolog of the mitochondrial coming from the host, limits metabolic turnover and a Complex I, as well as an alternative Na+-translocating redox balance needs to be achieved [66]. The observed NADH dehydrogenase (NqrABCDEF) (Figure 1). While diversity of terminal oxygen reductases indicates that the Complex I is ubiquitous in bacteria, Nqr is found mainly supply of oxygen to the symbionts may fluctuate over time in pathogenic and marine species [59]. Among symbiotic or between free-living and symbiotic stages, necessitating bacteria, nqr genes have so far been described only in adjustments in respiratory metabolism.

23

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 10 of 20 http://www.biomedcentral.com/1471-2164/15/924

Alternative terminal reductases When oxygen is lim- allow the S. velum symbiont to synthesize ATP and main- ited or unavailable, potentially either through competition tain pH homeostasis via two separate mechanisms. for oxygen with the host or if the symbionts find them- selves in the anoxic sediment that surrounds the burrow, Carbon metabolism the S. velum symbiont may be capable of using terminal Autotrophic carbon fixation, fueled chiefly by sulfur oxi- electron acceptors other than oxygen. Although it is un- dation, is the principal process in the S. velum symbiont, known whether the symbiont-containing gill bacteriocytes supplying both the symbiont and the host with organic ever become anaerobic, the presence of genes for peri- carbon [14]. While previous studies focused primarily on − plasmic NO3 reductase (napFDAGHBC) suggests that RuBisCO [10,74], the key enzyme of the Calvin cycle for symbiont energy generation may involve electron transfer CO2 fixation and the most highly expressed gene in the to nitrate, which is available in the porewater surrounding symbiont [40], our current analysis identified genes that S. velum at concentrations of ~1-10 μM ([67], in prepar- encode catalytic components required for CO2 fixation ation). The structure of the symbiont napFDAGHBC and storage, including the pyrophosphate-dependent operon is consistent with that of enteric bacteria that are phosphofructokinase, which has been hypothesized to thought to use Nap for effectively scavenging nitrate dur- command a more energy efficient variant of the cycle ing anaerobic growth under nitrate-limited conditions [22,75-77]. Furthermore, the genome of the S. velum (5 μM) [68]. The symbiont genome also encodes a DMSO symbiont contains the gene for α-ketoglutarate dehydro- reductase (dmsABC), which suggests an ability to respire genase – the key enzyme of the tricarboxylic acid cycle dimethyl sulfoxide (DMSO), a breakdown product of (TCA), suggesting that the symbiont can respire organic dimethylsulfoniopropionate (DMSP) produced, for example, carbon and may not be obligately autotrophic. by marine algae. DMSO is available at nanomolar concen- trations in the coastal eutrophic environments inhabited Autotrophy The genome of the S. velum symbiont encodes by S. velum [69], and Dms genes are common to many aversionoftheCalvincyclewhichappearstobeprevalent marine sediment-dwelling bacteria, e.g., Beggiatoa and in chemosynthetic symbionts but may also operate in a Shewanella [70,71]. few free-living bacteria [75-77]. In these organisms genes for fructose 1,6-bisphosphatase and sedoheptulose 1,7- bisphosphatase, which process obligate intermediates ATP synthases Based on the genome data, both H+ and in the cycle, are absent. Instead, the role of the missing Na+ membrane gradients, established along the course enzymes may be performed by a single reversible of the electron transport chain during respiration, may pyrophosphate-dependent phosphofructokinase (PPi-PFK), drive ATP synthesis in the S. velum symbiont via either H+- the gene for which was identified in the genome of or Na+-dependent ATP synthases (Figure 1). The H+-speci- the S. velum symbiont (Figure 1). The ability of this ficity of the F F -type ATP synthase is suggested by the 0 1 enzyme to dephosphorylate fructose 1,6-bisphosphate and presence of two characteristic transmembrane helixes sedoheptulose 1,7-bisphosphate in vitro was demonstrated within the c subunit. In contrast, an A A -type ATP syn- 0 1 for the PPi-PFK from Methylococcus capsulatus [75], thase detected in the genome contains the characteristic which shares 73% amino acid sequence identity with the Na+-binding PXXXQ motif I and ES motif II in the rotor homologue from the S. velum symbiont. Notably, during subunit K. While proton-translocating ATP synthases are dephosphorylation this enzyme generates pyrophosphate, predominant in bacteria, Na+-coupled ATP synthesis which bears a high-energy phosphate bond unlike the driven by respiration has recently been recognized in some orthophosphate liberated by fructose 1,6-bisphosphatase marine and pathogenic species [72,73]. To our knowledge, and sedoheptulose 1,7-bisphosphatase. In M. capsulatus this is the first report of a Na+-translocating ATP syn- [75] and in the chemosynthetic symbionts of R. pachyptila thase in a chemosynthetic symbiont. [76] and the oligochete, O. algarvensis [77], it was pro- posed that the pyrophosphate produced this way could be Ion gradient driven transporters Cellular roles of the converted into a proton gradient by a membrane-bound H+ and Na+ gradients in the S. velum symbiont appear proton-pumping pyrophosphatase (V-type H+-PPase) co- to extend beyond ATP synthesis. Besides ATP synthases, encoded with the PPi-PFK. This proton gradient could the genome encodes diverse Na+:substrate symporters then be used for ATP synthesis. Compared to the classical and numerous Na+:H+ antiporters, including the multi- Calvin cycle [78], this mechanism may allow bacteria to subunit MrpEFGBBCDD complex (Figure 1). These trans- spend up to 9.25% less energy on CO2 fixation [77]. Judg- porters, together with ATP synthases and respiratory ion ing from the similar gene content, this version of the cycle pumps, may establish and consume simultaneous trans- may also be at work in the symbionts of the vent clams, C. membrane gradients of protons and sodium ions in the magnifica [22] and C. okutanii [20]. Apart from the mem- symbiont [72]. These parallel cycles of H+ and Na+ would brane-bound V-type H+-PPase, the S. velum symbiont

24

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 11 of 20 http://www.biomedcentral.com/1471-2164/15/924

genome also encodes a soluble pyrophosphatase (PPase) were also found in the genome of the S. velum symbiont immediately upstream of the PPi-PFK gene. The PPase (Figure 1). These enzymes could allow the symbiont to cannot convert the energy of pyrophosphate into a proton grow on various carbon sources, including acetate and gradient but, by controlling the availability of pyrophos- other two-carbon compounds, [82] or rapidly replenish phate, may serve to regulate the catalytic direction of the intermediates of biosynthetic reactions. The presence of PPi-PFK, which may also participate in glycolysis as a kin- the glyoxylate bypass and the TCA cycle suggests that ase. This additional PPase suggests that it may be import- the symbiont may be a facultative mixo- or hetero- ant for the S. velum symbiont to control the direction of its troph. The adaptive role of having both heterotrophic carbon flux to a higher degree than what has been seen in pathways, however, is unclear, and may relate either to other chemosynthetic symbionts. the intracellular conditions specific to this particular sym- biosis or to the yet unconfirmed host-free existence of the Carbon Flux Carbon fixed by the S. velum symbiont symbiont. may be stored as polyglucose or fed into catabolic and anabolic reactions (Figure 1). The overall direction of Nitrogen metabolism the metabolic carbon flux in the symbiont can be con- Ammonia, abundant in the sediment where S. velum trolled by at least two putative mechanisms. First, the burrows, is the main form of nitrogen assimilated by the reversible PPi-PFK, participating in the Calvin cycle symbiosis [83]. It has been suggested that the symbionts as discussed above, may also phosphorylate fructose incorporate ammonia into biomass, which is then trans- 6-phosphate during glycolysis. PPi-PFK appears to be ferred to the host ([67] in preparation), a process which the only enzyme encoded in the genome that could has been described for the chemosynthetic symbionts catalyze both the forward and the reverse reactions. The of the hydrothermal vent tubeworm Ridgeia piscesae directionality of the catalysis may depend on the concen- [84]. The presence of assimilatory nitrogen pathways tration of pyrophosphate and the other substrates of the in the S. velum symbiont genome corroborate this enzyme in the cytoplasm [79], since this PPi-PFK is likely hypothesis. nonallosteric [75]. Second, the two encoded glyceraldehyde 3-phosphate dehydrogenases, GapA and GapB, may be Nitrogen assimilation Extracellular ammonia may be specific to glycolysis and the Calvin cycle/gluconeogenesis, imported by the symbiont via specific AmtB transporters respectively, by analogy to the homologous enzymes in and incorporated into glutamate and glutamine, which Staphylococcus aureus [80]. In the symbiont genome, gapB serve as amino group donors for the other nitrogen- is adjacent to the gene for transketolase, an enzyme in the containing compounds in the cell (Figure 1). The S. velum Calvin cycle, further suggesting that these two Gap pro- symbiosis comes in contact with 20–100 μM concentra- teins may play a role in regulating the direction of the car- tion of ammonia in its coastal environment ([67] in prep- bon flux either in the direction of glycolysis or the Calvin aration). Thus, it is not surprising that, unlike the cycle and gluconeogenesis. The symbionts of C. magnifica, chemosynthetic symbionts found at nitrate-rich (40 μM) C. okutanii, R. pachyptila, T. jerichona,andthescalysnail hydrothermal vents [85,86], the S. velum symbiont lacks possess just a single gap gene, which has a much higher nar genes for nitrate reductases capable of assimilatory ni- amino acid sequence identity to gapB than to gapA from trate reduction [32,87-89]. Assimilation of ammonia has the S. velum symbiont. In line of the above evidence the been previously demonstrated in the gills of S. velum, but symbiont of S. velum appears to be distinct from other was initially ascribed to the activity of the host glutamine chemosynthetic symbionts in placing a stronger emphasis synthetase (GS) [88]. The present analysis identified glnA, on controlling the direction of its carbon flux. the gene that encodes GS, in the genome of the symbiont. A preliminary transcriptional study showed glnA to be Heterotrophy The S. velum symbiont is the third one of the fifty most highly transcribed genes in the sym- chemosynthetic symbiont, along with the γ3-symbiont of biont [40]. The biosynthetic pathways reconstructed on O. algarvensis [34] and the intracellular γ-proteobacterial the basis of gene content suggest that the symbiont has symbionts of the scaly-foot snail [33], known to encode all the ability to make all of the 20 proteinogenic amino acids. of the enzymes required for the complete TCA cycle, The amino acid prototrophy of the symbiont is in keeping and, therefore, could oxidize organic carbon for energy with its proposed role in providing most, if not all, of the (Figure 1). All of the other sequenced chemosynthetic host’s nutrition [14,15]. symbionts lack genes for α-ketoglutarate dehydrogenase and citrate synthase, which suggests their obligate autot- Urea metabolism Host urea may serve as an additional rophy [81]. source of assimilatory nitrogen for the S. velum symbiont. Furthermore, genes for the glyoxylate bypass of the The identified ureHABCEFG operon encodes a cytoplas- TCA cycle, encoding isocitrate lyase and malate synthase, mic urease UreABC, which can hydrolyze urea, releasing

25

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 12 of 20 http://www.biomedcentral.com/1471-2164/15/924

ammonia that may be re-utilized by the symbiont. Urea the context of the symbiotic life-style. First, given the can enter the bacterial cell by passive diffusion [90], but identified genes for the biosynthesis of fatty-acids, the under nitrogen starvation the symbiont may be able to take symbiont may build components of its plasma mem- it up more rapidly via an ABC-transporter UrtABCDE, brane mostly from cis-vaccenic acid (18: lω7) (Figure 1). encoded directly upstream of the ure genes. Among che- According to a previous analysis of lipid composition mosynthetic symbionts, urease genes have been previously in S. velum [95], this unsaturated fatty acid and its deriv- described only in the γ-symbionts from the marine oligo- atives are the main constituents of cellular membranes in chaete worm O. algarvensis [34,77], which, like S. velum, the symbiont and the host alike. Furthermore, the lives in coastal sediments. The sequenced chemosynthetic isotopic signature of the host’s lipids indicates that they symbionts from hydrothermal vents lack urease genes, are bacterial in origin [95]. Second, the identified genes even though some of their host organisms, for instance for the synthesis of lipopolysaccharides (Figure 1) sug- R. pachyptila [91], are known to produce urea. This dis- gest that the symbiont may be able to assemble the LPS crepancy may be accounted for by the fact that in coastal structures that are known to be sufficient for growth of sediments urea is also present outside the host in the pore E. coli [96]. Most intracellular symbionts that live within water ([67] in preparation). ahostderivedmembrane,liketheS. velum symbiont [10], lack LPS biosynthetic genes and are unable to rep- Taurine synthesis The S. velum symbiont may also pro- licate on their own [97]. However, the symbionts which vide its host with nitrogenous osmoregulants, such as have the genes to synthesize LPS tend to either live dir- the non-proteinogenic amino acid taurine [92]. In the ectly in the cytoplasm [97] and have to make their own host tissues, taurine accounts for up to 70% of the total cellular envelope or, like the symbionts of R. pachyptila free amino acids and shows an isotopic composition [98], exist extracellularly for part of their life. Therefore, (δ13C, δ14N, δ34S) suggestive of symbiont origin [93]. the symbiont of S. velum may not only be able to make Synthesis of taurine may be accomplished by the two a fully functional cellular envelope and supply some of homologs of the reversible taurine dioxygenase (TauD) its components to its host, but may also be capable of encoded in the symbiont genome. Taurine could be living outside the bacteriocytes. actively secreted to the host by the TauABCD ABC transporter, the genes for which were found to contain a Membrane transport conserved binding domain for sulfonate, characteristic Transporters The number of transporters encoded in of the taurine molecule. Since taurine synthesis requires the genome of the S. velum symbiont exceeds what has sulfite [94], one of the final intermediates in sulfur been found in other intracellular bacteria (Table 2). The 2− oxidation, this pathway could serve to dispose of SO3 , diversity of genes for solute transport (Figure 1) suggests and, thus, to drive forward sulfur oxidation in the S. that the symbiont has an extensive chemical communi- velum symbiont, benefiting both the host and the cation with their environment. The S. velum symbiont symbiont. may use these transporters to import metabolic substrates and enzyme cofactors and export products of its biosyn- Membrane-associated functions thesis to sustain the physiology of the host. It is known The diversity of membrane-associated functions encoded that fixed organic carbon is transferred from the symbiont in the genome of the S. velum symbiont suggests that to the host within minutes [99], which suggests a trans- the symbiont is fully autonomous of its host in this as- port mechanism, since direct digestion of symbionts pect of its physiology. Other bacteria, which, like the by host cells would likely take hours to days [100]. symbiont, are thought to be obligately intracellular [17], Such transport could be accomplished by exporters of have lost genes required for the production of a cellu- amino acids (EamA), carboxylates (CitT), and fatty acids lar envelope, transport of solutes across the plasma (FadLD), all of which are encoded in the genome. More- membrane, sensing of the extracellular environment, as over, some of the importers found in the genome may also well as motility. These bacteria instead rely on their hosts act as exporters, depending on the cellular environment to perform these functions or no longer require them. [101]. Thus, the S. velum symbiont maintains a repertoire of transporters that may negotiate diverse chemical ex- Production of cellular envelope changes with the environment and, on the other hand, The S. velum symbiont appears capable of synthesizing allow it to provide nutrients to the host without being and assembling a cytoplasmic membrane, a peptidoglycan digested. layer (PG), and an outer membrane populated by lipo- polysaccharides (LPS), which constitute a cellular enve- Multi-drug efflux pumps The S. velum symbiont genome lope. While these abilities are typical of the free-living contains at least five sets of genes encoding multi-drug ef- γ-proteobacteria, two aspects in particular stand out in flux pumps (AcrAB-TolC), suggesting the ability to expel

26

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 13 of 20 http://www.biomedcentral.com/1471-2164/15/924 Outer membrane transporters Protein systems secretion Unclassified transporters Ion channels systems Phosphotransferase Secondary transporters ATP- dependent transporters Mb of genome genes per symbiont, other symbiotic and free-living bacteria Transporter S. velum of genes transport involved in Total number Genome size (Mb) 1.80 4.500.21 404 1.10 89.80.73 48 138 2.43 43.6 163 141 15 67.1 12 30 38 10 0 58 6 1 0 46 1 51 10 0 3 5 35 19 0.890.43 3.67 2.20 199 97 54.2 44.1 81 32 52 52 5 0 8 10 7 3 14 6 32 25 0.76 3.30 171 51.8 56 60 0 6 2 16 31 0.39 2.00 88 44.0 31 34 0 3 2 3 15 1.00 2.7 224 75.2 100 70 1 5 5 17 26 S. velum gene ratio to endosymbiont OIS 0.13 0.69 28 40.6 11 10 3 1 0 0 3 OISOIS 0.15 0.07OIS 1.02OIS 0.64 0.12 34 0.11 0.71 16 33.3 0.70 27 25.0 16 25 38.0 5 35.7 10 7 3 9 0 12 14 5 2 3 1 0 3 2 0 0 2 0 0 3 0 0 2 0 3 0 FIS* 2.47 7.75 553 71.4 281 203 7 18 2 13 29 OIS* 0.14 1.16 32 27.6 18 6 0 3 1 0 4 parasite symbiont symbiont Free-living Free-living Free-living Free-living Free-living OIS/parasite 0.21 1.06 48 45.3 19 28 0 0 1 8 0 Intracellular Extracellular CommensalCommensal 1.58 2.82 4.64 5.92 354 632 76.3 106.8 74 160 235 336 29 44 13 17 2 4 3 37 35 34 Intracellular methanotroph sulfate reducer sulfide oxidizer sulfide oxidizer sulfide oxidizer APS Bath MJ11 wSim XCL-2 GWSS OIS 0.03 0.25 7 28.0 4 2 0 0 0 0 1 kp342 ACN14a FIS 1.05 7.50 236 31.5 114 106 0 12 1 1 4 MadridE Buchnera DSM 180 floridanus glossinidia DSM 1251 C. okutanii Wolbachia Organism Lifestyle Transporter DSM 11347 Baumannia denitrificans C. magnifica Sulfurimonas Vibrio fischeri Blochmannia cicadellinicola K-12-MG1655 Methylococcus Thiomicrospira pipientis Escherichia coli endosymbiont endosymbiont aphidicola Wigglesworthia bv. Viciae 3841 capsulatus Allochromatium endosymbiont Solemya velum Ca. crunogena R. leguminosarum vinosum Thermodesulfovibrio Frankia alni Sulcia muelleri Rickettsia prowazekii Klebsiella pneumoniae yellowstonii Table 2 Comparison of extracellular transport genes in the OIS - obligate intracellular symbiont; FIS - free-living intracellular symbiont.

27

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 14 of 20 http://www.biomedcentral.com/1471-2164/15/924

host-derived antimicrobial agents. A comparable genetic Table 3 ICE mobile genetic elements in the S. velum capacity for the AcrAB-TolC efflux system has been found symbiont genome in bacteria, such as the plant symbiont Rhizobium legumi- ICE element Copies Length, bp nosarum, that have a free-living stage, but not in bacteria ICEVchLao1 1 834 that are obligately intracellular (Table 2, ATP-dependent ICEVchBan7 1 432 transporters). The plant host of R. leguminosarum manipu- ICEVchBan9 2 429, 888 lates the cellular fate of its symbionts using antimicrobial- like peptide factors [102]. As a result, R. leguminosarum ICEVchInd5 1 282 undergoes cell elongation and genome replication but ICEVchMex1 1 561 looses its ability to divide. Only a small number of R. legu- ICEVflind1 2 405, 729 minosarum cells remain vegetative [103]. A very similar ICEPalBan1 1 1389 morphological differentiation of the symbiont has been ob- ICEPdaSpa1 5 300, 387, 622, 939, 3568 served in S. velum [104]. Assuming the bivalve host also ICESpuPO1 3 549, 627, 648 uses peptide factors to control its symbiont, the S. velum symbiont may rely on the efflux pumps to maintain a small ICEPmiUSA1 1 1290 undifferentiated population in the bacteriocytes for trans- mission to future host generations. elements. This large number and diversity of mobile Sensory mechanisms and motility elements suggest that this bacterium may come into The S. velum symbiont appears well equipped to sense contact with other bacterial lineages more often than extracellular chemical changes, consistent with its inferred expected for most vertically transmitted intracellular ability to maintain a complex chemical exchange with the symbionts. Indeed, the abundance of mobile genetic environment. Over forty transmembrane chemoreceptors elements in bacterial genomes has been shown to cor- are encoded in the genome of the symbiont. Almost half relate with ecological niche. While there is considerable of them have one or more conserved PAS domains and overlap between the amounts of mobile elements hosted therefore may play a role in sensing oxygen levels and by free-living and facultative intracellular bacteria, ob- redox potentials. To relay sensory information, the major- ligate intracellular bacteria that undergo faithful vertical ity of the receptors contain either a diguanylate cyclase transmission consistently have few or no mobile ele- (GGDEF) or a histidine kinase (HisKA) signaling domain. ments [106]. Movement and surface attachment using type IV pili, Two hypothesized life and evolutionary history scenar- known as twitching motility, are the processes that may ios may explain the observed mobile element content in be regulated by chemosensory signal transduction in the the S. velum symbiont. One of them is a relatively recent S. velum symbiont (Figure 1). For example, in the genome shift to intracellularity, resulting in an expansion of mo- of the symbiont a chimeric gene containing PAS, GGDEF, bile elements [107,108]. Alternatively, the symbionts may and cyclic-diguanylate receptor (EAL) domains is co- undergo regular or occasional horizontal transmissions located with pilEY1XWVT genes required to assemble a between hosts and at that time encounter opportunities functional pilus. Furthermore, the symbiont genome con- for recombination between strains. For example, sporadic tains pilGIJ-cheAW genes, which encode a transmembrane episodes of horizontal transmission in the primarily ma- chemotaxis sensor protein, HisKA, and a DNA-binding ternally transmitted insect symbiont, Wolbachia, have re- response regulator, and are known to control twitching sulted in the acquisition and maintenance of novel mobile motility in other bacteria [105]. The symbiont may use the contractile pili to direct its movement in the environment with regard to chemicals gradients, and, potentially, Table 4 Insertion sequence mobile genetic elements in also rely on the same mechanism to find and colonize the S. velum symbiont genome new hosts. Family/Element Copies Length, bp Terminal inverted repeats IS30 30 1071 ATTCAA Mobile genetic elements The S. velum symbiont genome contains two major IS3/IS407 18 1219 CCCCCA/CCCCCAA(C/T)AAGT types of mobile elements, integrative and conjugative IS30 1 900 CAACCGTTTCAAT elements (ICEs) and insertion sequences (IS). The genome IS5/IS5 1 1638 ACCCAAGGTA contains 25 insertions from 12 different ICE families IS481 1 1271 GAGACATCATTTACA (Table 3) as well as 53 copies of four different IS elements IS30 1 1137 TGATGTACGGGTCCGA (Table 4). In total, these elements comprise 2.6% of the genome. No gene interruptions were associated with these Unknown 1 1848 CCCCTTCG

28

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 15 of 20 http://www.biomedcentral.com/1471-2164/15/924

elements [109,110]. In fact, horizontal transmission or Methods host-switching has likely occurred in the history of symbi- Specimen collection and DNA preparation onts of bivalves [111] including members of the genus S. velum individuals were collected by the staff of the Solemya, as 16S rRNA phylogenetic analyses show that Marine Resource Center of the Marine Biological Labora- these symbionts do not comprise a monophyletic clade tory (MBL), Woods Hole, MA from reducing sediment of [5,11]. Additionally, many of the genes in the S. velum shallow eelgrass beds near Naushon Island, Woods Hole, symbiont genome are most closely related to disparate MA in 2006, 2007, and 2012. The collection was per- bacterial taxa (Figure 3), suggesting that horizontal gene formed in accordance with state collecting permit issued transfer may have occurred in the past. These preliminary by the Division of Marine Fisheries and in compliance lines of evidence support the hypothesis that horizontal with all local, regional and federal regulations, including symbiont transmission hasoccurred.However,more the Convention on Biological Diversity and the Conven- information is needed about the distribution and rela- tion on the Trade in Endangered Species of Wild Fauna tionships of the mobile elements among intra-host and and Flora. The excised gills were macerated in the labo- inter-host S. velum symbiont populations before these ratory using a dounce homogenizer in 5 ml of 0.2 μm hypotheses can be differentiated. filtered seawater (FSW) per bivalve. Homogenates were passed through 100 μm and 5 μm nylon filters (Small Conclusions Parts Inc. #CMN-0105-A and CMN-0005-A) and cen- Many of the features commonly encoded in the genomes trifuged at 5,000 × g for 5 minutes at 4°C. The pellet was of chemosynthetic symbionts were observed in the gen- resuspended in FSW, pelleted, and resuspended in 1x ome of the S. velum symbiont alongside an array of TAE buffer. 50 g molten 2% agarose (SeaKem® #50152) genes unique to this bacterium. Potential adaptations to in 1x TAE was added to make plugs for genomic DNA the symbiotic lifestyle, such as a more energy-efficient extraction. The hardened plugs were treated with version of the Calvin cycle, were shared with the other DNAse I (0.25U/50 μl) at 37°C for 10 minutes and then sequenced chemosynthetic symbionts. The genes that equilibrated in TE buffer for 30 minutes at room set the S. velum symbiont apart from the others were temperature. Agarose plugs were further processed those that encoded the TCA and the glyoxylate cycles, using CHEF Mammalian Genomic DNA Plug Kit from DMSO and urea reductases, as well as the highly branched Bio-Rad Laboratories (#170-3591) according to the manu- electron transport chain. These functions may relate to facturer’s instructions. The protocol for pulse field gel the fact that the S. velum symbiosis lives in eutrophic electrophoresis (PFGE) and isolation of the bacterial sediment, unlike the oligotrophic environments inhabited chromosomes from the agarose plugs was adapted from by other chemosynthetic symbioses, e.g., those of R. Gil [112]. pachyptila, C. magnifica, and O. algarvensis. The S. velum symbiont has long been considered to be Genome sequencing and assembly vertically transmitted [17], but our genomic analyses are Genomic bacterial DNA was sequenced at the Institute inconsistent with predictions based on other vertically for Genomic Research (TIGR), the Joint Genome Institute transmitted obligately-intracellular bacteria. The S. (JGI), and the University of California, Davis, using a di- velum symbiont’s genetic repertoire is replete with genes versity of sequencing technologies. Two Sanger libraries for chemosynthesis, heterotrophy, bioenergetics, nitro- of 3–4 Kb and 10–12 Kb insert sizes were constructed as gen metabolism, cell maintenance, motility, communica- previously described [113]. Sequencing of these Sanger tion, and exchange with the environment. Thus, with libraries resulted in 110,187 reads with N50 of 969 bp and regard to the functional gene content, but also the gen- the average coverage depth of 8x. Subsequently, using ome size and GC composition, the genome is more similar Roche 454 technology, 387,143 sequencing reads with the to those of free-living sulfur-oxidizing bacteria (Table 1). N50 of 207 bp and the average coverage depth of 13x were Furthermore, the genome contains mobile elements that obtained. Then, 25,635,107 Illumina sequencing reads are comparable in numbers reported for horizontally- were generated. The Illumina sequences were 35 bp long transmitted obligately-intracellular bacteria. These diver- and had the average coverage depth of 150x. These gent lines of evidence suggest that the evolutionary life Sanger, Roche 454, and Illumina sequences were assem- history of the S. velum symbiont may be more compli- bled using the Paracel Genome Assembler (Paracel Inc., cated than previously hypothesized. This could include, Pasadena, CA) into 68 contigs. Next, symbiont DNA was but may not be limited to, an opportunistic generalist sequenced using Pacific Biosciences (PacBio) technology, lifestyle, a facultative symbiosis with a mixed trans- resulting in 150,000 reads with N50 of 4,966 bp and 9x mission mode, or a very recent obligate association coverage depth. The insertion and deletion (indels) errors, with the host for this clade of bacteria potentially on typical of the PacBio data [114], were reduced from 4% to a path to a new type of a cellular organelle. 0.2% with Illumina paired-end sequences (500x coverage)

29

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 16 of 20 http://www.biomedcentral.com/1471-2164/15/924

using PacBioToCA program [115] available as a part of taxa in the NCBI taxonomy based on the BLASTN [38] SMRT Analysis software package version 1.4 distributed searches (−best_hit_overhang 0.25, −best_hit_score_edge by the Pacific Biosciences [116]. The error correction step 0.05, −evalue 0.0001) against the NCBI-nr database (8 July also removed any PacBio sequences of the host origin, 2014) computed with MEGAN 5.4.3 (maximum number which, given the abundance of the symbionts in the gill of matches per read 100; LCA parameters: minimal sup- tissue, had Illumina coverage below 5x. The Illumina data port 5, minimal score 35, top percent 10) [39]. Selected used for the error correction were generated as part of a promoters were identified with BPROM [126]. Signal pep- different study and came from a specimen obtained at a tides and transmembrane domains were predicted using different location (Point Judith, RI) than the rest of SignalP 3.0 Server and TMHMM, respectively [127]. The the genomic data. Due to the extent of the intra-species Genomic Utility for Automated Comparison (GUAC) genomic sequence variation across geographic localities Python script (Additional file 5) was developed to inform (Russell et al., in preparation), these Illumina data could comparative analyses of gene content across multiple not be used to supplement the genome assembly but were genomes, in particular genes involved in sulfur-oxidation sufficient to correct the majority of sequencing indel (Figure 4) and extracellular transport (Table 2). The errors in the PacBio reads. The error-corrected 54,684 GUAC software first identified those target genes in the PacBio sequences with N50 of 1,409 bp were used to genomes of interest that were annotated with unam- connect the previous 68 genomic contigs into 30 larger biguous gene symbols (e.g. soxA). Next, using amino scaffolds using the Automated Hybrid Assembly (AHA) acid sequences of these genes as queries, BLASTP module of SMRT Analysis. The resulting 7 gaps within the searched for homologous sequences in the remaining tar- scaffolds, 2,272 bp in total, were then filled in with the get genomes (default cut-off values: bit score 50, identity PacBio error-corrected sequences using the PBJelly soft- 30%, alignment length over the source sequence 40%). ware tool [117], reducing the number of gaps to 4 and the These sequences were aligned using ClustalW [128]. total gap length to 100 bp. After discarding 20 of the smal- The alignments were used to manually verify the results lest low coverage (2-9x) scaffolds that contained mostly (e.g., based on known conserved domains, etc.). Mobile eukaryotic genes (>65%), identified as described below, genetic elements were detected by type. Insertion se- only 10 scaffolds were retained as a part of the symbiont quences were found using OASIS [129]. Integrative con- genome. jugative elements and plasmid as well as phage sequences were identified by BLASTN [38] searches Sequence analysis against the ICEberg [130] and ACLAME [131] data- Open reading frames (ORFs) on S. velum symbiont scaf- bases, respectively (cut-off values: 250 nucleotides align- folds were predicted using Glimmer [118], Prodigal ment length and 90% identity). To determine whether [119], and GeneMarkS [120]. The software parameters mobile genetic elements interrupted open reading frames, used to perform these analyses are listed in Additional the nucleotide regions before and after each element were file 4: Table S4. Once identified, the ORFs were trans- concatenated and aligned to the NCBI-nr sequences using lated into protein-coding sequences and queried against BLASTN. the UniProt Reference Clusters (UniRef90) (20 November 2013) [121], National Center for Biotechnology Infor- Availability of supporting data mation non-redundant (NCBI-nr) (4 November 2013) This genome project has been deposited at DDBJ/EMBL/ [122], and M5 non-redundant (M5-nr) (27 January 2014) GenBank under the accession JRAA00000000. The ver- [123] databases for functional annotation using BLASTP sion described in this paper is version JRAA01000000. (e-value cutoff 0.001) [38]. UniRef90 gene entries sharing the highest percent identity with the query and NCBI-nr Additional files and M5-nr entries with the highest bit score match to the query were retained for annotation. Genes predicted by Additional file 1: Table S1. Length [bp], GC%, percentage of the total two or more methods (redundant) were considered the base pairs, and the number of genes in the scaffolds which constitute same and collapsed into a single entry if they shared the the genome of the S. velum symbiont. same start and stop position, orientation, and similar func- Additional file 2: Table S2. tRNA genes and the codon frequencies in the genome of the S. velum symbiont. tional annotations. Non-redundant entries (i.e., gene pre- Additional file 3: Table S3. Gene product names used in Figures 1 and dictions unique to a given software) were also retained. 4, the corresponding NCBI protein ID reference numbers, and EC/TC Finally, the above predictions and annotations were recon- numbers. ciled with the genes predicted and annotated through the Additional file 4: Table S4. Parameters of the gene prediction software. Integrated Microbial Genomes Expert Review (IMG-ER) Additional file 5: Genomic Utility for Automated Comparison (GUAC). pipeline [124]. Selected origins of replication were verified A Python script developed to inform comparative analyses of gene content across multiple genomes. by Ori-Finder [125]. The genes in the genome was assigned

30

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 17 of 20 http://www.biomedcentral.com/1471-2164/15/924

Competing interests (Mollusca: Bivalvia) determined by 16S rRNA gene sequence analysis. The authors declare that they have no competing interests. J Bacteriol 1992, 174:3416–3421. 12. Cavanaugh CM, Abbott M, Veenhuis M: Immunochemical localization of Authors’ contributions ribulose-1, 5-bisphosphate carboxylase in the symbiont-containing gills of OD performed the DNA isolation and the final genome assembly, developed Solemya velum (Bivalvia: Mollusca). P Natl Acad Sci USA 1988, 85:7786–7789. the Python GUAC script, carried out and coordinated the sequence analysis 13. Scott KM, Cavanaugh CM: CO2 uptake and fixation by endosymbiotic and the manual annotation, and drafted the manuscript. SLR, WTL, KMF, LL, chemoautotrophs from the bivalve Solemya velum. Appl Environ Microb and GR participated in the sequence analysis, the manual annotation, and 2007, 73:1174–1179. the drafting of the manuscript. FJS carried out the DNA isolation, 14. Conway N, Capuzzo J, Fry B: The role of endosymbiotic bacteria in the coordinated and participated in the gene prediction and the automated nutrition of Solemya velum: evidence from a stable isotope analysis of annotation. RS performed the gene prediction and the automated endosymbionts and host. Limnol Oceanogr 1989, 34:249–255. annotation. ILGN carried out the DNA isolation and participated in the 15. Krueger DM, Gallager S, Cavanaugh CM: Suspension feeding on sequence analysis. TW and JAE coordinated and participated in the genome phytoplankton by Solemya velum, a symbiont-containing clam. sequencing, the initial genome assembly, and the preliminary gene Mar Ecol-Prog Ser 1992, 86:145–151. prediction and annotation. DW and JML performed the initial genome 16. Cary SC: Vertical transmission of a chemoautotrophic symbiont in the assembly, gene prediction, and annotation. CMC and JAE conceived of the protobranch bivalve, Solemya reidi. Mol Mar Biol Biotechnol 1994, 3:121–130. study, participated in its design and coordination, and helped draft the 17. Krueger DM, Gustafson RG, Cavanaugh CM: Vertical transmission of manuscript. All authors read and approved the final manuscript. chemoautotrophic symbionts in the bivalve Solemya velum (Bivalvia: Protobranchia). Biol Bull 1996, 190:195–202. Acknowledgments 18. Peek A, Vrijenhoek R, Gaut B: Accelerated evolutionary rate in This work was funded by grant 0412205 of the US National Science sulfur-oxidizing endosymbiotic bacteria associated with the mode of Foundation (NSF) and was made possible with the generous support of the symbiont transmission. Mol Biol Evol 1998, 15:1514. U.S. Department of Energy Joint Genome Institute (JGI). The work conducted 19. Hurtado LA, Mateos M, Lutz RA, Vrijenhoek RC: Coupling of bacterial by JGI was supported by the Office of Science of the U.S. Department of endosymbiont and host mitochondrial genomes in the hydrothermal Energy under Contract No. DE-AC02-05CH11231. We would like to express vent clam Calyptogena magnifica. Appl Environ Microb 2003, special thanks to Grace Pai for creating Sanger sequencing libraries and 69:2058–2064. Shannon Smith and Terry Utterback for coordinating sequencing at TIGR. 20. Kuwahara H, Yoshida T, Takaki Y, Shimamura S, Nishi S, Harada M, Matsuyama K, Takishita K, Kawato M, Uematsu K: Reduced genome of the Author details thioautotrophic intracellular symbiont in a deep-sea clam, Calyptogena 1Department of Organismic and Evolutionary Biology, Harvard University, 16 okutanii. Curr Biol 2007, 17:881–886. Divinity Avenue, 4081 Biological Laboratories, Cambridge, MA 02138, USA. 21. Kuwahara H, Takaki Y, Yoshida T, Shimamura S, Takishita K, Reimer JD, 2Department of Civil and Environmental Engineering, Massachusetts Institute Kato C, Maruyama T: Reductive genome evolution in chemoautotrophic of Technology, 15 Vassar Street, Cambridge, MA 02139, USA. 3SOA Key intracellular symbionts of deep-sea Calyptogena clams. Extremophiles Laboratory for Polar Science, Polar Research Institute of China, Shanghai 2008, 12:365–374. 200136, China. 4Microbiology & Systems Biology Group, TNO, Utrechtseweg 22. Newton I, Woyke T, Auchtung T, Dilly G, Dutton R, Fisher M, Fontanez K, 48, Zeist, Utrecht 3704HE, The Netherlands. 5School of Biology, Georgia Lau E, Stewart FJ, Richardson P: The Calyptogena magnifica Institute of Technology, Atlanta, GA 30332-0230, USA. 6Department of chemoautotrophic symbiont genome. Science 2007, 315:998–1000. Biology, Indiana University, 1001 East 3rd Street, Jordan Hall, Bloomington, IN 23. Newton I, Girguis PR, Cavanaugh CM: Comparative genomics of 47405, USA. 7DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts. CA 94598, USA. 8UC Davis Genome Center, 451 East Health Sciences Drive, BMC Genomics 2008, 9:585. Davis, CA 95616-8816, USA. 24. Peek A, Feldman R, Lutz R, Vrijenhoek R: Cospeciation of chemoautotrophic bacteria and deep sea clams. Proc Natl Acad Sci U S A Received: 3 April 2014 Accepted: 23 September 2014 1998, 95:9962. Published: 23 October 2014 25. Stewart FJ, Young CR, Cavanaugh CM: Lateral symbiont acquisition in a maternally transmitted chemosynthetic clam endosymbiosis. References Mol Biol Evol 2008, 25:673–687. 1. Sagan L: On the origin of mitosing cells. J Theor Biol 1967, 14:225–274. 26. Stewart FJ, Young C, Cavanaugh CM: Evidence for homologous 2. Gonzalez A, Clemente JC, Shade A, Metcalf JL, Song S, Prithiviraj B, Palmer BE, Knight recombination in intracellular chemosynthetic clam symbionts. R: Our microbial selves: what ecology can teach us. EMBO Rep 2011, 12:775–784. Mol Biol Evol 2009, 26:1391–1404. 3. Dilworth MJ, James EK, Sprent JI: Nitrogen-Fixing Leguminous Symbioses. 27. Stewart FJ, Baik AHY, Cavanaugh CM: Genetic subdivision of Kluwer Academic Pub; 2008. chemosynthetic endosymbionts of Solemya velum along the Southern 4. Clark EL, Karley AJ, Hubbard SF: Insect endosymbionts: manipulators of New England coast. Appl Environ Microb 2009, 75:6005–6007. insect herbivore trophic interactions? Protoplasma 2010, 244:25–51. 28. Krueger DM, Cavanaugh CM: Phylogenetic diversity of bacterial symbionts 5. Cavanaugh CM, McKiness Z, Newton I, Stewart FJ: Marine chemosynthetic of Solemya hosts based on comparative sequence analysis of 16S rRNA symbioses. In The Prokaryotes - Prokaryotic Biology and Symbiotic Associations. genes. Appl Environ Microb 1997, 63:91. 3rd edition. Edited by Rosenberg E. 2013:579–607. 29. Moran NA: Accelerated evolution and Muller’s rachet in endosymbiotic 6. Toft C, Andersson SGE: Evolutionary microbial genomics: insights into bacteria. Proc Natl Acad Sci U S A 1996, 93:2873–2878. bacterial host adaptation. Nat Rev Genet 2010, 11:465–475. 30. Wu M, Sun LV, Vamathevan J, Riegler M, Deboy R, Brownlie JC, McGraw EA, 7. Woyke T, Tighe D, Mavromatis K, Clum A, Copeland A, Schackwitz W, Lapidus Martin W, Esser C, Ahmadinejad N, Wiegand C, Madupu R, Beanan MJ, A, Wu D, McCutcheon JP, McDonald BR, Moran NA, Bristow J, Cheng J-F: One Brinkac LM, Daugherty SC, Durkin AS, Kolonay JF, Nelson WC, Mohamoud Y, bacterial cell, one complete genome. PLoS One 2010, 5:1–8. Lee P, Berry K, Young MB, Utterback T, Weidman J, Nierman WC, Paulsen IT, 8. Kamke J, Sczyrba A, Ivanova N, Schwientek P, Rinke C, Mavromatis K, Nelson KE, Tettelin H, O’Neill SL, Eisen JA: Phylogenomics of the Woyke T, Hentschel U: Single-cell genomics reveals complex reproductive parasite Wolbachia pipientis wMel: a streamlined genome carbohydrate degradation patterns in poribacterial symbionts of marine overrun by mobile genetic elements. PLoS Biol 2004, 2:E69. sponges. ISME J 2013, 7:2287–2300. 31. Robidart J, Bench S, Feldman R, Novoradovsky A, Podell S, Gaasterland T, 9. Dubilier N, Bergin C, Lott C: Symbiotic diversity in marine animals: the art Allen E, Felbeck H: Metabolic versatility of the Riftia pachyptila of harnessing chemosynthesis. Nat Rev Micro 2008, 6:725–740. endosymbiont revealed through metagenomics. Environ Microbiol 2008, 10. Cavanaugh CM: Symbiotic chemoautotrophic bacteria in marine 10:727–737. invertebrates from sulphide-rich habitats. Nature 1983, 302:58–61. 32. Gardebrecht A, Markert S, Sievert SM, Felbeck H, Thürmer A, Albrecht D, 11. Eisen JA, Smith SW, Cavanaugh CM: Phylogenetic relationships of Wollherr A, Kabisch J, Le Bris N, Lehmann R, Daniel R, Liesegang H, Hecker chemoautotrophic bacterial symbionts of Solemya velum say M, Schweder T: Physiological homogeneity among the endosymbionts of

31

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 18 of 20 http://www.biomedcentral.com/1471-2164/15/924

Riftia pachyptila and Tevnia jerichonana revealed by proteogenomics. enzymes for oxygen-tolerant biological hydrogen oxidation. J Mol Microb ISME J 2012, 6:766–776. Biotech 2005, 10:181–196. 33. Nakagawa S, Shimamura S, Takaki Y, Suzuki Y, Murakami S-I, Watanabe T, 56. Vignais PM, Billoud B: Occurrence, classification, and biological function of Fujiyoshi S, Mino S, Sawabe T, Maeda T, Makita H, Nemoto S, Nishimura S-I, hydrogenases: an overview. Chem Rev 2007, 107:4206–4272. Watanabe H, Watsuji T-O, Takai K: Allying with armored snails: the 57. Maroti J, Farkas A, Nagy IK, Maroti G, Kondorosi E, Rakhely G, Kovacs KL: complete genome of gammaproteobacterial endosymbiont. ISME J A second soluble hox-type nife enzyme completes the hydrogenase 2014, 8:40–51. set in Thiocapsa roseopersicina BBS. Appl Environ Microbiol 2010, 76:5113–5123. 34. Woyke T, Teeling H, Ivanova NN, Huntemann M, Richter M, Gloeckner FO, 58. Petersen JM, Zielinski FU, Pape T, Seifert R, Moraru C, Amann R, Hourdez S, Boffelli D, Anderson IJ, Barry KW, Shapiro HJ, Szeto E, Kyrpides NC, Girguis PR, Wankel SD, Barbe V, Pelletier E, Fink D, Borowski C, Bach W, Mussmann M, Amann R, Bergin C, Ruehland C, Rubin EM, Dubilier N: Dubilier N: Hydrogen is an energy source for hydrothermal vent Symbiosis insights through metagenomic analysis of a microbial symbioses. Nature 2011, 476:176–180. consortium. Nature 2006, 443:950–955. 59. Bogachev AV, Verkhovsky MI: Na+-translocating NADH: quinone 35. Wu M, Eisen JA: A simple, fast, and accurate method of phylogenomic oxidoreductase: progress achieved and prospects of investigations. inference. Genome Biol 2008, 9:1–11. Biochem (Moscow) 2005, 70:143–149. 36. Murphy FV, Ramakrishnan V: Structure of a purine-purine wobble base 60. Shigenobu S, Watanabe H, Hattori M, Sakaki Y, Ishikawa H: Genome pair in the decoding center of the ribosome. Nat Struct Mol Biol 2004, sequence of the endocellular bacterial symbiont of aphids Buchnera sp. 11:11251–11252. APS. Nature 2000, 407:81–86. 37. Tatusov RL, Koonin EV, Lipman DJ: A genomic perspective on protein 61. Pickering BS, Oresnik IJ: Formate-dependent autotrophic growth in families. Science 1997, 278:631–637. Sinorhizobium meliloti. J Bacteriol 2008, 190:6409. 38. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment 62. Benoit S, Abaibou H, Mandrand-Berthelot M-A: Topological analysis of the search tool. J Mol Biol 1990, 215:403–410. aerobic membrane-bound formate dehydrogenase of Escherichia coli. 39. Huson DH, Mitra S, Ruscheweyh H-J, Weber N, Schuster SC: Integrative J Bacteriol 1998, 180:6625. analysis of environmental sequences using MEGAN4. Genome Res 2011, 63. Preisig O, Zufferey R, Thony-Meyer L, Appleby C, Hennecke H: 21:1552–1560. A high-affinity cbb3-type cytochrome oxidase terminates the 40. Stewart FJ, Dmytrenko O, DeLong E: Metatranscriptomic analysis of sulfur symbiosis- specific respiratory chain of Bradyrhizobium japonicum. oxidation genes in the endosymbiont of Solemya velum. Frontiers J Bacteriol 1996, 178:1532. Microbiol 2011, 2:1–10. 64. Pitcher RS, Watmough NJ: The bacterial cytochrome cbb3 oxidases. 41. Frigaard N-U, Dahl C: Sulfur metabolism in phototrophic sulfur bacteria. Biochim Biophys Acta Bioenerg 2004, 1655:388–399. Adv Microb Physiol 2009, 54:103–200. 65. Nunoura T, Sako Y, Wakagi T, Uchida A: Regulation of the aerobic 42. Walsh DA, Zaikova E, Howes CG, Song YC, Wright JJ, Tringe SG, Tortell PD, respiratory chain in the facultatively aerobic and hyperthermophilic Hallam SJ: Metagenome of a versatile chemolithoautotroph from archaeon Pyrobaculum oguniense. Microbiol (Reading, Engl) 2003, expanding oceanic dead zones. Science 2009, 326:578–582. 149:673–688. 43. Dahl C, Prange A: Bacterial sulfur globules: occurrence, structure and 66. Otten MF, Stork DM, Reijnders WN, Westerhoff HV, Van Spanning RJ: metabolism. In Inclusions in Prokaryotes Microbiology Monographs, Volume Regulation of expression of terminal oxidases in Paracoccus denitrificans. 1. 2006:21–51. Eur J Biochem 2001, 268:2486–2497. 44. Friedrich C, Bardischewsky F, Rother D, Quentmeier A, Fischer J: 67. Krueger DM, Roeselers G, Sigman D, Cavanaugh CM: Nitrogen nutrition in Prokaryotic sulfur oxidation. Curr Opin Microbiol 2005, 8:253–259. the symbiosis Solemya velum. in preparation. 45. Fisher C, Childress J, ARP A, BROOKS J, DISTEL D, Favuzzi J, Macko S, 68. Potter LC, Millington P, Griffiths L, Thomas GH, Cole JA: Competition Newton A, Powell M, Somero G, SOTO T: Physiology, morphology, and between Escherichia coli strains expressing either a periplasmic or a biochemical composition of Riftia pachyptila at Rose Garden in 1985. membrane-bound nitrate reductase: does Nap confer a selective Deep-Sea Res 1988, 35:1745–1758. advantage during nitrate-limited growth? Biochem J 1999, 344(Pt 1):77–84. 46. Vetter RD: Elemental sulfur in the gills of three species of clams 69. Zemmelink H, Houghton L, Sievert S, Frew N, Dacey J: Gradients in containing chemoautotrophic symbiotic bacteria: a possible inorganic dimethylsuffide, dimethylsulfoniopropionate, dimethylsulfoxide, and energy storage compound. Mar Biol 1985, 88:33–42. bacteria near the sea surface. Mar Ecol-Prog Ser 2005, 295:33–42. 47. Childress JJ, Girguis PR: The metabolic demands of endosymbiotic 70. Mussmann M, Hu FZ, Richter M, de Beer D, Preisler A, Jorgensen BB, chemoautotrophic metabolism on host physiological capacities. Huntemann M, Gloeckner FO, Amann R, Koopman WJH, Lasken RS, Janto B, J Exp Biol 2011, 214:312–325. Hogg J, Stoodley P, Boissy R, Ehrlich GD: Insights into the genome of large 48. Cort JR, Selan U, Schulte A, Grimm F, Kennedy MA, Dahl C: Allochromatium sulfur bacteria revealed by analysis of single filaments. PLoS Biol 2007, vinosum DsrC: Solution-state NMR structure, redox properties, and 5:1923–1937. interaction with DsrEFH, a protein essential for purple sulfur bacterial 71. McCrindle SL, Kappler U, McEwan AG: Microbial dimethylsulfoxide and sulfur oxidation. J Mol Biol 2008, 382:692–707. trimethylamine-N-oxide respiration. Adv Microb Physiol 2005, 50:147–198. 49. Oliveira TF, Vonrhein C, Matias PM, Venceslau SS, Pereira IAC, Archer M: 72. Häse CC, Fedorova ND, Galperin MY, Dibrov PA: Sodium ion cycle in Purification, crystallization and preliminary crystallographic analysis of a bacterial pathogens: evidence from cross-genome comparisons. dissimilatory DsrAB sulfite reductase in complex with DsrC. J Struct Biol Microbiol Mol Biol Rev 2001, 65:353–370. table of contents. 2008, 164:236–239. 73. Mulkidjanian AY, Dibrov P, Galperin MY: The past and present of sodium 50. Ghosh W, Dam B: Biochemistry and molecular biology of lithotrophic energetics: may the sodium-motive force be with you. Biochim Biophys sulfur oxidation by taxonomically and ecologically diverse bacteria and Acta 2008, 1777:985–992. archaea. Fems Microbiol Rev 2009, 33:999–1043. 74. Robinson J, Cavanaugh CM: Expression of form I and form II Rubisco in 51. Chen C, Rabourdin B, Hammen C: The effect of hydrogen sulfide on the chemoautotrophic symbioses: implications for the interpretation of metabolism of Solemya velum and enzymes of sulfide oxidation in gill stable carbon isotope values. Limnol Oceanogr 1995, 40:1496–1502. tissue. Comp Biochem Physiol B Biochem Mol Biol 1987, 88:949–952. 75. Reshetnikov AS, Rozova ON, Khmelenina VN, Mustakhimov II, Beschastny AP, 52. Biegel E, Schmidt S, González JM, Müller V: Biochemistry, evolution and Murrell JC, Trotsenko YA: Characterization of the pyrophosphate- physiological function of the Rnf complex, a novel ion-motive electron dependent 6-phosphofructokinase from Methylococcus capsulatus Bath. transport complex in prokaryotes. Cell Mol Life Sci 2011, 68:613–634. FEMS Microbiol Lett 2008, 288:202–210. 53. Bruschi M, Guerlesquin F: Structure, function and evolution of bacterial 76. Markert S, Gardebrecht A, Felbeck H, Sievert SM, Klose J, Becher D, ferredoxins. Fems Microbiol Rev 1988, 4:155–175. Albrecht D, Thürmer A, Daniel R, Kleiner M, Hecker M, Schweder T: 54. Kovács KL, Kovács AT, Maróti G, Mészáros LS, Balogh J, Latinovics D, Status quo in physiological proteomics of the uncultured Riftia Fülöp A, Dávid R, Dorogházi E, Rákhely G: The hydrogenases of Thiocapsa pachyptila endosymbiont. Proteomics 2011, 11:3106–3117. roseopersicina. Biochem Soc Trans 2005, 33:61–63. 77. Kleiner M, Wentrup C, Lott C, Teeling H, Wetzel S, Young J, Chang Y-J, 55. Burgdorf T, Lenz O, Buhrke T, van der Linden E, Jones A, Albracht S, Shah M, VerBerkmoes NC, Zarzycki J, Fuchs G, Markert S, Hempel K, Voigt B, Friedrich B: [NiFe]-hydrogenases of Ralstonia eutropha H16: Modular Becher D, Liebeke M, Lalk M, Albrecht D, Hecker M, Schweder T, Dubilier N:

32

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 19 of 20 http://www.biomedcentral.com/1471-2164/15/924

Metaproteomics of a gutless marine worm and its symbiotic microbial 99. Cavanaugh CM: Symbiosis of chemoautotrophic bacteria and marine community reveal unusual pathways for carbon and energy use. invertebrates. In PhD Thesis. Cambridge, MA, USA: Harvard University, Proc Natl Acad Sci 2012, 109:E1173–E1182. Department of Organismic and Evolutionary Biology; 1985. 78. Bassham J, Benson A, Calvin M: The path of carbon in photosynthesis. 100. Fisher C, Childress J: Organic carbon transfer from methanotrophic J Biol Chem 1950, 185:781–787. symbionts to the host hydrocarbon-seep mussel. Symbiosis 1992, 79. Fenton A, Paricharttanakul N, Reinhart G: Identification of substrate 12:221–235. contact residues important for the allosteric regulation of 101. Saurin W, Hofnung M, Dassa E: Getting in or out: early segregation phosphofructokinase from Eschericia coli. Biochemistry 2003, between importers and exporters in the evolution of ATP-binding 42:6453–6459. cassette (ABC) transporters. J Mol Evol 1999, 48:22–41. 80. Purves J, Cockayne A, Moody PCE, Morrissey JA: Comparison of the 102. van de Velde W, Zehirov G, Szatmari A, Debreczeny M, Ishihara H, Kevei Z, regulation, metabolic functions, and roles in virulence of the Farkas A, Mikulass K, Nagy A, Tiricz H: Plant peptides govern terminal glyceraldehyde-3-phosphate dehydrogenase homologues gapA and differentiation of bacteria in symbiosis. Science 2010, 327:1122–1125. gapB in Staphylococcus aureus. Infect Immun 2010, 78:5223–5232. 103. Paau AS, Bloch CB, Brill WJ: Developmental fate of Rhizobium meliloti 81. Wood AP, Aurikko JP, Kelly DP: A challenge for 21st century molecular bacteroids in alfalfa nodules. J Bacteriol 1980, 143:1480–1490. biology and biochemistry: what are the causes of obligate autotrophy 104. Stewart FJ, Cavanaugh CM: Bacterial endosymbioses in Solemya and methanotrophy? Fems Microbiol Rev 2004, 28:335–352. (Mollusca: Bivalvia)—model systems for studies of symbiont–host 82. Han SO, Inui M, Yukawa H: Effect of carbon source availability and growth adaptation. Antonie Van Leeuwenhoek 2006, 90:343–360. phase on expression of Corynebacterium glutamicum genes involved in 105. Whitchurch CB, Leech AJ, Young MD, Kennedy D, Sargent JL, Bertrand JJ, the tricarboxylic acid cycle and glyoxylate bypass. Microbiology 2008, Semmler ABT, Mellick AS, Martin PR, Alm RA, Hobbs M, Beatson SA, 154:3073–3083. Huang B, Nguyen L, Commolli JC, Engel JN, Darzins A, Mattick JS: 83. Lee R, Thuesen E, Childress J: Ammonium and free amino acids as Characterization of a complex chemosensory signal transduction nitrogen sources for the chemoautotrophic symbiosis Solemya reidi system which controls twitching motility in Pseudomonas aeruginosa. Bernard (Bivalvia: Protobranchia). J Exp Mar Biol Ecol 1992, 158:75–91. Mol Microbiol 2004, 52:873–893. 84. Liao L, Wankel SD, Wu M, Cavanaugh CM, Girguis PR: Characterizing the 106. Newton ILG, Bordenstein SR: Correlations between bacterial ecology and plasticity of nitrogen metabolism by the host and symbionts of the mobile DNA. Curr Microbiol 2011, 62:198–208. hydrothermal vent chemoautotrophic symbioses Ridgeia piscesae. 107. Plague GR, Dunbar HE, Tran PL, Moran NA: Extensive proliferation of Mol Ecol 2013. transposable elements in heritable bacterial symbionts. J Bacteriol 2008, 85. Lee RW, Childress JJ: Assimilation of inorganic nitrogen by marine 190:777–779. invertebrates and their chemoautotrophic and methanotrophic 108. Gil R, Latorre A, Moya A: Evolution of prokaryote-animal symbiosis from a symbionts. Appl Environ Microb 1994, 60:1852–1858. genomics perspective. In Microbiology Monographs, Volume 19. Berlin, 86. Bourbonnais A, Lehmann MF, Butterfield DA, Juniper SK: Subseafloor Heidelberg: Springer Berlin Heidelberg; 2010:207–233. nitrogen transformations in diffuse hydrothermal vent fluids of the Juan 109. Cordaux R, Pichon S, Ling A, Pérez P, Delaunay C, Vavre F, Bouchon D, Grève P: de Fuca Ridge evidenced by the isotopic composition of nitrate and Intense transpositional activity of insertion sequences in an ancient obligate ammonium. Geochem Geophys Geosyst 2012, 13:1–23. endosymbiont. Mol Biol Evol 2008, 25:1889–1896. 87. Hentschel U, Felbeck H: Nitrate respiration in the hydrothermal vent 110. Chafee ME, Funk DJ, Harrison RG, Bordenstein SR: Lateral phage transfer in tubeworm Riftia pachyptila. Nature 1993, 366:338–340. obligate intracellular bacteria (wolbachia): verification from natural 88. Lee R, Robinson J, Cavanaugh CM: Pathways of inorganic nitrogen populations. Mol Biol Evol 2010, 27:501–505. assimilation in chemoautotrophic bacteria-marine invertebrate 111. Roeselers G, Newton ILG: On the evolutionary ecology of symbioses symbioses: expression of host and symbiont glutamine synthetase. between chemosynthetic bacteria and bivalves. Appl Microbiol Biotechnol J Exp Biol 1999, 202(Pt 3):289–300. 2012, 94:1–10. 89. Girguis PR, Lee RW, Desaulniers N, Childress JJ, Pospesel M, Felbeck H, 112. Gil R, Sabater-Muñoz B, Latorre A, Silva FJ, Moya A: Extreme genome Zal F: Fate of nitrate acquired by the tubeworm Riftia pachyptila. reduction in Buchnera spp.: toward the minimal genome needed for Appl Environ Microbiol 2000, 66:2783–2790. symbiotic life. Proc Natl Acad Sci U S A 2002, 99:4454–4458. 90. Beckers G, Bendt AK, Kramer R, Burkovski A: Molecular identification of the 113. Wu D, Daugherty SC, Van Aken SE, Pai GH, Watkins KL, Khouri H, Tallon LJ, urea uptake system and transcriptional analysis of urea transporter and Zaborsky JM, Dunbar HE, Tran PL, Moran NA, Eisen JA: Metabolic urease-encoding genes in Corynebacterium glutamicum. J Bacteriol 2004, complementarity and genomics of the dual bacterial symbiosis of 186:7645. sharpshooters. PLoS Biol 2006, 4:1079–1092. 91. De Cian M, Regnault M, Lallier FH: Nitrogen metabolites and related 114. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, enzymatic activities in the body fluids and tissues of the hydrothermal Bettman B, Bibillo A, Bjornson K, Chaudhuri B, Christians F, Cicero R, Clark S, vent tubeworm Riftia pachyptila. J Exp Biol 2000, 203:2907–2920. Dalal R, Dewinter A, Dixon J, Foquet M, Gaertner A, Hardenbol P, Heiner C, 92. Joyner JL, Peyer SM, Lee RW: Possible roles of sulfur-containing amino Hester K, Holden D, Kearns G, Kong X, Kuse R, Lacroix Y, Lin S, et al: acids in a chemoautotrophic bacterium-mollusc symbiosis. Biol Bull 2003, Real-time DNA sequencing from single polymerase molecules. 205:331–338. Science 2009, 323:133–138. 93. Conway N, Howes B, McDowell Capuzzo J, Turner R, Cavanaugh CM: 115. Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Characterization and site description of Solemya borealis Wang Z, Rasko DA, McCombie WR, Jarvis ED, Phillippy AM: Hybrid error (Bivalvia; Solemyidae), another bivalve-bacteria symbiosis. Mar Biol 1992, correction and de novo assembly of single-molecule sequencing reads. 112:601–613. Nat Biotechnol 2012. 94. Eichhorn E, van der Ploeg JR, Kertesz MA, Leisinger T: Characterization of 116. Pacific Biosciences. [http://www.pacb.com] alpha-ketoglutarate-dependent taurine dioxygenase from Escherichia 117. English AC, Richards S, Han Y, Wang M, Vee V, Qu J, Qin X, Muzny DM, Reid coli. J Biol Chem 1997, 272:23031–23036. JG, Worley KC, Gibbs RA: Mind the Gap: Upgrading Genomes with Pacific 95. Conway N, McDowell Capuzzo J: Incorporation and utilization of bacterial Biosciences RS Long-Read Sequencing Technology. PLoS ONE 2012, lipids in the Solemya velum symbiosis. Mar Biol 1991, 108:277–291. 7:e47768. 96. Karow M, Georgopoulos C: Isolation and characterization of the 118. Delcher AL, Bratke KA, Powers EC, Salzberg SL: Identifying bacterial genes and Escherichia coli msbB gene, a multicopy suppressor of null endosymbiont DNA with Glimmer. Bioinformatics 2007, 23:673–679. mutations in the high-temperature requirement gene htrB. J Bacteriol 119. Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ: Prodigal: 1992, 174:702–710. prokaryotic gene recognition and translation initiation site identification. 97. Moran N, McCutcheon J, Nakabachi A: Genomics and evolution of BMC Bioinformatics 2010, 11:1–11. heritable bacterial symbionts. Annu Rev Genet 2008, 42:165–190. 120. Besemer J, Lomsadze A, Borodovsky M: GeneMarkS: a self-training method 98. Nussbaumer AD, Fisher CR, Bright M: Horizontal endosymbiont for prediction of gene starts in microbial genomes. Implications for transmission in hydrothermal vent tubeworms. Nature 2006, finding sequence motifs in regulatory regions. Nucleic Acids Res 2001, 441:345–348. 29:2607.

33

Dmytrenko et al. BMC Genomics 2014, 15:924 Page 20 of 20 http://www.biomedcentral.com/1471-2164/15/924

121. Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH: UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 2007, 23:1282–1288. 122. Tatusova T, Ciufo S, Fedorov B, O’Neill K, Tolstoy I: RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res 2014, 42:D553–D559. 123. Wilke A, Harrison T, Wilkening J, Field D, Glass EM, Kyrpides N, Mavrommatis K, Meyer F: The M5nr: a novel non-redundant database containing pro- tein sequences and annotations from multiple sources and associated tools. BMC Bioinformatics 2012, 13:1–5. 124. Standard operating procedure for the annotations of genomes and metagenomes submitted to the integrated microbial genomes expert review (IMG-ER) system. [http://img.jgi.doe.gov/w/doc/img_er_ann.pdf] 125. Gao F, Zhang C-T: Ori-Finder: A web-based system for finding oriCs in unannotated bacterial genomes. BMC Bioinformatics 2009, 9:1–6. 126. Bprom. [http://www.softberry.com] 127. Emanuelsson O, Brunak S, von Heijne G, Nielsen H: Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2007, 2:953–971. 128. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics 2007, 23:2947–2948. 129. Robinson DG, Lee M-C, Marx CJ: OASIS: an automated program for global investigation of bacterial and archaeal insertion sequences. Nucleic Acids Res 2012, 40:e174. 130. Bi D, Xu Z, Harrison EM, Tai C, Wei Y, He X, Jia S, Deng Z, Rajakumar K, Ou H-Y: ICEberg: a web-based resource for integrative and conjugative elements found in Bacteria. Nucleic Acids Res 2012, 40(Database issue):D621–D626. 131. Leplae R, Lima-Mendez G, Toussaint A: ACLAME: a CLAssification of Mobile genetic Elements, update 2010. Nucleic Acids Res 2010, 38(Database issue):D57–D61.

doi:10.1186/1471-2164-15-924 Cite this article as: Dmytrenko et al.: The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis. BMC Genomics 2014 15:924.

Submit your next manuscript to BioMed Central and take full advantage of:

• Convenient online submission • Thorough peer review • No space constraints or color figure charges • Immediate publication on acceptance • Inclusion in PubMed, CAS, Scopus and Google Scholar • Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

34

CHAPTER 2

The "missing enzyme" in the enigmatic Calvin cycle

of chemoautotrophic bacterial symbionts

Oleg Dmytrenko1, Frank J. Stewart2, Daniel R. Utter1, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge,

Massachusetts, United States of America.

2School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America.

35

Abstract

Sulfur-oxidizing gammaproteobacterial symbionts of marine invertebrates fix CO2 via the

Calvin-Benson-Bassham (Calvin) cycle despite the absence of the gene for fructose 1,6- bisphosphatase (FBPase). Here we investigated the ability of the reversible pyrophosphate- dependent phosphofructokinase (PPi-PFK) from the symbionts of Solemya velum bivalve to perform the biochemical function of the missing FBPase. We detected high expression of the symbiont PPi-PFK-encoding gene and high reverse PPi-PFK activity in the symbiont-containing tissue of the host. Compared to other bacterial PPi-PFKs, the recombinant enzyme had the highest specificity for the reverse reaction and higher catalytic efficiency than many bacterial

FBPases. Using ancestral state reconstruction, we demonstrated that the selection of PPi-PFK over FBPase occurred in all lineages of gammaproteobacterial chemoautotrophic symbionts.

Our findings support the hypothesis that PPi-PFK can perform the biochemical function of

FBPase and suggest that PPi-PFK may play an important role in the evolution and maintenance of chemoautotrophic symbioses.

36

Introduction

Most of life on Earth thrives on biomass produced by autotrophic carbon fixation. Out of six known autotrophic carbon fixation pathways, the Calvin-Benson-Bassham (Calvin) cycle

(Bassham et al. 1953) is most ubiquitous and is responsible for over 90% of primary production

(Raven 2009; Schwander et al. 2016). The Calvin cycle, found in plants, protists, and bacteria, utilizes the enzyme ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO) in the key

CO2 incorporation step and relies on twelve auxiliary enzymatic reactions to regenerate its metabolic intermediates (Singer et al. 1952; Bar-Even et al. 2012; Raven 2013; Erb & Zarzycki

2018). Enzymes which catalyze the auxiliary reactions may be structurally unrelated but are functionally equivalent in different organisms (Martin & Schnarrenberger 1997). The Calvin cycle of endocellular gammaproteobacterial symbionts of marine invertebrates is perhaps the most enigmatic among . These symbiotic bacteria lack genes for fructose 1,6- bisphosphatase (FBPase, EC 3.1.3.11), an enzyme which catalyzes one of the essential auxiliary reactions in the cycle (Martin & Schnarrenberger 1997). Discovery and characterization of a functionally equivalent enzyme which performs the function of the missing FBPase in the symbionts may uncover a previously unknown variant of the Calvin cycle and would shed light on the metabolism and, potentially, evolution of chemoautotrophic symbioses.

Chemoautotrophic symbionts are some of the most prolific primary producers known (Lutz et al. 1994). These bacteria harness energy by oxidizing reduced inorganic compounds, such as sulfide (Felbeck et al. 1981; Cavanaugh 1983), methane (Cavanaugh et al. 1992; Barry et al.

2002), or hydrogen (Petersen et al. 2011). To capture and deliver oxygen and electron donors to the symbionts, their hosts evolved a number of behavioral, physiological, and biochemical adaptations (Doeller et al. 1988; Polz et al. 2000; Flores et al. 2005). In return, the symbionts provide their eukaryotic partners with organic carbon obtained by fixing CO2 into biomass using

RuBisCO (Felbeck 1981; Cavanaugh 1983; Polz et al. 1992; Nelson & Hagen 1995; Fiala-

37

Medioni et al. 2002). These bacteria-host associations have repeatedly evolved in a wide range of taxa, allowing both partners to occupy otherwise inhospitable environments, from hydrothermal vents to anoxic coastal sediments (Cavanaugh et al. 2013; Dubilier et al. 2008).

The majority of sulfur-oxidizing symbionts are gammaproteobacteria from a number of different clades. They exhibit a range of transmission modes (Nussbaumer et al. 2006; Stewart et al.

2009; Russell et al. 2017), diverse means of nutrient transfer to the host (Lee et al. 1999;

Sanders et al. 2013), and have varying genome sizes (Newton et al. 2007; Dmytrenko et al.

2014; Nakagawa et al. 2014). A striking commonality among gammaproteobacterial chemoautotrophic symbionts, on the other hand, is the consistent lack of the fbp gene encoding

FBPase. In bacteria this enzyme performs essential auxiliary reactions in the Calvin cycle by dephosphorylating fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7-bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). There are four known types of bacterial FBPases, types I, II, III, and V. Type IV has been so far only identified in archaea (Rashid et al. 2002). Type I is the most widely distributed in nature, being the primary FBPase in Escherichia coli and the majority of bacterial species, including autotrophs which rely on the Calvin cycle for carbon fixation

(Hines et al. 2007). Some bacteria, such as E. coli and Bacillus methanolicus, also possess

FBPase type II, encoded by the glpX gene (Donahue et al. 2000; Brown et al. 2009;

Stolzenberger et al. 2013). In some other bacteria, for example, Corynebacterium glutamicum,

GlpX is the only known FBPase (Rittmann et al. 2003). Type III was first described in a Gram- positive bacterium, Bacillus subtilis, and is generally rare (Fujita et al. 1998). Type V is predominantly archaeal, but was also found in at least one thermophilic bacterium, Aquifex aeolicus (Rashid et al. 2002). Similar to FBPase type I, GlpX is promiscuous and can dephosphorylate FBP as well as SBP (Gerbling et al. 1986; Stolzenberger et al. 2013).

Promiscuity of other bacterial FBPases, which are primarily confined to bacteria without the

38

Calvin cycle, to our knowledge, has not been tested. In eukaryotes FBPases are unable to dephosphorylate SBP (Teich et al. 2007). Instead, this reaction, which is specific to the Calvin cycle, is catalyzed by sedoheptulose 1,7-bisphosphatase (SBPase). Without FBPase–and

SBPase in eukaryotes–Calvin cycle intermediates cannot be regenerated, disrupting CO2 fixation.

To account for the absence of fbp in chemoautotrophic symbionts, it has been hypothesized that these bacteria co-opt a pyrophosphate-dependent phosphofructokinase (PPi-

PFK, EC 2.7.1.11) to perform the function of the missing FBPase (Newton et al. 2007; S.

Markert et al. 2011; Kleiner et al. 2012; Dmytrenko et al. 2014). The first PPi-PFK was described in a protist, Entamoeba histolytica, alongside the conventional and more common

ATP-dependent phosphofructokinase (ATP-PFK, EC 2.7.1.11) (Reeves et al. 1974). Later PPi-

PFKs were also discovered in bacteria (O'Brien et al. 1975), plants (Carnal & Black 1979), and archaea (Siebers et al. 1998). Being less-common of the two enzymes, PPi-PFK is primarily found in anaerobic organisms, which have a reduced capacity for ATP synthesis and may benefit from using PPi instead of the more costly ATP (Mertens 1991). Both ATP-PFK and PPi-

PFK are thought to participate in glycolysis. However, unlike ATP-PFK, which is virtually irreversible, PPi-PFK is reversible under physiological conditions and, therefore, may take part in gluconeogenesis, an ability which was first demonstrated by complementing FBPase deficiency in E. coli with PPi-PFK from Propionibacterium freudenreichii (Kemp & Tripathi 1993).

PPi-PFKs are promiscuous for FBP and SBP, a property which could allow these enzymes to participate in the Calvin cycle (Reshetnikov et al. 2008). When operating in reverse, PPi-PFK dephosphorylates FBP/SBP to F6P/S7P, analogous to FBPase, with the difference that PPi-

PFK also yields pyrophosphate (PPi) (Heinonen 2001). Accumulation of PPi may inhibit the reverse reaction of PPi-PFK and could make the forward reaction more favorable (Stitt 1989;

Theodorou & Plaxton 1996; Frese et al. 2014). To drive PPi-PFK in the reverse direction, a

39

continuous removal of pyrophosphate would be required. PPi can be consumed by a number of enzymes, including inorganic pyrophosphatase (PPase, EC 3.6.1.1), sodium-translocating

PPase (Na+-PPase, EC 3.6.1.1), proton-pumping PPase (H+-PPase, EC 3.6.1.1), ATP sulfurylase (SAT, EC 2.7.7.4), or pyruvate phosphate dikinase (EC 2.7.9.1) (van Alebeek &

Keltjens 1994; Heinonen 2001; Serrano et al. 2007; Parey et al. 2013). The Gibbs standard free energy change (∆fGº) due to PPi hydrolysis by PPase is - 22 kJ/mole, which may increase equilibrium constant (K') of the reverse reaction by 103-104-fold (Heinonen 2001). Aside from

PPi removal, H+/Na+-PPases may create electrochemical gradients, which can be consumed, for example, by ATP synthases to produce ATP (Serrano et al. 2007; Biegel & V. Müller 2011).

SAT, an enzyme in the sulfur oxidation pathway of chemoautotrophic symbionts and free-living

2- sulfur-oxidizing bacteria, transfers PPi to adenosine 5'-phosphosulfate (APS), making SO4 as well as ATP (Felbeck et al. 1981; C. Chen et al. 1987; Polz et al. 1992; Laue & Nelson 1994;

Fiala-Medioni et al. 2002; Parey et al. 2013). Thus, it appears that pyrophosphate removal may be integral to the reverse PPi-PFK activity. Furthermore, it would not only prevent substrate inhibition and make the reverse PPi-PFK reaction more favorable in vivo, but could offer energy savings in a form of ATP synthesis, either indirectly through the action of H+/Na+-PPases or directly using SAT.

Physiological significance of PPi-PFK was evaluated in a chemoautotrophic symbiont of the coastal bivalve, Solemya velum, by analyzing transcription of the PPi-PFK encoding gene, pfp, (Gene ID 31575776) in the context of total gene expression. This intracellular bacterium, housed within the gill tissue of S. velum, belongs to the class of gammaproteobacteria, and is one of the best studied chemoautotrophic symbionts (Stewart & Cavanaugh 2006). To analyze gene expression, a previously published dataset of S. velum symbiont transcripts (Stewart et al.

2011), was combined with new data and reevaluated in the context of the recently sequenced symbiont genome (Dmytrenko et al. 2014). This approach improved the initial transcriptome

40

analysis, which focused primarily on sulfur oxidation genes, and extended transcription-based functional predictions to other aspects of symbiont's physiology, such as carbon metabolism.

Our transcriptomic results were compared to previous metaproteomic studies of chemoautotrophic symbionts of Riftia pachyptila (S. Markert et al. 2011) and Olavius algarvensis (Kleiner et al. 2012), which found significant levels of PPi-PFK protein in these bacteria. Analysis of pfp expression in the symbiont of S. velum was further complemented by measurements of PPi-PFK activity in the symbiont-containing gill and the symbiont-free foot tissue of the host.

To determine the ability of PPi-PFK from the S. velum symbiont to perform the biochemical function commonly carried out in the Calvin cycle by FBPase, symbiont pfp was expressed in E. coli, purified, and characterized. Recombinant PPi-PFK and FBPase from a closely related purple sulfur bacterium, Allochromatium vinosum, were also included in this characterization. A. vinosum is a facultative , capable of using light as the primary energy source and reduced sulfur compounds as a source of electrons (Imhoff 2005;

Weissgerber et al. 2011). Under photolithoautotrophic conditions this bacterium fixes CO2 via the Calvin cycle. Being metabolically versatile, A. vinosum is also capable of growing photoorganoheterotrophically using, for example, acetate and malate. A comparison between

PPi-PFK, which may be used in the Calvin cycle of S. velum symbiont, and an enzyme from a closely-related autotrophic bacterium which does not lack FBPase, offered valuable insights into the physiological function of the symbiont PPi-PFK. This comparative biochemical analysis was further expanded by including previously characterized bacterial PPi-PFKs (Pfleiderer &

Klemme 1980; Deng et al. 1999; Ronimus et al. 1999; Ding et al. 1999; Reshetnikov et al. 2008;

Frese et al. 2014) and FBPases (Kelley-Loughnane et al. 2002; Brown et al. 2009; Myung et al.

2010).

41

The phylogeny of PPi-PFKs and ATP-PFKs has been studied in diverse organisms

(Mertens 1991; Michels et al. 1997; Reshetnikov et al. 2008; Frese et al. 2014; S. B. Le et al.

2017). These two enzymes share common ancestry, but their evolutionary histories are complex and riddled with horizontal gene transfer events and point mutations which are able to change specificity from ATP to PPi and vice versa (Chi & Kemp 2000; M. Müller et al. 2001; Bapteste et al. 2003). While a number of bacteria are known to have both, PPi-PFK and ATP-PFK, the majority only has one or the other enzyme (Roberton & Glucina 1982; Bapteste et al. 2003).

Lack of FBPase and presence of PPi-PFK has only been sporadically investigated and is thought to be primarily confined to chemoautotrophic symbionts (Newton et al. 2007;

Reshetnikov et al. 2008; B. Markert et al. 2014; Kleiner et al. 2012; Dmytrenko et al. 2014). To establish whether this pattern holds widely in chemoautotrophic gammaproteobacterial symbionts, we surveyed all available symbiont genomes and a wide range of genomic sequences from closely-related free-living autotrophic and heterotrophic bacteria. Using ancestral state reconstruction, events of FBPase loss and PPi-PFK gain were investigated in the evolutionary histories of these symbiotic and non-symbiotic bacteria.

The results of this study demonstrate that PPi-PFK from a chemoautotrophic symbiont is capable of catalyzing biochemical reactions commonly performed by FBPase in the Calvin cycle. PPi-PFK likely plays an important role in the symbiont metabolism, as high pfp expression and high reverse PPi-PFK activity were detected in the symbiont-containing gill tissue of S. velum. The recombinant symbiont PPi-PFK had the highest specificity for the reverse reaction among bacterial PPi-PFKs and higher catalytic efficiency than many bacterial FBPases. PPi removal was essential to PPi-PFK reverse activity. Hydrolysis of PPi in the symbionts can be performed by a number of enzymes, such as H+/Na+-PPases or SAT. Their activity may not only make the reverse PPi-PFK activity more favorable in vivo but could also offer additional benefits in a form of ATP synthesis. Using ancestral state reconstruction we determined that the shift

42

from FBPase to PPi-PFK occurred in evolutionary histories of all analyzed gammaproteobacterial symbionts, suggesting that this genotype may be essential to the origin and maintenance of chemoautotrophic symbioses. This observation agrees with the demonstrated biochemical ability of PPi-PFK to participate in the Calvin cycle, a function to which this enzyme may have specifically adapted in symbiotic bacteria.

Results

Selection of PPi-PFK over FBPase occurred in the evolutionary histories of all sequenced chemoautotrophic symbionts

Analogous to the S. velum symbiont, all 11 other chemoautotrophic symbionts sequenced to date contained genes for PPi-PFK and lacked FBPase and ATP-PFK genes

(Figure 2.1). To assess whether this genotype was inherited or independently derived, an ancestral state reconstruction was performed. Prior to predicting ancestral-states, a Bayesian phylogeny based on 15 genes was created for all the sequenced chemoautotrophic symbionts and their most closely related free-living bacteria with complete or nearly-complete genomes.

The resulting time-calibrated maximum clade credibility (MCC) tree represented the best- resolved phylogeny of chemoautotrophic symbionts to date. Presence or absence of the four genes of interest, PPi-PFK, ATP-PPi-PFK, FBPase, and RuBisCO (a Calvin cycle marker) was mapped to the tips of the tree.

Gammaproteobacteria which possess PPi-PFK and RuBisCO and lack ATP-PFK and

FBPase formed two disparate monophyletic clades composed primarily of chemoautotrophic symbionts. The only exceptions were free-living bacteria Sedimenticola sp. SIP G1,

Sedimenticola selenatireducens DSM17993, Thioglobus singularis EF1 (SUP05), and

Methylococcus capsulatus Bath known for their overall high similarity to chemoautotrophic symbionts (Ward et al. 2004; Walsh et al. 2009; Carlström et al. 2015; Flood et al. 2015).

43

ATP−PFK FBPase Not detected

PPi-PFK RuBisCO 500 My 1 1 1 0.9996 * 1 * 1 1 * * 1 * * 1 1 1 * 1 * 1 * 1 * * 1 1 1 * * 1 * * 1 * * * 1 1 1 * * 1 1 1 * 1 * * * * 1 * 1 * * * * 1 * * 1 * 1 1 * * 1 * * * 1 * * 1 * * 1 * * * * 1 * * * * * * * 1 1 1 * 1 * 1 * * * * * * 1 * 1 * 1 1 * * * * * * 1 0.9983 1 * * * 1 1 1 * 1 * * * * * * * * 1 * 1 * 1 * * * * 1 * 1 * * * * * * * 1 1 * * 1 * * 1 * * * * * * 1 * * * * 1 * 1 * * * * * 1 1 * 1 * * * 1 * * * * * * 1 * * * * * * * * 1 1 * * * * * * * * * * * * Xylella Xanthomonas campest Vib Y Thioth Thioth Thioth Thioth Thioglo Thiomicrospi Thiomicrospi Thiomicrospi Thiorhod Thiocapsa ma Thiocystis violascens DSM 198 Thiorhodococcus dr Sole T Sole Sinorhi Sole Sh Sh Rhodospi Rhodopseudomonas palust Pseud Candidatus Ruthia magnifica Sedimenticola s Sedimenticola selenatireducens DSM 17993 Riftia pac Met Met Met Nitrosococcus halophilus Nc 4 Nitrosococcus Pseudomonas putida F1 Pseudomonas f Psychrobacter aquaticus CMS 56 Psychrobacter c Sh Magnetospi C Candidatus Candidatus Ruthia magnifica st Ma Endosymbiont of unidentified scaly snail isolate Monju Met Met Met Met Beggiatoa alba B18LD Halothiobacillus neapolita Halomonas lutea DSM 23508 Chromohalobacter japonicus Chromohalobacter sal Ma Gy Hahella chejuensis KCTC 2396 Ma Ma Kangiella Buchne Esche Haemophilus pa Agrobacte Alviniconcha sym. Gamma 1 Bat Allochromatium vinosum DSM 180 Alviniconcha sym. Gamma 2 Lau Acinetobacter calcoaceticus PHEA−2 Acinetobacter haemolyticus CIP 64.3 Alcani Alcani e ersinia pestis CO92 o vnia je e e e xiella n r r r r r h h h h h h h h ichromatium pu inospi inobacter aquaeolei VT8 inobacter algicola DG893 w w w io fische uella sunshi m m m ymodiolus a ylomonas denit ylomonas methanica ylomicrobium alcaliphilum ylomicrobium al ylobacter tund ylobacter ma ylococcus capsulatus st anella sediminis H anella putre anella denit v v r r r r r y y y o z ix ni ix fl ix lacust ix disci ichia coli o o f b a a a elar xanthomonas su r astidiosa subs obium meliloti CCNWSX0020 a aphidicola st r r b us singula r o ax bo ax s v v h r ichonana u r k vib r illum elesiana elum gill symbiont e yptila illum mi oreensis DSM 16069 r v ium xilis DSM 14609 r netii RSA 331 V ea DSM 5205 r illum magnetotacticum MS 1 r r r r r i ES114 p esico a pelophila DSM 1534 a chilensis DSM 12352 a c io s aichensis f r o . DG881 r w ina 5811 r r kumensis SK2 r r r p is DSM 21227 n adiobacter r r ub atsonii C 113 agi A22 mis DSM 14473 v r z y ahaemolyticus HK385 . SIP G1 r yii YC6258 p unogena XCL 2 f ent Ph05 o r ificans OS217 aciens CN−32 ohalolentis K5 n . 970 m i r r r r n r v utulum DSM 6287 is EF1 um F11 icus BazSymB pa e ipaludum SV96 r pu us A45 ent Tica y b ificans wsii AZ1 osocius okutanii HA st um BG8 e p r r r xigens DSM 3043 . sandyi Ann 1 atum 984 . APS is pv. campest A w W−EB3 n onensis 11 1 us c2 r is TIE 1 r . Bath r . Cm r r tial is r ain HA

Figure 2.1. Multi-gene time-calibrated Bayesian phylogeny of chemosynthetic symbionts (yellow) and closely-related free-living bacteria. Support values are listed at the nodes. Presence or absence of PPi-PFK, ATP-PFK, FBPase, and RuBisCO genes is mapped at the tips of the tree. Inferred ancestral states are labeled at the nodes. Significant ancestral states are marked with asterisk (*).

44

The most recent common ancestors (MRCA) of each symbiont clade unequivocally had PPi-

PFK and RuBisCO and lacked FBPase, while the MRCA between the two clades possessed

PPi-PFK and RuBisCO. However, the ancestral state of FBPase in the last shared common ancestor could not be conclusively determined, as a Bayes factor (BF) ratio comparing reconstruction probability with and without FBPase at the MRCA between the two clades produced insignificant result (log BF < 2).

pfp and the Calvin cycle genes are among the most highly expressed in the S. velum symbiont

Total RNA from the symbiont-containing gill tissue of S. velum was analyzed to evaluate the metabolic potential of this chemoautotrophic bacterial symbiont. For this purpose two cDNA libraries, enriched and unenriched in the symbiont mRNA transcripts, were sequenced

(Appendix 2). At least 3.3% of the sequences, which corresponded to mRNA and tRNA

(Appendix 2 Figure A2.3), aligned to the symbiont genome. The highest number of mRNA transcripts mapped to the genes involved in housekeeping, carbon metabolism, and sulfur oxidation (Figure 2.2). Gene expression was uniformly distributed among the four largest contigs which comprise the genome of the S. velum symbiont (Dmytrenko et al. 2014). sirA

(Gene ID 31577136, 2.88% transcripts kb-1), which encodes a known virulence response regulator (Lawhon et al. 2002; Teplitski et al. 2003), was the most highly expressed gene in the symbiont, followed by rpmJ (Gene ID 31577289, 1.68% transcripts kb-1) ribosomal protein gene. cbbL (Gene ID 31576636, 1.71% transcripts kb-1), encoding RuBisCO large subunit (Schwedock et al. 2004), had the third highest expression level and was most transcribed gene in the Calvin cycle (Figure 2.3). Among the sulfur oxidation genes, dsrH (Gene ID 31575343, 1.2 transcripts kb-1) and dsrC (Gene ID 31575342, 0.87 transcripts kb-1) had the highest expression levels.

45

Figure 2.2. Gene expression in the S. velum symbiont. The circular insert depicts transcription across ten contigs which constitute the genome of the symbiont (Dmytrenko et al 2014). From outside to the center: genome contigs (Mb); average transcriptional levels not normalized to gene length; genes on forward strand (dark blue); genes on reverse strand (light blue); Calvin cycle genes (red; pgk and gapA between tktA and fbp and not shown); sulfur oxidation genes (yellow, including dsrABEFHCMKLJOPNRS operon); tRNA genes (purple); rRNA genes (brown). Expression of rRNA genes is not shown. The bar graph shows most highly expressed protein-coding genes in the S. velum symbiont with their transcription normalized to gene length (in kilobases). Data from rRNA enriched and unenriched cDNA datasets are presented for each gene. The Calvin cycle genes (red) and sulfur oxidation genes (yellow) are shown. Expression values and NCBI gene IDs for the respective genes are listed in Appendix 2 Table A2.2.

46

Figure 2.3. Proposed Calvin cycle in the S. velum symbiont. Circle areas are proportional to average gene expression (percentage of transcripts per kb of gene length) of the corresponding genes. NCBI gene IDs for the respective genes are listed in Appendix 2 Table A2.2.

These genes, known to encode cytoplasmic sulfur carrier proteins (Stockdreher et al. 2014), are part of the highly-transcribed dsrABEFHCMKLJOPNRS operon in the S. velum symbiont. Our analysis showed that genes encoding glycolytic enzymes such as pyruvate kinase (Gene ID

31576039, 0.1% transcripts kb-1), phosphoglycerate mutase (Gene ID 31575769, 0.044% transcripts kb-1), or enolase (Gene ID 31575165, 0.026% transcripts kb-1) had low expression. In contrast, transcriptional levels of pfp (Gene ID 31575776, 0.23% transcripts kb-1) were high and comparable to those of the Calvin cycle genes, such as those encoding glyceraldehyde 3- phosphate dehydrogenase (gapA, Gene ID 31576041, 0.27% transcripts kb-1) and phosphoribulokinase (prkA, Gene ID 31576527, 0.37% transcripts kb-1).

47

Enzymatic PPi-PFK activity is present in the symbiont-containing tissue

3- PPi-PFK activity, measured as PPi formation from FBP and PO4 , was detected in cell- free extracts (CFE) of S. velum symbiont-containing gill tissue (Table 2.1), but was absent from the foot CFE (Appendix 2 Table A2.3). No PPi-PFK activity was observed when proteins were

3- denatured by boiling or in the absence of PO4 . Sugars other than FBP, such as fructose and

3- F6P, did not trigger PPi formation with PO4 . Rates of PPi formation were dependent on

3- -1 concentrations of FBP and PO4 . The highest rates of approximately 27 nmol PPi min mg

-1 3- protein were measured for 2.5-5 mM FBP and 20-25 mM PO4 , with the exception of 2.5 mM

3- FBP and 25 mM PO4 substrate combination (Table 2.1). FBPase activity was detected in the foot CFE (data not shown).

Table 2.1. PPi-PFK activity (nmol PPi min-1 mg total protein-1) in S. velum gill tissue cell-free 3- extracts as a function of FBP and PO4 concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three biological replicates are shown. FBP [mM] 2.5 5 10 10 20.5±0.8 20.7±0.6 21.0±0.9 PO 3- 4 20 27.7±2.4 27.5±1.8 25.2±0.9 [mM] 25 25.3±1.9 27.2±0.4 25.0±0.9

Recombinant PPi-PFK from the S. velum symbiont is pyrophosphate-dependent and bidirectional

Recombinant symbiont PPi-PFK and PPase and A. vinosum PPi-PFK and FBPase proteins were purified close to homogeneity (Figure 2.4). The sodium dodecyl sulfate- polyacrylamide gel electrophoresis (SDS-PAGE) analyses were in agreement with the predicted sizes of 47.55 kDa and 22.7 kDa for the symbiont PPi-PFK and PPase and 47.47 kDa and

39.33 kDa for A. vinosum PPi-PFK and FBPase, respectively.

48

Figure 2.4. SDS-PAGE analysis of the recombinant S. velum symbiont: (A) PPi-PFK and (B) PPase and A. vinosum (C) PPi-PFK and (D) FBPase. The proteins were His-tagged at the N- terminal ends and analyzed during each purification step.

The purified PPi-PFK from the S. velum symbiont was pyrophosphate-dependent and unable to use ATP as substrate (Figure 2.5). A high rate of FBP formation (104±2.5 U/mg) was observed in the presence of 5 mM PPi. In comparison, with an equimolar amount of ATP only a background level of activity (1±0.07 U/mg) was recorded.

Kinetic parameters of the symbiont PPi-PFK were determined by measuring product formation at various substrate combinations in forward and reverse reactions (Figure 2.6;

Appendix 2 Tables A2.4 and A2.5) and under different pH and temperature conditions (Figure

2.6). The forward reaction reached its highest initial velocity at 7.5 mM F6P and 5 mM PPi

49

(Figure 2.5 A, Appendix 2 Table A2.4). Higher substrate concentrations were not tested as velocities plateaued at these values. The highest reverse initial reaction velocities occurred between 2.5 mM and 5 mM FBP and 10 mM and 25 mM phosphate (Figure 2.6 B, Appendix 2

3- Table A2.5). Above 5 mM FBP and 25 mM PO4 substrate concentrations were inhibitory.

Overall, the symbiont PPi-PFK had higher reaction velocities in the reverse than the forward reaction. pH optimum of the enzyme was observed between 7.5 and 8 (Figure 2.7 A).

Temperature optimum was recorded between 55℃ and 65℃ (Figure 2.7 B).

Figure 2.5. S. velum symbiont PPi-PFK forward reaction activity with either 5 mM PPi or ATP. The assay was initiated by adding the source of phosphates (ATP or PPi) to a reaction mixture containing 7.5 mM F6P and the recombinant PPi-PFK. Enzyme activity was measured by converting FBP reaction product by fructose 1,6-bisphosphate aldolase (EC 4.1.2.13) and triosephosphate isomerase (EC 5.3.1.1) to dihydroxyacetone phosphate (DHAP), with a subsequent reduction of DHAP by glycerophosphate dehydrogenase (EC 1.1.1.8) in a reaction which consumes NADH. Measurements were performed at pH 7.5 and at 25℃. Error bars show standard deviations from three replicate measurements performed using the same enzyme preparation.

50

Figure 2.6. Initial velocities of the recombinant symbiont PPi-PFK in (A) forward and (B) reverse reactions at different substrate concentrations calculated within the linear range of each substrate combination (Appendix 2 Tables A2.4 and A2.5). Measurements were performed at pH 7.5 and at 25℃. Color legend shows initial velocities in units mg-1 of recombinant PPi-PFK.

Figure 2.7. Initial velocities of the symbiont PPi-PFK in the reverse reaction under different (A) 3- pH and (B) temperature conditions measured at 5 mM FBP and 20 mM PO4 . Error bars show standard deviations from three replicate measurements.

51

Figure 2.8. Influence of pyrophosphate on the reverse reaction of the symbiont recombinant PPi-PFK with and without PPase across a range of FBP concentrations. Measurements were performed at pH 7.5 and at 25℃. Error bars show standard deviations from three replicate measurements.

Inhibition of the S. velum symbiont PPi-PFK by PPi can be attenuated by PPase

Pyrophosphate acted as a competitive inhibitor for FBP in the reverse reaction (Figure

2.8, Appendix 2 Table A2.2). The inhibition constant of PPi (KiPPi) was 0.381±0.079 mM. The effect of PPi inhibition was alleviated by a pyrophosphatase consuming enzyme, such as the

52

recombinant PPase (Km 0.18±0.02 mM) encoded in the same operon as the symbiont PPi-PFK.

Addition of 1.75 U PPase into reverse reaction negated the inhibitory effects of 1.0 mM PPi.

Figure 2.9. Catalytic efficiencies (Ef (Ceccarelli et al. 2008)) of PPi-PFKs (reverse reaction, solid lines) and FBPases (dash-dotted lines) from select bacteria across a range of FBP concentrations. Enzymes marked with an asterisks (*) were characterized in this study.

53

Table 2.2. Kinetic properties of the recombinant symbiont PPi-PFK and PPase and A. vinosum PPi-PFK and FBPase. Standard error of the mean (SEM) for three replicate measurements are shown. Additional Enzymes Et k Km Vmax k /Km cat cat conditions SEM SEM Substrates µmol /sec mM U/mg /sec mM PPi-PFK S. velum symbiont FBP 0.000017 72.0 1.37 0.150 0.01 185.5 479

3- PO4 0.000017 75.1 1.10 1.232 0.08 193.3 61

3- PO4 0.000017 78.0 1.07 1.246 0.07 201.0 63 + PPase F6P 0.000021 53.7 1.30 0.276 0.03 107.6 194 PPi 0.000021 41.6 0.35 0.005 0.00 103.1 7805 FBP 0.000017 60.7 0.96 0.184 0.01 156.4 330 0.05 mM PPi FBP 0.000017 58.6 1.13 0.175 0.01 150.9 335 0.125 mM PPi FBP 0.000017 66.5 1.20 0.265 0.05 171.3 251 0.25 mM PPi FBP 0.000017 61.3 0.29 0.326 0.02 157.9 188 0.5 mM PPi FBP 0.000017 66.4 0.31 0.469 0.02 171.1 142 1.0 mM PPi 1.0 mM FBP 0.000017 66.4 0.72 0.178 0.01 181.0 373 PPi+PPase PPase S. velum symbiont PPi 0.000079 560.0 41.12 0.107 0.04 158.1 5234 PPi-PFK A. vinosum FBP 0.000040 29.1 0.71 0.129 0.02 76.4 226

3- PO4 0.000040 34.8 0.54 0.678 0.06 88.1 51 FBPase A. vinosum FBP 0.000051 12.8 3.85 0.060 0.003 15.7 211

S. velum symbiont PPi-PFK has high catalytic efficiency

Catalytic efficiency of PPi-PFK from the S. velum symbiont was compared to efficiencies of other bacterial PPi-PFKs and FBPases (Figure 2.9). To compare different enzymes acting on the same substrate, their efficiency functions (Ef) were estimated for a range of substrate concentrations (Ceccarelli et al. 2008). EfFBP was calculated based on measured kinetic values

-1 for the symbiont PPi-PFK (KmFBP 0.15±0.01 mM, kcatFBP 72±1.37 sec ) as well as A. vinosum

54

-1 PPi-PFK (KmFBP 0.13±0.02 mM, kcatFBP 29.1±0.71 sec ) and FBPase (KmFBP 0.06±0.003 mM,

-1 kcatFBP 12.8±3.85 sec ) (Table 2.2). Parameters for other enzymes included in the comparison were taken from literature, as listed in Materials and Methods. For the Ef calculation, PPi-PFKs were assumed to act in concert with PPi removing enzymes and thus be irreversible.

Symbiont PPi-PFK had high catalytic efficiency in the reverse reaction (Figure 2.9). The enzyme was 1.2 times more efficient than FBPase from A. vinosum at low substrate concentrations (< 10 µM) and became over twofold more efficient as concentrations increased.

Among the FBPases from closely related bacteria, only the E. coli FBPase (Kelley-Loughnane et al. 2002) displayed higher efficiency than the symbiont PPi-PFK at low substrate concentrations. Compared to FBPases, PPi-PFKs exhibited overall higher efficiencies at high substrate concentrations, as FBPases rapidly dropped in performance above 10 µM FBP.

Catalytic efficiency of the symbiont PPi-PFK decreased threefold by addition of 1 mM PPi. This loss was almost entirely recovered in the presence of PPase.

Discussion

This study for the first time demonstrates that PPi-PFK from a chemoautotrophic symbiont is capable of performing the biochemical function of FBPase not encoded in its genome. Absence of FBPase in chemoautotrophic symbionts which fix CO2 via the Calvin cycle has been enigmatic since sequencing of the first symbiont genome, that of a deep sea calm

Calyptogena magnifica (Newton et al. 2007). While it has been hypothesized that PPi-PFK may be able to take over for the missing enzyme in this and other sequenced symbionts, for example, found in association with the siboglinid worm R. pachyptila (S. Markert et al. 2011), oligochaete O. algarvensis (Kleiner et al. 2012), and S. velum protobranch bivalve (Dmytrenko et al. 2014), the function of PPi-PFK in these bacteria was predicted based on sequence only and had not been experimentally validated.

55

Either loss of FBPase or gain of PPi-PFK occurred in the evolutionary history of all chemoautotrophic gammaproteobacterial symbionts sequenced to date (Figure 2.1), which suggests a strong association between these occurrences and emergence of a symbiotic lifestyle. As more chemoautotrophic symbionts will be sequenced, it will become more evident whether the last common ancestor of all chemoautotrophic symbionts possessed only PPi-PFK, or FBPase loss occurred independently during each symbiosis event.

Genome-wide transcriptional analysis of the S. velum symbiont provided valuable insights into the metabolic potential of this bacterium, in particular with regard to pfp and its hypothesized role in the Calvin cycle (Figure 2.2). Judging from the observed patterns of genes expression, sulfur oxidation and carbon fixation are the two key processes in the symbiont metabolism, in agreement with our current understanding of chemoautotrophic symbionts

(Felbeck et al. 1981; Cavanaugh 1983; Stewart & Cavanaugh 2006; Cavanaugh et al. 2013).

The observed transcriptional levels of sulfur oxidation genes mostly agreed with our preliminary analysis of gene expression data carried out without having the reference genome sequence

(Stewart et al. 2011). Mapping transcripts to the annotated genomic contigs (Dmytrenko et al.

2014) improved predictions of gene expression, in particular with regard to genes which in the genome are present in multiple copies, for example dsrC. cbbL, which encodes RuBisCO large subunit, was the third most highly expressed gene in the S. velum symbiont (Figure 2.2). Other

Calvin cycle genes followed close suit (Figure 2.3). Among them, expression of gapA and prkA was comparable in magnitude to that of pfp, in line with the hypothesized role of PPi-PFK in the cycle. Since PPi-PFK is commonly considered to be a glycolytic enzyme (Mertens 1991), we have also compared pfp expression to that of genes involved in glycolysis, for example, pyruvate kinase and enolase, whose expression was approximately 50% to 90% lower than that of pfp. In fact, if pfp was glycolytic, it would have been the most highly expressed gene in this pathway. pfp transcription in the S. velum symbiont agreed with high levels of PPi-PFK protein

56

detected in other chemoautotrophic symbiont, those of R. magnifica (S. Markert et al. 2011) and

O. algarvensis (Kleiner et al. 2012), suggesting that the enzyme plays an important role in metabolism of these autotrophic bacteria. Finally, our discussion of the transcriptome would not be complete without commenting on the mostly highly expressed gene in the S. velum symbiont

(Figure 2.2). sirA encodes a response regulator, which is known to increase expression of virulence and decrease expression of motility genes in pathogenic gammaproteobacterium,

Salmonella enterica (Lawhon et al. 2002; Teplitski et al. 2003). High expression of sirA in the symbiont suggests an exciting possibility that this gene may be similarly involved in "infection" of the S. velum host with symbiotic bacteria.

As predicted by pfp expression, we were able to detect PPi-PFK enzymatic activity in the

S. velum gill cell-free extracts (Table 2.1). PPi-PFK activity was quantified by measuring PPi

3- formation in the presence of FBP and PO4 . Detecting PPi instead of F6P allowed us to separate activity of PPi-PFK from that of FBPase, an enzyme present in the foot cell-free extracts and, therefore, also likely found in the host tissue surrounding the symbionts. Because of the potential eukaryotic FBPase activity in the extracts, which would consume some of the

FBP substrate without producing PPi, and the fact that the symbionts contribute only a portion of the total protein in the gill tissue, the measured PPi-PFK activity (0.028±0.007 U mg protein-1,

Table 2.1) was potentially undersampled. Given that there are up to 2.6x109 symbiont cells per gram of wet gill tissue (Cavanaugh 1983; Mitchell & Cavanaugh 1983) and assuming 155 fg/cell of protein in a bacterium (Cox 2004), the PPi-PFK activity values were corrected to account for proteins from the symbiont only (2.11±0.06 U mg protein-1, Appendix 2 Table A2.6). The actual symbiont PPi-PFK activity in cell-free extracts could be higher, as it may be partially shadowed by activity of the host FBPase. To our knowledge this is the first report of reverse PPi-PFK activity in cell-free extracts. These data suggest that PPi-PFK-specific activity in the symbiont-

57

containing tissue of S. velum could account for the missing FBPase. However, we are not able to entirely rule out the possibility of another, yet unidentified enzyme, being at work.

For further study, the symbiont PPi-PFK was expressed in E. coli, purified, and characterized. Substrate concentration dependencies observed with the recombinant enzyme

(Figure 2.6, Appendix 2 Tables A2.4, A2.5) closely approximated PPi-PFK activity profile observed in the gill cell-free extracts (Table 2.1). This strengthened our prior conclusion that the

CFE measurements reflected activity of the PPi-PFK enzyme. In agreement with sequence- based function prediction, the purified PPi-PFK was pyrophosphate-dependent and unable to utilize ATP as substrate (Figure 2.5). Low activity detected in the presence of ATP was likely due to residual PPi from storage buffer. Like other PPi-PFKs, the symbiont enzyme was reversible. However, the much higher specific activity (Vmax) of the symbiont PPi-PFK in the reverse reaction suggests that the forward reaction is less favored (Table 2.3). The observed

1.7 ratio of the reverse over the forward Vmax is significantly (p<0.0001) higher than the ratios calculated for other bacterial PPi-PFKs (O'Brien et al. 1975; Mertens et al. 1989; Ladror et al.

1991; Ding et al. 1999; Reshetnikov et al. 2008; Frese et al. 2014). Such strong preference for the reverse reaction supports our hypothesis that PPi-PFK may perform the role of the missing

FBPase in the Calvin cycle of the S. velum symbiont.

The highest symbiont PPi-PFK reverse reaction velocities were observed at pH values between 7.5 and 8.0 and at temperatures from 55℃ to 65℃ (Figure 2.7). This pH optimum is in agreement with an earlier study which reported the highest CO2 fixation rates in the symbionts at pH 8.0 (Scott & Cavanaugh 2007). It is not rare for enzymes from mesophilic bacteria to have high temperature optima between 50℃ and 65℃ (Wang et al. 2016; Saggu & Mishra 2017;

Saxena et al. 2018). High PPi-PFK temperature optimum implies that the enzyme may have stable tertiary and quaternary structures (Kaneko et al. 2005), which agrees with observed overall stability of the enzyme during handling and long-term storage. While the symbionts

58

unlikely experience such high temperatures in situ, during summer months they are subjected to large temperature fluctuations in intertidal environments and may face temperatures in excess of 30℃ at low tide (Kaplan et al. 1977).

Table 2.3. Comparison of Km and Vmax from bacterial PPi-PFK enzymes.

Vmax Km Km Vmax Km Km Vmax pH Tm pH Tm kDa FBP PO4 F6P PPi reverse/ Reference Species [mM] [mM] (U mg-1) [mM] [mM] (U mg-1) reverse reverse forward forward FBP F6P forward

S. velum 47.6 0.150 1.232 185.5 0.276 0.005 107.6 7.5-8.0 55-60 - - 1.72 This study symbiont

A. vinosum 47.5 0.130 0.678 76.4 ------This study

(Reshetnikov et M. capsulatus 44.7 0.360 8.690 9.0 2.270 0.027 7.6 - - 7.0 30 1.18 al. 2008)

(Pfleiderer & R. rubrum 40.0 0.020 0.820 24.2 0.380 0.025 20.0 8.6 - 7.2 - 1.21 Klemme 1980)

(Frese et al. X. campestris 44.7 0.024 2.500 59.0 0.202 0.041 58.0 - - 6.8 40 1.00 2014)

(O'Brien et al. 1975; Mertens P. freudenreichii 43.2 0.051 0.600 232.0 0.100 0.069 258.0 7.0-7.4 - 7.5 - 0.90 et al. 1989; Ladror et al. 1991)

(Ronimus et al. S. thermophila 61.0 0.038 0.400 239.0 0.240 0.110 438.0 7.0-7.5 5.0-6.4 >55 0.55 1999)

(Ding et al. D. thermophilum 37.4 2.900 4.300 0.6 0.228 0.022 6.2 7.0-7.5 - 5.7-6.3 - 0.10 1999)

(Ding et al. T. maritima 46.5 - - - 0.980 0.067 203.0 5.6-6.8 - 5.6-5.8 - - 2001)

(Deng et al. B. burgdorferi 62.0 - - - 0.109 0.015 82.9 - - 6.4-7.2 - - 1999)

Symbiont PPi-PFK can be substrate- and product-inhibited (Figures 2.6 and 2.8). In particular, substrate inhibition occurred in the reverse reaction at above 5 mM FBP and 25 mM

3- PO4 , similar to other PPi-PFKs (Frese et al. 2014). Furthermore, pyrophosphate, which is a substrate in the PPi-PFK forward reaction, acted as a strong competitive inhibitor of FBP in the reverse reaction (Figure 2.8). This is not surprising, give an almost 250-times higher affinity of

3- PPi-PFK for PPi compared to PO4 (Table 2.2). While the inhibitory effects of PPi have been previously described for plant PPi-PFKs (Stitt 1989; Theodorou & Plaxton 1996), this is the first measurement of PPi inhibition for a bacterial PPi-PFK to date. Bacterial cells on average contain 0.5-1.5 mM PPi (Heinonen & Drake 1988; Bornefeld 1981; J. Chen et al. 1990), while in obligate methanotrophs PPi concentration can reach 5 mM (Y. Trotsenko & Shishkina 1990; Y.

59

A. Trotsenko et al. 2008). At 1 mM pyrophosphate reduced catalytic efficiency of the symbiont

PPi-PFK reverse reaction by more than 75% (Figure 2.9). To overcome the effects of PPi inhibition, the symbiont may employ PPi-consuming enzymes encoded in its genome, such as

PPase, H+/Na+-PPases, or SAT (Dmytrenko et al. 2014). By removing pyrophosphate these enzymes may mitigate the inhibitory effect of PPi and make the reverse reaction more favorable. Since PPi-PFKs are readily reversible (Mertens 1991; Kemp & Tripathi 1993), under in situ equilibrium conditions PPi removal could change the direction of PPi-PFK catalysis.

Energetic favorability of the forward reaction may further decreases due to high free energy change associated with PPi hydrolysis (∆fGº = -22 kJ/mol) (Biegel & V. Müller 2011). In our study we have shown that under phosphate saturation conditions, when the concentration of

PPi formed in the reverse reaction is negligible, addition of PPase did not noticeably affect PPi-

PFK kinetics (Table 2.2, Appendix 2 Figure A2.4). This suggests that, under these conditions, rate of the reverse reaction is limited by the diffusion of products from the active site. It is noteworthy that the addition of PPase to the assay improves signal to noise ratio and may be recommended as a stabilizing component in the future experiments (Appendix 2 Figure A2.5).

Our data suggest that PPi-PFK reverse activity in the chemoautotrophic symbionts may be dependent on PPi removal. To study the effects of PPi removal we used an inorganic pyrophosphatase, which does not couple PPi hydrolysis to any less thermodynamically favorable reaction. However, in the S. velum symbiont expression of the PPase encoding gene was relatively low (0.053% transcripts kb-1). Other PPases, such as Na+-PPase (0.046% transcripts kb-1) and H+-PPases (0.023% transcripts kb-1), also did not show high levels of transcription. In chemoautotrophic symbionts of R. magnifica (S. Markert et al. 2011) and O. algarvensis (Kleiner et al. 2012) H+/Na+-PPases have been hypothesized to couple hydrolysis of

PPi produced by PPi-PFK to generation of H+/Na+ electrochemical gradients which can be used for ATP production by ATP synthases. Approximately 10 molecules of PPi could in this way be

60

used to translocate 10 H+/Na+ (Serrano et al. 2007), which may then be used to generate 3 molecules of ATP (Hinkle 2005). This mechanism could reduce the total ATP cost of the Calvin cycle by approximately 10% (Kleiner et al. 2012). However, given low transcription of the

H+/Na+-PPases encoding genes in the S. velum symbiont, an alternative mechanism of pyrophosphate removal may be at play. We hypothesize that ATP sulfurylase may be instead responsible for the majority of PPi removal in the symbiont of S. velum. SAT is the final enzyme

2- in sulfur oxidation to SO4 in many chemoautotrophic symbionts and free-living sulfur oxidizing bacteria (Dahl et al. 2013; Parey et al. 2013). High SAT activity and the associated genes expression have been previously reported in symbiont-containing tissues of diverse chemoautotrophic symbioses (Felbeck et al. 1981; Felbeck 1981; Fisher & Hand 1984; C. Chen et al. 1987; Polz et al. 1992; Fiala-Medioni et al. 2002; Boutet et al. 2011). In the transcriptome of the S. velum symbiont, sat was among the most highly expressed genes involved in sulfur oxidation and was the most transcribed among the genes which encode PPi consuming

2- enzymes (Figure 2.2). SAT acts by transferring PPi to APS, generating SO4 and ATP (Parey et al. 2013). If SAT activity was coupled to removal of PPi, potentially produced by PPi-PFK in the

Calvin cycle, two molecules of ATP would be made per each round of CO2 fixation. Thus, SAT would not only drive reverse PPi-PFK activity by preventing substrate inhibition and increasing equilibrium constant of the reverse reaction, but could reduce the energetic cost of carbon fixation from 3 to 1 ATPs per each CO2 fixed.

In the absence of pyrophosphate inhibition, the symbiont PPi-PFK was more catalytically efficient at converting FBP into F6P than many characterized bacterial PPi-PFKs and FBPases

(Figure 2.9). PPi inhibition markedly reduced Ef (Eisenthal et al. 2007; Ceccarelli et al. 2008) of the enzyme but was readily reversed with the help of PPase. E. coli FBPase is more efficient than the symbiont PPi-PFK at low substrate concentrations but becomes rapidly superseded by

PPi-PFK when substrate concentrations increase past 18 µM. In general, FBPases tend to

61

perform better at lower and worse at higher substrate concentrations compared to PPi-PFKs.

This may be attributed to the fact that, unlike PPi-PFKs, FBPases are virtually irreversible

(Mertens 1991). PPi-PFKs, on the other hand, tend to perform relatively well in both forward and reverse reactions (Table 2.3) (O'Brien et al. 1975; Mertens et al. 1989; Ladror et al. 1991; Ding et al. 1999; Reshetnikov et al. 2008; Frese et al. 2014), which may come at the cost of overall efficiency. The majority of PPi-PFKs included in this analysis are thought to act as glycolytic enzymes. For example, PPi-PFK is present in A. vinosum alongside FBPase. While this PPi-

PFK may under certain conditions act in reverse, the primary role of the enzyme is likely limited to glycolysis. Our analysis predicted that in ideal conditions of substrate saturation and no PPi inhibition, reverse activity of PPi-PFK from A. vinosum would be comparable to that of FBPase.

However, in vivo this PPi-PFK would likely exhibit much lower catalytic activity. Since A. vinosum can use FBPase in its Calvin cycle, there is no evolutionary pressure for PPi-PFK to improve its reverse catalysis in this bacterium. In case of chemoautotrophic symbionts, on the other hand, the hypothesized role of PPi-PFK in the Calvin cycle may be exerting selection pressure to improve catalytic efficiency of the reverse reaction, which agrees with our data.

The high prevalence of PPi-PFKs and the lack of FBPases in chemoautotrophic symbionts suggest that PPi-PFKs play an important role in metabolism of the symbiont. The potential shift from FBPases to PPi-PFKs in these bacteria must have been driven by advantages associated with this evolutionary change. It has been previously hypothesized that

PPi-PFK may perform the biochemical function of FBPase in the Calvin cycle of chemoautotrophic symbionts. In this study we have shown that PPi-PFK from S. velum symbiont is capable of not only performing the hypothesized catalysis, but has evolved a higher specificity for the reverse reaction compared to other, potentially glycolytic, PPi-PFKs (Table

2.3). The potential use of PPi-PFK in the Calvin cycle of chemoautotrophic symbionts may be directly coupled to sulfur oxidation through enzymatic activity of SAT. This enzyme consumes

62

2- PPi and generates ATP in the final step of sulfur oxidation to SO4 . Thus, PPi removal required for PPi-PFK reverse activity may have an added advantage of ATP synthesis. This process may reduce the overall energetic cost of the Calvin cycle and could have led to the prevalence of

PPi-PFKs and a potential loss of FBPases in chemoautotrophic symbionts. The proposed coupling between sulfur oxidation and carbon fixation through a concerted action of PPi-PFK and SAT may explain why this evolutionary change is confined to chemoautotrophic symbionts and has not found its way to photoautotrophic symbionts and plastids.

Materials and methods

Specimen collection and bacterial cultures

S. velum protobranch bivalves were collected from an intertidal mud flat at Bluff Hill

Cove, Point Judith Pond, Rhode Island, over the period between 2011 and 2016. For DNA and

RNA extractions the specimens were dissected in the field. Gill tissue was stored in RNALaterTM

(Thermo Fisher Scientific, Waltham, MA) at 4°C, and processed within 2 days. For measuring enzyme activity, live specimens were transported to the lab in continuously aerated chilled sea water within 2 hours of collecting. Gill and foot tissues were dissected, weighed, frozen in liquid

N2, and stored at -80°C. Freezing did not decrease enzymatic activity compared to fresh samples (data not shown).

Bacterial strains and plasmids used in this study are listed in Supplementary file 1, Table

S6. and A. vinosum DSM 180 Rif50 (Lubbe et al. 2006) culture was generously provided by

Christiane Dahl (Universität Bonn, Germany). A. vinosum cultures were grown in RCV medium

(Weaver et al. 1975) in anaerobic vials under constant illumination (60W). Culture stock was stored in 10% dimethyl sulfoxide (DMSO) at -80°C.

63

Phylogenetic analysis and ancestral state reconstruction

To determine where the absence of FBPase and the presence of PPi-PFK in chemoautotrophic symbionts was inherited or independently derived, their ancestral states were reconstructed. A Bayesian phylogeny was generated using sequences from chemoautotrophic symbionts and closely-related free-living bacteria. In this phylogeny 73 taxa were included, with

68 gammaproteobacterial taxa as the ingroup and 5 alphaproteobacterial taxa as an outgroup.

Only taxa with complete or nearly-complete genomes were used. The minimal combination of

DNA and amino acid sequences which provided adequate support at all nodes included 16S rRNA genes and 14 phylogenetically conserved proteins (AtpA, ClpX, DnaE, DnaK, InfB, MurA,

RplF, RplV, RplW, RpoA, RpoB, RpsC, RpsK, SecY (Wu et al. 2013)). A time-calibrated phylogeny from these 15 concatenated sequences was generated with BEAST version 1.8.2

(Drummond et al. 2012). Partitions and substitution models were chosen based on the results of a PartitionFinder (version 1.1.1) analysis (Lanfear et al. 2012) (16S: single partition of symmetric

+ gamma + invariant sites; protein sequences: 3 partitions, each LG + gamma + invariant sites

(S. Q. Le & Gascuel 2008)). Each partition was run under a log-normal relaxed clock model.

The tree model (Speciation: Yule process (Gernhard et al. 2008)) was shared across all partitions and time-calibrated using normally distributed priors describing the outgroup and ingroup MRCA with means of 1.9 and 1.7 billion years before present, respectively (Sheridan et al. 2003; Battistuzzi et al. 2004). Analysis was run for 20 million generations, sampling every

10,000 generations, producing a maximum of 2,000 trees per run prior to burn-in. The resulting phylogeny represented MCC tree from a set of 5,393 trees obtained by combining 3 stably converged independent runs. Only 2 nodes had support below 1, with values of 0.9983 and

0.9996.

At all nodes, ancestral states were reconstructed for the four proteins of interest: ATP-

PFK, PPi-PFK, FBPase, and RuBisCO. Since genome annotations were poor or missing for

64

many of the organisms of interest, presence or absence of each of the four proteins as tip states was determined based on genomic BLAST version 2.6.0+ (Altschul et al. 1990). Well-annotated sequences were chosen to search for gene homologs in the genomes of interest. The resulting high-quality hits were added afterwards to the initial query and used in a repeated BLAST search until no more targets could be identified. Ancestral states for each gene were estimated using BayesTraits version 2 (Pagel et al. 2004) under a MultiState evolution model and a

Markov Chain Monte Carlo run for 5,010,000 iterations (10,000 iteration burn-in) with an exponential hyperprior. Instead of using a single MCC tree for the ancestral state reconstruction, set of 5,392 trees was used to reflect the combined uncertainties of the phylogeny and the reconstruction. The probabilities plotted on the tree represent the median of the posterior distribution. To quantify the probability of a given predicted genotype at an ancestral node,

Bayes factors (BF) were used to compare the difference in marginal likelihoods between the genotype pair.

Transcriptome sequencing and analysis

RNA from the S. velum symbiont-containing gill tissue was extracted from a single specimen using miRNeasy Mini Kit (Qiagen, Hilden, Germany). To enrich for symbiont transcripts, host mRNA was depleted using Ambion MICROBEnrichTM kit (Thermo Fisher

Scientific, Waltham, MA). Afterwards, half of the RNA was used directly for sequencing. The other half was further enriched in the symbiont mRNA by depleting 16S (primers Sv_16SF1 +

Sv_16SR1T7) and 23S (primers Sv_23SF1 + Sv_23SR1T7) symbiont rRNA and 18S (primers

SV_18SF1_53 + SV_18SR1T7-53) and 28S (primers Sv_28F1 + Sv_28S1T7) host rRNA transcripts with custom species-specific oligonucleotide probes synthesized using the specified primers (Appendix 2 Table A2.7). rRNA depletion was performed according to the procedure

65

adapted from Stewart et al. (2010). RNA amplification, cDNA synthesis, and sequencing were done as described in Stewart et al. (2011).

Artificially duplicated reads which arose during pyrosequencing were identified with CD-

HIT software version 4.7 (W. Li & Godzik 2006). Reads which had 100% nucleotide identity, length difference no more than 1 nucleotide, and identical first 3 nucleotides were discarded and only unique non-duplicates retained for further analysis (Appendix 2 Table A2.1). Next, reads corresponding to the symbiont 5S, 16S, and 23s rRNA, mitochondrial 12S and 16S rRNA, and host 18S and 28S rRNA were removed with Bowtie 2 version 2.3.2 (Langmead et al. 2009) using local alignment (--local --very-sensitive-local) to the species-specific sequences. The remaining reads were mapped to the genomes of the S. velum symbiont (NCBI Accession number JRAA01000001) (Dmytrenko et al. 2014) and the host mitochondria (NCBI Accession

Number NC_017612.1) (Plazzi et al. 2013) with Bowtie 2 using end-to-end alignment (--end-to- end --sensitive --qc-filter). These mapped reads were processed with Samtools version 1.5 (H.

Li et al. 2009) and visualized in Circos version 0.69-6 (Krzywinski et al. 2009). Highly expressed genes in the genome of the S. velum symbiont were identified using HTSeq-Count version 0.8.0

(Anders et al. 2015), normalized to gene length and sample size, and visualized in Python 3.6

Matplotlib version 2.1.0 (Hunter 2007). The remaining reads (Appendix 2 Table A2.1) were queried with BLASTN version 2.6.0+ (Altschul et al. 1990) against the NCBI nucleotide database (11 November 2017). The resulting hits were analyzed in MEGAN version 6.10.5

(Huson et al. 2007).

PPi-PFK activity in S. velum cell-free extracts

To obtain a sufficient amount of cell-free extract for measuring enzyme activity, tissues from multiple individuals (2-3) were pooled. Using a pre-chilled glass dounce homogenizer, symbiont-containing gill and symbiont-free foot tissue samples were macerated in 1:12 w/v ice-

66

cold extraction buffer (pH 7.0) containing 50 mM Tris-HCl, 5 mM MgCl2, 1 mM KCl, 3 mM

TM NH4Cl, 5mM DTT, and 2x ProteaseArrest (G-Biosciences, Saint Louis, MO). Soluble cell content was released by subjecting the homogenates to a series of 3x10 sec sonication bursts with a probe sonicator (Sonifier 250, Branson, Danbury, CT) at output level 2. Samples were kept on ice-salt slurry (8:1 w/w) during sonication and transferred to ice between treatments.

Cell lysis was monitored microscopically. The sonicated homogenates were centrifuged at 4°C for 30 min at 20,000 x g to pellet cellular debris. Supernatant containing cell-free extract was immediately used for measuring enzyme activity. Total protein in the soluble cell-free fraction was determined through CB-Protein Assay with bovine serum albumin (BSA) as a standard (G-

Biosciences, Saint Louis, MO).

PPi-PFK activity in crude cell-free extracts was measured through a method adapted from Heinonen (1981), modified to a 96-well plate format. The assay reaction contained 50 mM

Tris-HCl, 5 mM MgCl2, 1 mM KCl, 3 mM NH4Cl, 5 mM dithiothreitol (DTT), 2.5-10 mM FBP, and

90 µg cell-free extract at pH 7.5. PPi formation, indicative of the reverse PPi-PFK activity, was

3- initiated by the addition of PO4 to a final concentration of 10-25 mM. Every 20 sec, 200 µl of the reaction mixture were transferred into a new well containing 66 µl of 20% trichloroacetic acid

(TCA) to terminate the reaction. A total of 6 time points were sampled. Afterwards, the TCA was neutralized with 15 µl of 5 M NaOH. To precipitate PPi, 31 µl of 2 mM CaCl2 were added into the samples followed by 18.5 µl of 1 M KF. The reactions were thoroughly mixed, incubated at RT for 15 min, and centrifuged at 3,220 x g for 5 min. The supernatant (295 µl), containing

3- unreacted PO4 , was discarded to prevent interference with PPi detection. Precipitate in the remaining 35 µl was dissolved with 165 µl of 0.4 M H2SO4. Freshly-prepared colorimetric reagent (200 µl), containing 32 mM (NH4)6Mo7O24, 0.5 M H2SO4, and 71 mM trimethylamine, was added to each well. After incubating for 15 min, the plate was centrifuged as above. The supernatant (200 µl) was transferred into new wells containing 6 µl of 2.5 M H2SO4 and

67

centrifuged again under the same conditions. The final supernatant (200 µl) was transferred to a transparent polystyrene 96-well plate (Greiner Bio-One, Frickenhausen, Germany) and mixed with 13 µl of 1 M 2-mercaptoethanol. After 6 min, absorbance was measured at 700 nm with a

Tecan Infinite m200 spectrophotometer (Tecan, Männedorf, Switzerland). Three biological replicates and three experimental replicates were measured per experimental condition. The concentration of PPi was determined using a standard curve.

Cloning, expression, and purification of recombinant proteins from S. velum symbionts and A. vinosum

S. velum symbiont-containing DNA was isolated from the gill tissue using DNeasy Blood

& Tissue kit (Qiagen, Hilden, Germany). DNA from A. vinosum mid log-phase culture was purified using E.N.Z.A. Bacteria DNA kit (Omega Bio-Tek, Norcross, GA). Genes encoding PPi-

PFK (primers Sv_pfp_1F_NdeI + Sv_pfp_1257_SacI) and PPase (primers Sv_ppase_1F_NdeI

+ Sv_ppase_549R_XhoI) from the S. velum symbiont and PPi-PFK (primers Av_pfp_1F_NdeI +

Av_pfp_1254 R_SacI) and FBPase (primers Av_fbp_1F_NdeI + Av_fbp_1014R_SacI) from A. vinosum were PCR-amplified with the specified primers listed in Supplementary file 1, Table S6 using Q5 high-fidelity polymerase (NEB, Ipswich, MA). PCR products were digested either with

NdeI and SacI or NcoI and XhoI restriction enzymes (NEB, Ipswich, MA), purified, and cloned into the digested expression vector pET28a+ (EMD Biosciences, San Diego, CA). This procedure introduced in-frame sequences encoding histidine tags (His6) at the 5' end of each cloned gene.

Ligated products were transferred into E. coli BL21(DE3) (Thermo Fisher Scientific,

Waltham, MA) and maintained using kanamycin (Kan). Inserts were verified by PCR and DNA sequencing (DF/HCC DNA Resource Core, Boston, MA). For overexpressing recombinant proteins, 200 ml of LB medium containing Kan was inoculated with an overnight culture of E.

68

coli BL21(DE3) bearing one of the expression plasmids. Cultures were grown at 37°C with shaking at 180 rpm until OD600 reached 0.4-0.8 (referred to as Uninduced SDS-PAGE fractions).

Protein expression was then induced with 1 mM IPTG for 3 hours. Afterwards, cells were pelleted by centrifugation at 4,000 x g for 20 min at 4°C. Cell pellets were frozen in liquid N2 and stored at -80°C.

To purify His6-tagged proteins, 500 mg of thawed E. coli BL21(DE3) cell pellets containing the recombinant enzymes were lysed in 10 ml of xTractor buffer (Takara Bio USA,

Ann Arbor, MI) with 125 µl of LongLife PELB Lysozyme and 200 µl of 100x ProteaseArrest (G-

Biosciences, Saint Louis, MO). Lysate was centrifuged at 12,000 x g for 20 min at 4°C and the resulting supernatant collected (referred to as Induced SDS-PAGE fractions). Subsequent protein purification was carried out at 4°C using His60 Ni-IDA resin columns and buffers supplied by Takara Bio USA (Ann Arbor, MI). After columns were washed with equilibration buffer, 5 ml of the starting samples were applied to the resin and incubated for 1 hour with gentle shaking. Unbound lysates were collected (referred to as Unbound SDS-PAGE fractions) and the procedure was repeated for the remaining 5 ml. Next, columns were washed with 10 ml of equilibration buffer (referred to as Wash 1 SDS-PAGE fractions), followed by 10 ml of wash buffer (referred to as Wash 2 SDS-PAGE fractions). Bound proteins were eluted in two 1 ml elution fractions in elution buffer containing 300 mM imidazole (referred to as Elute 1 and Elute

2 SDS-PAGE fractions). Eluted protein fractions were transferred into enzyme-specific storage buffers (referred to as Storage 1 and Storage 2 SDS-PAGE fractions) using Amicon Ultra-4 3K centrifugal devices (Millipore, Billerica, MA). PPi-PFK storage buffer (pH 7.0) contained 10 mM

Tris-acetate, 0.1 mM ethylenediaminetetraacetic acid (EDTA), 0.5 mM DTT, 17 mM KCl, 1 mM

MgCl2, 1 mM FBP, and 50% (v/v) glycerol. Storage buffer (pH 7.0) for FBPase comprised of 8 mM KPO4, 1 mM EDTA, 1 mM DTT, 17 mM NaCl, and 1 mM FBP. PPase storage (pH 8.0) included 20 mM Tris-HCl, 0.1 mM ZnCl2, 1 mM MgCl2, 100 mM KCl, and 1 mM DTT. Purified

69

PPi-PFKs and PPases were stored at -20°C. FBPase was frozen in liquid N2 and stored at -

80°C. Protein concentrations in individual fractions were measured using the CB-Protein Assay

BSA as a standard (G-Biosciences, Saint Louis, MO).

Individual protein fractions were analyzed by SDS-PAGE on precast 12% (w/v) acrylamide gels (Bio-Rad, Hercules, CA) with PAGEmark Unstained Marker protein ladder (G-

Biosciences, Saint Louis, MO) following standard procedure (Laemmli 1970). 30 µg of protein from the Uninduced, Induced, Unbound, Wash 1, and Wash 2 fractions and 3 µg of protein from the Elute 1, Elute 2, Storage 1, and Storage 2 fractions were analyzed. On the gel, proteins were visualized with OrioleTM fluorescent gel stain (Bio-Rad, Hercules, CA).

Characterization of recombinant S. velum symbionts and A. vinosum enzymes

Activities of the purified PPi-PFK and FBPase enzymes were measured in coupled enzyme assays adapted from Alves et al. (1994). Assays were performed in 300 µl volume on

µclearTM 96-well plates (Greiner Bio-One, Frickenhausen, Germany). Forward PPi-PFK reaction was assayed in 50 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 1 mM KCl, 3 mM NH4Cl, 5mM DTT,

0.15 mM NADH, 0.05-7.5 mM F6B, 1.3 U fructose 1,6-bisphosphate aldolase (EC 4.1.2.13), 10

U triosephosphate isomerase (EC 5.3.1.1), 1.7 U a-glycerophosphate dehydrogenase (EC

1.1.1.8), 400-500 ng of purified recombinant PPi-PFK, and 0.01-5 mM PPi. FBPase activity and reverse PPi-PFK reaction were assayed in 50 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 1 mM KCl, 3

+ mM NH4Cl, 5mM DTT, 0.4 mM NADP , 0.01-10 mM FBP, 1.75 U phosphoglucose isomerase

(EC 5.3.1.9), 1.75 U glucose 6-phosphate dehydrogenase (EC 1.1.1.49), 400-500 ng of purified enzyme, and 0.5-100 mM K2HPO4. For each measurement, 100 µl of the assay components equilibrated to the required temperature were added. The reaction was initiated by the addition of 200 µl PPi for the forward and 200 µl K2HPO4 for the reverse activity assay. Reaction progress was monitored at 340 nm in a Tecan Infinite m200 spectrophotometer (Tecan,

70

Männedorf, Switzerland) for up to 5 min at 25°C, unless stated otherwise. pH-dependence of the enzymes was measured in PIPES- (pH 6.0-7.5, adjusted with 1M KOH) and Tris-based (pH

3- 7.5-9.0, adjusted with 1M HCl) activity buffers with 20mM PO4 and 5mM FBP substrates.

Temperature-dependence between 20°C and 80°C in 5°C increments was assayed in the

3- reverse reaction in pre-equilibrated Tris-HCl buffer with 20 mM PO4 and 5 mM FBP substrates using temperature-controlled Tecan spectrophotometer (Tecan, Mannedorf,̈ Switzerland) and a water bath.

Substrate inhibition and the regulatory role of PPase in the symbiont PPi-PFK reverse

3- reaction were studied in the presence of 0.05-1.0 mM PPi, 0.01-5 mM FBP, and 20 mM PO4 substrates with and without 1.75 U PPase. Since sufficient amount of the S. velum symbiont

PPase could not be purified for this experiment, a commercial PPase from E. coli (I5907, Sigma

Aldrich) was used. All measurements were carried out at least in triplicate.

Concentrations of NADH, used as a reporter of PPi-PFK activity in coupled-enzyme assay for the forward reaction, were determined from a standard curve. The same standard curve was also applied to estimate concentrations of NAD(P)H, used as a reporter in the reverse assay, since NADH and NAD(P)H have the same extinction coefficient and both exhibit a maximum absorption peak at 340 nm (Bergmeyer 1975).

PPase activity was determined using a colorimetric assay based on Alebeek (van

Alebeek & Keltjens 1994). Reaction was carried-out in assay buffer containing 100 mM Tris-HCl

(pH 7.5), 2 mM MgCl2, 5 mM DTT, and 300 ng recombinant PPase from the S. velum symbiont.

The reaction was started by adding 0.05-1.0 mM of PPi to the total volume of 210 µl. For 100 sec at 20 sec intervals 30 µl aliquots of enzyme reaction were transferred into a colorimetric reagent containing 1% (NH4)6Mo7O24, 0.83 M H2SO4, and 8% FeSO4. After incubating samples for 32 min, absorbance was measured at 660 nm in a polystyrene 96-well plate (Greiner Bio-

71

One, Frickenhausen, Germany) using a Tecan Infinite m200 spectrophotometer (Tecan,

3- Männedorf, Switzerland). PO4 concentrations were estimated with the use of standard curve.

Initial velocities were calculated in the linear range of catalytic reactions at different substrate concentrations. Kinetic constants were determined using the nonlinear Least squares

Levenberg–Marquardt fitting algorithm. Ki inhibition constant was calculated in GraphPad Prism version 7.0 (GraphPad Software, La Jolla, CA).

To derive kcat values for the S. velum symbiont and A. vinosum PPi-PFKs, the enzymes were considered to be homodimeric, by analogy to the structure of PPi-PFK from Borrelia. burgdorferi, which is the closest related PPi-PFK with a published crystal structure (Weissgerber et al. 2011). By analogy to E. coli FBPase (Kelley-Loughnane et al. 2002), the A. vinosum

FBPase was assumed to be a homotetramer. The symbiont PPase was regarded as a homohexamer by analogies to the closet related homologue from with resolves crystal structure from E. coli (Kankare et al. 1994).

Catalytic efficiencies (Ef) were calculated using Equation 1 from Ceccarelli at al. (2008).

+ $%&'()* ) ! = ,- " 78%&'[9] [Equation 1] $/01(23*[5]* : ) 8 ;<=

KM - Michaelis constant

kcat - rate constant

’ k cat -rate constant for the reverse reaction

9 -1 -1 kdif - rate for a diffusion-controlled process (10 M s )

Ke - equilibrium constant

ϴ - reversibility of the reaction (1 – reversible, 0 – irreversible)

Only enzymes with published KmFBP and kcatFBP values were included in this analysis. In particular, kinetic data for the Thermatoga maritime FBPase were taken from Myung et al.

72

(2010). Catalytic parameters for E. coli FBPase were obtained from Kelley-Loughnane et al.

(2002). The E. coli type II FBPase’s, GlpX and YggF, were described by Brown et al. (2009).

Kinetic data for PPi-PFK from Xanthomonas campestris were documented in Frese et al.

(2014).

References Altschul, S.F. et al., 1990. Basic local alignment search tool. Journal of Molecular Biology, 215, pp.403–410.

Alves, A.M. et al., 1994. Enzymes of glucose and methanol metabolism in the actinomycete Amycolatopsis methanolica. Journal of Bacteriology, 176(22), pp.6827–6835.

Anders, S., Pyl, P.T. & Huber, W., 2015. HTSeq--a Python framework to work with high- throughput sequencing data. Bioinformatics, 31(2), pp.166–169.

Bapteste, E., Moreira, D. & Philippe, H., 2003. Rampant horizontal gene transfer and phospho- donor change in the evolution of the phosphofructokinase. Gene, 318, pp.185–191.

Bar-Even, A., Noor, E. & Milo, R., 2012. A survey of carbon fixation pathways through a quantitative lens. Journal of Experimental Botany, 63(6), pp.2325–2342.

Barry, J. et al., 2002. Methane-based symbiosis in a mussel, Bathymodiolus platifrons, from cold seeps in Sagami Bay, Japan. Invertebrate Biology, 121(1), pp.47–54.

Bassham, J.A. et al., 1953. The path of carbon in photosynthesis. xxi. The cyclic regeneration of carbon dioxide acceptor. Journal of the American Chemical Society, 76(7), pp.1760–1770.

Battistuzzi, F.U., Feijao, A. & Hedges, S.B., 2004. A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evolutionary Biology, 4(44), pp.1–14.

Bergmeyer, H.U., 1975. Neue Werte für die molaren Extinktions-Koeffizienten von NADH und NADPH zum Gebrauch im Routine-Laboratorium. Zeitschrift fur klinische Chemie und klinische Biochemie, 13(11), pp.507–508.

Biegel, E. & Müller, V., 2011. A Na+-translocating pyrophosphatase in the acetogenic bacterium Acetobacterium woodii. The Journal of Biological Chemistry, 286(8), pp.6080–6084.

Bornefeld, T., 1981. Is light-dependent formation of inorganic pyrophosphate in Anacystis a photosynthetic process? Archives of Microbiology, 129(5), pp.371–373.

Boutet, I. et al., 2011. Conjugating effects of symbionts and environmental factors on gene expression in deep-sea hydrothermal vent mussels. BMC Genomics, 12(530), pp.1–13.

73

Brown, G. et al., 2009. Structural and biochemical characterization of the type II fructose-1,6- bisphosphatase Glpx from Escherichia coli. The Journal of Biological Chemistry, 284(6), pp.3784–3792.

Carlström, C.I. et al., 2015. Phenotypic and genotypic description of Sedimenticola selenatireducens strain CUZ, a marine (per)chlorate-respiring gammaproteobacterium, and its close relative the chlorate-respiring Sedimenticola strain NSS R. E. Parales, ed. Applied and Environmental Microbiology, 81(8), pp.2717–2726.

Carnal, N.W. & Black, C.C., 1979. Pyrophosphate-dependent 6-phosphofructokinase, a new glycolytic enzyme in pineapple leaves. Biochemical and Biophysical Research Communications, 86(1), pp.20–26.

Cavanaugh, C.M., 1983. Symbiotic chemoautotrophic bacteria in marine invertebrates from sulphide-rich habitats. Nature, 302, pp.58–61.

Cavanaugh, C.M., Wirsen, C. & Jannasch, H., 1992. Evidence for methylotrophic symbionts in a hydrothermal vent mussel (Bivalvia: Mytilidae) from the Mid-Atlantic Ridge. Applied and Environmental Microbiology, 58(12), pp.3799–3803.

Cavanaugh, D.C.M. et al., 2013. Marine chemosynthetic symbioses. In The Prokaryotes. Berlin Heidelberg: Springer Berlin Heidelberg, pp. 579–607.

Ceccarelli, E.A., Carrillo, N. & Roveri, O.A., 2008. Efficiency function for comparing catalytic competence. Trends in Biotechnology, 26(3), pp.117–118.

Chen, C., Rabourdin, B. & Hammen, C., 1987. The effect of hydrogen sulfide on the metabolism of Solemya velum and enzymes of sulfide oxidation in gill tissue. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 88(3), pp.949–952.

Chen, J. et al., 1990. Pyrophosphatase is essential for growth of Escherichia coli. Journal of Bacteriology, 172(10), pp.5686–5689.

Chi, A. & Kemp, R.G., 2000. The primordial high energy compound: ATP or inorganic pyrophosphate? The Journal of Biological Chemistry, 275(46), pp.35677–35679.

Cox, R.A., 2004. Quantitative relationships for specific growth rates and macromolecular compositions of Mycobacterium tuberculosis, Streptomyces coelicolor A3(2) and Escherichia coli B/r: an integrative theoretical approach. Microbiology, 150, pp.1413–1426.

Dahl, C. et al., 2013. Sulfite oxidation in the purple sulfur bacterium Allochromatium vinosum: identification of SoeABC as a major player and relevance of SoxYZ in the process. Microbiology, 159(Pt 12), pp.2626–2638.

Deng, Z.H. et al., 1999. Expression, characterization, and crystallization of the pyrophosphate- dependent phosphofructo-1-kinase of Borrelia burgdorferi. Archives of Biochemistry and Biophysics, 371(2), pp.326–331.

Ding, Y.H., Ronimus, R.S. & Morgan, H.W., 1999. Purification and properties of the pyrophosphate-dependent phosphofructokinase from Dictyoglomus thermophilum Rt46 B.1.

74

Extremophiles, 3(2), pp.131–137.

Dmytrenko, O. et al., 2014. The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis. BMC Genomics, 15(924), pp.1–20.

Doeller, J. et al., 1988. Gill hemoglobin may deliver sulfide to bacterial symbionts of Solemya velum (Bivalvia, Mollusca). The Biological Bulletin, 175(3), pp.388–396.

Donahue, J.L. et al., 2000. Purification and characterization of glpX-encoded fructose 1,6- bisphosphatase, a new enzyme of the glycerol 3-phosphate regulon of Escherichia coli. Journal of Bacteriology, 182(19), pp.5624–5627.

Drummond, A.J. et al., 2012. Bayesian Phylogenetics with BEAUti and the BEAST 1.7. Molecular Biology and Evolution, 29(8), pp.1969–1973.

Dubilier, N., Bergin, C. & Lott, C., 2008. Symbiotic diversity in marine animals: the art of harnessing chemosynthesis. Nature Reviews Microbiology, 6(10), pp.725–740.

Eisenthal, R., Danson, M.J. & Hough, D.W., 2007. Catalytic efficiency and kcat/KM: a useful comparator? Trends in Biotechnology, 25(6), pp.247–249.

Erb, T.J. & Zarzycki, J., 2018. A short history of RubisCO: the rise and fall (?) of Nature's predominant CO2 fixing enzyme. Current Opinion in Biotechnology, 49, pp.100–107.

Felbeck, H., 1981. Chemoautotrophic potential of the hydrothermal vent tube worm, Riftia pachyptila jones (Vestimentifera). Science, 213(4505), pp.336–338.

Felbeck, H., Childress, J.J. & Somero, G.N., 1981. Calvin-Benson cycle and sulphide oxidation enzymes in animals from sulphide-rich habitats. Nature, 293(5830), pp.291–293.

Fiala-Medioni, A. et al., 2002. Ultrastructural, biochemical, and immunological characterization of two populations of the mytilid mussel Bathymodiolus azoricus from the Mid-Atlantic Ridge: evidence for a dual symbiosis. Marine Biology, 141(6), pp.1035–1043.

Fisher, M.R. & Hand, S.C., 1984. Chemoautotrophic symbionts in the bivalve Lucina floridana from seagrass beds. The Biological Bulletin, 167(2), pp.445–459.

Flood, B.E., Jones, D.S. & Bailey, J.V., 2015. Sedimenticola thiotaurini sp. nov., a sulfur- oxidizing bacterium isolated from salt marsh sediments, and emended descriptions of the genus Sedimenticola and Sedimenticola selenatireducens. International Journal of Systematic and Evolutionary Microbiology, 65(8), pp.2522–2530.

Flores, J.F. et al., 2005. Sulfide binding is mediated by zinc ions discovered in the crystal structure of a hydrothermal vent tubeworm hemoglobin. Proceedings of the National Academy of Sciences of the United States of America, 102(8), pp.2713–2718.

Frese, M. et al., 2014. Characterization of the pyrophosphate-dependent 6-phosphofructokinase from Xanthomonas campestris pv. campestris. Archives of Biochemistry and Biophysics, 546, pp.53–63.

75

Fujita, Y. et al., 1998. Identification and expression of the Bacillus subtilis fructose-1, 6- bisphosphatase gene (fbp). Journal of Bacteriology, 180(16), pp.4309–4313.

Gerbling, K.P., Steup, M. & Latzko, E., 1986. Fructose 1,6-bisphosphatase form B from Synechococcus leopoliensis hydrolyzes both fructose and sedoheptulose bisphosphate. Plant Physiology, 80(3), pp.716–720.

Gernhard, T., Hartmann, K. & Steel, M., 2008. Stochastic properties of generalised Yule models, with biodiversity applications. Journal of Mathematical Biology, 57(5), pp.713–735.

Heinonen, J., 2001. Biological role of inorganic pyrophosphate, Norwell, MA: Kluwer Academic Publishers.

Heinonen, J., Honkasalo, S. & Kukko, E., 1981. A method for the concentration and for the colorimetric determination of nanomoles of inorganic pyrophosphate. Analytical Biochemistry, 117, pp.293–300.

Heinonen, J.K. & Drake, H.L., 1988. Comparative assessment of inorganic pyrophosphate and pyrophosphatase levels of Escherichia coli, Clostridium pasteurianum, and Clostridium thermoaceticum. FEMS Microbiology Letters, 52(3), pp.205–208.

Hines, J.K., Fromm, H.J. & Honzatko, R.B., 2007. Structures of activated fructose-1,6- bisphosphatase from Escherichia coli. The Journal of Biological Chemistry, 282(16), pp.11696–11704.

Hinkle, P.C., 2005. P/O ratios of mitochondrial oxidative phosphorylation. Biochimica et Biophysica Acta, 1706(1-2), pp.1–11.

Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), pp.90–95.

Huson, D. et al., 2007. MEGAN analysis of metagenomic data. Genome Research, 17(3), p.377.

Imhoff, J.F., 2005. Family I. Chromatiaceae Bavendamm 1924, 125AL emend. Imhoff 1984b, 339. In Bergey's manual of systematic bacteriology. New York, NY: Springer, pp. 3–40.

Kaneko, H., Minagawa, H. & Shimada, J., 2005. Rational Design of Thermostable Lactate Oxidase by Analyzing Quaternary Structure and Prevention of Deamidation. Biotechnology Letters, 27(22), pp.1777–1784.

Kankare, J. et al., 1994. The structure of E.coli soluble inorganic pyrophosphatase at 2.7 A resolution. Protein engineering, 7(7), pp.823–830.

Kaplan, W., Valiela, I. & Teal, J.M., 1977. Denitrification in a salt marsh . Microbial Ecology, 3, pp.193–204.

Kelley-Loughnane, N. et al., 2002. Purification, kinetic studies, and homology model of Escherichia coli fructose-1,6-bisphosphatase. Biochimica et Biophysica Acta, 1594(1), pp.6–16.

76

Kemp, R.G. & Tripathi, R.L., 1993. Pyrophosphate-dependent phosphofructo-1-kinase complements fructose 1,6-bisphosphatase but not phosphofructokinase deficiency in Escherichia coli. Journal of Bacteriology, 175(17), pp.5723–5724.

Kleiner, M. et al., 2012. Metaproteomics of a gutless marine worm and its symbiotic microbial community reveal unusual pathways for carbon and energy use. Proceedings of the National Academy of Sciences of the United States of America, 109(19), pp.1173–1182.

Krzywinski, M. et al., 2009. Circos: an information aesthetic for comparative genomics. Genome Research, 19(9), pp.1639–1645.

Ladror, U.S. et al., 1991. Cloning, sequencing, and expression of pyrophosphate-dependent phosphofructokinase from Propionibacterium freudenreichi. The Journal of Biological Chemistry, 266(25), pp.16550–16555.

Laemmli, U.K., 1970. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227(5259), pp.680–685.

Lanfear, R. et al., 2012. PartitionFinder: Combined Selection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses. Molecular Biology and Evolution, 29(6), pp.1695–1701.

Langmead, B. et al., 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(R25), pp.1–10.

Laue, B.E. & Nelson, D.C., 1994. Characterization of the gene encoding the autotrophic ATP sulfurylase from the bacterial endosymbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 176(12), pp.3723–3729.

Lawhon, S.D. et al., 2002. Intestinal short-chain fatty acids alter Salmonella typhimurium invasion gene expression and virulence through BarA/SirA. Molecular microbiology, 46(5), pp.1451–1464.

Le, S.B. et al., 2017. 6-Phosphofructokinase and ribulose-5-phosphate 3-epimerase in methylotrophic Bacillus methanolicus ribulose monophosphate cycle. Applied Microbiology and Biotechnology, 101, pp.4185–4200.

Le, S.Q. & Gascuel, O., 2008. An improved general amino acid replacement matrix. Molecular Biology and Evolution, 25(7), pp.1307–1320.

Lee, R., Robinson, J. & Cavanaugh, C.M., 1999. Pathways of inorganic nitrogen assimilation in chemoautotrophic bacteria-marine invertebrate symbioses: expression of host and symbiont glutamine synthetase. The Journal of Experimental Biology, 202, pp.289–300.

Li, H. et al., 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), pp.2078–2079.

Li, W. & Godzik, A., 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22(13), pp.1658–1659.

77

Lubbe, Y.J. et al., 2006. Siro(haem)amide in Allochromatium vinosum and relevance of DsrL and DsrN, a homolog of cobyrinic acid a,c-diamide synthase, for sulphur oxidation. FEMS Microbiology Letters, 261(2), pp.194–202.

Lutz, R.A. et al., 1994. Rapid growth at deep-sea vents. Nature, 371(6499), pp.663–664.

Markert, B. et al., 2014. Characterization of two transketolases encoded on the chromosome and the plasmid pBM19 of the facultative ribulose monophosphate cycle methylotroph Bacillus methanolicus. BMC Microbiology, 14(7), pp.1–11.

Markert, S. et al., 2011. Status quo in physiological proteomics of the uncultured Riftia pachyptila endosymbiont. Proteomics, 11(15), pp.3106–3117.

Martin, W. & Schnarrenberger, C., 1997. The evolution of the Calvin cycle from prokaryotic to eukaryotic chromosomes: a case study of functional redundancy in ancient pathways through endosymbiosis. Current Genetics, 32(1), pp.1–18.

Mertens, E., 1991. Pyrophosphate-dependent phosphofructokinase, an anaerobic glycolytic enzyme? Febs Letters, 285(1), pp.1–5.

Mertens, E., Van Schaftingen, E. & Müller, M., 1989. Presence of a fructose-2,6-bisphosphate- insensitive pyrophosphate: fructose-6-phosphate phosphotransferase in the anaerobic protozoa Tritrichomonas foetus, Trichomonas vaginalis and Isotricha prostoma. Molecular and Biochemical Parasitology, 37(2), pp.183–190.

Michels, P. et al., 1997. The glycosomal ATP-dependent phosphofructokinase of Trypanosoma brucei must have evolved from an ancestral pyrophosphate-dependent enzyme. European Journal of Biochemistry, 250(3), pp.698–704.

Mitchell, T.A. & Cavanaugh, C.M., 1983. Numbers of numbers of symbiotic bacteria in the gill tissue of the bivalve Solemya velum Say. The Biological Bulletin, (165), p.521.

Müller, M. et al., 2001. Presence of prokaryotic and eukaryotic species in all subgroups of the PPi-dependent group II phosphofructokinase protein family. Journal of Bacteriology, 183(22), pp.6714–6716.

Myung, S., Wang, Y. & Zhang, Y.H.P., 2010. Fructose-1,6-bisphosphatase from a hyper- thermophilic bacterium Thermotoga maritima: Characterization, metabolite stability, and its implications. Process Biochemistry, 45(12), pp.1882–1887.

Nakagawa, S. et al., 2014. Allying with armored snails: the complete genome of gammaproteobacterial endosymbiont. The ISME Journal, 8(1), pp.40–51.

Nelson, D. & Hagen, K., 1995. Physiology and biochemistry of symbiotic and free-living chemoautotrophic sulfur bacteria. Integrative and Comparative Biology, 35(2), pp.91–101.

Newton, I. et al., 2007. The Calyptogena magnifica chemoautotrophic symbiont genome. Science, 315(5814), pp.998–1000.

Nussbaumer, A.D., Fisher, C.R. & Bright, M., 2006. Horizontal endosymbiont transmission in

78

hydrothermal vent tubeworms. Nature, 441(7091), pp.345–348.

O'Brien, W.E., Bowien, S. & Wood, H.G., 1975. Isolation and characterization of a pyrophosphate-dependent phosphofructokinase from Propionibacterium shermanii. The Journal of Biological Chemistry, 250(22), pp.8690–8695.

Pagel, M., Meade, A. & Barker, D., 2004. Bayesian Estimation of Ancestral Character States on Phylogenies. Systematic Biology, 53(5), pp.673–684.

Parey, K. et al., 2013. Structural, biochemical and genetic characterization of dissimilatory ATP sulfurylase from Allochromatium vinosum. PLoS ONE, 8(9), pp.1–9.

Petersen, J.M. et al., 2011. Hydrogen is an energy source for hydrothermal vent symbioses. Nature, 476(7359), pp.176–180.

Pfleiderer, C. & Klemme, J.H., 1980. [Pyrophosphate-dependent D-fructose-6-phosphate- phosphotransferase in Rhodospirillaceae (author's transl)]. Zeitschrift fur Naturforschung. Section C, Biosciences, 35(3-4), pp.229–238.

Plazzi, F., Ribani, A. & Passamonti, M., 2013. The complete mitochondrial genome of Solemya velum (Mollusca: Bivalvia) and its relationships with Conchifera. BMC Genomics, 14(409), pp.1–23.

Polz, M. et al., 2000. When bacteria hitch a ride. ASM News, 66(9), pp.531–539.

Polz, M.F. et al., 1992. Chemoautotrophic, sulfur-oxidizing symbiotic bacteria on marine nematodes: morphological and biochemical characterization. Microbial Ecology, 24(3), pp.313–329.

Rashid, N. et al., 2002. A novel candidate for the true fructose-1,6-bisphosphatase in Archaea. The Journal of Biological Chemistry, 277(34), pp.30649–30655.

Raven, J.A., 2009. Contributions of anoxygenic and oxygenic phototrophy and chemolithotrophy to carbon and oxygen fluxes in aquatic environments. Aquatic Microbial Ecology, 56(2-3), pp.177–192.

Raven, J.A., 2013. Rubisco: still the most abundant protein of Earth? New Phytologist, 198(1), pp.1–3.

Reeves, R.E. et al., 1974. Pyrophosphate:D-fructose 6-phosphate 1-phosphotransferase. A new enzyme with the glycolytic function of 6-phosphofructokinase. The Journal of Biological Chemistry, 249(24), pp.7737–7741.

Reshetnikov, A.S. et al., 2008. Characterization of the pyrophosphate-dependent 6- phosphofructokinase from Methylococcus capsulatus Bath. FEMS Microbiology Letters, 288(2), pp.202–210.

Rittmann, D. et al., 2003. Fructose-1,6-bisphosphatase from Corynebacterium glutamicum: expression and deletion of the fbp gene and biochemical characterization of the enzyme. Archives of Microbiology, 180(4), pp.285–292.

79

Roberton, A.M. & Glucina, P.G., 1982. Fructose 6-phosphate phosphorylation in Bacteroides species. Journal of Bacteriology, 150(3), pp.1056–1060.

Ronimus, R.S., Morgan, H.W. & Ding, Y.H.R., 1999. Phosphofructokinase activities within the order Spirochaetales and the characterisation of the pyrophosphate-dependent phosphofructokinase from thermophila. Archives of Microbiology, 172(6), pp.401–406.

Russell, S.L., Corbett-Detig, R.B. & Cavanaugh, C.M., 2017. Mixed transmission modes and dynamic genome evolution in an obligate animal–bacterial symbiosis. The ISME Journal, 11, pp.1359–1371.

Saggu, S.K. & Mishra, P.C., 2017. Characterization of thermostable alkaline proteases from Bacillus infantis SKS1 isolated from garden soil. PLoS ONE, 12(11), pp.1–18.

Sanders, J.G. et al., 2013. Metatranscriptomics reveal differences in in situ energy and nitrogen metabolism among hydrothermal vent snail symbionts. The ISME Journal, 7(8), pp.1556– 1567.

Saxena, H. et al., 2018. Characterization of a thermostable endoglucanase from Cellulomonas fimi ATCC484. Biochemistry and Cell Biology, 96(1), pp.68–76.

Schwander, T. et al., 2016. A synthetic pathway for the fixation of carbon dioxide in vitro. Science, 354(6314), pp.900–904.

Schwedock, J. et al., 2004. Characterization and expression of genes from the RubisCO gene cluster of the chemoautotrophic symbiont of Solemya velum: cbbLSQO. Archives of Microbiology, 182(1), pp.18–29.

Scott, K.M. & Cavanaugh, C.M., 2007. CO2 uptake and fixation by endosymbiotic chemoautotrophs from the bivalve Solemya velum. Applied and Environmental Microbiology, 73(4), pp.1174–1179.

Serrano, A. et al., 2007. H+-PPases: yesterday, today and tomorrow. IUBMB Life, 59(2), pp.76– 83.

Sheridan, P.P., Freeman, K.H. & Brenchley, J.E., 2003. Estimated minimal divergence times of the major bacterial and archaeal phyla. Geomicrobiology Journal, 20(1), pp.1–14.

Siebers, B., Klenk, H. & Hensel, R., 1998. PPi-dependent phosphofructokinase from Thermoproteus tenax, an archaeal descendant of an ancient line in phosphofructokinase evolution. Journal of Bacteriology, 180(8), pp.2137–2143.

Singer, S.J. et al., 1952. The proteins of green leaves. IV. A high molecular weight protein comprising a large part of the cytoplasmic proteins. The Journal of Biological Chemistry, 197(1), pp.233–239.

Stewart, F.J. & Cavanaugh, C.M., 2006. Bacterial endosymbioses in Solemya (Mollusca: Bivalvia)—model systems for studies of symbiont–host adaptation. Antonie van Leeuwenhoek, 90(4), pp.343–360.

80

Stewart, F.J. et al., 2009. Evidence for homologous recombination in intracellular chemosynthetic clam symbionts. Molecular Biology and Evolution, 26(6), pp.1391–1404.

Stewart, F.J. et al., 2011. Metatranscriptomic analysis of sulfur oxidation genes in the endosymbiont of Solemya velum. Frontiers in Microbiology, 2, pp.1–10.

Stewart, F.J., Ottesen, E.A. & Delong, E.F., 2010. Development and quantitative analyses of a universal rRNA-subtraction protocol for microbial metatranscriptomics. The ISME Journal, 4(7), pp.896–907.

Stitt, M., 1989. Product inhibition of potato-tuber pyrophosphate:fructose-6-phosphate phosphotransferase by phosphate and pyrophosphate. Plant Physiology, 89(2), pp.628– 633.

Stockdreher, Y. et al., 2014. New Proteins Involved in Sulfur Trafficking in the Cytoplasm of Allochromatium vinosum. Journal of Biological Chemistry, 289(18), pp.12390–12403.

Stolzenberger, J. et al., 2013. Characterization of fructose 1,6-bisphosphatase and sedoheptulose 1,7-bisphosphatase from the facultative ribulose monophosphate cycle methylotroph Bacillus methanolicus. Journal of Bacteriology, 195(22), pp.5112–5122.

Teich, R. et al., 2007. Origin and distribution of Calvin cycle fructose and sedoheptulose bisphosphatases in plantae and complex algae: A single secondary origin of complex red plastids and subsequent propagation via tertiary endosymbioses. Protist, 158(3), pp.263– 276.

Teplitski, M., Goodier, R.I. & Ahmer, B., 2003. Pathways leading from BarA/SirA to motility and virulence gene expression in Salmonella. Journal of Bacteriology, 185(24), pp.7257–7265.

Theodorou, M.E. & Plaxton, W.C., 1996. Purification and characterization of pyrophosphate- dependent phosphofructokinase from phosphate-starved Brassica nigra suspension cells. Plant Physiology, 112(1), pp.343–351.

Trotsenko, Y. & Shishkina, V., 1990. Studies on Phosphate-Metabolism in Obligate Methanotrophs. In FEMS Microbiology Reviews. pp. 267–271.

Trotsenko, Y.A., Murrell, J.C. & Gadd, G.M., 2008. Metabolic aspects of aerobic obligate methanotrophy. Advances in applied microbiology, 63, pp.183–229. van Alebeek, G. & Keltjens, J.T., 1994. Purification and characterization of inorganic pyrophosphatase from Methanobacterium thernoautotrophicum (strain Δ H). Biochimica et Biophysica Acta, 1206(2), pp.231–239.

Walsh, D.A. et al., 2009. Metagenome of a versatile chemolithoautotroph from expanding oceanic dead zones. Science, 326(5952), pp.578–582.

Wang, B. et al., 2016. Characterization of a novel highly thermostable esterase from the Gram- positive soil bacterium Streptomyces lividans TK64. Biotechnology and Applied Biochemistry, 63(3), pp.334–343.

81

Ward, N. et al., 2004. Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath) Nancy A Moran, ed. PLoS Biology, 2(10), p.e303.

Weaver, P.F., Wall, J.D. & Gest, H., 1975. Characterization of Rhodopseudomonas capsulata. Archives of Microbiology, 105(3), pp.207–216.

Weissgerber, T. et al., 2011. Complete genome sequence of Allochromatium vinosum DSM 180(T). Standards in Genomic Sciences, 5(3), pp.311–330.

Wu, D., Jospin, G. & Eisen, J.A., 2013. Systematic identification of gene families for use as “markers” for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS ONE, 8(10), pp.1–11.

Yoo, J.-G. & Bowien, B., 1995. Analysis of the cbbF genes from Alcaligenes eutrophus that encode fructose-1,6-/sedoheptulose-1,7-bisphosphatase. Current Microbiology, 31(1), pp.55–61.

82

CHAPTER 3

The enigmatic Calvin cycle

of chemoautotrophic bacterial symbionts deciphered

Oleg Dmytrenko1, Alicja J. Kunikowska2, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge,

Massachusetts, United States of America.

2Klinikum Rechts der Isar der Technischen Universität München, Munich, Germany.

83

Abstract

Autotrophic CO2 fixation is the main source of organic carbon on Earth. Virtually all of primary productivity from bacteria to higher plants is carried out by a conserved set of enzymatic reactions which constitute the Calvin-Benson-Bassham (Calvin) cycle. Chemoautotrophic gammaproteobacterial endosymbionts of marine invertebrates are some of the most prolific primary producers which use the Calvin cycle. However, these bacteria lack a gene for fructose bisphosphatase (FBPase), a key enzyme in this CO2 fixation pathway. Since sequencing of the first symbiont genome it remained unknown how the Calvin cycle operates in these bacteria without FBPase. This was partially due to our inability to culture and genetically manipulate chemoautotrophic symbionts. By reconstructing the symbiont-like Calvin cycle in a free-living closely-related purple sulfur gammaproteobacterium, Allochromatium vinosum, we have for the first time demonstrated that in the absence of FBPase its function in the cycle can be performed by a reversible pyrophosphate-dependent phosphofructokinase (PPi-PFK), previously hypothesized to participate in CO2 fixation. The shift from FBPase to PPi-PFK came at the cost of reduced growth and decreased adaptability but, at the same time, offered an improvement in thermodynamic efficiency potentially due to an increase in the metabolism of pyrophosphate, which could be generated by PPi-PFK acting in the Calvin cycle. Using this experimental approach we have not only demonstrated a novel energy-efficient variant of the Calvin cycle hypothesized in chemoautotrophic symbionts, but also showed the feasibility of experimentally testing metabolic hypotheses postulated based on sequence data from uncultured symbiotic microorganisms.

84

Introduction

The vast majority of known bacteria remain uncultured (Pace 2009; Robertson et al.

2013) and are not easily amenable to culture-independent experimental manipulation. This severely limits the possibilities for studying the physiology, function, and activity of uncultured bacteria such as chemoautotrophic endosymbionts of marine invertebrates (Cavanaugh et al.

2013). Inferences made from DNA and RNA sequence data have helped advance understanding of the symbionts and inform future research directions. Multiple insightful hypotheses have been proposed based on sequence data, many of which await experimental validation. One of the most notable such hypotheses in the field of chemoautotrophic symbioses traces back to the first sequenced symbiont genome, that of the intracellular gammaproteobacterium which colonizes gills of the deep-sea vent giant clam, Calyptogena magnifica (Newton et al. 2007). The gene encoding fructose bisphosphatase (FBPase), an enzyme which catalyzes essential reactions in the Calvin cycle, has not been found in the genome of this or any other gammaproteobacterial symbiont sequenced to date. Despite the absence of FBPase, the symbionts are able to fix CO2 using ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO), the key enzyme in the Calvin cycle (Felbeck et al. 1981;

Cavanaugh 1983; Robinson et al. 1998; Singer et al. 1952; Erb & Zarzycki 2018). The resulting organic carbon feeds their hosts in exchange for sulfide and oxygen sequestered and delivered to the bacteria (Fisher & Childress 1992; Polz et al. 2000; Hourdez & Weber 2005; Scott &

Cavanaugh 2007). Chemoautotrophic symbionts are, in fact, among the most prolific primary producers in the ocean, capable of supporting some of the fastest known growth rates among marine invertebrates (Lutz et al. 1994). It has been hypothesized that the function of the missing

FBPase in the symbionts may be performed by a pyrophosphate-dependent phosphofructokinase (PPi-PFK) acting in reverse (Newton et al. 2007; Markert et al. 2007;

Kleiner et al. 2012; Dmytrenko et al. 2014). A comprehensive survey of bacterial genomes

85

reveals that this genetic trait is confined to two disparate monophyletic clades dominated by symbionts within gammaproteobacteria (Dmytrenko et al. 2018), suggesting a potential link between the evolution of chemoautotrophic symbioses and the shift from FBPase to PPi-PFK in the Calvin cycle.

RPI

ribose 5-P xylulose 5-phosphate ribulose 5-phosphate RPE TK glyceraldehyde 3-phosphate sedoheptulose 7-phosphate PRK PPi PPi FBPase/PPi-PFK RPE 3- 3- PO4 PO4 sedoheptulose 1,7-bisphosphate ribulose 1,5-bisphosphate FBA dihydroxyacetone phosphate erythrose 4-phosphate xylulose 5-phosphate RuBisCO TK CO2 glyceraldehyde 3-phosphate fructose 6-phosphate TPI PPi PPi FBPase/PPi-PFK 3-phosphoglycerate 3- 3- PO4 PO4 fructose 1,6-phosphate PGK dihydroxyacetone phosphate FBA 1,3-bisphosphoglycerate TPI GAPDH

glyceraldehyde 3-phosphate

Figure 3.1. Hypothesized Calvin cycle in A. vinosum featuring interchangeable FBPase and PPi-PFK activity. The reactions catalyzed by PPi-PFK are shown in red. Enzyme names and their corresponding locus tags in the A. vinosum genome are listed in Appendix 3 Table A3.1.

The Calvin cycle is the primary biological CO2 fixation pathway found in bacteria, algae, and higher plants, which is responsible for over 90% of primary production (Raven 2009; Berg

2011; Schwander et al. 2016). It carries out carbon incorporation into biomass through a synergistic action of thirteen evolutionary conserved reactions catalyzed by enzymes, which could be orthologous, paralogous, or structurally unrelated to each other in different species

(Figure 3.1) (Martin & Schnarrenberger 1997). RuBisCO combines CO2 with ribulose bisphosphate (RBP) to make two 3-phosphoglycerates, which are then converted into triose phosphates. They, in turn, may serve as precursors of most other organic carbon molecules.

86

The remaining reactions of the Calvin cycle regenerate RBP from triose phosphates for the next round of CO2 fixation and supply intermediates to other cellular pathways (Sato & Atomi 2010;

Bar-Even et al. 2012). One of the enzymes involved in RBP regeneration is FBPase, which in bacteria dephosphorylates fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7- bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). Under physiological conditions these reactions are irreversible. Conversion of FBP to F6P via FBPase is also part of glycolysis.

Dephosphorylation of SBP, on the other hand, is specific to the Calvin cycle and in eukaryotes is catalyzed by a separate enzyme, sedoheptulose bisphosphatase (SBPase), which has no affinity for FBP (Teich et al. 2007). Without FBPase activity, the RBP CO2 acceptor could not be regenerated, stalling out the Calvin cycle.

Chemoautotrophic symbionts which do not encode FBPase in their genomes may instead use a bidirectional PPi-PFK, an enzyme thought to operate as a kinase in glycolysis using pyrophosphate (PPi) as a phosphoryl group donor, unlike the more common virtually irreversible ATP-dependent PFK (Mertens 1991). PPi-PFK, which shares common ancestry with

ATP-PFK (Bapteste et al. 2003), was first discovered in Entamoeba histolytica (Reeves et al.

1974) and later found in bacteria (O'Brien et al. 1975), plants (Carnal & Black 1979), and archaea (Siebers et al. 1998). It is unclear why some organisms have one or the other version of PFK, as most do not simultaneously carry genes for both enzymes (Mertens 1991;

Dmytrenko et al. 2018). Similar to bacterial FBPases, bacterial PPi-PFKs are known to be promiscuous for FBP and SBP in the reverse reaction (Reshetnikov et al. 2008), which suggests that these enzymes could be interchangeable in the Calvin cycle. Unlike FBPase, PPi-PFK generates PPi in the reverse reaction and consumes PPi when PPi-PFK acts as a kinase in the forward (glycolytic) direction. Bacteria have on average high PPi content (0.5 to 1.5 mM), which is produced, for instance, during syntheses of DNA, RNA, proteins, and polysaccharides

87

(Heinonen & Drake 1988; Bornefeld 1981; J. Chen et al. 1990). At these concentrations PPi is inhibitory to the reverse PPi-PFK activity (Anon 1989; Anon 1996). For example, in the case of

PPi-PFK from a chemoautotrophic symbiont of the Solemya velum coastal bivalve, 1 mM PPi reduces catalytic efficiency of the reverse reaction by more than threefold and makes the forward reaction more favorable (Dmytrenko et al. 2018). To prevent substrate inhibition and allow PPi-PFK to function in reverse, a low concentration of cellular PPi has to be maintained.

This can be accomplished by the action of diverse PPi-consuming enzymes, such as inorganic pyrophosphatase (PPase) (Josse 1966; Klemme & Gest 1971; van Alebeek & Keltjens 1994;

Jeon & Ishikawa 2005; Hoelzle et al. 2010), proton pumping PPase (H+-PPase) (Nyrén et al.

1984; Ordaz et al. 1992; Schultz & Baltscheffsky 2003; Serrano et al. 2004), or ATP sulfurylase

(SAT) (Parey et al. 2013). As H+-PPase hydrolyzes PPi, this membrane-bound enzyme creates an electrochemical proton gradient which can be used for ATP synthesis. SAT activity can also lead to ATP formation as a result of transferring PPi to adenosine 5'-phosphosulfate (APS) in a

2- final step of sulfur oxidation to sulfate (SO4 ). The above enzymes, detected in chemoautotrophic symbionts (Felbeck et al. 1981; C. Chen et al. 1987; Fisher et al. 1993; Laue

& Nelson 1994) and identified in their genomes and proteomes (Kleiner et al. 2012; Markert et al. 2011; Dmytrenko et al. 2014), could reduce the available pool of PPi, stimulating PPi-PFK reverse activity and concomitantly increasing cellular ATP content. The need for PPi removal suggests that the use of PPi-PFK in the Calvin cycle may affect not only CO2 fixation, but the overall energy balance and physiology of the cells.

Since chemoautotrophic symbionts are yet to be cultured outside of their host and no molecular genetics tools are available to study their gene function, a genetically tractable model bacterium is needed to experimentally test the hypothesized ability of PPi-PFK to replace

FBPase in the Calvin cycle. Using a free-living bacterium could overcome the limitations of working with the symbionts and provide a controlled experimental system free from potential

88

influences of the host. Allochromatium vinosum is well suited for this purpose. It is a free-living

2.0 µm x 2.5-6.0 µm rod-shaped anoxygenic facultatively photolithoautotrophic purple sulfur gammaproteobacterium, which is closely related to chemoautotrophic symbionts (Imhoff 2005;

Dubilier et al. 2008; Dmytrenko et al. 2014). Tools for manipulative genetics have been developed for A. vinosum (Pattaragulwanit & Dahl 1995; Lubbe et al. 2006) and successfully applied to the study of sulfur metabolism (Sander et al. 2006; Dahl et al. 2013; Stockdreher et al. 2014). Sequencing of the A. vinosum genome (Weissgerber et al. 2011) has expanded opportunities for applying the available genetic tools to studying other aspects of its physiology, such as carbon metabolism. A. vinosum is able to grow photolithoautotrophically by fixing CO2 via the Calvin cycle with energy obtained from light using reduced sulfur as electron donors

(Imhoff 2005). A. vinosum is metabolically versatile and can utilize sulfide, elemental sulfur, polysulfides, thiosulfate, and sulfite. It also grows photoorganoheterotrophically, e.g., on acetate, malate, or pyruvate, providing an opportunity for creating and maintaining knockout mutants which are deficient in CO2 fixation. In its genome, A. vinosum encodes FBPase and

PPi-PFK, which have both been previously purified and shown to dephosphorylate FBP to F6P

(Dmytrenko et al. 2018). By selectively inactivating genes for FBPase (fbp) and PPi-PFK (pfp) in

A. vinosum, we have examined their role in CO2 fixation and physiology. In particular, by creating the symbiont-like ∆fbp knockout, we tested the hypothesized ability of the remaining

PPi-PFK to replace FBPase in the Calvin cycle. Effects of the shift from FBPase to PPi-PFK in

A. vinosum were investigated by measuring growth- and CO2-fixation rates in the wild type (WT) and the knockouts. To quantify energy savings potentially associated with PPi-PFK use in the

Calvin cycle, we assessed sulfur consumption rates–as a proxy for reducing equivalents–and

ATP levels in the culture.

In this study we recreated the hypothesized Calvin cycle from chemoautotrophic symbionts in A. vinosum and demonstrated that pfp is sufficient and, in the absence of fbp,

89

essential for CO2 fixation and growth. Our results provided compelling evidence for a novel

Calvin cycle variant, potentially deciphering one of the biggest conundrums in chemoautotrophic symbiosis. Our data yielded important insights into the physiological changes associated with the shift from FBPase to PPi-PFK, including an increase in thermodynamic efficiency at the cost of adaptability and growth rate. This observation may have direct implications for understanding the metabolic changes potentially associated with the evolution of chemoautotrophic symbioses.

Finally, our experimental approach demonstrates the feasibility of testing hypotheses based on sequence data from uncultured symbiotic bacteria using molecular genetics in closely-related experimentally tractable organisms.

Results fbp and pfp genes were knocked out in A. vinosum

In A. vinosum, fbp and pfp genes were deleted by double-crossover homologous recombination (Figure 3.2). In the process, both genes were substituted in frame with antibiotic resistance genes, aphA and aacC1, respectively. These markers were placed under control of

Pfbp and Ppfp promoters to prevent potential polar effects on downstream genes. The success rate of obtaining A. vinosum ∆fbp knockouts was 100%. Double crossover ∆pfp mutants constituted 14% of the screened colonies. In case of ∆fbp ∆pfp mutants, made as a fbp knockout in the ∆pfp background, 7% of over one hundred tested colonies carried both deletions.

90

A fbp deletion Pfbp WT locus fbp Pfbp HR template pCM433 fbpL::aphA::fbpR aphA

Pfbp ∆fbp knockout aphA

pfp::aacC1 pfp::aacC1 pfp::aacC1 ∆ ∆ ∆

fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA Ladder ∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control

3.0 kb fbp aphA fbp locus

2.0 kb

1.0 kb

0.5 kb

B pfp deletion Ppfp WT locus pfp Ppfp HR template pCM433 pfpL::aacC1::pfpR aacC1

Ppfp ∆pfp knockout aacC1

pfp::aacC1 pfp::aacC1 pfp::aacC1 ∆ ∆ ∆

fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA Ladder ∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control

3.0 kb pfp aacC1 pfp locus

2.0 kb

1.0 kb

0.5 kb

Figure 3.2. Construction and PCR analysis of (A) fbp and (B) pfp gene knockouts in A. vinosum. fbp and pfp genes (purple) were replaced by homologous recombination with promoterless aphA kanamycin resistance (green) and aacC1 gentamicin resistance (orange) genes. The double crossover homologous recombination (HR) products fused the aphA and aacC1 antibiotic resistance genes with Pfbp and Ppfp promoters. To create ∆fbp ∆pfp double knockout, aacC1 with a constitutive gentamicin promoter was used.

91

Either pfp or fbp are sufficient and essential for autotrophic growth in A. vinosum

To investigate the effects of fbp and pfp knockouts on the physiology of A. vinosum, bacteria were grown in a controlled anaerobic bioreactor (Figure 3.3). pH, sulfide concentration, temperature, and illumination were continuously monitored and kept constant. Optical density of the culture was measured throughout growth.

The A. vinosum ∆fbp and ∆pfp knockouts maintained their ability to grow photolithoautotrophically in the presence of light with sulfide as a source of electrons and supplemented bicarbonate as the sole source of dissolved organic carbon (DIC) (Figure 3.4).

Similarly to the WT, ∆pfp knockout entered exponential growth phase approximately 40 hours after inoculation. A. vinosum ∆fbp had a significantly longer lag, starting to grow after, on average, 100 hours. Deletion of both genes completely abolished autotrophic growth even though the cells remained metabolically active and consumed sulfide throughout incubation

(Appendix 3 Figure A3.1).

A. vinosum WT, ∆fbp, and ∆pfp were able to grow photoorganoheterotrophically in a medium containing acetate, malate, and thiosulfate (Figure 3.5). Continuous sulfide feeding and pH control, necessary to maintain autotrophic growth, were not used. Under heterotrophic conditions, WT and ∆pfp exhibited similar growth. A. vinosum ∆fbp followed a comparable growth dynamic, plateauing, however, at a lower optical density. At approximately OD690 0.75 all strains underwent a diauxic shift. No significant lag in growth was observed between the three strains.

92

Figure 3.3. Bioreactor setup for growing A. vinosum with automated pH control, sulfide feeding, and optical density measurements. Cultures were incubated in the presence of 0.3-0.5 mM sulfide, at pH 7.0, 30℃, and 42,000 Lux (400-700 nm). Peristaltic pumps are not shown. (Fishing person is included as homage to Malvin Calvin (Wilson & Calvin 1955)).

93

Figure 3.4. Photoautotrophic growth of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp. OD690 was measured every 10 min for the WT, ∆fbp, ∆pfp, and every 30 min for the ∆fbp ∆pfp cultures. Shaded areas around mean values for each strain indicate standard error of the mean (SEM) (WT N=2, ∆fbp N=3, ∆pfp N=3).

A. vinosum ∆fbp ∆pfp knockout was unable to grow heterotrophically unless fructose or glucose were present in the medium (Figure 3.6). Growth commenced 120 and 420 hours after inoculation when the cultures were supplemented with fructose and glucose, respectively.

Compared to glucose, an equimolar amount of fructose yielded an overall higher cell density. Other sugars, such as sucrose, rhamnose, and glucuronate were unable to complement the mutant phenotype even after over two months of incubation.

Under autotrophic and heterotrophic conditions A. vinosum WT showed comparable growth rates (Figure 3.7). Single knockout mutations did not affect growth rates on malate and

94

acetate. With DIC as the sole carbon source, A. vinosum ∆fbp grew on average 26% slower than the WT and 20% slower than the ∆pfp knockout. No significant difference in growth rates between the WT and ∆pfp was observed. The ∆fbp ∆pfp mutant was able to grow neither autotrophically nor heterotrophically. Supplementation with fructose or glucose enabled growth of the double knockout at 30% of the WT rate.

Figure 3.5. Photoheterotrophic growth of A. vinosum WT, ∆fbp, ∆pfp. OD690 was measured every 10 min. Shaded areas around mean values for each strain indicate SEM (N=2).

A. vinosum ∆pfp and ∆fbp knockouts do not affect autotrophic CO2 fixation rates

To study the effects of ∆fbp and ∆pfp mutations on CO2 fixation, which in A. vinosum is primarily carried out by the Calvin cycle (Weissgerber, Sylvester, et al. 2014; Weissgerber,

Watanabe, et al. 2014; T. Tang et al. 2017), CO2 fixation rates in A. vinosum WT and knockout strains were measured under autotrophic and heterotrophic conditions using 13C-labeled DIC

(Figure 3.8). No significant difference was observed in carbon fixation rates among the WT,

95

∆fbp, and ∆pfp grown autotrophically. In the heterotrophic medium, 90-99% lower CO2 fixation rates were measured. A. vinosum WT and ∆pfp fixed equivalent amounts of carbon during heterotrophy. ∆fbp and ∆fbp ∆pfp incorporated 13C label at 50% and 90% lower rate than that of the WT, respectively.

Figure 3.6. Growth of A. vinosum ∆fbp ∆pfp in heterotrophic medium supplemented with sugars. OD690 of the cultures was measured approximately every 12 hours. Error bars around mean values indicate SEM (N=3).

A. vinosum ∆fbp have significantly reduced rates of sulfide consumption

The hypothesized PPi-PFK use in the Calvin cycle and the associated production of high energy phosphate, i.e., PPi, may affect sulfur metabolism of A. vinosum under autotrophic conditions. The effects of ∆fbp and ∆pfp gene loss on sulfide consumption through anaerobic sulfur oxidation were quantified by monitoring sulfide concentration in the cultures. Sulfide consumption rates (nmol S-1 mg protein-1 min-1) were calculated based on changes in sulfide

96

concentration in the bioreactor throughout autotrophic growth of the A. vinosum WT, ∆fbp, and

∆pfp cultures (Appendix 3 Figure A3.2). No significant difference in sulfide consumption rates was observed between A. vinosum WT and ∆pfp. Rates decreased by at least 75% in the bioreactor cultures containing ∆fbp. As cell densities increased over time, sulfide consumption rates decreased, while the general relationship between consumption rates by different strains remained comparable throughout growth.

Figure 3.7. Growth rates of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp under heterotrophic and autotrophic conditions measured within linear range (100 min for autotrophic and heterotrophic and 36 hours for heterotrophic ∆fbp ∆pfp cultures). For heterotrophic growth, rates before (darker hue) and after (lighter hue) diauxic shift are shown. Error bars around mean values indicate SEM (autotrophic WT N=2, ∆fbp N = 3, ∆pfp N = 3; heterotrophic N=2; heterotrophic ∆fbp ∆pfp N=3). ****p<0.05 by ANOVA with Fisher's Least Significant Difference (LSD) test.

A. vinosum WT and ∆fbp strains have comparable ATP levels

PPi, potentially generated by PPi-PFK acting in the Calvin cycle, may be converted into

ATP through a number of potential mechanisms. Here were measured ATP levels in A. vinosum

WT, ∆fbp, and ∆pfp at regular intervals throughout autotrophic growth (Figure 3.9). During most of the exponential growth WT and ∆fbp had comparable amounts of ATP (nmol ATP mg protein-

97

1). However, at the same optical densities, A. vinosum ∆pfp exhibited significantly lower ATP levels. In all cultures proportion of ATP per protein decreased as cell densities increased.

Figure 3.8. CO2 fixation rates of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp under autotrophic and heterotrophic conditions measured during exponential growth. Heterotrophic ∆fbp ∆pfp culture was supplemented with fructose. Error bars indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3; replicates=2). ****p<0.05 by ANOVA with LSD test.

98

Figure 3.9. Amount of ATP throughout growth of A. vinosum WT, ∆fbp, and ∆pfp under autotrophic conditions. Error bars around mean values indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3; replicates=3).

Discussion

The Calvin cycle is carried out by a set of enzymes which perform conserved substrate conversion steps enabling CO2 incorporation into biomass (Figure 3.1) (Martin &

Schnarrenberger 1997). Our study for the first time experimentally demonstrates a variation of the cycle in which two of the steps, dephosphorylation of FBP and SBP to F6P and S7P, may be catalyzed not by FBPase but by PPi-PFK. The observed physiological changes which accompany this substitution elucidate possible selective advantages which may have led to the evolution of PPi-PFK use in the Calvin cycle of chemoautotrophic symbionts, bacteria in which this departure from the canonical pathway is thought to occur (Newton et al. 2007; Markert et al.

2007; Kleiner et al. 2012; Dmytrenko et al. 2014).

99

Experimental investigation into the potential role of PPi-PFK in CO2 fixation was carried out in A. vinosum, a genetically tractable facultative photolithoautotrophic bacterium

(Pattaragulwanit & Dahl 1995), which is closely related to chemoautotrophic symbionts. fbp and pfp genes were knocked out by replacing their entire protein-coding sequences in frame with selectable antibiotic markers while retaining the endogenous promoters (Figure 3.2). This mutagenesis strategy maximally ensured that the observed knockout phenotypes reflect the effects of gene loss and, by extension, absence of either FBPase or PPi-PFK enzyme activity.

Deletion of individual genes, especially in the case of the symbiont-like ∆fbp mutation, did not reduce overall viability of A. vinosum. A double knockout, on the other hand, resulted in glucose/fructose auxotrophy, consistent with the predicted functions of fbp and pfp.

The ability of single—but not double—A. vinosum ∆fbp and ∆pfp knockouts to grow autotrophically with CO2 as the sole carbon source suggests that fbp and pfp complement each other and either of them is sufficient and essential for autotrophic growth (Figure 3.4). This observation is in agreement with a parallel study which demonstrated that purified A. vinosum

FBPase and PPi-PFK enzymes are capable of catalyzing the same essential reaction in the

Calvin cycle, namely the dephosphorylation of FBP to F6P (Dmytrenko et al. 2018). The measured growth rates (Figure 3.7, Appendix 3 Table A3.2) are well within the range of the doubling times (7-10 hours) previously reported for A. vinosum (Weissgerber et al. 2013). Prior to inoculation into autotrophic environment within a bioreactor all cultures were grown in a heterotrophic medium until approximately mid-log phase. Following the transfer A. vinosum WT,

∆fbp, and ∆pfp entered an apparent lag in growth (Figure 3.4), during which changes in gene expression and protein transcription necessary to accommodate a shift in the growth mode from heterotrophy to autotrophy likely took place. When transferred from heterotrophic to autotrophic medium, A. vinosum is known to upregulate expression of, among others, the Calvin cycle genes encoding RuBisCO, phosphoglycerate kinase, transketolase, and phosphoribulokinase

100

(T. Tang et al. 2017; Weissgerber, Sylvester, et al. 2014; Fuller et al. 1961) as well as downregulate carbon storage regulator A (CsrA) (Weissgerber, Sylvester, et al. 2014), a global posttranscriptional regulator protein known to negatively control activity of gluconeogenic enzymes such as FBPase in E. coli (Revelles et al. 2013; Timmermans & Van Melderen 2009).

Following the transfer into autotrophic environment, transcription and the corresponding protein levels also increase for flavocytochrome c (FccAB), sulfide:quinone oxidoreductases (SqrD and

SqrF), and the Dsr system (Weissgerber et al. 2013; Weissgerber, Sylvester, et al. 2014) involved in sulfide oxidation. Duration of the lag, approximately 40 hours, did not differ between the WT and the ∆pfp knockout. In case of the ∆fbp mutant the lag was significantly longer, lasting approximately 100 hours. This delay in autotrophic growth of ∆fbp could be explained by slow initiation of pfp transcription, driven by what is likely a regulated promoter, and influenced by an alternative start codon, GTG, associated with lower levels of transcriptional initiation

(Kozak 1999). Additionally, after the PPi-PFK protein is synthesized, it may not be fully capable of complementing the FBPase deficiency until the cytoplasmic PPi concentration, usually high in bacteria (Bornefeld 1981; Heinonen & Drake 1988; J. Chen et al. 1990; Heinonen & Heinonen

2001), is sufficiently reduced, for example, through the action of PPases or SAT.

Unlike the single knockouts, the A. vinosum ∆fbp ∆pfp fails to grow when transferred from the heterotrophic medium supplemented with fructose into minimal autotrophic environment. The transferred bacterial cells, however, remain metabolically active, as evidenced by continuous, although very slow, rate of sulfide consumption in the culture throughout incubation (40 days) (Appendix 3 Figure A3.1). This sulfide consumption in the bioreactor cannot be accounted for by abiotic processes and is likely due to bacterial metabolism. In the absence of both fbp and pfp genes A. vinosum thus appears to lose its ability to fix CO2 into biomass and as a result, produce organic carbon required for growth. However, it retains capacity to oxidize sulfide and persist in a vegetative state.

101

Separate deletion of fbp and pfp in A. vinosum has no major effect on ability of this bacterium to grow heterotrophically, primarily on acetate and malate (Figure 3.5). Under these conditions no lag in growth occurred for either A. vinosum WT, ∆fbp, or ∆pfp when bioreactor was inoculated with pre-cultures grown in the medium of the same composition. As growth progressed, a distinctive diauxic pattern shared by all three strains became apparent. The diauxic shift, followed by a brief lag phase, occurred around OD690 0.75. After the shift, the growth rates decreased (Figure 3.5, Appendix Table A3.2), signaling that the cells have likely switched from a more preferred carbon substrate to a less preferred one (Monod 1947). The initial faster growth phase was short, suggesting that the more preferred substrate may have been acetate, present in a lower concentration (2 mM) compared to a potentially less favored malate (21 mM), and, therefore, likely producing a shorter burst of growth. Acetate is more reduced than malate (McKinlay & Harwood 2010) and yields a higher standard free energy change in the first step of acetate metabolism catalyzed by acetyl-CoA synthetase (-20 kJ mole-

1) (van Rossum et al. 2016) compared to -8 kJ mole-1 for malic enzyme operating in the decarboxylation direction (Kunkee 1967). Once cell densities in the bioreactor reach OD690 of approximately 1.5, A. vinosum ∆fbp starts to show a faster decrease in growth rate than the WT or the ∆pfp knockout, eventually plateauing at a lower optical density. Knowing that A. vinosum

∆fbp is unable to shift from organic to inorganic carbon in the medium as rapidly as the other two strains (Figure 3.4), it is likely that the WT and ∆pfp reach stationary phase at a higher

OD690 by more readily consuming the background amount of DIC that may be present in the medium. This observed succession of growth phases suggests that A. vinosum may preferentially first consume acetate, then malate, and finally DIC and that the loss of the

FBPase enzyme due to ∆fbp mutation impairs its ability to rapidly adapt to a change from organic to inorganic sources of carbon.

102

A. vinosum which carries the ∆fbp ∆pfp knockout is a fructose/glucose auxotroph (Figure

3.6). The double mutant completely loses its ability to grow on minimal autotrophic (Figure 3.4) and heterotrophic media (Figure 3.6). Supplementation with fructose or glucose reverses the mutant phenotype and partially restores growth. The glc-/fru- auxotrophy of the double knockout agrees with the predicted metabolic roles of FBPase and PPi-PFK in A. vinosum. Deletion of fbp and pfp appears to deprive the bacterium of its ability to make sugars, required for synthesis of cellular components from acetate or malate via gluconeogenesis. In the case of CO2 fixation, these mutations would stall out the Calvin cycle by preventing regeneration of the cycle intermediates. To complement the auxotrophy, glucose or fructose must be taken up from the medium through a yet unidentified transferase system, since A. vinosum does not encode any of the known sugar transporters in its genome (Weissgerber et al. 2011). Uptake and utilization of fructose and glucose likely occur through a mechanism specific to the type of sugar molecule, as only two of the five tested saccharides were able to complement the ∆fbp and ∆pfp loss of function mutations. The implicated fructose and glucose transport and metabolism genes do not appear to be constitutively expressed in A. vinosum, instead being potentially induced after discrete periods of time since inoculation. Growth with fructose commences earlier than in glucose-supplemented cultures and reaches higher optical density, suggesting that fructose more readily complements the mutation and may be a more preferred substrate overall. These findings do not only provide strong support for the hypothesis regarding the potential role of PPi-

PFK in gluconeogenesis and the Calvin cycle, but also supply new insights into metabolism of

A. vinosum, which has been previously reported incapable of utilizing either fructose or glucose

(Imhoff 2005).

Incorporation of the 13C-label from DIC into the biomass of A. vinosum ∆fbp and ∆pfp testifies to the unimpaired ability of these knockouts to fix CO2 in comparison to the WT (Figure

3.8). Lack of significant differences in the rates measured under autotrophic conditions implies

103

that the enzymatic reactions of the Calvin cycle potentially affected by the two mutations may not be rate-limiting and that FBPase and PPi-PFK, which under substrate saturation and in the absence of inhibition have similar catalytic efficiencies (Dmytrenko et al. 2018), operate in a favorable cellular environment throughout autotrophic growth. Although rates of growth during autotrophy and heterotrophy are not significantly different (Figure 3.7), CO2 assimilation in heterotrophic medium was on average only 10% of the autotrophic level (Figure 3.8), which could be attributed to RuBisCO activity in the Calvin cycle (T. Tang et al. 2017) used to recycle reducing equivalents during growth on reduced carbon substrates (McKinlay & Harwood 2010).

Most of the observed CO2 fixation, however, could be explained primarily by anaplerotic carbon fixation via phosphoenolpyruvate carboxylase, phosphoenolpyruvate carboxykinase, and pyruvate carboxylase (K.-H. Tang et al. 2011; Weissgerber, Sylvester, et al. 2014; Weissgerber,

Watanabe, et al. 2014; T. Tang et al. 2017). CO2 fixation in A. vinosum measured under heterotrophic conditions agrees with the data reported for other heterotrophic bacteria such as

Roseobacter denitrificans (K.-H. Tang et al. 2009) and eukaryotic algae (Cassar & Laws 2007), but is lower than the previously estimated 29% for A. vinosum (T. Tang et al. 2017) and 25% for cyanobacterium Synechocystis sp. (Yang et al. 2002). A further reduction of the CO2 fixation rate in the ∆fbp mutant to only 5% of the WT autotrophic level suggests that at least half of carbon fixation under heterotrophic conditions could occur via the Calvin cycle (Figure 3.8). This agrees with an earlier observation that the ∆fbp knockout cannot rapidly switch from heterotrophy to autotrophy (Figure 3.4), which would require ready use of the Calvin cycle, and corroborates a prior supposition that this mutant is unable to fully utilize background CO2 present in heterotrophic medium (Figure 3.5).

Measurements of sulfide consumption in A. vinosum bioreactor cultures provided a means of estimating the effects of fbp and pfp gene loss on sulfide metabolism during autotrophic growth (Appendix 3 Figure A3.2). Concomitant energy metabolism was evaluated by

104

quantifying ATP throughout the incubations (Figure 3.9). Measured sulfide consumption rates fall within the range previously reported for A. vinosum (Weissgerber et al. 2013; Dahl et al.

2013) and other sulfur bacteria, such as Prosthecochloris aestuarii (Takashima et al. 2000). The highest sulfide consumption rates are observed in the WT strain, together with the highest overall ATP content, which agrees with the published ATP values for A. vinosum (Miović &

Gibson 1971; van Gemerden & Beeftink 1978). Deletion of pfp does not significantly decrease sulfide consumption (Appendix 3 Figure A3.2) but leads to a reduction of ATP level (85% of the

WT) (Figure 3.9). On the other hand, growth of A. vinosum ∆fbp is characterized by a significantly curtailed consumption of sulfide, down to almost 25% of the WT rates during the log-phase (Appendix 3 Figure A3.2). Despite these pronounced differences in rates of sulfide consumption, the ATP content of the ∆fbp knockout is almost level with the WT values, particularly during exponential growth (Figure 3.9). This high ATP content of A. vinosum ∆fbp could be attributed to potential pyrophosphate hydrolysis, for example, by H+-PPases (Nyrén et al. 1984; Ordaz et al. 1992; Schultz & Baltscheffsky 2003; Serrano et al. 2004) and SAT activity,

2- which makes ATP with PPi and APS in the final step of sulfide oxidation to SO4 (Parey et al.

- 2013). A. vinosum has genetic capacity for at least two sulfite (HSO3 ) oxidizing enzymes, SAT and a membrane bound polysulfide reductase-like iron-sulfur molybdoprotein (SoeABC) (Dahl et

- al. 2013). SoeABC is the major HSO3 oxidizing enzyme in this bacterium while SAT is thought to be secondary. We hypothesize that in A. vinosum ∆fbp, PPi, potentially produced by the

- reverse PPi-PFK activity in the Calvin cycle, may stimulate SAT activity and channel HSO3

2- oxidation to SO4 through this enzyme instead of SoeABC. ATP produced by SAT in this reaction could be used in the Calvin cycle. During each round of CO2 fixation PPi-PFK may produce two molecules of PPi by dephosphorylating FBP and SBP. If these PPi molecules were to be converted into ATP by SAT, the overall ATP cost of fixing one molecule of CO2 would drop from three to one ATP. For A. vinosum, which produces ATP using a light-driven cyclic electron

105

flow (Brune 1989), this would mean a decline in the overall ATP demand under autotrophic conditions, making the electron transport chain overreduced (Pott & Dahl 1998; Dahl et al. 2005;

Frigaard & Dahl 2009). Such conditions are known to exert back pressure on the cyclic electron flow system and lower the rate of sulfur oxidation. This hypothesis agrees with our ATP and sulfide consumption data and is consistent with reverse PPi-PFK activity in the Calvin cycle.

Removal of PPi through the action of H+-PPases would have a similar but likely less pronounced effect on sulfide consumption.

By recreating in A. vinosum the Calvin cycle proposed in uncultured chemoautotrophic symbionts, we demonstrated that either fbp or, more notably, pfp is essential and sufficient for

CO2 fixation and growth, ascertaining the hypothesized ability of PPi-PFK to replace FBPase. A. vinosum ∆fbp, the knockout which can only use PPi-PFK, grows at a reduced rate on CO2 as the sole carbon source. This reduction in growth may be associated with the physiological changes to the cellular milieu necessary to accommodate PPi-PFK use in the Calvin cycle.

These changes are likely centered around removal of PPi, which is inhibitory to reverse PPi-

PFK activity. A similar rationale could explain why A. vinosum ∆fbp is considerably slower at adapting to changes in culture conditions. On the other hand, this symbiont-like mutant does not exhibit reduced CO2 fixation ability or diminished ATP content, potentially due to ATP synthesis coupled to PPi removal. These observations suggest that while the loss of fbp would be deleterious to free living generalists who need to rapidly adapt to a changing environment and must be able to attain the most rapid growth to outpace other bacteria in a competition for resources, the shift from FBPase to PPi-PFK could be advantageous to specialists, particularly those living as symbionts in a relatively constant environment, face little competition, and, therefore, are selected for thermodynamic efficiency over growth rate and adaptability. It is in these bacteria that the PPi-PFK-utilizing variant of the Calvin cycle has been initially hypothesized. Thus, these data do not only establish that PPi-PFK may replace FBPase during

106

autotrophic growth on CO2 as the only carbon source. They also provide novel insights into the physiological aspects of this evolutionary adaptation, which may have facilitated the origin and maintenance of chemoautotrophic symbiosis.

Materials and methods

Bacterial strains and plasmids

The symbionts-like Calvin cycle was recreated in A. vinosum purple sulfur bacterium was using standard protocols in molecular genetics. A. vinosum DSM 180T Rif50 rifampicin spontaneous resistance mutant (Lubbe et al. 2006) and Escherichia coli S17-1 plasmid donor strain (Simon et al. 1983) were provided by Christiane Dahl (Universität Bonn) (Appendix 3

Table A3.3). Plasmids pCM184, pCM351 (Marx & Lidstrom 2002), and pCM433 (Marx 2008) were obtained from Christopher Marx (University of Idaho).

To delete fbp and pfp in A. vinosum, the genes were replaced in-frame by homologous recombination with antibiotic selection markers aphA and aacC1, respectively (Figure 3.2). To avoid polar effects on downstream genes, which could influence mutant phenotypes, the strong constitutive promoters of the antibiotic resistance cassettes were omitted in favor of Pfbp and

Ppfp native promoters. To achieve expression of aacC1 from Ppfp, the start codon of aacC1 was modified from ATG to GTG. Unlike aphA expressed from Pfbp, the pfp promoter did not initiate expression of aacC1 in E. coli, suggesting an A. vinosum specific mode of transcriptional regulation. Plasmids carrying templates for recombination were introduced into A. vinosum through conjugation with E. coli S17-1 donor. Successful transconjugants were selected either with kanamycin (∆fbp::aphA) or gentamicin (∆pfp::aacC1). Post conjugation, E. coli was eliminated from plates with rifampicin, to which the A. vinosum acceptor strain was resistant.

Double-crossover recombinant knockouts were obtained by additionally supplementing selection medium with sucrose. A. vinosum harboring single crossover products were sucrose-

107

sensitive due to the presence of sacB gene encoding levansucrase from Bacillus subtilis on the allelic exchange plasmids. The primers and plasmids used to create the knockout mutants are listed in Appendix 3 Tables A3.3 and A3.4. Construction of the plasmids, conjugation, and mutant isolation are detailed in Appendix 3 Supplementary Methods.

Growth conditions

For routine propagation, A. vinosum wild type (WT) and the knockout mutant strains were grown anaerobically in modified liquid heterotrophic RCV medium (pH 7.0) adapted from

Weaver (1975), containing 21 mM malate and 2 mM acetate as carbon sources and 0.8 mM sodium thiosulfate as a reducing agent and a source of sulfur. The medium was filter-sterilized, supplemented with 50 µg/ml rifampicin to avoid contamination, and distributed into 9 ml gas- tight glass vials. To obtain single colonies, A. vinosum was plated on RCV medium containing

1% Phytagel (Sigma Aldrich). Prior to inoculation the plates were stored overnight under oxygen-free atmosphere. The inoculated plates were incubated in GasPakTM BBLTM jars (BD).

To study growth kinetics and the effects of fbp and pfp gene loss, A. vinosum was cultured either under photolithoautotrophic or photoorganoheterotrophic conditions in liquid

Pfennig's medium (Imhoff 2006), with DIC as the sole carbon source (18 mM), or RCV medium, respectively. For this purpose the bacteria were grown in a bioreactor built in-house from a 500 ml spinner flask (Bellco Glass) (Figure 3.3). Optical density (OD) of the cultures was continuously monitored at 690 nm by circulating the medium from the main flask through a gas- tight glass cuvette mounted inside UV-1601 spectrophotometer (Shimadzu). For measuring heterotrophic growth kinetics, the bioreactor was inoculated with liquid mid-log phase RCV pre- cultures started from single colonies. For photoautotrophic growth, the pre-cultures were first collected on 0.45 µm sterile filters and then resuspended in Pfennig's medium to avoid transferring residual organic carbon from RCV medium into the bioreactor. Starting OD690 for

108

each bioreactor growth experiment was approximately 0.07. Bacteria were grown under constant illumination of approximately 42,000 Lux (400-700 nm) by placing the cultures between

2 incandescent light bulbs (60W), which kept the cultures at 30°C. Throughout photoautotrophic growth, sulfide concentration was maintained between 0.3 and 0.5 mM and pH was kept at 7.0.

For auxotrophic growth experiments RCV medium was supplemented with 1% w/v of either fructose, glucose, sucrose, rhamnose, or glucoronic acid. All growth experiments were repeated two to four times. Significant differences between growth rates of the WT and the knockout mutants were identified using ANOVA with a Fisher's Least Significant Difference (LSD) post hoc test. Media composition, culture conditions, bioreactor setup, antibiotic concentrations, and sampling procedures for ATP and protein determination are detailed in Appendix 3

Supplementary Methods.

CO2 fixation rates

To compare the effects of FBPase and PPi-PFK use on carbon fixation in the Calvin cycle of A. vinosum, CO2 fixation in the bioreactor cultures was determined by measuring the rate at which 13C labeled DIC (Cambridge Isotope Laboratories) was incorporated into biomass.

13 NaH CO3 was added to autotrophic and heterotrophic cultures during a mid-log phase to the final bicarbonate 13C/12C ratio of 0.17. Cultures were sampled (5 ml) in duplicate at regular time intervals for the total duration of 5 hours and collected on 25 mm GF/F glass microfiber filters

(Whatman). The filtrate was fumed with HCl for 12 h followed by lyophilization for 24 h at -50°C and pressure below 0.040 mBar in FreeZone 2.5 freeze dry system (Labconco). Carbon stable isotope composition of the samples was analyzed in stable isotope facilities at Boston

University; The Center, Marine Biological Laboratory, Woods Hole; and the Center for Stable Isotopes at the University of New .

109

For calculating 13C dissolved inorganic carbon (DIC) incorporation rates, the mass balance equation was adapted from Montoya (1996):

(A )([PC ]) = (A )([PC ]) + (A )([PC ]) PC f f PCcontrol control CO2 D where A equals atom% of particulate carbon (PC; biomass carbon) at the end of incubation (f) and start/natural abundance (control), or of the DIC pool (ACO2); [PCf] equals concentration/amount of PC at end of incubation, [PCcontrol] stands for concentration/amount of PC at start of incubation, and [PCΔ] represents concentration/amount of newly formed PC during incubation, equal to new carbon biomass. To calculate carbon fixation rates (newly formed carbon biomass), the equation was solved for the relative ratio of newly formed biomass as a function of total biomass.

((APC ) - (APC )) ([PC ]) f control = D ((A ) - (A )) ([PC ]) CO2 PCcontrol f

To determine the absolute carbon fixation rate, the equation was solved for [PCΔ]. The reported rates were calculated per min per mg of total protein. Significant differences between CO2 fixation rates of the WT and the knockout mutants were identified using ANOVA with a Fisher's

Least Significant Difference (LSD) test.

Sulfide consumption rates

The effects of ∆fbp and ∆pfp gene loss on sulfur metabolism were monitored by measuring rates of sulfide consumption throughout each autotrophic growth experiment. Sulfide concentration was continuously monitored in the bioreactor using an imbedded sulfide electrode

(Weiss Research) connected to Chemcadet mV controller (Cole-Parmer). The mV output from the controller was captured with Yocto-milliVolt-RX-BNC precision voltmeter (Yoctopuce) and recorded using Raspberry Pi3 (Raspberry Pi Foundation) running a custom Python script. The

110

recorded data reflected cyclical changes in sulfide concentration due to bacterial consumption

(≥0.3 mM) and intermittent automated supplementation with sulfide (≤0.5 mM).

Sulfide consumption rates were calculated from the mV measurements for each incubation experiment in Python 3.6 using NumPy 1.13.3 and Pandas 0.21.0. Briefly, mV data was converted into nmol of sulfide using a standard curve obtained with Cline spectrophotometric method (Cline 1969). Next, the highest sulfide consumption rate was determined in each consumption/supplementation cycle. The rate values were then adjusted for protein concentration in the bioreactor cultures. Finally, the rates from replicate experiments were averaged and plotted in Matplotlib version 2.1.0 (Hunter 2007).

Quantification of ATP

ATP content of the autotrophic bioreactor A. vinosum cultures was quantified throughout growth between OD690 of 0.5 and 2.5 using BacTiter-Glo Microbial Cell Viability Assay

(Promega). First, culture samples stored at -80°C were thawed for approximately 10 min on ice.

Next, aliquots (25 µl) were transferred in duplicate to an opaque-walled 96 well plate (Greiner

Bio-One) and combined with 25 µl of the ATP reagent. To remove bubbles, the plates were centrifuged at 1,000 x g for 1 min. Luminescence in the samples was recorded using a Tecan

Infinite m200 spectrophotometer under automatic attenuation, integration time 1,000 ms, and settle time 150 msec. The amount of ATP was determined with a standard curve. At least 3 experimental replicates were analyzed per sampled OD point.

Protein determination

Total protein concentration of A. vinosum cultures between OD690 0.5 and 2.5 was measured using a Coomassie Dye based assay for use in rate calculations (CO2 fixation and sulfide consumption) and quantification of ATP content. To release cellular proteins, the bacteria

111

were disrupted via probe sonication (Sonifier 250, Branson). Previously collected frozen A. vinosum cell pellets where thawed on ice and resuspended in cold 50 mM Tris-HCl buffer (pH

7.5) containing 200 mM NaCl. These bacterial suspensions were subjected to six rounds of sonication lasting 15 sec each at output level 3. During sonication sample tubes were kept on ice-salt slurry (8:1 w/w) and transferred to ice between the treatments for approximately 3 min.

Cell lysis was monitored microscopically. Protein concentrations were determined for three samples per time point in four technical replicates using CB-Protein Assay with bovine serum albumin (BSA) as a standard (G-Biosciences). Absorbance at 595 nm was quantified with a

Tecan Infinite m200 spectrophotometer.

Acknowledgements

This work was possible due to generous financial support from the Department of Organismic and Evolutionary Biology, Harvard University. We are indebted to Christiane Dahl, Christopher

Marx, Michael Madigan, Dipti Nayak, and Anna Wang for helpful suggestions and assistance with obtaining bacterial cultures and plasmids.

References Anon, 1989. Product inhibition of potato-tuber pyrophosphate:fructose-6-phosphate phosphotransferase by phosphate and pyrophosphate. Plant Physiology, 89(2), pp.628– 633.

Anon, 1996. Purification and characterization of pyrophosphate-dependent phosphofructokinase from phosphate-starved Brassica nigra suspension cells. Plant Physiology, 112(1), pp.343– 351.

Bapteste, E., Moreira, D. & Philippe, H., 2003. Rampant horizontal gene transfer and phospho- donor change in the evolution of the phosphofructokinase. Gene, 318, pp.185–191.

Bar-Even, A., Noor, E. & Milo, R., 2012. A survey of carbon fixation pathways through a quantitative lens. Journal of Experimental Botany, 63(6), pp.2325–2342.

Berg, I.A., 2011. Ecological aspects of the distribution of different autotrophic CO2 fixation pathways. Applied and Environmental Microbiology, 77(6), pp.1925–1936.

112

Bornefeld, T., 1981. Is light-dependent formation of inorganic pyrophosphate in Anacystis a photosynthetic process? Archives of Microbiology, 129(5), pp.371–373.

Brune, D.C., 1989. Sulfur oxidation by phototrophic bacteria. Biochimica et Biophysica Acta, 975(2), pp.189–221.

Carnal, N.W. & Black, C.C., 1979. Pyrophosphate-dependent 6-phosphofructokinase, a new glycolytic enzyme in pineapple leaves. Biochemical and Biophysical Research Communications, 86(1), pp.20–26.

Cassar, N. & Laws, E.A., 2007. Potential contribution of β-carboxylases to photosynthetic carbon isotope fractionation in a marine diatom. Phycologia, 46(3), pp.307–314.

Cavanaugh, C.M., 1983. Symbiotic chemoautotrophic bacteria in marine invertebrates from sulphide-rich habitats. Nature, 302, pp.58–61.

Cavanaugh, C.M. et al., 2013. Marine chemosynthetic symbioses. In The Prokaryotes. Berlin Heidelberg: Springer Berlin Heidelberg, pp. 579–607.

Chen, C., Rabourdin, B. & Hammen, C., 1987. The effect of hydrogen sulfide on the metabolism of Solemya velum and enzymes of sulfide oxidation in gill tissue. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology, 88(3), pp.949–952.

Chen, J. et al., 1990. Pyrophosphatase is essential for growth of Escherichia coli. Journal of Bacteriology, 172(10), pp.5686–5689.

Cline, J., 1969. Spectrophotometric determination of hydrogen sulfide in natural waters. Limnology and Oceanography, 14(3), pp.454–458.

Dahl, C. et al., 2005. Novel genes of the dsr gene cluster and evidence for close interaction of Dsr proteins during sulfur oxidation in the phototrophic sulfur bacterium Allochromatium vinosum. Journal of Bacteriology, 187(4), pp.1392–1404.

Dahl, C. et al., 2013. Sulfite oxidation in the purple sulfur bacterium Allochromatium vinosum: identification of SoeABC as a major player and relevance of SoxYZ in the process. Microbiology, 159(Pt 12), pp.2626–2638.

Dmytrenko, O. et al., 2014. The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis. BMC Genomics, 15(924), pp.1–20.

Dmytrenko, O. et al., 2018. The “missing enzyme” in the enigmatic Calvin cycle of chemosynthetic symbionts. In preparation.

Dubilier, N., Bergin, C. & Lott, C., 2008. Symbiotic diversity in marine animals: the art of harnessing chemosynthesis. Nature Reviews Microbiology, 6(10), pp.725–740.

Erb, T.J. & Zarzycki, J., 2018. A short history of RubisCO: the rise and fall (?) of Nature's predominant CO2 fixing enzyme. Current Opinion in Biotechnology, 49, pp.100–107.

113

Felbeck, H., Childress, J.J. & Somero, G.N., 1981. Calvin-Benson cycle and sulphide oxidation enzymes in animals from sulphide-rich habitats. Nature, 293(5830), pp.291–293.

Fisher, C. & Childress, J., 1992. Organic carbon transfer from methanotrophic symbionts to the host hydrocarbon-seep mussel. Symbiosis, 12(3), pp.221–235.

Fisher, C. et al., 1993. The co-occurrence of methanotrophic and chemoautotrophic sulfur- oxidizing bacterial symbionts in a deep-sea mussel. Marine Ecology, 14(4), pp.277–289.

Frigaard, N.-U. & Dahl, C., 2009. Sulfur metabolism in phototrophic sulfur bacteria. Advances in Microbial Physiology, Vol 51, 54, pp.103–200.

Fuller, R.C. et al., 1961. Carbon metabolism in Chromatium. The Journal of Biological Chemistry, 236(7), pp.2140–2149.

Gerbling, K.P., Steup, M. & Latzko, E., 1986. Fructose 1,6-bisphosphatase form B from Synechococcus leopoliensis hydrolyzes both fructose and sedoheptulose bisphosphate. Plant Physiology, 80(3), pp.716–720.

Heinonen, J.K. & Drake, H.L., 1988. Comparative assessment of inorganic pyrophosphate and pyrophosphatase levels of Escherichia coli, Clostridium pasteurianum, and Clostridium thermoaceticum. FEMS Microbiology Letters, 52(3), pp.205–208.

Heinonen, J.K. & Heinonen, J., 2001. Biological role of inorganic pyrophosphate, Norwell, MA: Kluwer Academic Publishers.

Hoelzle, K. et al., 2010. Inorganic pyrophosphatase in uncultivable hemotrophic mycoplasmas: identification and properties of the enzyme from Mycoplasma suis. BMC Microbiology, 10(194), pp.1–8.

Hourdez, S. & Weber, R.E., 2005. Molecular and functional adaptations in deep-sea hemoglobins. Journal of Inorganic Biochemistry, 99(1), pp.130–141.

Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), pp.90–95.

Imhoff, J.F., 2005. Family I. Chromatiaceae Bavendamm 1924, 125AL emend. Imhoff 1984b, 339. In Bergey's manual of systematic bacteriology. New York, NY: Springer, pp. 3–40.

Imhoff, J.F., 2006. The Chromatiaceae. In M. Dworkin et al., eds. The Prokaryotes. Berlin Heidelberg: Springer New York, pp. 846–873.

Jeon, S.-J. & Ishikawa, K., 2005. Characterization of the Family I inorganic pyrophosphatase from Pyrococcus horikoshii OT3. Archaea, 1(6), pp.385–389.

Josse, J., 1966. Constitutive inorganic pyrophosphatase of Escherichia coli. The Journal of Biological Chemistry, 241(9), pp.1938–1947.

114

Kleiner, M. et al., 2012. Metaproteomics of a gutless marine worm and its symbiotic microbial community reveal unusual pathways for carbon and energy use. Proceedings of the National Academy of Sciences of the United States of America, 109(19), pp.1173–1182.

Klemme, J.H. & Gest, H., 1971. Regulatory properties of an inorganic pyrophosphatase from the photosynthetic bacterium Rhodospirillum rubrum. Proceedings of the National Academy of Sciences of the United States of America, 68(4), pp.721–725.

Kozak, M., 1999. Initiation of translation in prokaryotes and eukaryotes. Gene, 234(2), pp.187– 208.

Kunkee, R.E., 1967. Malo-lactic fermentation. Advances in applied microbiology, 9, pp.235–279.

Laue, B.E. & Nelson, D.C., 1994. Characterization of the gene encoding the autotrophic ATP sulfurylase from the bacterial endosymbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 176(12), pp.3723–3729.

Lubbe, Y.J. et al., 2006. Siro(haem)amide in Allochromatium vinosum and relevance of DsrL and DsrN, a homolog of cobyrinic acid a,c-diamide synthase, for sulphur oxidation. FEMS Microbiology Letters, 261(2), pp.194–202.

Lutz, R.A. et al., 1994. Rapid growth at deep-sea vents. Nature, 371(6499), pp.663–664.

Markert, S. et al., 2007. Physiological proteomics of the uncultured endosymbiont of Riftia pachyptila. Science, 315(5809), pp.247–250.

Markert, S. et al., 2011. Status quo in physiological proteomics of the uncultured Riftia pachyptila endosymbiont. Proteomics, 11(15), pp.3106–3117.

Martin, W. & Schnarrenberger, C., 1997. The evolution of the Calvin cycle from prokaryotic to eukaryotic chromosomes: a case study of functional redundancy in ancient pathways through endosymbiosis. Current Genetics, 32(1), pp.1–18.

Marx, C.J., 2008. Development of a broad-host-range sacB-based vector for unmarked allelic exchange. BMC Research Notes, 1(1), pp.1–8.

Marx, C.J. & Lidstrom, M.E., 2002. Broad-host-range cre-lox system for antibiotic marker recycling in gram-negative bacteria. Biotechniques, 33(5), pp.1062–1067.

McKinlay, J.B. & Harwood, C.S., 2010. Carbon dioxide fixation as a central redox cofactor recycling mechanism in bacteria. Proceedings of the National Academy of Sciences of the United States of America, 107(26), pp.11669–11675.

Mertens, E., 1991. Pyrophosphate-dependent phosphofructokinase, an anaerobic glycolytic enzyme? Febs Letters, 285(1), pp.1–5.

Miović, M.L. & Gibson, J., 1971. Nucleotide pools in growing Chromatium strain D. Journal of Bacteriology, 108(2), pp.954–956.

115

Monod, J., 1947. The phenomenon of enzymatic adaptation and its bearing on problems of genetics and cellular differentiation. Growth, 11(4), pp.223–289.

Montoya, J.P. et al., 1996. A simple, high-precision, high-sensitivity tracer assay for N2 fixation. Applied and Environmental Microbiology, 62(3), pp.986–993.

Newton, I. et al., 2007. The Calyptogena magnifica chemoautotrophic symbiont genome. Science, 315(5814), pp.998–1000.

Nyrén, P., Hajnal, K. & Baltscheffsky, M., 1984. Purification of the membrane-bound proton- translocating inorganic pyrophosphatase from Rhodospirillum rubrum. Biochimica et Biophysica Acta, 766(3), pp.630–635.

O'Brien, W.E., Bowien, S. & Wood, H.G., 1975. Isolation and characterization of a pyrophosphate-dependent phosphofructokinase from Propionibacterium shermanii. The Journal of Biological Chemistry, 250(22), pp.8690–8695.

Ordaz, H. et al., 1992. Thermostability and activation by divalent-cations of the membrane- bound inorganic pyrophosphatase of Rhodospirillum rubrum. International Journal of Biochemistry, 24(10), pp.1633–1638.

Pace, N.R., 2009. Mapping the tree of life: progress and prospects. Microbiology and molecular biology reviews : MMBR, 73(4), pp.565–576.

Parey, K. et al., 2013. Structural, biochemical and genetic characterization of dissimilatory ATP sulfurylase from Allochromatium vinosum. PLoS ONE, 8(9), pp.1–9.

Pattaragulwanit, K. & Dahl, C., 1995. Development of a genetic system for a purple sulfur bacterium: conjugative plasmid transfer in Chromatium vinosum. Archives of Microbiology, 164(3), pp.217–222.

Polz, M. et al., 2000. When bacteria hitch a ride. ASM News, 66(9), pp.531–539.

Pott, A.S. & Dahl, C., 1998. Sirohaem sulfite reductase and other proteins encoded by genes at the dsr locus of Chromatium vinosum are involved in the oxidation of intracellular sulfur. Microbiology, 144 ( Pt 7)(7), pp.1881–1894.

Raven, J.A., 2009. Contributions of anoxygenic and oxygenic phototrophy and chemolithotrophy to carbon and oxygen fluxes in aquatic environments. Aquatic Microbial Ecology, 56(2-3), pp.177–192.

Reeves, R.E. et al., 1974. Pyrophosphate:D-fructose 6-phosphate 1-phosphotransferase. A new enzyme with the glycolytic function of 6-phosphofructokinase. The Journal of Biological Chemistry, 249(24), pp.7737–7741.

Reshetnikov, A.S. et al., 2008. Characterization of the pyrophosphate-dependent 6- phosphofructokinase from Methylococcus capsulatus Bath. FEMS Microbiology Letters, 288(2), pp.202–210.

116

Revelles, O. et al., 2013. The carbon storage regulator (Csr) system exerts a nutrient-specific control over central metabolism in Escherichia coli strain Nissle 1917. PLoS ONE, 8(6), pp.1–12.

Robertson, C.E. et al., 2013. Culture-independent analysis of aerosol microbiology in a metropolitan subway system. Applied and Environmental Microbiology, 79(11), pp.3485– 3493.

Robinson, J., Stein, J. & Cavanaugh, C.M., 1998. Cloning and sequencing of a form II ribulose- 1,5-bisphosphate carboxylase/oxygenase from the bacterial symbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 180(6), p.1596.

Sander, J., Engels-Schwarzlose, S. & Dahl, C., 2006. Importance of the DsrMKJOP complex for sulfur oxidation in Allochromatium vinosum and phylogenetic analysis of related complexes in other prokaryotes. Archives of Microbiology, 186(5), pp.357–366.

Sato, T. & Atomi, H., 2010. Microbial inorganic carbon fixation. Wiley Online Library, pp.1–12.

Schultz, A. & Baltscheffsky, M., 2003. Properties of mutated Rhodospirillum rubrum H+- pyrophosphatase expressed in Escherichia coli. Biochimica et Biophysica Acta (BBA)- Bioenergetics, 1607(2-3), pp.141–151.

Schwander, T. et al., 2016. A synthetic pathway for the fixation of carbon dioxide in vitro. Science, 354(6314), pp.900–904.

Scott, K.M. & Cavanaugh, C.M., 2007. CO2 uptake and fixation by endosymbiotic chemoautotrophs from the bivalve Solemya velum. Applied and Environmental Microbiology, 73(4), pp.1174–1179.

Serrano, A. et al., 2004. Proton-pumping inorganic pyrophosphatases in some archaea and other extremophilic prokaryotes. Journal of Bioenergetics and Biomembranes, 36(1), pp.127–133.

Siebers, B., Klenk, H. & Hensel, R., 1998. PPi-dependent phosphofructokinase from Thermoproteus tenax, an archaeal descendant of an ancient line in phosphofructokinase evolution. Journal of Bacteriology, 180(8), pp.2137–2143.

Simon, R., Priefer, U. & Puhler, A., 1983. A broad host range mobilization system for in vivo genetic engineering: transposon mutagenesis in gram negative bacteria. Nature Biotechnology, 1, pp.784–791.

Singer, S.J. et al., 1952. The proteins of green leaves. IV. A high molecular weight protein comprising a large part of the cytoplasmic proteins. The Journal of Biological Chemistry, 197(1), pp.233–239.

Stockdreher, Y. et al., 2014. New proteins involved in sulfur trafficking in the cytoplasm of Allochromatium vinosum. Journal of Biological Chemistry, 289(18), pp.12390–12403.

117

Takashima, T., Nishiki, T. & Konishi, Y., 2000. Anaerobic oxidation of dissolved hydrogen sulfide in continuous culture of the phototrophic bacterium Prosthecochloris aestuarii. Journal Of Bioscience And Bioengineering, 89(3), pp.247–251.

Tang, K.-H. et al., 2009. Carbohydrate metabolism and carbon fixation in Roseobacter denitrificans OCh114. PLoS ONE, 4(10), pp.1–12.

Tang, K.-H., Tang, Y.J. & Blankenship, R.E., 2011. Carbon metabolic pathways in phototrophic bacteria and their broader evolutionary implications. Frontiers in Microbiology, 2(165), pp.1– 23.

Tang, T. et al., 2017. Geochemically distinct carbon isotope distributions in Allochromatium vinosum DSM 180T grown photoautotrophically and photoheterotrophically. Geobiology, 15(2), pp.324–339.

Teich, R. et al., 2007. Origin and distribution of Calvin cycle fructose and sedoheptulose bisphosphatases in plantae and complex algae: A single secondary origin of complex red plastids and subsequent propagation via tertiary endosymbioses. Protist, 158(3), pp.263– 276.

Timmermans, J. & Van Melderen, L., 2009. Conditional essentiality of the csrA gene in Escherichia coli. Journal of Bacteriology, 191(5), pp.1722–1724. van Alebeek, G. & Keltjens, J.T., 1994. Purification and characterization of inorganic pyrophosphatase from Methanobacterium thernoautotrophicum (strain Δ H). Biochimica et Biophysica Acta, 1206(2), pp.231–239. van Gemerden, H. & Beeftink, H.H., 1978. Specific rates of substrate oxidation and product formation in autotrophically growing Chromatium vinosum cultures. Archives of Microbiology, 119(2), pp.135–143. van Rossum, H.M. et al., 2016. Engineering cytosolic acetyl-coenzyme A supply in Saccharomyces cerevisiae: Pathway stoichiometry, free-energy conservation and redox- cofactor balancing. Metabolic engineering, 36, pp.99–115.

Weaver, P.F., Wall, J.D. & Gest, H., 1975. Characterization of Rhodopseudomonas capsulata. Archives of Microbiology, 105(3), pp.207–216.

Weissgerber, T. et al., 2011. Complete genome sequence of Allochromatium vinosum DSM 180(T). Standards in Genomic Sciences, 5(3), pp.311–330.

Weissgerber, T. et al., 2013. Genome-wide transcriptional profiling of the purple sulfur bacterium Allochromatium vinosum DSM 180T during growth on different reduced sulfur compounds. Journal of Bacteriology, 195(18), pp.4231–4245.

Weissgerber, T., Sylvester, M., et al., 2014. A comparative quantitative proteomic study identifies new proteins relevant for sulfur oxidation in the purple sulfur bacterium Allochromatium vinosum. Applied and Environmental Microbiology, 80(7), pp.2279–2292.

118

Weissgerber, T., Watanabe, M., et al., 2014. Metabolomic profiling of the purple sulfur bacterium Allochromatium vinosum during growth on different reduced sulfur compounds and malate. Metabolomics, 10(6), pp.0–19.

Wilson, A.T. & Calvin, M., 1955. The photosynthetic cycle - CO2 dependent transients. Journal of the American Chemical Society, 77(22), pp.5948–5957.

Yang, C., Hua, Q. & Shimizu, K., 2002. Metabolic flux analysis in Synechocystis using isotope distribution from 13C-labeled glucose. Metabolic engineering, 4(3), pp.202–216.

Yoo, J.-G. & Bowien, B., 1995. Analysis of the cbbF genes from Alcaligenes eutrophus that encode fructose-1,6-/sedoheptulose-1,7-bisphosphatase. Current Microbiology, 31(1), pp.55–61.

119

CONCLUSION

Symbioses between eukaryotes and bacteria are subject to strong evolutionary forces which favor their maintenance and benefit the individual partners. In my thesis I investigated a potential adaptation to a symbiotic lifestyle which appears to have occurred in all chemoautotrophic gammaproteobacterial symbionts of marine invertebrates. These phylogenetically disparate bacteria form symbioses with diverse eukaryotic hosts (Dubilier et al.

2008; Cavanaugh et al. 2013). Chemoautotrophic symbionts range widely in their genetic repertoire and genome size. For example, some of the smallest known genomes among autotrophic bacteria belong to the symbionts of deep-sea clams Calyptogena okutanii (1.0 Mb)

(Kuwahara et al. 2007) and Calyptogena magnifica (1.2 Mb) (Newton et al. 2007). In contrast, the genome of Solemya velum symbiont (2.7 Mb) (Dmytrenko et al. 2014) and the symbiont of

Riftia pachyptila (3.5 Mb) (Robidart et al. 2008; Gardebrecht et al. 2012) are similar in size to the genomes of free-living bacteria, for instance, Thiomicrospira crunogena (2.4 Mb) (Scott et al.

2006) and Thiobacillus denitrificans (2.9 Mb) (Beller et al. 2006). However, regardless of the differences among chemoautotrophic symbionts, all belonging to the class gammaproteobacteria appear to lack one gene, namely fbp, encoding fructose 1,6- bisphosphatase (FBPase). In bacteria this enzyme performs two essential reactions in the

Calvin cycle, dephosphorylating fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7- bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). Presence of fbp in the genomes of almost all of their closest non-symbiotic relatives, demonstrated in Chapter 2, suggests a strong association between the lack of fbp and a symbiotic lifestyle.

Despite an apparent lack of FBPase, these symbionts are able to fix CO2 with RuBisCO

(Felbeck et al. 1981; Cavanaugh 1983; Robinson et al. 1998; Singer et al. 1952; Erb & Zarzycki

2018). It is possible that the missing enzyme may be supplied by the host. Such adaptation,

120

however, is unlikely to have occurred independently in phylogenetically diverse lineages of chemoautotrophic symbionts and hosts as different as siboglinid tubeworms (Markert et al.

2007) and coastal protobranch bivalves (Dmytrenko et al. 2014). A more parsimonious hypothesis, which I have investigated in my thesis, proposes a potential substitution of FBPase activity with enzymatic catalysis performed by a pyrophosphate-dependent phosphofructokinase

(PPi-PFK).

Having confirmed the absence of fbp in the genome of S. velum symbiont in Chapter 1 of my thesis, I turned to investigate expression of the PPi-PFK encoding gene, pfp, in the symbiont as part of Chapter 2. In bacteria, PPi-PFK is thought to be a glycolytic enzyme primarily operating in the forward direction (Mertens 1991; Frese et al. 2014). Transcriptional analysis of the S. velum symbiont revealed low expression levels across all genes encoding glycolitic enzymes, such as pyruvate kinase, phosphoglycerate mutase, or enolase. In contrast, transcriptional levels of pfp were high and comparable to those of the Calvin cycle genes, including glyceraldehyde 3-phosphate dehydrogenase and phosphoribulokinase. Expression of pfp correlated with high PPi-PFK activity in the symbiont-containing gill tissue. Furthermore, recombinant PPi-PFK was unique among other known bacterial PPi-PFK in having higher specificity for the reverse over the forward reaction and higher catalytic efficiency than a number of bacterial FBPases. Taken together, these suggest that in the symbionts PPi-PFK may not be primarily operating in glycolysis. Such high transcriptional and enzymatic activities and the higher propensity for the reverse reaction are more in line with the hypothesized role of PPi-PFK in the Calvin cycle. Further, pyrophosphate (PPi), produced by PPi-PFK during dephosphorylation of fructose 1,6-bisphosphate (FBP) to fructose 6-phosphate (F6P), inhibited the enzyme. Bacteria on average have high cellular PPi content, ranging from 0.5 to 1.5 mM

(Heinonen & Drake 1988; Bornefeld 1981; Chen et al. 1990). At 1 mM PPi, PPi-PFK activity is reduced by more than 75% and the forward reaction may be favored. Thus, reverse PPi-PFK

121

activity in the symbionts is dependent on PPi removal. Additionally, hydrolysis of PPi increases equilibrium constant (K') of the reverse reaction by 103-104-fold (Heinonen 2001). PPi can be consumed by a number of enzymes encoded in the genome of S. velum symbiont, for example,

ATP sulfurylase (SAT) , inorganic pyrophosphatase (PPase), sodium-translocating PPase (Na+-

PPase), or proton-pumping PPase (H+-PPases) (Dmytrenko et al. 2014). In the case of inorganic PPase, PPi is broken down to phosphate without coupling the hydrolysis to any other reaction (van Alebeek & Keltjens 1994). If PPi is consumed by H+/Na+ PPases, an electrochemical gradient is generated by transferring one H+/Na+ per PPi into the periplasm

(Serrano et al. 2007). Approximately 10 H+/Na+ can generate 3 molecule of ATP by ATP synthase (Hinkle 2005). In contrast, SAT activity produces one ATP per PPi (Parey et al. 2013).

Among the possibilities above, the latter may be the most favorable mechanism of PPi removal in the symbionts, shown below.

DDEFDGH Pi + FBP I⎯⎯⎯⎯K F6P + PPi

5PQ UF PPi + APS IK ATP + SOT

Among the S. velum symbiont genes known to encode PPi consuming enzymes, sat is the most highly transcribed. Furthermore, high SAT activity and protein levels has been previously reported in symbiont-containing tissues of numerous invertebrate hosts, including S. velum (Felbeck et al. 1981; Felbeck 1981; Fisher & Hand 1984; Chen et al. 1987; Polz et al.

1992; Fiala-Medioni et al. 2002; Markert et al. 2007; Kleiner et al. 2012). This suggests that PPi may couple dephosphorylation of FBP and sedoheptulose 1,7-bisphosphate by PPi-PFK in the

Calvin cycle to sulfide oxidation. This coupling would drive reverse PPi-PFK activity by preventing substrate inhibition and increasing equilibrium constant of the reverse reaction. The resulting PPi could be consumed by SAT, driving sulfide oxidation to completion and generating

122

two molecules of ATP per each round of the Calvin cycle. This would reduce the energetic cost of carbon fixation from three to one ATP molecules per CO2 molecule fixed.

Coupling between the PPi-PFK reverse reaction and the SAT activity–in the direction of

ATP synthesis and sulfate production–may explain why the hypothesized shift from FBPase to

PPi-PFK is specific to chemoautotrophic sulfur oxidizing symbionts. Such an adaptation has not occurred, for example, in photosynthetic symbionts or plastids, which have evolved from cyanobacteria. Organelles and photosynthetic symbiotic bacterial rely on FBPase and sedoheptulose 1,7-bisphosphatase (SBPase) in their Calvin cycle (Martin & Schnarrenberger

1997). They do not obtain energy from sulfide oxidation, like chemoautotrophic symbionts, and thus may not have an efficient energy generating mechanism to co-opt for PPi removal required for PPi-PFK activity in the Calvin cycle. However, PPi-PFKs are widely distributed among plants, including Zea mays (Mertens 1991), and their role in CO2 fixation remains enigmatic.

In Chapter 3 of my thesis I investigated the ability of PPi-PFK to replace FBPase in the

Calvin cycle and thus to support CO2 fixation. Due to the genetic intractability of chemoautotrophic symbionts, I recreated the symbiont-like Calvin cycle in A. vinosum, a close but free-living relative of the chemoautotrophic symbiont of S. velum. Using deletion mutagenesis in this bacterium it was demonstrated that, in the absence of fbp, PPi-PFK encoded by pfp is essential and sufficient for carbon fixation and growth. The shift from FBPase to PPi-PFK in A. vinosum was associated with a reduction in growth rate and adaptability but not in carbon fixation. The loss of FBPase also led to a significant decrease in sulfide oxidation rates. Despite this decline in sulfide consumption, ATP levels did not change. These observations agree with the proposed coupling between PPi-PFK reverse activity and SAT activity, which may consume PPi generated by PPi-PFK to make ATP in the final step of sulfur oxidation to sulfate. For A. vinosum, which generates ATP using a light-driven cyclic electron flow (Brune 1989), this means a decrease in ATP demand and a resulting back pressure on

123

cyclic electron flow, which is known to reduce sulfide oxidation rates (Pott & Dahl 1998; Dahl et al. 2005; Frigaard & Dahl 2009). This model fits the observed high ATP levels, a decline in sulfide oxidation, and as well unaltered rates of carbon fixation in the A. vinosum FBPase- deficient mutant. However, cellular adjustments may be necessary to accommodate the increase in PPi derived from reverse PPi-PFK activity. Besides, PPi-PFK forward activity, which may be required for certain anaplerotic purposes, diminishes. These changes could account for decreased growth rate of the ∆fbp mutant. Since chemoautotrophic symbionts are confined to their intracellular environment, a decline in growth would be of no consequence, as long as CO2 fixation rates remain unchanged. An increase in thermodynamic efficiency due to ATP production, on the other hand, would be of a great advantage, as these bacteria do not only feed themselves but also their much more energetically demanding hosts. Taken together, these data support the hypothesized ability of PPi-PFK to replace FBPase in the Calvin cycle of sulfur oxidizing bacteria. The results presented in my thesis propose a mechanism that highly favors a shift from FBPase to PPi-PFK in chemoautotrophic but not in photoautotrophic symbionts, plastids, or free-living bacteria.

Future directions

The indicated ability of PPi-PFK to replace FBPase in the Calvin cycle paves a way to investigate the role of PPi in the energy metabolism of chemoautotrophic symbionts, in particular with regard to SAT activity. The potential role of H+/Na+-PPases in PPi recycling offers another attractive line of inquiry. Finally, these findings may have potential applications for industrial sequestration of CO2.

124

References

Beller, H. et al., 2006. The genome sequence of the obligately chemolithoautotrophic, facultatively anaerobic bacterium Thiobacillus denitrificans. Journal of Bacteriology, 188(4), pp.1473–1488.

Bornefeld, T., 1981. Is light-dependent formation of inorganic pyrophosphate in Anacystis a photosynthetic process? Archives of Microbiology, 129(5), pp.371–373.

Brune, D.C., 1989. Sulfur oxidation by phototrophic bacteria. Biochimica et Biophysica Acta, 975(2), pp.189–221.

Cavanaugh, C.M., 1983. Symbiotic chemoautotrophic bacteria in marine invertebrates from sulphide-rich habitats. Nature, 302, pp.58–61.

Cavanaugh, C.M. et al., 2013. Marine chemosynthetic symbioses. In The Prokaryotes. Berlin Heidelberg: Springer Berlin Heidelberg, pp. 579–607.

Chen, J. et al., 1990. Pyrophosphatase is essential for growth of Escherichia coli. Journal of Bacteriology, 172(10), pp.5686–5689.

Dahl, C. et al., 2005. Novel genes of the dsr gene cluster and evidence for close interaction of Dsr proteins during sulfur oxidation in the phototrophic sulfur bacterium Allochromatium vinosum. Journal of Bacteriology, 187(4), pp.1392–1404.

Dmytrenko, O. et al., 2014. The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis. BMC Genomics, 15(924), pp.1–20.

Dubilier, N., Bergin, C. & Lott, C., 2008. Symbiotic diversity in marine animals: the art of harnessing chemosynthesis. Nature Reviews Microbiology, 6(10), pp.725–740.

Erb, T.J. & Zarzycki, J., 2018. A short history of RubisCO: the rise and fall (?) of Nature's predominant CO2 fixing enzyme. Current Opinion in Biotechnology, 49, pp.100–107.

Felbeck, H., Childress, J.J. & Somero, G.N., 1981. Calvin-Benson cycle and sulphide oxidation enzymes in animals from sulphide-rich habitats. Nature, 293(5830), pp.291–293.

Frese, M. et al., 2014. Characterization of the pyrophosphate-dependent 6-phosphofructokinase from Xanthomonas campestris pv. campestris. Archives of Biochemistry and Biophysics, 546, pp.53–63.

Frigaard, N.-U. & Dahl, C., 2009. Sulfur metabolism in phototrophic sulfur bacteria. Advances in Microbial Physiology, Vol 51, 54, pp.103–200.

Gardebrecht, A. et al., 2012. Physiological homogeneity among the endosymbionts of Riftia pachyptila and Tevnia jerichonana revealed by proteogenomics. The ISME Journal, 6(4), pp.766–776.

Gerbling, K.P., Steup, M. & Latzko, E., 1986. Fructose 1,6-bisphosphatase form B from

125

Synechococcus leopoliensis hydrolyzes both fructose and sedoheptulose bisphosphate. Plant Physiology, 80(3), pp.716–720.

Heinonen, J., 2001. Biological role of inorganic pyrophosphate, Norwell, MA: Kluwer Academic Publishers.

Heinonen, J.K. & Drake, H.L., 1988. Comparative assessment of inorganic pyrophosphate and pyrophosphatase levels of Escherichia coli, Clostridium pasteurianum, and Clostridium thermoaceticum. FEMS Microbiology Letters, 52(3), pp.205–208.

Hinkle, P.C., 2005. P/O ratios of mitochondrial oxidative phosphorylation. Biochimica et Biophysica Acta, 1706(1-2), pp.1–11.

Kleiner, M. et al., 2012. Metaproteomics of a gutless marine worm and its symbiotic microbial community reveal unusual pathways for carbon and energy use. Proceedings of the National Academy of Sciences of the United States of America, 109(19), pp.1173–1182.

Kuwahara, H. et al., 2007. Reduced genome of the thioautotrophic intracellular symbiont in a deep-sea clam, Calyptogena okutanii. Current biology, 17(10), pp.881–886.

Markert, S. et al., 2007. Physiological proteomics of the uncultured endosymbiont of Riftia pachyptila. Science, 315(5809), pp.247–250.

Martin, W. & Schnarrenberger, C., 1997. The evolution of the Calvin cycle from prokaryotic to eukaryotic chromosomes: a case study of functional redundancy in ancient pathways through endosymbiosis. Current Genetics, 32(1), pp.1–18.

Mertens, E., 1991. Pyrophosphate-dependent phosphofructokinase, an anaerobic glycolytic enzyme? Febs Letters, 285(1), pp.1–5.

Newton, I. et al., 2007. The Calyptogena magnifica chemoautotrophic symbiont genome. Science, 315(5814), pp.998–1000.

Parey, K. et al., 2013. Structural, biochemical and genetic characterization of dissimilatory ATP sulfurylase from Allochromatium vinosum. PLoS ONE, 8(9), pp.1–9.

Pott, A.S. & Dahl, C., 1998. Sirohaem sulfite reductase and other proteins encoded by genes at the dsr locus of Chromatium vinosum are involved in the oxidation of intracellular sulfur. Microbiology, 144 ( Pt 7)(7), pp.1881–1894.

Robidart, J. et al., 2008. Metabolic versatility of the Riftia pachyptila endosymbiont revealed through metagenomics. Environmental Microbiology, 10(3), pp.727–737.

Robinson, J., Stein, J. & Cavanaugh, C.M., 1998. Cloning and sequencing of a form II ribulose- 1,5-bisphosphate carboxylase/oxygenase from the bacterial symbiont of the hydrothermal vent tubeworm Riftia pachyptila. Journal of Bacteriology, 180(6), p.1596.

Scott, K.M. et al., 2006. The genome of deep-sea vent chemolithoautotroph Thiomicrospira crunogena XCL-2. PLoS Biology, 4(12), pp.2196–2212.

126

Serrano, A. et al., 2007. H+-PPases: yesterday, today and tomorrow. IUBMB Life, 59(2), pp.76– 83.

Singer, S.J. et al., 1952. The proteins of green leaves. IV. A high molecular weight protein comprising a large part of the cytoplasmic proteins. The Journal of Biological Chemistry, 197(1), pp.233–239. van Alebeek, G. & Keltjens, J.T., 1994. Purification and characterization of inorganic pyrophosphatase from Methanobacterium thernoautotrophicum (strain Δ H). Biochimica et Biophysica Acta, 1206(2), pp.231–239.

Yoo, J.-G. & Bowien, B., 1995. Analysis of the cbbF genes from Alcaligenes eutrophus that encode fructose-1,6-/sedoheptulose-1,7-bisphosphatase. Current Microbiology, 31(1), pp.55–61.

127

APPENDIX 1

Supplementary material for Chapter 1:

The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis

128

Table S1. Length [bp], GC%, percentage of the total base pairs, and the number of genes in the scaffolds which constitute the genome of the S. velum symbiont. Scaffold Length (bp) % GC % of Total bp No. Genes SV_sym_Scaffold_1 1213831 51.5 44.92 1232 SV_sym_Scaffold_2 892555 50.7 33.03 927 SV_sym_Scaffold_3 537613 50.9 19.89 557 SV_sym_Scaffold_4 28016 48.7 1.04 25 SV_sym_Scaffold_5 7777 46.3 0.29 1 SV_sym_Scaffold_6 7618 43.5 0.28 7 SV_sym_Scaffold_7 3806 43.2 0.14 1 SV_sym_Scaffold_8 3773 40.0 0.14 2 SV_sym_Scaffold_9 3752 46.6 0.14 2 SV_sym_Scaffold_10 3712 42.8 0.14 3 Total 2702453 51.0

129

Table S2. tRNA genes and the codon frequencies in the genome of the S. velum symbiont. Codons: tRNA genes: Codon AA Frequency AA Codon TGA * 1.63 Ala A TGC TAA * 1.327 Cys C GCA TAG * 0.637 Asp D GTC GCA A 30.143 Glu E TTC GCC A 24.733 Glu E TTC GCT A 16.719 Phe F GAA GCG A 16.595 Gly G GCC TGC C 5.296 Gly G TCC TGT C 5.114 His H GTG GAT D 35.186 Ile I GAT GAC D 23.155 Lys K TTT GAA E 34.885 Leu L CAG GAG E 33.656 Leu L CAA TTC F 19.054 Leu L TAA TTT F 18.404 Leu L GAG GGT G 27.277 Leu L TAG GGC G 26.427 Met M CAT GGA G 12.83 Met M CAT GGG G 7.449 Met M CAT CAT H 12.058 Asn N GTT CAC H 11.907 Pro P CGG ATC I 30.673 Pro P TGG ATT I 23.225 Pro P GGG ATA I 7.266 Gln Q TTG AAG K 24.794 Gln Q CTG AAA K 21.606 Arg R CCT CTG L 41.235 Arg R TCT CTC L 20.935 Arg R CCG CTT L 17.412 Arg R ACG TTG L 12.268 Ser S TGA TTA L 5.172 Ser S GGA CTA L 4.593 Ser S GCT ATG M 27.247 Thr T GGT AAC N 18.269 Thr T TGT AAT N 17.187 Val V GAC

130

Table S2 (Continued). CCG P 14.93 Val V TAC CCA P 9.846 Trp W CCA CCT P 9.477 Tyr Y GTA CCC P 8.879 CAG Q 27.882 CAA Q 10.404 CGC R 18.466 CGT R 18.353 AGG R 5.975 AGA R 5.232 CGA R 5.103 CGG R 4.279 AGC S 13.341 TCA S 12.66 AGT S 10.56 TCG S 9.407 TCC S 8.889 TCT S 7.926 ACC T 18.477 ACA T 14.328 ACT T 9.67 ACG T 9.083 GTC V 22.01 GTT V 20.633 GTG V 16.267 GTA V 10.521 TGG W 12.983 TAC Y 14.765 TAT Y 13.288

131

Table S3. Gene product names used in Figure 1 and Figure 4, the corresponding NCBI protein ID reference numbers, and EC/TC numbers. Product Full name Protein ID EC/TC Number Electron transport chain Sulfur Oxidation SoxA Heterodimeric c-type cytochrome complex SoxAX, subunit A JV46_24690 Unavailable SoxX Heterodimeric c-type cytochrome complex SoxAX, subunit X JV46_24720 Unavailable SoxY Sulfur carrier protein SoxYZ, subunit Y JV46_24710 Unavailable SoxZ Sulfur carrier protein SoxYZ, subunit Z JV46_24700 Unavailable SoxB Sulfate thiol-esterase SoxB JV46_27210 3.1.3.5 FccA Flavocytochrome c dehydrogenase FccAB, subunit A JV46_05270 Unavailable FccB Flavocytochrome c dehydrogenase FccAB, subunit B JV46_05260 Unavailable Sqr Sulfide-quinone reductase Sqr JV46_19710 1.8.5.4 rDsrA Reverse-operating cytoplasmic dissimilatory sulfite reductase DsrAB, subunit A JV46_15520 1.8.1.2 rDsrB Reverse-operating cytoplasmic dissimilatory sulfite reductase DsrAB, subunit B JV46_15510 1.8.1.2 rDsrE Hexameric sulfur relay protein rDsrEFH, subunit E JV46_15500 2.8.1.- rDsrF Hexameric sulfur relay protein rDsrEFH, subunit F JV46_15490 Unavailable rDsrH Hexameric sulfur relay protein rDsrEFH, subunit H JV46_15480 Unavailable rDsrC Persulfide carrier to DsrAB, rDsrC JV46_15470 2.8.1.- rDsrM Transmembrane electron transport complex rDsrKMJOP, subunit M JV46_15460 Unavailable rDsrK Transmembrane electron transport complex rDsrKMJOP, subunit K JV46_15450 Unavailable rDsrL Transmembrane electron transport complex rDsrKMJOP, subunit L JV46_15440 Unavailable rDsrJ Transmembrane electron transport complex rDsrKMJOP, subunit J JV46_15430 Unavailable rDsrO Transmembrane electron transport complex rDsrKMJOP, subunit O JV46_15420 Unavailable rDsrP Transmembrane electron transport complex rDsrKMJOP, subunit P JV46_15410 Unavailable rDsrN Dsr protein of unknown function, DsrN JV46_15400 Unavailable rDsrR Dsr protein of unknown function, DsrR JV46_15390 Unavailable rDsrS Putative posttranscriptional regulator of the rdsr operon JV46_15380 Unavailable AprA Adenosine phosphosulphate reductase AprABM, subunit A JV46_07790 1.8.99.2 AprB Adenosine phosphosulphate reductase AprABM, subunit B JV46_07780 1.8.99.2 AprM Adenosine phosphosulphate reductase AprABM, subunit M JV46_07770 1.8.99.2 Sat ATP-generating ATP sulfurylase JV46_21260 2.7.7.4 SulP1 Sulfate:bicarbonate antiporter SulP JV46_19680 Unavailable SulP2 Sulfate:bicarbonate antiporter SulP JV46_24990 Unavailable Primary Ion Pumps RnfA1 Electron transport complex, RnfABCDGE type, subunit A JV46_10780 Unavailable RnfB1 Electron transport complex, RnfABCDGE type, subunit B JV46_10790 Unavailable RnfC1 Electron transport complex, RnfABCDGE type, subunit C JV46_10800 Unavailable RnfD1 Electron transport complex, RnfABCDGE type, subunit D JV46_10810 Unavailable RnfG1 Electron transport complex, RnfABCDGE type, subunit G JV46_10820 Unavailable RnfE1 Electron transport complex, RnfABCDGE type, subunit E JV46_10830 Unavailable RnfB2 Electron transport complex, RnfBCDGEA type, subunit B JV46_16970 Unavailable RnfB3 Electron transport complex, RnfBCDGEA type, subunit B JV46_16960 Unavailable RnfC2 Electron transport complex, RnfBCDGEA type, subunit C JV46_16930 Unavailable RnfD2 Electron transport complex, RnfBCDGEA type, subunit D JV46_16920 Unavailable RnfG2 Electron transport complex, RnfBCDGEA type, subunit G JV46_16910 Unavailable RnfE2 Electron transport complex, RnfBCDGEA type, subunit E JV46_16900 Unavailable RnfA2 Electron transport complex, RnfBCDGEA type, subunit A JV46_16890 Unavailable NdhA NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit A JV46_20270 1.6.5.3 NdhB NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit B JV46_20280 1.6.5.3 NdhC NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit C JV46_20290 1.6.5.3 NdhD NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit D JV46_20300 1.6.5.3 NdhE NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit E JV46_20310 1.6.5.3 NdhF NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit F JV46_20320 1.6.5.3 NdhG NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit G JV46_20340 1.6.5.3 NdhH NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit H JV46_20350 1.6.5.3 NdhI NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit I JV46_20360 1.6.5.3 NdhJ NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit J JV46_20370 1.6.5.3 NdhK NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit K JV46_20380 1.6.5.3 NdhL NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit L JV46_20390 1.6.5.3 NdhM NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit M JV46_20400 1.6.5.3 NdhN NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit N JV46_20410 1.6.5.3 Hydrogenases HupS [Ni-Fe]-uptake hydrogenase HupSL, subunit S JV46_26680 1.12.5.1 HupL [Ni-Fe]-uptake hydrogenase HupSL, subunit L JV46_26720 1.12.5.1 Hox2F Bidirectional hydrogenase H2FUYH, subunit 2F JV46_10640 1.12.1.2 Hox2U Bidirectional hydrogenase H2FUYH, subunit 2U JV46_10650 1.12.1.2 Hox2Y Bidirectional hydrogenase H2FUYH, subunit 2Y JV46_10660 1.12.1.2 Hox2H Bidirectional hydrogenase H2FUYH, subunit 2H JV46_10670 1.12.1.2 Hox2W Maturation protease of Hox2H, How2W JV46_10680 Unavailable NqrA Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit A JV46_13720 1.6.5.8 NrqB Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit B JV46_13710 1.6.5.8 NqrC Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit C JV46_13700 1.6.5.8 NqrD Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit D JV46_13690 1.6.5.8 NqrE Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit E JV46_13680 1.6.5.8

132

Table S3 (Continued). NqrF Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit F JV46_13670 1.6.5.8 OadG Na+-translocating oxaloacetate decarboxylase OadGAB, subunit G JV46_14870 4.1.1.3 OadA Na+-translocating oxaloacetate decarboxylase OadGAB, subunit A JV46_14860 4.1.1.3 OadB1 Na+-translocating oxaloacetate decarboxylase OadGAB, subunit B1 JV46_14850 4.1.1.3 OadB2 Na+-translocating oxaloacetate decarboxylase OadGAB, subunit B2 JV46_14840 4.1.1.3 Quinone Reductases FdoG Formate dehydrogenase-O FdoGHI, subunit G JV46_19080 1.2.1.2 FdoH Formate dehydrogenase-O FdoGHI, subunit H JV46_19090 1.2.1.2 FdoI Formate dehydrogenase-O FdoGHI, subunit I JV46_19100 1.2.1.2 Quinone Oxidases QcoA Quinol:cytochrome-c oxidoreductase bc1, subunit A JV46_24140 1.10.2.2 QcoB Quinol:cytochrome-c oxidoreductase bc1, subunit B JV46_24170 1.10.2.2 QcoC Quinol:cytochrome-c oxidoreductase bc1, subunit C JV46_24180 1.10.2.2 Terminal reductases

CcoN cbb3-type cytochrome c oxidase CooNOQP, subunit N JV46_10130 1.9.3.1

CcoO cbb3-type cytochrome c oxidase CooNOQP, subunit O JV46_10150 1.9.3.1

CcoQ cbb3-type cytochrome c oxidase CooNOQP, subunit Q JV46_10160 1.9.3.1

CcoP cbb3-type cytochrome c oxidase CooNOQP, subunit P JV46_10170 1.9.3.1

CoxA aa3-type cytochrome c oxidase CoxAB - subunit A JV46_24840 1.9.3.1

CoxB aa3-type cytochrome c oxidase CoxAB - subunit B JV46_24930 1.9.3.1

CydA ba3-type cytochrome c oxidase CydAB - subunit A JV46_05320 1.9.3.1

CydB ba3-type cytochrome c oxidase CydAB - subunit B JV46_05330 1.9.3.1 DmsA Dimethylsulfoxide reductase DmsABC, subunit A JV46_27580 1.8.5.3 DmsB Dimethylsulfoxide reductase DmsABC, subunit B JV46_27590 1.8.5.3 DmsC Dimethylsulfoxide reductase DmsABC, subunit C JV46_27600 1.8.5.3 NapF Periplasmic nitrate reductase napFDAGHBC, subunit F JV46_19590 1.7.99.4 NapD Periplasmic nitrate reductase napFDAGHBC, subunit D JV46_19600 1.7.99.4 NapA Periplasmic nitrate reductase napFDAGHBC, subunit A JV46_19610 1.7.99.4 NapG Periplasmic nitrate reductase napFDAGHBC, subunit G JV46_19620 1.7.99.4 NapH Periplasmic nitrate reductase napFDAGHBC, subunit H JV46_19630 1.7.99.4 NapB Periplasmic nitrate reductase napFDAGHBC, subunit B JV46_19640 1.7.99.4 NapC Periplasmic nitrate reductase napFDAGHBC, subunit C JV46_19650 1.7.99.4 NirB Assimilatory nitrite reductase, subunit B, large JV46_10450 1.7.1.4 NirD Assimilatory nitrite reductase, subunit D, small JV46_10460 1.7.1.4 Mrp Antiporter MrpE Na+:H+ antiporter MrpEFGBBCDD, subunit E JV46_13210 Unavailable MrpF Na+:H+ antiporter MrpEFGBBCDD, subunit F JV46_13200 Unavailable MrpG Na+:H+ antiporter MrpEFGBBCDD, subunit G JV46_13190 Unavailable MrpB1 Na+:H+ antiporter MrpEFGBBCDD, subunit B1 JV46_13180 Unavailable MrpB2 Na+:H+ antiporter MrpEFGBBCDD, subunit B2 JV46_13170 Unavailable MrpC Na+:H+ antiporter MrpEFGBBCDD, subunit C JV46_13160 Unavailable MrpD1 Na+:H+ antiporter MrpEFGBBCDD, subunit D1 JV46_13140 Unavailable MrpD2 Na+:H+ antiporter MrpEFGBBCDD, subunit D2 JV46_13120 Unavailable ATP Synthases

Atpf0I F0F1-type ATP synthase, F0 subunit I JV46_17360 3.6.3.14

Atpf0A F0F1-type ATP synthase, F0 subunit A JV46_17370 3.6.3.14

Atpf0C F0F1-type ATP synthase, F0 subunit C JV46_17380 3.6.3.14

Atpf0B F0F1-type ATP synthase, F0 subunit B JV46_17390 3.6.3.14

Atpf1D F0F1-type ATP synthase, F1 subunit delta JV46_17400 3.6.3.14

Atpf1A F0F1-type ATP synthase, F1 subunit alpha JV46_17410 3.6.3.14

Atpf1G F0F1-type ATP synthase, F1 subunit gamma JV46_17420 3.6.3.14

Atpf1B F0F1-type ATP synthase, F1 subunit beta JV46_17430 3.6.3.14

Atpf1E F0F1-type ATP synthase, F1 subunit epsilon JV46_17450 3.6.3.14

AtpA0D A0A1-type ATP synthase, A0 subunit D JV46_14270 3.6.3.15

AtpA0B A0A1-type ATP synthase, A0 subunit B JV46_14280 3.6.3.15

AtpA0A A0A1-type ATP synthase, A0 subunit A JV46_14290 3.6.3.15

AtpA0F A0A1-type ATP synthase, A0 subunit F JV46_14310 3.6.3.15

AtpA1K A0A1-type ATP synthase, A1 subunit K JV46_14320 3.6.3.15

AtpA0I A0A1-type ATP synthase, A1 subunit I JV46_14330 3.6.3.15 Type IV Pilus PilA1 Type IV pilus assembly major pilin protein PilA1 JV46_15650 Unavailable PilA2 Type IV pilus assembly major pilin protein PilA2 JV46_15660 Unavailable PilB Type IV pilus assembly major pilin protein PilB JV46_21670 Unavailable PilC Type IV pilus assembly major pilin protein PilC JV46_21680 Unavailable PilD Type IV pilus assembly major pilin protein PilD JV46_21690 3.4.23.43 PilE1 Type IV pilus prepilin-type N-terminal cleavage PilE1 JV46_13730 Unavailable PilY1 Type IV pilus assembly tip-associated adhesin PilY1-like protein PilY1 JV46_13740 Unavailable PilX Type IV pilus assembly protein PilX JV46_13810 Unavailable PilW Type IV pilus assembly protein PilW JV46_13850 Unavailable PilV Type IV pilus modification protein PilV JV46_13890 Unavailable PilE2 Type IV pilus prepilin-type N-terminal cleavage/methylation domain PilE2 JV46_25290 Unavailable PilX Type IV pilus assembly protein PilX JV46_25310 Unavailable PilW Type IV pilus assembly protein PilW JV46_25320 Unavailable FimU Type IV pilus pilin protein FimU JV46_25330 Unavailable

133

Table S3 (Continued). PilP Type IV pilus assembly protein PilP JV46_23590 Unavailable PilQ Type IV pilus secretin protein PilQ JV46_23600 Unavailable PilU Type IV pilus assembly protein PilU JV46_21550 Unavailable Calvin Cycle CbbL ribulose 1,5-bisphosphate carboxylase large subunit CbbL JV46_07630 4.1.1.39 CbbS ribulose 1,5-bisphosphate carboxylase small subunit CbbS JV46_07620 4.1.1.39 CbbP Phosphoribulokinase JV46_10890 2.7.1.19 TK Transketolase JV46_03550 2.2.1.1 RPE Ribulose-phosphate 3-epimerase JV46_16690 5.1.3.1 RPI Ribose 5-phosphate isomerase JV46_17960 5.3.1.6 Glyconeogenesis PpsA Phosphoenolpyruvate synthase JV46_19560 2.7.9.2 GapB Glyceraldehyde-3-phosphate dehydrogenase GapB JV46_03540 1.2.1.12 PK Pyruvate kinase JV46_03510 2.7.1.40 Polyglucose biosynthesis PGM1 Phosphoglucomutase 1 JV46_12920 5.4.2.2 UDP UDP-glucose pyrophosphorylase JV46_24670 2.7.7.9 GS Glycogen synthases, ADP-glucose type JV46_06580 2.4.1.21 GBE Glycogen branching enzyme JV46_06590 2.4.1.18 GT 4-alpha-glucanotransferase JV46_06600 2.4.1.25 PYGL Glycogen phosphorylase JV46_06570 2.4.1.1 Glycolysis GK Glucokinase JV46_10890 2.7.1.19 GPI Glucose-6-phosphate isomerase JV46_18220 5.3.1.9 PPi-PFK Pyrophosphate-dependent phosphofructokinase JV46_23230 2.7.1.90 FBPA Fructose-bisphosphate aldolase JV46_03500 4.1.2.13 TPI Triosephosphate isomerase JV46_20250 5.3.1.1 GapA Glyceraldehyde-3-phosphate dehydrogenase GapA JV46_25990 1.2.1.12 PGK Phosphoglycerate kinase JV46_03520 2.7.2.3 PGM2 Phosphoglycerate mutase 2 JV46_23670 5.4.2.1 Eno Enolase JV46_21020 4.2.1.11 PK Pyruvate kinase JV46_03510 2.7.1.40 PDH1 Pyruvate dehydrogenase 1 JV46_14530 2.3.1.12 PDH2 Pyruvate dehydrogenase 2 JV46_14520 2.3.1.12 TCA Cycle CS Citrate synthase JV46_23370 2.3.3.1 ACO1 Aconitase 1 JV46_20010 4.2.1.3 ACO2 Aconitase 2 JV46_12860 4.2.1.3 ICD Isocitrate dehydrogenase JV46_14480 1.1.1.42 OGDH 2-oxoglutarate dehydrogenase JV46_13970 1.8.1.4 SCS1 Succinyl-CoA synthetase 1 JV46_15630 6.2.1.5 SCS2 Succinyl-CoA synthetase 2 JV46_15640 6.2.1.5 SdhC Succinate dehydrogenase ShdCDAB, subunit C JV46_18230 1.3.5.1 ShdD Succinate dehydrogenase ShdCDAB, subunit D JV46_18260 1.3.5.1 SdhA Succinate dehydrogenase ShdCDAB, subunit A JV46_18270 1.3.5.1 SdhB Succinate dehydrogenase ShdCDAB, subunit B JV46_18420 1.3.5.1 FumC Fumarase JV46_13940 4.2.1.2 Mqo Malate:quinone oxidoreductase Mqo JV46_27010 1.1.5.4 Glyoxylate Cycle ICL Isocitrate lyase JV46_21990 4.1.3.1 MLS Alanine-glyoxylate aminotransferase JV46_07950 2.6.1.44 ME Malic enzyme JV46_23450 1.1.1.40 Fatty Acid Biosynthesis (FAB) FabF 3-ketoacyl-ACP synthase II JV46_14750 2.3.1.179 PlsX Phosphate:acyl-[ACP] acyltransferase JV46_14800 2.3.1.15 Acc1 Acetyl-CoA carboxylase 1 JV46_14890 6.4.1.2 Acc2 Acetyl-CoA carboxylase 2 JV46_09790 6.4.1.2 Acc3 Acetyl-CoA carboxylase 3 JV46_26480 6.4.1.2 FabH 3-ketoacyl-ACP synthase III JV46_14790 2.3.1.180 FabG 3-ketoacyl-ACP reductase JV46_14770 1.1.1.100 ACP Acyl-carrier protein JV46_14760 Unavailable FabA 3-hydroxyacyl-[ACP] dehydratase JV46_14940 Unavailable FebI Enoyl-ACP reductase [NADH] JV46_13910 1.3.1.9 FebZ 3-hydroxydecanoyl-[ACP] dehydratase JV46_09760 4.2.1.60 FabD Malonyl CoA-ACP transacylase JV46_14780 2.3.1.39 Phospholipid synthesis PlsB G3P-acyltransferase JV46_25110 2.3.1.15 CdsA CDP-diglyceride synthetase JV46_15020 2.7.7.41 PssA Phosphatidylserine synthase JV46_19530 4.1.1.65 Psd Phosphatidylserine decarboxylase JV46_07980 4.1.1.65 PlsC 1-acyl-G3P-acyltransferase JV46_23260 2.3.1.51 Non-mevalonate Pathway Dxs 1-Deoxy-D-xylulose 5-phosphate synthase JV46_27320 2.2.1.7 IspC 1-Deoxy-D-xylulose 5-phosphate reductoisomerase JV46_15010 1.1.1.267

134

Table S3 (Continued). IspD 4-diphosphocytidyl 2-C-methyl D-erythritol synthase JV46_15020 2.7.7.41 IspE 4-diphosphocytidyl 2-C-methyl D-erythritol kinase JV46_16240 2.7.1.148 IspF 2C-methyl D-erythritol 2,4-cyclodiphosphate synthase JV46_21050 4.6.1.12 IspG 4-hydroxy 3-methylbut 2-enyl diphosphate synthase JV46_06430 1.17.4.3 IspH 4-hydroxy 3-methylbut 2-enyl diphosphate reductase JV46_25680 1.17.1.2 HMG-CoA reductase pathway GGGPPS Geranylgeranyl pyrophosphate synthase JV46_09930 2.5.1.30 FPPS Farnesyl-pyrophosphate synthase JV46_26060 2.5.1.10 Cell Wall Biosynthesis GlmU Glucosamine-1-phosphate N-acetyltransferase GlmU JV46_17460 2.3.1.157 MurA MurABCDE, subunit A - UDP-N-acetylglucosamine 1-carboxyvinyltransferase JV46_15790 2.5.1.7 MurB MurABCDE, subunit B - UDP-N-acetylmuramate dehydrogenase JV46_15800 1.1.1.158 MurC MurABCDE, subunit C - UDP-N-acetylmuramate-L-alanine ligase JV46_15810 6.3.2.8 MurD MurABCDE, subunit D - UDP-N-acetylmuramoylalanine-D-glutamate ligase JV46_15850 6.3.2.9 MurABCDE, subunit E - UDP-N-acetylmuramoyl-L-alanyl-D-glutamate:(L)-meso-2,6-diaminoheptan MurE JV46_15890 6.3.2.13 edioate gamma-ligase (ADP-forming) Ddl D-alanine--D-alanine ligase Ddl JV46_15790 6.3.2.4 MurF UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase JV46_15880 6.3.2.10 MraY UDP-N-acetylmuramyl pentapeptide phosphotransferase JV46_13470 2.7.8.13 MurG UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) JV46_15830 2.4.1.227 MtgA Monofunctional biosynthetic peptidoglycan transglycosylase JV46_19010 2.4.1.- MrcA Penicillin-binding protein 1A JV46_23660 2.4.1.- MrcB Penicillin-binding protein 1B JV46_27250 2.4.1.129 MrdA Penicillin-binding protein 2 JV46_11130 2.4.1.129 PbpB Penicillin-binding protein 2 JV46_15900 2.4.1.129 DacA Penicillin-binding protein 6 JV46_11170 3.4.16.4 DacB D-alanyl-D-alanine carboxypeptidase JV46_16060 3.4.16.4 MviN Integral membrane protein MviN JV46_26370 Unavailable KpsF KpsF family protein JV46_14380 Unavailable KdsA 3-deoxy-8-phosphooctulonate synthase JV46_21010 2.5.1.55 KdsB 3-deoxy-manno-octulosonate cytidylyltransferase JV46_20860 2.7.7.38 LpxA Acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase JV46_14930 2.3.1.129 LpxC UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase JV46_15720 3.5.1.108 LpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase JV46_14950 2.3.1.191 LpxH UDP-2,3-diacylglucosamine hydrolase JV46_10860 3.6.1.54 LpxB Lipid-A-disaccharide synthase JV46_14920 2.4.1.182 LpxK Tetraacyldisaccharide 4'-kinase JV46_20840 2.7.1.130 KdtA 3-deoxy-D-manno-octulosonic-acid transferase JV46_29080 Unavailable HtrB Lipid A biosynthesis lauroyl/palmitoleoyl acyltransferase JV46_23880 2.3.1.- MsbB Lauroyl/myristoyl acyltransferase JV46_27370 2.3.1.- Taurine Synthesis TauD Taurine dioxygenase JV46_12240 1.14.11.17 ABC Transporters YadH ABC-type polysaccharide/polyol phosphate export systems YadHG, subunit H JV46_14430 Unavailable YadG ABC-type polysaccharide/polyol phosphate export systems YadHG, subunit G JV46_14440 Unavailable MdlB ABC-type multidrug transport system MdlB JV46_11990 Unavailable MacA RND family efflux transporter MacAB, subunit A JV46_04810 Unavailable MacB RND family efflux transporter MacAB, subunit B JV46_04780 Unavailable SalX ABC-type antimicrobial peptide transport system SalXY, subunit X JV46_04800 Unavailable SalY ABC-type antimicrobial peptide transport system SalXY, subunit Y JV46_04790 Unavailable Ttg2A ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit A JV46_19850 Unavailable Ttg2C ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit C JV46_19860 Unavailable Ttg2D ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit D JV46_16460 Unavailable CcmA ABC-typeheme exporter protein CcmABCD, subunit A JV46_18340 Unavailable CcmB ABC-type heme exporter protein CcmABCD, subunit B JV46_18350 Unavailable CcmC ABC-type heme exporter protein CcmABCD, subunit C JV46_18360 Unavailable CcmD ABC-type heme exporter protein CcmABCD, subunit D JV46_18370 Unavailable DppC ABC-type oligopeptide transport systems DppCBBAF, subunit C JV46_03680 Unavailable DppB1 ABC-type oligopeptide transport systems DppCBBAF, subunit B1 JV46_03740 Unavailable DppB2 ABC-type oligopeptide transport systems DppCBBAF, subunit B2 JV46_03790 Unavailable DppA ABC-type oligopeptide transport systems DppCBBAF, subunit A JV46_03890 Unavailable DppF ABC-type oligopeptide transport systems DppCBBAF, subunit F JV46_20940 Unavailable SmoK ABC-type sorbitol//mannitol transporter SmoKGFEm subunit K JV46_03940 Unavailable SmoG ABC-type sorbitol//mannitol transporter SmoKGFEm subunit G JV46_03950 Unavailable SmoF ABC-type sorbitol//mannitol transporter SmoKGFEm subunit F JV46_03960 Unavailable SmoE ABC-type sorbitol//mannitol transporter SmoKGFEm subunit E JV46_03970 Unavailable LivK ABC-type amino acid/amide transporter LivKHMG, subunit K JV46_04880 3.A.1.4.- LivH ABC-type amino acid/amide transporter LivKHMG, subunit H JV46_04870 3.A.1.4.- LivM ABC-type amino acid/amide transporter LivKHMG, subunit M JV46_04860 3.A.1.4.- LivG ABC-type amino acid/amide transporter LivKHMG, subunit G JV46_04850 3.A.1.4.- LivF ABC-type amino acid/amide transporter LivKHMG, subunit F JV46_04840 3.A.1.4.- AapJ ABC-type amino acid transporter AapJQMP, subunit J JV46_28250 3.A.1.4.- AapQ ABC-type amino acid transporter AapJQMP, subunit Q JV46_28260 3.A.1.4.- AapM ABC-type amino acid transporter AapJQMP, subunit M JV46_28270 3.A.1.4.-

135

Table S3 (Continued). AapP ABC-type amino acid transporter AapJQMP, subunit P JV46_28280 3.A.1.4.- TauA ABC-type taurine transporter TauACB, subunit A JV46_26820 3.6.3.36 TauC ABC-type taurine transporter TauACB, subunit C JV46_26830 3.6.3.36 TauB ABC-type taurine transporter TauACB, subunit B JV46_26850 3.6.3.36 UrtA ABC-type urea transporter UrtABCDE, subunit A JV46_28500 Unavailable UrtB ABC-type urea transporter UrtABCDE, subunit B JV46_28510 Unavailable UrtC ABC-type urea transporter UrtABCDE, subunit C JV46_28520 Unavailable UrtD ABC-type urea transporter UrtABCDE, subunit D JV46_28530 Unavailable UrtE ABC-type urea transporter UrtABCDE, subunit E JV46_28540 Unavailable TupC ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit C JV46_12830 Unavailable TupB ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit B JV46_12840 Unavailable TupA ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit A JV46_12850 Unavailable PstB ABC-type phosphate transporter PstBACS, subunit B JV46_16720 3.A.1.7.1 PstA ABC-type phosphate transporter PstBACS, subunit A JV46_16730 3.A.1.7.1 PstC ABC-type phosphate transporter PstBACS, subunit C JV46_16740 3.A.1.7.1 PstS1 ABC-type phosphate transporter PstBACS, subunit S1 JV46_16750 3.A.1.7.1 PstS2 ABC-type phosphate transporter PstBACS, subunit S2 JV46_05400 3.A.1.7.1 ModC ABC-type molybdenum transporter ModCBA, subunit C JV46_17050 Unavailable ModB ABC-type molybdenum transporter ModCBA, subunit B JV46_17060 Unavailable ModA1 ABC-type molybdenum transporter ModCBA, subunit A1 JV46_17070 Unavailable ModA2 ABC-type molybdenum transporter ModCBA, subunit A2 JV46_27160 Unavailable ZnuB ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit B JV46_05790 Unavailable ZnuC ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit C JV46_05800 Unavailable ZnuA ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit A JV46_05810 Unavailable FhuD ABC-type hemin transporter FhuDBC, subunit D JV46_07880 3.6.3.34 FhuB ABC-type hemin transporter FhuDBC, subunit B JV46_12940 3.6.3.34 FhuC ABC-type hemin transporter FhuDBC, subunit C JV46_12930 3.6.3.34 AfuC ABC-type spermidine/putrescine transporter AfuCBA, subunit C JV46_11970 3.6.3.30 AfuB ABC-type spermidine/putrescine transporter AfuCBA, subunit B JV46_11980 3.6.3.30 AfuA ABC-type spermidine/putrescine transporter AfuCBA, subunit A JV46_12000 3.6.3.30 Porins PhoE Outer membrane protein PhoE JV46_26360 Unavailable OmpA1 Outer membrane protein OmpA1 JV46_21440 Unavailable OmpA2 Outer membrane protein OmpA2 JV46_16990 Unavailable OmpC Outer membrane protein OmpC JV46_22220 Unavailable Ion

Chanels EriC Chloride channel protein EriC JV46_16700 Unavailable AmtB1 ammonium transporter AmtB1 JV46_23120 Unavailable AmtB2 ammonium transporter AmtB2 JV46_19350 Unavailable Feo Transporter FeoA Ferrous iron (Fe2+) transporter FeoAB, subunit A JV46_09850 Unavailable FeoB Ferrous iron (Fe2+) transporter FeoAB, subunit B JV46_09860 Unavailable P-Type ATPases ActP P-type probable sodium:solute symporter ActP JV46_13540 Unavailable CopA1 P-type copper/silver transporter CopA1 JV46_20160 Unavailable CopA2 P-type copper/silver transporter CopA2 JV46_28240 Unavailable ZntA P-type heavy metal transporter ZntA JV46_05290 Unavailable MgtA P-type magnesium transporter MtgA JV46_26960 Unavailable Sec System SecA Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit A JV46_15700 Unavailable SecB Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit B JV46_23180 Unavailable SecD Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit D JV46_03750 Unavailable SecE Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit E JV46_16180 Unavailable SecF Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit F JV46_03730 Unavailable SecG Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit G JV46_20260 Unavailable SecY Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit Y JV46_07170 Unavailable YidC Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit YidC JV46_17160 Unavailable YajC Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit YajC JV46_03760 Unavailable Ffh Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit Ffh JV46_04040 Unavailable FtsY Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit FtsY JV46_25480 Unavailable Tat System TatA Twin arginine-targeting protein translocase TatABC, subunit A JV46_18620 Unavailable TatB Twin arginine-targeting protein translocase TatABC, subunit B JV46_18630 Unavailable TatC Twin arginine-targeting protein translocase TatABC, subunit C JV46_18640 Unavailable Type II Sectretion System GspC Type II protein secretion system GspDSCFGHIJKLMEO, subunit C JV46_18850 Unavailable GspD Type II protein secretion system GspDSCFGHIJKLMEO, subunit D JV46_18860 Unavailable GspE1 Type II protein secretion system GspDSCFGHIJKLMEO, subunit E1 JV46_18870 Unavailable GspE2 Type II protein secretion system GspDSCFGHIJKLMEO, subunit E2 JV46_18880 Unavailable GspF Type II protein secretion system GspDSCFGHIJKLMEO, subunit F JV46_18890 Unavailable GspG Type II protein secretion system GspDSCFGHIJKLMEO, subunit G JV46_18910 Unavailable GspH Type II protein secretion system GspDSCFGHIJKLMEO, subunit H JV46_18920 Unavailable GspI Type II protein secretion system GspDSCFGHIJKLMEO, subunit I JV46_18930 Unavailable

136

Table S3 (Continued). GspJ Type II protein secretion system GspDSCFGHIJKLMEO, subunit J JV46_18940 Unavailable GspK Type II protein secretion system GspDSCFGHIJKLMEO, subunit K JV46_18950 Unavailable GspL Type II protein secretion system GspDSCFGHIJKLMEO, subunit L JV46_18960 Unavailable GspM Type II protein secretion system GspDSCFGHIJKLMEO, subunit M JV46_18970 Unavailable GspN Type II protein secretion system GspDSCFGHIJKLMEO, subunit N JV46_18980 Unavailable GspO Type II protein secretion system GspDSCFGHIJKLMEO, subunit O JV46_21690 Unavailable Long-chain Farry Acid Transporter FadL Long-chain fatty acid transporter FadLD, subunit L JV46_22000 Unavailable FadD Long-chain fatty acid transporter FadLD, subunit D JV46_11770 Unavailable Tol

System TolQ Cell division and transport-associated protein TolQ (TC 2.C.1.2.1) JV46_27660 Unavailable TolR Biopolymer transport protein JV46_27670 Unavailable TolA TolA protein JV46_27680 Unavailable TolB tol-pal system beta propeller repeat protein TolB JV46_27690 Unavailable TonB Complex ExbB1 Biopolymer transport protein ExbBD, subunit B1 JV46_20820 Unavailable ExbB2 Biopolymer transport protein ExbBD, subunit B2 JV46_08350 Unavailable ExbD1 Biopolymer transport protein ExbBD, subunit D1 JV46_20830 Unavailable ExbD2 Biopolymer transport protein ExbBD, subunit D2 JV46_08360 Unavailable TonB1 Outer membrane transport energization protein TonB1 JV46_15870 Unavailable TonB2 Outer membrane transport energization protein TonB2 JV46_08370 Unavailable BtuB Outer membrane cobalamin receptor protein BtuB JV46_25710 Unavailable Multidrug Efflux Pump AcrA1 Multidrug efflux pump AcrAB, subunit A1 JV46_09130 Unavailable AcrB1 Multidrug efflux pump AcrAB, subunit B1 JV46_09140 Unavailable AcrA2 Multidrug efflux pump AcrAB, subunit A2 JV46_11370 Unavailable AcrB2 Multidrug efflux pump AcrAB, subunit B2 JV46_07710 Unavailable AcrB3 Multidrug efflux pump AcrAB, subunit B3 JV46_03460 Unavailable AcrA3 Multidrug efflux pump AcrAB, subunit A3 JV46_03470 Unavailable AcrA4 Multidrug efflux pump AcrAB, subunit A4 JV46_21950 Unavailable AcrB4 Multidrug efflux pump AcrAB, subunit B4 JV46_21960 Unavailable AcrB5 Multidrug efflux pump AcrAB, subunit B5 JV46_12380 Unavailable AcrA5 Multidrug efflux pump AcrAB, subunit A5 JV46_12390 Unavailable AcrB6 Multidrug efflux pump AcrAB, subunit B6 JV46_13300 Unavailable AcrA6 Multidrug efflux pump AcrAB, subunit A6 JV46_13310 Unavailable AcrB7 Multidrug efflux pump AcrAB, subunit B7 JV46_26640 Unavailable AcrA7 Multidrug efflux pump AcrAB, subunit A7 JV46_22530 Unavailable TolC Type I secretion outer membrane protein TolC JV46_17820 Unavailable TRAP Transporters DctM1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M1 JV46_07020 Unavailable DctQ1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q1 JV46_07030 Unavailable DctP1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P1 JV46_07040 Unavailable DctP2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P2 JV46_11320 Unavailable DctQ2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q2 JV46_11330 Unavailable DctM2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M2 JV46_11340 Unavailable DctM3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M3 JV46_12640 Unavailable DctQ3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q3 JV46_12650 Unavailable DctP3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P3 JV46_12660 Unavailable DctM4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M4 JV46_14070 Unavailable DctQ4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q4 JV46_14080 Unavailable DctP4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P4 JV46_14090 Unavailable DctM5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M5 JV46_25940 Unavailable DctM6 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M6 JV46_25950 Unavailable DctQ5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q5 JV46_25960 Unavailable DctP5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P5 JV46_25970 Unavailable Secondary Trasnporters TrkA K+ transporter TrkAH, subunit A JV46_29030 Unavailable TrkH K+ transporter TrkAH, subunit H JV46_29040 Unavailable CitT1 Di- and tricarboxylate transporter CitT1 JV46_03580 Unavailable CitT2 Di- and tricarboxylate transporter CitT2 JV46_17850 Unavailable ZupT Divalent heavy-metal cations transporter ZupT JV46_27760 Unavailable MgtE Mg2+ transporter MgtE JV46_25230 Unavailable NptA Na/Pi cotransporter NptA JV46_05390 Unavailable KefB1 Sodium/proton antiporter KefB1 JV46_24480 Unavailable KefB2 Sodium/proton antiporter KefB2 JV46_28430 Unavailable KefB3 Sodium/proton antiporter KefB3 JV46_09560 Unavailable NhaP Na+/H+ and K+/H+ antiporter NhaP JV46_14160 Unavailable LysE Putative threonine efflux protein LysE JV46_05240 Unavailable FieF Cation diffusion facilitator FieF JV46_17540 Unavailable

137

Table S4. Parameters of the gene prediction software. GeneMarkS Parameter Value Explanation Parameters specified: --gcode 11 --shape circular --prok Default parameters: --order 2 Markov chain order. Default = 2 --motif 1 Default = 1 --width 6 Default = 6 Length of seq. upstream of translation initiation site that --prestart 26 includes motif. Default = 26 --identity 0.99 Identity level for termination of iterations. Default = 0.99 --maxitr 10 Maximum no of iterations. Default = 10 --fixmotif Motif is located at a fixed position with respect to start --offover Overlap allowed by default --strand Strand to predict gene in. Default both Prodigal Parameters specified: -g 11 Translation table. Default = 11 Default parameters: -c Denes not allowed to run off edges -n Not used hence did not bypass shine dalgarno trainer -m Not used hence allowed genes to be built across n’s Glimmer Parameters specified: -n No header -t 1,15 Genes with entropy score less than 1.15 will be considered -z 11 GenBank translation table used -f Consider the option g Minimum gene length to n nucleotides. Does not include the -g 60 bases in the stop codon – ATG, GTG, -A Codon list default TTG Circular genome used, probability of all start codons considered equal

138

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 1/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) Printed for: Oleg

1 # Oleg Dmytrenko 1 August 2011 2 3 import sys 4 import os 5 import glob 6 import re 7 import shutil 8 import optparse 9 10 ##### Functions ##### 11 12 def get_aa_sequence(geneNAME, fileNAME, geneID): 13 fileNAME += '.fasta' 14 fileNAME.join('') # Operation for a string, not list, for list see line 144 15 outputNAME = geneNAME 16 outputNAME += '.fasta' 17 output = '' 18 output += './Gene aa sequences are here/' 19 output += outputNAME 20 dirname = 'Gene aa sequences are here' 21 if not os.path.isdir('./' + dirname + '/'): # This part creates an output directory if it does not exhist yet. 22 os.mkdir('./' + dirname + '/') 23 output = open(output, 'a') 24 for file in glob.iglob(fileNAME): 25 found = 0 26 for line in open(file): 27 if (line == '\n' and found == 1): 28 found = 0 29 output.write('\n') 30 if (('>' in line and re.search(geneID, line, re.A) != None) or found == 1): 31 found = 1 32 output.write(line) 33 34 gene_set = set() 35 def clustal_W(org, gene, id, mark): 36 if id not in gene_set: 37 gene_set.add(id) 38 org += '.fasta' 39 org.join('') # Operation for a string, not list, for list see line 144 40 outputNAME = gene 41 outputNAME += '.fasta' 42 output = '' 43 output += './Clustal analysis results/' 44 output += outputNAME 45 dirname = 'Clustal analysis results'

139

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 2/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

46 if not os.path.isdir('./' + dirname + '/'): # This part creates an output directory if it does not exhist yet. 47 os.mkdir('./' + dirname + '/') 48 output = open(output, 'a') 49 for file in glob.iglob(org): 50 found = 0 51 for line in open(file): 52 if (line == '\n' and found == 1): 53 found = 0 54 output.write('\n') 55 if '>' in line and re.search(id, line, re.A) != None: 56 new_line = re.findall('>(.*)', line, re.A) 57 new_new_line = '' 58 new_new_line += '>' 59 new_new_line += mark 60 new_new_line += str(new_line[0]) 61 new_new_line += '\n' 62 new_new_line.join('') 63 found = 1 64 output.write(new_new_line) 65 elif found == 1: 66 output.write(line) 67 68 def findall_genes_ID(ID): #Finds all the genes as definded in the gene ID .query file and returns a dictionary with gene ID's as keys and genome names as lists of values 69 List = [] 70 gene_ID_list = [] 71 for filename in glob.iglob('*.txt'): 72 for line in open(filename): 73 result = re.findall('[^-a-zA-Z]' + '[\t]?' + ID + '[\t]?' + '[^-a-zA-Z]', line, re.I|re.A) # ('[^-]' + '[\t]?'+ '[^a-z]' + ID + '[\t]?' + '[^-dependent][^-regulated][^-accessory]', line, re.I|re.A) Make it not search combinations which are part of a word 74 clean_result = re.findall(ID, str(result), re.I|re.A) 75 if result != []: 76 mystring=clean_result[0] 77 check = re.findall('\t([a-zA-Z]{3,5}\d?)\t', line, re.A) 78 numbered = re.findall('\t([a-zA-Z]{3,4}[0-9])\t', line, re.A) 79 if len(check) >= 1: 80 if len(ID) == 3: 81 newID = ID.lower() 82 if newID != check[0]: 83 continue 84 else: 85 gene_ID = re.findall('^(\d+)', line, re.A) 86 gene_ID = gene_ID[0] 87 gene_ID_list.append(gene_ID) 88 shortfilename = filename[:-4] 89 get_aa_sequence(ID, shortfilename, gene_ID) 90 clustal_W(shortfilename, ID, gene_ID, '00_')

140

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 3/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

91 elif len(ID) == 4: 92 newID = ID[:-1].lower()+ID[-1:].capitalize() 93 if newID != check[0]: #removes erroneous matches 94 continue 95 else: 96 gene_ID = re.findall('^(\d+)', line, re.A) 97 gene_ID = gene_ID[0] 98 gene_ID_list.append(gene_ID) 99 shortfilename = filename[:-4] 100 get_aa_sequence(ID, shortfilename, gene_ID) 101 clustal_W(shortfilename, ID, gene_ID, '00_') 102 elif len(numbered) == 1: 103 newID = ID[:-2].lower()+ID[-2:].capitalize() 104 if newID != check[0]: #removes erroneous matches 105 continue 106 else: 107 gene_ID = re.findall('^(\d+)', line, re.A) 108 gene_ID = gene_ID[0] 109 gene_ID_list.append(gene_ID) 110 shortfilename = filename[:-4] 111 get_aa_sequence(ID, shortfilename, gene_ID) 112 clustal_W(shortfilename, ID, gene_ID, '00_') 113 elif len(check) == 0: #if gene ID is not easily identifyable 114 second_search = re.findall('\t{2}([^\t]*)', line, re.I|re.A) 115 third_search = re.findall(r'[^\w]([\w]{3,5}\d?)', str(second_search), re.I|re.A) 116 capital = re.compile('[A-Z]', re.A) 117 lower = re.compile('[a-z]', re.A) 118 for item in third_search[:]: #itterates a copy of the list 119 if capital.match(item, 1) != None: 120 third_search.remove(item) 121 elif lower.match(item, 3) != None: 122 third_search.remove(item) 123 if len(third_search) >= 2: 124 choice = '' 125 while not choice == 'y' and not choice == 'n': 126 print('Does this search result correspond to the ID query', '\033[0;31m' ,ID, '\033[1;m', '?') 127 new_line = line.replace('\n', '') 128 print(new_line) 129 choice = input('Type y/n and press Return: ') 130 print('\n') 131 if choice == 'n': 132 continue 133 elif choice == 'y': 134 gene_ID = re.findall('^(\d+)', line, re.A) 135 gene_ID = gene_ID[0]

141

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 4/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

136 gene_ID_list.append(gene_ID) 137 shortfilename = filename[:-4] 138 get_aa_sequence(ID, shortfilename, gene_ID) 139 clustal_W(shortfilename, ID, gene_ID, '00_') 140 else: 141 gene_ID = re.findall('^(\d+)', line, re.A) 142 gene_ID = gene_ID[0] 143 gene_ID_list.append(gene_ID) 144 shortfilename = filename[:-4] 145 get_aa_sequence(ID, shortfilename, gene_ID) 146 clustal_W(shortfilename, ID, gene_ID, '00_') 147 148 def find_COG(ID, file_name): 149 list = '' 150 count = 0 151 for line in open(file_name): 152 if str(ID) in line: 153 if 'COG_category' in line: 154 COG = [] 155 COG = re.findall('(\[\w\]\s\w+.+)\t\t', line, re.A) 156 if COG: 157 if count == 0: 158 list += COG[0] 159 count += 1 160 elif count != 0: 161 list += ', ' 162 list += COG[0] 163 return(list) 164 165 def find_COG_number(ID, file_name): 166 list = '' 167 count = 0 168 for line in open(file_name): 169 if str(ID) in line: 170 if 'COG' in line: 171 COG_number = [] 172 COG_number = re.findall('(COG\d+)\t', line, re.A) 173 if COG_number: 174 if count == 0: 175 list += COG_number[0] 176 count += 1 177 elif count != 0: 178 list += ', ' 179 list += COG_number[0] 180 return(list)

142

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 5/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

181 182 def find_name(ID, file_name): 183 list = '' 184 count = 0 185 for line in open(file_name): 186 if str(ID) in line: 187 if 'Product_name' in line: 188 Name = [] 189 Name = re.findall('Product_name\t\t(.+)\t', line, re.A) 190 if Name: 191 if count == 0: 192 list += Name[0] 193 count += 1 194 elif count != 0: 195 list += ', ' 196 list += Name[0] 197 return(list) 198 199 def find_EC(ID, file_name): 200 list = '' 201 count = 0 202 for line in open(file_name): 203 if str(ID) in line: 204 if 'EC:' in line: 205 EC = [] 206 EC = re.findall('EC:(.+)]\t', line, re.A) 207 if EC: 208 if count == 0: 209 list += EC[0] 210 count += 1 211 elif count != 0: 212 list += ', ' 213 list += EC[0] 214 return(list) 215 216 def find_KEGG(ID, file_name): 217 list = '' 218 count = 0 219 for line in open(file_name): 220 if str(ID) in line: 221 if 'KO' in line: 222 KO = [] 223 KO = re.findall('KO:(K\d+)', line, re.A) 224 if KO: 225 if count == 0:

143

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 6/12 GenomicSaved: Utility9/24/14, for Automat00:37:02ed Comparison (GUAC) (Continued). Printed for: Oleg

226 list += KO[0] 227 count += 1 228 elif count != 0: 229 list += ', ' 230 list += KO[0] 231 return(list) 232 233 234 ##### Main ##### 235 236 if os.path.isdir('./BLAST Output is Here/'): 237 shutil.rmtree('./BLAST Output is Here/') #Removes directory and everything else down the tree 238 239 if os.path.isdir('./Gene aa sequences are here/'): 240 shutil.rmtree('./Gene aa sequences are here/') 241 242 if os.path.isdir('./Results are here/'): 243 shutil.rmtree('./Results are here/') 244 245 if os.path.isdir('./Clustal analysis results/'): 246 shutil.rmtree('./Clustal analysis results/') 247 248 249 parser = optparse.OptionParser() 250 parser.add_option('-b', dest='bit_score', type='float', help=('Type -b cut-off bit score. Default == 50')) 251 parser.add_option('-i', dest='per_id', type='float', help=('Type -i cut-off % identity. Default == 30')) 252 parser.add_option('-a', dest='per_align', type='float', help=('Type -a cut-off % of alignment length over query length. Default == 40')) 253 parser.set_defaults(bit_score=50, per_id=30, per_align=40) 254 (options, args) = parser.parse_args() 255 256 for filename in glob.iglob('*.query'): #This for-loop creats a list of all the genes (LG) in the query file in the order they are listed in the file. 257 LG =[] 258 for line in open(filename): 259 if line[0] == '\n': 260 break 261 elif line[-1] != '\n': 262 LG.append(line) 263 elif line[-1] == '\n': 264 length = len(line) - 1 265 line = (line[:length]) 266 LG.append(line) 267 268 LO = [] 269 for filename in glob.iglob('*.txt'): #This for-loop creates a list of all the organisms' genomes (LO) in the target directory in alphabetical order. 270 LO.append(filename[:-4])

144

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 7/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

271 LO.sort() 272 273 Dictionary = {} 274 genome_gene_ID_dict = {} #Dictionaries for the memory module 275 gene_score_dict = {} 276 gene_ID_gene_symbol_dict = {} 277 278 outFILE = 'genome_comparison_table.xls' 279 outDIR = './Results are here/' 280 outDIR += outFILE 281 outDIR.join('') 282 dirname = 'Results are here' 283 284 if not os.path.isdir('./' + dirname + '/'): 285 os.mkdir('./' + dirname + '/') 286 287 print(' \n***Finding genes and the corresponding amino acid sequences*** \n', '\033[0;32mThis module may require your input to resolve ambiguous matches. Wait until you see the next green message\033[1;m \n' ) 288 289 for ID in LG: 290 findall_genes_ID(ID) #This is where the search dictionary is returned 291 292 for O in LO: 293 genome_gene_ID_dict[O] = set() 294 295 dirname = 'BLAST Output is Here' 296 if not os.path.isdir('./' + dirname + '/'): 297 os.mkdir('./' + dirname + '/') 298 299 print('***Peforming BLAST search for genes not identified in annotation*** \n' 300 '***And identifying the best alignments in the target genomes*** \n', '\033[0;32mHave a coffee\033[1;m \n') 301 for O in LO: 302 for G in LG: 303 DB = [] 304 makeDB = [] 305 DB.append(O) 306 DB.append('.fasta.psq') 307 makeDB.append(O) 308 makeDB.append('.fasta') 309 makeDB = ''.join(makeDB) #Operation to join lists not strings, compare to line 13 310 DB = ''.join(DB) 311 if not (DB in glob.iglob('*.psq')): #Creates BLAST protein databases using exhisiting aa .fasta files for whole genomes 312 imput = [] 313 imput.append('makeblastdb -in ') 314 imput.append(makeDB) 315 imput.append(' -dbtype prot')

145

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 8/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

316 imput = ''.join(imput) 317 os.system (imput) 318 else: 319 full_path = [] 320 full_path.append('./Gene\ aa\ sequences\ are\ here/') 321 full_path.append(G) 322 full_path.append('.fasta') 323 full_path = ''.join(full_path) 324 check_path = [] 325 check_path.append('Gene aa sequences are here/') 326 check_path.append(G) 327 check_path.append('.fasta') 328 check_path = ''.join(check_path) 329 blast_query = [] 330 blast_query.append('blastp -db ') 331 blast_query.append(makeDB) 332 blast_query.append(' -outfmt 6') 333 blast_query.append(' -query ') 334 blast_query.append(full_path) 335 blast_query = ''.join(blast_query) 336 blast_output = [] 337 analysis_input = [] 338 blast_output.append(' -out ') 339 output_folder = [] 340 output_folder = ('./BLAST\ Output\ is\ Here/') 341 analysis_output_folder = [] 342 analysis_output_folder = ('./BLAST Output is Here/') 343 blast_output.append(output_folder) 344 blast_output.append(O) 345 analysis_input.append(analysis_output_folder) 346 analysis_input.append(O) 347 blast_output.append('_') 348 analysis_input.append('_') 349 blast_output.append(G) 350 analysis_input.append(G) 351 blast_output.append('.out') 352 analysis_input.append('.out') 353 blast_output = ''.join(blast_output) 354 analysis_input = ''.join(analysis_input) 355 blast_input = [] 356 blast_input.append(blast_query) 357 blast_input.append(blast_output) 358 blast_input = ''.join(blast_input) 359 if os.path.exists('./' + check_path): #Makes sure no ugly writting appears in the terminal if there are no aa sequences 360 os.system(blast_input)

146

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 9/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

361 if glob.iglob(analysis_input) != None: 362 for blast_out in glob.iglob(analysis_input): 363 for line in open(blast_out): 364 bit_score = re.findall('(\d+\.?\d?)$', line, re.A) 365 query = re.findall('^(\d+)\t\d+', line, re.A) 366 subject = re.findall('^\d+\t(\d+)', line, re.A) #subject gene object ID e.g. 643529311 367 percent_id = re.findall ('^\d+\t\d+\t(\d+?\.?\d+?)\t', line, re.A) #percent identity 368 alignment_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t(\d+)', line, re.A) #length of the alignment 369 q_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t\d+\t\d+\t\d+\t\d+\t(\d+)', line, re.A) #length of the query sequence 370 s_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t\d+\t\d+\t\d+\t\d+\t\d+\t\d+\t(\d+)', line, re.A) #length of the subject sequence 371 percent_alignment = 0 372 percent_alignment = int(alignment_length[0])*100/int(s_length[0]) #percent of the alignment over the subject length 373 score_list = [] 374 score_list.append(float(bit_score[0])) 375 score_list.append(percent_alignment) 376 score_list.append(float(percent_id[0])) 377 score_list.append(int(query[0])) 378 if float(bit_score[0]) >= options.bit_score and percent_alignment >= options.per_align and float(percent_id[0]) >= options.per_id: 379 gene_object_ID = int(subject[0]) 380 if gene_object_ID not in gene_ID_gene_symbol_dict: 381 gene_ID_gene_symbol_dict.setdefault(gene_object_ID, G) 382 gene_score_dict.setdefault(gene_object_ID, score_list) 383 genome_gene_ID_dict.setdefault(O, set()).add(gene_object_ID) 384 elif gene_object_ID in gene_ID_gene_symbol_dict: 385 if float(bit_score[0]) > float(gene_score_dict[gene_object_ID][0]): 386 gene_score_dict[gene_object_ID] = score_list 387 del gene_ID_gene_symbol_dict [gene_object_ID] 388 gene_ID_gene_symbol_dict.setdefault(gene_object_ID, G) 389 390 for G in LG: #creates {Dictionary} with gene symbols (keys) and 0's as values for each organism 391 list = [] 392 for O in LO: 393 list.append(0) 394 Dictionary[G] = list 395 396 for O in LO: #Updates {Dictionary} based on the BLAST search 397 for gene_ID in genome_gene_ID_dict[O]: 398 current_index = LO.index(O) 399 templist = Dictionary[gene_ID_gene_symbol_dict[gene_ID]] 400 templist[current_index] += 1 401 Dictionary[gene_ID_gene_symbol_dict[gene_ID]] = templist 402 403 table = open(outDIR, 'a') #Prints the table header 404 for item in LO: 405 table.write('\t' + str(item))

147

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 10/12 GenomicSaved: Utility9/24/14, for A00:37:02utomated Comparison (GUAC) (Continued). Printed for: Oleg

406 table.write('\n') 407 408 for gene_item in LG: #Print {Dictionary} into a table 409 table.write(str(gene_item)) 410 for cell in Dictionary[gene_item]: 411 table.write('\t' + str(cell)) 412 table.write('\n') 413 414 print('***Finding information about the identified genes*** \n') 415 416 outORGdir = './Results are here/' 417 418 for O in LO: 419 for ID in genome_gene_ID_dict[O]: 420 if ID in gene_ID_gene_symbol_dict: 421 symbol = gene_ID_gene_symbol_dict[ID] 422 culstal_id = str(ID) 423 clustal_W(O, symbol, culstal_id, '01_') 424 outORGfile = '' 425 outORGfile += outORGdir 426 outORGfile += O 427 outORGfile += '_' 428 outORGfile += symbol 429 outORGfile += '.xls' 430 outORGfile.join('') 431 check = [] 432 check.append('Results are here/') 433 check.append(O) 434 check.append('_') 435 check.append(symbol) 436 check.append('.xls') 437 check = ''.join(check) 438 if not os.path.exists('./' + check): 439 table = open(outORGfile, 'a') 440 table.write('gene symbol' + '\t' + 'bit score' + '\t' + '% alignment' + '\t' + '% identity' + '\t' + 'query' + '\t' + 'genome ID' + '\t'+ 'gene product name' '\t' + 'EC number' + '\t'+ 'COG category' + '\t' +'COG number' + '\t' + 'KEGG category' + '\t' + 'alternative names' + '\t' + 'function' + '\t' + 'notes' + '\n') 441 table.write(str(symbol)) 442 table = open(outORGfile, 'a') 443 search_file_name = '' 444 search_file_name += O 445 search_file_name += '.info' 446 search_file_name.join('') 447 table.write('\t' + str(gene_score_dict[ID][0]) + '\t') 448 table.write(str(gene_score_dict[ID][1]) + '\t') 449 table.write(str(gene_score_dict[ID][2]) + '\t') 450 table.write(str(gene_score_dict[ID][3]) + '\t')

148

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 11/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

451 table.write(str(ID) + '\t') 452 Name = [] 453 Name = find_name(ID, search_file_name) 454 if Name: 455 table.write(str(Name) + '\t') 456 elif not Name: 457 table.write('\t') 458 EC = [] 459 EC = find_EC(ID, search_file_name) 460 if EC: 461 table.write(str(EC) + '\t') 462 elif not EC: 463 table.write('\t') 464 COG = [] 465 COG = find_COG(ID, search_file_name) 466 if COG: 467 table.write(str(COG) + '\t') 468 elif not COG: 469 table.write('\t') 470 COG_number = [] 471 COG_number = find_COG_number(ID, search_file_name) 472 if COG_number: 473 table.write(str(COG_number) + '\t') 474 elif not COG_number: 475 table.write('\t') 476 KEGG = [] 477 KEGG = find_KEGG(ID, search_file_name) 478 if KEGG: 479 table.write(str(KEGG) + '\t' + '\n') 480 elif not KEGG: 481 table.write('\t' + '\n') 482 483 print('***Performing ClustalW analysis to check for misidentified genes. \n' 484 'See alignments in \"Clustal analysis results\" directory. \n' 485 'If needed, rerun GUAC with adjusted bit score, % identity, and % alignment cut off values. *** \n') 486 487 for G in LG: 488 full_path = [] 489 full_path.append('./Clustal analysis results/') 490 full_path.append(G) 491 full_path.append('.fasta') 492 full_path = ''.join(full_path) 493 #print(full_path) 494 if os.path.exists(full_path): 495 clustal_query = []

149

/Users/oleg/Downloads/8987655971254236_add5.cvs Page 12/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

496 clustal_query.append('clustalw2') 497 clustal_query.append(' -infile=') 498 clustal_query.append('./Clustal\ analysis\ results/') 499 clustal_query.append(G) 500 clustal_query.append('.fasta') 501 clustal_query.append(' -align') 502 clustal_query.append(' -outfile=') 503 clustal_query.append('./Clustal\ analysis\ results/') 504 clustal_query.append(G) 505 clustal_query.append('.aln') 506 clustal_query = ''.join(clustal_query) 507 os.system(clustal_query) 508 clustal_tree_query = [] 509 clustal_tree_query.append('clustalw2') 510 clustal_tree_query.append(' -infile=') 511 clustal_tree_query.append('./Clustal\ analysis\ results/') 512 clustal_tree_query.append(G) 513 clustal_tree_query.append('.fasta') 514 clustal_tree_query.append(' -tree') 515 clustal_tree_query.append(' -outfile=') 516 clustal_tree_query.append('./Clustal\ analysis\ results/') 517 clustal_tree_query.append(G) 518 clustal_tree_query.append('.ph') 519 clustal_tree_query = ''.join(clustal_tree_query) 520 os.system(clustal_tree_query) 521 522 print('***Analysis complete! Output in \"Results are here\" directory*** \n') 523 sys.exit() 524

150

APPENDIX 2

Supplementary material for Chapter 2:

The “missing enzyme” in the enigmatic Calvin cycle of chemoautotrophic bacterial symbionts

151

Sequencing of the symbiont-enriched and unenriched mRNA Approximately 1.6 million cDNA sequencing reads were obtained from RNA unenriched in the symbiont mRNA transcripts and over 1.35 million from the enriched RNA (Appendix 2 Table A2.1). The majority of sequences were rRNA transcripts. mRNA enrichment decreased the amount of reads by only 3.7% for the symbiont and 5.4% for the host rRNA, while leading to the fragmentation of transcripts (Appendix 2 Figure A2.1) and loss of some non-rRNA sequences (Appendix 2 Table A2.1). Host mRNA removal was effective since less than 1% of non rRNA reads were identified as eukaryotic. Roughly 0.4% of the reads were mitochondrial. Cytochrome c oxidase subunit I (COX1), a respiratory electron transport chain protein, was the most highly transcribed mitochondrial protein-coding gene (Appendix 2 Figure A2.2). Of the bacterial reads, 0.1% could not be referenced to the genome of the symbiont and may originate from the gill surface-associated microbial community (Appendix 2 Table A2.1).

152

Figure A2.1. Length distribution of the cDNA sequencing reads from the symbiont-containing gill tissue of S. velum.

153

Figure A2.2. Gene expression across the mitochondrial genome of S. velum. From outside to the center: mitochondrial DNA (Mb); gene expression per nucleotide, with the symbiont unenriched cDNA in blue and symbiont-enriched cDNA in grey; genes (green) with the corresponding gene names.

154

Figure A2.3. Transcriptional activity of tRNA genes in the S. velum symbiont.

155

Figure A2.4. Initial reverse reaction velocities of the symbiont PPi-PFK without and with 0.15U PPase over a range of phosphate concentrations. Measurements were performed at pH 7.5 and 25℃. Standard deviations from three replicate measurements are shown.

156

Figure A2.5. Symbiont PPi-PFK activity (A) without PPase and (B) with 0.15U PPase at different phosphate concentrations with 5 mM FBP. Measurements were performed at pH 7.5 and 25℃. Standard deviations from three replicate measurements are shown.

157

Table A2.1. Number of transcripts from S. velum symbiont-containing gill tissue unenriched and enriched in the symbiont mRNA. Transcript reads were mapped to the symbiont genome (Dmytrenko et al. 2014) or the mitochondrial genome of the host (Plazzi et al. 2013). Unmapped reads were queried with BLASTN against the NCBI nucleotide database. Data marked in bold were used in gene expression analysis (Figures 2.2 and 2.3). cDNA cDNA

unenriched symbiont enriched Read types # of reads % # of reads % Total 1,591,449 100.0 1,350,648 100.0 Unique non-duplicates 1,184,432 74.4 884,578 65.5 Duplicates 768,731 26 581,423 35 5S, 16S, 23S symbiont rRNA 415,701 26.1 303,155 22.4 18S, 28S host rRNA 549,041 34.5 392,671 29.1 12S, 16S mitochondrial rRNA 62,019 3.9 46,462 3.4 S. velum mitochondrial mRNA and tRNA 7,038 0.4 5,770 0.4 S. velum symbiont genomic mRNA & tRNA 53,513 3.4 45,101 3.3 Non-symbiont bacterial RNA 1,073 0.1 1,412 0.1 Eukaryotic RNA 10,085 0.6 11,180 0.8 Unassigned 85,962 5.4 78,827 5.8 Mean read length (bp) 316 176

158

Table A2.2. Most highly-expressed genes in the S. velum symbiont. cDNA cDNA symbiont Rank Gene name Gene ID unenriched enriched (% reads kb-1) (% reads kb-1) 1 sirA 31577136 2.755 2.829 2 rpmJ 31577289 1.984 1.267 3 rbcL 31576636 1.709 1.703 4 dsrE 31577137 1.346 0.970 5 dsrH 31575343 1.300 1.076 6 hp 31575728 1.204 1.638 7 dsrC 31575342 0.911 0.821 8 rpsM 31576581 0.868 0.663 9 hp 31576035 0.664 0.525 10 rpmG 31576708 0.635 0.401 11 porin_4 31577251 0.618 0.570 12 ompA 31577180 0.617 0.733 13 aprM 31575927 0.602 0.590 14 rbcS 31576635 0.545 0.381 15 ripA 31575730 0.543 0.610 16 groES 31575479 0.542 0.478 17 HU 31577016 0.541 0.571 18 atpE 31575643 0.538 0.477 19 rpmB 31576709 0.525 0.504 20 rplU 31576796 0.499 0.367 21 pilA 31575370 0.476 0.462 22 rpmD 31576584 0.469 0.381 23 hp 31575856 0.464 0.472 24 fba 31576038 0.462 0.566 25 glgC 31576501 0.456 0.535 26 rpmA 31576797 0.456 0.356 27 HSP70 31576502 0.445 0.573 28 raiA 31576650 0.415 0.380 29 hp 31576685 0.411 0.395 30 rpsS 31576598 0.394 0.437 31 aprB 31575928 0.393 0.386 32 rplV 31576597 0.383 0.418 33 nuoA 31575069 0.382 0.300 34 dsrE 31575345 0.381 0.451 35 dsrF 31575344 0.381 0.397 36 rpsG 31576606 0.375 0.366 37 prkA 31576527 0.372 0.368 38 dsrA 31575347 0.365 0.372 39 cytC 31576414 0.351 0.628 40 dsrB 31575346 0.349 0.394 41 rpmF 31575278 0.335 0.264 42 rpmI 31574931 0.334 0.340 43 atpI 31577205 0.325 0.351 44 sat 31574706 0.323 0.358 45 rbr 31575960 0.322 0.298 46 hp 31576722 0.321 0.358 47 rpsO 31575117 0.303 0.193 48 hp 31576151 0.303 0.239 49 acpP 31575273 0.302 0.222

159

Table A2.2 (Continued). 50 adk 31574970 0.298 0.330 51 rpsL 31576607 0.298 0.301 52 rpsK 31576580 0.292 0.280 53 rpsE 31576585 0.292 0.302 54 HTH_ARSR 31575768 0.286 0.266 55 iscU 31576121 0.285 0.274 56 haem_bdg 31576879 0.281 0.251 57 soxX 31576917 0.268 0.341 58 rplN 31576592 0.261 0.236 59 ndh 31575070 0.258 0.211 60 rplM 31575738 0.255 0.320 61 trp 31576756 0.253 0.294 62 gapA 31576041 0.251 0.285 63 rplC 31576602 0.250 0.311 64 dsrM 31575341 0.245 0.307 65 rpsF 31575104 0.244 0.295 66 RHOD 31575767 0.241 0.201 67 rplR 31576586 0.241 0.240 68 dsrL 31575339 0.239 0.258 69 atpF 31575644 0.237 0.233 70 rplW 31576600 0.232 0.229 71 dsrO 31575337 0.227 0.226 72 rpmH 31575606 0.225 0.227 73 rpsH 31576588 0.223 0.216 74 rpmC 31576594 0.223 0.259 75 rplB 31576599 0.216 0.185 76 HU_like 31575491 0.215 0.247 77 dsrJ 31575338 0.214 0.246 78 rplP 31576595 0.213 0.251 79 pfp 31575776 0.213 0.250 80 rpsQ 31576593 0.212 0.153 81 rplF 31576587 0.211 0.239 82 rpsU 31577001 0.208 0.255 83 dsrK 31575340 0.206 0.254 84 ALP_like 31575631 0.206 0.170 85 rpsJ 31576603 0.204 0.135 86 rpsD 31576579 0.203 0.189 87 ips 31576661 0.202 0.225 88 soxZ 31576915 0.198 0.211 89 hp 31576311 0.198 0.230 90 rplX 31576591 0.197 0.240 91 yfsF 31576658 0.197 0.308 92 rplO 31576583 0.195 0.144 93 rpsN 31576589 0.194 0.170 94 cmk 31575847 0.187 0.122 95 rpsC 31576596 0.187 0.207 96 rplD 31576601 0.182 0.177 97 hp 31576627 0.177 0.236 98 secB 31575766 0.175 0.146 99 fccA 31576290 0.174 0.124 100 ihfA 31574935 0.173 0.144 101 rpe 31575542 0.173 0.112

160

Table A2.2 (Continued). 102 yceD 31575279 0.172 0.158 103 GH57N_APU 31576500 0.171 0.173 104 ccmD 31577268 0.170 0.074 105 hp 31575962 0.170 0.159 106 rpsP 31576137 0.163 0.180 107 rplJ 31575435 0.161 0.133 108 arsC 31574866 0.161 0.171 109 pcm 31576727 0.159 0.148 110 rplE 31576590 0.158 0.159 111 nuoE 31575073 0.158 0.222 112 hp 31576090 0.157 0.156 113 yhbY 31576076 0.157 0.138 114 rplT 31574932 0.154 0.111 115 groEL 31575480 0.153 0.160 116 dsrC 31575352 0.152 0.075 117 cbbQ 31576634 0.151 0.191 118 yciL 31576070 0.151 0.149 119 dsrR 31575334 0.151 0.113 120 tktA 31576042 0.150 0.178 121 rplK 31575437 0.150 0.166 122 tpiA 31575063 0.150 0.155 123 nuoI 31575077 0.150 0.118 124 trpD 31576757 0.150 0.151 125 nuoC 31575071 0.149 0.144 126 fusA 31576605 0.148 0.155 127 ndhD 31575072 0.148 0.169 128 COG2847 31575100 0.148 0.142 129 mlrA 31574936 0.146 0.089 130 secY 31576582 0.145 0.110 131 iscA 31576120 0.142 0.123 132 hfq 31575310 0.141 0.156 133 soxY 31576916 0.139 0.142 134 aprA 31575929 0.138 0.152 135 RNase_P 31575607 0.138 0.133 136 nfs 31576122 0.137 0.138 137 napC 31576472 0.137 0.114 138 malQ 31576499 0.136 0.159 139 porin 31576867 0.132 0.120 140 fixQ 31576689 0.132 0.135 141 rplQ 31576577 0.132 0.112 142 glgBE 31576498 0.131 0.100 143 ppiA 31575874 0.131 0.108 144 fixN 31576687 0.131 0.111 145 secE 31575439 0.130 0.186 146 rimM 31576136 0.130 0.148 147 DUF4426 31575412 0.130 0.124 148 aroK 31575761 0.129 0.168 149 yccA 31574763 0.129 0.142 150 glnK 31575364 0.129 0.122 151 pgk 31576040 0.128 0.143 152 rpsB 31575300 0.125 0.127

161

Table A2.3. Controls of PPi-PFK activity (nmol PPi min-1 mg total protein-1) in the cell-free extracts (CFE) of S. velum gill and foot tissue. The reactions were carried out with 5 mM FBP 3- and 20 mM PO4 , unless otherwise stated. Standard deviations from three biological replicated are shown. Measurements were performed at pH 7.5 and at 25℃. 3- FBP Without PO4 5 mM F6P 5 mM Fru Boiled gill CFE Foot CFE

Activity 27.5±1.8 0.00±0.0 0.00±0.0 0.00±0.0 0.00±0.0 0.00±0.0

162

Table A2.4. Initial velocities of the symbiont PPi-PFK forward reaction (µmol min-1 mg protein-1) at different substrate concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three measurements are shown. PPi [mM] 0.01 0.025 0.5 2.5 5 0.05 11.7±0.9 21.2±0.2 23.1±3.0 19.5±2.6 18.5±1.0 0.1 14.5±0.6 23.0±2.7 35.5±4.0 30.2±3.3 30.90±1.0 F6P [mM] 0.5 33.6±1.6 38.3±4.5 60.2±1.9 65.9±2.3 65.8±6.4 2.5 40.4±1.8 68.4±5.0 86.9±7.2 85.5±6.9 98.2±9.4 7.5 66.9±3.6 85.7±2.8 101.6±3.4 102.1±0.9 104.0±2.5

163

Table A2.5. Initial velocities of the symbiont PPi-PFK reverse reaction (µmol min-1 mg protein-1) at different substrate concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three measurements are shown.

3- PO4 [mM] 0.5 1 5 10 20 25 50 0.01 4.7±0.6 8.2±0.7 11.8±0.8 11.7±1.2 9.3±0.3 6.7±0.3 3.1±0.1 0.025 17.9±1.7 20.6±3.0 34.0±2.2 23.9±1.9 20.7±1.4 17.4±1.0 12.4±1.0 0.05 16.5±1.0 25.4±2.7 51.0±0.2 52.3±2.7 43.1±8.8 44.0±3.7 24.5±3.6 FBP 0.1 [mM] 23.5±0.9 38.5±3.3 73.1±1.8 82.4±1.1 78.1±4.8 71.1±3.2 49.1±1.5 2.5 41.8±0.9 78.9±4.9 151.8±4.8 186.0±4.7 184.8±7.9 183.6±5.2 162.2±4.2 5.0 58.5±5.9 82.4±2.5 158.6±4.7 173.0±5.5 187.1±2.3 187.3±4.5 174.7±3.5 10.0 41.0±1.7 85.5±7.73 124.2±1.9 145.4±3.1 166.7±3.2 158.2±1.1 182.8±3.6

164

Table A2.6. PPi-PFK activity (µmol PPi min-1 mg symbiont protein-1) in S. velum gill tissue cell- free protein extracts at different substrate concentrations from Table 2.1 estimated per bacterial protein. FBP [mM] 2.5 5 10 10 1.56±0.04 1.58±0.07 1.60±0.09 3- PO4 [mM] 20 2.11±0.29 2.10±0.25 1.92±0.17 25 1.94±0.12 2.08±0.07 1.92±0.04

165

Table A2.7. Bacterial strains, plasmids, and primers used in this study.

Relevant properties Reference Strains A. vinosum DSM 180 RifR; spontaneous rifampicin-resistant mutant of A. vinosum (Lubbe et al. Rif50 DSM 180 2006) - - - E. coli BL21(DE3) F , ompT, hsdSB (rB , mB ), gal, dcm (DE3) ThermoFisher Plasmids EMD pET28a+ KanR expression vector Biosciences Primers Sv_16SF1 CGCTGGCGGTATGCTTAAC This study Sv_16SR1T7 GCCAGTGAATTGTAATACGACTCACTATAGGGCGGTGTGTACAAGGCC This study SV_18SF1_53 TGCTTGTCTCAAAGATTAAGCA This study SV_18SR1T7-53 GCCAGTGAATTGTAATACGACTCACTATAGGGAACAGTCCGAGGATGTC This study Sv_23SF1 CAAGTGAATAAGCGTACACGG This study Sv_23SR1T7 GCCAGTGAATTGTAATACGACTCACTATAGGGCAATTAGTATCGGTTAGC This study Sv_28F1 GCATATCACTAAGCGGAGGA This study Sv_28S1T7 GCCAGTGAATTGTAATACGACTCACTATAGGGTAAAACTAACCTGTCTCACG This study Sv_pfp_1F_NdeI ATTGCATCATATGAGTGCAAAAAACGCATT This study Sv_pfp_1257_SacI TTAGACGAGCTCTTACAGCTCGAAGTCTTCCA This study Sv_ppase_1F_NdeI ATTGCATCATATGAATCTGGATAAAGTCACCGC This study Sv_ppase_549R_XhoI TTAGACCTCGAGTCAGAAAGCGGGCTTCTC This study Av_pfp_1F_NdeI ATTGCATCATATGTCAGCCAAGAACGC This study Av_pfp_1254 R_SacI TTAGACGAGCTCTTACAGCTCAAAGGCGC This study Av_fbp_1F_NdeI ATTGCATCATATGCACAACGGTACCAG This study Av_fbp_1014R_SacI TTAGACGAGCTCTCAATCCGGCTGATGATACC This study

166

APPENDIX 3

Supplementary material for Chapter 3:

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts deciphered

167

Supplementary Methods

Bacterial strains and plasmids

Templates for recombination were created by PCR amplifying approximately 500 bp fragments immediately upstream and downstream of the target genes and fusing them in the same order to an antibiotic selection marker using primers from Appendix 3 Table A3.4. To amplify pfp left and right flanks, primers pfpNdeLF-pfpaacC1LR and pfpaacC534RF- pfpXhoIRR were used, respectively. The fbp flanks were amplified with primers fbpBglIILF-fbpaphALR and fbpaphARF-fbp367NdeIRR. aphA1 kanamycin resistance (KmR) antibiotic marker was chosen for fbp deletion in A. vinosum. This antibiotic resistance gene was PCR amplified from pCM184 plasmid using primers aphA1F and aphA816R. The aacC1 gentamicin resistance (GmR) promoterless gene for the single pfp knockout was PCR amplified from pCM351 with primers aacCSDF and aacC534R. Inactivation of both fbp and pfp made A. vinosum slow-growing.

Because of that and due to a high number of false positives with gentamicin selection, aacC1 gene with a constitutive gentamicin promoter was used to generate a double ∆fbp ∆pfp mutant in the ∆fbp genetic background. The flanking regions and the corresponding antibiotic markers were assembled using fusion PCR. Three PCR products were combined with two outer-most primers (pfpNdeILF-pfpXhoIRR and fbpBglIILF-fbp699NdeIRR) and PCR amplified for 25 cycles

(98°C for 10 sec, 65°C for 15 sec, 72°C for 62.5 sec). The resulting fusion products were purified, digested with NdeI and XhoI restriction enzymes for pfpL-aacC1-pfpR amplicon and

GblII and NdeI for fbpL-aphA-fbpR PCR product, and ligated into the digested plasmid pCM433 using T4 DNA ligase (NEB) at 16°C overnight. The resulting ligation products were directly introduced into E. coli S17-1 by electroporation using standard protocols (Sambrook & Russell

2001). E. coli cells containing the desired plasmid constructs were selected on antibiotic Luria-

Bertani (LB) plates. pCM433 fbpL::aphA1::fbpR plasmid grew on LB containing kanamycin (Km,

50 µg/ml), which indicated that in E. coli aphA1 gene was expressed from the fbp promoter

168

(Pfbp) present in the fbp left flank. Plasmid pCM433 pfpL::aacC1::pfpR could not be selected with gentamicin (Gm, 10 µg/ml), suggesting that A. vinosum pfp (Ppfp) promoter is not recognized in E. coli. Instead, plasmid selection in E. coli was carried out on tetracycline (10

µg/ml). Colonies which grow on selective plates were checked for plasmids by PCR and sequencing. In this study, for cloning purposes, Q5 high-fidelity polymerase (NEB) was used.

For verification and sequencing, PCR reactions were carried out with OneTaq polymerase

(NEB).

Once confirmed, the resulting allelic exchange plasmids were introduced into A. vinosum through conjugation. A. vinosum (1 ml) was harvested during mid-log phase (approximate optical density (OD) at 690 nm of 1.4) by centrifugation at 9,300 g for 5 min at room temperature

(RT). The pellets were washed twice in 500 µl RCV medium (see below). Following the wash, cells were resuspended in 500 µl of RCV. Fresh E. coli S17-1 colonies containing allelic exchange plasmids were scraped from the plates and resuspended in 3 ml RCV. The volume

8 8 equal to 4x10 of E. coli donor cells (assuming OD600 0.1 = 10 cell/ml) was mixed with A. vinosum contained in 500 µl RCV. This mixture was centrifuged at 9,300 g for 5 min at RT. The pellet was resuspended in 50 ml RCV and pipetted onto sterile 0.45 µl nitrocellulose filters

(Millipore) placed on non-selective RCV agarose plates. The plates were incubated for 4 to 10 days at 30°C anaerobically under light. Next, bacteria on the filters were resuspended in 1 ml sterile RCV and plated on RCV Phytagel plates (see below) containing antibiotics. To select against E. coli, 50 µg/ml rifampicin (Rif) was used. ∆fbp::aphA recombinants were selected on

10 µg/ml Km. ∆pfp::aacC1 mutants were identified on plates containing 5 µg/ml Gm. To obtain only double-crossover recombinant knockouts, A. vinosum single crossover mutants with integrated allelic exchange plasmid were selected against using 10% w/v sucrose. Sucrose is toxic to cells containing sacB gene found on pCM433 plasmid (Marx 2008) and has been effectively used as a counter-selection marker in A. vinosum (Grimm et al. 2011). During

169

selection, NaCl was omitted from RCV medium as it has been reported to interfere with the selection process in diverse bacteria (Kunst & Rapoport 1995; Logue et al. 2009; Suckow et al.

2011). A. vinosum colonies, which grew on average 10 days after plating, were restreaked multiple times and screened by PCR and sequencing using primers from Appendix 3 Table

A3.4.

Growth conditions

RCV liquid medium consisted of 3 solutions. Solution A was prepared by dissolving 60 g malate, 24 g NH4Cl, 4 g MgSO4 x 7H2O, 1.4 g CaCl2 x 2H2O, and 20 ml SL12 solution

(Overmann et al. 1992) in 1000 ml 18.2 Mohm H2O. SL12 solution contained 3 g EDTA-Na2 x

2H2O, 1.1 g FeSO4, 300 mg H3BO3, 190 mg CoCl2 x 6H2O, 50 mg MnCl2 x 4H2O, 42 mg ZnCl2,

24 mg NiCl2 x 6H2O, 18 mg Na2MoO2 x 2 H2O, and 2 mg CuCl2 dissolved in this order in 1000 ml 18.2 Mohm H2O. pH of the solution was adjusted to 2-3 with HCl. SL12 was filter-sterilized and stored in the dark at 4°C. Feeding solution B contained 1.55 g NaSH x H2O in 50 ml 18.2

Mohm H2O degassed with N2. Solution A was sterilized through a 0.2 µm filter and stored at 4°C in the dark. Solution B was autoclaved in half-full crimp top vials sealed with a butyl rubber stopper. To prepare the medium, 27.5 ml of solutions A, 275 mg of yeast extract, and 990 mg of

NaOH were added to 500 ml of 18.2 Mohm H2O. pH of the medium was brought to 7.0 with

NaOH. Afterwards, 32.4 ml of 180 mM KPO4 buffer were added together with 1,165 µl of 1M sodium acetate and 10% thiosulfate. The medium was filter-sterilized (0.2 µm) and bubbled with

N2 under sterile conditions for 60 min. Then, 1,514 µl of the feeding solution B were added. The medium was aliquoted into 9 ml anaerobic vials and stored in the dark at least 12 hours prior to inoculation. Background levels of sulfide in the feeding solution made the medium anaerobic. To prevent contamination, all cultures were grown in the presence of 15 mg/ml Rif.

170

To grow A. vinosum strains on plates, RCV medium was supplemented with 1%

Phytagel (Sigma Aldrich) and 85 mM NaCl2 to aid gelation. The medium containing Phytagel and solution A was autoclaved. Filter-sterilized and pre-warmed to 42°C solutions of 1M sodium acetate and 10% thiosulfate, feeding solution B, and 180 mM phosphate buffer pH 7.0 were added when the autoclaved solution cooled down to approximately 62°C. This was done to avoid precipitation of salts at higher temperatures. When 1.5% agarose was used instead of

Phytagel, sodium acetate, thiosulfate, and feeding solution B were excluded. Antibiotics were added when the assembled medium cooled down to 55°C. Once solidified, the plates were stored overnight under oxygen-free atmosphere prior to inoculation. The inoculated plates were incubated in GasPakTM BBLTM jars (BD) between two 60W incandescent lightbulbs placed 20 cm away from the surface of the jars. The lightbulbs maintained plate temperature at approximately 30°C.

To measure growth of A. vinosum under heterotrophic conditions, liquid RCV medium was prepared in a 500 ml spinner flask (Bellco Glass) with two 45 mm side arms and a 70 mm center neck. One side arm was closed with a butyl rubber stopper (Ochs) and fitted with two needles. These needles were capped with air-tight stopcocks with luer lock valves connected to

0.2 µm sterile filters for degassing. The needles were also used for withdrawing culture samples from the bioreactor. The second arm of the spinner flask was fitted with an air-tight sampler which continuously circulated medium through an attached glass 1 mm cuvette using a peristaltic pump (Teledyne ISCO), flow rate 10x50. Flexible tubing and O-rings of the sampler were made from VitonTM fluoroelastomer (Chemours). Rigid tubing running into the bioreactor and the glass cuvette were made from polyether ether ketone (PEEK). The butyl rubber stopper holding the tubes in the cuvette was sealed using Marine-Tex epoxy (ITW Engineered

Polymers). The connections between the PEEK and VitonTM tubing as well as the insertion points of the PEEK tubing into the flask through a custom laser-cut butyl rubber gasket, held in

171

place by an open top screw cap (Corning), were secured with Swagelok stainless steel fittings

(Swagelok) coated on the inside with fluoroelastomer. The center neck of the flask was closed with a screw cap lined with a butyl rubber gasket. All of the connections between glass and rubber were sealed with silicon grease (Cole-Parmer). This setup withstood multiple rounds of autoclaving at 121°C for 20 min. OD of the culture was monitored at 690 nm using UV-1601 spectrophotometer (Shimadzu) every 10 minutes, automated and recorded with UVProbe software (Shimadzu). The culture was gently stirred with a polytetrafluoroethylene- (PTFE) coated stirrer. Heterotrophic growth experiments were performed at least in duplicate.

To measure growth kinetics of A. vinosum ∆fbp ∆pfp, cultures were grown in RCV medium in 9 ml gas-tight vials. Since the double mutant didn't grow on either bicarbonate, malate, or acetate, these cultures were supplemented with 1% w/v of either D-fructose, D- glucose, sucrose, rhamnose, or glucoronic acid. OD at 690 nm was measured directly in the vials daily for over 2 months. The experiment was carried out at least in triplicate.

To study growth of A. vinosum cultures under autotrophic conditions, bacteria were inoculated into Pfennig's medium (Imhoff 2006). Sulfide and bicarbonate served as sole energy and carbon sources, respectively. To prepare the base of the medium, 0.33 g KCl, 0.33 g MgCl2 x 6H2O, 0.43 g CaCl2 x 2H2O, 0.33 g NH4Cl, 0.33 g KH2PO4, and 1 ml SL12 solution were dissolved in 900 ml 18.2 Mohm H2O and filter- sterilized (0.2 µm). Solutions of 17.85 mM

NaHCO3 and 713.5 mM NaSH x H2O, 100 ml each, were prepared in 18.2 Mohm H2O degassed with N2 for 45 min, sealed with butyl rubber stoppers in half-filled crimp top vials, and autoclaved for 15 min (liquid cycle). For adjusting pH of the medium during growth, 0.5 M HCl and 0.5 M

NaOH were filter-sterilized and degassed with sterile N2 for 45 min.

Growth of A. vinosum cultures under autotrophic conditions was carried out in a pH controlled bioreactor with feedback-controlled sulfide feeding. The setup was analogous to the bioreactor for heterotrophic growth, with some notable differences. The spinner flask contained

172

a larger center neck (100 mm) to fit double junction pH (Cole-Parmer) and sulfide (Weiss

Research) electrodes, a temperature sensor (Omega), as well as lines for titrating acid and base. Prior to assembling medium inside the bioreactor, spinner flask with the tubing for measuring OD, pH electrode, temperature sensor, acid and base feeding lines, and degassing/sampling needles were autoclaved. Sulfide electrode was sterilized in 7.5% H2O2 stabilized with 0.85% H2PO4, followed by UV treatment, and installed into the autoclaved bioreactor. The filter-sterilized base medium was added and bubbled with N2 for 2 hours. Next,

17.85 mM bicarbonate (100 ml) was added aseptically. pH was automatically adjusted to 7.0 using Apex controller (Neptune). Sulfide electrode was connected to Chemcadet mV controller

(Cole-Parmer). The mV output from the controller was recorded using Yocto-milliVolt-RX-BNC precision voltmeter (Yoctopuce) connected to Raspberry Pi3 (Raspberry Pi Foundation).

Assembled medium was supplemented with 0.25 mM sulfide and kept overnight with gentle stirring.

Prior to inoculation, sulfide electrode was calibrated by adding known amounts of sulfide in 0.05 mM increments into the bioreactor and quantifying the amount with Cline method (Cline

1969). Briefly, 1 ml samples were combined with an equal volume of 5.2% Zn-acetate to sequester sulfide. Aliquots (270 µl) were next incubated with 30 µl of the Cline reagent, containing 0.5 g N,N-dimethyl-p-phenylenediamine sulfate and 0.75 g FeCl3 x 6H2O in 25 ml

50% cool HCl, for 30 min in a 96-well plate (Greiner Bio-One) in triplicate. Absorption was measured at 670 nm using Tecan Infinite m200 spectrophotometer (Tecan). Amount of sulfide in the samples was determined using a standard curve.

To inoculate Pfennig's medium, pre-culture in mid-log phase was used. The inoculum was obtained by seeding liquid RCV medium with single colonies of A. vinosum. To prevent carry-over of dissolved organic carbon into the autotrophic medium, the pre-culture was collected on a 0.45 µm filter. Bacteria on the filter were suspended in 20 ml Pfennig's medium

173

aliquoted from the bioreactor into a sterile crimp top vial sparged with N2. This bacterial suspension was used to inoculate the bioreactor to OD690 of 0.07. Following inoculation, sulfide- feeding line was connected to one of the two needles leading into the bioreactor. When sulfide concentration in the bioreactor fell below approximately 0.3 mM, Chemcadet controller engaged a peristaltic pump (Teledyne ISCO), flow rate 1x20, which dispensed sulfide until the concentration inside the bioreactor increased to no more than 0.5 mM. OD690 of the culture was measured every 10 minutes (30 minutes for A. vinosum ∆fbp ∆pfp) until cultures entered stationary growth phase. Light intensity (42,000 Lux, 400-700 nm) was monitored with Yocto-

Light-V3 (Yoctopuce). Prior to inoculation, all cultures were verified by PCR and sequencing.

Autotrophic growth experiments were performed at least in duplicate.

To measure protein and ATP concentrations, samples were collected at regular OD690 intervals (0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.25, and 2.0). For protein quantification, 1 ml of culture was centrifuged at 13,000 g for 2 min. The pellets were frozen in liquid N2 and stored at -

80°C. Aliquots (125 µl) for ATP analysis were frozen immediately following collection and kept at -80°C until analyzed.

Freezer stock of A. vinosum was kept at -80°C in 10% DMSO.

Measuring CO2 fixation rates

13 CO2 fixation in the bioreactor cultures was determined using C labeled bicarbonate

(Cambridge Isotope Laboratories). During autotrophic growth, when the cultures were in log

13 phase, NaH CO3 was added to the final bicarbonate 13C/12C ratio of 0.17. At regular time intervals (0, 15, 30, 45, 75, 120, 180, 240, and 300 min) 5 ml of culture were filtered in duplicate on GF/F 25 mm glass microfiber filters (Whatman). For heterotrophic conditions, CO2 fixation experiments were carried out in 9 ml vials. Samples were collected at 0, 15, 120, 240, and 300 min intervals.

174

Prior to filtration, GF/F filters were baked at 450°C overnight to remove any residual organic carbon. Filters containing bacterial pellets were fumed with HCl for approximately 12 hours, HCl being changed after 6 hours. During fuming, the filters were kept on a PTFE surface cleaned with methanol. Isotopic signature and CO2 concentration of culture medium prior to adding 13C label were analyzed. Filters and liquid samples were analyzed by stable isotope facilities at , Woods Hole Marine Laboratory, and University of New Mexico.

Samples for protein determination were collected at each sampling time point and analyzed as described above.

For calculating 13C dissolved inorganic carbon (DIC) incorporation rates, the mass balance equation was adapted from Montoya (1996):

(A )([PC ]) = (A )([PC ]) + (A )([PC ]) PC f f PCcontrol control CO2 D where A equals atom% of particulate carbon (PC; biomass carbon) at the end of incubation (f) and start/natural abundance (control), or of the DIC pool (ACO2); [PCf] equals concentration/amount of PC at end of incubation, [PCcontrol] stands for concentration/amount of PC at start of incubation, and [PCΔ] represents concentration/amount of newly formed PC during incubation, equal to new carbon biomass. To calculate carbon fixation rates (newly formed carbon biomass), the equation was solved for the relative ratio of newly formed biomass as a function of total biomass.

((APC ) - (APC )) ([PC ]) f control = D ((A ) - (A )) ([PC ]) CO2 PCcontrol f

To determine the absolute carbon fixation rate, the equation was solved for [PCΔ]. The reported rates were calculated per min per mg of total protein.

175

Figure A3.1. Cycles of sulfide consumption and automated supplementation in A. vinosum ∆fbp ∆pfp autotrophic bioreactor culture.

176

Figure A3.2. Sulfide consumption rates of A. vinosum WT, ∆fbp, and ∆pfp under autotrophic conditions. Shaded areas around mean values indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3).

177

Table A3.1. Calvin cycle enzymes and the corresponding locus tags in the genome of A. vinosum. Abbreviation Name (EC number) Locus tag RuBisCO Ribulose-bisphosphate carboxylase (EC:4.1.1.39) Alvin_1365, Alvin_1366 PRK Phosphoribulokinase (EC:2.7.1.19) Alvin_0562 PGK Phosphoglycerate kinase (EC:2.7.2.3) Alvin_0314 GAPDH Glyceraldehyde 3-phosphate dehydrogenase (EC:1.2.1.12) Alvin_0315 FBA Fructose bisphosphate aldolase, class II (EC:4.1.2.13) Alvin_0312 FBPase Fructose 1,6-bisphosphatase (EC:3.1.3.11) Alvin_0677 PPi-PFK Phosphofructokinase (EC:2.7.1.11) Alvin_2908 TK Transketolase (EC:2.2.1.1) Alvin_0316 TPI Triosephosphate isomerase (EC:5.3.1.1) Alvin_2432 RPI Ribose 5-phosphate isomerase (EC:5.3.1.6) Alvin_2900 RPE Phosphopentose epimerase (EC:5.1.3.1) Alvin_0272

178

Table A3.2. Growth rates (OD690/min) of A. vinosum in autotrophic and heterotrophic media. Medium WT ∆fbp ∆pfp ∆fbp ∆pfp Autotrophic 0.00114±0.00009 0.00085±0.00004 0.00106±0.00007 0.000003±0.00000 Heterotrophic Before diauxic shift 0.00112±0.00001 0.00112±0.00001 0.00107±0.00009 After diauxic shift 0.00100±0.00001 0.00093±0.00001 0.00099±0.00009 Fructose 0.00032±0.00004 Glucose 0.00036±0.00004 Sucrose 0.00000±0.00001 Rhamnose 0.00000±0.00000 Glucuronate 0.00000±0.00000 No sugar 0.00000±0.00000

179

Table A3.3. Bacterial strains and plasmids used in this study. Strains and Plasmids Relevant properties Source Allochromatium vinosum

DSM 180T (Lubber at al. Rif50 RifR; spontaneous rifampicin-resistant mutant 2006) ∆fbp RifR KmR (∆fbp::aphA) This study ∆pfp RifR GmR (∆pfpA::aacC1) This study ∆fbp ∆pfp RifR KmR GmR (∆fbp::aphA) (∆pfp::aacC1) This study Escherichia coli 294 (recA thi pro hsdR- M+) TpR SmR [RP4-2-Tc : : (Simon et al, E. coli S17-1 Mu-Km : Tn7] 1983) Plasmids (Marx & ApR, KmR, TcR; broad-host range cre-lox allelic pCM184 Lindstrom exchange vector 2002) (Marx & ApR, GmR, TcR; broad-host range cre-lox allelic pCM351 Lindstrom exchange vector 2002) ApR, CmR, TcR; broad-host-range sacB-based pCM433 (Marx 2008) allelic exchange vector ApR, CmR, TcR, KmR; sacB-based vector for in- pCM433 fbpL::aphA::fbpR This study frame deletion of fbp ApR, CmR, TcR, GmR; sacB-based vector for in- pCM433 pfpL::aacC1::pfpR This study frame deletion of pfp

180

Table A3.4. Primers used in this study. Primer Sequence Source aacCGTG1F GTGTTACGCAGCAGCAAC This study aacC534R TTAGGTGGCGGTACTTGG This study pfpNdeILF ATTGCACATATGTACCAAGACCTATGTGCGTC This study pfpaacC1LR GTTGCTGCTGCGTAACACGATGATGTCCTCGATTCGTTTC This study pfpaacC534RF CCAAGTACCGCCACCTAAGGATTGAGATTATCAAGGTCAATTTGG This study pfpXhoIRR TTAGACCTCGAGCCCTTGTAGAGCTTGTCGAT This study aphA1F ATGAGCCATATTCAACGGGA This study aphA816R TTAGAAAAACTCATCGAGCATCA This study fbpBglIILF TTAGACAGATCTTCATGAAGGATTGGCCGATC This study fbpaphALR TCCCGTTGAATATGGCTCATACTGCTCCTCGTCTCGAAT This study fbpaphARF TGATGCTCGATGAGTTTTTCTAAGGCGAGGGTCTGCGAG This study fbp367NdeIRR ATTGCACATATGTAACGGGGTGCATTGAGGAA This study pfpEcGTG1F GTGATTAAGAAAATCGGTGTGTTGA This study pfpEcSD963R TGCTGCTCCATAACATCAAACTTAATACAGTTTTTTCGCGCAGT This study aacCSDF GTTTGATGTTATGGAGCAGCA This study pfpXhoILF ATTGCACTCGAGTACCAAGACCTATGTGCGTC This study TCAACACACCGATTTTCTTAATCACGATGATGTCCTCGATTCGTTT pfpEcpfpLR C This study pfpSacIRR TTAGACGAGCTCCCCTTGTAGAGCTTGTCGAT This study pfp_Vin_LF_verify TGTCTCATGAGCGGATACAT This study pCMR GCTTGAGCGTGACAATCA This study AVpfp240F CTACAAGCTCAAGAGTCTGGAA This study AVpfp1155R GATGAGCGGGATCAGATACTG This study AV_fbp_1F_NdeI ATTGCATCATATGCACAACGGTACCAG This study AV_fbp_1014R_SacI TTAGACGAGCTCTCAATCCGGCTGATGATACC This study AV_pfp_chrom_F GGTCACATCCATTTCGACAG This study AV_pfp_chrom_R ACCCGATCCGTTCTTAGAAC This study 1720F_AV_chrom CGTCACGTCACACATTTCTA This study fbp2557chromR GTTTCATGCAGGTCTCCTTA This study 27F AGAGTTTGATCMTGGCTCAG (Field et al. 1997) 1492R TACGGYTACCTTGTTACGACTT (Field et al. 1997)

181

References Cline, J., 1969. Spectrophotometric determination of hydrogen sulfide in natural waters. Limnology and Oceanography, 14(3), pp.454–458.

Grimm, F., Franz, B. & Dahl, C., 2011. Regulation of dissimilatory sulfur oxidation in the purple sulfur bacterium Allochromatium vinosum. Frontiers in Microbiology, 2(51), pp.1–11.

Imhoff, J.F., 2006. The Chromatiaceae. In M. Dworkin et al., eds. The Prokaryotes. Berlin Heidelberg: Springer New York, pp. 846–873.

Kunst, F. & Rapoport, G., 1995. Salt stress is an environmental signal affecting degradative enzyme synthesis in Bacillus subtilis. Journal of Bacteriology, 177(9), pp.2403–2407.

Logue, C.-A., Peak, I.R.A. & Beacham, I.R., 2009. Facile construction of unmarked deletion mutants in Burkholderia pseudomallei using sacB counter-selection in sucrose-resistant and sucrose-sensitive isolates. Journal of Microbiological Methods, 76(3), pp.320–323.

Marx, C.J., 2008. Development of a broad-host-range sacB-based vector for unmarked allelic exchange. BMC Research Notes, 1(1), pp.1–8.

Montoya, J.P. et al., 1996. A simple, high-precision, high-sensitivity tracer assay for N2 fixation. Applied and Environmental Microbiology, 62(3), pp.986–993.

Overmann, J., Fischer, U. & Pfennig, N., 1992. A new purple sulfur bacterium from saline littoral sediments, Thiorhodovibrio winogradskyi gen. nov. and sp. nov. Archives of Microbiology, 157, pp.329–335.

Sambrook, J. & Russell, D.W., 2001. Molecular Cloning: A Laboratory Manual Third, Cold Spring Harbor Laboratory Press.

Suckow, G., Seitz, P. & Blokesch, M., 2011. Quorum sensing contributes to natural transformation of Vibrio cholerae in a species-specific manner. Journal of Bacteriology, 193(18), pp.4914–4924.

182