The Harvard community has made this article openly available.

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts

A dissertation presented


Oleg Dmytrenko


The Department of Organismic and Evolutionary

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of


Harvard University Cambridge, Massachusetts

April 2018

© 2018 Oleg Dmytrenko All rights reserved.

Dissertation Advisor: Professor Colleen M. Cavanaugh Oleg Dmytrenko

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts


Symbiosis is a major driving force of biological diversity. within mutualistic symbiotic associations are capable of occupying new ecological niches which would have been otherwise inaccessible to their individual partners. Symbioses between chemoautotrophic and marine are an example of such partnerships in which hosts benefit from organic carbon supplied by its symbiotic bacteria, while the symbionts profit from a steady supply of reduced inorganic compounds and electron acceptors sequestered and delivered to them by their hosts. These symbioses are able to, for instance, create lush oases of surrounding hydrothermal vents in the amid the otherwise barren . Perhaps the most enigmatic feature shared by all chemoautotrophic symbionts within the class of in the lack for a gene encoding fructose 1,6-bisphosphatase (FBPase), an enzyme which in bacteria catalyzes two essential reactions in the Calvin-Benson-Bassham

(Calvin) carbon fixation cycle. Yet chemoautotrophic bacterial symbionts are not only able to fix

CO2 using the Calvin cycle, but are among some of the most prolific primary producers in the ocean. It has been hypothesized that a glycolytic pyrophosphate-dependent phosphofructokinase (PPi-PFK), acting in reverse, can perform the function of the missing

FBPase in these bacteria. To test this hypothesis in my thesis I investigated the ability of PPi-

PFK from the symbionts of Solemya velum coastal protobranch bivalve to perform the biochemical function of the missing FBPase. I detected high gene expression of the symbiont

PPi-PFK-encoding gene and high reverse PPi-PFK activity in the symbiont-containing tissue of the host. The recombinant enzyme from the S. velum symbiont had the highest specificity for the reverse reaction compared to other bacterial PPi-PFKs and higher catalytic efficiency than


many bacterial FBPases. By recreating the symbiont-like Calvin cycle in a free-living closely- related purple gammaproteobacterium, Allochromatium vinosum, I demonstrated that in the absence of FBPase its function in the cycle can be performed by PPi-PFK. The shift from

FBPase to PPi-PFK in A. vinosum came at the cost of reduced growth and decreased adaptability but offered an improvement in thermodynamic efficiency potentially due to of the high energy pyrophosphate generated by PPi-PFK in the Calvin cycle. Data presented in my thesis show that the selection of PPi-PFK over FBPase took place in all lineages of gammaproteobacterial chemoautotrophic symbionts. My results also demonstrate that PPi-PFK can perform the biochemical function of FBPase and may have become specifically adapted to this function in the symbionts. The feasibility of the Calvin cycle which uses PPi-PFK instead of FBPase was demonstrated in A. vinosum. The observed physiological changes accompanying the shift from FBPase to PPi-PFK in this bacterium suggest that such a transition could be advantageous to the symbionts. Living in a relative constant and isolated host environment, these specialist bacteria may be selected for the thermodynamic efficiency which accompanies PPi-PFK use. Free-living bacterial generalists, on the other hand, would be severely disadvantaged by the associated decline in growth rate and adaptability, as they would become less fit to outgrow their competition and slower at adapting to fluctuation in environmental conditions. A proposed link between PPi-PFK reverse activity in the Calvin cycle and a sulfur oxidation pathway in chemoautotrophic symbionts may explain why a shift to PPi-

PFK has not occurred in photoautotrophic symbionts and plastids, which obtain their energy from light instead of sulfide. These results advance our understanding of the key metabolic processes and evolutionary forces responsible for the origin and maintenance of chemoautotrophic symbioses.




I would like to thank Colleen Cavanaugh and all the current and former members of the

Cavanaugh lab for their help and support throughout my PhD years. I am grateful for their guidance in developing my research ideas and methodologies, sharing many fun moments in the laboratory, and, of course, commiserating. In particular I would like to thank Colleen for trusting me with finding a research project. I am very grateful for her continuous enthusiasm and support of my research and uncompromising scientific rigor. I would like to thank Kristina

Fontanez and Guus Roeselers for getting me started with stimulating projects in the laboratory early in my graduate career. Finally, I am thankful to Alicja Kunikowska and Daniel Utter for providing invaluable feedback on my thesis chapters.

I deeply appreciate the time, commitment, support, and guidance from my committee members, Edward DeLong, Peter Girguis, Hopi Hoekstra, and Christopher Marx. The extensive expertise and wisdom they brought to our meetings was tremendously helpful in refining my research direction, contextualizing data, and exploring new ways of answering nascent questions. I am deeply indebted to Peter Girguis for his scientific and career advice, for letting me use his lab equipment, and helping me navigate the graduate program. I would like to thank

Christopher Marx for helping me take my first steps in reverse . I owe big thanks to

Edward DeLong for letting me into his lab and providing funding and resources to study gene expression in the Solemya velum symbionts. I am incredibly thankful to Hopi Hoekstra for joining my committee well into my PhD and bringing along valuable evolutionary and genetic insights.

I would like to also thank everyone who has helped me learn and develop new experimental techniques and methodologies. I am particularly grateful to Frank Stewart who supervised my S. velum symbiont transcriptome study in the laboratory of Edward DeLong at

Massachusetts Institute of Technology. Molecular genetic experiments with Allochromatium


vinosum were made possible in large part due to generous advice as well as bacterial strains provided by Christiane Dahl and Renate Zigann. I would also like to thank Dipti Nayak, Nicole

De Nisco, Paige Swanson, and Anna Wang for sharing with me their plasmid stocks. Great many experimental measurements in my thesis were made possible thanks to Matthew

Meselson, who has generously given me access to his laboratory. Stable isotope experiments would not have been possible without advice from Wiebke Mohr, Tiantian Tang, and Daniel

Hoer. Sequencing and analysis of the S. velum symbiont genome was a large team effort which came to fruition thanks to contributions from Shelbi Russell, Wesley Loo, Kristina Fontanez, Li

Liao, Guus Roeselers, Irene Newton, Frank Stewart, John Eppley, Tanja Woyke, Jenna Morgan

Lang, Raghav Sharma, Donhying Wu, and Jonathan Eisen.

Great many people, including Chris Preheim, Lydia Carmosino, and Elena Kramer, deserve thanks for guiding me through the graduate program and making it such a positive and memorable experience. I would especially like to thank Elena Kramer, Peter Girguis, and

Rebecca Chetham for securing funding I needed to complete my thesis research. For excellent teaching experience in their courses I would like to express my gratitude to Joshua Sanes, Jeff

Lichman, Maryellen Ruvolo, Pardis Sabeti, Hopi Hoekstra, and Andrew Berry. For administrative, technical, and moral support I am particularly indebted to Madeleine Marino,

Nikki Hughes, Bridget Power, Jason Green, and Kendall Winters.

I owe big thanks to Shelbi Russell, Guus Roeselers, Chris Baker, Mark Comerford, and

Alicja Kunikowska for helping me collect specimens of Solemya velum.

I incredibly appreciate being part of the Quincy House community throughout my time at

Harvard. The students, the tutors, and the resident deans, Lee and Deb Gehrke, enriched my life in myriad ways as I served as a resident tutor in Quincy.

I am infinitely grateful to my family for their love, incredible patience, understanding, and support throughout my graduate studies. I admire my wife, Alicja Kunikowska, for her inquisitive


mind, impeccable work ethic, and sense of humor. Thank you for being there for me. My sister,

Olga Dmytrenko, has been an unwavering source of support and a tireless travel companion.

My father, Volodymyr Dmytrenko, has taught me to persevere even under the toughest of circumstances. My mother, Svitlana Dmytrenko, has always encouraged my scientific interests, shared my curiosity for biology and chemistry, taught me the value of hard work, and has been a true inspiration. Without all of you I would have never made it this far.



Symbiosis was first defined by the botanist and mycologist, Anton de Bary, in 1878 as

"the living together of differently named " (de Bary 1878). This original definition of symbiosis included a wide range of associations as far apart as parasitism and mutualism, but in contemporary literature it is most commonly used to describe an interaction benefiting both the host and the symbiont which persists over their lifetime. Symbioses have been instrumental in the of eukaryotic cells according to the endosymbiotic theory of the origin of mitochondria and plastids (Mereschkowsky 1905; Sagan 1967; de Duve 2007) from a purple non-sulfur bacterium (John & Whatley 1975; Gray et al. 1999; Andersson et al. 2003; Cavalier-

Smith 2006) and a cyanobacterium (Cavalier-Smith 1982; McFadden & van Dooren 2004;

Bhattacharya et al. 2007), respectively. Symbiosis is one of the major driving forces of biological diversity on Earth. It allows its partners to occupy otherwise inaccessible ecological niches.

Associations between plants and bacteria are able to colonize terrestrial environments poor in ammonium and nitrate, biologically available forms of nitrogen (Gibson et al. 2008). For ruminants and termites, partnerships with cellulose-digesting bacteria enable access to limited nutrients through feeding on plants (Krause et al. 2013; Brune 2014). Symbioses between chemolithoautotrophic bacteria and marine invertebrates colonize habitats in the deep sea, creating oases of life in contrast to their barren, food-limited benthic surroundings

(Cavanaugh et al. 2013).

Chemoautotrophic symbioses

Symbioses between chemoautotrophic bacteria and marine invertebrates are ubiquitous in environments featuring gradients of reduced inorganic molecules such as sulfide (Felbeck et al. 1981; Cavanaugh 1983), (Cavanaugh et al. 1992), or hydrogen (Petersen et al.


2011) and oxygen (Dubilier et al. 2008; Cavanaugh et al. 2013). Chemoautotrophic symbionts, which primarily belong to the class Gammaproteobacteria, oxidize reduced sulfide compounds with oxygen and use this energy to fix inorganic carbon into biomass primarily through the

Calvin-Benson-Bassham (Calvin) cycle (Stewart et al. 2005; Childress & Girguis 2011). The resulting organic carbon is supplied to the host either through secretion (Fisher & Childress

1986; Bright et al. 2000; Ponsard et al. 2013) or digestion of the symbionts (Fisher & Childress

1992). Some symbionts also supply their hosts with nitrogenous compounds (Lee et al. 1999).

The reliance on symbionts for nutrition is so prominent, that many host organisms do not have their own gut and are incapable of filter-feeding (Reid & Bernard 1980; Krueger et al. 1992;

Cavanaugh et al. 2013). Symbiont hosts have evolved behavioral, physiological, and biochemical adaptations for capturing energy substrates and acceptors and delivering them to the symbionts. Some invertebrates, for example, Solemya velum, build Y-shaped burrows in reduced coastal sediments and position themselves at the junction to gain access to the overlaying oxygenated water and from below (Stanley 1970; Cavanaugh 1983;

Roeselers & Newton 2012). Certain of symbiotic , nematodes, and ciliates migrate along the gradient of electron acceptors and donors (Polz et al. 2000). Some hosts also possess hemoglobin molecules which are capable of reversibly binding sulfide and oxygen for delivery to the symbionts (Doeller et al. 1988; Hourdez & Weber 2005;

Bailly & Vinogradov 2005; Flores et al. 2005). Since the discovery of hydrothermal vents

(Corliss & Ballard 1977; Corliss et al. 1979) and chemoautotrophic symbioses subsequently at the Galápagos Rift (Cavanaugh et al. 1981; Felbeck et al. 1981), basic principles governing these partnerships between bacteria and eukaryotes have been established. Yet, the life cycle of the bacterial symbionts, their recruitment by the host, and communication between the partners remain primarily unexplored. In comparison with our understanding of non-symbiotic


model organisms, such as Escherichia coli, Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, or Mus musculus, we have barely begun.

The study of chemoautotrophic symbionts is severely limited by our current inability to grow symbiotic bacteria in pure culture outside of the host and to maintain symbiont-free hosts.

Tools for genetically manipulating symbionts and their hosts are still to be developed. Thus, chemoautotrophic symbioses have been primarily studied using indirect methods. Location and morphology of symbionts are investigated using light (Ponsard et al. 2013; Eichinger et al.

2014), confocal (Bettencourt et al. 2014; Volland et al. 2018), and electron microscopy

(Conway, Howes, et al. 1992; Gros et al. 2012; Klose et al. 2016). Measurements of enzymatic activity in tissue extracts and characterization of recombinant enzymes have provided biochemical evidence for metabolic processes involved in functioning of symbioses. One of the first enzymes detected in chemoautotrophic symbionts was ribulose 1,5-bisphosphate carboxylate-oxygenase (RuBisCO) (Felbeck et al. 1981; Cavanaugh 1983), which catalyzes the key CO2 incorporation step in the Calvin cycle. Activity of enzymes, such as ATP sulfurylase

(SAT), was detected in the symbionts inferred to derive ATP and reducing equivalents from sulfide oxidation (Felbeck et al. 1981; Chen et al. 1987; Fisher et al. 1993). Methanol dehydrogenase activity central to methane-oxidizing metabolism, was detected in -free extracts containing methylotrophic symbionts of Bathymodiolus platifrons (Barry et al. 2002). In comparison, heterologous expression and subsequent characterization of symbiont genes in E. coli, has been less common (Millikan et al. 1999; Schwedock et al. 2004). Using pulse-chase tracer experiments, it has been possible to postulate metabolic pathways and metabolite fluxes occurring within symbioses (Felbeck 1983; Felbeck & Turner 1995; Volland et al. 2018).

Analysis of stable carbon, nitrogen, and sulfur isotopes has been a powerful tool in uncovering metabolic networks and trophic levels (Rau et al. 1990; Conway, Capuzzo, et al. 1992; Lee &

Childress 1994; Robinson et al. 2003). Experimental manipulations of symbionts have involved


controlled laboratory incubations of whole organisms (Girguis et al. 2002), preparation of cell- free extracts (Vacelet et al. 1996; Lee et al. 1999), symbiont enrichments (Scott & Cavanaugh

2007), and measurements of select metabolites (Liao et al. 2013). PCR amplification and sequencing of symbiont and host genes has becomes commonplace (Laue & Nelson 1994;

Robinson et al. 1998; Stewart et al. 2008; Russell et al. 2017), now more frequently superseded by whole genome and metagenome sequencing (Woyke et al. 2006; Newton et al. 2007;

Robidart et al. 2008; Dmytrenko et al. 2014). Genomic studies have been complemented by transcriptomics (Stewart et al. 2011; Sanders et al. 2013; Seston et al. 2016), proteomics

(Markert et al. 2011), and metabolomics (Kleiner et al. 2012), allowing direct investigation of predicted functional capabilities. In recent years a vast increase in sequence data from diverse symbiotic associations has occurred. Analysis of these data brought about a surge in hypotheses about , function, and activity of chemoautotrophic symbioses. For example, based on metagenomic data, multiple toxin-like genes have been hypothesized in the symbionts of deep-sea Bathymodiolus (Sayavedra et al. 2015) and recycling of host urea has been proposed in the symbionts of algarvensis (Woyke et al. 2006). However, the majority of these hypotheses remain untested due to lack of suitable tools which can be robustly applied to symbiotic systems.

Novel Calvin cycle in chemoautotrophic bacterial symbionts

In my thesis I set out to identify a key hypothesis in the field of chemoautotrophic symbiosis using genomic and transcriptomic data. Next, I developed a means of testing this hypothesis in a way that overcame the limitations imposed by our current inability to grow and genetically manipulate symbionts in pure culture. Together, I interpret the combined results in


the context of our current understanding of evolution and functioning of chemoautotrophic symbioses.

As a model symbiotic system, I chose to study the symbiosis between a protobranch bivalve, Solemya velum, and its gammaproteobacterial chemoautotrophic endosymbiont

(Cavanaugh 1983), which is closely related to other chemoautotrophic symbionts and a number of well-characterized free living bacteria, including Allochromatium vinosum (Weissgerber et al.

2011). The S. velum symbiosis is one of the best studied chemoautotrophic symbioses, with well-studied physiology and ecology (Stewart & Cavanaugh 2006; Scott & Cavanaugh 2007;

Russell & Cavanaugh 2017). Using energy from the oxidation of sulfide, the symbionts are known to fix CO2 using ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO), the key enzyme in the Calvin cycle, and are thought to feed their host with the resulting organic carbon

(Cavanaugh 1983; Conway & McDowell Capuzzo 1991; Scott & Cavanaugh 2007).

In Chapter 1 of my thesis the genome of S. velum symbiont was analyzed and compared to the genomes of other sequenced symbionts and closely-related free-living bacteria. In this analysis the extent of genome reduction, typical of many intracellular symbionts was evaluated.

Genes specific to the symbiotic lifestyle were identified. Metabolic pathways and cellular processes were inferred from sequence data. The S. velum symbiont genome (2.7 Mb) was comparable in size to the genomes of many free-living bacteria, had high GC content (51%), and carried a large number of mobile genetic elements, which are less common in obligate vertically-transmitted intracellular bacteria (Newton & Bordenstein 2011). Unlike symbionts from oligotrophic environments, the symbionts of S. velum contained genes which encoded the complete TCA and glyoxylate cycles, DMSO and urea reductases, and a highly-branched electron transport chain. Just like the first sequenced chemoautotrophic symbiont of

Calyptogena magnifica (Newton et al. 2008) and other gammaproteobacterial chemoautotrophic symbionts sequenced to date, the symbiont of S. velum lacked a gene for fructose 1,6-


bisphosphatase (FBPase), an enzyme which catalyzes two essential reaction in the Calvin cycle. It was hypothesized that in the S. velum symbionts a bidirectional pyrophosphate- dependent phosphofructokinase (PPi-PFK) can perform the function of the missing FBPase when operating in reverse, a possibility which was originally proposed in the symbiont of C. magnifica (Newton et al. 2008).

In Chapter 2 of my thesis, transcriptional activity in the S. velum symbionts was examined, with the focus on the genes involved in sulfur oxidation and carbon fixation through the Calvin cycle, including pfp which encodes PPi-PFK. Next, PPi-PFK activity in the symbiont- containing cell-free extracts was measured, and purified recombinant PPi-PFK was characterized. High transcriptional pfp and reverse enzymatic PPi-PFK activity was detected in the symbiont-containing gill-tissue of S. velum. Purified PPi-PFK had high specificity for the reverse reaction and higher catalytic efficiency than most bacterial FBPases. Finally, a multi- gene time-calibrated Bayesian phylogeny was constructed to investigate the presence or absence of PPi-PFK and FBPase in extant chemoautotrophic symbionts and free-living bacteria and to infer their ancestral states. Ancestral state reconstruction showed that the shift from

FBPase to PPi-PFK occurred in evolutionary histories of all analyzed chemoautotrophic symbionts. Together, these data support the hypothesis that PPi-PFK can perform the biochemical function of FBPase in the S. velum symbionts and may be essential to their evolution and maintenance.

Chapter 3 investigated the ability of PPi-PFK to perform the function of FBPase in the

Calvin cycle. Owing to the limitations of working with uncultured chemoautotrophic symbionts, the symbiont-like Calvin cycle was reconstructed in a closely-related free-living purple sulfur bacterium, Allochromatium vinosum. To study the physiological changes associated with the loss of FBPase and the use of PPi-PFK during autotrophic growth, an anaerobic bioreactor was built which continuously monitored cell-growth, sulfide oxidation, pH, temperature, and light


intensity. CO2 fixation rates, total protein concentrations, and ATP levels of the cultures were also measured. The obtained data showed that the shift from FBPase to PPi-PFK during autotrophic growth using the Calvin cycle is associated with a decrease in growth and adaptability, but offers a significant increase in thermodynamic efficiency. These results provide further evidence for the alternative Calvin cycle hypothesized in chemoautotrophic symbionts and offer insights into the associated energy-saving mechanisms potentially coupled to sulfur- metabolism.

Taken together, these results demonstrate that PPi-PFK can catalyze the same reactions as FBPase in the Calvin cycle of sulfur-oxidizing bacteria, particularly in chemoautotrophic symbionts. These findings suggest that the proposed function of PPi-PFK in the Calvin cycle may be essential to the origin and maintenance of chemoautotrophic symbionts. Furthermore, the adapted experimental approach illustrates the feasibility of experimentally testing hypotheses which originate from sequence data of uncultured microorganisms by applying molecular genetics in closely-related free-living bacteria.


The "missing enzyme" in the enigmatic Calvin cycle

of chemoautotrophic bacterial symbionts

Oleg Dmytrenko1, Frank J. Stewart2, Daniel R. Utter1, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge,

Massachusetts, United States of America.

2School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America.



Sulfur-oxidizing gammaproteobacterial symbionts of marine invertebrates fix CO2 via the

Calvin-Benson-Bassham (Calvin) cycle despite the absence of the gene for fructose 1,6- bisphosphatase (FBPase). Here we investigated the ability of the reversible pyrophosphate- dependent phosphofructokinase (PPi-PFK) from the symbionts of Solemya velum bivalve to perform the biochemical function of the missing FBPase. We detected high expression of the symbiont PPi-PFK-encoding gene and high reverse PPi-PFK activity in the symbiont-containing tissue of the host. Compared to other bacterial PPi-PFKs, the recombinant enzyme had the highest specificity for the reverse reaction and higher catalytic efficiency than many bacterial

FBPases. Using ancestral state reconstruction, we demonstrated that the selection of PPi-PFK over FBPase occurred in all lineages of gammaproteobacterial chemoautotrophic symbionts.

Our findings support the hypothesis that PPi-PFK can perform the biochemical function of

FBPase and suggest that PPi-PFK may play an important role in the evolution and maintenance of chemoautotrophic symbioses.



Most of life on Earth thrives on biomass produced by autotrophic carbon fixation. Out of six known autotrophic carbon fixation pathways, the Calvin-Benson-Bassham (Calvin) cycle

(Bassham et al. 1953) is most ubiquitous and is responsible for over 90% of primary production

(Raven 2009; Schwander et al. 2016). The Calvin cycle, found in plants, protists, and bacteria, utilizes the enzyme ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO) in the key

CO2 incorporation step and relies on twelve auxiliary enzymatic reactions to regenerate its metabolic intermediates (Singer et al. 1952; Bar-Even et al. 2012; Raven 2013; Erb & Zarzycki

2018). Enzymes which catalyze the auxiliary reactions may be structurally unrelated but are functionally equivalent in different organisms (Martin & Schnarrenberger 1997). The Calvin cycle of endocellular gammaproteobacterial symbionts of marine invertebrates is perhaps the most enigmatic among . These symbiotic bacteria lack genes for fructose 1,6- bisphosphatase (FBPase, EC, an enzyme which catalyzes one of the essential auxiliary reactions in the cycle (Martin & Schnarrenberger 1997). Discovery and characterization of a functionally equivalent enzyme which performs the function of the missing FBPase in the symbionts may uncover a previously unknown variant of the Calvin cycle and would shed light on the metabolism and, potentially, evolution of chemoautotrophic symbioses.

Chemoautotrophic symbionts are some of the most prolific primary producers known (Lutz et al. 1994). These bacteria harness energy by oxidizing reduced inorganic compounds, such as sulfide (Felbeck et al. 1981; Cavanaugh 1983), methane (Cavanaugh et al. 1992; Barry et al.

2002), or hydrogen (Petersen et al. 2011). To capture and deliver oxygen and electron donors to the symbionts, their hosts evolved a number of behavioral, physiological, and biochemical adaptations (Doeller et al. 1988; Polz et al. 2000; Flores et al. 2005). In return, the symbionts provide their eukaryotic partners with organic carbon obtained by fixing CO2 into biomass using

RuBisCO (Felbeck 1981; Cavanaugh 1983; Polz et al. 1992; Nelson & Hagen 1995; Fiala-


Medioni et al. 2002). These bacteria-host associations have repeatedly evolved in a wide range of taxa, allowing both partners to occupy otherwise inhospitable environments, from hydrothermal vents to anoxic coastal sediments (Cavanaugh et al. 2013; Dubilier et al. 2008).

The majority of sulfur-oxidizing symbionts are gammaproteobacteria from a number of different clades. They exhibit a range of transmission modes (Nussbaumer et al. 2006; Stewart et al.

2009; Russell et al. 2017), diverse means of nutrient transfer to the host (Lee et al. 1999;

Sanders et al. 2013), and have varying genome sizes (Newton et al. 2007; Dmytrenko et al.

2014; Nakagawa et al. 2014). A striking commonality among gammaproteobacterial chemoautotrophic symbionts, on the other hand, is the consistent lack of the fbp gene encoding

FBPase. In bacteria this enzyme performs essential auxiliary reactions in the Calvin cycle by dephosphorylating fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7-bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). There are four known types of bacterial FBPases, types I, II, III, and V. Type IV has been so far only identified in archaea (Rashid et al. 2002). Type I is the most widely distributed in nature, being the primary FBPase in Escherichia coli and the majority of bacterial species, including autotrophs which rely on the Calvin cycle for carbon fixation

(Hines et al. 2007). Some bacteria, such as E. coli and Bacillus methanolicus, also possess

FBPase type II, encoded by the glpX gene (Donahue et al. 2000; Brown et al. 2009;

Stolzenberger et al. 2013). In some other bacteria, for example, Corynebacterium glutamicum,

GlpX is the only known FBPase (Rittmann et al. 2003). Type III was first described in a Gram- positive bacterium, Bacillus subtilis, and is generally rare (Fujita et al. 1998). Type V is predominantly archaeal, but was also found in at least one thermophilic bacterium, Aquifex aeolicus (Rashid et al. 2002). Similar to FBPase type I, GlpX is promiscuous and can dephosphorylate FBP as well as SBP (Gerbling et al. 1986; Stolzenberger et al. 2013).

Promiscuity of other bacterial FBPases, which are primarily confined to bacteria without the


Calvin cycle, to our knowledge, has not been tested. In eukaryotes FBPases are unable to dephosphorylate SBP (Teich et al. 2007). Instead, this reaction, which is specific to the Calvin cycle, is catalyzed by sedoheptulose 1,7-bisphosphatase (SBPase). Without FBPase–and

SBPase in eukaryotes–Calvin cycle intermediates cannot be regenerated, disrupting CO2 fixation.

To account for the absence of fbp in chemoautotrophic symbionts, it has been hypothesized that these bacteria co-opt a pyrophosphate-dependent phosphofructokinase (PPi-

PFK, EC to perform the function of the missing FBPase (Newton et al. 2007; S.

Markert et al. 2011; Kleiner et al. 2012; Dmytrenko et al. 2014). The first PPi-PFK was described in a protist, Entamoeba histolytica, alongside the conventional and more common

ATP-dependent phosphofructokinase (ATP-PFK, EC (Reeves et al. 1974). Later PPi-

PFKs were also discovered in bacteria (O'Brien et al. 1975), plants (Carnal & Black 1979), and archaea (Siebers et al. 1998). Being less-common of the two enzymes, PPi-PFK is primarily found in anaerobic organisms, which have a reduced capacity for ATP synthesis and may benefit from using PPi instead of the more costly ATP (Mertens 1991). Both ATP-PFK and PPi-

PFK are thought to participate in glycolysis. However, unlike ATP-PFK, which is virtually irreversible, PPi-PFK is reversible under physiological conditions and, therefore, may take part in gluconeogenesis, an ability which was first demonstrated by complementing FBPase deficiency in E. coli with PPi-PFK from Propionibacterium freudenreichii (Kemp & Tripathi 1993).

PPi-PFKs are promiscuous for FBP and SBP, a property which could allow these enzymes to participate in the Calvin cycle (Reshetnikov et al. 2008). When operating in reverse, PPi-PFK dephosphorylates FBP/SBP to F6P/S7P, analogous to FBPase, with the difference that PPi-

PFK also yields pyrophosphate (PPi) (Heinonen 2001). Accumulation of PPi may inhibit the reverse reaction of PPi-PFK and could make the forward reaction more favorable (Stitt 1989;

Theodorou & Plaxton 1996; Frese et al. 2014). To drive PPi-PFK in the reverse direction, a


continuous removal of pyrophosphate would be required. PPi can be consumed by a number of enzymes, including inorganic pyrophosphatase (PPase, EC, sodium-translocating

PPase (Na+-PPase, EC, proton-pumping PPase (H+-PPase, EC, ATP sulfurylase (SAT, EC, or pyruvate phosphate dikinase (EC (van Alebeek &

Keltjens 1994; Heinonen 2001; Serrano et al. 2007; Parey et al. 2013). The Gibbs standard free energy change (∆fGº) due to PPi hydrolysis by PPase is - 22 kJ/mole, which may increase equilibrium constant (K') of the reverse reaction by 103-104-fold (Heinonen 2001). Aside from

PPi removal, H+/Na+-PPases may create electrochemical gradients, which can be consumed, for example, by ATP synthases to produce ATP (Serrano et al. 2007; Biegel & V. Müller 2011).

SAT, an enzyme in the sulfur oxidation pathway of chemoautotrophic symbionts and free-living

2- sulfur-oxidizing bacteria, transfers PPi to adenosine 5'-phosphosulfate (APS), making SO4 as well as ATP (Felbeck et al. 1981; C. Chen et al. 1987; Polz et al. 1992; Laue & Nelson 1994;

Fiala-Medioni et al. 2002; Parey et al. 2013). Thus, it appears that pyrophosphate removal may be integral to the reverse PPi-PFK activity. Furthermore, it would not only prevent substrate inhibition and make the reverse PPi-PFK reaction more favorable in vivo, but could offer energy savings in a form of ATP synthesis, either indirectly through the action of H+/Na+-PPases or directly using SAT.

Physiological significance of PPi-PFK was evaluated in a chemoautotrophic symbiont of the coastal bivalve, Solemya velum, by analyzing transcription of the PPi-PFK encoding gene, pfp, (Gene ID 31575776) in the context of total gene expression. This intracellular bacterium, housed within the gill tissue of S. velum, belongs to the class of gammaproteobacteria, and is one of the best studied chemoautotrophic symbionts (Stewart & Cavanaugh 2006). To analyze gene expression, a previously published dataset of S. velum symbiont transcripts (Stewart et al.

2011), was combined with new data and reevaluated in the context of the recently sequenced symbiont genome (Dmytrenko et al. 2014). This approach improved the initial transcriptome


analysis, which focused primarily on sulfur oxidation genes, and extended transcription-based functional predictions to other aspects of symbiont's physiology, such as carbon metabolism.

Our transcriptomic results were compared to previous metaproteomic studies of chemoautotrophic symbionts of Riftia pachyptila (S. Markert et al. 2011) and Olavius algarvensis (Kleiner et al. 2012), which found significant levels of PPi-PFK protein in these bacteria. Analysis of pfp expression in the symbiont of S. velum was further complemented by measurements of PPi-PFK activity in the symbiont-containing gill and the symbiont-free foot tissue of the host.

To determine the ability of PPi-PFK from the S. velum symbiont to perform the biochemical function commonly carried out in the Calvin cycle by FBPase, symbiont pfp was expressed in E. coli, purified, and characterized. Recombinant PPi-PFK and FBPase from a closely related purple sulfur bacterium, Allochromatium vinosum, were also included in this characterization. A. vinosum is a facultative , capable of using light as the primary energy source and reduced sulfur compounds as a source of electrons (Imhoff 2005;

Weissgerber et al. 2011). Under photolithoautotrophic conditions this bacterium fixes CO2 via the Calvin cycle. Being metabolically versatile, A. vinosum is also capable of growing photoorganoheterotrophically using, for example, acetate and malate. A comparison between

PPi-PFK, which may be used in the Calvin cycle of S. velum symbiont, and an enzyme from a closely-related autotrophic bacterium which does not lack FBPase, offered valuable insights into the physiological function of the symbiont PPi-PFK. This comparative biochemical analysis was further expanded by including previously characterized bacterial PPi-PFKs (Pfleiderer &

Klemme 1980; Deng et al. 1999; Ronimus et al. 1999; Ding et al. 1999; Reshetnikov et al. 2008;

Frese et al. 2014) and FBPases (Kelley-Loughnane et al. 2002; Brown et al. 2009; Myung et al.



The phylogeny of PPi-PFKs and ATP-PFKs has been studied in diverse organisms

(Mertens 1991; Michels et al. 1997; Reshetnikov et al. 2008; Frese et al. 2014; S. B. Le et al.

2017). These two enzymes share common ancestry, but their evolutionary histories are complex and riddled with horizontal gene transfer events and point mutations which are able to change specificity from ATP to PPi and vice versa (Chi & Kemp 2000; M. Müller et al. 2001; Bapteste et al. 2003). While a number of bacteria are known to have both, PPi-PFK and ATP-PFK, the majority only has one or the other enzyme (Roberton & Glucina 1982; Bapteste et al. 2003).

Lack of FBPase and presence of PPi-PFK has only been sporadically investigated and is thought to be primarily confined to chemoautotrophic symbionts (Newton et al. 2007;

Reshetnikov et al. 2008; B. Markert et al. 2014; Kleiner et al. 2012; Dmytrenko et al. 2014). To establish whether this pattern holds widely in chemoautotrophic gammaproteobacterial symbionts, we surveyed all available symbiont genomes and a wide range of genomic sequences from closely-related free-living autotrophic and heterotrophic bacteria. Using ancestral state reconstruction, events of FBPase loss and PPi-PFK gain were investigated in the evolutionary histories of these symbiotic and non-symbiotic bacteria.

The results of this study demonstrate that PPi-PFK from a chemoautotrophic symbiont is capable of catalyzing biochemical reactions commonly performed by FBPase in the Calvin cycle. PPi-PFK likely plays an important role in the symbiont metabolism, as high pfp expression and high reverse PPi-PFK activity were detected in the symbiont-containing gill tissue of S. velum. The recombinant symbiont PPi-PFK had the highest specificity for the reverse reaction among bacterial PPi-PFKs and higher catalytic efficiency than many bacterial FBPases. PPi removal was essential to PPi-PFK reverse activity. Hydrolysis of PPi in the symbionts can be performed by a number of enzymes, such as H+/Na+-PPases or SAT. Their activity may not only make the reverse PPi-PFK activity more favorable in vivo but could also offer additional benefits in a form of ATP synthesis. Using ancestral state reconstruction we determined that the shift


from FBPase to PPi-PFK occurred in evolutionary histories of all analyzed gammaproteobacterial symbionts, suggesting that this genotype may be essential to the origin and maintenance of chemoautotrophic symbioses. This observation agrees with the demonstrated biochemical ability of PPi-PFK to participate in the Calvin cycle, a function to which this enzyme may have specifically adapted in symbiotic bacteria.


Selection of PPi-PFK over FBPase occurred in the evolutionary histories of all sequenced chemoautotrophic symbionts

Analogous to the S. velum symbiont, all 11 other chemoautotrophic symbionts sequenced to date contained genes for PPi-PFK and lacked FBPase and ATP-PFK genes

(Figure 2.1). To assess whether this genotype was inherited or independently derived, an ancestral state reconstruction was performed. Prior to predicting ancestral-states, a Bayesian phylogeny based on 15 genes was created for all the sequenced chemoautotrophic symbionts and their most closely related free-living bacteria with complete or nearly-complete genomes.

The resulting time-calibrated maximum clade credibility (MCC) tree represented the best- resolved phylogeny of chemoautotrophic symbionts to date. Presence or absence of the four genes of interest, PPi-PFK, ATP-PPi-PFK, FBPase, and RuBisCO (a Calvin cycle marker) was mapped to the tips of the tree.

Gammaproteobacteria which possess PPi-PFK and RuBisCO and lack ATP-PFK and

FBPase formed two disparate monophyletic clades composed primarily of chemoautotrophic symbionts. The only exceptions were free-living bacteria Sedimenticola sp. SIP G1,

Sedimenticola selenatireducens DSM17993, Thioglobus singularis EF1 (SUP05), and

Methylococcus capsulatus Bath known for their overall high similarity to chemoautotrophic symbionts (Ward et al. 2004; Walsh et al. 2009; Carlström et al. 2015; Flood et al. 2015).


ATP−PFK FBPase Not detected

PPi-PFK RuBisCO 500 My 1 1 1 0.9996 * 1 * 1 1 * * 1 * * 1 1 1 * 1 * 1 * 1 * * 1 1 1 * * 1 * * 1 * * * 1 1 1 * * 1 1 1 * 1 * * * * 1 * 1 * * * * 1 * * 1 * 1 1 * * 1 * * * 1 * * 1 * * 1 * * * * 1 * * * * * * * 1 1 1 * 1 * 1 * * * * * * 1 * 1 * 1 1 * * * * * * 1 0.9983 1 * * * 1 1 1 * 1 * * * * * * * * 1 * 1 * 1 * * * * 1 * 1 * * * * * * * 1 1 * * 1 * * 1 * * * * * * 1 * * * * 1 * 1 * * * * * 1 1 * 1 * * * 1 * * * * * * 1 * * * * * * * * 1 1 * * * * * * * * * * * * Xylella Xanthomonas campest Vib Y Thioth Thioth Thioth Thioth Thioglo Thiomicrospi Thiomicrospi Thiomicrospi Thiorhod Thiocapsa ma Thiocystis violascens DSM 198 Thiorhodococcus dr Sole T Sole Sinorhi Sole Sh Sh Rhodospi Rhodopseudomonas palust Pseud Candidatus Ruthia magnifica Sedimenticola s Sedimenticola selenatireducens DSM 17993 Riftia pac Met Met Met Nitrosococcus halophilus Nc 4 Nitrosococcus Pseudomonas putida F1 Pseudomonas f Psychrobacter aquaticus CMS 56 Psychrobacter c Sh Magnetospi C Candidatus Candidatus Ruthia magnifica st Ma Endosymbiont of unidentified scaly snail isolate Monju Met Met Met Met Beggiatoa alba B18LD Halothiobacillus neapolita Halomonas lutea DSM 23508 Chromohalobacter japonicus Chromohalobacter sal Ma Gy Hahella chejuensis KCTC 2396 Ma Ma Kangiella Buchne Esche Haemophilus pa Agrobacte Alviniconcha sym. Gamma 1 Bat Allochromatium vinosum DSM 180 Alviniconcha sym. Gamma 2 Lau Acinetobacter calcoaceticus PHEA−2 Acinetobacter haemolyticus CIP 64.3 Alcani Alcani e ersinia pestis CO92 o vnia je e e e xiella n r r r r r h h h h h h h h ichromatium pu inospi inobacter aquaeolei VT8 inobacter algicola DG893 w w w io fische uella sunshi m m m ymodiolus a ylomonas denit ylomonas methanica ylomicrobium alcaliphilum ylomicrobium al ylobacter tund ylobacter ma ylococcus capsulatus st anella sediminis H anella putre anella denit v v r r r r r y y y o z ix ni ix fl ix lacust ix disci ichia coli o o f b a a a elar xanthomonas su r astidiosa subs obium meliloti CCNWSX0020 a aphidicola st r r b us singula r o ax bo ax s v v h r ichonana u r k vib r illum elesiana elum gill symbiont e yptila illum mi oreensis DSM 16069 r v ium xilis DSM 14609 r netii RSA 331 V ea DSM 5205 r illum magnetotacticum MS 1 r r r r r i ES114 p esico a pelophila DSM 1534 a chilensis DSM 12352 a c io s aichensis f r o . DG881 r w ina 5811 r r kumensis SK2 r r r p is DSM 21227 n adiobacter r r ub atsonii C 113 agi A22 mis DSM 14473 v r z y ahaemolyticus HK385 . SIP G1 r yii YC6258 p unogena XCL 2 f ent Ph05 o r ificans OS217 aciens CN−32 ohalolentis K5 n . 970 m i r r r r n r v utulum DSM 6287 is EF1 um F11 icus BazSymB pa e ipaludum SV96 r pu us A45 ent Tica y b ificans wsii AZ1 osocius okutanii HA st um BG8 e p r r r xigens DSM 3043 . sandyi Ann 1 atum 984 . APS is pv. campest A w W−EB3 n onensis 11 1 us c2 r is TIE 1 r . Bath r . Cm r r tial is r ain HA

Figure 2.1. Multi-gene time-calibrated Bayesian phylogeny of chemosynthetic symbionts (yellow) and closely-related free-living bacteria. Support values are listed at the nodes. Presence or absence of PPi-PFK, ATP-PFK, FBPase, and RuBisCO genes is mapped at the tips of the tree. Inferred ancestral states are labeled at the nodes. Significant ancestral states are marked with asterisk (*).


The most recent common ancestors (MRCA) of each symbiont clade unequivocally had PPi-

PFK and RuBisCO and lacked FBPase, while the MRCA between the two clades possessed

PPi-PFK and RuBisCO. However, the ancestral state of FBPase in the last shared common ancestor could not be conclusively determined, as a Bayes factor (BF) ratio comparing reconstruction probability with and without FBPase at the MRCA between the two clades produced insignificant result (log BF < 2).

pfp and the Calvin cycle genes are among the most highly expressed in the S. velum symbiont

Total RNA from the symbiont-containing gill tissue of S. velum was analyzed to evaluate the metabolic potential of this chemoautotrophic bacterial symbiont. For this purpose two cDNA libraries, enriched and unenriched in the symbiont mRNA transcripts, were sequenced

(Appendix 2). At least 3.3% of the sequences, which corresponded to mRNA and tRNA

(Appendix 2 Figure A2.3), aligned to the symbiont genome. The highest number of mRNA transcripts mapped to the genes involved in housekeeping, carbon metabolism, and sulfur oxidation (Figure 2.2). Gene expression was uniformly distributed among the four largest contigs which comprise the genome of the S. velum symbiont (Dmytrenko et al. 2014). sirA

(Gene ID 31577136, 2.88% transcripts kb-1), which encodes a known virulence response regulator (Lawhon et al. 2002; Teplitski et al. 2003), was the most highly expressed gene in the symbiont, followed by rpmJ (Gene ID 31577289, 1.68% transcripts kb-1) ribosomal protein gene. cbbL (Gene ID 31576636, 1.71% transcripts kb-1), encoding RuBisCO large subunit (Schwedock et al. 2004), had the third highest expression level and was most transcribed gene in the Calvin cycle (Figure 2.3). Among the sulfur oxidation genes, dsrH (Gene ID 31575343, 1.2 transcripts kb-1) and dsrC (Gene ID 31575342, 0.87 transcripts kb-1) had the highest expression levels.


Figure 2.2. Gene expression in the S. velum symbiont. The circular insert depicts transcription across ten contigs which constitute the genome of the symbiont (Dmytrenko et al 2014). From outside to the center: genome contigs (Mb); average transcriptional levels not normalized to gene length; genes on forward strand (dark blue); genes on reverse strand (light blue); Calvin cycle genes (red; pgk and gapA between tktA and fbp and not shown); sulfur oxidation genes (yellow, including dsrABEFHCMKLJOPNRS operon); tRNA genes (purple); rRNA genes (brown). Expression of rRNA genes is not shown. The bar graph shows most highly expressed protein-coding genes in the S. velum symbiont with their transcription normalized to gene length (in kilobases). Data from rRNA enriched and unenriched cDNA datasets are presented for each gene. The Calvin cycle genes (red) and sulfur oxidation genes (yellow) are shown. Expression values and NCBI gene IDs for the respective genes are listed in Appendix 2 Table A2.2.


Figure 2.3. Proposed Calvin cycle in the S. velum symbiont. Circle areas are proportional to average gene expression (percentage of transcripts per kb of gene length) of the corresponding genes. NCBI gene IDs for the respective genes are listed in Appendix 2 Table A2.2.

These genes, known to encode cytoplasmic sulfur carrier proteins (Stockdreher et al. 2014), are part of the highly-transcribed dsrABEFHCMKLJOPNRS operon in the S. velum symbiont. Our analysis showed that genes encoding glycolytic enzymes such as pyruvate kinase (Gene ID

31576039, 0.1% transcripts kb-1), phosphoglycerate mutase (Gene ID 31575769, 0.044% transcripts kb-1), or enolase (Gene ID 31575165, 0.026% transcripts kb-1) had low expression. In contrast, transcriptional levels of pfp (Gene ID 31575776, 0.23% transcripts kb-1) were high and comparable to those of the Calvin cycle genes, such as those encoding glyceraldehyde 3- phosphate dehydrogenase (gapA, Gene ID 31576041, 0.27% transcripts kb-1) and phosphoribulokinase (prkA, Gene ID 31576527, 0.37% transcripts kb-1).


Enzymatic PPi-PFK activity is present in the symbiont-containing tissue

3- PPi-PFK activity, measured as PPi formation from FBP and PO4 , was detected in cell- free extracts (CFE) of S. velum symbiont-containing gill tissue (Table 2.1), but was absent from the foot CFE (Appendix 2 Table A2.3). No PPi-PFK activity was observed when proteins were

3- denatured by boiling or in the absence of PO4 . Sugars other than FBP, such as fructose and

3- F6P, did not trigger PPi formation with PO4 . Rates of PPi formation were dependent on

3- -1 concentrations of FBP and PO4 . The highest rates of approximately 27 nmol PPi min mg

-1 3- protein were measured for 2.5-5 mM FBP and 20-25 mM PO4 , with the exception of 2.5 mM

3- FBP and 25 mM PO4 substrate combination (Table 2.1). FBPase activity was detected in the foot CFE (data not shown).

Table 2.1. PPi-PFK activity (nmol PPi min-1 mg total protein-1) in S. velum gill tissue cell-free 3- extracts as a function of FBP and PO4 concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three biological replicates are shown. FBP [mM] 2.5 5 10 10 20.5±0.8 20.7±0.6 21.0±0.9 PO 3- 4 20 27.7±2.4 27.5±1.8 25.2±0.9 [mM] 25 25.3±1.9 27.2±0.4 25.0±0.9

Recombinant PPi-PFK from the S. velum symbiont is pyrophosphate-dependent and bidirectional

Recombinant symbiont PPi-PFK and PPase and A. vinosum PPi-PFK and FBPase proteins were purified close to homogeneity (Figure 2.4). The sodium dodecyl sulfate- polyacrylamide gel electrophoresis (SDS-PAGE) analyses were in agreement with the predicted sizes of 47.55 kDa and 22.7 kDa for the symbiont PPi-PFK and PPase and 47.47 kDa and

39.33 kDa for A. vinosum PPi-PFK and FBPase, respectively.


Figure 2.4. SDS-PAGE analysis of the recombinant S. velum symbiont: (A) PPi-PFK and (B) PPase and A. vinosum (C) PPi-PFK and (D) FBPase. The proteins were His-tagged at the N- terminal ends and analyzed during each purification step.

The purified PPi-PFK from the S. velum symbiont was pyrophosphate-dependent and unable to use ATP as substrate (Figure 2.5). A high rate of FBP formation (104±2.5 U/mg) was observed in the presence of 5 mM PPi. In comparison, with an equimolar amount of ATP only a background level of activity (1±0.07 U/mg) was recorded.

Kinetic parameters of the symbiont PPi-PFK were determined by measuring product formation at various substrate combinations in forward and reverse reactions (Figure 2.6;

Appendix 2 Tables A2.4 and A2.5) and under different pH and temperature conditions (Figure

2.6). The forward reaction reached its highest initial velocity at 7.5 mM F6P and 5 mM PPi


(Figure 2.5 A, Appendix 2 Table A2.4). Higher substrate concentrations were not tested as velocities plateaued at these values. The highest reverse initial reaction velocities occurred between 2.5 mM and 5 mM FBP and 10 mM and 25 mM phosphate (Figure 2.6 B, Appendix 2

3- Table A2.5). Above 5 mM FBP and 25 mM PO4 substrate concentrations were inhibitory.

Overall, the symbiont PPi-PFK had higher reaction velocities in the reverse than the forward reaction. pH optimum of the enzyme was observed between 7.5 and 8 (Figure 2.7 A).

Temperature optimum was recorded between 55℃ and 65℃ (Figure 2.7 B).

Figure 2.5. S. velum symbiont PPi-PFK forward reaction activity with either 5 mM PPi or ATP. The assay was initiated by adding the source of phosphates (ATP or PPi) to a reaction mixture containing 7.5 mM F6P and the recombinant PPi-PFK. Enzyme activity was measured by converting FBP reaction product by fructose 1,6-bisphosphate aldolase (EC and triosephosphate isomerase (EC to dihydroxyacetone phosphate (DHAP), with a subsequent reduction of DHAP by glycerophosphate dehydrogenase (EC in a reaction which consumes NADH. Measurements were performed at pH 7.5 and at 25℃. Error bars show standard deviations from three replicate measurements performed using the same enzyme preparation.


Figure 2.6. Initial velocities of the recombinant symbiont PPi-PFK in (A) forward and (B) reverse reactions at different substrate concentrations calculated within the linear range of each substrate combination (Appendix 2 Tables A2.4 and A2.5). Measurements were performed at pH 7.5 and at 25℃. Color legend shows initial velocities in units mg-1 of recombinant PPi-PFK.

Figure 2.7. Initial velocities of the symbiont PPi-PFK in the reverse reaction under different (A) 3- pH and (B) temperature conditions measured at 5 mM FBP and 20 mM PO4 . Error bars show standard deviations from three replicate measurements.


Figure 2.8. Influence of pyrophosphate on the reverse reaction of the symbiont recombinant PPi-PFK with and without PPase across a range of FBP concentrations. Measurements were performed at pH 7.5 and at 25℃. Error bars show standard deviations from three replicate measurements.

Inhibition of the S. velum symbiont PPi-PFK by PPi can be attenuated by PPase

Pyrophosphate acted as a competitive inhibitor for FBP in the reverse reaction (Figure

2.8, Appendix 2 Table A2.2). The inhibition constant of PPi (KiPPi) was 0.381±0.079 mM. The effect of PPi inhibition was alleviated by a pyrophosphatase consuming enzyme, such as the


recombinant PPase (Km 0.18±0.02 mM) encoded in the same operon as the symbiont PPi-PFK.

Addition of 1.75 U PPase into reverse reaction negated the inhibitory effects of 1.0 mM PPi.

Figure 2.9. Catalytic efficiencies (Ef (Ceccarelli et al. 2008)) of PPi-PFKs (reverse reaction, solid lines) and FBPases (dash-dotted lines) from select bacteria across a range of FBP concentrations. Enzymes marked with an asterisks (*) were characterized in this study.


Table 2.2. Kinetic properties of the recombinant symbiont PPi-PFK and PPase and A. vinosum PPi-PFK and FBPase. Standard error of the mean (SEM) for three replicate measurements are shown. Additional Enzymes Et k Km Vmax k /Km cat cat conditions SEM SEM Substrates µmol /sec mM U/mg /sec mM PPi-PFK S. velum symbiont FBP 0.000017 72.0 1.37 0.150 0.01 185.5 479

3- PO4 0.000017 75.1 1.10 1.232 0.08 193.3 61

3- PO4 0.000017 78.0 1.07 1.246 0.07 201.0 63 + PPase F6P 0.000021 53.7 1.30 0.276 0.03 107.6 194 PPi 0.000021 41.6 0.35 0.005 0.00 103.1 7805 FBP 0.000017 60.7 0.96 0.184 0.01 156.4 330 0.05 mM PPi FBP 0.000017 58.6 1.13 0.175 0.01 150.9 335 0.125 mM PPi FBP 0.000017 66.5 1.20 0.265 0.05 171.3 251 0.25 mM PPi FBP 0.000017 61.3 0.29 0.326 0.02 157.9 188 0.5 mM PPi FBP 0.000017 66.4 0.31 0.469 0.02 171.1 142 1.0 mM PPi 1.0 mM FBP 0.000017 66.4 0.72 0.178 0.01 181.0 373 PPi+PPase PPase S. velum symbiont PPi 0.000079 560.0 41.12 0.107 0.04 158.1 5234 PPi-PFK A. vinosum FBP 0.000040 29.1 0.71 0.129 0.02 76.4 226

3- PO4 0.000040 34.8 0.54 0.678 0.06 88.1 51 FBPase A. vinosum FBP 0.000051 12.8 3.85 0.060 0.003 15.7 211

S. velum symbiont PPi-PFK has high catalytic efficiency

Catalytic efficiency of PPi-PFK from the S. velum symbiont was compared to efficiencies of other bacterial PPi-PFKs and FBPases (Figure 2.9). To compare different enzymes acting on the same substrate, their efficiency functions (Ef) were estimated for a range of substrate concentrations (Ceccarelli et al. 2008). EfFBP was calculated based on measured kinetic values

-1 for the symbiont PPi-PFK (KmFBP 0.15±0.01 mM, kcatFBP 72±1.37 sec ) as well as A. vinosum


-1 PPi-PFK (KmFBP 0.13±0.02 mM, kcatFBP 29.1±0.71 sec ) and FBPase (KmFBP 0.06±0.003 mM,

-1 kcatFBP 12.8±3.85 sec ) (Table 2.2). Parameters for other enzymes included in the comparison were taken from literature, as listed in Materials and Methods. For the Ef calculation, PPi-PFKs were assumed to act in concert with PPi removing enzymes and thus be irreversible.

Symbiont PPi-PFK had high catalytic efficiency in the reverse reaction (Figure 2.9). The enzyme was 1.2 times more efficient than FBPase from A. vinosum at low substrate concentrations (< 10 µM) and became over twofold more efficient as concentrations increased.

Among the FBPases from closely related bacteria, only the E. coli FBPase (Kelley-Loughnane et al. 2002) displayed higher efficiency than the symbiont PPi-PFK at low substrate concentrations. Compared to FBPases, PPi-PFKs exhibited overall higher efficiencies at high substrate concentrations, as FBPases rapidly dropped in performance above 10 µM FBP.

Catalytic efficiency of the symbiont PPi-PFK decreased threefold by addition of 1 mM PPi. This loss was almost entirely recovered in the presence of PPase.


This study for the first time demonstrates that PPi-PFK from a chemoautotrophic symbiont is capable of performing the biochemical function of FBPase not encoded in its genome. Absence of FBPase in chemoautotrophic symbionts which fix CO2 via the Calvin cycle has been enigmatic since sequencing of the first symbiont genome, that of a deep sea calm

Calyptogena magnifica (Newton et al. 2007). While it has been hypothesized that PPi-PFK may be able to take over for the missing enzyme in this and other sequenced symbionts, for example, found in association with the siboglinid worm R. pachyptila (S. Markert et al. 2011), oligochaete O. algarvensis (Kleiner et al. 2012), and S. velum protobranch bivalve (Dmytrenko et al. 2014), the function of PPi-PFK in these bacteria was predicted based on sequence only and had not been experimentally validated.


Either loss of FBPase or gain of PPi-PFK occurred in the evolutionary history of all chemoautotrophic gammaproteobacterial symbionts sequenced to date (Figure 2.1), which suggests a strong association between these occurrences and emergence of a symbiotic lifestyle. As more chemoautotrophic symbionts will be sequenced, it will become more evident whether the last common ancestor of all chemoautotrophic symbionts possessed only PPi-PFK, or FBPase loss occurred independently during each symbiosis event.

Genome-wide transcriptional analysis of the S. velum symbiont provided valuable insights into the metabolic potential of this bacterium, in particular with regard to pfp and its hypothesized role in the Calvin cycle (Figure 2.2). Judging from the observed patterns of genes expression, sulfur oxidation and carbon fixation are the two key processes in the symbiont metabolism, in agreement with our current understanding of chemoautotrophic symbionts

(Felbeck et al. 1981; Cavanaugh 1983; Stewart & Cavanaugh 2006; Cavanaugh et al. 2013).

The observed transcriptional levels of sulfur oxidation genes mostly agreed with our preliminary analysis of gene expression data carried out without having the reference genome sequence

(Stewart et al. 2011). Mapping transcripts to the annotated genomic contigs (Dmytrenko et al.

2014) improved predictions of gene expression, in particular with regard to genes which in the genome are present in multiple copies, for example dsrC. cbbL, which encodes RuBisCO large subunit, was the third most highly expressed gene in the S. velum symbiont (Figure 2.2). Other

Calvin cycle genes followed close suit (Figure 2.3). Among them, expression of gapA and prkA was comparable in magnitude to that of pfp, in line with the hypothesized role of PPi-PFK in the cycle. Since PPi-PFK is commonly considered to be a glycolytic enzyme (Mertens 1991), we have also compared pfp expression to that of genes involved in glycolysis, for example, pyruvate kinase and enolase, whose expression was approximately 50% to 90% lower than that of pfp. In fact, if pfp was glycolytic, it would have been the most highly expressed gene in this pathway. pfp transcription in the S. velum symbiont agreed with high levels of PPi-PFK protein


detected in other chemoautotrophic symbiont, those of R. magnifica (S. Markert et al. 2011) and

O. algarvensis (Kleiner et al. 2012), suggesting that the enzyme plays an important role in metabolism of these autotrophic bacteria. Finally, our discussion of the transcriptome would not be complete without commenting on the mostly highly expressed gene in the S. velum symbiont

(Figure 2.2). sirA encodes a response regulator, which is known to increase expression of virulence and decrease expression of motility genes in pathogenic gammaproteobacterium,

Salmonella enterica (Lawhon et al. 2002; Teplitski et al. 2003). High expression of sirA in the symbiont suggests an exciting possibility that this gene may be similarly involved in "infection" of the S. velum host with symbiotic bacteria.

As predicted by pfp expression, we were able to detect PPi-PFK enzymatic activity in the

S. velum gill cell-free extracts (Table 2.1). PPi-PFK activity was quantified by measuring PPi

3- formation in the presence of FBP and PO4 . Detecting PPi instead of F6P allowed us to separate activity of PPi-PFK from that of FBPase, an enzyme present in the foot cell-free extracts and, therefore, also likely found in the host tissue surrounding the symbionts. Because of the potential eukaryotic FBPase activity in the extracts, which would consume some of the

FBP substrate without producing PPi, and the fact that the symbionts contribute only a portion of the total protein in the gill tissue, the measured PPi-PFK activity (0.028±0.007 U mg protein-1,

Table 2.1) was potentially undersampled. Given that there are up to 2.6x109 symbiont cells per gram of wet gill tissue (Cavanaugh 1983; Mitchell & Cavanaugh 1983) and assuming 155 fg/cell of protein in a bacterium (Cox 2004), the PPi-PFK activity values were corrected to account for proteins from the symbiont only (2.11±0.06 U mg protein-1, Appendix 2 Table A2.6). The actual symbiont PPi-PFK activity in cell-free extracts could be higher, as it may be partially shadowed by activity of the host FBPase. To our knowledge this is the first report of reverse PPi-PFK activity in cell-free extracts. These data suggest that PPi-PFK-specific activity in the symbiont-


containing tissue of S. velum could account for the missing FBPase. However, we are not able to entirely rule out the possibility of another, yet unidentified enzyme, being at work.

For further study, the symbiont PPi-PFK was expressed in E. coli, purified, and characterized. Substrate concentration dependencies observed with the recombinant enzyme

(Figure 2.6, Appendix 2 Tables A2.4, A2.5) closely approximated PPi-PFK activity profile observed in the gill cell-free extracts (Table 2.1). This strengthened our prior conclusion that the

CFE measurements reflected activity of the PPi-PFK enzyme. In agreement with sequence- based function prediction, the purified PPi-PFK was pyrophosphate-dependent and unable to utilize ATP as substrate (Figure 2.5). Low activity detected in the presence of ATP was likely due to residual PPi from storage buffer. Like other PPi-PFKs, the symbiont enzyme was reversible. However, the much higher specific activity (Vmax) of the symbiont PPi-PFK in the reverse reaction suggests that the forward reaction is less favored (Table 2.3). The observed

1.7 ratio of the reverse over the forward Vmax is significantly (p<0.0001) higher than the ratios calculated for other bacterial PPi-PFKs (O'Brien et al. 1975; Mertens et al. 1989; Ladror et al.

1991; Ding et al. 1999; Reshetnikov et al. 2008; Frese et al. 2014). Such strong preference for the reverse reaction supports our hypothesis that PPi-PFK may perform the role of the missing

FBPase in the Calvin cycle of the S. velum symbiont.

The highest symbiont PPi-PFK reverse reaction velocities were observed at pH values between 7.5 and 8.0 and at temperatures from 55℃ to 65℃ (Figure 2.7). This pH optimum is in agreement with an earlier study which reported the highest CO2 fixation rates in the symbionts at pH 8.0 (Scott & Cavanaugh 2007). It is not rare for enzymes from mesophilic bacteria to have high temperature optima between 50℃ and 65℃ (Wang et al. 2016; Saggu & Mishra 2017;

Saxena et al. 2018). High PPi-PFK temperature optimum implies that the enzyme may have stable tertiary and quaternary structures (Kaneko et al. 2005), which agrees with observed overall stability of the enzyme during handling and long-term storage. While the symbionts


unlikely experience such high temperatures in situ, during summer months they are subjected to large temperature fluctuations in intertidal environments and may face temperatures in excess of 30℃ at low tide (Kaplan et al. 1977).

Table 2.3. Comparison of Km and Vmax from bacterial PPi-PFK enzymes.

Vmax Km Km Vmax Km Km Vmax pH Tm pH Tm kDa FBP PO4 F6P PPi reverse/ Reference Species [mM] [mM] (U mg-1) [mM] [mM] (U mg-1) reverse reverse forward forward FBP F6P forward

S. velum 47.6 0.150 1.232 185.5 0.276 0.005 107.6 7.5-8.0 55-60 - - 1.72 This study symbiont

A. vinosum 47.5 0.130 0.678 76.4 ------This study

(Reshetnikov et M. capsulatus 44.7 0.360 8.690 9.0 2.270 0.027 7.6 - - 7.0 30 1.18 al. 2008)

(Pfleiderer & R. rubrum 40.0 0.020 0.820 24.2 0.380 0.025 20.0 8.6 - 7.2 - 1.21 Klemme 1980)

(Frese et al. X. campestris 44.7 0.024 2.500 59.0 0.202 0.041 58.0 - - 6.8 40 1.00 2014)

(O'Brien et al. 1975; Mertens P. freudenreichii 43.2 0.051 0.600 232.0 0.100 0.069 258.0 7.0-7.4 - 7.5 - 0.90 et al. 1989; Ladror et al. 1991)

(Ronimus et al. S. thermophila 61.0 0.038 0.400 239.0 0.240 0.110 438.0 7.0-7.5 5.0-6.4 >55 0.55 1999)

(Ding et al. D. thermophilum 37.4 2.900 4.300 0.6 0.228 0.022 6.2 7.0-7.5 - 5.7-6.3 - 0.10 1999)

(Ding et al. T. maritima 46.5 - - - 0.980 0.067 203.0 5.6-6.8 - 5.6-5.8 - - 2001)

(Deng et al. B. burgdorferi 62.0 - - - 0.109 0.015 82.9 - - 6.4-7.2 - - 1999)

Symbiont PPi-PFK can be substrate- and product-inhibited (Figures 2.6 and 2.8). In particular, substrate inhibition occurred in the reverse reaction at above 5 mM FBP and 25 mM

3- PO4 , similar to other PPi-PFKs (Frese et al. 2014). Furthermore, pyrophosphate, which is a substrate in the PPi-PFK forward reaction, acted as a strong competitive inhibitor of FBP in the reverse reaction (Figure 2.8). This is not surprising, give an almost 250-times higher affinity of

3- PPi-PFK for PPi compared to PO4 (Table 2.2). While the inhibitory effects of PPi have been previously described for plant PPi-PFKs (Stitt 1989; Theodorou & Plaxton 1996), this is the first measurement of PPi inhibition for a bacterial PPi-PFK to date. Bacterial cells on average contain 0.5-1.5 mM PPi (Heinonen & Drake 1988; Bornefeld 1981; J. Chen et al. 1990), while in obligate methanotrophs PPi concentration can reach 5 mM (Y. Trotsenko & Shishkina 1990; Y.


A. Trotsenko et al. 2008). At 1 mM pyrophosphate reduced catalytic efficiency of the symbiont

PPi-PFK reverse reaction by more than 75% (Figure 2.9). To overcome the effects of PPi inhibition, the symbiont may employ PPi-consuming enzymes encoded in its genome, such as

PPase, H+/Na+-PPases, or SAT (Dmytrenko et al. 2014). By removing pyrophosphate these enzymes may mitigate the inhibitory effect of PPi and make the reverse reaction more favorable. Since PPi-PFKs are readily reversible (Mertens 1991; Kemp & Tripathi 1993), under in situ equilibrium conditions PPi removal could change the direction of PPi-PFK catalysis.

Energetic favorability of the forward reaction may further decreases due to high free energy change associated with PPi hydrolysis (∆fGº = -22 kJ/mol) (Biegel & V. Müller 2011). In our study we have shown that under phosphate saturation conditions, when the concentration of

PPi formed in the reverse reaction is negligible, addition of PPase did not noticeably affect PPi-

PFK kinetics (Table 2.2, Appendix 2 Figure A2.4). This suggests that, under these conditions, rate of the reverse reaction is limited by the diffusion of products from the active site. It is noteworthy that the addition of PPase to the assay improves signal to noise ratio and may be recommended as a stabilizing component in the future experiments (Appendix 2 Figure A2.5).

Our data suggest that PPi-PFK reverse activity in the chemoautotrophic symbionts may be dependent on PPi removal. To study the effects of PPi removal we used an inorganic pyrophosphatase, which does not couple PPi hydrolysis to any less thermodynamically favorable reaction. However, in the S. velum symbiont expression of the PPase encoding gene was relatively low (0.053% transcripts kb-1). Other PPases, such as Na+-PPase (0.046% transcripts kb-1) and H+-PPases (0.023% transcripts kb-1), also did not show high levels of transcription. In chemoautotrophic symbionts of R. magnifica (S. Markert et al. 2011) and O. algarvensis (Kleiner et al. 2012) H+/Na+-PPases have been hypothesized to couple hydrolysis of

PPi produced by PPi-PFK to generation of H+/Na+ electrochemical gradients which can be used for ATP production by ATP synthases. Approximately 10 molecules of PPi could in this way be


used to translocate 10 H+/Na+ (Serrano et al. 2007), which may then be used to generate 3 molecules of ATP (Hinkle 2005). This mechanism could reduce the total ATP cost of the Calvin cycle by approximately 10% (Kleiner et al. 2012). However, given low transcription of the

H+/Na+-PPases encoding genes in the S. velum symbiont, an alternative mechanism of pyrophosphate removal may be at play. We hypothesize that ATP sulfurylase may be instead responsible for the majority of PPi removal in the symbiont of S. velum. SAT is the final enzyme

2- in sulfur oxidation to SO4 in many chemoautotrophic symbionts and free-living sulfur oxidizing bacteria (Dahl et al. 2013; Parey et al. 2013). High SAT activity and the associated genes expression have been previously reported in symbiont-containing tissues of diverse chemoautotrophic symbioses (Felbeck et al. 1981; Felbeck 1981; Fisher & Hand 1984; C. Chen et al. 1987; Polz et al. 1992; Fiala-Medioni et al. 2002; Boutet et al. 2011). In the transcriptome of the S. velum symbiont, sat was among the most highly expressed genes involved in sulfur oxidation and was the most transcribed among the genes which encode PPi consuming

2- enzymes (Figure 2.2). SAT acts by transferring PPi to APS, generating SO4 and ATP (Parey et al. 2013). If SAT activity was coupled to removal of PPi, potentially produced by PPi-PFK in the

Calvin cycle, two molecules of ATP would be made per each round of CO2 fixation. Thus, SAT would not only drive reverse PPi-PFK activity by preventing substrate inhibition and increasing equilibrium constant of the reverse reaction, but could reduce the energetic cost of carbon fixation from 3 to 1 ATPs per each CO2 fixed.

In the absence of pyrophosphate inhibition, the symbiont PPi-PFK was more catalytically efficient at converting FBP into F6P than many characterized bacterial PPi-PFKs and FBPases

(Figure 2.9). PPi inhibition markedly reduced Ef (Eisenthal et al. 2007; Ceccarelli et al. 2008) of the enzyme but was readily reversed with the help of PPase. E. coli FBPase is more efficient than the symbiont PPi-PFK at low substrate concentrations but becomes rapidly superseded by

PPi-PFK when substrate concentrations increase past 18 µM. In general, FBPases tend to


perform better at lower and worse at higher substrate concentrations compared to PPi-PFKs.

This may be attributed to the fact that, unlike PPi-PFKs, FBPases are virtually irreversible

(Mertens 1991). PPi-PFKs, on the other hand, tend to perform relatively well in both forward and reverse reactions (Table 2.3) (O'Brien et al. 1975; Mertens et al. 1989; Ladror et al. 1991; Ding et al. 1999; Reshetnikov et al. 2008; Frese et al. 2014), which may come at the cost of overall efficiency. The majority of PPi-PFKs included in this analysis are thought to act as glycolytic enzymes. For example, PPi-PFK is present in A. vinosum alongside FBPase. While this PPi-

PFK may under certain conditions act in reverse, the primary role of the enzyme is likely limited to glycolysis. Our analysis predicted that in ideal conditions of substrate saturation and no PPi inhibition, reverse activity of PPi-PFK from A. vinosum would be comparable to that of FBPase.

However, in vivo this PPi-PFK would likely exhibit much lower catalytic activity. Since A. vinosum can use FBPase in its Calvin cycle, there is no evolutionary pressure for PPi-PFK to improve its reverse catalysis in this bacterium. In case of chemoautotrophic symbionts, on the other hand, the hypothesized role of PPi-PFK in the Calvin cycle may be exerting selection pressure to improve catalytic efficiency of the reverse reaction, which agrees with our data.

The high prevalence of PPi-PFKs and the lack of FBPases in chemoautotrophic symbionts suggest that PPi-PFKs play an important role in metabolism of the symbiont. The potential shift from FBPases to PPi-PFKs in these bacteria must have been driven by advantages associated with this evolutionary change. It has been previously hypothesized that

PPi-PFK may perform the biochemical function of FBPase in the Calvin cycle of chemoautotrophic symbionts. In this study we have shown that PPi-PFK from S. velum symbiont is capable of not only performing the hypothesized catalysis, but has evolved a higher specificity for the reverse reaction compared to other, potentially glycolytic, PPi-PFKs (Table

2.3). The potential use of PPi-PFK in the Calvin cycle of chemoautotrophic symbionts may be directly coupled to sulfur oxidation through enzymatic activity of SAT. This enzyme consumes


2- PPi and generates ATP in the final step of sulfur oxidation to SO4 . Thus, PPi removal required for PPi-PFK reverse activity may have an added advantage of ATP synthesis. This process may reduce the overall energetic cost of the Calvin cycle and could have led to the prevalence of

PPi-PFKs and a potential loss of FBPases in chemoautotrophic symbionts. The proposed coupling between sulfur oxidation and carbon fixation through a concerted action of PPi-PFK and SAT may explain why this evolutionary change is confined to chemoautotrophic symbionts and has not found its way to photoautotrophic symbionts and plastids.

Materials and methods

Specimen collection and bacterial cultures

S. velum protobranch bivalves were collected from an intertidal mud flat at Bluff Hill

Cove, Point Judith Pond, Rhode Island, over the period between 2011 and 2016. For DNA and

RNA extractions the specimens were dissected in the field. Gill tissue was stored in RNALaterTM

(Thermo Fisher Scientific, Waltham, MA) at 4°C, and processed within 2 days. For measuring enzyme activity, live specimens were transported to the lab in continuously aerated chilled sea water within 2 hours of collecting. Gill and foot tissues were dissected, weighed, frozen in liquid

N2, and stored at -80°C. Freezing did not decrease enzymatic activity compared to fresh samples (data not shown).

Bacterial strains and plasmids used in this study are listed in Supplementary file 1, Table

S6. and A. vinosum DSM 180 Rif50 (Lubbe et al. 2006) culture was generously provided by

Christiane Dahl (Universität Bonn, Germany). A. vinosum cultures were grown in RCV medium

(Weaver et al. 1975) in anaerobic vials under constant illumination (60W). Culture stock was stored in 10% dimethyl sulfoxide (DMSO) at -80°C.


Phylogenetic analysis and ancestral state reconstruction

To determine where the absence of FBPase and the presence of PPi-PFK in chemoautotrophic symbionts was inherited or independently derived, their ancestral states were reconstructed. A Bayesian phylogeny was generated using sequences from chemoautotrophic symbionts and closely-related free-living bacteria. In this phylogeny 73 taxa were included, with

68 gammaproteobacterial taxa as the ingroup and 5 alphaproteobacterial taxa as an outgroup.

Only taxa with complete or nearly-complete genomes were used. The minimal combination of

DNA and amino acid sequences which provided adequate support at all nodes included 16S rRNA genes and 14 phylogenetically conserved proteins (AtpA, ClpX, DnaE, DnaK, InfB, MurA,

RplF, RplV, RplW, RpoA, RpoB, RpsC, RpsK, SecY (Wu et al. 2013)). A time-calibrated phylogeny from these 15 concatenated sequences was generated with BEAST version 1.8.2

(Drummond et al. 2012). Partitions and substitution models were chosen based on the results of a PartitionFinder (version 1.1.1) analysis (Lanfear et al. 2012) (16S: single partition of symmetric

+ gamma + invariant sites; protein sequences: 3 partitions, each LG + gamma + invariant sites

(S. Q. Le & Gascuel 2008)). Each partition was run under a log-normal relaxed clock model.

The tree model (Speciation: Yule process (Gernhard et al. 2008)) was shared across all partitions and time-calibrated using normally distributed priors describing the outgroup and ingroup MRCA with means of 1.9 and 1.7 billion years before present, respectively (Sheridan et al. 2003; Battistuzzi et al. 2004). Analysis was run for 20 million generations, sampling every

10,000 generations, producing a maximum of 2,000 trees per run prior to burn-in. The resulting phylogeny represented MCC tree from a set of 5,393 trees obtained by combining 3 stably converged independent runs. Only 2 nodes had support below 1, with values of 0.9983 and


At all nodes, ancestral states were reconstructed for the four proteins of interest: ATP-

PFK, PPi-PFK, FBPase, and RuBisCO. Since genome annotations were poor or missing for


many of the organisms of interest, presence or absence of each of the four proteins as tip states was determined based on genomic BLAST version 2.6.0+ (Altschul et al. 1990). Well-annotated sequences were chosen to search for gene homologs in the genomes of interest. The resulting high-quality hits were added afterwards to the initial query and used in a repeated BLAST search until no more targets could be identified. Ancestral states for each gene were estimated using BayesTraits version 2 (Pagel et al. 2004) under a MultiState evolution model and a

Markov Chain Monte Carlo run for 5,010,000 iterations (10,000 iteration burn-in) with an exponential hyperprior. Instead of using a single MCC tree for the ancestral state reconstruction, set of 5,392 trees was used to reflect the combined uncertainties of the phylogeny and the reconstruction. The probabilities plotted on the tree represent the median of the posterior distribution. To quantify the probability of a given predicted genotype at an ancestral node,

Bayes factors (BF) were used to compare the difference in marginal likelihoods between the genotype pair.

Transcriptome sequencing and analysis

RNA from the S. velum symbiont-containing gill tissue was extracted from a single specimen using miRNeasy Mini Kit (Qiagen, Hilden, Germany). To enrich for symbiont transcripts, host mRNA was depleted using Ambion MICROBEnrichTM kit (Thermo Fisher

Scientific, Waltham, MA). Afterwards, half of the RNA was used directly for sequencing. The other half was further enriched in the symbiont mRNA by depleting 16S (primers Sv_16SF1 +

Sv_16SR1T7) and 23S (primers Sv_23SF1 + Sv_23SR1T7) symbiont rRNA and 18S (primers

SV_18SF1_53 + SV_18SR1T7-53) and 28S (primers Sv_28F1 + Sv_28S1T7) host rRNA transcripts with custom species-specific oligonucleotide probes synthesized using the specified primers (Appendix 2 Table A2.7). rRNA depletion was performed according to the procedure


adapted from Stewart et al. (2010). RNA amplification, cDNA synthesis, and sequencing were done as described in Stewart et al. (2011).

Artificially duplicated reads which arose during pyrosequencing were identified with CD-

HIT software version 4.7 (W. Li & Godzik 2006). Reads which had 100% nucleotide identity, length difference no more than 1 nucleotide, and identical first 3 nucleotides were discarded and only unique non-duplicates retained for further analysis (Appendix 2 Table A2.1). Next, reads corresponding to the symbiont 5S, 16S, and 23s rRNA, mitochondrial 12S and 16S rRNA, and host 18S and 28S rRNA were removed with Bowtie 2 version 2.3.2 (Langmead et al. 2009) using local alignment (--local --very-sensitive-local) to the species-specific sequences. The remaining reads were mapped to the genomes of the S. velum symbiont (NCBI Accession number JRAA01000001) (Dmytrenko et al. 2014) and the host mitochondria (NCBI Accession

Number NC_017612.1) (Plazzi et al. 2013) with Bowtie 2 using end-to-end alignment (--end-to- end --sensitive --qc-filter). These mapped reads were processed with Samtools version 1.5 (H.

Li et al. 2009) and visualized in Circos version 0.69-6 (Krzywinski et al. 2009). Highly expressed genes in the genome of the S. velum symbiont were identified using HTSeq-Count version 0.8.0

(Anders et al. 2015), normalized to gene length and sample size, and visualized in Python 3.6

Matplotlib version 2.1.0 (Hunter 2007). The remaining reads (Appendix 2 Table A2.1) were queried with BLASTN version 2.6.0+ (Altschul et al. 1990) against the NCBI nucleotide database (11 November 2017). The resulting hits were analyzed in MEGAN version 6.10.5

(Huson et al. 2007).

PPi-PFK activity in S. velum cell-free extracts

To obtain a sufficient amount of cell-free extract for measuring enzyme activity, tissues from multiple individuals (2-3) were pooled. Using a pre-chilled glass dounce homogenizer, symbiont-containing gill and symbiont-free foot tissue samples were macerated in 1:12 w/v ice-


cold extraction buffer (pH 7.0) containing 50 mM Tris-HCl, 5 mM MgCl2, 1 mM KCl, 3 mM

TM NH4Cl, 5mM DTT, and 2x ProteaseArrest (G-Biosciences, Saint Louis, MO). Soluble cell content was released by subjecting the homogenates to a series of 3x10 sec sonication bursts with a probe sonicator (Sonifier 250, Branson, Danbury, CT) at output level 2. Samples were kept on ice-salt slurry (8:1 w/w) during sonication and transferred to ice between treatments.

Cell lysis was monitored microscopically. The sonicated homogenates were centrifuged at 4°C for 30 min at 20,000 x g to pellet cellular debris. Supernatant containing cell-free extract was immediately used for measuring enzyme activity. Total protein in the soluble cell-free fraction was determined through CB-Protein Assay with bovine serum albumin (BSA) as a standard (G-

Biosciences, Saint Louis, MO).

PPi-PFK activity in crude cell-free extracts was measured through a method adapted from Heinonen (1981), modified to a 96-well plate format. The assay reaction contained 50 mM

Tris-HCl, 5 mM MgCl2, 1 mM KCl, 3 mM NH4Cl, 5 mM dithiothreitol (DTT), 2.5-10 mM FBP, and

90 µg cell-free extract at pH 7.5. PPi formation, indicative of the reverse PPi-PFK activity, was

3- initiated by the addition of PO4 to a final concentration of 10-25 mM. Every 20 sec, 200 µl of the reaction mixture were transferred into a new well containing 66 µl of 20% trichloroacetic acid

(TCA) to terminate the reaction. A total of 6 time points were sampled. Afterwards, the TCA was neutralized with 15 µl of 5 M NaOH. To precipitate PPi, 31 µl of 2 mM CaCl2 were added into the samples followed by 18.5 µl of 1 M KF. The reactions were thoroughly mixed, incubated at RT for 15 min, and centrifuged at 3,220 x g for 5 min. The supernatant (295 µl), containing

3- unreacted PO4 , was discarded to prevent interference with PPi detection. Precipitate in the remaining 35 µl was dissolved with 165 µl of 0.4 M H2SO4. Freshly-prepared colorimetric reagent (200 µl), containing 32 mM (NH4)6Mo7O24, 0.5 M H2SO4, and 71 mM trimethylamine, was added to each well. After incubating for 15 min, the plate was centrifuged as above. The supernatant (200 µl) was transferred into new wells containing 6 µl of 2.5 M H2SO4 and


centrifuged again under the same conditions. The final supernatant (200 µl) was transferred to a transparent polystyrene 96-well plate (Greiner Bio-One, Frickenhausen, Germany) and mixed with 13 µl of 1 M 2-mercaptoethanol. After 6 min, absorbance was measured at 700 nm with a

Tecan Infinite m200 spectrophotometer (Tecan, Männedorf, Switzerland). Three biological replicates and three experimental replicates were measured per experimental condition. The concentration of PPi was determined using a standard curve.

Cloning, expression, and purification of recombinant proteins from S. velum symbionts and A. vinosum

S. velum symbiont-containing DNA was isolated from the gill tissue using DNeasy Blood

& Tissue kit (Qiagen, Hilden, Germany). DNA from A. vinosum mid log-phase culture was purified using E.N.Z.A. Bacteria DNA kit (Omega Bio-Tek, Norcross, GA). Genes encoding PPi-

PFK (primers Sv_pfp_1F_NdeI + Sv_pfp_1257_SacI) and PPase (primers Sv_ppase_1F_NdeI

+ Sv_ppase_549R_XhoI) from the S. velum symbiont and PPi-PFK (primers Av_pfp_1F_NdeI +

Av_pfp_1254 R_SacI) and FBPase (primers Av_fbp_1F_NdeI + Av_fbp_1014R_SacI) from A. vinosum were PCR-amplified with the specified primers listed in Supplementary file 1, Table S6 using Q5 high-fidelity polymerase (NEB, Ipswich, MA). PCR products were digested either with

NdeI and SacI or NcoI and XhoI restriction enzymes (NEB, Ipswich, MA), purified, and cloned into the digested expression vector pET28a+ (EMD Biosciences, San Diego, CA). This procedure introduced in-frame sequences encoding histidine tags (His6) at the 5' end of each cloned gene.

Ligated products were transferred into E. coli BL21(DE3) (Thermo Fisher Scientific,

Waltham, MA) and maintained using kanamycin (Kan). Inserts were verified by PCR and DNA sequencing (DF/HCC DNA Resource Core, Boston, MA). For overexpressing recombinant proteins, 200 ml of LB medium containing Kan was inoculated with an overnight culture of E.


coli BL21(DE3) bearing one of the expression plasmids. Cultures were grown at 37°C with shaking at 180 rpm until OD600 reached 0.4-0.8 (referred to as Uninduced SDS-PAGE fractions).

Protein expression was then induced with 1 mM IPTG for 3 hours. Afterwards, cells were pelleted by centrifugation at 4,000 x g for 20 min at 4°C. Cell pellets were frozen in liquid N2 and stored at -80°C.

To purify His6-tagged proteins, 500 mg of thawed E. coli BL21(DE3) cell pellets containing the recombinant enzymes were lysed in 10 ml of xTractor buffer (Takara Bio USA,

Ann Arbor, MI) with 125 µl of LongLife PELB Lysozyme and 200 µl of 100x ProteaseArrest (G-

Biosciences, Saint Louis, MO). Lysate was centrifuged at 12,000 x g for 20 min at 4°C and the resulting supernatant collected (referred to as Induced SDS-PAGE fractions). Subsequent protein purification was carried out at 4°C using His60 Ni-IDA resin columns and buffers supplied by Takara Bio USA (Ann Arbor, MI). After columns were washed with equilibration buffer, 5 ml of the starting samples were applied to the resin and incubated for 1 hour with gentle shaking. Unbound lysates were collected (referred to as Unbound SDS-PAGE fractions) and the procedure was repeated for the remaining 5 ml. Next, columns were washed with 10 ml of equilibration buffer (referred to as Wash 1 SDS-PAGE fractions), followed by 10 ml of wash buffer (referred to as Wash 2 SDS-PAGE fractions). Bound proteins were eluted in two 1 ml elution fractions in elution buffer containing 300 mM imidazole (referred to as Elute 1 and Elute

2 SDS-PAGE fractions). Eluted protein fractions were transferred into enzyme-specific storage buffers (referred to as Storage 1 and Storage 2 SDS-PAGE fractions) using Amicon Ultra-4 3K centrifugal devices (Millipore, Billerica, MA). PPi-PFK storage buffer (pH 7.0) contained 10 mM

Tris-acetate, 0.1 mM ethylenediaminetetraacetic acid (EDTA), 0.5 mM DTT, 17 mM KCl, 1 mM

MgCl2, 1 mM FBP, and 50% (v/v) glycerol. Storage buffer (pH 7.0) for FBPase comprised of 8 mM KPO4, 1 mM EDTA, 1 mM DTT, 17 mM NaCl, and 1 mM FBP. PPase storage (pH 8.0) included 20 mM Tris-HCl, 0.1 mM ZnCl2, 1 mM MgCl2, 100 mM KCl, and 1 mM DTT. Purified


PPi-PFKs and PPases were stored at -20°C. FBPase was frozen in liquid N2 and stored at -

80°C. Protein concentrations in individual fractions were measured using the CB-Protein Assay

BSA as a standard (G-Biosciences, Saint Louis, MO).

Individual protein fractions were analyzed by SDS-PAGE on precast 12% (w/v) acrylamide gels (Bio-Rad, Hercules, CA) with PAGEmark Unstained Marker protein ladder (G-

Biosciences, Saint Louis, MO) following standard procedure (Laemmli 1970). 30 µg of protein from the Uninduced, Induced, Unbound, Wash 1, and Wash 2 fractions and 3 µg of protein from the Elute 1, Elute 2, Storage 1, and Storage 2 fractions were analyzed. On the gel, proteins were visualized with OrioleTM fluorescent gel stain (Bio-Rad, Hercules, CA).

Characterization of recombinant S. velum symbionts and A. vinosum enzymes

Activities of the purified PPi-PFK and FBPase enzymes were measured in coupled enzyme assays adapted from Alves et al. (1994). Assays were performed in 300 µl volume on

µclearTM 96-well plates (Greiner Bio-One, Frickenhausen, Germany). Forward PPi-PFK reaction was assayed in 50 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 1 mM KCl, 3 mM NH4Cl, 5mM DTT,

0.15 mM NADH, 0.05-7.5 mM F6B, 1.3 U fructose 1,6-bisphosphate aldolase (EC, 10

U triosephosphate isomerase (EC, 1.7 U a-glycerophosphate dehydrogenase (EC, 400-500 ng of purified recombinant PPi-PFK, and 0.01-5 mM PPi. FBPase activity and reverse PPi-PFK reaction were assayed in 50 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 1 mM KCl, 3

+ mM NH4Cl, 5mM DTT, 0.4 mM NADP , 0.01-10 mM FBP, 1.75 U phosphoglucose isomerase

(EC, 1.75 U glucose 6-phosphate dehydrogenase (EC, 400-500 ng of purified enzyme, and 0.5-100 mM K2HPO4. For each measurement, 100 µl of the assay components equilibrated to the required temperature were added. The reaction was initiated by the addition of 200 µl PPi for the forward and 200 µl K2HPO4 for the reverse activity assay. Reaction progress was monitored at 340 nm in a Tecan Infinite m200 spectrophotometer (Tecan,


Männedorf, Switzerland) for up to 5 min at 25°C, unless stated otherwise. pH-dependence of the enzymes was measured in PIPES- (pH 6.0-7.5, adjusted with 1M KOH) and Tris-based (pH

3- 7.5-9.0, adjusted with 1M HCl) activity buffers with 20mM PO4 and 5mM FBP substrates.

Temperature-dependence between 20°C and 80°C in 5°C increments was assayed in the

3- reverse reaction in pre-equilibrated Tris-HCl buffer with 20 mM PO4 and 5 mM FBP substrates using temperature-controlled Tecan spectrophotometer (Tecan, Mannedorf,̈ Switzerland) and a water bath.

Substrate inhibition and the regulatory role of PPase in the symbiont PPi-PFK reverse

3- reaction were studied in the presence of 0.05-1.0 mM PPi, 0.01-5 mM FBP, and 20 mM PO4 substrates with and without 1.75 U PPase. Since sufficient amount of the S. velum symbiont

PPase could not be purified for this experiment, a commercial PPase from E. coli (I5907, Sigma

Aldrich) was used. All measurements were carried out at least in triplicate.

Concentrations of NADH, used as a reporter of PPi-PFK activity in coupled-enzyme assay for the forward reaction, were determined from a standard curve. The same standard curve was also applied to estimate concentrations of NAD(P)H, used as a reporter in the reverse assay, since NADH and NAD(P)H have the same extinction coefficient and both exhibit a maximum absorption peak at 340 nm (Bergmeyer 1975).

PPase activity was determined using a colorimetric assay based on Alebeek (van

Alebeek & Keltjens 1994). Reaction was carried-out in assay buffer containing 100 mM Tris-HCl

(pH 7.5), 2 mM MgCl2, 5 mM DTT, and 300 ng recombinant PPase from the S. velum symbiont.

The reaction was started by adding 0.05-1.0 mM of PPi to the total volume of 210 µl. For 100 sec at 20 sec intervals 30 µl aliquots of enzyme reaction were transferred into a colorimetric reagent containing 1% (NH4)6Mo7O24, 0.83 M H2SO4, and 8% FeSO4. After incubating samples for 32 min, absorbance was measured at 660 nm in a polystyrene 96-well plate (Greiner Bio-


One, Frickenhausen, Germany) using a Tecan Infinite m200 spectrophotometer (Tecan,

3- Männedorf, Switzerland). PO4 concentrations were estimated with the use of standard curve.

Initial velocities were calculated in the linear range of catalytic reactions at different substrate concentrations. Kinetic constants were determined using the nonlinear Least squares

Levenberg–Marquardt fitting algorithm. Ki inhibition constant was calculated in GraphPad Prism version 7.0 (GraphPad Software, La Jolla, CA).

To derive kcat values for the S. velum symbiont and A. vinosum PPi-PFKs, the enzymes were considered to be homodimeric, by analogy to the structure of PPi-PFK from Borrelia. burgdorferi, which is the closest related PPi-PFK with a published crystal structure (Weissgerber et al. 2011). By analogy to E. coli FBPase (Kelley-Loughnane et al. 2002), the A. vinosum

FBPase was assumed to be a homotetramer. The symbiont PPase was regarded as a homohexamer by analogies to the closet related homologue from with resolves crystal structure from E. coli (Kankare et al. 1994).

Catalytic efficiencies (Ef) were calculated using Equation 1 from Ceccarelli at al. (2008).

+ $%&'()* ) ! = ,- " 78%&'[9] [Equation 1] $/01(23*[5]* : ) 8 ;<=

KM - Michaelis constant

kcat - rate constant

’ k cat -rate constant for the reverse reaction

9 -1 -1 kdif - rate for a diffusion-controlled process (10 M s )

Ke - equilibrium constant

ϴ - reversibility of the reaction (1 – reversible, 0 – irreversible)

Only enzymes with published KmFBP and kcatFBP values were included in this analysis. In particular, kinetic data for the Thermatoga maritime FBPase were taken from Myung et al.


(2010). Catalytic parameters for E. coli FBPase were obtained from Kelley-Loughnane et al.

(2002). The E. coli type II FBPase’s, GlpX and YggF, were described by Brown et al. (2009).

Kinetic data for PPi-PFK from Xanthomonas campestris were documented in Frese et al.


The enigmatic Calvin cycle

of chemoautotrophic bacterial symbionts deciphered

Oleg Dmytrenko1, Alicja J. Kunikowska2, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge,

Massachusetts, United States of America.

2Klinikum Rechts der Isar der Technischen Universität München, Munich, Germany.



Autotrophic CO2 fixation is the main source of organic carbon on Earth. Virtually all of primary productivity from bacteria to higher plants is carried out by a conserved set of enzymatic reactions which constitute the Calvin-Benson-Bassham (Calvin) cycle. Chemoautotrophic gammaproteobacterial endosymbionts of marine invertebrates are some of the most prolific primary producers which use the Calvin cycle. However, these bacteria lack a gene for fructose bisphosphatase (FBPase), a key enzyme in this CO2 fixation pathway. Since sequencing of the first symbiont genome it remained unknown how the Calvin cycle operates in these bacteria without FBPase. This was partially due to our inability to culture and genetically manipulate chemoautotrophic symbionts. By reconstructing the symbiont-like Calvin cycle in a free-living closely-related purple sulfur gammaproteobacterium, Allochromatium vinosum, we have for the first time demonstrated that in the absence of FBPase its function in the cycle can be performed by a reversible pyrophosphate-dependent phosphofructokinase (PPi-PFK), previously hypothesized to participate in CO2 fixation. The shift from FBPase to PPi-PFK came at the cost of reduced growth and decreased adaptability but, at the same time, offered an improvement in thermodynamic efficiency potentially due to an increase in the metabolism of pyrophosphate, which could be generated by PPi-PFK acting in the Calvin cycle. Using this experimental approach we have not only demonstrated a novel energy-efficient variant of the Calvin cycle hypothesized in chemoautotrophic symbionts, but also showed the feasibility of experimentally testing metabolic hypotheses postulated based on sequence data from uncultured symbiotic microorganisms.



The vast majority of known bacteria remain uncultured (Pace 2009; Robertson et al.

2013) and are not easily amenable to culture-independent experimental manipulation. This severely limits the possibilities for studying the physiology, function, and activity of uncultured bacteria such as chemoautotrophic endosymbionts of marine invertebrates (Cavanaugh et al.

2013). Inferences made from DNA and RNA sequence data have helped advance understanding of the symbionts and inform future research directions. Multiple insightful hypotheses have been proposed based on sequence data, many of which await experimental validation. One of the most notable such hypotheses in the field of chemoautotrophic symbioses traces back to the first sequenced symbiont genome, that of the intracellular gammaproteobacterium which colonizes gills of the deep-sea vent giant clam, Calyptogena magnifica (Newton et al. 2007). The gene encoding fructose bisphosphatase (FBPase), an enzyme which catalyzes essential reactions in the Calvin cycle, has not been found in the genome of this or any other gammaproteobacterial symbiont sequenced to date. Despite the absence of FBPase, the symbionts are able to fix CO2 using ribulose 1,5-bisphosphate carboxylase oxygenase (RuBisCO), the key enzyme in the Calvin cycle (Felbeck et al. 1981;

Cavanaugh 1983; Robinson et al. 1998; Singer et al. 1952; Erb & Zarzycki 2018). The resulting organic carbon feeds their hosts in exchange for sulfide and oxygen sequestered and delivered to the bacteria (Fisher & Childress 1992; Polz et al. 2000; Hourdez & Weber 2005; Scott &

Cavanaugh 2007). Chemoautotrophic symbionts are, in fact, among the most prolific primary producers in the ocean, capable of supporting some of the fastest known growth rates among marine invertebrates (Lutz et al. 1994). It has been hypothesized that the function of the missing

FBPase in the symbionts may be performed by a pyrophosphate-dependent phosphofructokinase (PPi-PFK) acting in reverse (Newton et al. 2007; Markert et al. 2007;

Kleiner et al. 2012; Dmytrenko et al. 2014). A comprehensive survey of bacterial genomes


reveals that this genetic trait is confined to two disparate monophyletic clades dominated by symbionts within gammaproteobacteria (Dmytrenko et al. 2018), suggesting a potential link between the evolution of chemoautotrophic symbioses and the shift from FBPase to PPi-PFK in the Calvin cycle.


ribose 5-P xylulose 5-phosphate ribulose 5-phosphate RPE TK glyceraldehyde 3-phosphate sedoheptulose 7-phosphate PRK PPi PPi FBPase/PPi-PFK RPE 3- 3- PO4 PO4 sedoheptulose 1,7-bisphosphate ribulose 1,5-bisphosphate FBA dihydroxyacetone phosphate erythrose 4-phosphate xylulose 5-phosphate RuBisCO TK CO2 glyceraldehyde 3-phosphate fructose 6-phosphate TPI PPi PPi FBPase/PPi-PFK 3-phosphoglycerate 3- 3- PO4 PO4 fructose 1,6-phosphate PGK dihydroxyacetone phosphate FBA 1,3-bisphosphoglycerate TPI GAPDH

glyceraldehyde 3-phosphate

Figure 3.1. Hypothesized Calvin cycle in A. vinosum featuring interchangeable FBPase and PPi-PFK activity. The reactions catalyzed by PPi-PFK are shown in red. Enzyme names and their corresponding locus tags in the A. vinosum genome are listed in Appendix 3 Table A3.1.

The Calvin cycle is the primary biological CO2 fixation pathway found in bacteria, algae, and higher plants, which is responsible for over 90% of primary production (Raven 2009; Berg

2011; Schwander et al. 2016). It carries out carbon incorporation into biomass through a synergistic action of thirteen evolutionary conserved reactions catalyzed by enzymes, which could be orthologous, paralogous, or structurally unrelated to each other in different species

(Figure 3.1) (Martin & Schnarrenberger 1997). RuBisCO combines CO2 with ribulose bisphosphate (RBP) to make two 3-phosphoglycerates, which are then converted into triose phosphates. They, in turn, may serve as precursors of most other organic carbon molecules.


The remaining reactions of the Calvin cycle regenerate RBP from triose phosphates for the next round of CO2 fixation and supply intermediates to other cellular pathways (Sato & Atomi 2010;

Bar-Even et al. 2012). One of the enzymes involved in RBP regeneration is FBPase, which in bacteria dephosphorylates fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7- bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). Under physiological conditions these reactions are irreversible. Conversion of FBP to F6P via FBPase is also part of glycolysis.

Dephosphorylation of SBP, on the other hand, is specific to the Calvin cycle and in eukaryotes is catalyzed by a separate enzyme, sedoheptulose bisphosphatase (SBPase), which has no affinity for FBP (Teich et al. 2007). Without FBPase activity, the RBP CO2 acceptor could not be regenerated, stalling out the Calvin cycle.

Chemoautotrophic symbionts which do not encode FBPase in their genomes may instead use a bidirectional PPi-PFK, an enzyme thought to operate as a kinase in glycolysis using pyrophosphate (PPi) as a phosphoryl group donor, unlike the more common virtually irreversible ATP-dependent PFK (Mertens 1991). PPi-PFK, which shares common ancestry with

ATP-PFK (Bapteste et al. 2003), was first discovered in Entamoeba histolytica (Reeves et al.

1974) and later found in bacteria (O'Brien et al. 1975), plants (Carnal & Black 1979), and archaea (Siebers et al. 1998). It is unclear why some organisms have one or the other version of PFK, as most do not simultaneously carry genes for both enzymes (Mertens 1991;

Dmytrenko et al. 2018). Similar to bacterial FBPases, bacterial PPi-PFKs are known to be promiscuous for FBP and SBP in the reverse reaction (Reshetnikov et al. 2008), which suggests that these enzymes could be interchangeable in the Calvin cycle. Unlike FBPase, PPi-PFK generates PPi in the reverse reaction and consumes PPi when PPi-PFK acts as a kinase in the forward (glycolytic) direction. Bacteria have on average high PPi content (0.5 to 1.5 mM), which is produced, for instance, during syntheses of DNA, RNA, proteins, and polysaccharides


(Heinonen & Drake 1988; Bornefeld 1981; J. Chen et al. 1990). At these concentrations PPi is inhibitory to the reverse PPi-PFK activity (Anon 1989; Anon 1996). For example, in the case of

PPi-PFK from a chemoautotrophic symbiont of the Solemya velum coastal bivalve, 1 mM PPi reduces catalytic efficiency of the reverse reaction by more than threefold and makes the forward reaction more favorable (Dmytrenko et al. 2018). To prevent substrate inhibition and allow PPi-PFK to function in reverse, a low concentration of cellular PPi has to be maintained.

This can be accomplished by the action of diverse PPi-consuming enzymes, such as inorganic pyrophosphatase (PPase) (Josse 1966; Klemme & Gest 1971; van Alebeek & Keltjens 1994;

Jeon & Ishikawa 2005; Hoelzle et al. 2010), proton pumping PPase (H+-PPase) (Nyrén et al.

1984; Ordaz et al. 1992; Schultz & Baltscheffsky 2003; Serrano et al. 2004), or ATP sulfurylase

(SAT) (Parey et al. 2013). As H+-PPase hydrolyzes PPi, this membrane-bound enzyme creates an electrochemical proton gradient which can be used for ATP synthesis. SAT activity can also lead to ATP formation as a result of transferring PPi to adenosine 5'-phosphosulfate (APS) in a

2- final step of sulfur oxidation to sulfate (SO4 ). The above enzymes, detected in chemoautotrophic symbionts (Felbeck et al. 1981; C. Chen et al. 1987; Fisher et al. 1993; Laue

& Nelson 1994) and identified in their genomes and proteomes (Kleiner et al. 2012; Markert et al. 2011; Dmytrenko et al. 2014), could reduce the available pool of PPi, stimulating PPi-PFK reverse activity and concomitantly increasing cellular ATP content. The need for PPi removal suggests that the use of PPi-PFK in the Calvin cycle may affect not only CO2 fixation, but the overall energy balance and physiology of the cells.

Since chemoautotrophic symbionts are yet to be cultured outside of their host and no molecular genetics tools are available to study their gene function, a genetically tractable model bacterium is needed to experimentally test the hypothesized ability of PPi-PFK to replace

FBPase in the Calvin cycle. Using a free-living bacterium could overcome the limitations of working with the symbionts and provide a controlled experimental system free from potential


influences of the host. Allochromatium vinosum is well suited for this purpose. It is a free-living

2.0 µm x 2.5-6.0 µm rod-shaped anoxygenic facultatively photolithoautotrophic purple sulfur gammaproteobacterium, which is closely related to chemoautotrophic symbionts (Imhoff 2005;

Dubilier et al. 2008; Dmytrenko et al. 2014). Tools for manipulative genetics have been developed for A. vinosum (Pattaragulwanit & Dahl 1995; Lubbe et al. 2006) and successfully applied to the study of sulfur metabolism (Sander et al. 2006; Dahl et al. 2013; Stockdreher et al. 2014). Sequencing of the A. vinosum genome (Weissgerber et al. 2011) has expanded opportunities for applying the available genetic tools to studying other aspects of its physiology, such as carbon metabolism. A. vinosum is able to grow photolithoautotrophically by fixing CO2 via the Calvin cycle with energy obtained from light using reduced sulfur as electron donors

(Imhoff 2005). A. vinosum is metabolically versatile and can utilize sulfide, elemental sulfur, polysulfides, thiosulfate, and sulfite. It also grows photoorganoheterotrophically, e.g., on acetate, malate, or pyruvate, providing an opportunity for creating and maintaining knockout mutants which are deficient in CO2 fixation. In its genome, A. vinosum encodes FBPase and

PPi-PFK, which have both been previously purified and shown to dephosphorylate FBP to F6P

(Dmytrenko et al. 2018). By selectively inactivating genes for FBPase (fbp) and PPi-PFK (pfp) in

A. vinosum, we have examined their role in CO2 fixation and physiology. In particular, by creating the symbiont-like ∆fbp knockout, we tested the hypothesized ability of the remaining

PPi-PFK to replace FBPase in the Calvin cycle. Effects of the shift from FBPase to PPi-PFK in

A. vinosum were investigated by measuring growth- and CO2-fixation rates in the wild type (WT) and the knockouts. To quantify energy savings potentially associated with PPi-PFK use in the

Calvin cycle, we assessed sulfur consumption rates–as a proxy for reducing equivalents–and

ATP levels in the culture.

In this study we recreated the hypothesized Calvin cycle from chemoautotrophic symbionts in A. vinosum and demonstrated that pfp is sufficient and, in the absence of fbp,


essential for CO2 fixation and growth. Our results provided compelling evidence for a novel

Calvin cycle variant, potentially deciphering one of the biggest conundrums in chemoautotrophic symbiosis. Our data yielded important insights into the physiological changes associated with the shift from FBPase to PPi-PFK, including an increase in thermodynamic efficiency at the cost of adaptability and growth rate. This observation may have direct implications for understanding the metabolic changes potentially associated with the evolution of chemoautotrophic symbioses.

Finally, our experimental approach demonstrates the feasibility of testing hypotheses based on sequence data from uncultured symbiotic bacteria using molecular genetics in closely-related experimentally tractable organisms.

Results fbp and pfp genes were knocked out in A. vinosum

In A. vinosum, fbp and pfp genes were deleted by double-crossover homologous recombination (Figure 3.2). In the process, both genes were substituted in frame with antibiotic resistance genes, aphA and aacC1, respectively. These markers were placed under control of

Pfbp and Ppfp promoters to prevent potential polar effects on downstream genes. The success rate of obtaining A. vinosum ∆fbp knockouts was 100%. Double crossover ∆pfp mutants constituted 14% of the screened colonies. In case of ∆fbp ∆pfp mutants, made as a fbp knockout in the ∆pfp background, 7% of over one hundred tested colonies carried both deletions.


A fbp deletion Pfbp WT locus fbp Pfbp HR template pCM433 fbpL::aphA::fbpR aphA

Pfbp ∆fbp knockout aphA

pfp::aacC1 pfp::aacC1 pfp::aacC1 ∆ ∆ ∆

fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA Ladder ∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control

3.0 kb fbp aphA fbp locus

2.0 kb

1.0 kb

0.5 kb

B pfp deletion Ppfp WT locus pfp Ppfp HR template pCM433 pfpL::aacC1::pfpR aacC1

Ppfp ∆pfp knockout aacC1

pfp::aacC1 pfp::aacC1 pfp::aacC1 ∆ ∆ ∆

fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA fbp::aphA pfp::aacC1fbp::aphA Ladder ∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control∆ ∆ ∆ WT Negative control

3.0 kb pfp aacC1 pfp locus

2.0 kb

1.0 kb

0.5 kb

Figure 3.2. Construction and PCR analysis of (A) fbp and (B) pfp gene knockouts in A. vinosum. fbp and pfp genes (purple) were replaced by homologous recombination with promoterless aphA kanamycin resistance (green) and aacC1 gentamicin resistance (orange) genes. The double crossover homologous recombination (HR) products fused the aphA and aacC1 antibiotic resistance genes with Pfbp and Ppfp promoters. To create ∆fbp ∆pfp double knockout, aacC1 with a constitutive gentamicin promoter was used.


Either pfp or fbp are sufficient and essential for autotrophic growth in A. vinosum

To investigate the effects of fbp and pfp knockouts on the physiology of A. vinosum, bacteria were grown in a controlled anaerobic bioreactor (Figure 3.3). pH, sulfide concentration, temperature, and illumination were continuously monitored and kept constant. Optical density of the culture was measured throughout growth.

The A. vinosum ∆fbp and ∆pfp knockouts maintained their ability to grow photolithoautotrophically in the presence of light with sulfide as a source of electrons and supplemented bicarbonate as the sole source of dissolved organic carbon (DIC) (Figure 3.4).

Similarly to the WT, ∆pfp knockout entered exponential growth phase approximately 40 hours after inoculation. A. vinosum ∆fbp had a significantly longer lag, starting to grow after, on average, 100 hours. Deletion of both genes completely abolished autotrophic growth even though the cells remained metabolically active and consumed sulfide throughout incubation

(Appendix 3 Figure A3.1).

A. vinosum WT, ∆fbp, and ∆pfp were able to grow photoorganoheterotrophically in a medium containing acetate, malate, and thiosulfate (Figure 3.5). Continuous sulfide feeding and pH control, necessary to maintain autotrophic growth, were not used. Under heterotrophic conditions, WT and ∆pfp exhibited similar growth. A. vinosum ∆fbp followed a comparable growth dynamic, plateauing, however, at a lower optical density. At approximately OD690 0.75 all strains underwent a diauxic shift. No significant lag in growth was observed between the three strains.


Figure 3.3. Bioreactor setup for growing A. vinosum with automated pH control, sulfide feeding, and optical density measurements. Cultures were incubated in the presence of 0.3-0.5 mM sulfide, at pH 7.0, 30℃, and 42,000 Lux (400-700 nm). Peristaltic pumps are not shown. (Fishing person is included as homage to Malvin Calvin (Wilson & Calvin 1955)).


Figure 3.4. Photoautotrophic growth of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp. OD690 was measured every 10 min for the WT, ∆fbp, ∆pfp, and every 30 min for the ∆fbp ∆pfp cultures. Shaded areas around mean values for each strain indicate standard error of the mean (SEM) (WT N=2, ∆fbp N=3, ∆pfp N=3).

A. vinosum ∆fbp ∆pfp knockout was unable to grow heterotrophically unless fructose or glucose were present in the medium (Figure 3.6). Growth commenced 120 and 420 hours after inoculation when the cultures were supplemented with fructose and glucose, respectively.

Compared to glucose, an equimolar amount of fructose yielded an overall higher cell density. Other sugars, such as sucrose, rhamnose, and glucuronate were unable to complement the mutant phenotype even after over two months of incubation.

Under autotrophic and heterotrophic conditions A. vinosum WT showed comparable growth rates (Figure 3.7). Single knockout mutations did not affect growth rates on malate and


acetate. With DIC as the sole carbon source, A. vinosum ∆fbp grew on average 26% slower than the WT and 20% slower than the ∆pfp knockout. No significant difference in growth rates between the WT and ∆pfp was observed. The ∆fbp ∆pfp mutant was able to grow neither autotrophically nor heterotrophically. Supplementation with fructose or glucose enabled growth of the double knockout at 30% of the WT rate.

Figure 3.5. Photoheterotrophic growth of A. vinosum WT, ∆fbp, ∆pfp. OD690 was measured every 10 min. Shaded areas around mean values for each strain indicate SEM (N=2).

A. vinosum ∆pfp and ∆fbp knockouts do not affect autotrophic CO2 fixation rates

To study the effects of ∆fbp and ∆pfp mutations on CO2 fixation, which in A. vinosum is primarily carried out by the Calvin cycle (Weissgerber, Sylvester, et al. 2014; Weissgerber,

Watanabe, et al. 2014; T. Tang et al. 2017), CO2 fixation rates in A. vinosum WT and knockout strains were measured under autotrophic and heterotrophic conditions using 13C-labeled DIC

(Figure 3.8). No significant difference was observed in carbon fixation rates among the WT,


∆fbp, and ∆pfp grown autotrophically. In the heterotrophic medium, 90-99% lower CO2 fixation rates were measured. A. vinosum WT and ∆pfp fixed equivalent amounts of carbon during heterotrophy. ∆fbp and ∆fbp ∆pfp incorporated 13C label at 50% and 90% lower rate than that of the WT, respectively.

Figure 3.6. Growth of A. vinosum ∆fbp ∆pfp in heterotrophic medium supplemented with sugars. OD690 of the cultures was measured approximately every 12 hours. Error bars around mean values indicate SEM (N=3).

A. vinosum ∆fbp have significantly reduced rates of sulfide consumption

The hypothesized PPi-PFK use in the Calvin cycle and the associated production of high energy phosphate, i.e., PPi, may affect sulfur metabolism of A. vinosum under autotrophic conditions. The effects of ∆fbp and ∆pfp gene loss on sulfide consumption through anaerobic sulfur oxidation were quantified by monitoring sulfide concentration in the cultures. Sulfide consumption rates (nmol S-1 mg protein-1 min-1) were calculated based on changes in sulfide


concentration in the bioreactor throughout autotrophic growth of the A. vinosum WT, ∆fbp, and

∆pfp cultures (Appendix 3 Figure A3.2). No significant difference in sulfide consumption rates was observed between A. vinosum WT and ∆pfp. Rates decreased by at least 75% in the bioreactor cultures containing ∆fbp. As cell densities increased over time, sulfide consumption rates decreased, while the general relationship between consumption rates by different strains remained comparable throughout growth.

Figure 3.7. Growth rates of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp under heterotrophic and autotrophic conditions measured within linear range (100 min for autotrophic and heterotrophic and 36 hours for heterotrophic ∆fbp ∆pfp cultures). For heterotrophic growth, rates before (darker hue) and after (lighter hue) diauxic shift are shown. Error bars around mean values indicate SEM (autotrophic WT N=2, ∆fbp N = 3, ∆pfp N = 3; heterotrophic N=2; heterotrophic ∆fbp ∆pfp N=3). ****p<0.05 by ANOVA with Fisher's Least Significant Difference (LSD) test.

A. vinosum WT and ∆fbp strains have comparable ATP levels

PPi, potentially generated by PPi-PFK acting in the Calvin cycle, may be converted into

ATP through a number of potential mechanisms. Here were measured ATP levels in A. vinosum

WT, ∆fbp, and ∆pfp at regular intervals throughout autotrophic growth (Figure 3.9). During most of the exponential growth WT and ∆fbp had comparable amounts of ATP (nmol ATP mg protein-


1). However, at the same optical densities, A. vinosum ∆pfp exhibited significantly lower ATP levels. In all cultures proportion of ATP per protein decreased as cell densities increased.

Figure 3.8. CO2 fixation rates of A. vinosum WT, ∆fbp, ∆pfp, and ∆fbp ∆pfp under autotrophic and heterotrophic conditions measured during exponential growth. Heterotrophic ∆fbp ∆pfp culture was supplemented with fructose. Error bars indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3; replicates=2). ****p<0.05 by ANOVA with LSD test.


Figure 3.9. Amount of ATP throughout growth of A. vinosum WT, ∆fbp, and ∆pfp under autotrophic conditions. Error bars around mean values indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3; replicates=3).


The Calvin cycle is carried out by a set of enzymes which perform conserved substrate conversion steps enabling CO2 incorporation into biomass (Figure 3.1) (Martin &

Schnarrenberger 1997). Our study for the first time experimentally demonstrates a variation of the cycle in which two of the steps, dephosphorylation of FBP and SBP to F6P and S7P, may be catalyzed not by FBPase but by PPi-PFK. The observed physiological changes which accompany this substitution elucidate possible selective advantages which may have led to the evolution of PPi-PFK use in the Calvin cycle of chemoautotrophic symbionts, bacteria in which this departure from the canonical pathway is thought to occur (Newton et al. 2007; Markert et al.

2007; Kleiner et al. 2012; Dmytrenko et al. 2014).


Experimental investigation into the potential role of PPi-PFK in CO2 fixation was carried out in A. vinosum, a genetically tractable facultative photolithoautotrophic bacterium

(Pattaragulwanit & Dahl 1995), which is closely related to chemoautotrophic symbionts. fbp and pfp genes were knocked out by replacing their entire protein-coding sequences in frame with selectable antibiotic markers while retaining the endogenous promoters (Figure 3.2). This mutagenesis strategy maximally ensured that the observed knockout phenotypes reflect the effects of gene loss and, by extension, absence of either FBPase or PPi-PFK enzyme activity.

Deletion of individual genes, especially in the case of the symbiont-like ∆fbp mutation, did not reduce overall viability of A. vinosum. A double knockout, on the other hand, resulted in glucose/fructose auxotrophy, consistent with the predicted functions of fbp and pfp.

The ability of single—but not double—A. vinosum ∆fbp and ∆pfp knockouts to grow autotrophically with CO2 as the sole carbon source suggests that fbp and pfp complement each other and either of them is sufficient and essential for autotrophic growth (Figure 3.4). This observation is in agreement with a parallel study which demonstrated that purified A. vinosum

FBPase and PPi-PFK enzymes are capable of catalyzing the same essential reaction in the

Calvin cycle, namely the dephosphorylation of FBP to F6P (Dmytrenko et al. 2018). The measured growth rates (Figure 3.7, Appendix 3 Table A3.2) are well within the range of the doubling times (7-10 hours) previously reported for A. vinosum (Weissgerber et al. 2013). Prior to inoculation into autotrophic environment within a bioreactor all cultures were grown in a heterotrophic medium until approximately mid-log phase. Following the transfer A. vinosum WT,

∆fbp, and ∆pfp entered an apparent lag in growth (Figure 3.4), during which changes in gene expression and protein transcription necessary to accommodate a shift in the growth mode from heterotrophy to autotrophy likely took place. When transferred from heterotrophic to autotrophic medium, A. vinosum is known to upregulate expression of, among others, the Calvin cycle genes encoding RuBisCO, phosphoglycerate kinase, transketolase, and phosphoribulokinase


(T. Tang et al. 2017; Weissgerber, Sylvester, et al. 2014; Fuller et al. 1961) as well as downregulate carbon storage regulator A (CsrA) (Weissgerber, Sylvester, et al. 2014), a global posttranscriptional regulator protein known to negatively control activity of gluconeogenic enzymes such as FBPase in E. coli (Revelles et al. 2013; Timmermans & Van Melderen 2009).

Following the transfer into autotrophic environment, transcription and the corresponding protein levels also increase for flavocytochrome c (FccAB), sulfide:quinone oxidoreductases (SqrD and

SqrF), and the Dsr system (Weissgerber et al. 2013; Weissgerber, Sylvester, et al. 2014) involved in sulfide oxidation. Duration of the lag, approximately 40 hours, did not differ between the WT and the ∆pfp knockout. In case of the ∆fbp mutant the lag was significantly longer, lasting approximately 100 hours. This delay in autotrophic growth of ∆fbp could be explained by slow initiation of pfp transcription, driven by what is likely a regulated promoter, and influenced by an alternative start codon, GTG, associated with lower levels of transcriptional initiation

(Kozak 1999). Additionally, after the PPi-PFK protein is synthesized, it may not be fully capable of complementing the FBPase deficiency until the cytoplasmic PPi concentration, usually high in bacteria (Bornefeld 1981; Heinonen & Drake 1988; J. Chen et al. 1990; Heinonen & Heinonen

2001), is sufficiently reduced, for example, through the action of PPases or SAT.

Unlike the single knockouts, the A. vinosum ∆fbp ∆pfp fails to grow when transferred from the heterotrophic medium supplemented with fructose into minimal autotrophic environment. The transferred bacterial cells, however, remain metabolically active, as evidenced by continuous, although very slow, rate of sulfide consumption in the culture throughout incubation (40 days) (Appendix 3 Figure A3.1). This sulfide consumption in the bioreactor cannot be accounted for by abiotic processes and is likely due to bacterial metabolism. In the absence of both fbp and pfp genes A. vinosum thus appears to lose its ability to fix CO2 into biomass and as a result, produce organic carbon required for growth. However, it retains capacity to oxidize sulfide and persist in a vegetative state.


Separate deletion of fbp and pfp in A. vinosum has no major effect on ability of this bacterium to grow heterotrophically, primarily on acetate and malate (Figure 3.5). Under these conditions no lag in growth occurred for either A. vinosum WT, ∆fbp, or ∆pfp when bioreactor was inoculated with pre-cultures grown in the medium of the same composition. As growth progressed, a distinctive diauxic pattern shared by all three strains became apparent. The diauxic shift, followed by a brief lag phase, occurred around OD690 0.75. After the shift, the growth rates decreased (Figure 3.5, Appendix Table A3.2), signaling that the cells have likely switched from a more preferred carbon substrate to a less preferred one (Monod 1947). The initial faster growth phase was short, suggesting that the more preferred substrate may have been acetate, present in a lower concentration (2 mM) compared to a potentially less favored malate (21 mM), and, therefore, likely producing a shorter burst of growth. Acetate is more reduced than malate (McKinlay & Harwood 2010) and yields a higher standard free energy change in the first step of acetate metabolism catalyzed by acetyl-CoA synthetase (-20 kJ mole-

1) (van Rossum et al. 2016) compared to -8 kJ mole-1 for malic enzyme operating in the decarboxylation direction (Kunkee 1967). Once cell densities in the bioreactor reach OD690 of approximately 1.5, A. vinosum ∆fbp starts to show a faster decrease in growth rate than the WT or the ∆pfp knockout, eventually plateauing at a lower optical density. Knowing that A. vinosum

∆fbp is unable to shift from organic to inorganic carbon in the medium as rapidly as the other two strains (Figure 3.4), it is likely that the WT and ∆pfp reach stationary phase at a higher

OD690 by more readily consuming the background amount of DIC that may be present in the medium. This observed succession of growth phases suggests that A. vinosum may preferentially first consume acetate, then malate, and finally DIC and that the loss of the

FBPase enzyme due to ∆fbp mutation impairs its ability to rapidly adapt to a change from organic to inorganic sources of carbon.


A. vinosum which carries the ∆fbp ∆pfp knockout is a fructose/glucose auxotroph (Figure

3.6). The double mutant completely loses its ability to grow on minimal autotrophic (Figure 3.4) and heterotrophic media (Figure 3.6). Supplementation with fructose or glucose reverses the mutant phenotype and partially restores growth. The glc-/fru- auxotrophy of the double knockout agrees with the predicted metabolic roles of FBPase and PPi-PFK in A. vinosum. Deletion of fbp and pfp appears to deprive the bacterium of its ability to make sugars, required for synthesis of cellular components from acetate or malate via gluconeogenesis. In the case of CO2 fixation, these mutations would stall out the Calvin cycle by preventing regeneration of the cycle intermediates. To complement the auxotrophy, glucose or fructose must be taken up from the medium through a yet unidentified transferase system, since A. vinosum does not encode any of the known sugar transporters in its genome (Weissgerber et al. 2011). Uptake and utilization of fructose and glucose likely occur through a mechanism specific to the type of sugar molecule, as only two of the five tested saccharides were able to complement the ∆fbp and ∆pfp loss of function mutations. The implicated fructose and glucose transport and metabolism genes do not appear to be constitutively expressed in A. vinosum, instead being potentially induced after discrete periods of time since inoculation. Growth with fructose commences earlier than in glucose-supplemented cultures and reaches higher optical density, suggesting that fructose more readily complements the mutation and may be a more preferred substrate overall. These findings do not only provide strong support for the hypothesis regarding the potential role of PPi-

PFK in gluconeogenesis and the Calvin cycle, but also supply new insights into metabolism of

A. vinosum, which has been previously reported incapable of utilizing either fructose or glucose

(Imhoff 2005).

Incorporation of the 13C-label from DIC into the biomass of A. vinosum ∆fbp and ∆pfp testifies to the unimpaired ability of these knockouts to fix CO2 in comparison to the WT (Figure

3.8). Lack of significant differences in the rates measured under autotrophic conditions implies


that the enzymatic reactions of the Calvin cycle potentially affected by the two mutations may not be rate-limiting and that FBPase and PPi-PFK, which under substrate saturation and in the absence of inhibition have similar catalytic efficiencies (Dmytrenko et al. 2018), operate in a favorable cellular environment throughout autotrophic growth. Although rates of growth during autotrophy and heterotrophy are not significantly different (Figure 3.7), CO2 assimilation in heterotrophic medium was on average only 10% of the autotrophic level (Figure 3.8), which could be attributed to RuBisCO activity in the Calvin cycle (T. Tang et al. 2017) used to recycle reducing equivalents during growth on reduced carbon substrates (McKinlay & Harwood 2010).

Most of the observed CO2 fixation, however, could be explained primarily by anaplerotic carbon fixation via phosphoenolpyruvate carboxylase, phosphoenolpyruvate carboxykinase, and pyruvate carboxylase (K.-H. Tang et al. 2011; Weissgerber, Sylvester, et al. 2014; Weissgerber,

Watanabe, et al. 2014; T. Tang et al. 2017). CO2 fixation in A. vinosum measured under heterotrophic conditions agrees with the data reported for other heterotrophic bacteria such as

Roseobacter denitrificans (K.-H. Tang et al. 2009) and eukaryotic algae (Cassar & Laws 2007), but is lower than the previously estimated 29% for A. vinosum (T. Tang et al. 2017) and 25% for cyanobacterium Synechocystis sp. (Yang et al. 2002). A further reduction of the CO2 fixation rate in the ∆fbp mutant to only 5% of the WT autotrophic level suggests that at least half of carbon fixation under heterotrophic conditions could occur via the Calvin cycle (Figure 3.8). This agrees with an earlier observation that the ∆fbp knockout cannot rapidly switch from heterotrophy to autotrophy (Figure 3.4), which would require ready use of the Calvin cycle, and corroborates a prior supposition that this mutant is unable to fully utilize background CO2 present in heterotrophic medium (Figure 3.5).

Measurements of sulfide consumption in A. vinosum bioreactor cultures provided a means of estimating the effects of fbp and pfp gene loss on sulfide metabolism during autotrophic growth (Appendix 3 Figure A3.2). Concomitant energy metabolism was evaluated by


quantifying ATP throughout the incubations (Figure 3.9). Measured sulfide consumption rates fall within the range previously reported for A. vinosum (Weissgerber et al. 2013; Dahl et al.

2013) and other sulfur bacteria, such as Prosthecochloris aestuarii (Takashima et al. 2000). The highest sulfide consumption rates are observed in the WT strain, together with the highest overall ATP content, which agrees with the published ATP values for A. vinosum (Miović &

Gibson 1971; van Gemerden & Beeftink 1978). Deletion of pfp does not significantly decrease sulfide consumption (Appendix 3 Figure A3.2) but leads to a reduction of ATP level (85% of the

WT) (Figure 3.9). On the other hand, growth of A. vinosum ∆fbp is characterized by a significantly curtailed consumption of sulfide, down to almost 25% of the WT rates during the log-phase (Appendix 3 Figure A3.2). Despite these pronounced differences in rates of sulfide consumption, the ATP content of the ∆fbp knockout is almost level with the WT values, particularly during exponential growth (Figure 3.9). This high ATP content of A. vinosum ∆fbp could be attributed to potential pyrophosphate hydrolysis, for example, by H+-PPases (Nyrén et al. 1984; Ordaz et al. 1992; Schultz & Baltscheffsky 2003; Serrano et al. 2004) and SAT activity,

2- which makes ATP with PPi and APS in the final step of sulfide oxidation to SO4 (Parey et al.

- 2013). A. vinosum has genetic capacity for at least two sulfite (HSO3 ) oxidizing enzymes, SAT and a membrane bound polysulfide reductase-like iron-sulfur molybdoprotein (SoeABC) (Dahl et

- al. 2013). SoeABC is the major HSO3 oxidizing enzyme in this bacterium while SAT is thought to be secondary. We hypothesize that in A. vinosum ∆fbp, PPi, potentially produced by the

- reverse PPi-PFK activity in the Calvin cycle, may stimulate SAT activity and channel HSO3

2- oxidation to SO4 through this enzyme instead of SoeABC. ATP produced by SAT in this reaction could be used in the Calvin cycle. During each round of CO2 fixation PPi-PFK may produce two molecules of PPi by dephosphorylating FBP and SBP. If these PPi molecules were to be converted into ATP by SAT, the overall ATP cost of fixing one molecule of CO2 would drop from three to one ATP. For A. vinosum, which produces ATP using a light-driven cyclic electron


flow (Brune 1989), this would mean a decline in the overall ATP demand under autotrophic conditions, making the electron transport chain overreduced (Pott & Dahl 1998; Dahl et al. 2005;

Frigaard & Dahl 2009). Such conditions are known to exert back pressure on the cyclic electron flow system and lower the rate of sulfur oxidation. This hypothesis agrees with our ATP and sulfide consumption data and is consistent with reverse PPi-PFK activity in the Calvin cycle.

Removal of PPi through the action of H+-PPases would have a similar but likely less pronounced effect on sulfide consumption.

By recreating in A. vinosum the Calvin cycle proposed in uncultured chemoautotrophic symbionts, we demonstrated that either fbp or, more notably, pfp is essential and sufficient for

CO2 fixation and growth, ascertaining the hypothesized ability of PPi-PFK to replace FBPase. A. vinosum ∆fbp, the knockout which can only use PPi-PFK, grows at a reduced rate on CO2 as the sole carbon source. This reduction in growth may be associated with the physiological changes to the cellular milieu necessary to accommodate PPi-PFK use in the Calvin cycle.

These changes are likely centered around removal of PPi, which is inhibitory to reverse PPi-

PFK activity. A similar rationale could explain why A. vinosum ∆fbp is considerably slower at adapting to changes in culture conditions. On the other hand, this symbiont-like mutant does not exhibit reduced CO2 fixation ability or diminished ATP content, potentially due to ATP synthesis coupled to PPi removal. These observations suggest that while the loss of fbp would be deleterious to free living generalists who need to rapidly adapt to a changing environment and must be able to attain the most rapid growth to outpace other bacteria in a competition for resources, the shift from FBPase to PPi-PFK could be advantageous to specialists, particularly those living as symbionts in a relatively constant environment, face little competition, and, therefore, are selected for thermodynamic efficiency over growth rate and adaptability. It is in these bacteria that the PPi-PFK-utilizing variant of the Calvin cycle has been initially hypothesized. Thus, these data do not only establish that PPi-PFK may replace FBPase during


autotrophic growth on CO2 as the only carbon source. They also provide novel insights into the physiological aspects of this evolutionary adaptation, which may have facilitated the origin and maintenance of chemoautotrophic symbiosis.

Materials and methods

Bacterial strains and plasmids

The symbionts-like Calvin cycle was recreated in A. vinosum purple sulfur bacterium was using standard protocols in molecular genetics. A. vinosum DSM 180T Rif50 rifampicin spontaneous resistance mutant (Lubbe et al. 2006) and Escherichia coli S17-1 plasmid donor strain (Simon et al. 1983) were provided by Christiane Dahl (Universität Bonn) (Appendix 3

Table A3.3). Plasmids pCM184, pCM351 (Marx & Lidstrom 2002), and pCM433 (Marx 2008) were obtained from Christopher Marx (University of Idaho).

To delete fbp and pfp in A. vinosum, the genes were replaced in-frame by homologous recombination with antibiotic selection markers aphA and aacC1, respectively (Figure 3.2). To avoid polar effects on downstream genes, which could influence mutant phenotypes, the strong constitutive promoters of the antibiotic resistance cassettes were omitted in favor of Pfbp and

Ppfp native promoters. To achieve expression of aacC1 from Ppfp, the start codon of aacC1 was modified from ATG to GTG. Unlike aphA expressed from Pfbp, the pfp promoter did not initiate expression of aacC1 in E. coli, suggesting an A. vinosum specific mode of transcriptional regulation. Plasmids carrying templates for recombination were introduced into A. vinosum through conjugation with E. coli S17-1 donor. Successful transconjugants were selected either with kanamycin (∆fbp::aphA) or gentamicin (∆pfp::aacC1). Post conjugation, E. coli was eliminated from plates with rifampicin, to which the A. vinosum acceptor strain was resistant.

Double-crossover recombinant knockouts were obtained by additionally supplementing selection medium with sucrose. A. vinosum harboring single crossover products were sucrose-


sensitive due to the presence of sacB gene encoding levansucrase from Bacillus subtilis on the allelic exchange plasmids. The primers and plasmids used to create the knockout mutants are listed in Appendix 3 Tables A3.3 and A3.4. Construction of the plasmids, conjugation, and mutant isolation are detailed in Appendix 3 Supplementary Methods.

Growth conditions

For routine propagation, A. vinosum wild type (WT) and the knockout mutant strains were grown anaerobically in modified liquid heterotrophic RCV medium (pH 7.0) adapted from

Weaver (1975), containing 21 mM malate and 2 mM acetate as carbon sources and 0.8 mM sodium thiosulfate as a reducing agent and a source of sulfur. The medium was filter-sterilized, supplemented with 50 µg/ml rifampicin to avoid contamination, and distributed into 9 ml gas- tight glass vials. To obtain single colonies, A. vinosum was plated on RCV medium containing

1% Phytagel (Sigma Aldrich). Prior to inoculation the plates were stored overnight under oxygen-free atmosphere. The inoculated plates were incubated in GasPakTM BBLTM jars (BD).

To study growth kinetics and the effects of fbp and pfp gene loss, A. vinosum was cultured either under photolithoautotrophic or photoorganoheterotrophic conditions in liquid

Pfennig's medium (Imhoff 2006), with DIC as the sole carbon source (18 mM), or RCV medium, respectively. For this purpose the bacteria were grown in a bioreactor built in-house from a 500 ml spinner flask (Bellco Glass) (Figure 3.3). Optical density (OD) of the cultures was continuously monitored at 690 nm by circulating the medium from the main flask through a gas- tight glass cuvette mounted inside UV-1601 spectrophotometer (Shimadzu). For measuring heterotrophic growth kinetics, the bioreactor was inoculated with liquid mid-log phase RCV pre- cultures started from single colonies. For photoautotrophic growth, the pre-cultures were first collected on 0.45 µm sterile filters and then resuspended in Pfennig's medium to avoid transferring residual organic carbon from RCV medium into the bioreactor. Starting OD690 for


each bioreactor growth experiment was approximately 0.07. Bacteria were grown under constant illumination of approximately 42,000 Lux (400-700 nm) by placing the cultures between

2 incandescent light bulbs (60W), which kept the cultures at 30°C. Throughout photoautotrophic growth, sulfide concentration was maintained between 0.3 and 0.5 mM and pH was kept at 7.0.

For auxotrophic growth experiments RCV medium was supplemented with 1% w/v of either fructose, glucose, sucrose, rhamnose, or glucoronic acid. All growth experiments were repeated two to four times. Significant differences between growth rates of the WT and the knockout mutants were identified using ANOVA with a Fisher's Least Significant Difference (LSD) post hoc test. Media composition, culture conditions, bioreactor setup, antibiotic concentrations, and sampling procedures for ATP and protein determination are detailed in Appendix 3

Supplementary Methods.

CO2 fixation rates

To compare the effects of FBPase and PPi-PFK use on carbon fixation in the Calvin cycle of A. vinosum, CO2 fixation in the bioreactor cultures was determined by measuring the rate at which 13C labeled DIC (Cambridge Isotope Laboratories) was incorporated into biomass.

13 NaH CO3 was added to autotrophic and heterotrophic cultures during a mid-log phase to the final bicarbonate 13C/12C ratio of 0.17. Cultures were sampled (5 ml) in duplicate at regular time intervals for the total duration of 5 hours and collected on 25 mm GF/F glass microfiber filters

(Whatman). The filtrate was fumed with HCl for 12 h followed by lyophilization for 24 h at -50°C and pressure below 0.040 mBar in FreeZone 2.5 freeze dry system (Labconco). Carbon stable isotope composition of the samples was analyzed in stable isotope facilities at Boston

University; The Center, Marine Biological Laboratory, Woods Hole; and the Center for Stable Isotopes at the University of New .


For calculating 13C dissolved inorganic carbon (DIC) incorporation rates, the mass balance equation was adapted from Montoya (1996):

(A )([PC ]) = (A )([PC ]) + (A )([PC ]) PC f f PCcontrol control CO2 D where A equals atom% of particulate carbon (PC; biomass carbon) at the end of incubation (f) and start/natural abundance (control), or of the DIC pool (ACO2); [PCf] equals concentration/amount of PC at end of incubation, [PCcontrol] stands for concentration/amount of PC at start of incubation, and [PCΔ] represents concentration/amount of newly formed PC during incubation, equal to new carbon biomass. To calculate carbon fixation rates (newly formed carbon biomass), the equation was solved for the relative ratio of newly formed biomass as a function of total biomass.

((APC ) - (APC )) ([PC ]) f control = D ((A ) - (A )) ([PC ]) CO2 PCcontrol f

To determine the absolute carbon fixation rate, the equation was solved for [PCΔ]. The reported rates were calculated per min per mg of total protein. Significant differences between CO2 fixation rates of the WT and the knockout mutants were identified using ANOVA with a Fisher's

Least Significant Difference (LSD) test.

Sulfide consumption rates

The effects of ∆fbp and ∆pfp gene loss on sulfur metabolism were monitored by measuring rates of sulfide consumption throughout each autotrophic growth experiment. Sulfide concentration was continuously monitored in the bioreactor using an imbedded sulfide electrode

(Weiss Research) connected to Chemcadet mV controller (Cole-Parmer). The mV output from the controller was captured with Yocto-milliVolt-RX-BNC precision voltmeter (Yoctopuce) and recorded using Raspberry Pi3 (Raspberry Pi Foundation) running a custom Python script. The


recorded data reflected cyclical changes in sulfide concentration due to bacterial consumption

(≥0.3 mM) and intermittent automated supplementation with sulfide (≤0.5 mM).

Sulfide consumption rates were calculated from the mV measurements for each incubation experiment in Python 3.6 using NumPy 1.13.3 and Pandas 0.21.0. Briefly, mV data was converted into nmol of sulfide using a standard curve obtained with Cline spectrophotometric method (Cline 1969). Next, the highest sulfide consumption rate was determined in each consumption/supplementation cycle. The rate values were then adjusted for protein concentration in the bioreactor cultures. Finally, the rates from replicate experiments were averaged and plotted in Matplotlib version 2.1.0 (Hunter 2007).

Quantification of ATP

ATP content of the autotrophic bioreactor A. vinosum cultures was quantified throughout growth between OD690 of 0.5 and 2.5 using BacTiter-Glo Microbial Cell Viability Assay

(Promega). First, culture samples stored at -80°C were thawed for approximately 10 min on ice.

Next, aliquots (25 µl) were transferred in duplicate to an opaque-walled 96 well plate (Greiner

Bio-One) and combined with 25 µl of the ATP reagent. To remove bubbles, the plates were centrifuged at 1,000 x g for 1 min. Luminescence in the samples was recorded using a Tecan

Infinite m200 spectrophotometer under automatic attenuation, integration time 1,000 ms, and settle time 150 msec. The amount of ATP was determined with a standard curve. At least 3 experimental replicates were analyzed per sampled OD point.

Protein determination

Total protein concentration of A. vinosum cultures between OD690 0.5 and 2.5 was measured using a Coomassie Dye based assay for use in rate calculations (CO2 fixation and sulfide consumption) and quantification of ATP content. To release cellular proteins, the bacteria


were disrupted via probe sonication (Sonifier 250, Branson). Previously collected frozen A. vinosum cell pellets where thawed on ice and resuspended in cold 50 mM Tris-HCl buffer (pH

7.5) containing 200 mM NaCl. These bacterial suspensions were subjected to six rounds of sonication lasting 15 sec each at output level 3. During sonication sample tubes were kept on ice-salt slurry (8:1 w/w) and transferred to ice between the treatments for approximately 3 min.

Cell lysis was monitored microscopically. Protein concentrations were determined for three samples per time point in four technical replicates using CB-Protein Assay with bovine serum albumin (BSA) as a standard (G-Biosciences). Absorbance at 595 nm was quantified with a

Tecan Infinite m200 spectrophotometer.


This work was possible due to generous financial support from the Department of Organismic and Evolutionary Biology, Harvard University. We are indebted to Christiane Dahl, Christopher

Marx, Michael Madigan, Dipti Nayak, and Anna Wang for helpful suggestions and assistance with obtaining bacterial cultures and plasmids.

Symbioses between eukaryotes and bacteria are subject to strong evolutionary forces which favor their maintenance and benefit the individual partners. In my thesis I investigated a potential adaptation to a symbiotic lifestyle which appears to have occurred in all chemoautotrophic gammaproteobacterial symbionts of marine invertebrates. These phylogenetically disparate bacteria form symbioses with diverse eukaryotic hosts (Dubilier et al.

2008; Cavanaugh et al. 2013). Chemoautotrophic symbionts range widely in their genetic repertoire and genome size. For example, some of the smallest known genomes among autotrophic bacteria belong to the symbionts of deep-sea clams Calyptogena okutanii (1.0 Mb)

(Kuwahara et al. 2007) and Calyptogena magnifica (1.2 Mb) (Newton et al. 2007). In contrast, the genome of Solemya velum symbiont (2.7 Mb) (Dmytrenko et al. 2014) and the symbiont of

Riftia pachyptila (3.5 Mb) (Robidart et al. 2008; Gardebrecht et al. 2012) are similar in size to the genomes of free-living bacteria, for instance, Thiomicrospira crunogena (2.4 Mb) (Scott et al.

2006) and Thiobacillus denitrificans (2.9 Mb) (Beller et al. 2006). However, regardless of the differences among chemoautotrophic symbionts, all belonging to the class gammaproteobacteria appear to lack one gene, namely fbp, encoding fructose 1,6- bisphosphatase (FBPase). In bacteria this enzyme performs two essential reactions in the

Calvin cycle, dephosphorylating fructose 1,6-bisphosphate (FBP) and sedoheptulose 1,7- bisphosphate (SBP) to fructose 6-phosphate (F6P) and sedoheptulose 7-phosphate (S7P), respectively (Gerbling et al. 1986; Yoo & Bowien 1995). Presence of fbp in the genomes of almost all of their closest non-symbiotic relatives, demonstrated in Chapter 2, suggests a strong association between the lack of fbp and a symbiotic lifestyle.

Despite an apparent lack of FBPase, these symbionts are able to fix CO2 with RuBisCO

(Felbeck et al. 1981; Cavanaugh 1983; Robinson et al. 1998; Singer et al. 1952; Erb & Zarzycki

2018). It is possible that the missing enzyme may be supplied by the host. Such adaptation,


however, is unlikely to have occurred independently in phylogenetically diverse lineages of chemoautotrophic symbionts and hosts as different as siboglinid tubeworms (Markert et al.

2007) and coastal protobranch bivalves (Dmytrenko et al. 2014). A more parsimonious hypothesis, which I have investigated in my thesis, proposes a potential substitution of FBPase activity with enzymatic catalysis performed by a pyrophosphate-dependent phosphofructokinase


Having confirmed the absence of fbp in the genome of S. velum symbiont in Chapter 1 of my thesis, I turned to investigate expression of the PPi-PFK encoding gene, pfp, in the symbiont as part of Chapter 2. In bacteria, PPi-PFK is thought to be a glycolytic enzyme primarily operating in the forward direction (Mertens 1991; Frese et al. 2014). Transcriptional analysis of the S. velum symbiont revealed low expression levels across all genes encoding glycolitic enzymes, such as pyruvate kinase, phosphoglycerate mutase, or enolase. In contrast, transcriptional levels of pfp were high and comparable to those of the Calvin cycle genes, including glyceraldehyde 3-phosphate dehydrogenase and phosphoribulokinase. Expression of pfp correlated with high PPi-PFK activity in the symbiont-containing gill tissue. Furthermore, recombinant PPi-PFK was unique among other known bacterial PPi-PFK in having higher specificity for the reverse over the forward reaction and higher catalytic efficiency than a number of bacterial FBPases. Taken together, these suggest that in the symbionts PPi-PFK may not be primarily operating in glycolysis. Such high transcriptional and enzymatic activities and the higher propensity for the reverse reaction are more in line with the hypothesized role of PPi-PFK in the Calvin cycle. Further, pyrophosphate (PPi), produced by PPi-PFK during dephosphorylation of fructose 1,6-bisphosphate (FBP) to fructose 6-phosphate (F6P), inhibited the enzyme. Bacteria on average have high cellular PPi content, ranging from 0.5 to 1.5 mM

(Heinonen & Drake 1988; Bornefeld 1981; Chen et al. 1990). At 1 mM PPi, PPi-PFK activity is reduced by more than 75% and the forward reaction may be favored. Thus, reverse PPi-PFK


activity in the symbionts is dependent on PPi removal. Additionally, hydrolysis of PPi increases equilibrium constant (K') of the reverse reaction by 103-104-fold (Heinonen 2001). PPi can be consumed by a number of enzymes encoded in the genome of S. velum symbiont, for example,

ATP sulfurylase (SAT) , inorganic pyrophosphatase (PPase), sodium-translocating PPase (Na+-

PPase), or proton-pumping PPase (H+-PPases) (Dmytrenko et al. 2014). In the case of inorganic PPase, PPi is broken down to phosphate without coupling the hydrolysis to any other reaction (van Alebeek & Keltjens 1994). If PPi is consumed by H+/Na+ PPases, an electrochemical gradient is generated by transferring one H+/Na+ per PPi into the periplasm

(Serrano et al. 2007). Approximately 10 H+/Na+ can generate 3 molecule of ATP by ATP synthase (Hinkle 2005). In contrast, SAT activity produces one ATP per PPi (Parey et al. 2013).

Among the possibilities above, the latter may be the most favorable mechanism of PPi removal in the symbionts, shown below.



Among the S. velum symbiont genes known to encode PPi consuming enzymes, sat is the most highly transcribed. Furthermore, high SAT activity and protein levels has been previously reported in symbiont-containing tissues of numerous invertebrate hosts, including S. velum (Felbeck et al. 1981; Felbeck 1981; Fisher & Hand 1984; Chen et al. 1987; Polz et al.

1992; Fiala-Medioni et al. 2002; Markert et al. 2007; Kleiner et al. 2012). This suggests that PPi may couple dephosphorylation of FBP and sedoheptulose 1,7-bisphosphate by PPi-PFK in the

Calvin cycle to sulfide oxidation. This coupling would drive reverse PPi-PFK activity by preventing substrate inhibition and increasing equilibrium constant of the reverse reaction. The resulting PPi could be consumed by SAT, driving sulfide oxidation to completion and generating


two molecules of ATP per each round of the Calvin cycle. This would reduce the energetic cost of carbon fixation from three to one ATP molecules per CO2 molecule fixed.

Coupling between the PPi-PFK reverse reaction and the SAT activity–in the direction of

ATP synthesis and sulfate production–may explain why the hypothesized shift from FBPase to

PPi-PFK is specific to chemoautotrophic sulfur oxidizing symbionts. Such an adaptation has not occurred, for example, in photosynthetic symbionts or plastids, which have evolved from cyanobacteria. Organelles and photosynthetic symbiotic bacterial rely on FBPase and sedoheptulose 1,7-bisphosphatase (SBPase) in their Calvin cycle (Martin & Schnarrenberger

1997). They do not obtain energy from sulfide oxidation, like chemoautotrophic symbionts, and thus may not have an efficient energy generating mechanism to co-opt for PPi removal required for PPi-PFK activity in the Calvin cycle. However, PPi-PFKs are widely distributed among plants, including Zea mays (Mertens 1991), and their role in CO2 fixation remains enigmatic.

In Chapter 3 of my thesis I investigated the ability of PPi-PFK to replace FBPase in the

Calvin cycle and thus to support CO2 fixation. Due to the genetic intractability of chemoautotrophic symbionts, I recreated the symbiont-like Calvin cycle in A. vinosum, a close but free-living relative of the chemoautotrophic symbiont of S. velum. Using deletion mutagenesis in this bacterium it was demonstrated that, in the absence of fbp, PPi-PFK encoded by pfp is essential and sufficient for carbon fixation and growth. The shift from FBPase to PPi-PFK in A. vinosum was associated with a reduction in growth rate and adaptability but not in carbon fixation. The loss of FBPase also led to a significant decrease in sulfide oxidation rates. Despite this decline in sulfide consumption, ATP levels did not change. These observations agree with the proposed coupling between PPi-PFK reverse activity and SAT activity, which may consume PPi generated by PPi-PFK to make ATP in the final step of sulfur oxidation to sulfate. For A. vinosum, which generates ATP using a light-driven cyclic electron flow (Brune 1989), this means a decrease in ATP demand and a resulting back pressure on


cyclic electron flow, which is known to reduce sulfide oxidation rates (Pott & Dahl 1998; Dahl et al. 2005; Frigaard & Dahl 2009). This model fits the observed high ATP levels, a decline in sulfide oxidation, and as well unaltered rates of carbon fixation in the A. vinosum FBPase- deficient mutant. However, cellular adjustments may be necessary to accommodate the increase in PPi derived from reverse PPi-PFK activity. Besides, PPi-PFK forward activity, which may be required for certain anaplerotic purposes, diminishes. These changes could account for decreased growth rate of the ∆fbp mutant. Since chemoautotrophic symbionts are confined to their intracellular environment, a decline in growth would be of no consequence, as long as CO2 fixation rates remain unchanged. An increase in thermodynamic efficiency due to ATP production, on the other hand, would be of a great advantage, as these bacteria do not only feed themselves but also their much more energetically demanding hosts. Taken together, these data support the hypothesized ability of PPi-PFK to replace FBPase in the Calvin cycle of sulfur oxidizing bacteria. The results presented in my thesis propose a mechanism that highly favors a shift from FBPase to PPi-PFK in chemoautotrophic but not in photoautotrophic symbionts, plastids, or free-living bacteria.

Future directions

The indicated ability of PPi-PFK to replace FBPase in the Calvin cycle paves a way to investigate the role of PPi in the energy metabolism of chemoautotrophic symbionts, in particular with regard to SAT activity. The potential role of H+/Na+-PPases in PPi recycling offers another attractive line of inquiry. Finally, these findings may have potential applications for industrial sequestration of CO2.



Supplementary material for Chapter 1:

The genome of the intracellular bacterium of the coastal bivalve, Solemya velum: a blueprint for thriving in and out of symbiosis


Table S1. Length [bp], GC%, percentage of the total base pairs, and the number of genes in the scaffolds which constitute the genome of the S. velum symbiont. Scaffold Length (bp) % GC % of Total bp No. Genes SV_sym_Scaffold_1 1213831 51.5 44.92 1232 SV_sym_Scaffold_2 892555 50.7 33.03 927 SV_sym_Scaffold_3 537613 50.9 19.89 557 SV_sym_Scaffold_4 28016 48.7 1.04 25 SV_sym_Scaffold_5 7777 46.3 0.29 1 SV_sym_Scaffold_6 7618 43.5 0.28 7 SV_sym_Scaffold_7 3806 43.2 0.14 1 SV_sym_Scaffold_8 3773 40.0 0.14 2 SV_sym_Scaffold_9 3752 46.6 0.14 2 SV_sym_Scaffold_10 3712 42.8 0.14 3 Total 2702453 51.0


Table S2. tRNA genes and the codon frequencies in the genome of the S. velum symbiont. Codons: tRNA genes: Codon AA Frequency AA Codon TGA * 1.63 Ala A TGC TAA * 1.327 Cys C GCA TAG * 0.637 Asp D GTC GCA A 30.143 Glu E TTC GCC A 24.733 Glu E TTC GCT A 16.719 Phe F GAA GCG A 16.595 Gly G GCC TGC C 5.296 Gly G TCC TGT C 5.114 His H GTG GAT D 35.186 Ile I GAT GAC D 23.155 Lys K TTT GAA E 34.885 Leu L CAG GAG E 33.656 Leu L CAA TTC F 19.054 Leu L TAA TTT F 18.404 Leu L GAG GGT G 27.277 Leu L TAG GGC G 26.427 Met M CAT GGA G 12.83 Met M CAT GGG G 7.449 Met M CAT CAT H 12.058 Asn N GTT CAC H 11.907 Pro P CGG ATC I 30.673 Pro P TGG ATT I 23.225 Pro P GGG ATA I 7.266 Gln Q TTG AAG K 24.794 Gln Q CTG AAA K 21.606 Arg R CCT CTG L 41.235 Arg R TCT CTC L 20.935 Arg R CCG CTT L 17.412 Arg R ACG TTG L 12.268 Ser S TGA TTA L 5.172 Ser S GGA CTA L 4.593 Ser S GCT ATG M 27.247 Thr T GGT AAC N 18.269 Thr T TGT AAT N 17.187 Val V GAC


Table S2 (Continued). CCG P 14.93 Val V TAC CCA P 9.846 Trp W CCA CCT P 9.477 Tyr Y GTA CCC P 8.879 CAG Q 27.882 CAA Q 10.404 CGC R 18.466 CGT R 18.353 AGG R 5.975 AGA R 5.232 CGA R 5.103 CGG R 4.279 AGC S 13.341 TCA S 12.66 AGT S 10.56 TCG S 9.407 TCC S 8.889 TCT S 7.926 ACC T 18.477 ACA T 14.328 ACT T 9.67 ACG T 9.083 GTC V 22.01 GTT V 20.633 GTG V 16.267 GTA V 10.521 TGG W 12.983 TAC Y 14.765 TAT Y 13.288


Table S3. Gene product names used in Figure 1 and Figure 4, the corresponding NCBI protein ID reference numbers, and EC/TC numbers. Product Full name Protein ID EC/TC Number Electron transport chain Sulfur Oxidation SoxA Heterodimeric c-type cytochrome complex SoxAX, subunit A JV46_24690 Unavailable SoxX Heterodimeric c-type cytochrome complex SoxAX, subunit X JV46_24720 Unavailable SoxY Sulfur carrier protein SoxYZ, subunit Y JV46_24710 Unavailable SoxZ Sulfur carrier protein SoxYZ, subunit Z JV46_24700 Unavailable SoxB Sulfate thiol-esterase SoxB JV46_27210 FccA Flavocytochrome c dehydrogenase FccAB, subunit A JV46_05270 Unavailable FccB Flavocytochrome c dehydrogenase FccAB, subunit B JV46_05260 Unavailable Sqr Sulfide-quinone reductase Sqr JV46_19710 rDsrA Reverse-operating cytoplasmic dissimilatory sulfite reductase DsrAB, subunit A JV46_15520 rDsrB Reverse-operating cytoplasmic dissimilatory sulfite reductase DsrAB, subunit B JV46_15510 rDsrE Hexameric sulfur relay protein rDsrEFH, subunit E JV46_15500 2.8.1.- rDsrF Hexameric sulfur relay protein rDsrEFH, subunit F JV46_15490 Unavailable rDsrH Hexameric sulfur relay protein rDsrEFH, subunit H JV46_15480 Unavailable rDsrC Persulfide carrier to DsrAB, rDsrC JV46_15470 2.8.1.- rDsrM Transmembrane electron transport complex rDsrKMJOP, subunit M JV46_15460 Unavailable rDsrK Transmembrane electron transport complex rDsrKMJOP, subunit K JV46_15450 Unavailable rDsrL Transmembrane electron transport complex rDsrKMJOP, subunit L JV46_15440 Unavailable rDsrJ Transmembrane electron transport complex rDsrKMJOP, subunit J JV46_15430 Unavailable rDsrO Transmembrane electron transport complex rDsrKMJOP, subunit O JV46_15420 Unavailable rDsrP Transmembrane electron transport complex rDsrKMJOP, subunit P JV46_15410 Unavailable rDsrN Dsr protein of unknown function, DsrN JV46_15400 Unavailable rDsrR Dsr protein of unknown function, DsrR JV46_15390 Unavailable rDsrS Putative posttranscriptional regulator of the rdsr operon JV46_15380 Unavailable AprA Adenosine phosphosulphate reductase AprABM, subunit A JV46_07790 AprB Adenosine phosphosulphate reductase AprABM, subunit B JV46_07780 AprM Adenosine phosphosulphate reductase AprABM, subunit M JV46_07770 Sat ATP-generating ATP sulfurylase JV46_21260 SulP1 Sulfate:bicarbonate antiporter SulP JV46_19680 Unavailable SulP2 Sulfate:bicarbonate antiporter SulP JV46_24990 Unavailable Primary Ion Pumps RnfA1 Electron transport complex, RnfABCDGE type, subunit A JV46_10780 Unavailable RnfB1 Electron transport complex, RnfABCDGE type, subunit B JV46_10790 Unavailable RnfC1 Electron transport complex, RnfABCDGE type, subunit C JV46_10800 Unavailable RnfD1 Electron transport complex, RnfABCDGE type, subunit D JV46_10810 Unavailable RnfG1 Electron transport complex, RnfABCDGE type, subunit G JV46_10820 Unavailable RnfE1 Electron transport complex, RnfABCDGE type, subunit E JV46_10830 Unavailable RnfB2 Electron transport complex, RnfBCDGEA type, subunit B JV46_16970 Unavailable RnfB3 Electron transport complex, RnfBCDGEA type, subunit B JV46_16960 Unavailable RnfC2 Electron transport complex, RnfBCDGEA type, subunit C JV46_16930 Unavailable RnfD2 Electron transport complex, RnfBCDGEA type, subunit D JV46_16920 Unavailable RnfG2 Electron transport complex, RnfBCDGEA type, subunit G JV46_16910 Unavailable RnfE2 Electron transport complex, RnfBCDGEA type, subunit E JV46_16900 Unavailable RnfA2 Electron transport complex, RnfBCDGEA type, subunit A JV46_16890 Unavailable NdhA NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit A JV46_20270 NdhB NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit B JV46_20280 NdhC NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit C JV46_20290 NdhD NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit D JV46_20300 NdhE NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit E JV46_20310 NdhF NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit F JV46_20320 NdhG NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit G JV46_20340 NdhH NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit H JV46_20350 NdhI NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit I JV46_20360 NdhJ NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit J JV46_20370 NdhK NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit K JV46_20380 NdhL NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit L JV46_20390 NdhM NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit M JV46_20400 NdhN NADH dehydrogenase, NADH:quinone oxidoreductase NdhABCDEFGHIJKLMN, subunit N JV46_20410 Hydrogenases HupS [Ni-Fe]-uptake hydrogenase HupSL, subunit S JV46_26680 HupL [Ni-Fe]-uptake hydrogenase HupSL, subunit L JV46_26720 Hox2F Bidirectional hydrogenase H2FUYH, subunit 2F JV46_10640 Hox2U Bidirectional hydrogenase H2FUYH, subunit 2U JV46_10650 Hox2Y Bidirectional hydrogenase H2FUYH, subunit 2Y JV46_10660 Hox2H Bidirectional hydrogenase H2FUYH, subunit 2H JV46_10670 Hox2W Maturation protease of Hox2H, How2W JV46_10680 Unavailable NqrA Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit A JV46_13720 NrqB Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit B JV46_13710 NqrC Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit C JV46_13700 NqrD Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit D JV46_13690 NqrE Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit E JV46_13680


Table S3 (Continued). NqrF Na+-translocating NADH:ubiquinone oxidoreductase NqrABCDEF, subunit F JV46_13670 OadG Na+-translocating oxaloacetate decarboxylase OadGAB, subunit G JV46_14870 OadA Na+-translocating oxaloacetate decarboxylase OadGAB, subunit A JV46_14860 OadB1 Na+-translocating oxaloacetate decarboxylase OadGAB, subunit B1 JV46_14850 OadB2 Na+-translocating oxaloacetate decarboxylase OadGAB, subunit B2 JV46_14840 Quinone Reductases FdoG Formate dehydrogenase-O FdoGHI, subunit G JV46_19080 FdoH Formate dehydrogenase-O FdoGHI, subunit H JV46_19090 FdoI Formate dehydrogenase-O FdoGHI, subunit I JV46_19100 Quinone Oxidases QcoA Quinol:cytochrome-c oxidoreductase bc1, subunit A JV46_24140 QcoB Quinol:cytochrome-c oxidoreductase bc1, subunit B JV46_24170 QcoC Quinol:cytochrome-c oxidoreductase bc1, subunit C JV46_24180 Terminal reductases

CcoN cbb3-type cytochrome c oxidase CooNOQP, subunit N JV46_10130

CcoO cbb3-type cytochrome c oxidase CooNOQP, subunit O JV46_10150

CcoQ cbb3-type cytochrome c oxidase CooNOQP, subunit Q JV46_10160

CcoP cbb3-type cytochrome c oxidase CooNOQP, subunit P JV46_10170

CoxA aa3-type cytochrome c oxidase CoxAB - subunit A JV46_24840

CoxB aa3-type cytochrome c oxidase CoxAB - subunit B JV46_24930

CydA ba3-type cytochrome c oxidase CydAB - subunit A JV46_05320

CydB ba3-type cytochrome c oxidase CydAB - subunit B JV46_05330 DmsA Dimethylsulfoxide reductase DmsABC, subunit A JV46_27580 DmsB Dimethylsulfoxide reductase DmsABC, subunit B JV46_27590 DmsC Dimethylsulfoxide reductase DmsABC, subunit C JV46_27600 NapF Periplasmic nitrate reductase napFDAGHBC, subunit F JV46_19590 NapD Periplasmic nitrate reductase napFDAGHBC, subunit D JV46_19600 NapA Periplasmic nitrate reductase napFDAGHBC, subunit A JV46_19610 NapG Periplasmic nitrate reductase napFDAGHBC, subunit G JV46_19620 NapH Periplasmic nitrate reductase napFDAGHBC, subunit H JV46_19630 NapB Periplasmic nitrate reductase napFDAGHBC, subunit B JV46_19640 NapC Periplasmic nitrate reductase napFDAGHBC, subunit C JV46_19650 NirB Assimilatory nitrite reductase, subunit B, large JV46_10450 NirD Assimilatory nitrite reductase, subunit D, small JV46_10460 Mrp Antiporter MrpE Na+:H+ antiporter MrpEFGBBCDD, subunit E JV46_13210 Unavailable MrpF Na+:H+ antiporter MrpEFGBBCDD, subunit F JV46_13200 Unavailable MrpG Na+:H+ antiporter MrpEFGBBCDD, subunit G JV46_13190 Unavailable MrpB1 Na+:H+ antiporter MrpEFGBBCDD, subunit B1 JV46_13180 Unavailable MrpB2 Na+:H+ antiporter MrpEFGBBCDD, subunit B2 JV46_13170 Unavailable MrpC Na+:H+ antiporter MrpEFGBBCDD, subunit C JV46_13160 Unavailable MrpD1 Na+:H+ antiporter MrpEFGBBCDD, subunit D1 JV46_13140 Unavailable MrpD2 Na+:H+ antiporter MrpEFGBBCDD, subunit D2 JV46_13120 Unavailable ATP Synthases

Atpf0I F0F1-type ATP synthase, F0 subunit I JV46_17360

Atpf0A F0F1-type ATP synthase, F0 subunit A JV46_17370

Atpf0C F0F1-type ATP synthase, F0 subunit C JV46_17380

Atpf0B F0F1-type ATP synthase, F0 subunit B JV46_17390

Atpf1D F0F1-type ATP synthase, F1 subunit delta JV46_17400

Atpf1A F0F1-type ATP synthase, F1 subunit alpha JV46_17410

Atpf1G F0F1-type ATP synthase, F1 subunit gamma JV46_17420

Atpf1B F0F1-type ATP synthase, F1 subunit beta JV46_17430

Atpf1E F0F1-type ATP synthase, F1 subunit epsilon JV46_17450

AtpA0D A0A1-type ATP synthase, A0 subunit D JV46_14270

AtpA0B A0A1-type ATP synthase, A0 subunit B JV46_14280

AtpA0A A0A1-type ATP synthase, A0 subunit A JV46_14290

AtpA0F A0A1-type ATP synthase, A0 subunit F JV46_14310

AtpA1K A0A1-type ATP synthase, A1 subunit K JV46_14320

AtpA0I A0A1-type ATP synthase, A1 subunit I JV46_14330 Type IV Pilus PilA1 Type IV pilus assembly major pilin protein PilA1 JV46_15650 Unavailable PilA2 Type IV pilus assembly major pilin protein PilA2 JV46_15660 Unavailable PilB Type IV pilus assembly major pilin protein PilB JV46_21670 Unavailable PilC Type IV pilus assembly major pilin protein PilC JV46_21680 Unavailable PilD Type IV pilus assembly major pilin protein PilD JV46_21690 PilE1 Type IV pilus prepilin-type N-terminal cleavage PilE1 JV46_13730 Unavailable PilY1 Type IV pilus assembly tip-associated adhesin PilY1-like protein PilY1 JV46_13740 Unavailable PilX Type IV pilus assembly protein PilX JV46_13810 Unavailable PilW Type IV pilus assembly protein PilW JV46_13850 Unavailable PilV Type IV pilus modification protein PilV JV46_13890 Unavailable PilE2 Type IV pilus prepilin-type N-terminal cleavage/methylation domain PilE2 JV46_25290 Unavailable PilX Type IV pilus assembly protein PilX JV46_25310 Unavailable PilW Type IV pilus assembly protein PilW JV46_25320 Unavailable FimU Type IV pilus pilin protein FimU JV46_25330 Unavailable


Table S3 (Continued). PilP Type IV pilus assembly protein PilP JV46_23590 Unavailable PilQ Type IV pilus secretin protein PilQ JV46_23600 Unavailable PilU Type IV pilus assembly protein PilU JV46_21550 Unavailable Calvin Cycle CbbL ribulose 1,5-bisphosphate carboxylase large subunit CbbL JV46_07630 CbbS ribulose 1,5-bisphosphate carboxylase small subunit CbbS JV46_07620 CbbP Phosphoribulokinase JV46_10890 TK Transketolase JV46_03550 RPE Ribulose-phosphate 3-epimerase JV46_16690 RPI Ribose 5-phosphate isomerase JV46_17960 Glyconeogenesis PpsA Phosphoenolpyruvate synthase JV46_19560 GapB Glyceraldehyde-3-phosphate dehydrogenase GapB JV46_03540 PK Pyruvate kinase JV46_03510 Polyglucose biosynthesis PGM1 Phosphoglucomutase 1 JV46_12920 UDP UDP-glucose pyrophosphorylase JV46_24670 GS Glycogen synthases, ADP-glucose type JV46_06580 GBE Glycogen branching enzyme JV46_06590 GT 4-alpha-glucanotransferase JV46_06600 PYGL Glycogen phosphorylase JV46_06570 Glycolysis GK Glucokinase JV46_10890 GPI Glucose-6-phosphate isomerase JV46_18220 PPi-PFK Pyrophosphate-dependent phosphofructokinase JV46_23230 FBPA Fructose-bisphosphate aldolase JV46_03500 TPI Triosephosphate isomerase JV46_20250 GapA Glyceraldehyde-3-phosphate dehydrogenase GapA JV46_25990 PGK Phosphoglycerate kinase JV46_03520 PGM2 Phosphoglycerate mutase 2 JV46_23670 Eno Enolase JV46_21020 PK Pyruvate kinase JV46_03510 PDH1 Pyruvate dehydrogenase 1 JV46_14530 PDH2 Pyruvate dehydrogenase 2 JV46_14520 TCA Cycle CS Citrate synthase JV46_23370 ACO1 Aconitase 1 JV46_20010 ACO2 Aconitase 2 JV46_12860 ICD Isocitrate dehydrogenase JV46_14480 OGDH 2-oxoglutarate dehydrogenase JV46_13970 SCS1 Succinyl-CoA synthetase 1 JV46_15630 SCS2 Succinyl-CoA synthetase 2 JV46_15640 SdhC Succinate dehydrogenase ShdCDAB, subunit C JV46_18230 ShdD Succinate dehydrogenase ShdCDAB, subunit D JV46_18260 SdhA Succinate dehydrogenase ShdCDAB, subunit A JV46_18270 SdhB Succinate dehydrogenase ShdCDAB, subunit B JV46_18420 FumC Fumarase JV46_13940 Mqo Malate:quinone oxidoreductase Mqo JV46_27010 Glyoxylate Cycle ICL Isocitrate lyase JV46_21990 MLS Alanine-glyoxylate aminotransferase JV46_07950 ME Malic enzyme JV46_23450 Fatty Acid Biosynthesis (FAB) FabF 3-ketoacyl-ACP synthase II JV46_14750 PlsX Phosphate:acyl-[ACP] acyltransferase JV46_14800 Acc1 Acetyl-CoA carboxylase 1 JV46_14890 Acc2 Acetyl-CoA carboxylase 2 JV46_09790 Acc3 Acetyl-CoA carboxylase 3 JV46_26480 FabH 3-ketoacyl-ACP synthase III JV46_14790 FabG 3-ketoacyl-ACP reductase JV46_14770 ACP Acyl-carrier protein JV46_14760 Unavailable FabA 3-hydroxyacyl-[ACP] dehydratase JV46_14940 Unavailable FebI Enoyl-ACP reductase [NADH] JV46_13910 FebZ 3-hydroxydecanoyl-[ACP] dehydratase JV46_09760 FabD Malonyl CoA-ACP transacylase JV46_14780 Phospholipid synthesis PlsB G3P-acyltransferase JV46_25110 CdsA CDP-diglyceride synthetase JV46_15020 PssA Phosphatidylserine synthase JV46_19530 Psd Phosphatidylserine decarboxylase JV46_07980 PlsC 1-acyl-G3P-acyltransferase JV46_23260 Non-mevalonate Pathway Dxs 1-Deoxy-D-xylulose 5-phosphate synthase JV46_27320 IspC 1-Deoxy-D-xylulose 5-phosphate reductoisomerase JV46_15010


Table S3 (Continued). IspD 4-diphosphocytidyl 2-C-methyl D-erythritol synthase JV46_15020 IspE 4-diphosphocytidyl 2-C-methyl D-erythritol kinase JV46_16240 IspF 2C-methyl D-erythritol 2,4-cyclodiphosphate synthase JV46_21050 IspG 4-hydroxy 3-methylbut 2-enyl diphosphate synthase JV46_06430 IspH 4-hydroxy 3-methylbut 2-enyl diphosphate reductase JV46_25680 HMG-CoA reductase pathway GGGPPS Geranylgeranyl pyrophosphate synthase JV46_09930 FPPS Farnesyl-pyrophosphate synthase JV46_26060 Cell Wall Biosynthesis GlmU Glucosamine-1-phosphate N-acetyltransferase GlmU JV46_17460 MurA MurABCDE, subunit A - UDP-N-acetylglucosamine 1-carboxyvinyltransferase JV46_15790 MurB MurABCDE, subunit B - UDP-N-acetylmuramate dehydrogenase JV46_15800 MurC MurABCDE, subunit C - UDP-N-acetylmuramate-L-alanine ligase JV46_15810 MurD MurABCDE, subunit D - UDP-N-acetylmuramoylalanine-D-glutamate ligase JV46_15850 MurABCDE, subunit E - UDP-N-acetylmuramoyl-L-alanyl-D-glutamate:(L)-meso-2,6-diaminoheptan MurE JV46_15890 edioate gamma-ligase (ADP-forming) Ddl D-alanine--D-alanine ligase Ddl JV46_15790 MurF UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase JV46_15880 MraY UDP-N-acetylmuramyl pentapeptide phosphotransferase JV46_13470 MurG UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) JV46_15830 MtgA Monofunctional biosynthetic peptidoglycan transglycosylase JV46_19010 2.4.1.- MrcA Penicillin-binding protein 1A JV46_23660 2.4.1.- MrcB Penicillin-binding protein 1B JV46_27250 MrdA Penicillin-binding protein 2 JV46_11130 PbpB Penicillin-binding protein 2 JV46_15900 DacA Penicillin-binding protein 6 JV46_11170 DacB D-alanyl-D-alanine carboxypeptidase JV46_16060 MviN Integral membrane protein MviN JV46_26370 Unavailable KpsF KpsF family protein JV46_14380 Unavailable KdsA 3-deoxy-8-phosphooctulonate synthase JV46_21010 KdsB 3-deoxy-manno-octulosonate cytidylyltransferase JV46_20860 LpxA Acyl-[acyl-carrier-protein]--UDP-N-acetylglucosamine O-acyltransferase JV46_14930 LpxC UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase JV46_15720 LpxD UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase JV46_14950 LpxH UDP-2,3-diacylglucosamine hydrolase JV46_10860 LpxB Lipid-A-disaccharide synthase JV46_14920 LpxK Tetraacyldisaccharide 4'-kinase JV46_20840 KdtA 3-deoxy-D-manno-octulosonic-acid transferase JV46_29080 Unavailable HtrB Lipid A biosynthesis lauroyl/palmitoleoyl acyltransferase JV46_23880 2.3.1.- MsbB Lauroyl/myristoyl acyltransferase JV46_27370 2.3.1.- Taurine Synthesis TauD Taurine dioxygenase JV46_12240 ABC Transporters YadH ABC-type polysaccharide/polyol phosphate export systems YadHG, subunit H JV46_14430 Unavailable YadG ABC-type polysaccharide/polyol phosphate export systems YadHG, subunit G JV46_14440 Unavailable MdlB ABC-type multidrug transport system MdlB JV46_11990 Unavailable MacA RND family efflux transporter MacAB, subunit A JV46_04810 Unavailable MacB RND family efflux transporter MacAB, subunit B JV46_04780 Unavailable SalX ABC-type antimicrobial peptide transport system SalXY, subunit X JV46_04800 Unavailable SalY ABC-type antimicrobial peptide transport system SalXY, subunit Y JV46_04790 Unavailable Ttg2A ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit A JV46_19850 Unavailable Ttg2C ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit C JV46_19860 Unavailable Ttg2D ABC-type transport system involved in resistance to organic solvents Ttg2ACD, subunit D JV46_16460 Unavailable CcmA ABC-typeheme exporter protein CcmABCD, subunit A JV46_18340 Unavailable CcmB ABC-type heme exporter protein CcmABCD, subunit B JV46_18350 Unavailable CcmC ABC-type heme exporter protein CcmABCD, subunit C JV46_18360 Unavailable CcmD ABC-type heme exporter protein CcmABCD, subunit D JV46_18370 Unavailable DppC ABC-type oligopeptide transport systems DppCBBAF, subunit C JV46_03680 Unavailable DppB1 ABC-type oligopeptide transport systems DppCBBAF, subunit B1 JV46_03740 Unavailable DppB2 ABC-type oligopeptide transport systems DppCBBAF, subunit B2 JV46_03790 Unavailable DppA ABC-type oligopeptide transport systems DppCBBAF, subunit A JV46_03890 Unavailable DppF ABC-type oligopeptide transport systems DppCBBAF, subunit F JV46_20940 Unavailable SmoK ABC-type sorbitol//mannitol transporter SmoKGFEm subunit K JV46_03940 Unavailable SmoG ABC-type sorbitol//mannitol transporter SmoKGFEm subunit G JV46_03950 Unavailable SmoF ABC-type sorbitol//mannitol transporter SmoKGFEm subunit F JV46_03960 Unavailable SmoE ABC-type sorbitol//mannitol transporter SmoKGFEm subunit E JV46_03970 Unavailable LivK ABC-type amino acid/amide transporter LivKHMG, subunit K JV46_04880 3.A.1.4.- LivH ABC-type amino acid/amide transporter LivKHMG, subunit H JV46_04870 3.A.1.4.- LivM ABC-type amino acid/amide transporter LivKHMG, subunit M JV46_04860 3.A.1.4.- LivG ABC-type amino acid/amide transporter LivKHMG, subunit G JV46_04850 3.A.1.4.- LivF ABC-type amino acid/amide transporter LivKHMG, subunit F JV46_04840 3.A.1.4.- AapJ ABC-type amino acid transporter AapJQMP, subunit J JV46_28250 3.A.1.4.- AapQ ABC-type amino acid transporter AapJQMP, subunit Q JV46_28260 3.A.1.4.- AapM ABC-type amino acid transporter AapJQMP, subunit M JV46_28270 3.A.1.4.-


Table S3 (Continued). AapP ABC-type amino acid transporter AapJQMP, subunit P JV46_28280 3.A.1.4.- TauA ABC-type taurine transporter TauACB, subunit A JV46_26820 TauC ABC-type taurine transporter TauACB, subunit C JV46_26830 TauB ABC-type taurine transporter TauACB, subunit B JV46_26850 UrtA ABC-type urea transporter UrtABCDE, subunit A JV46_28500 Unavailable UrtB ABC-type urea transporter UrtABCDE, subunit B JV46_28510 Unavailable UrtC ABC-type urea transporter UrtABCDE, subunit C JV46_28520 Unavailable UrtD ABC-type urea transporter UrtABCDE, subunit D JV46_28530 Unavailable UrtE ABC-type urea transporter UrtABCDE, subunit E JV46_28540 Unavailable TupC ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit C JV46_12830 Unavailable TupB ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit B JV46_12840 Unavailable TupA ABC-type cobalamin/Fe3+-siderophor transporter TupCBA, subunit A JV46_12850 Unavailable PstB ABC-type phosphate transporter PstBACS, subunit B JV46_16720 3.A.1.7.1 PstA ABC-type phosphate transporter PstBACS, subunit A JV46_16730 3.A.1.7.1 PstC ABC-type phosphate transporter PstBACS, subunit C JV46_16740 3.A.1.7.1 PstS1 ABC-type phosphate transporter PstBACS, subunit S1 JV46_16750 3.A.1.7.1 PstS2 ABC-type phosphate transporter PstBACS, subunit S2 JV46_05400 3.A.1.7.1 ModC ABC-type molybdenum transporter ModCBA, subunit C JV46_17050 Unavailable ModB ABC-type molybdenum transporter ModCBA, subunit B JV46_17060 Unavailable ModA1 ABC-type molybdenum transporter ModCBA, subunit A1 JV46_17070 Unavailable ModA2 ABC-type molybdenum transporter ModCBA, subunit A2 JV46_27160 Unavailable ZnuB ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit B JV46_05790 Unavailable ZnuC ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit C JV46_05800 Unavailable ZnuA ABC-type Mn2+/Zn2+ transporter ZnuBCA, subunit A JV46_05810 Unavailable FhuD ABC-type hemin transporter FhuDBC, subunit D JV46_07880 FhuB ABC-type hemin transporter FhuDBC, subunit B JV46_12940 FhuC ABC-type hemin transporter FhuDBC, subunit C JV46_12930 AfuC ABC-type spermidine/putrescine transporter AfuCBA, subunit C JV46_11970 AfuB ABC-type spermidine/putrescine transporter AfuCBA, subunit B JV46_11980 AfuA ABC-type spermidine/putrescine transporter AfuCBA, subunit A JV46_12000 Porins PhoE Outer membrane protein PhoE JV46_26360 Unavailable OmpA1 Outer membrane protein OmpA1 JV46_21440 Unavailable OmpA2 Outer membrane protein OmpA2 JV46_16990 Unavailable OmpC Outer membrane protein OmpC JV46_22220 Unavailable Ion

Chanels EriC Chloride channel protein EriC JV46_16700 Unavailable AmtB1 ammonium transporter AmtB1 JV46_23120 Unavailable AmtB2 ammonium transporter AmtB2 JV46_19350 Unavailable Feo Transporter FeoA Ferrous iron (Fe2+) transporter FeoAB, subunit A JV46_09850 Unavailable FeoB Ferrous iron (Fe2+) transporter FeoAB, subunit B JV46_09860 Unavailable P-Type ATPases ActP P-type probable sodium:solute symporter ActP JV46_13540 Unavailable CopA1 P-type copper/silver transporter CopA1 JV46_20160 Unavailable CopA2 P-type copper/silver transporter CopA2 JV46_28240 Unavailable ZntA P-type heavy metal transporter ZntA JV46_05290 Unavailable MgtA P-type magnesium transporter MtgA JV46_26960 Unavailable Sec System SecA Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit A JV46_15700 Unavailable SecB Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit B JV46_23180 Unavailable SecD Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit D JV46_03750 Unavailable SecE Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit E JV46_16180 Unavailable SecF Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit F JV46_03730 Unavailable SecG Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit G JV46_20260 Unavailable SecY Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit Y JV46_07170 Unavailable YidC Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit YidC JV46_17160 Unavailable YajC Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit YajC JV46_03760 Unavailable Ffh Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit Ffh JV46_04040 Unavailable FtsY Protein translocase SecABDEFGY-YidC-YajC-Ffh-FtsY, subunit FtsY JV46_25480 Unavailable Tat System TatA Twin arginine-targeting protein translocase TatABC, subunit A JV46_18620 Unavailable TatB Twin arginine-targeting protein translocase TatABC, subunit B JV46_18630 Unavailable TatC Twin arginine-targeting protein translocase TatABC, subunit C JV46_18640 Unavailable Type II Sectretion System GspC Type II protein secretion system GspDSCFGHIJKLMEO, subunit C JV46_18850 Unavailable GspD Type II protein secretion system GspDSCFGHIJKLMEO, subunit D JV46_18860 Unavailable GspE1 Type II protein secretion system GspDSCFGHIJKLMEO, subunit E1 JV46_18870 Unavailable GspE2 Type II protein secretion system GspDSCFGHIJKLMEO, subunit E2 JV46_18880 Unavailable GspF Type II protein secretion system GspDSCFGHIJKLMEO, subunit F JV46_18890 Unavailable GspG Type II protein secretion system GspDSCFGHIJKLMEO, subunit G JV46_18910 Unavailable GspH Type II protein secretion system GspDSCFGHIJKLMEO, subunit H JV46_18920 Unavailable GspI Type II protein secretion system GspDSCFGHIJKLMEO, subunit I JV46_18930 Unavailable


Table S3 (Continued). GspJ Type II protein secretion system GspDSCFGHIJKLMEO, subunit J JV46_18940 Unavailable GspK Type II protein secretion system GspDSCFGHIJKLMEO, subunit K JV46_18950 Unavailable GspL Type II protein secretion system GspDSCFGHIJKLMEO, subunit L JV46_18960 Unavailable GspM Type II protein secretion system GspDSCFGHIJKLMEO, subunit M JV46_18970 Unavailable GspN Type II protein secretion system GspDSCFGHIJKLMEO, subunit N JV46_18980 Unavailable GspO Type II protein secretion system GspDSCFGHIJKLMEO, subunit O JV46_21690 Unavailable Long-chain Farry Acid Transporter FadL Long-chain fatty acid transporter FadLD, subunit L JV46_22000 Unavailable FadD Long-chain fatty acid transporter FadLD, subunit D JV46_11770 Unavailable Tol

System TolQ Cell division and transport-associated protein TolQ (TC 2.C.1.2.1) JV46_27660 Unavailable TolR Biopolymer transport protein JV46_27670 Unavailable TolA TolA protein JV46_27680 Unavailable TolB tol-pal system beta propeller repeat protein TolB JV46_27690 Unavailable TonB Complex ExbB1 Biopolymer transport protein ExbBD, subunit B1 JV46_20820 Unavailable ExbB2 Biopolymer transport protein ExbBD, subunit B2 JV46_08350 Unavailable ExbD1 Biopolymer transport protein ExbBD, subunit D1 JV46_20830 Unavailable ExbD2 Biopolymer transport protein ExbBD, subunit D2 JV46_08360 Unavailable TonB1 Outer membrane transport energization protein TonB1 JV46_15870 Unavailable TonB2 Outer membrane transport energization protein TonB2 JV46_08370 Unavailable BtuB Outer membrane cobalamin receptor protein BtuB JV46_25710 Unavailable Multidrug Efflux Pump AcrA1 Multidrug efflux pump AcrAB, subunit A1 JV46_09130 Unavailable AcrB1 Multidrug efflux pump AcrAB, subunit B1 JV46_09140 Unavailable AcrA2 Multidrug efflux pump AcrAB, subunit A2 JV46_11370 Unavailable AcrB2 Multidrug efflux pump AcrAB, subunit B2 JV46_07710 Unavailable AcrB3 Multidrug efflux pump AcrAB, subunit B3 JV46_03460 Unavailable AcrA3 Multidrug efflux pump AcrAB, subunit A3 JV46_03470 Unavailable AcrA4 Multidrug efflux pump AcrAB, subunit A4 JV46_21950 Unavailable AcrB4 Multidrug efflux pump AcrAB, subunit B4 JV46_21960 Unavailable AcrB5 Multidrug efflux pump AcrAB, subunit B5 JV46_12380 Unavailable AcrA5 Multidrug efflux pump AcrAB, subunit A5 JV46_12390 Unavailable AcrB6 Multidrug efflux pump AcrAB, subunit B6 JV46_13300 Unavailable AcrA6 Multidrug efflux pump AcrAB, subunit A6 JV46_13310 Unavailable AcrB7 Multidrug efflux pump AcrAB, subunit B7 JV46_26640 Unavailable AcrA7 Multidrug efflux pump AcrAB, subunit A7 JV46_22530 Unavailable TolC Type I secretion outer membrane protein TolC JV46_17820 Unavailable TRAP Transporters DctM1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M1 JV46_07020 Unavailable DctQ1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q1 JV46_07030 Unavailable DctP1 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P1 JV46_07040 Unavailable DctP2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P2 JV46_11320 Unavailable DctQ2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q2 JV46_11330 Unavailable DctM2 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M2 JV46_11340 Unavailable DctM3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M3 JV46_12640 Unavailable DctQ3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q3 JV46_12650 Unavailable DctP3 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P3 JV46_12660 Unavailable DctM4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M4 JV46_14070 Unavailable DctQ4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q4 JV46_14080 Unavailable DctP4 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P4 JV46_14090 Unavailable DctM5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M5 JV46_25940 Unavailable DctM6 TRAP-type C4-dicarboxylate transporter DctPQM, subunit M6 JV46_25950 Unavailable DctQ5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit Q5 JV46_25960 Unavailable DctP5 TRAP-type C4-dicarboxylate transporter DctPQM, subunit P5 JV46_25970 Unavailable Secondary Trasnporters TrkA K+ transporter TrkAH, subunit A JV46_29030 Unavailable TrkH K+ transporter TrkAH, subunit H JV46_29040 Unavailable CitT1 Di- and tricarboxylate transporter CitT1 JV46_03580 Unavailable CitT2 Di- and tricarboxylate transporter CitT2 JV46_17850 Unavailable ZupT Divalent heavy-metal cations transporter ZupT JV46_27760 Unavailable MgtE Mg2+ transporter MgtE JV46_25230 Unavailable NptA Na/Pi cotransporter NptA JV46_05390 Unavailable KefB1 Sodium/proton antiporter KefB1 JV46_24480 Unavailable KefB2 Sodium/proton antiporter KefB2 JV46_28430 Unavailable KefB3 Sodium/proton antiporter KefB3 JV46_09560 Unavailable NhaP Na+/H+ and K+/H+ antiporter NhaP JV46_14160 Unavailable LysE Putative threonine efflux protein LysE JV46_05240 Unavailable FieF Cation diffusion facilitator FieF JV46_17540 Unavailable


Table S4. Parameters of the gene prediction software. GeneMarkS Parameter Value Explanation Parameters specified: --gcode 11 --shape circular --prok Default parameters: --order 2 Markov chain order. Default = 2 --motif 1 Default = 1 --width 6 Default = 6 Length of seq. upstream of translation initiation site that --prestart 26 includes motif. Default = 26 --identity 0.99 Identity level for termination of iterations. Default = 0.99 --maxitr 10 Maximum no of iterations. Default = 10 --fixmotif Motif is located at a fixed position with respect to start --offover Overlap allowed by default --strand Strand to predict gene in. Default both Prodigal Parameters specified: -g 11 Translation table. Default = 11 Default parameters: -c Denes not allowed to run off edges -n Not used hence did not bypass shine dalgarno trainer -m Not used hence allowed genes to be built across n’s Glimmer Parameters specified: -n No header -t 1,15 Genes with entropy score less than 1.15 will be considered -z 11 GenBank translation table used -f Consider the option g Minimum gene length to n nucleotides. Does not include the -g 60 bases in the stop codon – ATG, GTG, -A Codon list default TTG Circular genome used, probability of all start codons considered equal


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 1/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) Printed for: Oleg

1 # Oleg Dmytrenko 1 August 2011 2 3 import sys 4 import os 5 import glob 6 import re 7 import shutil 8 import optparse 9 10 ##### Functions ##### 11 12 def get_aa_sequence(geneNAME, fileNAME, geneID): 13 fileNAME += '.fasta' 14 fileNAME.join('') # Operation for a string, not list, for list see line 144 15 outputNAME = geneNAME 16 outputNAME += '.fasta' 17 output = '' 18 output += './Gene aa sequences are here/' 19 output += outputNAME 20 dirname = 'Gene aa sequences are here' 21 if not os.path.isdir('./' + dirname + '/'): # This part creates an output directory if it does not exhist yet. 22 os.mkdir('./' + dirname + '/') 23 output = open(output, 'a') 24 for file in glob.iglob(fileNAME): 25 found = 0 26 for line in open(file): 27 if (line == '\n' and found == 1): 28 found = 0 29 output.write('\n') 30 if (('>' in line and, line, re.A) != None) or found == 1): 31 found = 1 32 output.write(line) 33 34 gene_set = set() 35 def clustal_W(org, gene, id, mark): 36 if id not in gene_set: 37 gene_set.add(id) 38 org += '.fasta' 39 org.join('') # Operation for a string, not list, for list see line 144 40 outputNAME = gene 41 outputNAME += '.fasta' 42 output = '' 43 output += './Clustal analysis results/' 44 output += outputNAME 45 dirname = 'Clustal analysis results'


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 2/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

46 if not os.path.isdir('./' + dirname + '/'): # This part creates an output directory if it does not exhist yet. 47 os.mkdir('./' + dirname + '/') 48 output = open(output, 'a') 49 for file in glob.iglob(org): 50 found = 0 51 for line in open(file): 52 if (line == '\n' and found == 1): 53 found = 0 54 output.write('\n') 55 if '>' in line and, line, re.A) != None: 56 new_line = re.findall('>(.*)', line, re.A) 57 new_new_line = '' 58 new_new_line += '>' 59 new_new_line += mark 60 new_new_line += str(new_line[0]) 61 new_new_line += '\n' 62 new_new_line.join('') 63 found = 1 64 output.write(new_new_line) 65 elif found == 1: 66 output.write(line) 67 68 def findall_genes_ID(ID): #Finds all the genes as definded in the gene ID .query file and returns a dictionary with gene ID's as keys and genome names as lists of values 69 List = [] 70 gene_ID_list = [] 71 for filename in glob.iglob('*.txt'): 72 for line in open(filename): 73 result = re.findall('[^-a-zA-Z]' + '[\t]?' + ID + '[\t]?' + '[^-a-zA-Z]', line, re.I|re.A) # ('[^-]' + '[\t]?'+ '[^a-z]' + ID + '[\t]?' + '[^-dependent][^-regulated][^-accessory]', line, re.I|re.A) Make it not search combinations which are part of a word 74 clean_result = re.findall(ID, str(result), re.I|re.A) 75 if result != []: 76 mystring=clean_result[0] 77 check = re.findall('\t([a-zA-Z]{3,5}\d?)\t', line, re.A) 78 numbered = re.findall('\t([a-zA-Z]{3,4}[0-9])\t', line, re.A) 79 if len(check) >= 1: 80 if len(ID) == 3: 81 newID = ID.lower() 82 if newID != check[0]: 83 continue 84 else: 85 gene_ID = re.findall('^(\d+)', line, re.A) 86 gene_ID = gene_ID[0] 87 gene_ID_list.append(gene_ID) 88 shortfilename = filename[:-4] 89 get_aa_sequence(ID, shortfilename, gene_ID) 90 clustal_W(shortfilename, ID, gene_ID, '00_')


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 3/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

91 elif len(ID) == 4: 92 newID = ID[:-1].lower()+ID[-1:].capitalize() 93 if newID != check[0]: #removes erroneous matches 94 continue 95 else: 96 gene_ID = re.findall('^(\d+)', line, re.A) 97 gene_ID = gene_ID[0] 98 gene_ID_list.append(gene_ID) 99 shortfilename = filename[:-4] 100 get_aa_sequence(ID, shortfilename, gene_ID) 101 clustal_W(shortfilename, ID, gene_ID, '00_') 102 elif len(numbered) == 1: 103 newID = ID[:-2].lower()+ID[-2:].capitalize() 104 if newID != check[0]: #removes erroneous matches 105 continue 106 else: 107 gene_ID = re.findall('^(\d+)', line, re.A) 108 gene_ID = gene_ID[0] 109 gene_ID_list.append(gene_ID) 110 shortfilename = filename[:-4] 111 get_aa_sequence(ID, shortfilename, gene_ID) 112 clustal_W(shortfilename, ID, gene_ID, '00_') 113 elif len(check) == 0: #if gene ID is not easily identifyable 114 second_search = re.findall('\t{2}([^\t]*)', line, re.I|re.A) 115 third_search = re.findall(r'[^\w]([\w]{3,5}\d?)', str(second_search), re.I|re.A) 116 capital = re.compile('[A-Z]', re.A) 117 lower = re.compile('[a-z]', re.A) 118 for item in third_search[:]: #itterates a copy of the list 119 if capital.match(item, 1) != None: 120 third_search.remove(item) 121 elif lower.match(item, 3) != None: 122 third_search.remove(item) 123 if len(third_search) >= 2: 124 choice = '' 125 while not choice == 'y' and not choice == 'n': 126 print('Does this search result correspond to the ID query', '\033[0;31m' ,ID, '\033[1;m', '?') 127 new_line = line.replace('\n', '') 128 print(new_line) 129 choice = input('Type y/n and press Return: ') 130 print('\n') 131 if choice == 'n': 132 continue 133 elif choice == 'y': 134 gene_ID = re.findall('^(\d+)', line, re.A) 135 gene_ID = gene_ID[0]


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 4/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

136 gene_ID_list.append(gene_ID) 137 shortfilename = filename[:-4] 138 get_aa_sequence(ID, shortfilename, gene_ID) 139 clustal_W(shortfilename, ID, gene_ID, '00_') 140 else: 141 gene_ID = re.findall('^(\d+)', line, re.A) 142 gene_ID = gene_ID[0] 143 gene_ID_list.append(gene_ID) 144 shortfilename = filename[:-4] 145 get_aa_sequence(ID, shortfilename, gene_ID) 146 clustal_W(shortfilename, ID, gene_ID, '00_') 147 148 def find_COG(ID, file_name): 149 list = '' 150 count = 0 151 for line in open(file_name): 152 if str(ID) in line: 153 if 'COG_category' in line: 154 COG = [] 155 COG = re.findall('(\[\w\]\s\w+.+)\t\t', line, re.A) 156 if COG: 157 if count == 0: 158 list += COG[0] 159 count += 1 160 elif count != 0: 161 list += ', ' 162 list += COG[0] 163 return(list) 164 165 def find_COG_number(ID, file_name): 166 list = '' 167 count = 0 168 for line in open(file_name): 169 if str(ID) in line: 170 if 'COG' in line: 171 COG_number = [] 172 COG_number = re.findall('(COG\d+)\t', line, re.A) 173 if COG_number: 174 if count == 0: 175 list += COG_number[0] 176 count += 1 177 elif count != 0: 178 list += ', ' 179 list += COG_number[0] 180 return(list)


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 5/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

181 182 def find_name(ID, file_name): 183 list = '' 184 count = 0 185 for line in open(file_name): 186 if str(ID) in line: 187 if 'Product_name' in line: 188 Name = [] 189 Name = re.findall('Product_name\t\t(.+)\t', line, re.A) 190 if Name: 191 if count == 0: 192 list += Name[0] 193 count += 1 194 elif count != 0: 195 list += ', ' 196 list += Name[0] 197 return(list) 198 199 def find_EC(ID, file_name): 200 list = '' 201 count = 0 202 for line in open(file_name): 203 if str(ID) in line: 204 if 'EC:' in line: 205 EC = [] 206 EC = re.findall('EC:(.+)]\t', line, re.A) 207 if EC: 208 if count == 0: 209 list += EC[0] 210 count += 1 211 elif count != 0: 212 list += ', ' 213 list += EC[0] 214 return(list) 215 216 def find_KEGG(ID, file_name): 217 list = '' 218 count = 0 219 for line in open(file_name): 220 if str(ID) in line: 221 if 'KO' in line: 222 KO = [] 223 KO = re.findall('KO:(K\d+)', line, re.A) 224 if KO: 225 if count == 0:


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 6/12 GenomicSaved: Utility9/24/14, for Automat00:37:02ed Comparison (GUAC) (Continued). Printed for: Oleg

226 list += KO[0] 227 count += 1 228 elif count != 0: 229 list += ', ' 230 list += KO[0] 231 return(list) 232 233 234 ##### Main ##### 235 236 if os.path.isdir('./BLAST Output is Here/'): 237 shutil.rmtree('./BLAST Output is Here/') #Removes directory and everything else down the tree 238 239 if os.path.isdir('./Gene aa sequences are here/'): 240 shutil.rmtree('./Gene aa sequences are here/') 241 242 if os.path.isdir('./Results are here/'): 243 shutil.rmtree('./Results are here/') 244 245 if os.path.isdir('./Clustal analysis results/'): 246 shutil.rmtree('./Clustal analysis results/') 247 248 249 parser = optparse.OptionParser() 250 parser.add_option('-b', dest='bit_score', type='float', help=('Type -b cut-off bit score. Default == 50')) 251 parser.add_option('-i', dest='per_id', type='float', help=('Type -i cut-off % identity. Default == 30')) 252 parser.add_option('-a', dest='per_align', type='float', help=('Type -a cut-off % of alignment length over query length. Default == 40')) 253 parser.set_defaults(bit_score=50, per_id=30, per_align=40) 254 (options, args) = parser.parse_args() 255 256 for filename in glob.iglob('*.query'): #This for-loop creats a list of all the genes (LG) in the query file in the order they are listed in the file. 257 LG =[] 258 for line in open(filename): 259 if line[0] == '\n': 260 break 261 elif line[-1] != '\n': 262 LG.append(line) 263 elif line[-1] == '\n': 264 length = len(line) - 1 265 line = (line[:length]) 266 LG.append(line) 267 268 LO = [] 269 for filename in glob.iglob('*.txt'): #This for-loop creates a list of all the organisms' genomes (LO) in the target directory in alphabetical order. 270 LO.append(filename[:-4])


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 7/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

271 LO.sort() 272 273 Dictionary = {} 274 genome_gene_ID_dict = {} #Dictionaries for the memory module 275 gene_score_dict = {} 276 gene_ID_gene_symbol_dict = {} 277 278 outFILE = 'genome_comparison_table.xls' 279 outDIR = './Results are here/' 280 outDIR += outFILE 281 outDIR.join('') 282 dirname = 'Results are here' 283 284 if not os.path.isdir('./' + dirname + '/'): 285 os.mkdir('./' + dirname + '/') 286 287 print(' \n***Finding genes and the corresponding amino acid sequences*** \n', '\033[0;32mThis module may require your input to resolve ambiguous matches. Wait until you see the next green message\033[1;m \n' ) 288 289 for ID in LG: 290 findall_genes_ID(ID) #This is where the search dictionary is returned 291 292 for O in LO: 293 genome_gene_ID_dict[O] = set() 294 295 dirname = 'BLAST Output is Here' 296 if not os.path.isdir('./' + dirname + '/'): 297 os.mkdir('./' + dirname + '/') 298 299 print('***Peforming BLAST search for genes not identified in annotation*** \n' 300 '***And identifying the best alignments in the target genomes*** \n', '\033[0;32mHave a coffee\033[1;m \n') 301 for O in LO: 302 for G in LG: 303 DB = [] 304 makeDB = [] 305 DB.append(O) 306 DB.append('.fasta.psq') 307 makeDB.append(O) 308 makeDB.append('.fasta') 309 makeDB = ''.join(makeDB) #Operation to join lists not strings, compare to line 13 310 DB = ''.join(DB) 311 if not (DB in glob.iglob('*.psq')): #Creates BLAST protein databases using exhisiting aa .fasta files for whole genomes 312 imput = [] 313 imput.append('makeblastdb -in ') 314 imput.append(makeDB) 315 imput.append(' -dbtype prot')


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 8/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

316 imput = ''.join(imput) 317 os.system (imput) 318 else: 319 full_path = [] 320 full_path.append('./Gene\ aa\ sequences\ are\ here/') 321 full_path.append(G) 322 full_path.append('.fasta') 323 full_path = ''.join(full_path) 324 check_path = [] 325 check_path.append('Gene aa sequences are here/') 326 check_path.append(G) 327 check_path.append('.fasta') 328 check_path = ''.join(check_path) 329 blast_query = [] 330 blast_query.append('blastp -db ') 331 blast_query.append(makeDB) 332 blast_query.append(' -outfmt 6') 333 blast_query.append(' -query ') 334 blast_query.append(full_path) 335 blast_query = ''.join(blast_query) 336 blast_output = [] 337 analysis_input = [] 338 blast_output.append(' -out ') 339 output_folder = [] 340 output_folder = ('./BLAST\ Output\ is\ Here/') 341 analysis_output_folder = [] 342 analysis_output_folder = ('./BLAST Output is Here/') 343 blast_output.append(output_folder) 344 blast_output.append(O) 345 analysis_input.append(analysis_output_folder) 346 analysis_input.append(O) 347 blast_output.append('_') 348 analysis_input.append('_') 349 blast_output.append(G) 350 analysis_input.append(G) 351 blast_output.append('.out') 352 analysis_input.append('.out') 353 blast_output = ''.join(blast_output) 354 analysis_input = ''.join(analysis_input) 355 blast_input = [] 356 blast_input.append(blast_query) 357 blast_input.append(blast_output) 358 blast_input = ''.join(blast_input) 359 if os.path.exists('./' + check_path): #Makes sure no ugly writting appears in the terminal if there are no aa sequences 360 os.system(blast_input)


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 9/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

361 if glob.iglob(analysis_input) != None: 362 for blast_out in glob.iglob(analysis_input): 363 for line in open(blast_out): 364 bit_score = re.findall('(\d+\.?\d?)$', line, re.A) 365 query = re.findall('^(\d+)\t\d+', line, re.A) 366 subject = re.findall('^\d+\t(\d+)', line, re.A) #subject gene object ID e.g. 643529311 367 percent_id = re.findall ('^\d+\t\d+\t(\d+?\.?\d+?)\t', line, re.A) #percent identity 368 alignment_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t(\d+)', line, re.A) #length of the alignment 369 q_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t\d+\t\d+\t\d+\t\d+\t(\d+)', line, re.A) #length of the query sequence 370 s_length = re.findall('^\d+\t\d+\t\d+?\.?\d+?\t\d+\t\d+\t\d+\t\d+\t\d+\t\d+\t(\d+)', line, re.A) #length of the subject sequence 371 percent_alignment = 0 372 percent_alignment = int(alignment_length[0])*100/int(s_length[0]) #percent of the alignment over the subject length 373 score_list = [] 374 score_list.append(float(bit_score[0])) 375 score_list.append(percent_alignment) 376 score_list.append(float(percent_id[0])) 377 score_list.append(int(query[0])) 378 if float(bit_score[0]) >= options.bit_score and percent_alignment >= options.per_align and float(percent_id[0]) >= options.per_id: 379 gene_object_ID = int(subject[0]) 380 if gene_object_ID not in gene_ID_gene_symbol_dict: 381 gene_ID_gene_symbol_dict.setdefault(gene_object_ID, G) 382 gene_score_dict.setdefault(gene_object_ID, score_list) 383 genome_gene_ID_dict.setdefault(O, set()).add(gene_object_ID) 384 elif gene_object_ID in gene_ID_gene_symbol_dict: 385 if float(bit_score[0]) > float(gene_score_dict[gene_object_ID][0]): 386 gene_score_dict[gene_object_ID] = score_list 387 del gene_ID_gene_symbol_dict [gene_object_ID] 388 gene_ID_gene_symbol_dict.setdefault(gene_object_ID, G) 389 390 for G in LG: #creates {Dictionary} with gene symbols (keys) and 0's as values for each organism 391 list = [] 392 for O in LO: 393 list.append(0) 394 Dictionary[G] = list 395 396 for O in LO: #Updates {Dictionary} based on the BLAST search 397 for gene_ID in genome_gene_ID_dict[O]: 398 current_index = LO.index(O) 399 templist = Dictionary[gene_ID_gene_symbol_dict[gene_ID]] 400 templist[current_index] += 1 401 Dictionary[gene_ID_gene_symbol_dict[gene_ID]] = templist 402 403 table = open(outDIR, 'a') #Prints the table header 404 for item in LO: 405 table.write('\t' + str(item))


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 10/12 GenomicSaved: Utility9/24/14, for A00:37:02utomated Comparison (GUAC) (Continued). Printed for: Oleg

406 table.write('\n') 407 408 for gene_item in LG: #Print {Dictionary} into a table 409 table.write(str(gene_item)) 410 for cell in Dictionary[gene_item]: 411 table.write('\t' + str(cell)) 412 table.write('\n') 413 414 print('***Finding information about the identified genes*** \n') 415 416 outORGdir = './Results are here/' 417 418 for O in LO: 419 for ID in genome_gene_ID_dict[O]: 420 if ID in gene_ID_gene_symbol_dict: 421 symbol = gene_ID_gene_symbol_dict[ID] 422 culstal_id = str(ID) 423 clustal_W(O, symbol, culstal_id, '01_') 424 outORGfile = '' 425 outORGfile += outORGdir 426 outORGfile += O 427 outORGfile += '_' 428 outORGfile += symbol 429 outORGfile += '.xls' 430 outORGfile.join('') 431 check = [] 432 check.append('Results are here/') 433 check.append(O) 434 check.append('_') 435 check.append(symbol) 436 check.append('.xls') 437 check = ''.join(check) 438 if not os.path.exists('./' + check): 439 table = open(outORGfile, 'a') 440 table.write('gene symbol' + '\t' + 'bit score' + '\t' + '% alignment' + '\t' + '% identity' + '\t' + 'query' + '\t' + 'genome ID' + '\t'+ 'gene product name' '\t' + 'EC number' + '\t'+ 'COG category' + '\t' +'COG number' + '\t' + 'KEGG category' + '\t' + 'alternative names' + '\t' + 'function' + '\t' + 'notes' + '\n') 441 table.write(str(symbol)) 442 table = open(outORGfile, 'a') 443 search_file_name = '' 444 search_file_name += O 445 search_file_name += '.info' 446 search_file_name.join('') 447 table.write('\t' + str(gene_score_dict[ID][0]) + '\t') 448 table.write(str(gene_score_dict[ID][1]) + '\t') 449 table.write(str(gene_score_dict[ID][2]) + '\t') 450 table.write(str(gene_score_dict[ID][3]) + '\t')


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 11/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

451 table.write(str(ID) + '\t') 452 Name = [] 453 Name = find_name(ID, search_file_name) 454 if Name: 455 table.write(str(Name) + '\t') 456 elif not Name: 457 table.write('\t') 458 EC = [] 459 EC = find_EC(ID, search_file_name) 460 if EC: 461 table.write(str(EC) + '\t') 462 elif not EC: 463 table.write('\t') 464 COG = [] 465 COG = find_COG(ID, search_file_name) 466 if COG: 467 table.write(str(COG) + '\t') 468 elif not COG: 469 table.write('\t') 470 COG_number = [] 471 COG_number = find_COG_number(ID, search_file_name) 472 if COG_number: 473 table.write(str(COG_number) + '\t') 474 elif not COG_number: 475 table.write('\t') 476 KEGG = [] 477 KEGG = find_KEGG(ID, search_file_name) 478 if KEGG: 479 table.write(str(KEGG) + '\t' + '\n') 480 elif not KEGG: 481 table.write('\t' + '\n') 482 483 print('***Performing ClustalW analysis to check for misidentified genes. \n' 484 'See alignments in \"Clustal analysis results\" directory. \n' 485 'If needed, rerun GUAC with adjusted bit score, % identity, and % alignment cut off values. *** \n') 486 487 for G in LG: 488 full_path = [] 489 full_path.append('./Clustal analysis results/') 490 full_path.append(G) 491 full_path.append('.fasta') 492 full_path = ''.join(full_path) 493 #print(full_path) 494 if os.path.exists(full_path): 495 clustal_query = []


/Users/oleg/Downloads/8987655971254236_add5.cvs Page 12/12 GenomicSaved: Utility9/24/14, for Automated00:37:02 Comparison (GUAC) (Continued). Printed for: Oleg

496 clustal_query.append('clustalw2') 497 clustal_query.append(' -infile=') 498 clustal_query.append('./Clustal\ analysis\ results/') 499 clustal_query.append(G) 500 clustal_query.append('.fasta') 501 clustal_query.append(' -align') 502 clustal_query.append(' -outfile=') 503 clustal_query.append('./Clustal\ analysis\ results/') 504 clustal_query.append(G) 505 clustal_query.append('.aln') 506 clustal_query = ''.join(clustal_query) 507 os.system(clustal_query) 508 clustal_tree_query = [] 509 clustal_tree_query.append('clustalw2') 510 clustal_tree_query.append(' -infile=') 511 clustal_tree_query.append('./Clustal\ analysis\ results/') 512 clustal_tree_query.append(G) 513 clustal_tree_query.append('.fasta') 514 clustal_tree_query.append(' -tree') 515 clustal_tree_query.append(' -outfile=') 516 clustal_tree_query.append('./Clustal\ analysis\ results/') 517 clustal_tree_query.append(G) 518 clustal_tree_query.append('.ph') 519 clustal_tree_query = ''.join(clustal_tree_query) 520 os.system(clustal_tree_query) 521 522 print('***Analysis complete! Output in \"Results are here\" directory*** \n') 523 sys.exit() 524



Supplementary material for Chapter 2:

The “missing enzyme” in the enigmatic Calvin cycle of chemoautotrophic bacterial symbionts


Sequencing of the symbiont-enriched and unenriched mRNA Approximately 1.6 million cDNA sequencing reads were obtained from RNA unenriched in the symbiont mRNA transcripts and over 1.35 million from the enriched RNA (Appendix 2 Table A2.1). The majority of sequences were rRNA transcripts. mRNA enrichment decreased the amount of reads by only 3.7% for the symbiont and 5.4% for the host rRNA, while leading to the fragmentation of transcripts (Appendix 2 Figure A2.1) and loss of some non-rRNA sequences (Appendix 2 Table A2.1). Host mRNA removal was effective since less than 1% of non rRNA reads were identified as eukaryotic. Roughly 0.4% of the reads were mitochondrial. Cytochrome c oxidase subunit I (COX1), a respiratory electron transport chain protein, was the most highly transcribed mitochondrial protein-coding gene (Appendix 2 Figure A2.2). Of the bacterial reads, 0.1% could not be referenced to the genome of the symbiont and may originate from the gill surface-associated microbial community (Appendix 2 Table A2.1).


Figure A2.1. Length distribution of the cDNA sequencing reads from the symbiont-containing gill tissue of S. velum.


Figure A2.2. Gene expression across the mitochondrial genome of S. velum. From outside to the center: mitochondrial DNA (Mb); gene expression per nucleotide, with the symbiont unenriched cDNA in blue and symbiont-enriched cDNA in grey; genes (green) with the corresponding gene names.


Figure A2.3. Transcriptional activity of tRNA genes in the S. velum symbiont.


Figure A2.4. Initial reverse reaction velocities of the symbiont PPi-PFK without and with 0.15U PPase over a range of phosphate concentrations. Measurements were performed at pH 7.5 and 25℃. Standard deviations from three replicate measurements are shown.


Figure A2.5. Symbiont PPi-PFK activity (A) without PPase and (B) with 0.15U PPase at different phosphate concentrations with 5 mM FBP. Measurements were performed at pH 7.5 and 25℃. Standard deviations from three replicate measurements are shown.


Table A2.1. Number of transcripts from S. velum symbiont-containing gill tissue unenriched and enriched in the symbiont mRNA. Transcript reads were mapped to the symbiont genome (Dmytrenko et al. 2014) or the mitochondrial genome of the host (Plazzi et al. 2013). Unmapped reads were queried with BLASTN against the NCBI nucleotide database. Data marked in bold were used in gene expression analysis (Figures 2.2 and 2.3). cDNA cDNA

unenriched symbiont enriched Read types # of reads % # of reads % Total 1,591,449 100.0 1,350,648 100.0 Unique non-duplicates 1,184,432 74.4 884,578 65.5 Duplicates 768,731 26 581,423 35 5S, 16S, 23S symbiont rRNA 415,701 26.1 303,155 22.4 18S, 28S host rRNA 549,041 34.5 392,671 29.1 12S, 16S mitochondrial rRNA 62,019 3.9 46,462 3.4 S. velum mitochondrial mRNA and tRNA 7,038 0.4 5,770 0.4 S. velum symbiont genomic mRNA & tRNA 53,513 3.4 45,101 3.3 Non-symbiont bacterial RNA 1,073 0.1 1,412 0.1 Eukaryotic RNA 10,085 0.6 11,180 0.8 Unassigned 85,962 5.4 78,827 5.8 Mean read length (bp) 316 176


Table A2.2. Most highly-expressed genes in the S. velum symbiont. cDNA cDNA symbiont Rank Gene name Gene ID unenriched enriched (% reads kb-1) (% reads kb-1) 1 sirA 31577136 2.755 2.829 2 rpmJ 31577289 1.984 1.267 3 rbcL 31576636 1.709 1.703 4 dsrE 31577137 1.346 0.970 5 dsrH 31575343 1.300 1.076 6 hp 31575728 1.204 1.638 7 dsrC 31575342 0.911 0.821 8 rpsM 31576581 0.868 0.663 9 hp 31576035 0.664 0.525 10 rpmG 31576708 0.635 0.401 11 porin_4 31577251 0.618 0.570 12 ompA 31577180 0.617 0.733 13 aprM 31575927 0.602 0.590 14 rbcS 31576635 0.545 0.381 15 ripA 31575730 0.543 0.610 16 groES 31575479 0.542 0.478 17 HU 31577016 0.541 0.571 18 atpE 31575643 0.538 0.477 19 rpmB 31576709 0.525 0.504 20 rplU 31576796 0.499 0.367 21 pilA 31575370 0.476 0.462 22 rpmD 31576584 0.469 0.381 23 hp 31575856 0.464 0.472 24 fba 31576038 0.462 0.566 25 glgC 31576501 0.456 0.535 26 rpmA 31576797 0.456 0.356 27 HSP70 31576502 0.445 0.573 28 raiA 31576650 0.415 0.380 29 hp 31576685 0.411 0.395 30 rpsS 31576598 0.394 0.437 31 aprB 31575928 0.393 0.386 32 rplV 31576597 0.383 0.418 33 nuoA 31575069 0.382 0.300 34 dsrE 31575345 0.381 0.451 35 dsrF 31575344 0.381 0.397 36 rpsG 31576606 0.375 0.366 37 prkA 31576527 0.372 0.368 38 dsrA 31575347 0.365 0.372 39 cytC 31576414 0.351 0.628 40 dsrB 31575346 0.349 0.394 41 rpmF 31575278 0.335 0.264 42 rpmI 31574931 0.334 0.340 43 atpI 31577205 0.325 0.351 44 sat 31574706 0.323 0.358 45 rbr 31575960 0.322 0.298 46 hp 31576722 0.321 0.358 47 rpsO 31575117 0.303 0.193 48 hp 31576151 0.303 0.239 49 acpP 31575273 0.302 0.222


Table A2.2 (Continued). 50 adk 31574970 0.298 0.330 51 rpsL 31576607 0.298 0.301 52 rpsK 31576580 0.292 0.280 53 rpsE 31576585 0.292 0.302 54 HTH_ARSR 31575768 0.286 0.266 55 iscU 31576121 0.285 0.274 56 haem_bdg 31576879 0.281 0.251 57 soxX 31576917 0.268 0.341 58 rplN 31576592 0.261 0.236 59 ndh 31575070 0.258 0.211 60 rplM 31575738 0.255 0.320 61 trp 31576756 0.253 0.294 62 gapA 31576041 0.251 0.285 63 rplC 31576602 0.250 0.311 64 dsrM 31575341 0.245 0.307 65 rpsF 31575104 0.244 0.295 66 RHOD 31575767 0.241 0.201 67 rplR 31576586 0.241 0.240 68 dsrL 31575339 0.239 0.258 69 atpF 31575644 0.237 0.233 70 rplW 31576600 0.232 0.229 71 dsrO 31575337 0.227 0.226 72 rpmH 31575606 0.225 0.227 73 rpsH 31576588 0.223 0.216 74 rpmC 31576594 0.223 0.259 75 rplB 31576599 0.216 0.185 76 HU_like 31575491 0.215 0.247 77 dsrJ 31575338 0.214 0.246 78 rplP 31576595 0.213 0.251 79 pfp 31575776 0.213 0.250 80 rpsQ 31576593 0.212 0.153 81 rplF 31576587 0.211 0.239 82 rpsU 31577001 0.208 0.255 83 dsrK 31575340 0.206 0.254 84 ALP_like 31575631 0.206 0.170 85 rpsJ 31576603 0.204 0.135 86 rpsD 31576579 0.203 0.189 87 ips 31576661 0.202 0.225 88 soxZ 31576915 0.198 0.211 89 hp 31576311 0.198 0.230 90 rplX 31576591 0.197 0.240 91 yfsF 31576658 0.197 0.308 92 rplO 31576583 0.195 0.144 93 rpsN 31576589 0.194 0.170 94 cmk 31575847 0.187 0.122 95 rpsC 31576596 0.187 0.207 96 rplD 31576601 0.182 0.177 97 hp 31576627 0.177 0.236 98 secB 31575766 0.175 0.146 99 fccA 31576290 0.174 0.124 100 ihfA 31574935 0.173 0.144 101 rpe 31575542 0.173 0.112


Table A2.2 (Continued). 102 yceD 31575279 0.172 0.158 103 GH57N_APU 31576500 0.171 0.173 104 ccmD 31577268 0.170 0.074 105 hp 31575962 0.170 0.159 106 rpsP 31576137 0.163 0.180 107 rplJ 31575435 0.161 0.133 108 arsC 31574866 0.161 0.171 109 pcm 31576727 0.159 0.148 110 rplE 31576590 0.158 0.159 111 nuoE 31575073 0.158 0.222 112 hp 31576090 0.157 0.156 113 yhbY 31576076 0.157 0.138 114 rplT 31574932 0.154 0.111 115 groEL 31575480 0.153 0.160 116 dsrC 31575352 0.152 0.075 117 cbbQ 31576634 0.151 0.191 118 yciL 31576070 0.151 0.149 119 dsrR 31575334 0.151 0.113 120 tktA 31576042 0.150 0.178 121 rplK 31575437 0.150 0.166 122 tpiA 31575063 0.150 0.155 123 nuoI 31575077 0.150 0.118 124 trpD 31576757 0.150 0.151 125 nuoC 31575071 0.149 0.144 126 fusA 31576605 0.148 0.155 127 ndhD 31575072 0.148 0.169 128 COG2847 31575100 0.148 0.142 129 mlrA 31574936 0.146 0.089 130 secY 31576582 0.145 0.110 131 iscA 31576120 0.142 0.123 132 hfq 31575310 0.141 0.156 133 soxY 31576916 0.139 0.142 134 aprA 31575929 0.138 0.152 135 RNase_P 31575607 0.138 0.133 136 nfs 31576122 0.137 0.138 137 napC 31576472 0.137 0.114 138 malQ 31576499 0.136 0.159 139 porin 31576867 0.132 0.120 140 fixQ 31576689 0.132 0.135 141 rplQ 31576577 0.132 0.112 142 glgBE 31576498 0.131 0.100 143 ppiA 31575874 0.131 0.108 144 fixN 31576687 0.131 0.111 145 secE 31575439 0.130 0.186 146 rimM 31576136 0.130 0.148 147 DUF4426 31575412 0.130 0.124 148 aroK 31575761 0.129 0.168 149 yccA 31574763 0.129 0.142 150 glnK 31575364 0.129 0.122 151 pgk 31576040 0.128 0.143 152 rpsB 31575300 0.125 0.127


Table A2.3. Controls of PPi-PFK activity (nmol PPi min-1 mg total protein-1) in the cell-free extracts (CFE) of S. velum gill and foot tissue. The reactions were carried out with 5 mM FBP 3- and 20 mM PO4 , unless otherwise stated. Standard deviations from three biological replicated are shown. Measurements were performed at pH 7.5 and at 25℃. 3- FBP Without PO4 5 mM F6P 5 mM Fru Boiled gill CFE Foot CFE

Activity 27.5±1.8 0.00±0.0 0.00±0.0 0.00±0.0 0.00±0.0 0.00±0.0


Table A2.4. Initial velocities of the symbiont PPi-PFK forward reaction (µmol min-1 mg protein-1) at different substrate concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three measurements are shown. PPi [mM] 0.01 0.025 0.5 2.5 5 0.05 11.7±0.9 21.2±0.2 23.1±3.0 19.5±2.6 18.5±1.0 0.1 14.5±0.6 23.0±2.7 35.5±4.0 30.2±3.3 30.90±1.0 F6P [mM] 0.5 33.6±1.6 38.3±4.5 60.2±1.9 65.9±2.3 65.8±6.4 2.5 40.4±1.8 68.4±5.0 86.9±7.2 85.5±6.9 98.2±9.4 7.5 66.9±3.6 85.7±2.8 101.6±3.4 102.1±0.9 104.0±2.5


Table A2.5. Initial velocities of the symbiont PPi-PFK reverse reaction (µmol min-1 mg protein-1) at different substrate concentrations. Measurements were performed at pH 7.5 and at 25℃. Standard deviations from three measurements are shown.

3- PO4 [mM] 0.5 1 5 10 20 25 50 0.01 4.7±0.6 8.2±0.7 11.8±0.8 11.7±1.2 9.3±0.3 6.7±0.3 3.1±0.1 0.025 17.9±1.7 20.6±3.0 34.0±2.2 23.9±1.9 20.7±1.4 17.4±1.0 12.4±1.0 0.05 16.5±1.0 25.4±2.7 51.0±0.2 52.3±2.7 43.1±8.8 44.0±3.7 24.5±3.6 FBP 0.1 [mM] 23.5±0.9 38.5±3.3 73.1±1.8 82.4±1.1 78.1±4.8 71.1±3.2 49.1±1.5 2.5 41.8±0.9 78.9±4.9 151.8±4.8 186.0±4.7 184.8±7.9 183.6±5.2 162.2±4.2 5.0 58.5±5.9 82.4±2.5 158.6±4.7 173.0±5.5 187.1±2.3 187.3±4.5 174.7±3.5 10.0 41.0±1.7 85.5±7.73 124.2±1.9 145.4±3.1 166.7±3.2 158.2±1.1 182.8±3.6


Table A2.6. PPi-PFK activity (µmol PPi min-1 mg symbiont protein-1) in S. velum gill tissue cell- free protein extracts at different substrate concentrations from Table 2.1 estimated per bacterial protein. FBP [mM] 2.5 5 10 10 1.56±0.04 1.58±0.07 1.60±0.09 3- PO4 [mM] 20 2.11±0.29 2.10±0.25 1.92±0.17 25 1.94±0.12 2.08±0.07 1.92±0.04


Table A2.7. Bacterial strains, plasmids, and primers used in this study.




Supplementary material for Chapter 3:

The enigmatic Calvin cycle of chemoautotrophic bacterial symbionts deciphered


Supplementary Methods

Bacterial strains and plasmids

Templates for recombination were created by PCR amplifying approximately 500 bp fragments immediately upstream and downstream of the target genes and fusing them in the same order to an antibiotic selection marker using primers from Appendix 3 Table A3.4. To amplify pfp left and right flanks, primers pfpNdeLF-pfpaacC1LR and pfpaacC534RF- pfpXhoIRR were used, respectively. The fbp flanks were amplified with primers fbpBglIILF-fbpaphALR and fbpaphARF-fbp367NdeIRR. aphA1 kanamycin resistance (KmR) antibiotic marker was chosen for fbp deletion in A. vinosum. This antibiotic resistance gene was PCR amplified from pCM184 plasmid using primers aphA1F and aphA816R. The aacC1 gentamicin resistance (GmR) promoterless gene for the single pfp knockout was PCR amplified from pCM351 with primers aacCSDF and aacC534R. Inactivation of both fbp and pfp made A. vinosum slow-growing.

Because of that and due to a high number of false positives with gentamicin selection, aacC1 gene with a constitutive gentamicin promoter was used to generate a double ∆fbp ∆pfp mutant in the ∆fbp genetic background. The flanking regions and the corresponding antibiotic markers were assembled using fusion PCR. Three PCR products were combined with two outer-most primers (pfpNdeILF-pfpXhoIRR and fbpBglIILF-fbp699NdeIRR) and PCR amplified for 25 cycles

(98°C for 10 sec, 65°C for 15 sec, 72°C for 62.5 sec). The resulting fusion products were purified, digested with NdeI and XhoI restriction enzymes for pfpL-aacC1-pfpR amplicon and

GblII and NdeI for fbpL-aphA-fbpR PCR product, and ligated into the digested plasmid pCM433 using T4 DNA ligase (NEB) at 16°C overnight. The resulting ligation products were directly introduced into E. coli S17-1 by electroporation using standard protocols (Sambrook & Russell

2001). E. coli cells containing the desired plasmid constructs were selected on antibiotic Luria-

Bertani (LB) plates. pCM433 fbpL::aphA1::fbpR plasmid grew on LB containing kanamycin (Km,

50 µg/ml), which indicated that in E. coli aphA1 gene was expressed from the fbp promoter


(Pfbp) present in the fbp left flank. Plasmid pCM433 pfpL::aacC1::pfpR could not be selected with gentamicin (Gm, 10 µg/ml), suggesting that A. vinosum pfp (Ppfp) promoter is not recognized in E. coli. Instead, plasmid selection in E. coli was carried out on tetracycline (10

µg/ml). Colonies which grow on selective plates were checked for plasmids by PCR and sequencing. In this study, for cloning purposes, Q5 high-fidelity polymerase (NEB) was used.

For verification and sequencing, PCR reactions were carried out with OneTaq polymerase


Once confirmed, the resulting allelic exchange plasmids were introduced into A. vinosum through conjugation. A. vinosum (1 ml) was harvested during mid-log phase (approximate optical density (OD) at 690 nm of 1.4) by centrifugation at 9,300 g for 5 min at room temperature

(RT). The pellets were washed twice in 500 µl RCV medium (see below). Following the wash, cells were resuspended in 500 µl of RCV. Fresh E. coli S17-1 colonies containing allelic exchange plasmids were scraped from the plates and resuspended in 3 ml RCV. The volume

8 8 equal to 4x10 of E. coli donor cells (assuming OD600 0.1 = 10 cell/ml) was mixed with A. vinosum contained in 500 µl RCV. This mixture was centrifuged at 9,300 g for 5 min at RT. The pellet was resuspended in 50 ml RCV and pipetted onto sterile 0.45 µl nitrocellulose filters

(Millipore) placed on non-selective RCV agarose plates. The plates were incubated for 4 to 10 days at 30°C anaerobically under light. Next, bacteria on the filters were resuspended in 1 ml sterile RCV and plated on RCV Phytagel plates (see below) containing antibiotics. To select against E. coli, 50 µg/ml rifampicin (Rif) was used. ∆fbp::aphA recombinants were selected on

10 µg/ml Km. ∆pfp::aacC1 mutants were identified on plates containing 5 µg/ml Gm. To obtain only double-crossover recombinant knockouts, A. vinosum single crossover mutants with integrated allelic exchange plasmid were selected against using 10% w/v sucrose. Sucrose is toxic to cells containing sacB gene found on pCM433 plasmid (Marx 2008) and has been effectively used as a counter-selection marker in A. vinosum (Grimm et al. 2011). During


selection, NaCl was omitted from RCV medium as it has been reported to interfere with the selection process in diverse bacteria (Kunst & Rapoport 1995; Logue et al. 2009; Suckow et al.

2011). A. vinosum colonies, which grew on average 10 days after plating, were restreaked multiple times and screened by PCR and sequencing using primers from Appendix 3 Table


Growth conditions

RCV liquid medium consisted of 3 solutions. Solution A was prepared by dissolving 60 g malate, 24 g NH4Cl, 4 g MgSO4 x 7H2O, 1.4 g CaCl2 x 2H2O, and 20 ml SL12 solution

(Overmann et al. 1992) in 1000 ml 18.2 Mohm H2O. SL12 solution contained 3 g EDTA-Na2 x

2H2O, 1.1 g FeSO4, 300 mg H3BO3, 190 mg CoCl2 x 6H2O, 50 mg MnCl2 x 4H2O, 42 mg ZnCl2,

24 mg NiCl2 x 6H2O, 18 mg Na2MoO2 x 2 H2O, and 2 mg CuCl2 dissolved in this order in 1000 ml 18.2 Mohm H2O. pH of the solution was adjusted to 2-3 with HCl. SL12 was filter-sterilized and stored in the dark at 4°C. Feeding solution B contained 1.55 g NaSH x H2O in 50 ml 18.2

Mohm H2O degassed with N2. Solution A was sterilized through a 0.2 µm filter and stored at 4°C in the dark. Solution B was autoclaved in half-full crimp top vials sealed with a butyl rubber stopper. To prepare the medium, 27.5 ml of solutions A, 275 mg of yeast extract, and 990 mg of

NaOH were added to 500 ml of 18.2 Mohm H2O. pH of the medium was brought to 7.0 with

NaOH. Afterwards, 32.4 ml of 180 mM KPO4 buffer were added together with 1,165 µl of 1M sodium acetate and 10% thiosulfate. The medium was filter-sterilized (0.2 µm) and bubbled with

N2 under sterile conditions for 60 min. Then, 1,514 µl of the feeding solution B were added. The medium was aliquoted into 9 ml anaerobic vials and stored in the dark at least 12 hours prior to inoculation. Background levels of sulfide in the feeding solution made the medium anaerobic. To prevent contamination, all cultures were grown in the presence of 15 mg/ml Rif.


To grow A. vinosum strains on plates, RCV medium was supplemented with 1%

Phytagel (Sigma Aldrich) and 85 mM NaCl2 to aid gelation. The medium containing Phytagel and solution A was autoclaved. Filter-sterilized and pre-warmed to 42°C solutions of 1M sodium acetate and 10% thiosulfate, feeding solution B, and 180 mM phosphate buffer pH 7.0 were added when the autoclaved solution cooled down to approximately 62°C. This was done to avoid precipitation of salts at higher temperatures. When 1.5% agarose was used instead of

Phytagel, sodium acetate, thiosulfate, and feeding solution B were excluded. Antibiotics were added when the assembled medium cooled down to 55°C. Once solidified, the plates were stored overnight under oxygen-free atmosphere prior to inoculation. The inoculated plates were incubated in GasPakTM BBLTM jars (BD) between two 60W incandescent lightbulbs placed 20 cm away from the surface of the jars. The lightbulbs maintained plate temperature at approximately 30°C.

To measure growth of A. vinosum under heterotrophic conditions, liquid RCV medium was prepared in a 500 ml spinner flask (Bellco Glass) with two 45 mm side arms and a 70 mm center neck. One side arm was closed with a butyl rubber stopper (Ochs) and fitted with two needles. These needles were capped with air-tight stopcocks with luer lock valves connected to

0.2 µm sterile filters for degassing. The needles were also used for withdrawing culture samples from the bioreactor. The second arm of the spinner flask was fitted with an air-tight sampler which continuously circulated medium through an attached glass 1 mm cuvette using a peristaltic pump (Teledyne ISCO), flow rate 10x50. Flexible tubing and O-rings of the sampler were made from VitonTM fluoroelastomer (Chemours). Rigid tubing running into the bioreactor and the glass cuvette were made from polyether ether ketone (PEEK). The butyl rubber stopper holding the tubes in the cuvette was sealed using Marine-Tex epoxy (ITW Engineered

Polymers). The connections between the PEEK and VitonTM tubing as well as the insertion points of the PEEK tubing into the flask through a custom laser-cut butyl rubber gasket, held in


place by an open top screw cap (Corning), were secured with Swagelok stainless steel fittings

(Swagelok) coated on the inside with fluoroelastomer. The center neck of the flask was closed with a screw cap lined with a butyl rubber gasket. All of the connections between glass and rubber were sealed with silicon grease (Cole-Parmer). This setup withstood multiple rounds of autoclaving at 121°C for 20 min. OD of the culture was monitored at 690 nm using UV-1601 spectrophotometer (Shimadzu) every 10 minutes, automated and recorded with UVProbe software (Shimadzu). The culture was gently stirred with a polytetrafluoroethylene- (PTFE) coated stirrer. Heterotrophic growth experiments were performed at least in duplicate.

To measure growth kinetics of A. vinosum ∆fbp ∆pfp, cultures were grown in RCV medium in 9 ml gas-tight vials. Since the double mutant didn't grow on either bicarbonate, malate, or acetate, these cultures were supplemented with 1% w/v of either D-fructose, D- glucose, sucrose, rhamnose, or glucoronic acid. OD at 690 nm was measured directly in the vials daily for over 2 months. The experiment was carried out at least in triplicate.

To study growth of A. vinosum cultures under autotrophic conditions, bacteria were inoculated into Pfennig's medium (Imhoff 2006). Sulfide and bicarbonate served as sole energy and carbon sources, respectively. To prepare the base of the medium, 0.33 g KCl, 0.33 g MgCl2 x 6H2O, 0.43 g CaCl2 x 2H2O, 0.33 g NH4Cl, 0.33 g KH2PO4, and 1 ml SL12 solution were dissolved in 900 ml 18.2 Mohm H2O and filter- sterilized (0.2 µm). Solutions of 17.85 mM

NaHCO3 and 713.5 mM NaSH x H2O, 100 ml each, were prepared in 18.2 Mohm H2O degassed with N2 for 45 min, sealed with butyl rubber stoppers in half-filled crimp top vials, and autoclaved for 15 min (liquid cycle). For adjusting pH of the medium during growth, 0.5 M HCl and 0.5 M

NaOH were filter-sterilized and degassed with sterile N2 for 45 min.

Growth of A. vinosum cultures under autotrophic conditions was carried out in a pH controlled bioreactor with feedback-controlled sulfide feeding. The setup was analogous to the bioreactor for heterotrophic growth, with some notable differences. The spinner flask contained


a larger center neck (100 mm) to fit double junction pH (Cole-Parmer) and sulfide (Weiss

Research) electrodes, a temperature sensor (Omega), as well as lines for titrating acid and base. Prior to assembling medium inside the bioreactor, spinner flask with the tubing for measuring OD, pH electrode, temperature sensor, acid and base feeding lines, and degassing/sampling needles were autoclaved. Sulfide electrode was sterilized in 7.5% H2O2 stabilized with 0.85% H2PO4, followed by UV treatment, and installed into the autoclaved bioreactor. The filter-sterilized base medium was added and bubbled with N2 for 2 hours. Next,

17.85 mM bicarbonate (100 ml) was added aseptically. pH was automatically adjusted to 7.0 using Apex controller (Neptune). Sulfide electrode was connected to Chemcadet mV controller

(Cole-Parmer). The mV output from the controller was recorded using Yocto-milliVolt-RX-BNC precision voltmeter (Yoctopuce) connected to Raspberry Pi3 (Raspberry Pi Foundation).

Assembled medium was supplemented with 0.25 mM sulfide and kept overnight with gentle stirring.

Prior to inoculation, sulfide electrode was calibrated by adding known amounts of sulfide in 0.05 mM increments into the bioreactor and quantifying the amount with Cline method (Cline

1969). Briefly, 1 ml samples were combined with an equal volume of 5.2% Zn-acetate to sequester sulfide. Aliquots (270 µl) were next incubated with 30 µl of the Cline reagent, containing 0.5 g N,N-dimethyl-p-phenylenediamine sulfate and 0.75 g FeCl3 x 6H2O in 25 ml

50% cool HCl, for 30 min in a 96-well plate (Greiner Bio-One) in triplicate. Absorption was measured at 670 nm using Tecan Infinite m200 spectrophotometer (Tecan). Amount of sulfide in the samples was determined using a standard curve.

To inoculate Pfennig's medium, pre-culture in mid-log phase was used. The inoculum was obtained by seeding liquid RCV medium with single colonies of A. vinosum. To prevent carry-over of dissolved organic carbon into the autotrophic medium, the pre-culture was collected on a 0.45 µm filter. Bacteria on the filter were suspended in 20 ml Pfennig's medium


aliquoted from the bioreactor into a sterile crimp top vial sparged with N2. This bacterial suspension was used to inoculate the bioreactor to OD690 of 0.07. Following inoculation, sulfide- feeding line was connected to one of the two needles leading into the bioreactor. When sulfide concentration in the bioreactor fell below approximately 0.3 mM, Chemcadet controller engaged a peristaltic pump (Teledyne ISCO), flow rate 1x20, which dispensed sulfide until the concentration inside the bioreactor increased to no more than 0.5 mM. OD690 of the culture was measured every 10 minutes (30 minutes for A. vinosum ∆fbp ∆pfp) until cultures entered stationary growth phase. Light intensity (42,000 Lux, 400-700 nm) was monitored with Yocto-

Light-V3 (Yoctopuce). Prior to inoculation, all cultures were verified by PCR and sequencing.

Autotrophic growth experiments were performed at least in duplicate.

To measure protein and ATP concentrations, samples were collected at regular OD690 intervals (0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.25, and 2.0). For protein quantification, 1 ml of culture was centrifuged at 13,000 g for 2 min. The pellets were frozen in liquid N2 and stored at -

80°C. Aliquots (125 µl) for ATP analysis were frozen immediately following collection and kept at -80°C until analyzed.

Freezer stock of A. vinosum was kept at -80°C in 10% DMSO.

Measuring CO2 fixation rates

13 CO2 fixation in the bioreactor cultures was determined using C labeled bicarbonate

(Cambridge Isotope Laboratories). During autotrophic growth, when the cultures were in log

13 phase, NaH CO3 was added to the final bicarbonate 13C/12C ratio of 0.17. At regular time intervals (0, 15, 30, 45, 75, 120, 180, 240, and 300 min) 5 ml of culture were filtered in duplicate on GF/F 25 mm glass microfiber filters (Whatman). For heterotrophic conditions, CO2 fixation experiments were carried out in 9 ml vials. Samples were collected at 0, 15, 120, 240, and 300 min intervals.


Prior to filtration, GF/F filters were baked at 450°C overnight to remove any residual organic carbon. Filters containing bacterial pellets were fumed with HCl for approximately 12 hours, HCl being changed after 6 hours. During fuming, the filters were kept on a PTFE surface cleaned with methanol. Isotopic signature and CO2 concentration of culture medium prior to adding 13C label were analyzed. Filters and liquid samples were analyzed by stable isotope facilities at , Woods Hole Marine Laboratory, and University of New Mexico.

Samples for protein determination were collected at each sampling time point and analyzed as described above.

For calculating 13C dissolved inorganic carbon (DIC) incorporation rates, the mass balance equation was adapted from Montoya (1996):

(A )([PC ]) = (A )([PC ]) + (A )([PC ]) PC f f PCcontrol control CO2 D where A equals atom% of particulate carbon (PC; biomass carbon) at the end of incubation (f) and start/natural abundance (control), or of the DIC pool (ACO2); [PCf] equals concentration/amount of PC at end of incubation, [PCcontrol] stands for concentration/amount of PC at start of incubation, and [PCΔ] represents concentration/amount of newly formed PC during incubation, equal to new carbon biomass. To calculate carbon fixation rates (newly formed carbon biomass), the equation was solved for the relative ratio of newly formed biomass as a function of total biomass.

((APC ) - (APC )) ([PC ]) f control = D ((A ) - (A )) ([PC ]) CO2 PCcontrol f

To determine the absolute carbon fixation rate, the equation was solved for [PCΔ]. The reported rates were calculated per min per mg of total protein.


Figure A3.1. Cycles of sulfide consumption and automated supplementation in A. vinosum ∆fbp ∆pfp autotrophic bioreactor culture.


Figure A3.2. Sulfide consumption rates of A. vinosum WT, ∆fbp, and ∆pfp under autotrophic conditions. Shaded areas around mean values indicate SEM (WT N=2, ∆fbp N=3, ∆pfp N=3).


Table A3.1. Calvin cycle enzymes and the corresponding locus tags in the genome of A. vinosum. Abbreviation Name (EC number) Locus tag RuBisCO Ribulose-bisphosphate carboxylase (EC: Alvin_1365, Alvin_1366 PRK Phosphoribulokinase (EC: Alvin_0562 PGK Phosphoglycerate kinase (EC: Alvin_0314 GAPDH Glyceraldehyde 3-phosphate dehydrogenase (EC: Alvin_0315 FBA Fructose bisphosphate aldolase, class II (EC: Alvin_0312 FBPase Fructose 1,6-bisphosphatase (EC: Alvin_0677 PPi-PFK Phosphofructokinase (EC: Alvin_2908 TK Transketolase (EC: Alvin_0316 TPI Triosephosphate isomerase (EC: Alvin_2432 RPI Ribose 5-phosphate isomerase (EC: Alvin_2900 RPE Phosphopentose epimerase (EC: Alvin_0272


Table A3.2. Growth rates (OD690/min) of A. vinosum in autotrophic and heterotrophic media. Medium WT ∆fbp ∆pfp ∆fbp ∆pfp Autotrophic 0.00114±0.00009 0.00085±0.00004 0.00106±0.00007 0.000003±0.00000 Heterotrophic Before diauxic shift 0.00112±0.00001 0.00112±0.00001 0.00107±0.00009 After diauxic shift 0.00100±0.00001 0.00093±0.00001 0.00099±0.00009 Fructose 0.00032±0.00004 Glucose 0.00036±0.00004 Sucrose 0.00000±0.00001 Rhamnose 0.00000±0.00000 Glucuronate 0.00000±0.00000 No sugar 0.00000±0.00000


Table A3.3. Bacterial strains and plasmids used in this study. Strains and Plasmids Relevant properties Source Allochromatium vinosum

DSM 180T (Lubber at al. Rif50 RifR; spontaneous rifampicin-resistant mutant 2006) ∆fbp RifR KmR (∆fbp::aphA) This study ∆pfp RifR GmR (∆pfpA::aacC1) This study ∆fbp ∆pfp RifR KmR GmR (∆fbp::aphA) (∆pfp::aacC1) This study Escherichia coli 294 (recA thi pro hsdR- M+) TpR SmR [RP4-2-Tc : : (Simon et al, E. coli S17-1 Mu-Km : Tn7] 1983) Plasmids (Marx & ApR, KmR, TcR; broad-host range cre-lox allelic pCM184 Lindstrom exchange vector 2002) (Marx & ApR, GmR, TcR; broad-host range cre-lox allelic pCM351 Lindstrom exchange vector 2002) ApR, CmR, TcR; broad-host-range sacB-based pCM433 (Marx 2008) allelic exchange vector ApR, CmR, TcR, KmR; sacB-based vector for in- pCM433 fbpL::aphA::fbpR This study frame deletion of fbp ApR, CmR, TcR, GmR; sacB-based vector for in- pCM433 pfpL::aacC1::pfpR This study frame deletion of pfp




