Research Collection

Doctoral Thesis

Elucidation of the metabolic network topology in Methylobacterium extorquens AM1 and its operation under pure and mixed substrate conditions

Author(s): Peyraud, Rémi

Publication Date: 2011

Permanent Link: https://doi.org/10.3929/ethz-a-007208824

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library

Diss. ETH N°19955

ELUCIDATION OF THE METABOLIC NETWORK TOPOLOGY IN METHYLOBACTERIUM EXTORQUENS AM1 AND ITS OPERATION UNDER PURE AND MIXED SUBSTRATE CONDITIONS

DISSERTATION

Submitted to

ETH ZURICH

For the degree of

DOCTOR OF SCIENCES

by

REMI PEYRAUD

Master of Science, Université Paul Sabatier

07 March 1981 French

Accepted on the recommendation of

Prof. Dr. Julia Vorholt Prof. Dr. Jean-Charles Portais Prof. Dr. Uwe Sauer

2011

Table of contents

Abstract 1

Résumé 3

Chapter I: Introduction 5 I.1. Methylotrophic organisms and their habitat 6 I.2. Application of methylotrophs in a methanol-based white-biotechnology 9 I.3. Metabolic pathways and enzyme diversity in methylotrophs 10 I.4. Connecting metabolic pathways: assessing emergent properties of complex 18 networks I.5. Elucidation of cell physiology by Metabolic Flux Analysis (MFA) 19 I.6. Aims 20

Chapter II: Glyoxylate regeneration in M. extorquens AM1: Operation of 27 the ethylmalonyl-CoA pathway II.1. Abstract 28 II.2. Demonstration of the ethylmalonyl-CoA pathway in M. extorquens by using 13C 28 metabolomics II.3. Supporting Information 47

Chapter III: Genome analysis of Methylobacterium extorquens 53 III.1. Abstract 54 III.2. Genome sequence analysis of M. extorquens AM1 and DM4: A reference 54 blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources III.3. Supporting Information 85 III.4. Comparison of the metabolic capactity of Methylobacterium species 95

Chapter IV: Elucidation of network topology at system level in 99 M. extorquens AM1 and its operation under pure methylotrophic growth IV.1. Abstract 100 IV.2. Genome-scale reconstruction and system level investigation of the metabolic 100 network of M. extorquens AM1 IV.3. Supporting Information 131

Table of contents (continued)

Chapter V: Co-consumption of methanol and succinate by Methylobacterium 141 extorquens AM1 V.1. Abstract 142 V.2. Co-consumption of methanol plus succinate by M. extorquens AM1 142 V.3. Supporting Information 157

Chapter VI: General discussion 161 VI.1. Additional properties of the EMCP. A new strategy for autotrophy? 162 VI.2. Yield and limitation of M. extorquens AM1 during methylotrophic growth. 165 Genome-scale model point of view VI.3. Optimal energetic balance in M. extorquens AM1 during methylotrophic growth 166 from Genome-Scale model and enzymology point of view. VI.4. Flexibility in energetic balance in M. extorquens AM1 during methylotrophy 168 avoids formaldehyde accumulation. VI.5. Pre-adaptation in M. extorquens AM1? 171

Acknowledgement 177

Curriculum Vitae 179

Abstract

In nature, methylotrophy, i.e. the ability of microorganisms to use reduced compounds whithout carbon-carbon bounds, C1 compounds, such as methane and methanol, is a significant biological and geochemical process involved in the global carbon cycle. The biochemistry of methylotrophic microorganisms was intensively studied in the past. M. extorquens AM1, a gram-negative Alphaproteobacterium, abundantly and ubiquitously found on leaf surfaces, has become a model organism to study methylotrophy. Up to now, system level investigations of their methylotrophic capacity were missing. In the following work, the overall network topology of M. extorquens AM1 has been elucidated and the physiology of the bacteria under conditions of pure methanol as well as methanol plus succinate growth was investigated. i) First, the long standing question of the metabolic pathway involved in the regeneration of glyoxylate in the isocitrate lyase negative cycle methylotroph M. extorquens AM1 was solved using a combination of approaches: Metabolomics was used to identify CoA- intermediates involved in the pathway leading to glyoxylate formation, monitoring the 13C dynamic labeling incorporation into the CoA-ester intermediates demonstrated the sequence of reactions through the pathway, and a 13C steady-state labeling experiment was used to quantify glyoxylate generating fluxes. This combined analysis demonstrated the operation of the ethylmalonyl-CoA pathway (EMCP), an alternative pathway to the glyoxylate cycle for C1 assimilation as well as C2 assimilation as demonstrated in Rhodobacter sphaeroides for acetate assimilation. ii) Genome information is valuable to infer the presence of known metabolic pathway. The genome annotation of two methylotrophs M. extorquens AM1 and M. extorquens DM4 was performed and revealed that the strain AM1 has a large genome size of 6.88 Mb coding 6759 genes. This high genomes size indicates that the strain required a huge number of genomic traits in its environment in accordance with its free leaving and facultative methylotrophyc life style. In addition, it possesses numbers of insertion element suggesting its potential genomic plasticity and that horizontal gene transfer mechanisms are involved in its evolution. Sequencing and annotation of two M. extorquens strains opens the opportunity two reveal the commons traits of the these bacteria as well as the specific traits of the strain AM1 like the methylamine utilization cluster. The genome sequence provides valuable information to build up systems level investigations of methylotrophy.

1

iii) A metabolic network reconstruction of M. extorquens AM1 was performed. The reconstructed metabolic network is composed of 1139 reactions, 977 metabolites, associated to 911 genes via a gene to protein to reaction association network. Accurate predictions of growth rate under methylotrophy were obtained using Flux Balance Analysis (FBA) validating the reconstruction. Robustness analysis of the network using Minimal Cut Set reveal a high number of fragile points (50% of reactions involved into the central metabolism) under methylotrophic growth conditions. This result predicts that a strong pressure of selection is operating under environmental conditions to keep the methylotrophic capacity. In addition, a metabolic mode in which M. extorquens could perform CO2 fixation using the EMCP was identified providing the proof of principle that the EMCP could be a new strategy for autotrophy. Finally 13C Metabolic Flux Analysis (13C-MFA) was performed to quantify the in vivo metabolic fluxes in M. extorquens AM1 growing in the presence of methanol. The flux distribution reveals the operation of substrate cycles around a dense sub-network as potential points of rerouting fluxes during metabolic adaptation to new carbon sources. iv) The metabolic strategy of facultative microorganism to handle two carbon sources in culture was often reported to result in a diauxic shift. Nonetheless, there is evidence that co-consumption, i.e. simultaneous utilization of substrates, may also occur. M. extorquens AM1 was grown with methanol in addition to succinate to assess its metabolic strategy under mixed substrate conditions. A metabolic segregation in term of pathways utilization and metabolic functions supplied was observed. Indeed, methanol was used as energy source and for the biosynthesis of C1-unit whereas succinate was the main carbon sources (97%). The rational for the observed segregation was investigated by comparing in silico prediction and in vivo observation. Evidence was obtained for substrate utilization in relation to the most appropriate utilization.

2

Résumé

Dans la nature, la methylotrophie, définie comme la capacité d'un microorganisme à utiliser des composés réduits à un seul carbone (C1) comme le méthane ou le méthanol, est un processus biologique et géochimique d'importance en ce qui concerne le cycle du carbone. En conséquence, et avec en plus l'objectif de les utiliser en biotechnologie, la biochimie des organismes methylotrophes a été étudiée en détail par le passé. Ansi, M. extorquens AM1, une gram-négative Alphaproteobacteria, qui est abondammente et de façon ubiquitaire à la surface des plantes, est devenu un organisme modèle pour étudier la méthylotrophie. Cependant, une étude à l'échelle du système du métabolisme de cette bactérie n'a jusqu'alors pas été réalisé. Lors du travail présenté ci- dessous, l'ensemble de la topologie du réseau métabolique de M. extorquens AM1 a été élucidé et la physiologie de la bactérie en condition méthylotrophique pure de croissance ainsi qu'en condition mixte de substrat, méthanol plus succinate, a été étudiée. i) premièrement, la question, non résolue depuis 50 ans, de qu‘elle voie métabolique est impliquée dans la régénération du glyoxylate chez le méthylotrophes tel que M. extorquens AM1 possédant le cycle de la serine, mais dépourvu du cycle du glyoxylate, a été résolu en utilisant une combinaison d'approches expérimentale. Ainsi, la métabolomique a été utilisée pour identifier les de CoA impliqués comme intermédiaires dans la voie de formation du glyoxylate, ensuite la dynamique d'incorporation de l'isotope du carbone 13 a été suivit à travers ces métabolites pour démontrer leur séquence de formation à travers la voie métabolique, et finalement une expérience de marquage au carbone 13 à l'état stationnaire a permis de quantifier les flux de formation de glyoxylate. Cette analyse combinée démontre l'opération de la voie de l'ethylmalonyl-CoA, une alternative au cycle du glyoxylate pour l'assimilation des composés à un seul carbone mais aussi de l'assimilation des composé à deux carbone comme démontré dans Rhodobacter sphaeroides lors de l'assimilation de l'acétate. ii) L'information contenue dans le génome des organismes vivant permet d'inférer la présence de voie métaboliques connues dont il dispose. L'annotation du génome de deux methylotrophes, M. extorquens AM1 et M. extorquens DM4, a été réalisée et révèle que la souche AM1 dispose d'un génome de grande taille composé de 6.88 Mb codant 6759 gènes. Cette grande taille indique que la souche a besoin d'un nombre important de traits génomiques pour s'adapter à son environnement, ce qui est en accord avec son style de vie autonome d'autre forme de vie et de sa méthylotrophie facultative. De plus, il possède un nombre important d'éléments d'insertion suggérant une potentielle plasticité de son génome et que des mécanismes horizontaux de transfert de gènes sont impliqués dans son évolution. Le séquençage et l'annotation des deux souches a permis de mettre en évidence les traits métaboliques communs à la bactérie M. extorquens ainsi que les traits spécifiques de la souche AM1, tel qu'un cluster de gène impliqué dans l'utilisation de la 3 méthylamine. La séquence du génome est une donnée important pour permettre par la suite de réaliser des études, au niveau du système, de la methylotrophie. iii) La reconstruction du réseau métabolique de M. extorquens AM1 a été réalisée et décrit dans un modéle formel l'état des connaissances sur le métabolisme de la bactérie. Le model métabolique reconstruit est composée de 1139 réactions, 977 métabolites, qui sont associés avec 911 gènes via un réseau d'association entre gène, protéine, et réaction. Une prédiction fidèle du taux de croissance en condition methylotrophique a été obtenue par analyse de balance de flux validant la reconstruction du réseau. Par la suite, une analyse de robustesse du réseau par la méthode de rupture de voie minimale a été réalisée et révèle un grand nombre de points fragiles (50% des réactions impliquées dans le métabolisme central) en condition methylotrophique de croissance. Ce résultat indique qu'une forte pression de sélection doit opèrer dans l'environnement pour garder la capacité méthylotrophique. De plus, un mode métabolique par lequel M. extorquens AM1 peut réaliser une fixation de CO2 en utilisant l'EMCP a été identifié et consiste en une preuve de principe que la voie de l'EMCP peut être une nouvelle stratégie d'autotrophie. Finalement, une analyse des flux métaboliques par carbone 13 a été réalisée pour quantifier les flux métaboliques in vivo chez M. extorquens AM1 poussant sur méthanol. La distribution de flux obtenue révèle l'opération de cycle de substrats autour d'un sous-réseau dense qui représente un potentiel point de routage des flux lors de l‘adaptation à de nouvelle sources de carbone. iv) La stratégie des microorganismes facultatifs pour utiliser deux sources de carbones disponibles dans le milieu a souvent été trouvée comme consistant en une transition diauxique, c‘est-à-dire une utilisation séquentielle des substrats. Néanmoins, des études suggèrent que la co-consomation, c'est-à-dire l'utilisation simultanée des substrats, peut aussi s'établir. Ansi, M. extorquens AM1 a été cultivé en présence de méthanol en plus de succinate pour évaluer sa stratégie métabolique en condition mixte de substrat. Une ségrégation du métabolisme en termes d'utilisation de voie métabolique et d'approvisionnement de fonctions métaboliques a été observée. Ainsi, le méthanol est utilisé comme source d'énergie en plus de pourvoire spécifiquement la biosynthèse d'unités à 1 carbone alors que le succinate est utilisé comme source principale de carbone (97%). L'explication de la ségrégation observée a été étudiée en comparant les prédictions in silico avec les observations in vivo. Les résultats obtenus montrent que l'utilisation des substrats s'effectue en accord avec leur utilisation méatbolic la plus appropriée.

4

CHAPTER I

Introduction

5

I.1. Methylotrophic organisms and their habitat

Methylotrophs are defined as microorganism able to use reduced compounds without carbon- carbon bound, e.g. methane (CH4), methanol (CH3OH), formaldehyde (HCHO), or methylamine

(CH3NH2), as sole source of carbon and energy [1]. They are able to oxidize these reduced carbon sources to carbon dioxide (CO2). During this process methylotrophs generate the energy required for maintenance and growth, i.e. building organic matter. From oxidation of methanol with water

(H2O) to CO2, up to 6 electrons are releases (Fig. 1).

Fig. 1: , carbon oxidation level, number of electrons available and redox potential of C1 substrates. Redox potential correspond to H2 couple with the compound of lower carbon oxidation level, for instance in case of methanol: HCHO + H2 = CH3OH. These values are taken from Maden [2]. The redox potential of NAD(P)+/NAD(P)H+H+ couple is -320 mV thus one redox equivalent can be obtain from formaldehyde oxidation to formic acid and formic acid to CO2. The redox potential of methanol oxidation to formaldehyde is lower and a cytochrome cL is used as electron acceptor by the methanol dehydrogenase of M. extorquens AM1.

These electrons can be used to generate redox equivalents (NAD(P)H) or/and can flow through the electron transfer chain to the final electron acceptor, e.g. O2. Thus, the energy released during the oxido-reduction steps until the final electron acceptor is converted into a proton motive force and then by chemiosmosis to phosphorylate ADP, generating ATP. Part of this ATP is used, in addition with redox equivalents to generate ―de novo” carbon-carbon linkages. Therefore, from a simple one carbon (C1) molecule methylotrophs can biosynthetically produce the carbon skeleton of complex molecules like sugars, amino acids, and fatty acids. They share the property to generate ―de novo‖ their carbon-carbon bound with autotrophs which use inorganic carbon dioxide

(CO2). Nevertheless, the oxidation level of the carbon of CO2 is +IV, thus excluding the possibility to generate energy from the carbon precursor. Hence autotrophs are using light or reduced inorganic compound such as reduced sulfur as energy sources. On the other hand, most

6 methylotrophs use the reduced C1 source as carbon and energy source (Fig. 2). Intriguingly, some methylotrophs possess the capability to assimilate carbon from CO2 using the Calvin–Benson- Bassham (CBB) cycle [3] or the serine cycle [4, 5], both pathways are detailed below (part I.3).

Fig. 2: Global view of metabolic processes in methylotrophs. A: Metabolic strategies of Ribulose Monophosphate Pathway (RuMP) methylotrophs assimilating carbon exclusively from the reduced C1- substrate; B: Metabolic strategies of Calvin–Benson-Bassham (CBB) methylotrophs assimilating carbon exclusively at the oxidation level of CO2. C: Metabolic strategies of serine cycle methylotrophs assimilating carbon from the reduced C1 substrate and CO2.

The methylotrophic bacteria discovered so far belong to the Alpha-, Beta- and Gammaproteobacteria and gram positives [1, 6]. They were isolated or identified from a broad range of ecological niches, e.g. lake sediments [7], soils [8, 9], plant rhizosphere [10], and phyllosphere [10, 11], where they are part of many important biological and biogeochemical processes, e.g. the global carbon, and indirectly the nitrogen, and sulfur cycling [12]. They are adapted to a broad range of metabolic lifestyles, and were classified accordingly to their metabolic capabilities and metabolic pathway [1, 6], i.e. obligate methylotrophs versus facultative methylotrophs, heterotrophic versus autotrophic methylotrophs, aerobic versus anaerobic methylotrophs, Type I (using the Ribulose Monophosphate Pathway (RuMP) [13] for C1 assimilation) versus Type II (using the serine cycle). Such genetic and biochemical diversity may explain the ecological competitiveness of methylotrophs.

7

Among them, Methylobacterium extorquens AM1 (formerly Pseudomonas AM1) is a gram- negative Alphaproteobacterium isolated 50 years ago by Large and Quayle [14]. It became a model organism to study methylotrophy due to intensive biochemistry and genetic studies carried out since its isolation [1, 15]. M. extorquens species are present abundantly in nature at the surface of leaves [16], i.e. the phyllosphere, where they benefit from methanol release by plants [17] and where they are involved in its degradation [18]. They were also isolated from the rhizosphere highlighting there adaptation capacity [10].

Fig. 3: Electron microscopy picture of bacteria on the surface of an Arabidopsis thaliana leaf (picture taken from Delmotte et al., 2011).

Methanol is a byproduct of the biosynthesis of the plant wall and is released substantially during plant growth [19]. Localization of methanol emission via the stomata was suggested by using a Pichia pastoris biosensor having a gfp fused to a methanol inducible promoter [18]. Accordingly, the emission rate is correlated with stomata opening and leads to a transient availability of methanol with a high peak in the morning when the stomata open, a lower release during the day and a scarcity during the night [19]. Even if this production of methanol is only transient on leaf surface, the rate of emission monitored by Huve and coworkers suggest that methanol is the major carbon sources quantified so far on leaf surfaces. Indeed, they monitored a rate around 100 mol∙m-2(leaf surface)∙h-1 whereas the accumulation of sugar amount leaching from cuticle was monitored to be in the 10 mol.m-2 range [20]. Nevertheless like many methylotrophs, M. extorquens is a facultative microorganism. It can also use organic acid as carbon and energy sources, e.g. oxalate, acetate, pyruvate and succinate. These organic acids are suspected to be a complement for the bacteria when methanol is not available [17]; in addition alternative carbon sources may become important when the leaves fall to the ground and then the organisms may adapt to these new environmental condition by using organic acid produced from leaf degradation. Thus, the metabolic capabilities of methylotrophs were demonstrated to be based on number of specialized (i.e. methylotrophy) as well as "classical" metabolic pathways.

8

I.2. Application of methylotroph in a methanol-based white-biotechnology.

Methylotrophic organisms are receiving more and more attention in biotechnology for the conversion of one carbon (C1) substrates into biomass, chemicals, and materials [21]. More particularly, they represent true opportunities in the rising concept of the ―methanol economy‖ put forward by Nobel prize laureate George Olah, which refers to the potential of methanol for addressing major biotechnological challenges regarding the replacement of fossil sources, environmental safety, and sustainable development [22, 23]. Up to now, industrial biotechnology which produces fuels and chemicals from processes carried out by microorganisms alternatively to industrial chemical processes is based on sugar as carbon feedstock. In addition, ethanol and biodiesel produced from foods crops result in reduced direct fossil resource utilization and a neutral emission of CO2. However, production of this carbon feedstock and fuels, produced from plant photosynthesis, require extended land use due to low efficiency of the photosynthesis, i.e only 1% of the solar energy is converted into biomass. Then, extended parts of the earth surface such as forest, savannahs, grass land especially in Southest Asia and Americas are more and more converted into croplands for sugarcane and soybean production. For instance, in Malaysia oils palms production for biodiesel is responsible of 87% of the deforestation [23]. In consequence, the increasing crop surfaces and the industrial mono-cultures of plants generate devastating impact on ecology of large area of the earth with the drastic reduction of the biodiversity and huge quantity of CO2 release in atmosphere due to deforestation. In addition this intensive land utilization competies with cropland for food production. Hence, sugar-based biotechnology on one hand competes with human nutrition due to the limited landscape availability and growing demands in food under the pressure of growing world population, and on the other hand will increase due to an enhanced demand in fuel and chemicals by industrial development in many countries. Therefore, food prices like sugar concomitantly with oils prices increased in the recent years [24] with many and deep impact on human society. Hence, an alternative carbon feedstock for fuel and chemicals production is in demand. Methanol-based biotechnology could represent a valuable alternative to chemical industry as well as sugar-based biotechnology. Methanol is an available raw chemicals with 46 million tons produced worldwide in 2006 [21]. It is produced in a large-scale production from synthesis gas

(syngas with CO and H2) obtained from incomplete combustion of natural gas. A promising interest of methanol usage for sustainable development of methanol-based biotechnology is the capability to produce it from renewable resources such as biogas, wood biomass, glycerol, and especially from municipality waste [21]. Methanol biotechnology in industrial scale has been already attempted. For instance the production of single cell protein in the 1970 [25, 26], howether the use of methanol in white biotechnology is still limited. So far only a limited number of products have been shown by methylotrophs in proof 9 of principle studies rather than at industrial scale in general such as serine [27], glutamate [28], and [29], see Schrader et al. for a more exhaustive list of chemicals produced [21]. More specifically, if Methylobacterium was used for production like serine [27] it presents an interesting potential as bioplastic factory [30]. Indeed, Methylobacterium were used and improved to produce various polyhydroxyalkanoate (PHA) such as polyhydroxybutyrate (PHB) [30-32]. It can also provide a sources of pigments based on its capacity to synthetize the carotenoids responsible for their characteristic pink color [33]. One of the limitations of using methylotrophs is the lack of understanding of their central as well as global metabolism. Indeed, only recently and part of this thesis, see chapter II, an essential pathway for C1 assimilation, i.e. the ethylmalonyl-CoA pathway, was elucidated which is tightly embedded with the PHB biosynthesis. In addition, metabolic network reconstruction is missing and would provide valuable tool for rational strain design in silico and developing engineering strategies. Another limiting factor is few tools available for their genetic modification [34] and the only partial genome sequence and annotation [15].

I.3. Metabolic pathways and enzyme diversity in methylotrophs.

Methylotrophic organisms are able to dissimilate and assimilate C1 substrates without carbon- carbon bounds. At first sight, one may think that from a simple C1 molecule like methanol only few possibilities of biochemical transformation could operate to generate CO2 and new multi- carbon compounds. Nevertheless, biochemical studies revealed a wide diversity of processes supporting methylotrophy. Indeed, for each single chemical reaction or metabolic pathway at least two alternative mechanisms were discovered in different microorganisms (Fig. 3). The following paragraphs introduce each metabolic pathway or enzymatic step found in methylotrophic bacteria that are presented in Fig. 4. Additional strategies, especially for other reduced C1-compound utilization such as methane more details can be found in the recently published review of Ludmila Chistoserdova [35]. The broad diversity in metabolic processes discovered in methylotrophs support the utilization of module concept describing similar metabolic function supported by different but equivalent mechanisms. Even more surprising quite often several alternatives modules were found in the same organism (Fig. 4, Table 1). This modularity of the metabolic network of methylotrophs supports their broad versatility and colonization of diverse ecological niches.

10

Fig. 4: Metabolic processes (module) diversity in methylotrophs. Acronyms of the metabolic pathways and enzymes are: MxaF, large subunit of periplasmic methanol dehydrogenase (MxaFI); Mdhc, cytoplasmic methanol dehydrogenase; XoxF, methanol dehydrogenase or/and formaldehyde dehydrogenase, MxaF-like protein; Mau, methylamine dehydrogenase; NMGP, N-methylglutamate pathway; Dcm, dichloromethane dehalogenase; Cmu, chloromethane degradation; H4MPT, H4MPT (tetrahydromethanopterin)-linked pathway for formaldehyde oxidation; FdhA, NAD-dependent formaldehyde dehydrogenase; GSH, -linked formaldehyde oxidation; PurU, 10-formyl H4F (tetrahydrofolate) hydrolase; FtfL, formate-tetrahydrofolate ligase; FolD; methylene-tetrahydrofolate dehydrogenase and methenyl-H4F cyclohydrolase; MtdA, methylene-tetrahydrofolate dehydrogenase; RuMP, Ribulose monophosphate cycle; CBB, Calvin–Benson-Bassham cycle. Dashed line: metabolic process under debate; ox PP pathway, oxidative pentose phosphate pathway. Blue module: oxidation process; pink module: assimilation process; yellow modul: mixed process. Figure modified from [35]

11

Table 1. Examples of occurrence of methylotrophy metabolic modules in major methylotrophs. Table taken from [34]

MDH XoxF Mau NMGP H4MPT GSH PurU FolD MtdA FtfL Serine EMCP ICL RuMP CBB V4 - + - - - - - + - + - - - - + NC10 + + - - + - + + - - - - + - + M.c. + + - - + - - - + + + - - + + OB3b + + - - + - - - + + + + - - - M.s. + + - - + - - + - + + + - - + M.e. + + + - + - - - + + + + - - - P.d. + + + - - + + + - + - + - - + H.d. + + + - + - - + - + + + + - - 2181 - + - - - - - + - + - - - + - M.f. + + + + + - + + - + - - + + - M.m. - + + - + - + + - - - - + + - M.p. + + - - + - + + + + + - + - + M.t. + + + + + - + + - + - - + + - B.p. - + - + + + + + - - + - + - + R.p. - + - - - + - + - + + + + - + F.p. - + - - + - - + - + + - + - + G.b. + + - - + - - - + + + + + - -

Catabolic processes are highlighted in blue; amphybolic processes (methylene-H4F)-handling modules are highlighted in yellow; and assimilation modules are highlighted in pink. Module acronyms are defined in the legend of Fig. 4. V4, Methylacidiphilum invernorum; NC10, Candidatus Methylomirabilis oxyfera; M.c., Methylococcus capsulatus; OB3b, Methylosinus trichosporium; M.s., Methylocella silvestris; M.e., Methylobacterium extorquens; P.d., Paracoccus denitrificans; H.d., Hyphomicrobium denitrificans; 2181, Methylophilales strain 2181; M.f., Methylobacillus flagellatus; M.m., Methylotenera mobilis; M.p., Methylibium petroleiphilum; M.t., Methylophaga thiooxidans; B.p., Burkholderia phymatum; R.p., Ruegeria pomeroyi; F.p., Fulvimarina pelagi; G.b., Granulibacter bethesdensis.

A. Oxidation (catabolic) enzymes and pathways.

Methanol dehydrogenase (MxaF, XoxF, Mdhc). Methanol is oxidized to formaldehyde by methanol dehydrogenases (MDH) [1]. Several kinds of methanol dehydrogenases are present in methylotrophs. M. extorquens possess a periplasmic enzyme, catalyzing also formaldehyde oxidation, which uses cytochrome cL as electron acceptor and pyrroloquinoline quinone (PQQ) as prosthetic group (see also Fig. 5). It is a heterotetrameric enzyme (22 - MxaFI) and its cristal structure was solved [36, 37]. There are several homologues of the catalytic subunit MxaF of the MDH such as XoxF. Also present in M. extorquens this periplasmic enzyme, like most methanol dehydrogenase, catalyzed methanol as well as formaldehyde oxidation [38]. It was found by proteomics to be highly expressed by methylotrophs on plant surface [16], and methanol dehydrogenase activity was recently reported to be induced by La3+ [39]. Cytoplasmic NADH-dependent methanol dehydrogenases (Mdhc) can be found in gram-positive methylotrophs [40].

12

Fig. 5: Chemical reaction catalyzed by M. extorquens species involved in C1 oxidation (catabolic) pathways.

Methylamine oxidation (Mau, NMGP). Methylamine can be oxidized to formaldehyde by methylamine dehydrogenase (gram-negative bacteria) or methylamine oxidase (gram-positive bacteria) [41]. Methylamine dehydrogenase of M. extorquens AM1 (Mau) was purified and characterized in 1968 [42]. The enzyme is localized in periplasm and use amicyanin as electron acceptor before its transfer to cytochrome. The genes associated are localized in a 9 genes cluster (mauFBEDACJGLMN) [43]. Yet, another alternative pathway, the N-methylglutamate (NMGP), generates N-methylglutamate intermediary by condensation of the methyl group with L-glutamate [44, 45]. The methyl group could be then transfer to the tetrahydrofolate cofactor forming 5,10-methylene-tetrahydrofolate [46].

Halogenated-methane oxidation (Cmu, Dcm). Some methylotrophic strains can use also halogenated methane compounds. In Methylobacterium species, the first steps of chloromethane degradation is carried out by methyl-transferases (CmuA and CmuB) which are transferring in two steps the methyl group to tetrahydrofolate (THF) cofactor [47]. Dichloromethane is oxidized to

13 formaldehyde by dichloromethane dehalogenase (Dcm) using glutathione as cofactor via the instable intermediarte S-chloromethyl-glutathione [48].

Linear formaldehyde oxidation to formate (H4MPT, FdhA, XoxF, GSH). Divers strategies are used by methylotrophs to oxidize formaldehyde. Such diversity could be explained by the need to keep formaldehyde at low concentration in the cell due to its toxicity. Toxicity of formaldehyde is due to formation of cross-linking between DNA and proteins. As mentioned above, the formaldehyde may either be formed in the cytoplasm or as the case for M. extorquens AM1 in the periplasm and then probably diffuse inside the cytoplasm. Formaldehyde can be converted directly to formate by NAD-dependent formaldehyde dehydrogenase (FdhA) [49] or periplasmic alcohol dehydrogenases with broad substrate specificities (like MxaF and XoxF, see previously) [38] or by cofactor-dependent pathways. The tetrahydromethanopterin (H4MPT)-dependent pathway discovered in M. extorquens AM1 is widespread in methylotrophs and was shown to cross the bacterial/archaeal boundaries [50]. The enzymes of this pathway are localized in the cytoplasm. The first step of the pathway is formaldehyde condensation with the cofactor catalyzed by the formaldehyde activating enzyme Fae [51] (see Fig. 5). Then, the methylene-tetrahydromethhanopterin is reduced to methenyl- tetrahydrometanopterin via methylene-tetrahydrometanopterin dehydrogenase (MtdB) or the methylene-tetrahydrofolate/tetrahydromethanopterin dehydrogenase (MtdA). Both enzymes are catalyzing the oxidation using NADP+ as electron acceptor, whereas only MtdB can perform oxidation using NAD+ [52, 53]. The following step is the hydratation of methenyl- tetrahydromethanopterin catalyzed by methenyl-tetrahydromethanopterin cyclohydrolase generating 5-formyltetrahydromethanopterin [54]. The C1-unit of 5- formyltetrahydromethanopterin is then transferred to the cofactor methanofuran and subsequently hydrolysed to released formate by the formylmethanofuran-tetrahydromethanopterin formyltransferase/hydrolase complex [55, 56]. Autotrophic methylotrophs like Rhodobacter sphaeroides are using the glutathione (GSH)- dependent pathway [57, 58].

Formate dehydrogenase (FDHc and FDHp). Oxidation of formate to CO2 can be carried out by cytoplasmic formate dehydrogenase (FDHc) as well as periplasmic formate dehydrogenase (FDHp). The first type is NADH-dependent whereas the second is cytochrome-dependent. Several formate dehydrogenases can be found in the same microorganism, for instance four distinct enzymes are present in M. extorquens AM1, whereby at least one of these needs to be present to allow methylotrophic growth showing principle redundancy of the enzymes [59].

B: oxidation and assimilation (amphibolic) pathways.

14

Tetrahydrofolate pathway (FolD, MtdA, PurU, FtfL). The tetrahydrofolate pathway allows to activate and reduce formate into 5,10-methylene-tetrahydrofolate. The pathway can operate in the oxidative way or/and, depending of enzymes involved, in the reductive way. The pathway also provides biomass biosynthesis in C1-precursor. The first step in the reductive direction in M. extorquens AM1 is formate condensation with tetrahydrofolate by the reversible formate- tetrahydrofolate ligase (FtfL) [60] (Fig. 6). Subsequently 10-formyl-tetrahydrofolate is dehydrated to 5,10-methenyl-tetrahydrofolate by methenyl-tetrahydrofolate cyclohydrolase (Fch). Finally 5,10-methenyl-tetrahydrofolate is reduced to 5,10-methylene-tetrahydrofolate by methylene- tetrahydrofolate dehydrogenase (MtdA). Some Methylobacterium species possess alternative enzymes like FolD, a bi-functional enzyme with methylene-tetrahydrofolate dehydrogenase and methenyl-H4F cyclohydrolase activities, or PurU, an irreversible 10-formyl-tetrahydrofolate hydrolase [61]. FolD and PurU are, with the methyl-tetrahydrofolate reductase (MetF), specifically involved in the oxidation of chloromethane in Methylobacterium [47].

Fig. 6: Chemical reactions of the tetrahydrofolate pathway catalyzed by M. extorquens for C1 dissimilation or assimilation.

C. Carbon assimilation (anabolic) pathways.

The serine cycle. This cycle was discovered by Quayle and co-workers in M. extorquens AM1 after the isolation of the strain, 50 years ago [1]. 14C labeling experiment and enzymes assays allowed elucidating the sequence of this pathway [4, 5, 62]. Carbon is assimilate at the level of 5,10-methylene-tetrahydrofolate by condensation with to form L-serine catalyzed by the serine hydroxymethyltransferase (Fig. 7). L-serine is them converted sequentially to hydroxypyruvate, D-glycerate, 2-phospho-D-glycerate, and phosphoenolpyruvate by L-serine- glyoxylate aminotransferase, hydroxypyruvate reductase, glycerate kinase, and enolase, respectively. Subsequently, phosphoenolpyruvate carboxylase condenses phosphoenolpyruvate with CO2 to generate oxaloacetate. The C4 compound is then reduced to (S)-malate by malate dehydrogenase. (S)-malate is subsequently cleaved to glyoxylate and acetyl-CoA by an apparent malate synthase activity that is catalyzed by two enzymes, the malyl-CoA synthetase and the

15 malyl-CoA lyase. One acetyl-CoA is then produced from one C1-unit and one CO2, whereas glyoxylate is then aminated to refill the glycine pool and close the cycle. This last reaction is catalyzed by the L-serine-glyoxylate aminotransferase. This cycle is unbalanced when intermediates leave the cycle for biosynthetic purpose, because consequently no glyoxylate is formed to refill the glycine pool. Hence, a pathway to supply the glyoxylate pool and balance the serine cycle has to operate. In some serine cyle methylotrophs, glyxoxylate is generated by oxidation of acetyl-CoA via the glyoxylate cycle [63] using the key enzyme isocitrate lyase (ICL). But some serine cycle methylotrophs lack isocitrate lyase and therefore require an alternative pathway. This pathway was elucidated to be the ethylmalonyl- CoA pathway [64] in this work and will be described in detail in the Chapter II.

Fig. 7: Chemical reactions of the serine cycle catalyzed by M. extorquens for C1 assimilation. The net product of the cycle from one C1 and one CO2 is one acetyl-CoA.

The Ribulose bisphosphate pathway (Calvin–Benson-Bassham (CBB) cycle). Using this pathway carbon atoms are assimilated at the oxidation level of CO2 [1, 3, 65, 66]. The two key enzymes of this cycle are phosphoribulokinase, which produces ribulose-1,5-bisphosphate – a C5 compound - by of ribulose-5-phosphate, and secondly the ribulose bisphosphate carboxylase which condense one molecule of CO2 (C1) with the previously formed ribulose-1,5- bisphosphate (C5) and release by cleavage two molecules of phosphoglycerate (C3). The C3 compounds are then used as precursors to build the cell biomass or, by mechanism of carbon skeleton recombination via the pentose phosphate pathway after a reduction step, to refill the ribulose-5-phosphate pool in order to assimilate one new CO2.

16

The Ribulose Monophosphate Pathway (RuMP). By this process carbon is assimilated at the oxidation level of formaldehyde [1, 13, 67]. Therefore, considering the higher energetic level of formaldehyde, formaldehyde condensation with a C5 compound neither require previous C5 activation by ATP, nor a reduction step for C5 regeneration like the ribulose bisphosphate pathway. In addition, electron provided in formaldehyde can be dissimilated via the oxidative pentose phosphate pathway (ox. PPp). The two key enzymes of this pathway are the hexulose phosphate synthase which condenses ribulose-5-phosphate with formaldehyde to form hexulose-6- phosphate, and the hexulose phosphate isomerase which converts the hexulose-6-phosphate into fructose-6-phosphate. Fructose-6-phosphate can then be dissimilated via the oxidative pentose phosphate pathway or assimilated via the glycolytic pathway.

17

I.4. Connecting metabolic pathways: assessing emergent properties of complex networks.

The metabolic pathways described above were deeply studied and characterized at various levels including individual enzymes, their catalyzed reactions, together with the encoding genes, and mutant phenotypes as well as metabolite identification. However, relatively few was known at the beginning of this PhD work about their combined operation to perform methylotrophy as well as emergent properties of their interplay. Indeed assembling and combining metabolic pathways and enzyme catalyzed reactions describe a network of reactions, which connects metabolites to each other via a series of biochemical transformations. This metabolic network can describe the molecular processes, how microorganism are synthesizing their complex and divers constituents. Therefore, the metabolic network structure, i.e. topology, carries information on the ability of a microorganism to grow depending on available substrates in its environment. Thus, the material basis, transformation of inorganic matter in living organic matter and energy production, of life and ultimately of death can then be addressed from elucidation of the metabolic network structure. However, assembling biochemical knowledge on metabolism reveals a challenge for biochemists to deal with the high complexity of metabolic networks. Indeed, metabolic networks can contain more than 1000 reactions. For instance knowledge on the model bacteria Escherichia coli generate a network composed of 2077 reactions [68]. Such metabolic networks harbor a high degrees of freedom, whereas several possibilities of paths through the network can lead from an entry to exit reactions. Indeed, a network containing over 100 degrees of freedom generate a combinatory which is not computable with available computer so far. Therefore, apprehending the entire set of possible solutions through complex network is not accessible so far. Nevertheless, mathematical methods were developed to address specific properties emerging at the network level. They are based on the conversion of biochemical knowledge into a stoichiometric matrix, N = r x m, where m are metabolites and r reactions. Considering the metabolism at steady-state, i.e. stable concentration of its components, a r dimensional flux vector V can be calculated satisfying the solution N x V = 0. Then, the flux vector can be calculated using Flux Balance Analysis (FBA) method which allows finding a single solution using linear programming [69]. The solution is obtain by minimizing an optimization function, which represents the optimal solution depending on chosen optimization function [70], like maximization of growth rate. Another method, Elementary Flux Modes (EFM) analysis allows calculating the entire non-decomposable solutions through a metabolic network [71, 72], but as explain before are only suitable for metabolic network having a low degree of freedom. The analysis of the relationship between the structure and the function of metabolic networks allows identifying key emergent properties of the latter [72]. Among these properties are the flexibility which corresponds to the network capacity to take different metabolic states derived from their degree of

18 freedom, and the robustness which correspond to the network capacity to sustain metabolic function under genetic, chemical or environmental perturbation [73]. These properties are not evident from the individual components of the network because they are rising from the interplay between them. Flexibility and robustness drive adaptation capacities of living organisms into different environments. Hence, the regulation of the metabolic network which support their metabolic reprograming during the adaptation can be inferred from network structure analysis [72]. Elucidation of the metabolic network in a formal mathematical model at genome-scale is called genome-scale metabolic reconstruction [74, 75] and was for the first time performed in Haemophilus influenzae [76]. During this process knowledge on metabolism is collected from available genomic, biochemical, genetic and physiological studies. Information are then assembling in a coherent Gene to Protein to Reaction (GPR) association network [77] which describes the metabolism.

I.5. Elucidation of cell physiology by Metabolic Flux Analysis (MFA)

Matter and energy are led through the reaction network in order to supply biological functions, e.g. growth, maintenance, storage compound formation, motion, signaling molecules production (communication), detoxification (stress protection). Therefore, quantification of metabolic fluxes through the network remains a direct observation of the cell physiology. It links biological function, i.e. outcome, and the molecular processes involved, i.e. how they are achieved. However, the elucidation of the in vivo operation of metabolic fluxes is challenging. It requires knowledge on the metabolic network structure and secondly to collect high number of measurements to solve the high degrees of freedom of the metabolic network. Metabolic Flux Analysis (MFA) was developed to assess the physiological state of a metabolic network [78, 79]. To elucidate metabolic flux distribution, experimental flux values are collected, and includes the substrate uptake rate, the oxygen consumption rate, the growth rate those values are used as constraints to solve flux values. If the numbers of informative fluxes measured are superior to the number of the degree of liberty in the network, then the actual state of the network can be established according to their sensitivity to measurements [80]. One of the main measurements required, when growth is investigated, is the biomass composition of the cell. This quantitative information on the organic matter composing an organism as well as the associated energy for their production, especially the macromolecules assembling cost are required. Their rate of production can be calculated from the doubling time of the organism. Maintenance energy quantification, which corresponds to the minimal energy required for the cell to maintain functions not linked with growth such as osmosis, can be calculated from substrate consumption when the growth rate is zero. They are critical information to investigate growth as well as the surviving state.

19 A commonly used experimental method to elucidate undetermined metabolic fluxes is 13-C Metabolic Flux Analysis (13C-MFA). This method is based on isotope tracking through the network by measuring the fate of 13C carbon isotopes, or 15N if nitrogen metabolism is investigated, through the network by monitoring their incorporation into metabolites by mass spectrometry (MS), or nuclear magnetic resonance (NMR). This method was developed by Zysperski in 1995 [81] and began the standard for experimental in vivo elucidation of metabolic flux distribution. In vivo metabolic flux distribution have been investigated previously in M. extorquens AM1 growing with methanol [82]; however, flux calculations were based on a reduced central network with partially (as now known) incorrect assumptions on the pathway leading to glyoxylate generation and a biomass composition that was corrected throughout this work. All this taken together making new 13C MFA analysis necessary.

I.6. Aims

M. extorquens AM1 became a model organism to study methylotrophy due to intensive biochemistry and genetic studies carried out since its isolation 50 years ago [1, 15]. Despite these intensive studies on methylotrophs, no genome-scale metabolic model of this class of organism was available at the beginning of this work. The aim of this work was first the elucidation of the metabolic network topology of M. extorquens AM1 and the generation of a first genome-scale metabolic model of this methylotroph. This model was then used to subsequently investigate by 13C-MFA the network operation under pure and mixed substrate conditions.

Information on pathways and enzymes of central metabolism as well as whole genome annotation represent crucial prerequisites to achieve the goal of metabolic network reconstruction of M. extorquens AM1 and its operation. The sequencing of the complete genome sequence necessary for metabolic network reconstruction was initiated in 1998 with a preliminary annotation of the central metabolism of the unfinished genome sequence in 2003 [15]. The complete genome sequence was published in 2009 during the course of this work [83] by an international consortium (see also chapter III). The knowledge on enzymes and metabolic pathways was summarized above and almost all the enzymes were identified and characterized in the bacterium. However, the pathway resulting in glyoxylate regeneration in M. extorquens AM1 was still open at the beginning of this thesis.

20

Fig. 8: Serine cycle balancing by acetyl-CoA to glyoxylate converting pathways. The operation of the ethylmalonyl pathway in the isocitrate negative M. extorquens AM1 was demonstrated during the course of this thesis work.

Evidence was available that a metabolic pathway which supplies the serine cycle with glyoxylate is present in M. extorquens AM1 [84]. In some methylotrophs the process of glyoxylate formation is performed by the glyoxylate cycle via the key enzyme isocitrate lyase [63]. However, this enzyme is missing in M. extorquens AM1 [85], indicating that an alternative pathway to the glyoxylate cycle exists (Fig. 8). Thus, the existence of the pathway balancing the serine cycle was a long standing question dating back to the discovery of the serine cycle. Based on evidence from gene phenotypes, reactions and metabolites, two hypotheses were proposed more recently, both based are on a sequence of coenzyme A (CoA) thioesters and conversion of acetyl-CoA to glyoxylate. They were the glyoxylate regeneration cycle proposed for M. extorquens AM1 [86] and the ethylmalonyl-CoA pathway (EMCP) for operation of growth in the presence of acetate proposed for Rhodobacter sphaeroides [64]. The operation of the EMCP in M. extorquens AM1 under methylotrotrophic growth conditions was demonstrated during the course of this work and closed the serine cycle. The knowledge on the operation of the central pathways together with the genomic information was used

21 to build the genome-scale (GS) network and to analyze in silico emergent properties of the network structure at the system level including the investigation of the network robustness of the methylotrophic mode. Another objective of this work was the quantification of metabolic fluxes by 13C-MFA during growth with methanol. Establishing metabolic flux distribution, i.e. "in vivo" enzyme activities, through the network during pure substrate conditions was used as a source to provide valuable information to reinterpret data collected so far, e.g. enzymes activities, as well as data on the transcriptome, proteome, and metabolome level. In addition, this information allowed balancing carbon and energy through the genome-scale network model revealing strategy of pathways usage, optimal or suboptimal state of the network, and identifying potential metabolic bottle necks. This information can be valuable for further strain optimization in biotechnology applications. In addition, knowledge on the growth physiology was used here to investigate the putative co-consumption of M. extorquens AM1. Indeed, there is evidence that co-consumption, i.e. simultaneous usage of different carbon sources, can be a suitable strategy for microorganisms to survive and grow in an environment containing divers carbon and energy sources at low concentration [87, 88] compared with a diauxic shift. The main argument for simultaneous carbon utilization put forward is that in an environment where cells are exposed to oligotrophy no carbon source might be sufficient alone to sustain growth. In addition, the metabolic cost of metabolic reprograming during diauxic shifts between the different available substrates would be too costly in comparison to available resources. Taken together, this thesis work aimed at a deep investigation of the metabolic network of M. extorquens AM1 at system level.

References

1. Anthony, C., The Biochemistry of Methylotrophs. 1982, London: Academic Press. 2. Maden, B.E., Tetrahydrofolate and tetrahydromethanopterin compared: functionally distinct carriers in C1 metabolism. Biochem J, 2000. 350 Pt 3: p. 609-29. 3. Bassham, J.A., Photosynthetic carbon metabolism. Proc Natl Acad Sci U S A, 1971. 68(11): p. 2877- 82. 4. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 3. Distribution of radioactivity in metabolites of methanol-grown Pseudomonas AM1 after incubation with [C]methanol and [C]bicarbonate. Biochem J, 1962. 82(3): p. 483-8. 5. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 4. of phosphoenolpyruvate in methanol-grown Pseudomonas AM1. Biochem J, 1962. 85(1): p. 243-50. 6. Chistoserdova, L., M.G. Kalyuzhnaya, and M.E. Lidstrom, The expanding world of methylotrophic metabolism. Annu Rev Microbiol, 2009. 63: p. 477-99. 7. Kalyuzhnaya, M.G., et al., Novel methylotrophic isolates from Lake Washington sediment and description of a new species in the genus Methylotenera, Methylotenera versatilis sp. nov. Int J Syst Evol Microbiol, 2011. 8. Kolb, S., Aerobic methanol-oxidizing bacteria in soil. FEMS Microbiol Lett, 2009. 300(1): p. 1-10. 9. Radajewski, S., et al., Stable-isotope probing as a tool in microbial ecology. Nature, 2000. 403(6770): p. 646-649. 10. Schauer, S. and U. Kutschera, Methylotrophic bacteria on the surfaces of field-grown sunflower plants: a biogeographic perspective. Theory Biosci, 2008. 127(1): p. 23-9.

22 11. Knief, C., L. Frances, and J.A. Vorholt, Competitiveness of diverse Methylobacterium strains in the phyllosphere of Arabidopsis thaliana and identification of representative models, including M. extorquens PA1. Microb Ecol, 2010. 60(2): p. 440-52. 12. Singh, B.K., et al., Microorganisms and climate change: terrestrial feedbacks and mitigation options. Nat Rev Microbiol, 2010. 8(11): p. 779-90. 13. Kemp, M.B. and J.R. Quayle, Microbial growth on C1 compounds. Uptake of [14C]formaldehyde and [14C]formate by methane-grown Pseudomonas methanica and determination of the hexose labelling pattern after brief incubation with [14C]methanol. Biochem J, 1967. 102(1): p. 94-102. 14. Peel, D. and J.R. Quayle, Microbial growth on C1 compounds. I. Isolation and characterization of Pseudomonas AM 1. Biochem J, 1961. 81: p. 465-9. 15. Chistoserdova, L., et al., Methylotrophy in Methylobacterium extorquens AM1 from a genomic point of view. J Bacteriol, 2003. 185(10): p. 2980-7. 16. Delmotte, N., et al., Community proteogenomics reveals insights into the physiology of phyllosphere bacteria. Proc Natl Acad Sci U S A, 2009. 106(38): p. 16428-33. 17. Sy, A., et al., Methylotrophic metabolism is advantageous for Methylobacterium extorquens during colonization of Medicago truncatula under competitive conditions. Appl Environ Microbiol, 2005. 71(11): p. 7245-52. 18. Abanda-Nkpwatt, D., et al., Molecular interaction between Methylobacterium extorquens and seedlings: growth promotion, methanol consumption, and localization of the methanol emission site. J Exp Bot, 2006. 57(15): p. 4025-32. 19. Huve, K., et al., Simultaneous growth and emission measurements demonstrate an interactive control of methanol release by leaf expansion and stomata. J Exp Bot, 2007. 58(7): p. 1783-93. 20. Fiala, V., et al., Occurrence of soluble carbohydrates on the phylloplane of maize (Zea mays L.): variations in relation to leaf heterogeneity and position on the plant. New Phytologist, 1990, 115(4): p. 609-615 21. Schrader, J., et al., Methanol-based industrial biotechnology: current status and future perspectives of methylotrophic bacteria. Trends Biotechnol, 2009. 27(2): p. 107-15. 22. Olah, G.A., Beyond oil and gas: the methanol economy. Angew Chem Int Ed Engl, 2005. 44(18): p. 2636-9. 23. Olah, G.A., A. Goeppert, and G.K.S. Prakash, Beyond oil and gas: the methanol economy, Second, Updated and Enlarged Edition. 2009: Wiley-VCH. 24. Proceedings of the High Level Conference on World Food Security: The Challenges of Climate Change and Bioenergy. Soaring Food Prices: Facts, Perspectives, Impacts and Actions Required (HLC/08/INF/1). 2008. Rome. 25. Windass, J.D., et al., Improved conversion of methanol to single-cell protein by Methylophilus methylotrophus. Nature, 1980. 287(5781): p. 396-401. 26. Westlake, R., Large-Scale Continuous Production of Single Cell Protein. Chemie Ingenieur Technik, 1986. 58(12): p. 934-937. 27. Sirirote, P., T. Yamane, and S. Shimizu, L-Serine Production from Methanol and Glycine with an Immobilized Methylotroph. Journal of Fermentation Technology, 1988. 66(3): p. 291-297. 28. Schendel, F.J., et al., Production of glutamate using wild type Bacillus methanolicus. 1997, Regents of the University of Minnesota: USA. 29. Schendel, F.J., R.S. Hanson, and R. Dillingham, production of lysine using salt tolerant, methanol utilizing Bacillus. 1999, University of Minnesota, USA. 30. Hofer, P., P. Vermette, and D. Groleau, Introducing a new bioengineered bug: Methylobacterium extorquens tuned as a microbial bioplastic factory. Bioeng Bugs, 2011. 2(2): p. 71-9. 31. Kim, P., J.H. Kim, and D.K. Oh, Improvement in cell yield of Methylobacterium sp by reducing the inhibition of medium components for poly-beta-hydroxybutyrate production. World Journal of Microbiology & Biotechnology, 2003. 19(4): p. 357-361. 32. Bourque, D., Y. Pomerleau, and D. Groleau, High cell density production of poly-beta-hydroxybutyrate (PHB) from methanol by Methylobacterium extorquens: Production of high-molecular-mass PHB. Applied Microbiology and Biotechnology, 1995. 44(3-4): p. 367-376. 33. Van Dien, S.J., et al., Genetic characterization of the carotenoid biosynthetic pathway in Methylobacterium extorquens AM1 and isolation of a colorless mutant. Appl Environ Microbiol, 2003. 69(12): p. 7563-6. 34. Marx, C.J. and M.E. Lidstrom, Development of improved versatile broad-host-range vectors for use in methylotrophs and other Gram-negative bacteria. Microbiology, 2001. 147(Pt 8): p. 2065-75. 35. Chistoserdova, L., Modularity of methylotrophy, revisited. Environ Microbiol, 2011.

23 36. Ghosh, M., et al., The refined structure of the quinoprotein methanol dehydrogenase from Methylobacterium extorquens at 1.94 A. Structure, 1995. 3(2): p. 177-87. 37. Williams, P.A., et al., The atomic resolution structure of methanol dehydrogenase from Methylobacterium extorquens. Acta Crystallogr D Biol Crystallogr, 2005. 61(Pt 1): p. 75-9. 38. Schmidt, S., et al., Functional investigation of methanol dehydrogenase-like protein XoxF in Methylobacterium extorquens AM1. Microbiology, 2010. 156(Pt 8): p. 2575-86. 39. Hibi, Y., et al., Molecular structure of La(3+)-induced methanol dehydrogenase-like protein in Methylobacterium radiotolerans. J Biosci Bioeng, 2011. 111(5): p. 547-9. 40. Arfman, N., et al., Properties of an NAD(H)-containing methanol dehydrogenase and its activator protein from Bacillus methanolicus. Eur J Biochem, 1997. 244(2): p. 426-33. 41. Levering, P.R., et al., Arthrobacter P1, a fast growing versatile methylotroph with amine oxidase as a key enzyme in the metabolism of methylated amines. Arch Microbiol, 1981. 129(1): p. 72-80. 42. Eady, R.R. and P.J. Large, Purification and properties of an amine dehydrogenase from Pseudomonas AM1 and its role in growth on methylamine. Biochem J, 1968. 106(1): p. 245-55. 43. Chistoserdov, A.Y., et al., Genetic organization of the mau gene cluster in Methylobacterium extorquens AM1: complete nucleotide sequence and generation and characteristics of mau mutants. J Bacteriol, 1994. 176(13): p. 4052-65. 44. Netrusov, A.I., [NAD-dependent N-methylglutamate dehydrogenase--new enzyme metabolizing methylamine in methylotrophs]. Mikrobiologiia, 1975. 44(3): p. 552-4. 45. Jorns, M.S. and L.B. Hersh, N-Methylglutamate synthetase. Substrate-flavin hydrogen transfer reactions probed with deazaflavin mononucleotide. J Biol Chem, 1975. 250(10): p. 3620-8. 46. Latypova, E., et al., Genetics of the glutamate-mediated methylamine utilization pathway in the facultative methylotrophic beta-proteobacterium Methyloversatilis universalis FAM5. Mol Microbiol, 2010. 75(2): p. 426-39. 47. Vannelli, T., et al., A corrinoid-dependent catabolic pathway for growth of a Methylobacterium strain with chloromethane. Proc Natl Acad Sci U S A, 1999. 96(8): p. 4615-20. 48. Leisinger, T., et al., Microbes, enzymes and genes involved in dichloromethane utilization. Biodegradation, 1994. 5(3-4): p. 237-48. 49. Ando, M., et al., Formaldehyde dehydrogenase from Pseudomonas putida. Purification and some properties. J Biochem, 1979. 85(5): p. 1165-72. 50. Chistoserdova, L., et al., C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science, 1998. 281(5373): p. 99-102. 51. Vorholt, J.A., et al., Novel formaldehyde-activating enzyme in Methylobacterium extorquens AM1 required for growth on methanol. J Bacteriol, 2000. 182(23): p. 6645-50. 52. Vorholt, J.A., Cofactor-dependent pathways of formaldehyde oxidation in methylotrophic bacteria. Arch Microbiol, 2002. 178(4): p. 239-49. 53. Hagemeier, C.H., et al., Characterization of a second methylene tetrahydromethanopterin dehydrogenase from Methylobacterium extorquens AM1. Eur J Biochem, 2000. 267(12): p. 3762-9. 54. Pomper, B.K., et al., A methenyl tetrahydromethanopterin cyclohydrolase and a methenyl tetrahydrofolate cyclohydrolase in Methylobacterium extorquens AM1. Eur J Biochem, 1999. 261(2): p. 475-80. 55. Pomper, B.K., et al., Generation of formate by the formyltransferase/hydrolase complex (Fhc) from Methylobacterium extorquens AM1. FEBS Lett, 2002. 523(1-3): p. 133-7. 56. Pomper, B.K. and J.A. Vorholt, Characterization of the formyltransferase from Methylobacterium extorquens AM1. Eur J Biochem, 2001. 268(17): p. 4769-75. 57. Barber, R.D., M.A. Rott, and T.J. Donohue, Characterization of a glutathione-dependent formaldehyde dehydrogenase from Rhodobacter sphaeroides. J Bacteriol, 1996. 178(5): p. 1386-93. 58. Barber, R.D. and T.J. Donohue, Function of a glutathione-dependent formaldehyde dehydrogenase in Rhodobacter sphaeroides formaldehyde oxidation and assimilation. Biochemistry, 1998. 37(2): p. 530- 7. 59. Chistoserdova, L., et al., Identification of a fourth formate dehydrogenase in Methylobacterium extorquens AM1 and confirmation of the essential role of formate oxidation in methylotrophy. J Bacteriol, 2007. 189(24): p. 9076-81. 60. Marx, C.J., et al., Purification of the formate-tetrahydrofolate ligase from Methylobacterium extorquens AM1 and demonstration of its requirement for methylotrophic growth. J Bacteriol, 2003. 185(24): p. 7169-75.

24 61. Nagy, P.L., et al., Formyltetrahydrofolate hydrolase, a regulatory enzyme that functions to balance pools of tetrahydrofolate and one-carbon tetrahydrofolate adducts in Escherichia coli. J Bacteriol, 1995. 177(5): p. 1292-8. 62. Large, P.J. and J.R. Quayle, Microbial growth on C(1) compounds. 5. Enzyme activities in extracts of Pseudomonas AM1. Biochem J, 1963. 87(2): p. 386-96. 63. Kornberg, H.L. and H.A. Krebs, Synthesis of cell constituents from C2-units by a modified tricarboxylic acid cycle. Nature, 1957. 179(4568): p. 988-91. 64. Erb, T.J., et al., Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A, 2007. 104(25): p. 10631-6. 65. Bassham, J.A., The control of photosynthetic carbon metabolism. Science, 1971. 172(3983): p. 526-34. 66. Quayle, J.R. and D.B. Keech, Carbon dioxide and formate utilization by formate-grown Pseudomonas oxalaticus. Biochim Biophys Acta, 1958. 29(1): p. 223-5. 67. Kemp, M.B. and J.R. Quayle, Microbial growth on C1 compounds. Incorporation of C1 units into allulose phosphate by extracts of Pseudomonas methanica. Biochem J, 1966. 99(1): p. 41-8. 68. Feist, A.M., et al., A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol, 2007. 3: p. 121. 69. Fell, D.A. and J.R. Small, Fat synthesis in adipose tissue. An examination of stoichiometric constraints. Biochem J, 1986. 238(3): p. 781-6. 70. Schuetz, R., L. Kuepfer, and U. Sauer, Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol, 2007. 3: p. 119. 71. Schuster, S., T. Dandekar, and D.A. Fell, Detection of elementary flux modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Trends Biotechnol, 1999. 17(2): p. 53-60. 72. Stelling, J., et al., Metabolic network structure determines key aspects of functionality and regulation. Nature, 2002. 420(6912): p. 190-3. 73. Edwards, J.S. and B.O. Palsson, Robustness analysis of the Escherichia coli metabolic network. Biotechnol Prog, 2000. 16(6): p. 927-39. 74. Oberhardt, M.A., B.O. Palsson, and J.A. Papin, Applications of genome-scale metabolic reconstructions. Mol Syst Biol, 2009. 5: p. 320. 75. Covert, M.W., et al., Metabolic modeling of microbial strains in silico. Trends Biochem Sci, 2001. 26(3): p. 179-86. 76. Edwards, J.S. and B.O. Palsson, Systems properties of the Haemophilus influenzae Rd metabolic genotype. J Biol Chem, 1999. 274(25): p. 17410-6. 77. Thiele, I. and B.O. Palsson, A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc, 2010. 5(1): p. 93-121. 78. Christensen, B. and J. Nielsen, Metabolic network analysis. A powerful tool in metabolic engineering. Adv Biochem Eng Biotechnol, 2000. 66: p. 209-31. 79. Lee, K., et al., Metabolic flux analysis: a powerful tool for monitoring tissue function. Tissue Eng, 1999. 5(4): p. 347-68. 80. Savinell, J.M. and B.O. Palsson, Optimal selection of metabolic fluxes for in vivo measurement. I. Development of mathematical methods. J Theor Biol, 1992. 155(2): p. 201-14. 81. Szyperski, T., Biosynthetically directed fractional 13C-labeling of proteinogenic amino acids. An efficient analytical tool to investigate intermediary metabolism. Eur J Biochem, 1995. 232(2): p. 433- 48. 82. Van Dien, S.J., T. Strovas, and M.E. Lidstrom, Quantification of central metabolic fluxes in the facultative methylotroph methylobacterium extorquens AM1 using 13C-label tracing and mass spectrometry. Biotechnol Bioeng, 2003. 84(1): p. 45-55. 83. Vuilleumier, S., et al., Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources. PLoS One, 2009. 4(5): p. e5584. 84. Salem, A.R. and J.R. Quayle, Mutants of Pseudomonas AM1 that require glycollate or glyoxylate for growth on methanol or ethanol. Biochem J, 1971. 124(5): p. 74P. 85. Dunstan, P.M., C. Anthony, and W.T. Drabble, Microbial metabolism of C 1 and C 2 compounds. The involvement of glycollate in the metabolism of ethanol and of acetate by Pseudomonas AM1. Biochem J, 1972. 128(1): p. 99-106. 86. Korotkova, N., et al., Glyoxylate regeneration pathway in the methylotroph Methylobacterium extorquens AM1. J Bacteriol, 2002. 184(6): p. 1750-8.

25 87. Egli, T., and C. A. Mason. Mixed substrates and mixed cultures. Butterworth-Heinemann, Boston, 1991. p 173-201 88. Egli, T. The ecological and physiological significance of the growth of heterotrophic microorganisms with mixtures of substrates. Plenum Press, New York, 1995. p 305-386

26

CHAPTER II

Glyoxylate regeneration in M. extorquens AM1: Operation of the ethylmalonyl-CoA pathway

Rémi Peyraud, Patrick Kiefer, Philipp Christen, Stephane Massou, Jean-Charles Portais, and Julia A. Vorholt

Published in: Peyraud R, Kiefer P, Christen P, Massou S, Portais J-C, and Vorholt JA Demonstration of the ethylmalonyl-CoA pathway by using 13C metabolomics. Proc Natl Acad Sci U S A, 106(12):4846-51 (2009) Contribution by RP: Design of the study, experimental work, data analysis, writing the manuscript.

27 Abstract

The assimilation of one-carbon (C1) compounds, such as methanol, by serine cycle methylotrophs requires the continuous regeneration of glyoxylate. Instead of the glyoxylate cycle, this process is achieved by a not yet established pathway where CoA thioesters are known to play a key role. We applied state-of-the-art metabolomics and 13C metabolomics strategies to demonstrate how glyoxylate is generated during methylotrophic growth in the isocitrate lyase- negative methylotroph Methylobacterium extorquens AM1. High-resolution mass spectrometry showed the presence of CoA thioesters specific to the recently proposed ethylmalonyl-CoA pathway. The operation of this pathway was demonstrated by short-term 13C-labeling experiments, which allowed determination of the sequence of reactions from the order of label incorporation into the different CoA derivatives. Analysis of 13C positional enrichment in glycine by NMR was consistent with the predicted labeling pattern as a result of the operation of the ethylmalonyl-CoA pathway and the unique operation of the latter for glyoxylate generation during growth on methanol. The results also revealed that 2 molecules of glyoxylate were regenerated in this process. This work provides a complete pathway for methanol assimilation in the model methylotroph M. extorquens AM1 and represents an important step toward the determination of the overall topology of its metabolic network. The operation of the ethylmalonyl-CoA pathway in M. extorquens AM1 has major implications for the physiology of these methylotrophs and their role in nature, and it also provides a common ground for C1 and C2 compound assimilation in isocitrate lyase-negative bacteria.

Introduction

Methylotrophic bacteria are organisms capable of using reduced carbon compounds, such as methanol or methane, as sole sources of carbon and energy, and they play a key role in carbon cycling in their environment. They also represent promising organisms in biotechnology for the conversion of one- carbon (C1) substrates to value-added products [1]. The elucidation of the mechanisms enabling growth on reduced C1 compounds of Methylobacterium extorquens AM1, one of the most studied methylotrophs, has been a longstanding goal, and although great progress has been made [2-5], it is still not fully achieved. A key point has been to understand how the bacterium incorporates C1 units into cell material. The serine cycle was elucidated in this organism during the early 1960s by Quayle and coworkers [6-9]. The assimilation of C1 units by this pathway requires continuous regeneration of glyoxylate from acetyl-CoA and can be achieved, in principle, via the well-known glyoxylate cycle [10]. However, Dunstan and coworkers [11-14] showed in 1972 and 1973 that M. extorquens AM1

28 lacks the key enzyme of the glyoxylate cycle, isocitrate lyase, but has an alternative route involving oxidation of acetate to glyoxylate that functions during growth on both C1 and C2 compounds.Also, other organisms, including the photosynthetic Rhodobacter sphaeroides are known to require an alternative to the glyoxylate cycle when growing on C2 substrates or on substrates that are converted into acetyl-CoA to enter central metabolism [15-18]. Recent studies, including mutant analyses, gene predictions, enzyme assays, and metabolite studies in M. extorquens AM1, have led to the observation that a complex sequence of CoA thioester derivatives is involved in glyoxylate regeneration, resulting in the hypothesis of the so-called glyoxylate regeneration cycle (GRC) [19, 20][Fig. 1 and supporting information (SI) Table S1]. According to this pathway, a C5 compound, methylsuccinyl-CoA, is formed from the condensation of 2 acetyl-CoA molecules plus 1 CO2 and is decarboxylated twice in a process similar to valine degradation. The specific intermediates of the GRC are isobutyryl-CoA, metacrylyl-CoA, and hydroxyisobutyryl-CoA, and the result is the formation of propionyl-CoA. Subsequently, propionyl-CoA is transformed to malate, from which 1 glyoxylate and 1 acetyl-CoA are generated [20]. More recently, a second hypothesis, the ethylmalonyl-CoA pathway (EMCP), was proposed from studies of C2 assimilation pathways in R. sphaeroides [21-23]. This pathway (Fig. 1 and Table S1) includes the formation of methylsuccinyl-CoA, which is further converted to methylmalyl-CoA, from which both glyoxylate and propionyl-CoA are released by cleavage [22]. The propionyl-CoA can then be converted to C4 compounds and assimilated as cell material [23].

29

Fig. 1. Pathways proposed for glyoxylate regeneration in isocitrate lyase negative bacteria. The reactions that are specific to the GRC [20] or to the EMCP [23] are indicated. For designations of genes and enzymes, see Table S1. Metabolite numbers are according to Fig. 2.

The 2 pathways mentioned above are still hypothetical, and none has been firmly demonstrated to operate in vivo. They differ strikingly in terms of carbon balance and, therefore, overall carbon yield for methylotrophic growth. The GRC includes a net decarboxylation step, whereas the ethylmalonyl- CoA pathway includes net carboxylation steps. This makes the second pathway more efficient in terms

30 of carbon assimilation and has important implications with regard to the physiology of these methylotrophs and their actual biotechnological potential. In this work, we combined state-of-the-art metabolomics and 13C metabolomics strategies to examine the pathway of glyoxylate regeneration occurring in M. extorquens AM1 under methylotrophic conditions. The development of an original liquid chromatography high-resolution mass spectrometry (LC-HRMS) method allowed for the identification of almost all intermediates of the ethylmalonyl- CoA pathway. Detailed and conclusive information regarding the operation of the ethylmalonyl-CoA pathway as the predominant process for glyoxylate formation was obtained from 13C-labeling experiments in which kinetic isotopomer profiles collected by LC-HRMS during short-term 13C- labeling experiments were combined with steady-state isotopomer distributions measured by NMR.

Results

Identification of CoA Thioesters by Liquid Chromatography-Mass Spectrometry (LC-MS). The pathways proposed for the conversion of acetyl-CoA to glyoxylate in isocitrate lyase-negative bacteria involve CoA thioesters as key intermediates; however, the nature and number of these metabolites differ for the GRC and the ethylmalonyl-CoA pathway (Fig. 1). To determine which reactions occur in M. extorquens AM1 during methylotrophic growth, the presence of CoA thioesters in methanol-grown cells of M. extorquens AM1 was analyzed (Fig. 2 and Table S2). To this end, an LC-HRMS method for the detection of possibly occurring CoA thioesters, including the 15 CoA derivatives that have been considered in the different hypothetical pathways [20, 23], was developed. The intermediates specific to the GRC, including isobutyryl-CoA, methacrylyl-CoA, and -hydroxyisobutyryl-CoA [20], could not be detected. In contrast, the 2 specific intermediates of the ethylmalonyl-CoA pathway, mesaconyl-CoA and -methylmalyl-CoA [23], were identifiable in the cell extracts of methanol- grown cells.

31

Fig. 2. LC-MS analysis of CoA thioesters occurring in cell extracts of M. extorquens AM1during growth on methanol. 1, malyl-CoA; 2, succinyl-CoA; 3, methylmalonyl-CoA; 4, -methylmalyl-CoA; 5, CoA; 6, mesaconyl-CoA; 7, methylsuccinyl-CoA; 8, ethylmalonyl-CoA; 9, acetyl-CoA; 10, 3-hydroxybutyryl-CoA; 11, propionyl-CoA; 12, crotonyl-CoA; and 13, butyryl-CoA.

Short-Term 13C-Labeling Experiments. Short-term labeling experiments were performed to further investigate the sequence of reactions leading to glyoxylate formation. These experiments were carried out with [1-13C]-acetate, because this metabolite is converted to acetyl-CoA, the initial metabolite from which glyoxylate is formed during assimilation of C1 and C2 compounds. In addition, the fate of the C1 carbon of acetate differs between the 2 proposed pathways for glyoxylate regeneration [20, 23] (Fig. 3). Therefore, both the kinetics of label incorporation into pathway intermediates and the nature of the generated isotopomers provide discriminative information. Furthermore, the isotopomer pattern should be consistent in all CoA thioesters participating in a same linear sequence of reactions.

32

Fig. 3. Predicted fate of carbons from [1-13C]acetate to propionyl-CoA. (A) GRC. (B) EMCP. Black triangles indicate carbon derived from the first carbon of acetate; gray circles, carbon derived from CO2.

Incorporation of the 13C label into each CoA thioester during the 13C acetate-labeling experiment was monitored by LC-HRMS over time (Fig. 4) from cells grown in the presence of 12C-enriched methanol. As expected, acetyl-CoA was the first CoA metabolite in which the label was detected. The label was subsequently found in (in order) 3-hydroxybutyryl-CoA, ethylmalonyl-CoA, methylsuccinyl-CoA, and mesaconyl-CoA, immediately followed by propionyl-CoA. The percentage of 13C incorporated into the 3 C5 CoA derivatives was quite similar. This finding, as well as the observation that the incorporation of the label into the C3 CoA thioester propionyl-CoA was essentially indistinguishable from that of the 2 C5 CoA thioesters, methylsuccinyl-CoA and mesaconyl-CoA, indicates that no loss of labeled carbon occurred. After propionyl-CoA, methylmalonyl-CoA was found to be labeled, followed by succinyl-CoA. Their isotopomer profiles were similar to those of metabolites in the early steps of the pathway, indicating that they are all connected through the same sequence of reactions. After 10 s of incubation, the percentage of 13C label incorporated into ethylmalonyl-CoA did not increase further, unlike other CoA metabolites (Fig. 4). Further investigations are needed to understand this observation. However, the mass isotopomer distribution (relative proportions of M0, M+1, M+2, etc.) of ethylmalonyl-CoA remained constant over time, suggesting that the process by which the label was transferred from [13C]acetate to this metabolite did not change during the incubation period. The mass isotopomers of few CoA intermediates were difficult to quantify because of small pool sizes. Therefore, replicate samples collected at the same time points were pooled and concentrated. The concentrated samples indicated that crotonyl-CoA was labeled similarly to 3-hydroxybutyryl-CoA and before butyryl-CoA and ethylmalonyl-CoA. Butyryl-CoA was found to be labeled more than ethylmalonyl-CoA and similarly to methylsuccinyl-CoA. -methylmlyl-CoA was labeled before the

33 other C5 compounds mentioned above and with a high level of M+2, which supposes an incorporation of label from mesaconyl-CoA plus an entry of label from glyoxylate due to the reversibility of the reaction catalyzed by L-malyl-CoA/-methylmalyl-CoA lyase [22, 24]. Taken together, the label incorporation in CoA thiosters is in agreement with the sequence of reactions shown in Fig. 1 and suggests the operation of the ethylmalonyl-CoA pathway [23].

Fig. 4. Kinetics of 13C label incorporation in CoA thioesters after addition of [1-13C]acetate to 12 13 [ C]methanol-grown M. extorquens AM1 cells. L1-Ac represents the percent of C label incorporated in a given metabolite, normalized to the maximal number of carbon atoms received from the first carbon of acetate. Results are mean values ± SDs from 3 independent biological replicates.

Mass Isotopomers of Propionyl-CoA. Because the fate of carbon atoms derived from [1-13C]acetate is different in the 2 proposed pathways, the generated propionyl-CoA molecules do not receive the same number of carbon atoms; i.e., doubly labeled M+2 propionyl-CoA would be formed according to the ethylmalonyl-CoA pathway (Fig. 3), whereas only singly labeled M+1 propionyl-CoA would be formed according to the GRC as a result of carbon loss as CO2. Examination of the evolution of the mass isotopomer distribution of propionyl- CoA during [1-13C]acetate-labeling experiments showed increases of M+2 until 24.1% ± 2.6% at 90 s

34 of incubation time. To determine whether a decarboxylation step operates between methylsuccinyl- CoA and propionyl-CoA, the proportion of M+2 isotopomers in methylsuccinyl-CoA and propionyl- CoA during the entire labeling experiment was compared (Fig. 5). The proportion was found to be the same, indicating that no decarboxylation reaction occurred, as would have been observed if the GRC was operating under our experimental conditions.

Fig. 5. Comparison of the time course evolution of M+2 isotopomeric fractions in propionyl-CoA (E) and methylsuccinyl-CoA (F) during [1-13C]acetate labeling experiments carried out with methanol-grown M. extorquens AM1 cells. The parallel development of the M+2 fraction in the 2 metabolites indicated that no loss of carbon occurred in the process by which methylsuccinyl-CoA was converted to propionyl-CoA.

Analysis of Glycine Isotopomers Generated from [13C]Methanol Under Pure Methylotrophic Conditions. To examine the contribution of the ethylmalonyl-CoA pathway to glyoxylate regeneration under pure methylotrophic conditions, a model of glyoxylate metabolism was built up to simulate the theoretical fate of carbon in the central metabolic network (serine cycle, glyoxylate 13 regeneration, and citric acid cycle) of M. extorquens AM1 from [ C]methanol and CO2 at natural abundance. This model was used to calculate the steadystate isotopomer distribution in glyoxylate expected to occur when each pathway (either GRC or EMCP) was considered separately (for information on underlying considerations and results, see Fig. S1). By operation of the GRC [20], the 12 13 13 12 2 major isotopomers are expected to be C1- C2 and C1- C2, and they should represent 41.1% and 33.3%, respectively, of all glyoxylate isotopomers (Table 1). According to the ethylmalonyl-CoA 12 13 12 12 pathway [23], the 2 major isotopomers are expected to be C1- C2 and C1- C2, and their steady- state proportions should represent 57.1% and 39.1%, respectively, of all glyoxylate isotopomers (Table 1). Taken together, the results of these simulations revealed that the 2 pathways result in strongly discriminative labeling patterns in glyoxylate.

35 Notably, Large et al. [7] in 1962 used 14C-labeling strategies to investigate the serine cycle in M. extorquens. They found disproportionate labeling, in which 92.5% of the label from [14C]methanol incorporated into glycine was recovered in the C2 position. Although these results found almost half a century ago were not interpreted in that way, they are consistent with the operation of the ethylmalonyl-CoA pathway [23] (Table 1). This seminal study, however, did not provide detailed measurement of the various isotopomers from which conclusive information regarding the operation of the 2 pathways could be obtained.

Table 1. Steady-state distribution of 13C labeling in glycine in M. extorquens AM1 during methylotrophic growth on [13C]methanol.

Predicted data according to Experimental data

Isotopomer GRC EMCP This study Large et al. [7] Isotopomer distribution (% of total glycine isotopomers) 12 12 C1- C2 18.4 39.1 32.3 ± 0.7 13 12 C1- C2 33.3 2.8 2 ± 1 12 13 C1- C2 41.1 57.1 60.1 ± 2.9 13 13 C1- C2 7.2 1 5.6 ± 2 Label recovery (% of total label in glycine)

C1 45.6 6.2 10.4 ± 4.0* 7.5

C2 54.4 93.8 89.6 ± 6.6* 92.5

Experimental data were obtained from NMR data recorded on proteinogenic amino acids of M. extorquence 13 AM1 growing on [ C]methanol and 5% CO2 at natural abundance. The distribution of glycine isotopomers expected from the pure operation of either the GRC [20] or EMCP [23] was calculated from a mathematical model taking into account the stoichiometry and the carbon atom transition of each process. The percentage of 13C label recovered in the 2 carbon positions of glycine (Experimental data) were compared with the values of 14C-label recovery in intracellular glycine measured by Large et al. [7] and to the values predicted. *Correlated value.

To obtain more detailed labeling information, we analyzed by NMR the positional isotopomers of glycine. Data were generated during steady-state 13C-labeling experiments carried out with 13 Cenriched methanol and CO2 at natural abundance, so that the labeling of free glycine–and glyoxylate–could be measured from the more abundant proteinogenic glycine. The 4 positional isotopomers of glycine were measured by combining 2D ZQF-TOCSY and 2D-HSQC experiments [25, 26] (Table S3 and Fig. S2). The results obtained for 3 biological replicates are shown in Table 1.

The fractional enrichments (percentages of label in the carbon position) of C1 and C2 were 7.6% and 13 65.7%, respectively, indicating that 89.6% ± 6.6% of the C label was recovered in the C2 position in

36 our experiments. The latter value is closely similar to the 92.5% determined by Large et al. [7] in 1962. The steady-state distribution of [13C]isotopomers in glycine (Table 1) was used to determine the metabolic origin of glyoxylate in M. extorquens AM1 under pure methylotrophic conditions. The relative flux distributions calculated are displayed in Fig. 6. The steady-state positional isotopomers of glycine were consistent with the almost unique operation of the ethylmalonyl-CoA pathway (flux relative to methanol uptake: 25% ± 1%), because the contribution of the GRC, if any, was within the limits of experimental errors (0% ± 2%). The apparent discrepancy between the observed isotopic pattern of glyoxylate (Table 1) and the theoretical expectations for the pure operation of the 13 ethylmalonyl-CoA pathway could be explained by the contribution of C CO2 produced by methanol 13 oxidation. The culture was aerated with 5% CO2 to remove the C CO2 produced by the bacteria.

Under this condition, 6% CO2 was found to come from methanol oxidation. Moreover, all isotopomer constraints were satisfied in the calculations, suggesting that if any other pathway contributed to glyoxylate formation, it would result in the same labeling patterns as those induced by the metabolic pathways considered. The data shown in Fig. 6 allowed us to calculate the contribution of the various pathways to glyoxylate biosynthesis. For each turn of the serine cycle, 25% of glyoxylate molecules were regenerated by direct cleavage of methylmalyl-CoA, whereas the remaining 75% were obtained from malate. Of these 75%, ~50% corresponded to malate recycled in the serine cycle, and the remaining 25% were generated from the propionyl-CoA molecules generated in the ethylmalonyl-CoA pathway. This indicated that the proportion of glyoxylate molecules regenerated from propionyl-CoA (25%) was in the range of that generated directly from the cleavage of methylmalyl-CoA (25%). These data were consistent with the wide majority of propionyl-CoA molecules generated in the ethylmalonyl- CoA pathway being used for the regeneration of glyoxylate.

37

Fig. 6. Metabolic origin of glyoxylate in M. extorquens AM1 during growth on methanol. The contributions of central metabolic pathways to glyoxylate biosynthesis were calculated from the positional isotopomers of glycine measured by NMR. Results are expressed relative to methanol uptake, set arbitrarily to 1.0, and confidence intervals are given within brackets. TCA indicates tricarboxylic acid cycle; EMCP, ethylmalonyl- CoA pathway [23]; and GRC, glyoxylate regeneration cycle [20].

Discussion

The present study shows that glyoxylate regeneration in M. extorquens AM1 occurs by the ethylmalonyl-CoA pathway [23]. This conclusion was drawn from: (i) examination of the CoA thioesters present in cells during growth on methanol, (ii) kinetics of label incorporation in these CoA thioesters during short 13C-labeling experiments, and (iii) steady-state 13C-labeling experiments providing quantitative information with regard to the metabolic origin of glyoxylate carbon atoms, which essentially rule out a contribution of the GRC. The original LC-MS method developed here for direct analysis of the CoA thioesters proved to be critical to demonstrate the occurrence of specific intermediates of the ethylmalonyl-CoA pathway and to determine the sequence of reactions. It provides an analytical platform for the analysis of a large number of CoA thioesters, which are of broad interest not only for the study of methylotrophy but also for other metabolic purposes, such as CO2 assimilation in autotrophs [27], fatty acid and polyketide metabolisms, and bioremediation/degradation of aromatic compounds and hydrocarbons [28, 29]. The present work also emphasizes the complementarities of MS and NMR to resolve the topology of

38 complex metabolic networks from 13C-labeling experiments. The present work demonstrates that the combination of the 2 methods is highly valuable, because the (increasing) sensitivity of mass spectrometers allows short-term labeling experiments to be carried out, resulting in dynamic information on a given metabolic network, whereas NMR is unique in providing detailed positional labeling information from which metabolic pathways can be directly identified and quantified. The sequence of reactions that converts acetyl-CoA into glyoxylate observed in M. extorquens AM1 in vivo is in agreement with that proposed for R. sphaeroides [23], where crotonyl-CoA is converted into ethylmalonyl-CoA by the crotonyl-CoA reductase/carboxylase [23]. The data do not rule out that this conversion could occur also via butyryl-CoA [20] and its carboxylation by propionyl-CoA carboxylase (Fig. 1), although the recent demonstration that crotonyl-CoA reductase/carboxylase is present in methanol-grown M. extorquens makes the former process more likely [23]. In any case, the pathway converts 2 acetyl-CoA molecules into glyoxylate and succinyl-CoA, and the carbon balance of the process is:

- + 2 acetyl-CoA + 2 CO2 + 2 H = 1 glyoxylate + H + 1 succinyl-CoA

This balance highlights the occurrence of 2 net carboxylation steps in the ethylmalonyl-CoA pathway, which is different not only from the GRC, where carbon loss would occur, but also from the classical glyoxylate cycle [10], where no carboxylation occurs. The operation of the ethylmalonyl-CoA pathway has major implications for C1 assimilation in M. extorquens AM1 in terms of carbon balance and metabolic organization. The carbon balance of the whole process of C1 assimilation depends on the behavior of the succinyl-CoA molecule generated along with glyoxylate in the ethylmalonyl-CoA pathway. Erb et al. [23] proposed that succinyl-CoA was directly incorporated into cell material; however, the labeling data collected here for pure methylotrophic growth conditions indicate that succinyl-CoA is used to regenerate glyoxylate. Moreover, the positional isotopomers of glycine measured by NMR were consistent with a significant contribution of this process to glyoxylate regeneration, because the proportion of glyoxylate molecules regenerated from propionyl-CoA was calculated to be similar to that released by the cleavage of methylmalyl-CoA. These results indicate that not only 1 but 2 molecules of glyoxylate are regenerated by the ethylmalonyl-CoA pathway in M. extorquens AM1, with the following carbon balance:

- + 1 acetyl-CoA + 2 CO2 = 2 glyoxylate + H + CoASH

The demonstration of the ethylmalonyl-CoA pathway closes the serine cycle during methylotrophic growth, a problem that has been unsolved since 1963 [9]. The mechanism of C1 assimilation resulting

39 from our observations (Fig. 6) shows that 2 molecules of glyoxylate can be regenerated at the same time. It also provides the molecular basis for the explanation of labeling data showing that roughly half the biomass carbon comes from CO2 during methylotrophic growth conditions under laboratory conditions [6, 30]

Fig. 7. Comparison of C1 assimilation pathways in isocitrate lyase-negative (ICL-; A), proposed from this + study, and positive (ICL ; B) serine cycle methylotrophs. Note than CO2 is derived from methanol, and therefore the overall carbon balance is the same in the 2 organisms: 3 methanol → 1 C3.

Therefore, the oxidation of methanol into CO2 occurring in the initial steps of methanol utilization appears to be critical not only for energy purposes but also for carbon assimilation. A comparison of the carbon balance in ICL-negative and ICL-positive serine cycle methylotrophs (Fig. 7) reveals that the former organisms have a higher CO2 fixation ability than the latter, although the energetic cost is likely to be higher as well. The high efficiency of carbon recovery during methylotrophic growth has major implications for the physiology of these widespread organisms and for their role in carbon cycling in their environment, and it poses questions as to what extent CO2 is recycled from endogenously formed methanol or the atmosphere under natural conditions.

Materials and Methods

Reagents, Medium Composition, and Culture Conditions. [13C]methanol (99%) and [12C]methanol

(99.9%) were purchased from Cambridge Isotope Laboratories; D2O (99.8% and 99.97%) were purchased from Eurisotop. All other chemicals were purchased from Sigma. Acetonitrile, formic acid, and ammonium used for HPLC solvents were of LC-MS degree. M. extorquensAM1was grown on minimal medium containing 1.62 g/L NH4Cl, 0.2 g/L MgSO4, 2.21 g/L K2HPO4, 1.25 g/L

NaH2PO4·2H2O, and the following trace elements: 15 mg/L Na2EDTA·2H2O, 4.5 mg/L ZnSO4·7H2O,

0.3 mg/L CoCl2·6H2O, 1 mg/L MnCl2·4H2O, 1 mg/L H3BO3, 2.5 mg/L CaCl2, 0.4 mg/L of

40 Na2MoO4·2H2O, 3 mg/L FeSO4·7H2O, and 0.3 mg/L CuSO4·5H2O. All batch cultures were carried out in a 500-mL bioreactor (Infors-HT) at 28 °C and at 1000 rpm. The pH was kept constant at 7.0 by 13 addition of 1M NH4OH. For the purpose of short-term C-labeling experiments, cells were grown in 300 mL of medium containing 0.5% 13C-depleted methanol, and were aerated with compressed air at 0.15 L/min. Initial optical density (OD600) was 0.2, and sampling was performed between ODs 2.8 and 3.0. Cultures carried out for the purpose of steady-state labeling experiments were grown in 400 mL of medium containing [13C]methanol and aerated with synthetic air containing 5% natural labeled 13 CO2. To keep the fraction of dissolved C-CO2 produced from methanol below 2%, aeration rate was increased as follows: 0.2 L/min until OD 0.6, 0.4 L/min until OD 1, and 0.6 L/min until sampling. Initial OD was 0.01, and cells were harvested at around OD 1.5.

Sampling and Extraction of CoA Thioesters. A total of 1 mL of culture corresponding to 0.6 mg of cell dry weight was directly transferred into 4.5 mL of 95% acetonitrile at -20 °C containing 25 mM formic acid for quenching. To provide instantaneous quenching of metabolic activity, the sample was added into the quenching solution on a Vortex. Cells were disrupted by 3 sonication steps (30 s, 23 kHz) by using a Soniprep 150 device (Sanyo) and carried out in a cooling bath (T<-10 °C), with 30 s between each treatment. After the addition of 20 mL of ice-cold H2O, the sample was chilled with liquid nitrogen. Frozen samples were stored at -20 °C until freeze drying. Subsequently, 300 L of an ice-cold, 25mM ammonium formate buffer (pH 3.5, 2% MeOH) was added. The suspension was centrifuged (14,000xg, 2 min, -5 °C), and the supernatant was filtered through a Sartorius Minisart filter (pore size 0.2 m) before analysis.

LC-MS Analysis. Analyses were performed with a Rheos 2200 HPLC system (Flux Instruments) coupled to an LTQ Orbitrap mass spectrometer (Thermo Fisher Scientific), equipped with an electrospray ionization probe. CoA esters were separated with a C18 analytical column (Gemini 150x2.0 mm, particle size 3m; Phenomenex) at a flow rate of 220 L·min-1. Injection volume was 10L. Solvent A was 50 mM formic acid adjusted to pH 8.1 with NH4OH, and solvent B was methanol. The following gradient of B was applied: 0 min,5%;1 min,5%;10 min, 23%; 20min, 80%; 22min, 80%. MS analysis was done in the negative FTMS mode at a resolution of 15,000 (m/z=400) to determine mass isotopomer distribution patterns and at a resolution of 60,000 (m/z=400) to identify CoA thioesters and to detect potential mas speak overlapping problems. Sheath gas flow rate was 40, aux gas flow rate was10, tube lens was-90 V, capillary voltage was- 4V, and ion spray voltage was - 4.7 kV. For the identification of CoA thioesters in cell extracts, chromatograms were analyzed for the presence of [M-H]+ ions corresponding to the exact mass expected from the 15 CoA thioesters potentially

41 involved in glyoxylate regeneration, with a mass tolerance of 5 ppm. The number of carbon atoms was validated by using additional extracts from M. extorquens cells grown on [13C]methanol.

Short-Term 13C-Labeling Experiments. Incubation of cells with labeled acetate was realized in 2- mL Eppendorf tubes containing 50 L of 1.05M [1-13C]acetate. To start, 1 mL of culture growing on [12C]methanol was quickly added with a syringe in the Eppendorf tube, and the sample was vortexed. A final acetate concentration of 50 mM was chosen to reach the same concentration as methanol. The acetate homogenization time in the final sample was found to be 1.1± 0.2 seconds. After various incubation times, the solution was added to the quenching solution, and the CoA thioesters were analyzed as explained above. The efficiency of the quenching process was evaluated by examining label incorporation into CoA thioesters in a culture sample injected in the quenching solution containing [1-13C]acetate without an incubation period and revealing no incorporation of label.

Calculations of Normalized 13C-Label Incorporation. The incorporation of 13C label in each CoA ester during short-term 13C-labeling experiments was calculated from the analysis of the corresponding isotopic cluster in the mass spectra. The data were corrected for naturally occurring isotopes, the contribution of which was determined from the analysis of samples collected just before the addition of [1-13C]acetate. Results are expressed as percent of 13C atoms incorporatedin the molecule. Because the different CoA esters do not receive the same number of 13C atoms from [1-13C]acetate, results were normalized according to the maximum number of 13C atoms that can be received from [1-13C]acetate in the considered molecule (‗‗normalized fractional labeling,‘‘ L1-Ac):  n   Mi  i  i1  L  1Ac n

Where Mi is the proportion of the mass isotopomer corresponding to molecules having incorporated i 13C atoms from [1-13C]acetate; n is the maximum number of [13C]carbon that can be incorporated into the molecule from [1-13C]acetate (note that for some compounds, n is different according to GRC or EMCP).

NMR Analyses. All 1D and 2D NMR spectra were recorded on a Bruker Avance II 500-MHz spectrometer using a 5-mm z-gradient BBI probe head. The data were acquired and processed by using TOPSPIN 1.3 software (Bruker).The temperature was 298 K. The 1D 1H spectra were acquired by using a 30° pulse, 5,000-Hz sweep width, and 3.27-s acquisition times. A total of 128 scans were recorded, and relaxation delay between scans was 10 s. Proteinogenic amino acid sample was prepared

42 as described previously [25] and modified as explained in the SI Text. Positional isotopomers of proteinogenic glycine were measured from (i) the analysis of carbon–carbon couplings in 2D-[1H- 13C]HSQC spectra and (ii) the analysis of heteronuclear 1H,13C couplings in 2D ZQF-TOCSY spectra [25, 26]. The peak deconvolution was realized with the software GOSA-fit (www.biol-log.biz/). The 2 1 value of the glycine JH2C1 coupling constant was determined from the 1D H analysis of 0.45 M [U- 13C, 15N]glycine performed with and without COOH decoupling.

Flux Calculations. For the purpose of flux calculations, a reaction network describing C1 metabolism in M. extorquens AM1, including simplified reactions for biomass formation, was designed. This model described the stoichiometry of the reactions as well as the transitions of carbon atoms. Biomass requirements were obtained from Van Dien and Lidstrom [31]. Relative flux distributions were calculated from the positional isotopomers of glycine collected by NMR using both the 13C-Flux software developed by Wiechert [32]. Results are expressed as molar fluxes relative to the rate of methanol uptake (set arbitrarily to 1.0).

Acknowledgements

We thank T. Erb (University of Freiburg, Freiburg, Germany) for the generous gift of several CoA standards, and S. Sokol (Institut National des Sciences Appliquées, Toulouse, France) for fruitful discussions and support in metabolic simulations. This work was supported by Eidgenössische Technische Hochschule Zurich Research Grant ETH-25 08–2. Evonik Degussa GmbH and North Rhine-Westphalia cofinanced by the European Union are acknowledged for supporting the development of LC-MS analytics. The Swiss Academy of Engineering Science (SATW) and the Centre Français pour l‘Accueil et les Echanges Internationaux (Egide) supported the work with a travel grant (Germaine de Staël program). The work carried out at the Laboratory for BioSystems& Process Engineering (Toulouse, France) was supported by grants from the Agence Nationale de la Recherche, the Région Midi-Pyrénées, and the European Regional Development Fund (ERDF).

References

1. Schrader, J., et al., Methanol-based industrial biotechnology: current status and future perspectives of methylotrophic bacteria. Trends Biotechnol, 2009. 27(2): p. 107-15. 2. Anthony, C., The Biochemistry of Methylotrophs. 1982, London: Academic Press. 3. Chistoserdova, L., et al., C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science, 1998. 281(5373): p. 99-102. 4. Chistoserdova, L., et al., Methylotrophy in Methylobacterium extorquens AM1 from a genomic point of view. J Bacteriol, 2003. 185(10): p. 2980-7.

43 5. Vorholt, J.A., Cofactor-dependent pathways of formaldehyde oxidation in methylotrophic bacteria. Arch Microbiol, 2002. 178(4): p. 239-49. 6. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C1 compounds. II. Synthesis of cell constituents by methanol- and formate-grown Pseudomonas AM 1, and methanol-grown Hyphomicrobium vulgare. Biochem J, 1961. 81: p. 470-80. 7. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 3. Distribution of radioactivity in metabolites of methanol-grown Pseudomonas AM1 after incubation with [C]methanol and [C]bicarbonate. Biochem J, 1962. 82(3): p. 483-8. 8. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 4. Carboxylation of phosphoenolpyruvate in methanol-grown Pseudomonas AM1. Biochem J, 1962. 85(1): p. 243-50. 9. Large, P.J. and J.R. Quayle, Microbial growth on C(1) compounds. 5. Enzyme activities in extracts of Pseudomonas AM1. Biochem J, 1963. 87(2): p. 386-96. 10. Kornberg, H.L. and H.A. Krebs, Synthesis of cell constituents from C2-units by a modified tricarboxylic acid cycle. Nature, 1957. 179(4568): p. 988-91. 11. Dunstan, P.M., C. Anthony, and W.T. Drabble, Microbial metabolism of C 1 and C 2 compounds. The involvement of glycollate in the metabolism of ethanol and of acetate by Pseudomonas AM1. Biochem J, 1972. 128(1): p. 99-106. 12. Dunstan, P.M., C. Anthony, and W.T. Drabble, Microbial metabolism of C 1 and C 2 compounds. The role of glyoxylate, glycollate and acetate in the growth of Pseudomonas AM1 on ethanol and on C 1 compounds. Biochem J, 1972. 128(1): p. 107-15. 13. Dunstan, P.M. and C. Anthony, Microbial growth on C-1 and C2 compounds: the metabolism of acetate to glycine in Pseudomonas AM1. Biochem J, 1972. 130(1): p. 31P. 14. Dunstan, P.M. and C. Anthony, Microbial metabolism of C1 and C2 compounds. The role of acetate during growth of Pseudomonas AM1 on C1 compounds, ethanol and beta-hydroxybutyrate. Biochem J, 1973. 132(4): p. 797-801. 15. Kornberg, H.L. and J. Lascelles, The formation of isocitratase by the Athiorhodaceae. J Gen Microbiol, 1960. 23: p. 511-7. 16. Albers, H. and G. Gottschalk, Acetate metabolism in Rhodopseudomonas gelatinosa and several other Rhodospirillaceae. Arch Microbiol, 1976. 111(1-2): p. 45-9. 17. Han, L. and K.A. Reynolds, A novel alternate anaplerotic pathway to the glyoxylate cycle in streptomycetes. J Bacteriol, 1997. 179(16): p. 5157-64. 18. Claassen, P.A.M. and A.J.B. Zehnder, Isocitrate lyase activity in Thiobacillus versutus grown anaerobically on acetate and nitrate. J Gen Microbiol, 1986. 132: p. 3179–3185. 19. Korotkova, N., L. Chistoserdova, and M.E. Lidstrom, Poly-beta-hydroxybutyrate biosynthesis in the facultative methylotroph methylobacterium extorquens AM1: identification and mutation of gap11, gap20, and phaR. J Bacteriol, 2002. 184(22): p. 6174-81. 20. Korotkova, N., et al., Glyoxylate regeneration pathway in the methylotroph Methylobacterium extorquens AM1. J Bacteriol, 2002. 184(6): p. 1750-8. 21. Alber, B.E., et al., Study of an alternate glyoxylate cycle for acetate assimilation by Rhodobacter sphaeroides. Mol Microbiol, 2006. 61(2): p. 297-309. 22. Meister, M., et al., L-malyl-coenzyme A/beta-methylmalyl-coenzyme A lyase is involved in acetate assimilation of the isocitrate lyase-negative bacterium Rhodobacter capsulatus. J Bacteriol, 2005. 187(4): p. 1415-25. 23. Erb, T.J., et al., Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A, 2007. 104(25): p. 10631-6. 24. Hacking, A.J. and J.R. Quayle, Purification and properties of malyl-coenzyme A lyase from Pseudomonas AM1. Biochem J, 1974. 139(2): p. 399-405. 25. Massou, S., et al., Application of 2D-TOCSY NMR to the measurement of specific(13C-enrichments in complex mixtures of 13C-labeled metabolites. Metab Eng, 2007. 9(3): p. 252-7. 26. Massou, S., et al., NMR-based fluxomics: quantitative 2D NMR methods for isotopomers analysis. Phytochemistry, 2007. 68(16-18): p. 2330-40. 27. Berg, I.A., et al., A 3-hydroxypropionate/4-hydroxybutyrate autotrophic carbon dioxide assimilation pathway in Archaea. Science, 2007. 318(5857): p. 1782-6. 28. Kniemeyer, O., et al., Anaerobic oxidation of short-chain hydrocarbons by marine sulphate-reducing bacteria. Nature, 2007. 449(7164): p. 898-901. 29. Boll, M., G. Fuchs, and J. Heider, Anaerobic oxidation of aromatic compounds and hydrocarbons. Curr Opin Chem Biol, 2002. 6(5): p. 604-11.

44 30. Crowther, G.J., G. Kosaly, and M.E. Lidstrom, Formate as the Main Branchpoint for Methylotrophic Metabolism in Methylobacterium extorquens AM1. J Bacteriol, 2008. 31. Van Dien, S.J. and M.E. Lidstrom, Stoichiometric model for evaluating the metabolic capabilities of the facultative methylotroph Methylobacterium extorquens AM1, with application to reconstruction of C(3) and C(4) metabolism. Biotechnol Bioeng, 2002. 78(3): p. 296-312. 32. Wiechert, W., et al., A universal framework for 13C metabolic flux analysis. Metab Eng, 2001. 3(3): p. 265-83.

45

46

Supporting Information

47 SI Text

Theoretical Fate of Carbon in the Central Metabolism of Methylobacterium extorquens AM1. The design of the steady-state 13C labeling experiment was done by simulation of the theoretical labeling of glycine obtained through the model of the central metabolic network (serine cycle, 13 glyoxylate regeneration, and citric acid cycle) of M. extorquens AM1 from [ C]methanol and CO2 at natural abundance. Simplification of the discriminative process by which the glyoxylate regeneration pathway and the ethylmalonyl-CoA pathway differ is explained below and is illustrated in Fig. S1. According to the glyoxylate regeneration cycle (1), 1 succinyl-CoA is generated from 2 acetyl-CoA molecules plus 1 CO2 and contains predominantly 2 carbon atoms coming from methanol and 2 from

CO2. The conversion of succinyl-CoA into malate and its cleavage into glyoxylate plus acetyl-CoA occurs via the formation of a symmetrical intermediate, namely succinate, which distributes the label equally between the C1 and C2 positions of glyoxylate. By operation of the GRC (1), the 2 major 12 13 13 12 isotopomers are expected to be C1- C2 and C1- C2, and they should represent 41.1% and 33.3%, respectively, of all glyoxylate isotopomers (Table 1). According to the ethylmalonyl-CoA pathway (2), glyoxylate is generated by cleavage of methylmalyl- CoA into propionyl-CoA and glyoxylate, but a second glyoxylate molecule can be generated potentially from succinyl-CoA. The glyoxylate directly generated by cleavage of methylmalyl-CoA is 12 13 predominantly singly labeled (expected isotopomer: C1- C2). The glyoxylate molecule generated 12 13 from succinyl-CoA can have 2 different isotopic patterns—one ( C1- C2) is singly labeled, and the 12 12 other ( C1- C2) is unlabeled. By operation of the EMCP, the 2 major isotopomers are expected to be 12 13 12 12 C1- C2 and C1- C2, and their steady-state proportions should represent 57.1% and 39.1%, respectively, of all glyoxylate isotopomers.

Extraction of Proteinogenic Amino Acids. Extraction of proteinogenic amino acids was performed as described previously (3) for Escherichia coli and was adapted to M. extrorquens AM1 as follows. The totality of a steady-state culture (400 mL) was centrifuged at 4,600 × g and washed with 0.1 M

NaCl solution and centrifuged. The cells were disrupted by 3 cycles of freezing (liquid N2 during 10 s) and thawing (10 °C during 2 min). The treated cells were resuspended in 20mM Tris-HCL (pH 7.6), and disrupted again by bead-beating with 0.1-mm beads (zirconium/silice). After removal of beads by short centrifugation, the final extraction step was realized by sonification at 4 °C: 3 times (30 s, 23 kHz) with breaks of 30 s. The cell debris was removed by ultracentrifugation (33,000 × g, 30 min, 4 °C), and the proteins were precipitated by ethanol (70% final) and centrifuged (33,000 × g, 30 min, 4 °C). The pellet was suspended in 6 M HCl and hydrolyzed for 14 h at 107 °C. The acid was removed by evaporation, and labile protons were exchanged 3 times with deuterium by successive

48 resuspensions in 2 mL of D2O 99.8% followed by lyophilization. The sample was finally resuspended in 600 L of D2O 99.97%.

1. Korotkova N, Chistoserdova L, Kuksa V, Lidstrom ME (2002) Glyoxylate regeneration pathway in the methylotroph Methylobacterium extorquens AM1. J Bacteriol 184:1750–1758. 2. Erb TJ, et al. (2007) Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: The ethylmalonyl-CoA pathway. Proc Natl Acad Sci USA 104:10631–10636. 3. Massou S, Nicolas C, Letisse F, Portais JC (2007) Application of 2D-TOCSY NMR to the measurement of specific 13C-enrichments in complex mixtures of 13C-labeled metabolites. Metab Eng 9:252–257.

Fig. S1. Predicted fate of methanol and CO2 carbons during methylotrophic growth. (Left) GRC. (Right) EMCP. Carbon derived from methanol (gray triangles), carbon derived or released as CO2 (gray circles), and carbon of unspecified origin (open triangles).

49

Fig. S2. Steady-state distribution of glycine isotopomers during growth in [13C]methanol-grown cells of M. 13 1 13 extorquens AM1. (A) Typical 1D C NMR pseudospectrum of glycine C2 extracted from the 2D H- C HSQC NMR analysis of proteinogenic amino acids. (B) Typical 1D 1H spectrum of glycine H2 protons extracted from 13 2 2D ZQF-TOCSY NMR analysis of proteinogenic amino acids. (C) C1 enrichment determination via the JH2C1 couplings observed in the 1D 1H spectrum of glycine H2.

50 Table S1. Summary of enzymes and their corresponding genes proposed for glyoxylate formation according to the GRC and the EMCP

Reaction Gene Protein Reaction N° name identified ref. essential identified ref. measured ref. A -Ketothiolase phaA* (1) - - +*,† (1, 2) B Acetoacetyl-CoA reductase phaB* (1) + - +*,† (1, 2) C (R)3-hydroxybutyryl-CoA dehydratase croR* (3) + - +* (1, 3) D Crotonyl-CoA reductase ccr*,† (3, 4) + +† (4) +† (4, 5) E Butyryl-CoA carboxylase pccA*, pccB* (3) + - +* (3) F Crotonyl-CoA reductase/carboxylase ccr† (3, 4) + +† (4) +† (4, 5) G Ethylmalonyl-CoA epimerase epm(epi)*,† (6, 7) + +† (7) +† (7) H Ethylmalonyl-CoA mutase meaA(ecm)*,† (6, 7) + +† (7) +† (7) I Methylsuccinyl-CoA decarboxylase - - - J Isobutyryl-CoA dehydrogenase ibd2* (3) + - - K Metacrylyl-CoA hydratase meaC* (8) + - - L (2S)methysuccinyl-CoA dehydrogenase ibd2† (2, 3, 8) + +† (2) - M Mesaconyl-CoA hydratase meaC(mch)*,† (2,8, 9) + +† (2, 9) +† (9) N -Methylmalyl-CoA lyase mcl1*,† (6, 10) + +*,† (10, 11) +*,† (10, 11) O Propionyl-CoA carboxylase pccA*, pccB* (3) + - +* (3) P Methylmalonyl-CoA epimerase epm(epi)*,† (6, 7) + +† (7) +† (7) mcmA*, Q Methylmalonyl-CoA mutase mcmB* (12) + +*,† (2, 12) +* (12)

Letters indicating the different reactions are according to Fig. 1. *Data from M. extorquens AM1. †data from R. sphaeroides. (Gene name) synonym in R. sphaeroides.

1. Korotkova N, Chistoserdova L, Kuksa V, Lidstrom ME (2002) J Bacteriol 184:6174–6181. 2. Alber Be, Spanheimer R, Ebenau-Jehle C, Fuchs G (2006) Mol Microbiol 61:297–309.. 3. Korotkova N, Chistoserdova L, Kuksa V, Lidstrom ME (2002) J Bacteriol 184:1750–1758. 4. Erb TJ, et al. (2007) Proc Natl Acad Sci USA 104:10631–10636. 5. Han L, Reynolds KA (1997) J Bacteriol 179:5157–5164. 6. Chistoserdova L, Chen SW, Lapidus A, Lidstrom ME (2003) J Bacteriol 185:2980–2987. 7. Erb TJ, Retey J, Fuchs G, Alber BE (2008) J Biol Chem 283:32283–32293. 8. Korotkova N, Lidstrom ME, Chistoserdova L (2005) J Bacteriol 187:1523–1526. 9. Zarzycki J., et al. (2008) J Bacteriol 190:1366–1374. 10. Meister M, Saum S, Alber BE, Fuchs G (2005) J Bacteriol 187:1415–1425. 11. Hacking AJ, Quayle JR (1974) Biochem J 139:399–405. 12. Korotkova N, Lidstrom ME (2004) J Biol Chem 279:13652–13658.

51 Table S2. Identification of CoA thioesters by stable isotope assignment and spiking of standards using LC- HRMS

CoA ester Identification by LC-HRMS Occurrence in Detection Pathway in MeOH- grown cells Formula [M0-H]- [MU-H]- Cn Confirmation GRC EMCP by spiking Theor. Observed Theor. Observed Mass mass Mass mass

Acetyl- C23H38N7O17P3S 808.1185 808.1182 831.1957 831.1946 23 + + + +

Propionyl- C24H40N7O17P3S 822.1341 822.1350 846.2147 846.2127 24 + + + +

Crotonyl- C25H40N7O17P3S 834.1341 834.1338 859.2180 859.2143 25 + + + +

Methacrylyl- C25H40N7O17P3S 834.1341 834.1324 859.2158 n.d. 25 + + - -

Butyryl- C25H42N7O17P3S 836.1498 836.1501 861.2337 861.2291 25 + + - +

Isobutyryl- C25H42N7O17P3S 836.1498 836.1502 861.2337 n.d. 25 + + - (+)*

Acetoacetyl- C25H40N7O18P3S 850.1291 850.1292 875.2129 n.d. 25 + + + -

-Hydroxybutyryl- C25H42N7O18P3S 852.1447 852.1451 877.2286 877.2266 25 + + + +

- C25H42N7O18P3S 852.1447 n.d. 877.2286 n.d. 25. - † + - - Hydroxisobutyryl- succinyl- C25H40N7O19P3S 866.1240 866.1239 891.2078 891.2058 25 + + + +

Methylmalonyl- C25H40N7O19P3S 866.1240 866.1246 891.2078 891.2049 25 + + + +

Malyl- C25H40N7O20P3S 882.1189 882.1155 907.2028 907.2024 25 + + + +

Mesaconyl- C26H40N7O19P3S 878.1240 878.1234 904.2112 904.2109 26 + - + +

Ethlylmalonyl- C26H42N7O19P3S 880.1396 880.1397 906.2269 906.2260 26 + + + +

Methylsuccinyl- C26H42N7O19P3S 880.1396 880.1366 906.2269 906.2234 26 + + + +

-Methylmalyl- C26H42N7O20P3S 896.1334 896.1315 922.2218 922.2209 26 + - + +

CoA thioesters proposed to be involved in the glyoxylate regeneration pathway of M. extorquens AM1 according to the GRC and the EMCP. Cn indicates number of carbon atoms in molecule; M0, monoisotopic mass peak of natural labeled compound; MU, uniformly 13C-labeled mass peak of 13C-labeled compound; n.d., not detected. *Not always present. †No standard available.

52

CHAPTER III

Genome analysis of Methylobacterium extorquens

Stéphane Vuilleumier, Ludmila Chistoserdova, Ming-Chun Lee, Françoise Bringel, Aurélie Lajus, Yang Zhou, Benjamin Gourion, Valérie Barbe, Jean Chang, Stéphane Cruveillier, Carole Dossat, Will Gillett, Christelle Gruffaz, Eric Haugen, Edith Hourcade, Ruth Levy, Sophie Mangenot, Emilie Muller, Thierry Nadalig, Marco Pagni, Christian Penny, Rémi Peyraud, David G. Robinson, David Roche, Zoé Rouy, Channakhone Saenampechek, Grégory Salvignol, David Vallenet, Zaining Wu, Christopher J. Marx, Julia A. Vorholt, Maynard V. Olson, Rajinder Kaul, Jean Weissenbach, Claudine Médigue, Mary E. Lidstrom

Published in: Vuilleumier S, Chistoserdova L, Lee MC, Bringel F, Lajus A, Zhou Y, Gourion B, Barbe V, Chang J, Cruveiller S, Dossat D, Gillett W, Gruffaz C, Haugen E, Hourcade E, Levy R, Mangenot S, Muller E, Nadalig T, Pagni M, Penny C, Peyraud R, Robinson DG, Roche D, Rouy Z, Saenampechek C, Salvignol G, Vallenet D, Wu Z, Marx CJ, Vorholt JA, Olson MV, Kaul R, Weissenbach J, Médigue C, Lidstrom ME Methylobacterium Genome Sequences: A reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources PLoS ONE, 4(5): e5584 Contribution by RP: Manual expert annotation of genes involved in metabolism of the two strains, comparative genomic analyses, contributing to writing the manuscript.

53 Abstract

Background: Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared. Methodology/Principal Findings: The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm) gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau) gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17) that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name „„island integration determinant‟‟ (iid). Conclusion/Significance: These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire methylotrophic lifestyles.

Introduction

Pink-pigmented facultative methylotrophs of the genus Methylobacterium are ubiquitous in soil, air and water environments [1]. The common trait of all Methylobacterium species is the ability to grow on one or several reduced one carbon (C1) compounds other than methane, most prominently methanol, which is a major volatile organic compound emitted by vegetation [2]. Accordingly, strains

54 of Methylobacterium are often found in association with plants, either involved in bona fide symbioses as endophytes, or as epiphytes on leaf surfaces [3-7]. The potential of strains from this genus to provide biotechnological products of high added value has attracted sustained scientific attention [8,9]. Of all Methylobacterium strains, M. extorquens strain AM1 (formerly Pseudomonas AM1, Methylobacterium sp. AM1) is the best studied, and has served as a model organism for over four decades. It was first isolated in 1960 in Oxford, England, as an airborne contaminant growing on methylamine [10]. It was then used as a workhorse to characterize the serine cycle for assimilation of the C1-unit of methylene tetrahydrofolate, a central intermediate in methylotrophic metabolism, and more recently the ethylmalonyl-CoA pathway for glyoxylate regeneration [8,11-13] (Fig. 1). Enzymatic systems for oxidation of both methanol [14,15] and methylamine [16], which involve the use of specific cofactors pyrroloquinoline quinone (PQQ) and tryptophylquinone (TTQ), respectively [17], were characterized in strain AM1. Bacterial tetrahydromethanopterin (H4MPT)- dependent enzymes, now known to occur in most methylotrophs [18,19] but originally thought to be unique to archaeal methanogens [20] were also first demonstrated in this strain. In Methylobacterium, the H4MPT-dependent pathway has been shown to play a major role in both energy generation and protecting cells from formaldehyde poisoning. [21] As to the analogous tetrahydrofolate (H4F)-linked pathway that involves two enzymes encoded by mtdA and fch, also first discovered in this organism [22,23], its major role in assimilatory metabolism was recently identified, in supplying C1 units into the serine cycle [24,25] (Fig. 1).

55

Fig. 1. Central pathways for carbon conversion in Methylobacterium during methylotrophic growth. Full lines, H4MPT-dependent pathway, H4F-dependent pathway, serine cycle and ethylmalonyl-CoA pathway for glyoxylate regeneration; broken line, tricarboxylic acid cycle reactions (2-oxoglutarate dehydrogenase activity (and a dissimilatory TCA cycle) are not essential for methylotrophic growth [93,94]). Key pathway outputs [13] used for carbon assimilation (biomass production) are shown in bold italics. Genes involved in the serine cycle, the TCA cycle and in the ethylmalonyl-CoA pathway are indicated. Genes given on the same line and not separated by hyphens are closely associated on the chromosome. Genes and their arrangement on the chromosome are strongly conserved in strains AM1 and DM4 (see Suppl. Table S1), except the mau cluster for methylamine utilization and the dcm cluster for dichloromethane utilization which are unique to strain AM1 and strain DM4, respectively.

56 Draft genome data for M. extorquens AM1 have been available since 2003 [11] and have enabled transcriptomic and proteomic approaches (see e.g. [26,27]), as well as metabolomic studies (see [24,28,29]). Combined with the large complement of genetic tools developed for Methylobacterium (see e.g. [30,31]), this has established M. extorquens AM1 as a model for systems level investigations. Methylobacterium strain DM4 has been isolated from industrial wastewater sludge in Switzerland, as part of efforts to characterize microorganisms able to degrade the organohalogenated pollutant dichloromethane (DCM) [32]. Unlike methanol and methylamine, which are mainly produced naturally, DCM is better known as a synthetic compound [33,34]. Rated as potentially carcinogenic for humans and the most highly produced chlorinated organic compound (http://www.eurochlor.org/solvents), DCM is highly volatile (b.p. 38°C) and water-soluble, making it a widespread contaminant in the environment [35]. Aerobic methylotrophic bacteria capable of using DCM as the sole source of carbon and energy [36] express high levels of DCM dehalogenase, which transforms DCM into formaldehyde and two molecules of HCl [37]. The genotoxic effects of DCM in both mammals [38] and bacteria [39] are due to a short-lived intermediate in the enzymatic transformation of DCM to formaldehyde [40]. Growth with DCM, as the main trait that distinguishes strain DM4 from M. extorquens AM1, has led to its classification as a separate Methylobacterium species [41]. Basing on 16S rRNA gene sequence and DNA-DNA relatedness, it was recently proposed that strain DM4 should be reclassified as M. extorquens [42]. The primary objective of this work was to define a fully assembled and annotated reference genomic blueprint for Methylobacterium, to assist future experimental investigations of methylotrophic metabolism by global approaches. We report here complete genomic sequences of strains AM1 and DM4, and describe the genomic make-up and potential for genomic plasticity that underlies the extensive capacity of Methylobacterium for physiological adaption to methylotrophic lifestyles. Availability of complete genomic sequences of the two strains also provides the opportunity to define the conserved complements of genes associated with methylotrophy, and to investigate the differences between the two strains associated with strain-specific adaptations.

Results and discussion

Genomic structure. The genome of M. extorquens AM1 totals 6.88 Mb and consists of five replicons: a chromosome of 5.51 Mbp (Acc. No. CP001510), a megaplasmid of 1.26 Mbp (CP001511), and three plasmids (25 kb (CP001512), 38 kb (CP001513), and 44 kb (CP001514)), with an average GC content of 68.5% (Table 1, Fig. 2). The genome of strain DM4 is somewhat smaller (6.12 Mb), and features only three replicons: a chromosome of 5.94 Mbp (Acc. No. FP103042) and two plasmids (141 kb

57 (FP103043) and 38 kb (FP103044)), with an average GC content of 68.0% (Table 1, Fig. 2). Based on their sizes and the relative distribution of sequencing reads for each replicon, plasmids p1META and p2META are predicted to be present at 2-3 copies per strain AM1 genome, the replicon p3META at 1- 2 copies per genome, and the megaplasmid at one copy per genome. Predicted copy numbers of 0.4- 0.5 and 0.6-0.7 per genome were obtained for DM4 plasmids p1METDI and p2METDI, respectively. By convention, the origin of both chromosomes was set upstream of the dnaA gene, as no GC skew was observed to help in predict initiation and termination of replication [43]. The chromosomes of the two strains are remarkably similar in both gene content and synteny (Fig. 2, Fig. 3). 85% of the M. extorquens AM1 chromosomal genes have full-length homologs (higher than 30% identity) on the chromosome of strain DM4. Of these, 89% have homologs at higher than 95% identity, underlining the orthologous nature of most genes in the two strains. Ribosomal genes (23S, 16S, and 5S) are identical in all five copies of the ribosomal operon of strains AM1 and DM4. The intergenic spacer length between 16S and 23S genes is identical in all five copies of the ribosomal operon of the same strain, but its length differs markedly between the two strains (905 nt in M. extorquens AM1, 602 nt in strain DM4). These data confirm the already mentioned recent suggestion [42] that strain DM4 belongs to the species M. extorquens.

58 Table 1. Genome statistics for M. extorquens AM1 and DM4

Strain AM1 Strain DM4 ______Chromo- Mega Plasmid Plasmid Plasmid Total/ Chromo- Plasmid Plasmid Total/ some plasmid p1META1 p2META1 p3META1 average some p1METDI p2METDI Average ______Size (bp) 5511322 1261460 44195 37858 24943 6879778 5943769 141504 38579 6123852 GC (%) 68.7 67.7 67.9 65.3 66.9 68.5 a 68.1 65.3 63.7 68.0 a Repeat regions b (%) 8.3 7.9 0.2 2.7 0.3 8.0 a 9.3 1.6 0 9.1 a Genes 5315 1318 46 45 35 6759 5859 137 41 6037 Protein-coding genes 5227 1312 46 45 35 6665 5771 137 41 5949 Average length CDS (bp) 905.3 846.9 822.7 693.2 536.6 891.7 a 888.2 878.4 821.9 887.2 a Intergenic (bp) 178.1 167.7 163.1 229.2 160.6 176.4 a 180.4 309.6 202.6 183.4 a Coding density (%) 84.2 84.1 78.9 73.1 69.8 84.0 a 83.6 70.6 77.7 83.3 a rRNA operons 5 0 0 0 0 5 5 0 0 5 tRNA 57 6 0 0 0 63 58 0 0 58 Insertion elements (IS) Total length (%) 2.4 7.6 22.0 27.5 15.5 3.7 a 1.6 21.5 10.0 2.1 a Intact IS c 93 (19) 41 (25) 3 (3) 4 (4) 1 (1) 142 (39) 54 (27) 15 (15) 2 (2) 71 (42) Partial IS c 8 (7) 22 (13) 1 (1) 1 (1) 0 32 (19) 17 (11) 5 (5) 1 (1) 23 (15) MITEs 1 3 0 0 0 4 8 0 0 8 ______a Average b Defined by the algorithm Nosferatu as implemented in Mage [45] c Number of IS elements (number of IS types in brackets)

59

Fig. 2. Schematic representation of the 8 circular replicons in the genomes of Methylobacterium extorquens strains AM1 (top) and DM4 (bottom). Successive circles from inside to outside: GC skew; GC deviation (with values exceeding +/- 2SD indicated in red); rRNA (pink); tRNA (green); IS elements (brown); all genes coloured according to functional class (COG); methylotrophy genes (blue, see Suppl. Table 1); strain-specific genes (yellow, except genes predicted to be of foreign origin, in red); genomic islands (green, see Table 3). Plasmids are not shown to scale.

The distribution of functional categories according to the COG classification ([44], Table 2) was as expected for a free-living proteobacterium with a versatile lifestyle and the observed genome size. COG functional class assignments are more frequent and diverse for chromosomal genes than for plasmid genes (Table 2). No significant differences in functional classes were evident between AM1 and DM4 chromosomes, except for the larger proportion of genes associated with recombination, replication and repair in strain AM1, a reflection of the larger set of IS elements in that strain (Table 1, and see below).

60 The Mage annotation platform [45] and Alien Hunter [46] were used to detect genes and genome regions found in one strain but not in the other. Several unique chromosomal regions, termed genomic

Fig. 3. Overall synteny between M. extorquens AM1 and DM4. The linearized replicons were aligned and visualized by Lineplot in Mage. Syntenic relationships comprising at least 8 genes are indicated by violet and blue lines for genes found on the same strandor on opposite strands, respectively. IS elements (pink), ribosomal operons (blue) and tRNAs (green) are also indicated. islands and ranging from a few genes to hundreds of genes, which represent approximately 632 kb (11.5%) and 1,054 kb (17.7%) of the chromosome for strains AM1 and DM4, respectively, were defined (Table 3). With the exception of the dcm and mau gene clusters (see below), few of these islands appear to encode functions important for central metabolism or methylotrophy. One remarkable genomic island in strain AM1 (Table 3) contains a hypothetical gene of unknown function of 47.5 kb (META1_2412) encoding a 15,831 residue-long repeat-rich polypeptide (Pfam PF00353 (hemolysin-type calcium-binding region); PF05594, (haemagglutinin, bacterial); COG3210 (large exoproteins involved in heme utilization or adhesion), and COG2931 (RTX toxins and related Ca2+- binding proteins). This gene product, if expressed, would represent one of the largest proteins known in biology [47].

61 Extra-chromosomal replicons are highly strain-specific and show little similarity in size, gene content or synteny with each other. However, an approximately 130 kb region of the AM1 megaplasmid is globally syntenic to a region of similar length in the chromosome of strain DM4 (Fig. 3). Plasmids encode mostly proteins of currently unknown function (Table 2) or proteins associated with plasmid- related functions. Exceptions include a cation efflux system on plasmid p1META1 (p1META1_0021/p1META1_0022); a cluster of copper resistance genes on plasmid p2META1 (p2META1_0029/p2META1_0030); a truncated luxI gene (p1META1_0049) recently shown to be essential for the operation of two bona fide, chromosomally-located luxI genes, and encoding two acyl homoserine lactone synthases [48]; and UmuDC systems involved in SOS DNA repair. Unlike in strain DM4, which has two complete copies of umuDC on its chromosome (METDI0144/METDI0143 and METDI4328/METDI4329), a complete umuDC system in AM1 is only found on the megaplasmid (META2_0643/META2_0644), while a truncated copy of umuC is found on the chromosome (META1_4790)

62 Table 2. Functional classes in M. extorquens AM1 and DM4 replicons a ______Strain AM1 Strain DM4 ______Class - Description Chrom. mega- p1 p2 p3 Class Proc. a chrom. p1 p2 Class Proc. a plas. % % % % ______D - Cell cycle control, cell division, chromosome partitioning 34 7 1 1 1 0.66 16.28 38 3 2 0.72 17.48 M - Cell wall/membrane/envelope biogenesis 240 30 1 1 4.08 245 4 4.18 N - Cell motility 126 16 3 2.18 121 1 2.05 O - Posttranslational modification, protein turnover, chaperones 164 32 2 2.97 201 2 3.41 T - Signal transduction mechanisms 266 34 1 1 4.53 304 1 1 5.14 U - Intracellular trafficking, secretion, and vesicular transport 35 8 1 0.66 32 3 0.59 V - Defense mechanisms 66 13 1 1.20 80 2 1.38 . B - Chromatin structure and dynamics 3 0.05 14.12 4 0.07 13.28 J - Translation, ribosomal structure and biogenesis 188 14 1 3.05 199 1 3.36 K - Transcription 205 46 2 2 2 3.86 243 5 2 4.20 L - Replication, recombination and repair 292 161 11 11 3 7.17 290 39 7 5.65 C - Energy production and conversion 262 26 1 4.34 24.50 303 3 2 5.18 27.39 E - Amino acid transport and metabolism 423 31 1 1 6.84 468 12 8.07 F - Nucleotide transport and metabolism 79 11 1.35 80 1.34 G - Carbohydrate transport and metabolism 141 9 2.25 144 1 2.44 H - Coenzyme transport and metabolism 124 7 1 1.98 123 1 2.08 I - Lipid transport and metabolism 164 11 2.63 174 2 2 2.99 P - Inorganic ion transport and metabolism 200 35 3 1 3.59 220 2 3.73 Q - Secondary metabolites biosynthesis, transport and catabolism 89 12 1 1.53 92 1 1.56 . R - General function prediction only 355 40 2 5.96 11.24 388 1 2 6.57 12.12 S - Function unknown 312 35 2 3 5.28 327 2 1 5.55 CDS with at least one COG hit 3768 578 30 22 10 4408 66.15 4076 84 21 4181 70.27 Total CDS 5227 1311 46 45 35 6664 5772 137 41 5950 . ______a Only the first COG hit of each CDS is considered (CDS may be associated with several COGs) b Processes: Cellular processes and signaling (D,M,N,O,T,U,V); information storage and processing (B,J,K,L); metabolism (C,E,F,G,H,I,P,Q); and poorly characterized (R,S).

63 Table 3. Unique regions in M. extorquens AM1 and DM4 chromosomes

______start start end end length [Left border][Inside][Right border] a IS Features CDS CDS (nt) CDS (nt) (bp) (proposed role) ______(META1_) (META1_) total (%) unique (%) AH (%) ______Methylobacterium extorquens AM1 0035 37982 0046 51448 13467 [none][GC][IS] 1 14 11 (78.6) 0 tRNA/0058 61782 0073 80559 18778 [tRNA][int-AH][none] 1 17 16 (94.1) 5 (29.4) 0149 156863 tRNA/0165 174721 17859 [none][int-AH-mob][tRNA] 0 16 13 (81.3) 0 tRNA/0241 261958 0273 289656 27699 [tRNA][GC][IS] 4 (1) 34 25 (73.5) 0 1078 1126013 1115 1164815 38802 [IS][GC-mob-tRNA-tRNA][IS] 9 (1) 45 19 (42.2) 4 (8.9) 1226 1283284 1257 1308115 24832 [none][GC][tRNA] 0 sulfur metabolism 33 29 (87.9) 0 tRNA/1555 1629314 1614 1691767 62454 [tRNA][none][IS] 2 59 53 89.8) 0 tRNA/1825 1911338 1924 1995805 84467 [tRNA][GC][int] 6 (1) copper resistance 100 82 (82.0) 0 2402 2475649 2415/tRNA 2538027 62379 [none][GC][tRNA] 0 giant META1_2412 gene 13 13 (100) 0 2573 2704934 2598 2734385 29452 [none][GC][none] 0 sulfur metabolism 25 14((56.0) 0 2622 2753794 2652 2781501 27708 [none][GC][none] 0 metal transport 31 31 (100) 0 2657 2787350 META1_2681 2808325 20976 [none][GC][integrase] 0 22 22 (100) 0 2687 2814961 2697 2823774 8814 [none][GC][none] 0 11 10 (90.9) 0 2755 2879429 2817 2939813 60385 [IS][tRNA-AH][tRNA] 3 d included mau gene cluster 63 50 (79.4) 10 (15.9) tRNA/3934 4056642 4086 4163355 106714 [tRNA][AH][int] 15 (6) phage-related 148 145 (98.0) 25 (16.9) 4747 4874344 4792/tRNA 4901372 27028 [integrase][GC][tRNA] 4 phage-related 43 41 (95.3) 6 (14.0) ______

64 ______start start end end length [Left border][Inside][Right border] a IS Features CDS CDS (nt) CDS (nt) (bp) (proposed role) ______(META1_) (META1_) total (%) unique (%) AH (%) ______

Methylobacterium extorquens DM4 0036 37946 0052 51383 13438 [Mite][int - GC][IS] 2 d (0) 16 13 (81.3) 11 (68.8) 0137 137916 0147/tRNA 146122 8207 [none_Mite][AH][tRNA] 2 d (1) SOS repair (umuCD) 8 0 4 (50.0) tRNA/0225 232397 0336 321727 89331 [tRNA][AH][none] 6 (2) sensors/regulators, transport; carbon metabolism 109 62 (56.9) 27 (24.8) 0345 328235 0426 390700 62466 [none][GC][int] 1 (0) Beta-lactamase-like domain repeat region 78 57 (73.1) 12 (15.4) 0707 673123 0725 683063 9941 [int][AH][none] 2 (1) Beta-lactamase-like domain repeat region 17 11 (64.7) 11 (64.7) 0736 689280 0748 702730 13451 [none][AH][IS] 2 (1) efflux determinant 14 13 (92.9) 13 (92.9) 0769 717385 0782 730568 13184 [none][GC][none] 0 carbon metabolism 13 10 (76.9) 0 0786 733379 0825/tRNA 759448 26070 [none][AH][tRNA] 4 d (3) gene decay region 39 34 (87.2) 28 (71.8) tRNA/0840 774084 0935 848713 74630 [tRNA][int-mob(3)-int-AH][none] 9 d (4) contains putative 5-formyl-H4F cyclo-ligase 93 37 (39.8) 30 (32.3) 1157 1065633 1204 1108451 42819 [int][int-int-AH][none] 2 (1) 45 25 (55.6) 19 (42.2) 1209 1114522 1243 1136772 22251 [none][GC][none] 0 34 26 (76.5) 10 (29.4) 1275 1161654 1336 1215594 53941 [none][GC][none] 7 (5) 62 21 (33.9) 19 (30.6) 1341 1219900 1375 1252518 32619 [none][GC-AH][none] 2 d (1) copper resistance 36 5 (13.9) 3 (8.3) 1382 1257095 1424 1290282 33188 [none][AH][IS] 6 (1) metal resistance 41 35 (85.4) 16 (39.0) 1586 1452662 1650 1512578 59917 [none][GC][none] 10 d (7) 66 54 (81.8) 45 (68.2) 1656 1520241 1683 1548181 27941 [none][GC][none] 0 putative sulfur compound transport/ amidase region 28 26 (92.9) 9 (32.1) 1860 1741524 1886 1767524 26001 [none][GC][none] 2 (2) carbon utilisation, transport, molybdopterin-related 25 25 (100) 0

65 ______start start end end length [Left border][Inside][Right border] a IS Features CDS CDS (nt) CDS (nt) (bp) (proposed role) ______(META1_) (META1_) total (%) unique (%) AH (%) ______Methylobacterium extorquens DM4 (followed) 1917 1800884 1956 1830642 29759 [none][GC][none] 6 d (2) carbon utilisation, transport, -related 36 27 (75.0) 18 (50.0) tRNA/2330 2227107 2333 2244312 17206 [tRNA][none][IS] 1 (1) 4 4 (100) 0 2348 2263919 2380 2300423 36505 [none][GC-AH][none] 2 (2) 33 27 (81.8) 6 (18.2) 2551 2460336 2682 2587254 126919 [none][mob-tRNA-int-AH][none] 7 (1) dcm region 129 127 (98.4) 127 (98.4) tRNA/3361 3293493 3383 3308143 14651 [tRNA][tRNA-int-AH][int] 1 (1) 23 19 (82.6) 19 (82.6) 4329 4266670 4356 4283719 17050 [none][AH][int] 3 (0) 27 18 (66.7) 10 (37.0) 4367 4291119 4487 4393994 102876 [none][AH][none] 6 d (5) nitrogen metabolism, urease-like operon 122 108 (88.5) 59 (48.4) 4495 4400485 4514 4415276 14792 [IS][AH][IS] 4 (2) 20 16 (80.0) 16 (80.0) tRNA/4746 4641081 4776 4668491 27411 [tRNA][int-mob-tRNA-AH][none] 0 efflux determinant 30 26 (86.7) 14 (46.7) tRNA/5356 5288637 5390 5336994 48358 [tRNA][int][none] 0 virulence determinant 35 32 (91.4) 22 (62.9) tRNA/5552 5522861 5566 5532025 9165 [tRNA][int-AH][none] 2 (2) 16 15 (93.8) 15 (93.8) ______a int: integrase; mob: mobility determinant; GC: region with atypical GC content; AH: region rich in genes detected by Alien Hunter; IS: Insertion Sequence; MITE: Miniature Inverted Repeat Transposable Element b no homolog with >80% identity / 0.8 minLrap value in the chromosome of the compared strain c as detected by Alien Hunter ([46], see Materials and Methods) d including one putative MITE

66 Comparative genomics of aerobic methylotrophy. Methylotrophy can be envisioned in terms of the assembly of discrete metabolic modules, each responsible for a specific metabolic task, which in combination define pathways for methylotrophic metabolism, several variants of which have been well characterized [11,49].

The Methylobacterium blueprint. In Methylobacterium, the currently recognized methylotrophy genes and modules are found exclusively on the chromosomes of strains AM1 and DM4 (Fig. 2, Suppl. Table 1). Common genes associated with methylotrophy inventoried in Suppl. Table 1 display at least 95% identity at the protein level (99.1% average), with complete synteny between the two strains [11]. Several methylotrophy genes are found as singletons, including several cases of genes that encode different subunits of the same enzyme (e.g. mcmAB, pccAB, see Suppl. Table 1). Nevertheless, a majority of methylotrophy genes are found in large clusters. Only two known methylotrophy gene clusters are not shared between the two strains (Suppl. Table 1, and see below): the dcm (dichloromethane degradation) gene region present only in strain DM4, and the mau gene cluster encoding methylamine dehydrogenase and accessory functions in strain AM1. One large multi-operon cluster (49.3 kb) encodes most of the serine cycle enzymes, most of the PQQ biosynthesis functions

[17], genes for H4MPT-linked reactions and H4MPT biosynthesis, and H4F biosynthesis genes. It also contains genes encoding a homolog of methanol dehydrogenase (XoxFJG) of still unknown function often found nearby genes involved in C1 metabolism [51,52] and recently suggested to be involved in formaldehyde metabolism in the photosynthetic bacterium Rhodobacter sphaeroides [53].

Comparison of gene sets for methylotrophy in fully sequenced genomes. A steadily increasing number of genomes of methylotrophic microorganisms has been sequenced, assembled and annotated. We limit our comparative analysis of known genetic determinants and modules of methylotrophy (Suppl. Table 2) to completed, manually annotated and officially published methylotroph genomes (listed in Suppl. Table 3), six of which belong to the phylum Proteobacteria and one to the phylum Verrucomicrobia. Methylococcus capsulatus represents Gamma-proteobacterial methanotrophs [54], Methylibium petroleiphilum [55], Methylobacillus flagellatus [56] and Methylophilales strain HTCC2181 [57] feature two different orders within Beta-proteobacteria (Burkholderiales and Methylophilales). Silicibacter pomeroyi, although not reported to grow methylotrophically, is an Alpha-proteobacterium of the family Rhodobacteriaceae capable of degrading methylated sulfur compounds [58]. Granulibacter bethesdensis is an emerging human pathogen of the family of Acetobacteriaceae within Alpha-proteobacteria [59] reported to grow on methanol [60]. Finally, strain V4 (candidatus ―Methyloacidiphilum infernorum‖) represents the recently discovered group of thermophilic and acidophilic methanotrophs of the phylum Verrucomicrobia [61].

67

Methanol utilization. The mxa gene cluster encoding the classic methanol dehydrogenase is nearly identical (over 99% identity at the protein level) between strains AM1 and DM4 and very similar in the genomes of M. capsulatus, M. flagellatus, G. bethesdensis and several other proteobacterial methylotrophs [62]. This conservation of both gene sequence and gene synteny suggests that the mxa gene cluster was most likely disseminated via lateral transfer among methylotrophs of different subclasses of Proteobacteria. This notwithstanding, no similar gene clusters are recognizable in the genomes of the other four organisms discussed here. M. petroleiphilum features a gene cluster encoding an alternative methanol dehydrogenase (Mdh2; [62]) with little homology to either mxaF or xoxF. The gene xoxF is found in all of the genomes discussed here except that of S. pomeroyi but, as discussed elsewhere [62], the Xox system is unlikely to be responsible for aerobic methanol oxidation. The genes responsible for methanol oxidation by Methylophilales HTCC2181 and strain V4 remain unknown, suggesting the existence of other, yet unidentified systems for methanol dissimilation.

Methylamine utilization. The mau gene cluster encoding the canonical system for methylamine utilization was characterized for a large part in strain AM1, and the genome of M. flagellatus [56] contains a mau gene cluster very similar to it. The main difference is that the gene for the electron acceptor from methylamine dehydrogenase in strain AM1, amicyanin, is replaced by a gene for azurin, an analogous copper-containing electron acceptor protein in M. flagellatus. The mau cluster was not found in genomes of the other methylotrophs including strain DM4 discussed here, which were shown or assumed to grow with methylamine (Suppl. Table 2). Thus, as yet uncharacterized genetic determinants are responsible for methylamine utilization in most methylotrophs, including strain DM4 in particular.

H4MPT-dependent formaldehyde oxidation. Tetrahydomethanopterin (H4MPT)-dependent formaldehyde oxidation is the main pathway for both energy generation and formaldehyde detoxification in M. extorquens, and therefore absolutely essential for methylotrophy in this organism [21,63,64]. First defined in M. extorquens AM1, this pathway has also been described in a variety of other bacteria, including from phyla whose methylotrophic ability has not yet been demonstrated such as Planctomycetes [18,65,66]. Phylogenetic analysis suggests that this pathway must be one of the most ancient in the context of methylotrophic metabolism. However, it is unessential in M. flagellatus [67], and is absent in some other methylotrophs. S. pomeroyi possesses an alternative glutathione- dependent (FlhA/FghA) system for oxidation of formaldehyde similar to that of P. denitrificans [68] and R. sphaeroides [53]. No formaldehyde oxidation systems were identified in the genomes of Methylophilales HTCC2181 or Verrucomicrobia strain V4.

68

Conversion of formate to CO2. M. extorquens strains possess four different functional formate dehydrogenases for the final step of energy generation from carbon oxidation [69]. The other methylotrophs included in our analysis also encode one or several FDH homologs (Suppl. Table 2), but only one, FHD2, is consistently detected. These observations suggest that formate oxidation, as a transformation ubiquitous to life, does not strictly qualify as a methylotrophy-specific reaction, and may thus involve analogous [50] enzymatic systems.

C1 assimilation via methylene tetrahydrofolate and the serine cycle. The serine cycle is essential for carbon assimilation in Methylobacterium and comprises reactions specific to methylotrophy as well as reactions involved in multicarbon metabolism (Fig. 1, see also [8,11]). Genes involved in the serine cycle can be ascribed to two categories on the basis of mutational analysis [11]: methylotrophy- specific genes (glyA, sga, hpr, gck, ppc, mtkAB and mcl), and genes which are essential under non methylotrophic growth conditions (eno and mdh). Recent evidence [24,25] and mutant analyses

[23,70-73] suggest that genes for the C1 transfer pathway linked to H4F (mtdA, fch and ftfL) are specifically involved in assimilatory metabolism in Methylobacterium. Six methylotrophy-specific serine cycle genes, along with mtdA and fch, belong to gene clusters associated with methylotrophy on the chromosomes of strains AM1 and DM4 (Fig. 4), while the three remaining genes (glyA, gck and ftfL) are not parts of methylotrophy gene clusters and are located elsewhere on the chromosome. As exemplified here for M. petroleiphilum [55], Beta-proteobacterial methylotrophs may also employ the serine cycle for C1 carbon assimilation [55,74]. As that of the Alpha-proteobacterium S. pomeroyi, the genome of M. petroleiphilum contains a single gene cluster encoding all required functions of the serine cycle (Fig. 4). In S. pomeroyi however, the organisation of this gene cluster is quite different from that of Methylobacterium, Granulibacter bethesdensis and M. petroleiphilum (Fig. 4), and contains tandem genes for two distantly related bifunctional methylene-H4F dehydrogenase/ methenyl-

H4F cyclohydrolase (FolD) enzymes instead of the isofunctional mtdA/fch genes found in the other genomes discussed here. Moreover, the hpr, gck and sga genes inferred from the genomic context display only modest sequence identity with Methylobacterium prototypes (Fig. 4), further suggesting that the serine cycle in S. pomeroyi belongs to an independent evolutionary lineage. Extending the analysis to methylotrophic organisms able to grow with methane, the Gamma- proteobacterial methanotroph M. capsulatus also harbors serine cycle gene homologs in its genome, including the mtdA/fch pair, but few of them are clustered (Fig. 4). However, the gene for one key enzyme of the serine cycle, the methylotrophy-specific phosphoenolpyruvate carboxylase gene, is missing [56], consistent with the extensive biochemical studies demonstrating that the main pathway for C1 assimilation in Methylococcus capsulatus is the RuMP pathway.

69

C1 assimilation and the ethylmalonyl-CoA pathway for glyoxylate regeneration. The assimilation of C1 units by the serine cycle requires the regeneration of glyoxylate from acetyl-CoA. It has been a long standing puzzle how strain AM1 achieves this given that it lacks isocitrate lyase activity, the key enzyme of the classical glyoxylate regeneration pathway [8]. Indeed, and the corresponding gene was not detected in the Methylobacterium genome. Glyoxylate regeneration via the recently elucidated ethylmalonyl-CoA pathway [12], was demonstrated in strain AM1 [13], and the corresponding genes were identified [11,12]. The genomes of the other bacteria compared here present a contrasting picture in this respect. The genome of S. pomeroyi also contains a complete set of the genes for the ethylmalonyl-CoA pathway (not shown), and as in M. extorquens, these genes are not clustered on the

Fig. 4. Clustering and conservation of serine cycle and other genes important for methylotrophic metabolism in sequenced methylotrophic bacteria. Sequences were retreived from Genbank and visualized using CLC Sequence Viewer 5 (www.clcbio.com). Chromosome sequence positions are indicated, as well as the percent identity at the protein level with Methylobacterium prototypes (nd: not detectable). Formate tetrahydrofolate ligase/formyl-tetrahydrofolate synthetase (ftfL, black); serine hydroxymethyltransferase (glyA, pink); serine glyxoylate aminotransferase (sga, yellow); hydroxypyruvate reductase (hprA, red); glycerate kinase (gck, purple); phosphoenolpyruvate carboxylase (ppc, orange); malyl-CoA lyase / -methylmalyl-CoA lyase (mcl, dark green); malate thiokinase (mtkA/mtkB, light green); NAD(P)-dependent methylene- tetrahydromethanopterin/methylene-tetrahydrofolate dehydrogenase (mtdA, dark blue); methenyl tetrahydrofolate cyclohydrolase (fch, light blue); bifunctional methylene-tetrahydrofolate dehydrogenase/methenyl-tetrahydrofolate cyclohydrolase (folD, grey); trasnscriptional regulator (pale green); other (white); tRNA (black rectangles).

70 chromosome. In M. petroleiphilum and G. bethesdensis, however, the genes for the key enzymes of the ethylmalonyl-CoA pathway [11] are missing (not shown), but genes thought to encode the isocitrate lyase shunt are present instead [55]. In M. capsulatus, neither ethylmalonyl-CoA pathway nor the isocitrate lyase shunt appear to be encoded within the genome [56], consistent with the operation of the RuMP pathway as the predominant pathway for C1 assimilation in M. capsulatus.

Transcriptional regulation of carbon assimilation in methylotrophic metabolism. The gene of the global serine cycle regulator in Methylobacterium (QscR, a LysR-type regulator homologous to CbbR), is essential for methylotrophic growth. It activates transcription of the clustered serine cycle genes as well as of glyA, and negatively regulates its own transcription [75] but it is not in the proximity of known serine cycle genes in the genome. However, the genes of several probable regulators of unknown function are found nearby serine cycle genes in all methylotrophic bacteria including Methylobacterium discussed here (Fig. 4).

Analysis of IS elements uncovers a significant potential for genome plasticity in Methylobacterium. Methylobacterium genomes display an IS content comparable to that other microbial genomes [76], but with a clear differential distribution of highly diverse IS elements in AM1 and DM4 (Fig. 5, Suppl. Table 4). In AM1, 39 different IS types (defined by a 95% amino acid identity threshold), belonging to 14 IS families (defined as broad groupings of related elements in ISfinder [77]), were detected, compared to 42 IS types belonging to 14 IS families in DM4. Overall diversity of IS types is higher in DM4, but the total number of IS elements in AM1 is twice as high as in DM4 (Table 1). A total of 71 intact and 23 partial IS elements were detected in strain DM4, representing about 2% of the genome (Table 1) With 9 and 7 copies, respectively, ISMex15 and ISMex17 were the two most abundant IS elements in this strain. In comparison, strain AM1 featured 142 intact and 32 partial IS elements, representing 3.7% of the genome (Table 1). At 37, 16 and 23 intact copies, respectively, ISMex1, ISMex2 and ISMex3 of AM1 (with average pairwise nucleotide differences between different gene copies of only 0.01%, 0.11% and 0.03% respectively), the most abundant IS elements identified, may have undergone recent expansion. In addition, one miniature inverted-repeat transposable element (MITE), MiniMdi3, was detected in both strains (Suppl. Table 4). This element (~400bp) is related to ISMdi3 but lacks the transposase gene. Few studies so far have identified the presence of both non-autonomous and autonomous transposable elements in the same bacterial genome (see e.g. Out of a total of 70 IS types identified in this work, only 11 IS types are shared between the two strains (Suppl. Table 4, intact IS). IS5 and IS110 are the most abundant shared IS families, each family featuring 5 to 7 different types of IS (Fig. 5). This suggests that substantial IS

71 loss and/or acquisition has occurred during the relatively short period of time since both strains have emerged from a common ancestor.

Fig. 5. IS family distribution of intact ISs in M. extorquens AM1 and DM4. The bar length shows the total intact IS copy number of each IS family in DM4 (right) and AM1 (left) (see Suppl. Table 4). Differently colored regions represent different replicons : blue – DM4 chromosome, cyan – DM4 plasmid p1METDI, green – DM4 plasmid p2METDI, pink – AM1 chromosome, orange – AM1 megaplasmid, dark yellow – AM1 plasmid p1META1, light yellow – AM1 plasmid p2META1, light green – AM1 plasmid p3META1. Open circles and squares represent the numbers of different types of ISs within each family in AM1 and DM4, respectively.

The distribution of IS element localization within each genome displays clear-cut, non-random features. Plasmids harbor a higher density of IS elements than the chromosomes. Over 20% of the length of the DM4 plasmid p2METDI and of the AM1 plasmids p1META and p2META encode IS elements. Similarly, IS elements comprise about 8% of the length of the AM1 megaplasmid (Table 1), a significantly higher proportion than in the chromosome (x2 test, p<0.0001). Moreover, several IS families are significantly over-represented on particular replicons. For example, all 16 copies of ISMex2, an IS element belonging to the IS481 family that is specific to strain AM1, are found on its chromosome while all 5 copies of the IS elements belonging to the IS110 family are on the megaplasmid. In contrast, 13 out of the 14 copies of IS elements of this group in DM4 are located on the chromosome. For some IS elements, however, a more homogeneous distribution was noted. For example, the Tn3 family element ISMex22 unique to strain AM1 is found in one copy per replicon. Transposition immunity was described for this type of IS element [79], suggesting the occurrence of transposition saturation in this case.

72 The observed non-random IS density across replicons may be due to one or more of three potential causes: (1) biased transposition rates by different IS types across replicons, such as local hopping or plasmid specificity; (2) biased selective effects of transposition events, such as over-representation in regions with high density of genes with little or no selective value, such as plasmids or IS elements themselves; or (3) insufficient time for reaching equilibrium, e.g. for IS elements acquired via recent plasmid-mediated transmission. A second pattern in the distribution of IS locations was noted within each replicon. There is an over-representation of IS elements by 7-fold and 39-fold in chromosomal regions unique to AM1 and DM4, respectively, relative to the regions shared between the two strains (x2 test, p<0.0001; also see Fig. 2). These could represent regions with fewer essential genes and therefore relaxed selection against DNA insertions. Alternatively, these could have been IS-rich regions dating back to the common ancestor of these two strains. This would have then led to increased rates of deletion between two co-directional copies of the same IS element, causing these IS- rich regions to be lost more frequently.

IS elements linked to methylotrophy. The two strain-specific methylotrophy regions containing mau (in AM1) and dcm (in DM4) gene clusters (Table 3) are closely associated with IS elements. In strain DM4, genes dcmR and dcmA are embedded within several overlapping IS elements ([80], Table 3 and see below, Fig. 6). In strain AM1, the mau cluster (12 kb) lies between 2 copies of ISMex15 (~30kb), as part of a larger (approx. 66 kb) gene cluster unique to this strain (Table 3). This suggests that such methylotrophy-associated gene clusters may be prone to lateral gene transfer and/or deletion. Indeed, it has been shown recently that the presence of the mau gene cluster is variable in closely related environmental strains of Methylotenera, a betaproteobacterial methylotroph [81]. This phenomenon may be involved in the emergence of new ecotypes of methylotrophs.

73

Fig. 6. dcm region of strain DM4. All functional annotations are putative except for the DCM dehalogenase gene and its upstream regulator (bold). Highlighted are genes for putative enzymes (red), regulators (orange) transporters (yellow), proteins involved in DNA modification (blue), transposases (cyan), proteins involved in plasmid functions (green), and gene fragments (grey), with hypothetical and conserved hypothetical proteins left in white. The interrupted chelatase family gene (hashed) defined here as "island integration determinant" flank the 126 kb dcm island.

The genomic island for DCM utilization: a new type of mobility determinant? Unlike methylamine and methanol which are produced naturally in large amounts[2,82], DCM is produced naturally at low levels only [33], and presumably occurs in significant concentrations in the environment due to industrial production. The dcm genomic island unique to strain DM4 with the dcmA gene encoding DCM dehalogenase required for growth of Methylobacterium with DCM is located on the chromosome (Table 3, Fig. 6), just 20 genes downstream of the large conserved 49 kb methylotrophy gene cluster (Fig. 2, Suppl. Table 1). This 126 kb DNA region, of markedly different GC content (60.5%) from the genome average, was most likely acquired by horizontal transfer. The sequences upstream and downstream of the unique dcm region are in complete synteny between the genomes of strains DM4 and AM1. The integration point of the dcm region features the 5'-end and 3'- end remains of a ‗‗chelatase-like‘‘ (COG0606, predicted ATPase with chaperone activity). Although most currently known genomic islands are located at the 3' end of a tRNA locus, other genes serving

74 as integration sites have been described, such as the glr (glutamate racemase) gene of the Helicobacter pylori pathogenicity island [82]. Clues on the mode of integration of the dcm region within the Methylobacterium chromosomal framework were obtained by a more detailed analysis. The first CDS within the dcm region encodes a putative recombinase. Arrangements of non-overlapping 5' and 3' fragments of such a ―chelatase‖ gene bordering an internal DNA fragment beginning with a recombinase gene are also evident in three other published complete genomes (Table 4). Additional DNA motifs associated with such structures include 5-20 bp direct repeats and palindromic sequences located immediately up- and downstream of the 5'- and 3'-fragments of the disrupted gene, respectively (Table 4). DNA sequences encoding ―chelatase‖ homologs are often apparent pseudogenes, partial sequences, or sequences containing one or several internal stop codons, suggesting that such sequences may have experienced insertion and subsequent excision of DNA fragments. It is tempting to speculate that such sequences represent novel determinants of genome plasticity, and we propose the term "island integration determinant" (iid) to describe them. The dcm region features only few genes that can be associated with confidence with methylotrophic metabolism (Fig. 6). The majority of the genes within this region (74/128, 58%) are hypothetical or conserved hypothetical proteins (compared to the chromosomal average of 41.1% and plasmid average of 43.8% for such proteins, Table1). Several genes of the dcm region are interrupted by IS elements (e.g. a glutathione S-transferase METDI2660/2663), or are present in truncated form (e.g. a DNA helicase METDI2648). Many CDS seem associated with DNA modification, stability and mobility. Moreover, 7 IS elements were identified in this region, with 4 in close proximity to dcmA [80]. The structural elements of a bona fide repABC operon encoding plasmid replication and maintenance function with its counter-transcribed small RNA in divergent orientation upstream of repC, and a palindromic 16 nt sequence GTTCTCAGCTGAGAAC fitting the par binding site consensus sequence [83] upstream of repA, were also found within the dcm region. The 8 kb region centered around repABC displays extensive synteny with several rhizobial plasmids and with several regions on the chromosome of Nitrobacter hamburgensis X14 [84]. This suggests that part or all of the dcm region may have once existed as an extrachromosomal element and contributed to the spread of the metabolic capacity to degrade DCM in the environment. Nevertheless, introduction of the dcmA gene into strain AM1, with expression of active DCM dehalogenase at high levels, failed to enable growth on DCM [85]. Thus, specific adaptations are required beyond the presence of DCM dehalogenase to enable Methylobacterium to grow with this compound [36]. Additional genetic determinants needed for growth with DCM remain to be discovered, and the availability of genomic sequences will facilitate experimental efforts towards identifying them.

75 Conclusions

The assembled and complete genome sequences of two strains representing the pink-pigmented facultative methylotrophs of the genus Methylobacterium reveal extensive genome-wide homology and gene synteny. Genomic determinants of methylotrophy are quasi-identical between the two strains, with the exception of the methylamine utilization cluster unique to strain AM1 and of the DCM utilization cluster unique to strain DM4. Still, the two strains differ in genome size and number of replicons, and feature a set of strain-specific genes, mostly of unknown function. The large number and extensive diversity of IS elements in Methylobacterium genomes, along with the often clustered organization of genes for utilization of C1 compounds, suggests that genome rearrangements and horizontal gene transfer, most often associated with IS elements, represent key mechanisms of Methylobacterium avolution relating to growth-supporting nutrients and environmental conditions. The co-linearity of the two genomes and the absence of substantial large-scale sequence rearrangements are all the more striking in this context, and may indicate that purifying selection sets strong constraints against major alterations of the genome structure in Methylobacterium, despite the long laboratory history of the two strains, usually grown with different carbon sources (methanol for strain AM1 and DCM for strain DM4). These two genome sequences thus afford a refined picture of the potential of Methylobacterium for physiological flexibility and adaptation to specific environmental constraints within a conserved genomic framework, and provide the basis for renewed, systems level experimental investigations.

76 Table 4. Island integration determinants (iid) associated with genomic islands in completed microbial genomes Characteristic M. strain DM4 M. extorquens PA1 Mesorhizobium loti Nitrobacter hamburgensis X14 ______Genome accession number XXXXXX NC_010172 NC_002678 NC_007964 Disrupted CDS a 5‘-end fragment METDI2550 (100) Mext1904 (99) Mll4733 (70) Nham144 (75) Insertion position 329 215 214 214 3‘-end fragment METDI2684 (99) Mext1923 (98) Mll4667 (70) Nham130 (73) Associated recombinase (Rec) METDI2551 (+, 100) Mext1905 (+, 28) Mlr4668 (+, 28) Nham143 (+, 28) Genomic island organization c DR-DR2-PAL-Rec- DR/PAL-Rec- DR/PAL1- DR-PAL-Rec- [Insert]- [Insert]- [Insert]- [Insert]- PAL/DR2-DR PAL/DR Rec-PAL2/DR PAL/DR Direct repeat (DR) sequence GAACC (DR) TGCTGATGA G[C,G]CACAAT[C,G]T[G,C]CT CATCA[C,T]TTGCTGA AA[T,A]AGA (DR2) Palindromic (PAL) sequence ATTCCCCACCTT>X< ATGACGTGGCCTATT>X< GCCACAATCT>GCTA< GTGATGCTACATTAA>X< AAGGTGGGGAAT AATAGGCCACGTCAT ACATTGTGGC (PAL1) TTAATGTAGCATCAC GTGCAGTATTAAA>X< TTTAATACGGCACA (PAL2) Associated genomic island Size (kb) 126.4 18.4 50.5 17 %GC d 60.5 (68.1) 69.7 (68.2) 56.2 (62.8) 58.2 (61.7) Number of CDS 127 18 65 13 Proposed role DCM degradation Arsenite resistance unknown unknown ______a CDS number, % identity at protein level and genomic island insertion position relative to the intact island integration determinant CDS META1_1797 of M. extorquens AM1 b CDS number, orientation relative to the upstream island integration element and % identity at protein level with the recombinase of strain DM4 c DR: direct repeat; PAL: palindrome; Rec: recombinase. Slashes indicate sequence overlap, > and < indicate end and begin of mirror palindromic sequences. In M. loti, the two palindromic sequence pairs are only separated by three and four bases, respectively. Overlapping segments of direct repeat and palindromic sequences are indicated in bold or underlined (5‘-end or 3‘-end of the genomic island, respectively). Bases between square brackets indicate alternative bases in the corresponding motif. d Genome %GC content given in brackets

77 Materials and Methods

Sequencing, assembly, and validation of the genome of M. extorquens AM1. Sequence data were obtained by whole genome shotgun sequencing as previously described [86]. BigDye terminator chemistry and capillary DNA sequencers (model 3700, Applied Biosystems) were used. Randomly picked blunt end-cloned small insert pUC19 vector-based plasmids (average ~3 kb insert size) were sequenced at both ends using universal forward and reverse sequencing primers, according to standard protocols established at the University of Washington Genome Center. In addition, a large insert fosmid library was constructed from Sau3A partial-restricted genomic DNA cloned in BamH1 digested pFOS1 vector. About 1,920 randomly picked fosmid clones were end-sequenced and the data pooled with the small insert shotgun sequence data. Sequence data were assembled and visualized using Phred/Phrap/Consed software (www.phrap.com). The sequence quality and assembly was improved by carrying out several rounds of experiments designed by the Autofinish tool in Consed [87]. Manual finishing was carried out that involved (a) use of specialized sequencing chemistries to sequence difficult regions; (b) PCR amplification and sequencing of specific targeted regions; (c) transposon mutagenesis of over 110 small insert clones followed by sequencing to fix misassembled or difficult to assemble regions; and (d) shotgun sequencing of the 58 targeted fosmid clones to fix long-range misassemblies in the assembled genome. The consensus sequences from transposon mutagenized small insert clones, and the shotgun sequenced fosmid clones were used as backbones in the main genome assembly to resolve misassembled regions. The final strain AM1 genome assembly contained a total of 132942 sequence reads, as well as the backbones from 58 fosmids and over 110 transposon mutagenized small insert clones, and was validated by two independent methods. The gross-scale long-range validity of the genome assembly was established by pulse-field-gel-electrophoresis, with complete agreement between the virtual and experimentally determined fingerprint patterns of the final assembled genome, either by single restriction enzyme digestion with PmeI or SwaI or by double digestion with a mixture of PmeI and SwaI restriction enzymes (data not shown). For kb scale validation of the genome assembly, fingerprint data were generated from 1673 of the paired-end-sequenced fosmid clones by digesting with three independent restriction enzymes, FspI, NcoI and SphI. The fosmid paired-end-sequence and experimentally derived fingerprint data were used for assembly validation by comparison with the virtual fingerprint patterns from the assembled genome using the SeqTile software tools developed for this purpose at UWGC [86]. The fosmid paired-end-reads anchored the clone to a unique position in the genome, while the fingerprint data were used to compare experimentally derived fingerprints with the sequence derived virtual patterns. A complete correspondence between the virtual and experimentally derived fingerprint pattern of the genome in the three restriction enzyme domains of FspI, NcoI and SphI was observed, thus validating the genome assembly.

78

Genome sequencing, assembly and validation of the genome of strain DM4. The complete sequence of the genome of strain DM4 was obtained using three different libraries. Genomic DNA was fragmented by mechanical shearing, and 3 kb (A) and 10 kb (B) inserts were cloned, respectively, into plasmid vectors pNAV (a pcDNA2.1 (Invitrogen) derivative) and pCNS (a pSU18 derivative). In addition, a large insert BAC library (25 kb inserts, C) was constructed from Sau3A partially digested total DNA by cloning into pBeloBAC11. Plasmid DNAs were purified and end-sequenced (79200 (A), 27648 (B), 13056 (C) paired match end-reads, respectively) using dye-terminator chemistry on ABI3730 sequencers. Assembly was realized as described [88] with Phred/Phrap/Consed software package (www.phrap.com). An additional 2170 sequences from selected clones were used in the finishing phase of assembly.

Genome annotation and bioinformatic analysis. Coding sequences were predicted using the AMIGene (Annotation of Microbial Genomes) software [89] and then submitted to automatic functional annotation using the set of tools listed in [45]. Putative orthology relationships between the two genomes were defined by gene pairs satisfying either the Bidirectional Best Hit criterion [90] or an alignment threshold (at least 40% sequence identity over at least 80% of the length of the smallest protein). These relationships were subsequently used to search for conserved gene clusters (synteny groups) among several bacterial genomes using an algorithm based on an exact graph-theoretical approach [91]. This method allowed for multiple correspondences between genes, detection of paralogy relationships, gene fusions, and chromosomal rearrangements (inversion, insertion/deletion). The ‗gap‘ parameter, representing the maximum number of consecutive genes that are not involved in a synteny group, was set to five. Manual validation of automatic annotations was performed in a relational database (MethylobacScope, https://www.genoscope.cns.fr/agc/mage/wwwpkgdb/Login/log.php?pid=26) using the MaGe web interface [45], which allows graphic visualization of the annotations enhanced by a synchronized representation of synteny groups in other genomes chosen for comparison. Genomes were checked for the presence of genes without homologs in the parent genome using thresholds of 80% sequence identity threshold at the protein level and 80% of the length of the shorter homolog (minLrap 0.8). Chromosomal genes of potentially foreign origin were detected using Alien Hunter [46]. Potential genomic islands were searched for with the RGP (Region of Genomic Plasticity) tool of the Mage web-based interface [45] based on synteny breaks between compared genomes, and then checked the predicted regions manually. Only regions larger than 8 kb are reported here. IS annotations were done by in-house computational tools (Robinson, Lee, Marx, unpublished) that incorporated IScan [92], followed by manual validation based on ISfinder [77]. IS elements were given

79 names of type "ISMex3", with "Mex" (for M. extorquens) and "Mdi" (for Methylobacterium degrading dichloromethane) indicating strains AM1 or DM4, respectively. The same type name was used for both strains for IS elements with >95% identity in protein sequence. An intact copy was defined as a sequence whose length was at least 99% of the length of the longest copy detected, and a partial IS was defined as a >500 bp fragment with >80% DNA identity to an intact copy.

Acknowledgments

Elizabeth Skovran, Sandro Roselli, Romain Lang and David Lalaouna are thanked for participation in the annotation work.

References

1. Lidstrom ME (2006) Aerobic methylotrophic prokaryotes. In: Dworkin M, Falkow S, Rosenberg E, Schleifer K-H, Stackebrandt E, editors. The Prokaryotes,Vol 2 : Ecophysiology and Biochemistry. New York: Springer-Verlag. pp. 618-634. 2. Galbally IE, Kirstine W (2002) The production of methanol by flowering plants and the global cycle of methanol. J Atmos Chem 43: 195-229. 3. Jourand P, Giraud E, Bena G, Sy A, Willems A, et al. (2004) Methylobacterium nodulans sp. nov., for a group of aerobic, facultatively methylotrophic, legume root-nodule-forming and nitrogen-fixing bacteria. Int J Syst Evol Microbiol 54: 2269-2273. 4. Lidstrom ME, Chistoserdova L (2002) Plants in the pink: Cytokinin production by Methylobacterium. J Bacteriol 184: 1818-1818. 5. Sy A, Timmers ACJ, Knief C, Vorholt JA (2005) Methylotrophic metabolism is advantageous for Methylobacterium extorquens during colonization of Medicago truncatula under competitive conditions. Appl Environ Microbiol 71: 7245-7252. 6. Van Aken B, Yoon JM, Schnoor JL (2004) Biodegradation of nitro-substituted explosives 2,4,6- trinitrotoluene, hexahydro-1,3,5-trinitro-1,3,5-triazine, an octahydro-1,3,5,7-tetranitro-1,3,5-tetrazocine by a phytosymbiotic Methylobacterium sp. associated with poplar tissues (Populus deltoides x nigra DN34). Appl Environ Microbiol 70: 508-517. 7. Abanda-Nkpwatt D, Musch M, Tschiersch J, Boettner M, Schwab W (2006) Molecular interaction between Methylobacterium extorquens and seedlings: growth promotion, methanol consumption, and localization of the methanol emission site. J Exp Bot 57: 4025-4032. 8. Anthony C (1982) The Biochemistry of Methylotrophs. London: Academic Press. 9. Schrader J, Schilling M, Holtmann D, Sell D, Villela Filho M, et al. (in press) Methanol-based industrial biotechnology: current status and future perspectives of methylotrophic bacteria. Trends Biotechnol. 10. Peel D, Quayle JR (1961) Microbial growth on C1 compounds. I. Isolation and characterization of Pseudomonas AM1. Biochem J 81: 465-469. 11. Chistoserdova L, Chen SW, Lapidus A, Lidstrom ME (2003) Methylotrophy in Methylobacterium extorquens AM1 from a genomic point of view. J Bacteriol 185: 2980-2987. 12. Erb TJ, Berg IA, Brecht V, Muller M, Fuchs G, et al. (2007) Synthesis of C-5-dicarboxylic acids from C- 2-units involving crotonyl-CoA carboxylase/reductase: The ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A 104: 10631-10636. 13. Peyraud R, Kiefer P, Christen P, Massou S, Portais J-C, et al. Demonstration of the ethylmalonyl-CoA pathway using 13C metabolomics. Proc Natl Acad Sci USA under revision. 14. Afolabi PR, Mohammed F, Amaratunga K, Majekodunmi O, Dales SL, et al. (2001) Site-directed mutagenesis and X-ray crystallography of the PQQ-containing quinoprotein methanol dehydrogenase and its electron acceptor, cytochrome cL. Biochemistry 40: 9799-9809.

80 15. Williams PA, Coates L, Mohammed F, Gill R, Erskine PT, et al. (2005) The atomic resolution structure of methanol dehydrogenase from Methylobacterium extorquens. Acta Crystallogr Sect D-Biol Crystallogr 61: 75-79. 16. Chistoserdov AY, Chistoserdova LV, McIntire WS, Lidstrom ME (1994) Genetic organization of the mau gene cluster in Methylobacterium extorquens AM1: complete nucleotide sequence and generation and characteristics of mau mutants. J Bacteriol 176: 4052-4065. 17. Davidson VL (2001) Pyrroloquinoline quinone (PQQ) from methanol dehydrogenase and tryptophan tryptophylquinone (TTQ) from methylamine dehydrogenase. Adv Protein Chem 58: 95-140. 18. Chistoserdova L, Jenkins C, Kalyuzhnaya MG, Marx CJ, Lapidus A, et al. (2004) The enigmatic Planctomycetes may hold a key to the origins of methanogenesis and methylotrophy. Mol Biol Evol 21: 1234-1241. 19. Vorholt JA, Chistoserdova L, Stolyar SM, Thauer RK, Lidstrom ME (1999) Distribution of tetrahydromethanopterin-dependent enzymes in methylotrophic bacteria and phylogeny of methenyl tetrahydromethanopterin cyclohydrolases. J Bacteriol 181: 5750-5757. 20. Chistoserdova L, Vorholt JA, Thauer RK, Lidstrom ME (1998) C-1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science 281: 99-102. 21. Vorholt JA, Marx CJ, Lidstrom ME, Thauer RK (2000) Novel formaldehyde-activating enzyme in Methylobacterium extorquens AM1 required for growth on methanol. J Bacteriol 182: 6645-6650. 22. Vorholt JA, Chistoserdova L, Lidstrom ME, Thauer RK (1998) The NADP-dependent methylene tetrahydromethanopterin dehydrogenase in Methylobacterium extorquens AM1. J Bacteriol 180: 5351- 5356. 23. Pomper BK, Vorholt JA, Chistoserdova L, Lidstrom ME, Thauer RK (1999) A methenyl tetrahydromethanopterin cyclohydrolase and a methenyl tetrahydrofolate cyclohydrolase in Methylobacterium extorquens AM1. Eur J Biochem 261: 475-480. 24. Marx CJ, Van Dien SJ, Lidstrom ME (2005) Flux analysis uncovers key role of functional redundancy in formaldehyde metabolism. PLoS Biol 3: 244-253. 25. Crowther GJ, Kosály G, Lidstrom ME (2008) Formate as the main branch point for methylotrophic metabolism in Methylobacterium extorquens AM1. J Bacteriol 190: 5057-5062. 26. Okubo Y, Skovran E, Guo XF, Sivam D, Lidstrom ME (2007) Implementation of microarrays for Methylobacterium extorquens AM1. OMICS 11: 325-340. 27. Bosch G, Skovran E, Xia Q, Wang T, Taub F, et al. (2008) Comprehensive proteomics of Methylobacterium extorquens AM1 metabolism under single carbon and nonmethylotrophic conditions. Proteomics 8: 3494-3505. 28. Guo XF, Lidstrom ME (2008) Metabolite profiling analysis of Methylobacterium extorquens AM1 by comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry. Biotechnol Bioeng 99: 929-940. 29. Kiefer P, Portais J-C, Vorholt JA (2008) Quantitative metabolome analysis using liquid chromatography- high-resolution mass spectrometry. Anal Biochem 382: 94-100. 30. Marx C (2008) Development of a broad-host-range sacB-based vector for unmarked allelic exchange. BMC Research Notes 1: 1. 31. Marx CJ, Lidstrom ME (2001) Development of improved versatile broad-host-range vectors for use in methylotrophs and other Gram-negative bacteria. Microbiology 147: 2065-2075. 32. Gälli R, Leisinger T (1985) Specialized bacterial strains for the removal of dichloromethane from industrial waste. Conservation and Recycling 8: 91-100. 33. Khalil MAK, Moore RM, Harper DB, Lobert JM, Erickson DJ, et al. (1999) Natural emissions of chlorine- containing gases: Reactive Chlorine Emissions Inventory. J Geophys Res-Atmos 104: 8333-8346. 34. McCulloch A, Aucott ML, Graedel TE, Kleiman G, Midgley PM, et al. (1999) Industrial emissions of trichloroethene, tetrachloroethene, and dichloromethane: Reactive Chlorine Emissions Inventory. J Geophys Res-Atmos 104: 8417-8427. 35. Keith LH, Telliard WA (1979) Priority pollutants I - a perspective view. Environ Sci Technol 13: 416-423. 36. Vuilleumier S (2002) Coping with a halogenated one-carbon diet: aerobic dichloromethane-mineralising bacteria. In: Reineke W, Agathos S, editors. Biotechnology for the environment, Focus on Biotechnology Series. Dordrecht: Kluwer Academic Publishers. pp. 105-131. 37. Vuilleumier S, Ivoš N, Dean M, Leisinger T (2001) Sequence variation in dichloromethane dehalogenases/glutathione S-transferases. Microbiology 147: 611-619. 38. Starr TB, Matanoski G, Anders MW, Andersen ME (2006) Workshop overview: Reassessment of the cancer risk of dichloromethane in humans. Toxicological Sciences 91: 20-28.

81 39. Gisi D, Leisinger T, Vuilleumier S (1999) Enzyme-mediated dichloromethane toxicity and mutagenicity of bacterial and mammalian dichloromethane-active glutathione S-transferases. Arch Toxicol 73: 71-79. 40. Kayser MF, Vuilleumier S (2001) Dehalogenation of dichloromethane by dichloromethane dehalogenase/glutathione S-transferase leads to the formation of DNA adducts. J Bacteriol 183: 5209- 5212. 41. Doronina NV, Trotsenko YA, Tourova TP, Kuznetsov BB, Leisinger T (2000) Methylopila helvetica sp. nov. and Methylobacterium dichloromethanicum sp. nov. - Novel aerobic facultatively methylotrophic bacteria utilizing dichloromethane. Syst Appl Microbiol 23: 210-218. 42. Kato Y, Asahara M, Arai D, Goto K, Yokota A (2005) Reclassification of Methylobacterium chloromethanicum and Methylobacterium dichloromethanicum as later subjective synonyms of Methylobacterium extorquens and of Methylobacterium lusitanum as a later subjective synonym of Methylobacterium rhodesianum. J Gen Appl Microbiol 51: 287-299. 43. Necsulea A, Lobry JR (2007) A new method for assessing the effect of replication on DNA base composition asymmetry. Mol Biol Evol 24: 2169-2179. 44. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41. 45. Vallenet D, Labarre L, Rouy Z, Barbe V, Bocs S, et al. (2006) MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res 34: 53-65. 46. Vernikos GS, Parkhill J (2006) Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Nucleic Acids Res 22: 2196-2203. 47. Fukuda N, Granzier HL (2005) Titin/connectin-based modulation of the Frank-Starling mechanism of the heart. J Muscle Res Cell Motil 26: 319-323. 48. Nieto-Peñalver CG, Cantet F, Morin D, Haras D, Vorholt JA (2006) A plasmid-borne truncated luxI homolog controls quorum-sensing systems and extracellular carbohydrate production in Methylobacterium extorquens AM1. J Bacteriol 188: 7321-7324. 49. Chistoserdova L, Kalyuzhnaya MG, Lidstrom ME (2005) C1-transfer modules: from genomics to ecology. ASM News 71: 521-528. 50. Galperin MY, Walker DR, Koonin EV (1998) Analogous enzymes: independent inventions in enzyme evolution. Genome Res 8: 779-790. 51. Chistoserdova L, Lidstrom ME (1997) Molecular and mutational analysis of a DNA region separating two methylotrophy gene clusters in Methylobacterium extorquens AM1. Microbiology 143: 1729-1736. 52. Denef VJ, Patrauchan MA, Florizone C, Park J, Tsoi TV, et al. (2005) Growth substrate- and phase- specific expression of biphenyl, benzoate, and C-1 metabolic pathways in Burkholderia xenovorans LB400. J Bacteriol 187: 7996-8005. 53. Wilson SM, Gleisten MP, Donohue TJ (2008) Identification of proteins involved in formaldehyde metabolism by Rhodobacter sphaeroides. Microbiology 154: 296-305. 54. Ward N, Larsen O, Sakwa J, Bruseth L, Khouri H, et al. (2004) Genomic insights into methanotrophy: The complete genome sequence of Methylococcus capsulatu (Bath). PLoS Biol 2: 1616-1628. 55. Kane SR, Chakicherla AY, Chain PSG, Schmidt R, Shin MW, et al. (2007) Whole-genome analysis of the methyl tert-butyl ether-degrading beta-proteobacterium Methylibium petroleiphilum PM1. J Bacteriol 189: 1931-1945. 56. Chistoserdova L, Lapidus A, Han C, Goodwin L, Saunders L, et al. (2007) Genome of Methylobacillus flagellatus, molecular basis for obligate methylotrophy, and polyphyletic origin of methylotrophy. J Bacteriol 189: 4020-4027. 57. Giovannoni SJ, Hayakawa DH, Tripp HJ, Stingl U, Givan SA, et al. (2008) The small genome of an abundant coastal ocean methylotroph. Environmental Microbiol 10: 1771-1782. 58. Moran MA, Buchan A, Gonzalez JM, Heidelberg JF, Whitman WB, et al. (2004) Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment. Nature 432: 910-913. 59. Greenberg DE, Porcella SF, Zelazny AM, Virtaneva K, Sturdevant DE, et al. (2007) Genome sequence analysis of the emerging human pathogenic acetic acid bacterium Granulibacter bethesdensis. J Bacteriol 189: 8727-8736. 60. Greenberg DE, Porcella SF, Stock F, Wong A, Conville PS, et al. (2006) Granulibacter bethesdensis gen. nov., sp. nov., a distinctive pathogenic acetic acid bacterium in the family Acetobacteraceae. Int J Syst Evol Microbiol 56: 2609-2616. 61. Hou S, Makarova KS, Saw JH, Senin P, Ly BV, et al. (2008) Complete genome sequence of the extremely acidophilic methanotroph isolate V4, "Methylacidiphilum infernorum", a representative of the bacterial phylum Verrucomicrobia. Biol Direct 3: 26.

82 62. Kalyuzhnaya MG, Hristova KR, Lidstrom ME, Chistoserdova L (2008) Characterization of a novel methanol dehydrogenase in representatives of Burkholderiales: Implications for environmental detection of methylotrophy and evidence for convergent evolution. J Bacteriol 190: 3817-3823. 63. Chistoserdova L, Vorholt J, Thauer R, Lidstrom M (1998) C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science 281: 99 - 102. 64. Marx CJ, Chistoserdova L, Lidstrom ME (2003) Formaldehyde-detoxifying role of the tetrahydromethanopterin-linked pathway in Methylobacterium extorquens AM1. J Bacteriol 185: 7160- 7168. 65. Bauer M, Lombardot T, Teeling H, Ward NL, Amann R, et al. (2004) Archaea-like genes for C-1-transfer enzymes in Planctomycetes: Phylogenetic implications of their unexpected presence in this phylum. J Mol Evol 59: 571-586. 66. Marx CJ, Miller JA, Chistoserdova L, Lidstrom ME (2004) Multiple formaldehyde oxidation/detoxification pathways in Burkholderia fungorum LB400. J Bacteriol 186: 2173-2178. 67. Chistoserdova L, Gomelsky L, Vorholt JA, Gomelsky M, Tsygankov YD, et al. (2000) Analysis of two formaldehyde oxidation pathways in Methylobacillus flagellatus KT, a ribulose monophosphate cycle methylotroph. Microbiology 146: 233-238. 68. Ras J, van Ophem PW, Reijnders WN, van Spanning RJ, Duine JA, et al. (1995) Isolation, sequencing, and mutagenesis of the gene encoding NAD- and glutathione-dependent formaldehyde dehydrogenase (GD-FALDH) from Paracoccus denitrificans, in which GD-FALDH is essential for methylotrophic growth. J Bacteriol 177: 247-251. 69. Chistoserdova L, Crowther GJ, Vorholt JA, Skovran E, Portais JC, et al. (2007) Identification of a fourth formate dehydrogenase in Methylobacterium extorquens AM1 and confirmation of the essential role of formate oxidation in methylotrophy. J Bacteriol 189: 9076-9081. 70. Chistoserdova LV, Lidstrom ME (1994) Genetics of the serine cycle in Methylobacterium extorquens AM1: identification, sequence, and mutation of three new genes involved in C1 assimilation, orf4, mtkA, and mtkB. J Bacteriol 176: 7398-7404. 71. Marx CJ, O'Brien BN, Breezee J, Lidstrom ME (2003) Novel methylotrophy genes of Methylobacterium extorquens AM1 identified by using transposon mutagenesis including a putative dihydromethanopterin reductase. J Bacteriol 185: 669-673. 72. Marx CJ, Lidstrom ME (2004) Development of an insertional expression vector system for Methylobacterium extorquens AM1 and generation of null mutants lacking mtdA and/or fch. Microbiology 150: 9-19. 73. Marx CJ, Laukel M, Vorholt JA, Lidstrom ME (2003) Purification of the formate-tetrahydrofolate ligase from Methylobacterium extorquens AM1 and demonstration of its requirement for methylotrophic growth. J Bacteriol 185: 7169-7175. 74. Kalyuzhnaya MG, De Marco P, Bowerman S, Pacheco CC, Lara JC, et al. (2006) Methyloversatilis universalis gen. nov., sp nov., a novel taxon within the Betaproteobacteria represented by three methylotrophic isolates. Int J Syst Evol Microbiol 56: 2517-2522. 75. Kalyuzhnaya MG, Lidstrom ME (2005) QscR-mediated transcriptional activation of serine cycle genes in Methylobacterium extorquens AM1. J Bacteriol 187: 7511-7517. 76. Siguier P, Filée J, Chandler M (2006) Insertion sequences in prokaryotic genomes. Curr Op Microbiol 9: 526-531. 77. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M (2006) ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 34: D32-D36. 78. Chen Y, Zhou F, Li G, Xu Y (2008) A recently active miniature inverted-repeat transposable element, chunjie, inserted into an operon without disturbing the operon structure in Geobacter uraniireducens Rf4. Genetics 179: 2291-2297. 79. Chandler M, Mahillon J (2002) Insertion sequences revisited. In: Craig NL, Craigie R, Gellert M, Lambowitz AM, editors. Mobile DNA II. Washington DC: ASM Press. pp. 305-366. 80. Schmid-Appert M, Zoller K, Traber H, Vuilleumier S, Leisinger T (1997) Association of newly discovered IS elements with the dichloromethane utilization genes of methylotrophic bacteria. Microbiology 143: 2557-2567. 81. Kalyuzhnaya MG, Lapidus A, Ivanova N, Copeland AC, McHardy AC, et al. (2008) High-resolution metagenomics targets specific functional types in complex microbial communities. Nat Biotech 26: 1029- 1034. 82. Neff JC, Holland EA, Dentener FJ, McDowell WH, K.M. R (2002) The origin, composition and rates of organic nitrogen deposition: a missing piece of the nitrogen cycle? Biogeochemistry 57/58: 99-136.

83 83. Cevallos MA, Cervantes-Rivera R, Gutierrez-Rios RM (2008) The repABC plasmid family. Plasmid 60: 19-37. 84. Starkenburg SR, Larimer FW, Stein LY, Klotz MG, Chain PSG, et al. (2008) Complete genome sequence of Nitrobacter hamburgensis X14 and comparative genomic analysis of species within the genus Nitrobacter. Appl Environ Microbiol 74: 2852-2863. 85. Kayser MF, Ucurum Z, Vuilleumier S (2002) Dichloromethane metabolism and C1 utilization genes in Methylobacterium strains. Microbiology 148: 1915-1922. 86. Rohmer L, Fong C, Abmayr S, Wasnick M, Freeman TJL, et al. (2007) Comparison of Francisella tularensis genomes reveals evolutionary events associated with the emergence of human pathogenic strains. Genome Biol 8: R102. 87. Gordon D, Desmarais C, Green P (2001) Automated finishing with Autofinish. Genome Res 11: 614-625. 88. Vallenet D, Nordmann P, Barbe V, Poirel L, Mangenot S, et al. (2008) Comparative analysis of Acinetobacters: Three genomes for three lifestyles. PLoS ONE 3: e1805. 89. Bocs S, Cruveiller S, Vallenet D, Nuel G, Medigue C (2003) AMIGene: Annotation of MIcrobial genes. Nucleic Acids Res 31: 3723-3726. 90. Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA 96: 2896-2901. 91. Boyer F, Morgat A, Labarre L, Pothier J, Viari A (2005) Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functional data. Bioinformatics 21: 4209-4215. 92. Wagner A, Lewis C, Bichsel M (2007) A survey of bacterial insertion sequences using IScan. Nucleic Acids Res 35: 5284-5293.

84

Supporting Information

85 Supplementary Table 1. Methylotrophy genes in M. extorquens AM1 and DM4 ______Gene Function AM1 DM4 Identity C1 (%) spec. ______pccB Propionyl-CoA carboxylase, beta subunit META1_0172 METDI0155 99.8 M ccr Crotonyl-CoA carboxylase/reductase META1_0178 METDI0161 100.0 M ecm Ethylmalonyl-CoA mutase META1_0180 METDI0163 99.6 M meaB methylmalonyl-CoA mutase accessory protein META1_0188 METDI0171 99.7 M fdh3A Cytochrome-linked formate dehydrogenase META1_0303 METDI0454 99.8 M alpha subunit fdh3B Cytochrome-linked formate dehydrogenase META1_0304 METDI0455 100.0 M beta subunit fdh3C Cytochrome-linked formate dehydrogenase META1_0305 METDI0456 99.4 R gamma subunit ftfL Formyl-tetrahydrofolate ligase META1_0329 METDI0483 99.5 M qscR Serine cycle transcriptional regulator META1_0756 METDI1125 99.1 M epi Ethylmalonyl-CoA/ methylmalonyl-CoA epimerase META1_0839 METDI1550 100.0 M meaD ATP:cob(I)alamin adenosyltransferase META1_1433 METDI2206 100.0 M mdh Malate dehydrogenase META1_1537 METDI2311 99.7 E sga Serine glyoxylate aminotransferase META1_1726 METDI2478 100.0 M hpr Hydroxupyruvate reductase META1_1727 METDI2479 99.7 M mtdA Bifunctional methylene-H4MPT/ META1_1728 METDI2480 100.0 M

methylene-H4F dehydrogenase fch Methenyl-H4F cyclohydrolase META1_1729 METDI2481 100.0 M mtkA Malate thiokinase, alpha subunit META1_1730 METDI2482 99.7 M mtkB Malate thiokinase, beta subunit META1_1731 METDI2483 100.0 M ppc PEP carboxylase META1_1732 METDI2484 99.7 M mcl Malyl-CoA lyase/beta-malyl-CoA lyase META1_1733 METDI2485 100.0 M Hypothetical protein META1_1734 METDI2486 100.0 NA Putative carboxymethylenebutenolidase META1_1735 METDI2487 96.2 pcbD putative pterin-4-Alpha-carbinolamine dehydratase META1_1736 METDI2488 99.2 NA ABC type transporter, substrate-binding subunit META1_1737 METDI2489 100.0 ABC type transporter, permease subunit META1_1738 METDI2490 98.8 ABC type transporter, ATP-binding subunit META1_1739 METDI2491 99.6 xoxF Homolog of mxaF META1_1740 METDI2492 100.0 xoxG cytochrome c META1_1741 METDI2493 100.0 xoxJ Homolog of mxaJ META1_1742 METDI2494 98.9 folK 2-amino-4-hydroxy-6-hydroxymethyl-7,8- dihydropterin pyrophosphokinase META1_1743 METDI2495 98.7 NA folB Dihydroneopterin aldolase META1_1744 METDI2496 99.2 NA

86 folP Dihydropteroate synthase META1_1745 METDI2497 99.3 E Hypothetical protein META1_1746 METDI2498 99.2 Hypothetical protein MEAT1_1747 METDI2499 99.1 pqqE PQQ synthesis META1_1748 METDI2500 99.5 M pqqC/D PQQ synthesis META1_1749 METDI2501 98.1 M pqqB PQQ synthesis META1_1750 METDI2502 98.7 M pqqA PQQ synthesis META1_1751 METDI2503 100.0 mxbM Transcriptional regulator META1_1752 METDI2504 100.0 M mxbD Sensor kinase META1_1753 METDI2505 99.6 M Hypothetical protein META1_1754 METDI2506 98.5 fhcC Formyltransferase/hydrolase complex gamma subunit META1_1755 METDI2507 97.7 M fhcD Formyltransferase/hydrolase complex delta subunit META1_1756 METDI2508 99.7 M fhcA Formyltransferase/hydrolase complex alpha subunit META1_1757 METDI2509 99.8 M fhcB Formyltransferase/hydrolase complex beta subunit META1_1758 METDI2510 99.7 M mptG Ribofuranosylaminobenzene 5'-phosphate synthase META1_1760 METDI2512 99.1 M mtdB Methylene-H4MPT dehydrogenase META1_1761 METDI2513 100.0 M orfY Unknown META1_1762 METDI2514 95.7 M mch Methenyl-H4MPT cyclohydrolase META1_1763 METDI2515 99.7 M orf5 H4MPT biosynthesis META1_1764 METDI2516 99.3 M orf7 Unknown META1_1765 METDI2517 99.3 M fae Formaldehyde activating enzyme META1_1766 METDI2518 100.0 M orf17 Unknown META1_1767 METDI2519 96.8 M orf9 H4MPT biosynthesis META1_1768 METDI2520 97.7 M Homolog of mxaE META1_1770 METDI2522 98.9 Homolog of mxaD META1_1771 METDI2523 99.4 Homolog of mxaD META1_1772 METDI2524 98.9 orf19 H4MPT biosynthesis META1_1773 METDI2525 98.0 M orf20 H4MPT biosynthesis META1_1774 METDI2526 98.9 M orf21 H4MPT biosynthesis META1_1775 METDI2527 95.6 M orf22 H4MPT biosynthesis META1_1776 METDI2528 97.0 M dcmR Transcriptional repressor t present METDI2655 - M dcmA Dichloromethane dehalogenase t present METDI2656 - M fdh4B Protein associated with expression of formate dehydogenase 4 META1_2093 METDI2873 98.0 R fdh4A Formate dehydrogenase 4 META1_2094 METDI2874 98.7 R msd Methylsuccinyl-CoA dehydrogenase META1_2223 METDI3005 99.1 M folE GTP cyclohydrolase META1_2264 METDI3046 100.0 NA pqqF PQQ synthesis META1_2330 METDI3110 100.0 M pqqG PQQ synthesis META1_2331 METDI3111 100.0 M mcmB Methylmalonyl-CoA mutase, beta subunit META1_2390 METDI3170 97.5 M mauF Unknown META1_2769 t present - M mauB Methylamine dehydrogenase large subunit META1_2770 t present - M mauE Essential for small subunit maturation META1_2771 t present - M

87 mauD Essential for small subunit maturation META1_2772 t present - M mauA Methylamine dehydrogenase small subunit META1_2773 t present - M mauC Amicyanin META1_2774 t present - M mauJ Unknown META1_2775 t present - mauG Unknown META1_2776 t present - M maul Unknown META1_2777 t present - M mauM Ferredoxin META1_2778 t present - mauN Ferredoxin META1_2779 t present - folA Dihydrofolate reductase META1_2852 METDI3418 97.6 NA fumC Fumarase C META1_2857 METDI3423 99.4 E gck Glycerate kinase META1_2944 METDI3513 98.9 M eno Enolase META1_2984 METDI3551 99.8 E pccA Propionyl-CoA carboxylase, alpha subunit META1_3203 METDI3767 99.3 M glyA Serine hydroxymethyltrasferase META1_3384 METDI3959 100.0 M croR Crotonase META1_3675 METDI4248 100.0 M phaR PHB synthesis regulator/acetyl-CoA flux META1_3699 METDI4271 97.5 M phaA Beta-ketothiolase META1_3700 METDI4272 100.0 M phaB Acetoacetyl-CoA reductase META1_3701 METDI4273 99.6 M sdhC Succinate dehydrogenase, cytochrome b556 subunit META1_3859 METDI4591 98.5 E sdhD Succinate dehydrogenase, membrane anchor subunit META1_3860 METDI4592 100.0 E sdhA Succinate dehydrogenase, flavoprotein subunit META1_3861 METDI4593 99.7 E sdhB Succinate dehydrogenase, iron-sulfur subunit META1_3863 METDI4595 99.6 E mcd Mesaconyl-CoA hydratase META1_4153 METDI4744 99.1 M pabB para-aminobenzoic acid synthase component META1_4284 METDI4895 98.7 E dmrA Dihydromethanopterin reductase META1_4312 METDI4922 100.0 M mxaB Transcriptional regulator META1_4525 METDI5131 99.6 M mxaH Unknown META1_4526 METDI5132 96.9 NA mxaE Unknown META1_4527 METDI5133 96.5 M mxaD Unknown META1_4528 METDI5134 98.9 mxaL Essential for Ca2+ insertion into MDH META1_4529 METDI5135 97.9 M mxaK Essential for Ca2+ insertion into MDH META1_4530 METDI5136 97.6 M mxaC Essential for Ca2+ insertion into MDH META1_4531 METDI5137 98.6 M mxaA Essential for Ca2+ insertion into MDH META1_4532 METDI5138 96.8 M mxaS Unknown META1_4533 METDI5139 99.7 M mxaR Unknown META1_4534 METDI5140 99.7 M mxaI Methanol dehydrogenase, small subunit META1_4535 METDI5141 99.0 M mxaG Cytochrome c550 META1_4536 METDI5142 99.0 M mxaJ MxaJ, possible chaperone META1_4537 METDI5143 99.0 M mxaF Methanol dehydrogenase, large subunit META1_4538 METDI5145 100.0 M fdh2C Molybdenum-dependent formate dehydrogenase, gamma subunit META1_4846 METDI5437 100.0 R fdh2B Molybdenum-dependent formate dehydrogenase, beta subunit META1_4847 METDI5438 98.7 R fdh2A Molybdenum-dependent formate

88 dehydrogenase, alpha subunit META1_4848 METDI5439 99.7 R fdh2D Molybdenum-dependent formate dehydrogenase, delta subunit META1_4849 METDI5440 100.0 R folC Dihydrofolate synthase META1_4888 METDI5480 99.3 NA mxcQ Sensor kinase META1_4896 METDI5488 99.0 M mxcE Transcriptional regulator META1_4897 METDI5489 100.0 M tenA Involved in transcriptional regulation META1_4898 METDI5490 97.4 fdh1B Tungsten-dependent formate dehydrogenase, beta subunit META1_5031 METDI5632 99.7 R fdh1A Tungsten-dependent formate dehydrogenase, alpha subunit META1_5032 METDI5633 99.9 R mcmA Methylmalonyl-CoA mutase, alpha subunit META1_5251 METDI5851 99.7 M ______Genes transcribed from the leading strand are highlighted in grey. NA, no mutant available; M, methylotrophy- specific; E, essential for both methylotrophic and non-methylotrophic growth; R, redundant function.

89 Supplementary Table 2. Methylotrophy enzymes and modules deduced from complete genomic sequences of methylotrophs

Pathway/function Organism ______M. e. G. b. S. p. M. p. M. f. M. sp. M. c. strain V4 ______Methane utilization ------+ + Methanol utilization + + - + + ? + ? Methylamine utilization + - - - + - - -

H4MPT-dependent C1 conversion pathway + + - + + - + - Formaldehyde oxidation

MtdA/Fch (H4MPT-dependent) + + - + - - + -

FolD (H4F-dependent) - - + - + + - + FlhA/FghA (glutathione-dependent) - - + - - - - -

Conversion of formate to CO2 Formate dehydrogenase (FDH1) + - + + - - + - Formate dehydrogenase (FDH2) + + - + + + + + Formate dehydrogenase (FDH3) + - + + - - - - Formate dehydrogenase (FDH4) + + - - + - - - Ribulose monophosphate pathway - - - - + + + - Calvin-Benson-Bassham pathway - - - + - - + + Serine cycle + + + + - - partial - Isocitrate lyase - + - + - - - - Ethylmalonyl-CoA pathway + - + - - - - - Tricarboxylic acid cycle + + + + - - + + ______

90 Supplementary Table 3. Methylotrophic bacteria with published genome sequences included in comparative analyses ______Organism Phylum Class Genome Growth on Acc. No. Reference size (Mb) ______Methanol Methylamine DCM ______M. extorquens AM1/DM4 Proteobacteria Alpha 6.9/6.1 + + - This work Granulibacter bethesdensis Proteobacteria Alpha 2.7 + N.D. N.D. NC_008343 [59,60] Silicibacter pomeroyi Proteobacteria Alpha 5.5 N.D. N.D. N.D. NC_003911 [58] Methylibium petroleiphilum Proteobacteria Beta 5.5 + - N.D. NC_008825 [55] Methylobacillus flagellatus Proteobacteria Beta 2.9 + + N.D. NC_007947 [56] Methylophilales sp. HTCC2181 Proteobacteria Beta 1.9 + - N.D. NZ_AAUX00000000 [57] Methylococcus capsulatus Proteobacteria Gamma 3.1 + - N.D. NC_002977 [54] Strain V4 Verrucomicrobia 2.4 + - N.D. NC_010794 [61] ______N.D., not detected

91 Supplementary Table 4. Characteristics of IS elements in Methylobacterium extorquens ______IS type Length IS family DM4 AM1 (bp) ______Intact Partial Intact Partial ______

ISMex1 1200 IS3 0 0 37 1 ISMex2 1041 IS481 0 0 16 0 ISMex3 1399 IS256 2 0 23 0 ISMex4 1615 IS1380 0 0 8 0 ISMex5 1283 IS3 0 4 4 5 ISMex6 1535 ISNCY 0 0 2 0 ISMex7 1285 IS3 0 0 1 0 ISMex8 2597 IS21 0 0 3 0 ISMex9 1384 IS110 0 0 1 1 ISMex10 1762 ISL3 2 0 4 0 ISMex11 1227 IS3 3 1 2 3 ISMex12 1659 IS110 0 1 1 1 ISMex13 2359 IS21 0 0 4 0 ISMex14 1381 IS256 0 0 2 0 ISMex15 836 IS5 9 1 2 1 ISMex16 1511 IS3 1 0 3 2 ISMex17 1132 IS110 7 1 1 0 ISMex18 1173 IS630 1 0 1 2 ISMex19 851 IS5 0 0 1 0 ISMex20 2321 IS200/IS605 0 0 1 0 ISMex21 1046 IS630 2 0 2 0 ISMex22 3864 Tn3 0 0 5 1 ISMex23 1548 IS200/IS605 0 0 1 0 ISMex24 1678 ISL3 0 0 2 0 ISMex25 887 IS6 0 0 1 1 ISMex26 1872 ISL3 0 0 1 2 ISMex27 2493 IS21 0 0 1 0 ISMex28 1523 IS5 0 0 1 0 ISMex29 1334 IS110 1 0 1 0 ISMex30 1047 IS630 1 1 1 0 ISMex31 1081 IS110 1 0 1 0 ISMex32 873 IS5 0 0 1 0 ISMex33 1318 IS3 0 1 1 0 ISMex34 636 IS200/IS605 0 0 1 0 ISMex35 1560 IS5 0 0 1 0 ISMex36 1475 IS1182 0 0 1 0 ISMex37 1510 IS3 0 0 1 2

92 ISMex38 3843 Tn3 0 0 1 0 ISMex39 2440 IS21 0 0 1 0 IS1354 1431 IS256 2 0 0 0 IS1355 970 IS5 1 0 0 0 IS1357 1229* IS701 0 1 0 0 ISMdi1 1064 IS630 3 1 0 1 ISMdi2 1691 ISL3 1 0 0 1 ISMdi3 1246 IS3 4 4 0 4 ISMdi4 1063 IS481 2 3 0 0 ISMdi5 1279 IS3 1 1 0 1 ISMdi6 1476 IS1182 1 0 0 0 ISMdi7 2568 IS21 2 0 0 1 ISMdi8 1006 IS481 2 0 0 1 ISMdi9 1625 IS110 1 0 0 0 ISMdi10 1437 IS110 1 0 0 0 ISMdi11 1894 IS481 1 0 0 0 ISMdi12 1361 IS110 1 0 0 0 ISMdi13 1416 IS256 1 0 0 0 ISMdi14 1341 IS110 2 0 0 0 ISMdi15 1889 IS200/IS605 1 0 0 0 ISMdi16 1271 IS3 1 0 0 0 ISMdi17 890 IS5 1 0 0 0 ISMdi18 999 IS1595 1 1 0 0 ISMdi19 1099 IS3 1 0 0 0 ISMdi20 1062 IS630 1 1 0 0 ISMdi21 942 IS5 1 0 0 0 ISMdi22 724 IS5 1 0 0 0 ISMdi23 1065 IS481 1 0 0 0 ISMdi24 855 IS5 1 0 0 0 ISMdi25 1601 IS30 1 0 0 0 ISMdi26 867 IS5 1 1 0 1 ISMdi27 2464 IS21 1 0 0 0 ISMdi28 1410 IS701 1 0 0 0 ISMdi29 1939 ISNCY 1 0 0 0 MiniMdi3 416‡ 8 0 4 0 ______* fragment length ‡ Maximum length. Actual lengths vary from 382 to 416.

93

94

Comparison of the metabolic capacity of Methylobacterium species.

95 Short report

Sequencing of 6 Methylobacterium strains. The genome sequencing and annotation of Methylobacterium extorquens AM1 (AM1) and DM4 (DM4) described in Chapter III previously were associated with the sequencing of 6 more Methylobacteria species. The aim of this study is to get a broader view on this genus of plant colonizing bacteria and to identified their specific and common genetic traits associated with plant colonization. The 6 additionally sequenced bacteria are: M. extorquens PA1 (PA1) an epitphyte of plants, M. extorquens CM4 (CM4) a chloromethane degrading strain, M. populi BJ001 (POP) which is an endophyte of polar trees; M. nodulans ORS2060 (NOD) and M. sp. 4-46 (SP) which are two nodulating species, and M. radiotolerans JCM 2831 (RAD) isolated from rice seed; Sequencing methods were performed as described for the strain AM1 and DM4 but devoid of expert annotation.

Transfer of the genome-scale model of M. extorquens AM1. In order to identify the common metabolic capacities of 8 Methylobacterium strains, a model of M. extorquens AM1 (iRP911, see Chapter IV for the reconstruction), composed of 1139 reactions assigned to 132 metabolic pathways, was transferred to each genome as following. Assignment of metabolic reactions to microorganisms was done via the Gene to Protein to Reaction (GRP) network of the AM1 model and via gene homologies (30 % identity and 80 % match lenght) relative to the M. extorquens AM1 genome. This methodology is expected to assure a high level of confidence in the reactions transferred that come from a manually curated genome annotation and manually curated network reconstruction. Nevertheless, only common features are accessible because the reaction transfer is based on a ―one direction process‖ from AM1 to the others, indeed only already known metabolic capacity of AM1 are considered. This methodology is promising to allow fast metabolic reconstruction in new organisms. If a complete reconstruction can not be achieved by this process due to specific traits of the new sequenced bacteria which are not identified and restriction on tranfering only reaction for which a genes is associated in the GPR, however it will significantly improved the draft reconstruction process and refinement process if closed related organism are accessible.

Comparison of the metabolic capacity of representatives of the genus Methylobacterium. Table 1 shows the degree of completeness of individual pathways for each strain, i.e. the number of reactions assigned to the pathway of a Methylobacterium strain by the number of reaction in the M. extorquens AM1 model. The Methylobacterium core metabolism is composed of the reactions which are common to all Methylobacterium strains. 91 % of the reactions in the M. extorquens AM1 model are common to all strains. From a metabolic capacity point of view, the 4 M. extorquens strain (AM1, PA1, CM4, DM4) are closely related to each other with 99% of

96 completion. Interestingly, also M. populi shares the same high level of conservation (99%). M. nodulans is missing reactions in bacteriochlorophyll-a and carotenoids biosynthesis. A lack of transfer of metabolic capacities (83 reactions) was noted for 8 reactions that are essential for growth on methanol in M. extorquens AM1. They are required for amino acids and nucleotides biosynthesis and appear to be missing in RAD, SP, NOD. 3 non-essential reactions are missing from the category of central metabolism. They correspond to methylamine utilization (only present in M. extorquens AM1 and CM4), phosphate acetyltransferase (missing in POP, RAD; note however, acetyl-CoA synthetase is present), malate:quinone oxidoreductase (missing in SP, NOD; note however, malate dehydrogenase is present). Not all transporters are conserved across all the 8 genomes analyzed: malate transport is only present in M. extorquens AM1, a formate transporter appears to be absent in M. extorquens sp. 4-46, a gluconate transporter could not be identified in M. radiotolerans and M. nodulans. There are missing capacities for glucose, gluconate-like and galactose oxidation processes in periplasmic and extracellular spaces. Several interconvertion between sugars are not found like between mannose, gulose, galactose in Methylobacterium sp. 4- 46.

Missing methanol dehydrogenase in M. sp. 4-46: all genes for methylotrophy (fae, mtdA, mtdB, fch, fdh) are found in Methylobacterium sp. 4-46 except mxaF. Methylotrophy capacity is assigned through the reactions transfer due to homology (>30%) with 2 xoxF-like genes in sp. 4-46. Investigation to reveal if Methylobacterium sp. 4-46 is able to use methanol is required.

Discussion

Methylobacterium species shared 91% of the metabolic network of M. extorquens AM1 which is almost identical in central metabolism and differents in metabolic periphery like degradation pathways and substrate transporters. These not transferred reactions are identified specific traits of M. extorquens AM1 for adaptation to its specific ecological nechesses.

Figure 1: Pathways completion of the 8 methylobacterium strains calculated from the M. extorquens AM1 model (iRP911). Completion of each pathway was calculated from the number of reactions in the M. extorquens AM1 model that can be assigned to each other strain via the GPR association network and genes homology with at least 30% identity and 80% of length match. Values are from 1: complete, to 0 no reactions. The Core metabolism is composed by the reactions found in all strains.

97

98

CHAPTER IV

Elucidation of network topology at system level and its operation under pure methylotrophic growth

Rémi Peyraud, Kathrin Schneider, Patrick Kiefer, Stéphane Massou, Julia A. Vorholt, Jean-Charles Portais

Peyraud R, Schneider K, Kiefer P, Massou S, Vorholt JA, and Portais J-C Genome-scale reconstruction and system level investigation of the metabolic network of Methylobacterium extorquens AM1 BMC Systems Biology, under revision Contribution by RP: Design of the study, metabolic network reconstruction and the modeling, in silico analysis (EFM, MCS, FBA), NMR and MS measurements, data treatment and analysis, writing the manuscript.

99 Abstract

Background: Methylotrophs are microorganisms playing a key role in biogeochemical processes - especially the global carbon cycle- and that have gained interest for biotechnological purposes. Significant progress was made in the recent years in the biochemistry, genetics, genomics, and physiology of methylotrophic bacteria, showing that methylotrophy is a phenomenon much more widespread and versatile than initially assumed. Despite such progress, our current understanding of the biochemistry and metabolism of these bacteria is still incomplete. System-level descriptions of methylotrophy metabolisms are currently lacking, and very little is known about the network-scale organization and properties of methylotrophy, and how the methylotrophic behavior emerges from this organization, especially in facultative organisms Results: In this work, we report on the integrated, system-level investigation of the metabolic network of the facultative methylotroph Methylobacterium extorquens AM1, a valuable model of methylotrophic bacteria. The genome-scale reconstruction of its metabolic network shows that the central carbon metabolism of the bacterium is a mosaic of classical and specialized pathways enabling growth on C1 and multicarbon (C2 to C4) compounds. The sub-network operating during methylotrophic growth was identified from both in silico and experimental investigations – including 13C-fluxomics –. The core of this metabolism is organized as a highly unusual series of metabolic pathways (serine cycle, ethylmalonyl-CoA pathway, TCA cycle, anaplerotic processes), tightly embedded into one another and operating as an entity to achieve C1 assimilation during methylotrophic growth. This complex metabolic machinery is structurally fragile but allows efficient utilization of C1 compounds via highly specialized pathways and is versatile enough around a flexible backbone of C2/C3/C4 inter-conversions to allow switching to other carbon sources. Conclusions: This work emphasizes that the metabolism of M. extorquens AM1 is adapted to its lifestyle not only in terms of enzymatic equipment, but also in terms of network-level structure and regulation. It suggests that the metabolism of the bacterium has evolved both structurally and functionally to an efficient but transitory utilization of methanol.

Introduction

Methylotrophs are microorganisms able to grow on reduced C1 compounds such as methane and methanol as sole source of carbon and energy. Methylotrophy has gained increasing interest over the past decade for both basic and applied purposes, since methanol can be produced from diverse

100 renewable sources and represents a valuable feedstock for biotechnological applications [1, 2]. The recent progress in the biochemistry, genetics, genomics, and physiology of methylotrophic bacteria has shown that methylotrophy is a phenomenon much more widespread and versatile than initially assumed [3]. The phylogenetic distribution of methylotrophy is broad and spans over a range of phyla and genera [3]. Methylotrophic bacteria are adapted to a broad range of lifestyles and ecological niches (soil, water sediments, plant roots, phyllosphere), and play roles in many important biological or biogeochemical processes like the global carbon cycle. Methylotrophy encompasses a broad range of metabolic capabilities or behaviors that were used to classify them into obligate vs facultative, heterotrophic vs autotrophic, aerobic vs anaerobic, etc. Such genetic and biochemical diversity may explain the ecological competitiveness of methylotrophs. From the biochemical point of view, methylotrophy relies on specialized pathways (C1 pathways) ensuring the fulfillment of all growth requirements. Energetic requirements are fulfilled by dissimilation pathways, which involve a series of three basic steps: i) the oxidation of primary C1 substrates – methanol, methylamine – typically into the toxic intermediate formaldehyde, ii) the oxidation of the latter compound into formate, and iii) formate oxidation into carbon dioxide

(CO2). The assimilation of C1-units can be achieved by different mechanisms starting from either formaldehyde (ribulose-monophosphate pathway, RuMP), CO2 (Calvin-Benson-Bassham cycle,

CBB) or methylene-tetrahydrofolate (Me-THF) + CO2 (serine cycle). Novel enzymes, biochemical mechanisms, and metabolic pathways have been discovered, resulting in a more complete description of methylotrophic pathways and their diversity. Interestingly, these findings do not modify our current view of the general organization of C1 pathways, but show that many more alternative enzymes or pathways than initially assumed carry out each of the basic steps of methylotrophy. Moreover, a growing number of newly discovered methylotrophs have been investigated, showing that different combinations of the various pathways can be found in nature, and leading to the concept of a modular metabolism [4]. Despite considerable progress made in the understanding of the biochemistry and physiology of methylotrophy, a complete, system-level description of methylotrophy metabolism is currently lacking. The comprehensive understanding of bacterial physiology requires a detailed knowledge about the complete metabolic potential of the studied organism, but such knowledge is currently missing and there is no genome-scale metabolic network established for any methylotrophic bacterium. In consequence there is currently little known about the network-scale organization of methylotrophy, the specific properties of methylotrophic networks, and how the methylotrophic behavior emerges from this organization, especially in facultative organisms. Among the recently sequenced organisms, the Alphaproteobacterium Methylobacterium extorquens AM1 is a major model of methylotrophic bacteria. This facultative methylotroph is able to grow on C1- but also on multicarbon (C2-C4) compounds. It is part of the abundant population of bacteria systematically found in the phyllosphere [5, 6], where they benefit from plant-derived methanol [7, 8]. The biochemistry, genetics, and physiology of this bacterium has been extensively investigated, and

101 has allowed the discovery of major methylotrophic pathways, including the serine cycle [9-11], the tetrahydromethanopterin(H4MPT)-dependent pathway for formaldehyde oxidation [12, 13], and of a number of novel enzymes or enzyme functions (Pyrroloquinoline quinine (PQQ)-dependent methanol dehydrogenase, formaldehyde activating enzyme, methylene-H4MPT dehydrogenases coupled to pyridine nucleotides, and formyl-methanofuran hydrolase). The pathway for methanol assimilation in M. extorquens AM1 was recently completed with the discovery of the ethylmalonyl-CoA pathway (EMCP), an alternative to the glyoxylate cycle for the synthesis of glyoxylate, and which encompasses an unusual series of 12 reactions where intermediates are all Coenzyme A (CoA) esters [14, 15]. Finally, the genome of this bacterium was recently sequenced and annotated [16], providing inevitable information for genome-scale investigations. In this work, we report on the integrated, system-level investigation of the metabolic network of the facultative methylotroph M. extorquens AM1, as a valuable model of methylotrophic bacteria. To get comprehensive understanding of the biochemistry and physiology of this bacterium, we have first evaluated its complete metabolic potential by compiling current biochemical knowledge and genome annotation data into a comprehensive genome-scale representation of the metabolic network. Then, we have performed both in silico and experimental investigations to identify the subnetwork operating during methylotrophic growth, in order to analyze structurally and functionally the system-level organization of methylotrophy with an emphasis on the properties of the organization of central carbon metabolism.

Results

Metabolic network reconstruction. The genome-scale (GS) metabolic network of M. extorquens AM1 was reconstructed according to recommended guidelines [17]. The details of the process are given in material and methods and are schematically shown in Fig. S1. Briefly, the GS metabolic network was obtained by integrating relevant information collected from i) genome annotation [16], ii) published physiological, genetic and biochemical studies in M. extorquens AM1 and closely related organisms iii) biochemical information contained in relevant databases [18-20], vi) complementary investigations (biomass quantification), and intensive refinement (Table S1, S2). The chemical composition of M. extorquens AM1 cell was determined experimentally or taken from available literature (Table 1 and Table S3) and used to define the biosynthetic needs and corresponding pathways.

102 Table 1. Chemical content and physiological parameters of M. extorquens AM1 cells growing on methanol.

Macromolecule % Cell Dry Weight ± Data source Organism source Protein 59.13 ± 2.11 This study M. extorquens AM1

Carbohydrate 16.43 ± 1.09 This study M. extorquens AM1

Rhamnose (polymer) 8.92 ± 0.92 This study M. extorquens AM1

Glucose (polymer) 5.62 ± 0.52 This study M. extorquens AM1

Trehalose 1.22 ± 0.20 This study M. extorquens AM1

Glucosamine (polymer) 0.09 ± 0.67 This study M. extorquens AM1

RNA 8.20 ± 0.68 This study M. extorquens AM1

Fatty acid 4.95 ± 0.29 This study M. extorquens AM1

Neidhart et al. ; GC DNA 3.00 - content : Vuilleumier E. coli

et al. (2009) PHB 2.36 ± 0.05 This study M. extorquens AM1

Polyamine 0.40 - Neidhart et al. E. coli

Konovalova et al. Carotenoid 0.023 - M. extorquens AM1 (2007) Kiefer et al. (2008); Guo et al. (2006); Guo Intracellular metabolites 2.64 - et al. (2007); Vorholt M. extorquens AM1

et al. (1998); Crowther et al. (2008) Inorganic ions 1.01 - Neidhart et al. E. coli

Cofactors 0.22 - Neidhart et al. E. coli SUM 98.36

Physiological parameters value ±  units Data sources Organism source Growth rate 0.168 ± 0.003 h-1 This study M. extorquens AM1 Specific methanol uptake rate 15.0 ± 0.25 mmol.g-1.h-1 This study M. extorquens AM1 Specific proton production rate 0.22 ± 0.01 mmol.g-1.h-1 This study M. extorquens AM1 Growth-associated ATP 59.81 - mmol.g-1 Neidhart et al. E. coli maintenance Macromolecular building costs 26.65 - mmol.g-1 This study M. extorquens AM1

Non-Growth-associated ATP 9.5 - mmol.g-1.h-1 Rokem et al. (1978) Methylobacterium maintenance

The refined reconstruction was converted into a mathematical model using CellNetAnalyser [21] and the network was checked for self-consistency and curated to allow biosynthesis of all cell components from each of the 12 carbon sources established for M. extorquens (Table S4). In addition, flux balance analysis (FBA) was used to calculate the theoretical maximum growth rate enabled by the reconstructed network for each carbon source. The calculated values were close to available published data (Table S4), showing the consistency of the network with experimental observations. Last, the capability of the GS network to explain the oxidation of carbon compounds [22] was validated (Table S4).

103 The final GS network (iRP911) contained 1139 unique reactions and 977 metabolites, and was based on a gene-to-protein-to-reaction (GPR) association network that included 911 genes encoding 761 proteins (Table 2, Table S2, S3). The confidence in the network information was established by scoring the evidence currently available for each reaction [17]. The confidence scores ranged from 0 (lowest) to 4 (highest), with the latter score being assigned to a reaction with direct evidence for both gene product function and biochemical reaction (Table 2). The average confidence score over the final network score was 2.1.

Table 2. Properties of the genome-scale (iRP913) and methylotrophic network reconstructed for M. extorquens AM1, and evidence for reactions included in the genome-scale model.

Methylotrophic % of GS Properties GS network % network network Biochemically unique reactions 1139 717 62.9% Reversible reactions 578 50.7% 340 58.8%

Metabolites 977 722 73.9% Genes 911 706 77.5% Enzymes 761 595 78.2% Protein complexes 83 10.9 65 78.3%

Confidence score of reactions (GS network) Number % Data sources 4 Experimental evidence for enzyme activity 54 4.7% Biochemical data 4 Spontaneous reaction 15 1.3% 3 Experimental evidence for gene function 28 2.5% Genetic data 2 Genome annotation 856 75.2% Genomic data 2 Evidence from physiology (Transport) 127 11.2% Physiological data 1 Hypothetical reaction required for modeling 59 5.2% Modeling data 2.1 Average confidence score of the network

Main features of the genome-scale metabolic network of M. extorquens AM1. M. extorquens AM1 is a facultative methylotroph able to utilize a relatively narrow range of substrates. The ability to grow on C3 and C4 organic acids - e.g. lactate, pyruvate, succinate or malate - relies on the occurrence of classical metabolic pathways, which include the tricarboxylic acid (TCA) cycle, anaplerotic pathways, gluconeogenesis, pentose-phosphate pathway (PPP), and Entner-Doudoroff (ED) pathway (Fig. 1). Growth on C1 compounds relies on specific metabolic pathways that were resolved for this bacterium in the past 50 years [4, 13]. The first step is the oxidation of primary

C1 substrates – e.g. methanol - to formate via methanol dehydrogenase and the H4MPT-dependent C1 pathway. Formate is a key branch-point to trigger the flow of carbon between dissimilation and assimilation. Dissimilation is achieved by oxidation of formate into CO2. The assimilation of C1- units requires the conversion of formate into Me- THF, since the spontaneous condensation of formaldehyde with THF was demonstrated to be not significant [23]. In the serine cycle, Me-THF is condensed with a C2 compound – generated from glyoxylate - to build C3 compounds like

104

Fig. 1 - Central carbon metabolism of M. extorquens AM1 Precursors of biomass components are labeled with an asterisk (*). Some metabolites were duplicated on the map for clearer visualization, and are indicated with ―. Cofactors used as substrate or products are indicated in blue and red, respectively. GA3P, glyceraldehyde-3-phosphate; 3PG, 3-phosphoglycerate; 2PG, 2- phosphoglycerate; PEP, phosphoenol-pyruvate; OAA, oxaloacetate; 3HBCOA, 3-Hydroxy-butyryl- Coenzyme A; E4P, D-erythrose-4-phosphate; AKG, -ketoglutarate; SED7P, sedoheptulose-7-phosphate; cyt c red: reduced cytochrome c. Dotted arrows represent uncertain reactions.

2-phosphoglycerate and phosphoenolpyruvate (PEP), which are further carboxylated into oxaloacetate (OAA) and other C4 intermediates. The continuous operation of the serine cycle requires the operation of the recently discovered EMCP, in which both glyoxylate regeneration and CO2 fixation take place. These pathways are tightly embedded into each other and the consequences of such organization will be detailed later. The C2 compounds used as carbon source

105 enter this metabolism at the level of the EMCP, from which they feed the central pathways. M. extorquens possesses also the ability to oxidize 26 additional compounds [22]. These compounds include a significant number of sugars, mainly pentoses. The reconstruction data suggests that such capability is due to the occurrence of soluble sugar dehydrogenases able to oxidize a wide range of pentoses and other sugars [24], but no assimilation processes were identified from the genome annotation. More surprising is the inability of M. extorquens AM1 to grow on hexoses or hexose-derivatives since all relevant transport and glycolytic processes – i.e. PPP and ED pathway – were found in the network. The detailed examination of the biosynthetic pathways included in the GS network indicated incomplete lipopolysaccharide (LPS) biosynthesis. The pathways for keto-deoxyoctulosonate and lipid A synthesis and assembly were found, but the genes encoding the enzymes classically involved in heptose biosynthesis and in sugar incorporation onto the lipid A were missing. These observations suggest the occurrence of an unusual LPS structure in M. extorquens AM1. Consistently, no heptose or galactose was detected from the hydrolysis of cell material (see above) though significant contents in rhamnose (8.9 ± 0.9%) and glucose (5.6 ± 0.5%) were found. The GS network also included the biosynthetic pathways for carotenoids and bacteriochlorophyll A. The biosynthetic pathway of bacteriochlorophyll was complete in addition with the presence of the enzymes of the photosystem I. Several degradation pathways are missing in the GS network of M. extorquens, including the degradation of amino acids – e.g. , , - and nucleotides. This is in agreement with physiological data showing that the occurrence of these compounds in the cultivation medium did not result in detectable metabolic activity. The GS metabolic reconstruction showed that M. extorquens possesses a respiratory chain with alternative systems for electron inputs and outputs (Fig. S2). A great variety of potential electron donors could be identified, including a significant number of soluble periplasmic dehydrogenases transferring electrons to cytochrome c, including methanol dehydrogenase, formaldehyde dehydrogenase, and the already-mentioned soluble sugar dehydrogenase. Three terminal oxidases were also present, including two ubiquinol oxidases and one cytochrome c oxidase, suggesting that M. extorquens can develop an aerobic metabolism at different oxygen levels. Besides oxygen, nitrate might represent a potential alternative electron acceptor.

Organization of central metabolic pathways. C1 assimilation ensures the conversion of the C1- unit into precursor metabolites and involves a high number of metabolic pathways like C1 pathways, serine cycle, EMCP, TCA cycle, gluconeogenesis, anaplerotic reactions (Fig. 1). They are connected by overlapping metabolites and enzyme reactions. The most central processes are interconnected cycles (serine cycle, TCA cycle) or pathways (EMCP, anaplerotic reactions). The serine cycle shares common reactions with gluconeogenesis (enolase), with the EMCP (malyl- CoA ligase, malyl-CoA lyase), with the TCA cycle (malate dehydrogenase), and with amino acid metabolism (serine hydroxymethyltransferase). The EMCP shares also reactions with the

106 polyhydroxybutyrate (PHB) biosynthesis (acetyl-CoA C-acetyltransferase, acetoacetyl-CoA reductase), the TCA cycle (succinate dehydrogenase, fumarase), and fatty acid degradation (hydroxybutyryl-CoA (HBCOA) dehydratase, hydroxybutyryl-CoA dehydrogenase). The overall picture of M. extorquens central metabolism that emerges from these observations is an unusual series of metabolic pathways and cycles that are tightly embedded one into each other and allow operating almost as an entity. The C3 (2PG, pyruvate and PEP) and C4 (OAA and malate) intermediates play a critical role in the overall network organization. They represent the branching-points of the three main central metabolic pathways, i.e. the serine cycle, the EMCP and the TCA cycle, and of anaplerotic processes. Hence, they can be generated by different metabolic routes [25]. Accordingly, seven different reactions allow the inter-conversion of the five compounds (Fig. 1). These reactions include processes inter-converting i) C3 into C3 (enolase, PEP synthase, pyruvate kinase), ii) C4 into C4 (malate dehydrogenase), iii) C3 into C4 (PEP carboxylase (PEPCL)), and iv) C4 into C3 (PEP carboxykinase (PEPCK), malic enzyme). Moreover, the C3/C4 inter-conversions include either reversible reactions (2PG/PEP and OAA/malate inter-conversion) or irreversible but opposite reactions (PEP/pyruvate and PEP/OAA inter-conversions). The result is a dense sub-network of reactions that provides alternative pathways for the same conversion [25] and the occurrence of potential substrate cycles [26].

Identification of the sub-network operating during methylotrophic growth. The functional structure of the metabolic network operating during pure methylotrophic growth of M. extorquens, i.e. growth on methanol as sole source of carbon and energy, was establish by determining the sub- network of the GS metabolic model that includes all the reactions operating on methanol, thereafter referred to as the ‗methylotrophic network‘ (Fid. S1). The identification of the methylotrophic and non-methylotrophic reactions was based on both theoretical and experimental considerations, including i) physiological parameters ii) genetic and biochemical data, iii) omics data (transcriptomic, proteomic, metabolomics), and extensive refinement. Each reactions were confronted against all above criteria and thus decision to include or exclude a particular reaction was bound on this score (Table S5). Decision of exclusion or inclusion of a reaction is always based on multi-criterion consideration. This reduction step was a pre-requisite for in sillico analysis of the methylotrophic network. Indeed the GS network has a too important degree of freedom (158) for computation of Elementary Flux Modes. In addition, the reduction step allows rolling out some false results of network topology analysis, like network organization that can't appears in the cell due to regulation processes and/or that are occurring in different environmental condition. The Supporting Fig. 1 shows the diagram of the reduction process which is detail following: i) Methylotrophic growth of M. extorquens The determination of the methylotrophic network was performed here to account for conditions where M. extorquens cells were grown exponentially in a minimal medium containing methanol as

107 sole added source of carbon and energy, NH4Cl as nitrogen source, and salts, as described in material and methods. Such growth conditions allowed three levels of reduction of the GS metabolic model. First, the processes (transport and biochemical reactions) associated with the utilization of compounds (e.g. carbon sources, nitrogen sources, etc) included in the GS network but not occurring in the medium were removed. Similarly, the pathways and transport systems associated with metabolic end-products included in the GS network but not detected experimentally during methylotrophic growth were removed. To this aim, quantitative 1H-NMR analysis of culture supernatants collected after methylotrophic growth indicated that no by-product accumulated in the medium to detectable levels, allowing the removal of 187 reactions from the GS model. It was also assumed that in exponentially-growing cells no biomass degradation occurred, resulting in a further simplification of the network by removing the biochemical pathways specifically involved in the degradation or salvage of macromolecular components. This simplification resulted in the removal of 213 biochemical reactions. Some reactions associated with macromolecule degradation could be potentially involved in other metabolic processes, such as cofactor biosynthesis or recycling of anabolic by-products. Some of these reactions - a total of 13 - appeared to be relevant for growth on methanol and were kept in the methylotrophic network. ii) Genetic and biochemical data Literature data were used to substantiate further the methylotrophic network. More particularly, the phenotypes of gene deletion mutants were used to support the reduction process. From currently available literature, a total of 47 genes were shown to be essential during growth on methanol. Some of these genes encode multifunctional enzymes, so that a total of 51 biochemical reactions were associated with the 47 essential genes. All monofunctional enzymes (42 reactions) were kept in the methylotrophic model. For multifunctional enzymes, it could not be determined at this stage which reaction(s) was (were) responsible for essentiality, and other considerations were applied before making a decision as regard to their inclusion or exclusion of the methylotrophic network. In total, 2 reactions were excluded from genetic data analysis. The enzyme assays available in the literature were considered to confirm the presence of biochemical reactions during methylotrophic growth, as well as their differential activities upon non-methylotrophic condition. Additional biochemical information from in vitro essay of particular reaction like spontaneous reaction or biochemical information in other microorganism was used to validate reaction occurrence. In total, 13 reactions were exluded from biochemical data analysis. iii) Omics data The next reduction step was based on the comparison of omics data - including transtriptomics (Bosch et al, 2008), proteomics (Bosch et al, 2008), and metabolomics data (Guo & Lidstrom, 2006; Guo & Lidstrom, 2008; Kiefer et al, 2010; Kiefer et al, 2008), collected for both methylotrophic-grown and non-methylotrophic grown M. extorquens cells. The molecular components corresponding to each type of omic data – e.g. protein for proteomics data - were kept

108 in the methylotrophic network when they were identified to occur in methanol-grown cells, or/and their content was significantly higher - at least twice higher - than in non-methanol-grown cells, i.e. succinate-grown cells. The score of components identified from transcriptomics (differential expression), proteomics (spectral counting, differential expression) and metabolomics (identification) were assigned to their corresponding biochemical reactions via the GPR association. Taken together, the omics data were involved in the confirmation of the occurrence of 175 reactions, and the exclusion of 296 reactions. The final methylotrophic network contained 717 reactions and 722 metabolites, associated with 706 genes (Table 1, Table S5), and included approximately two thirds of the components of the GS network. To validate the topology of the methylotrophic network, non methylotrophic reactions were constrained to zero in the stoichiometric model of M. extorquens metabolism, and the reduced model was used to simulate growth performance on methanol. FBA simulations showed that the network supports a theoretical maximal growth rate of 0.20 h-1, which is consistent with experimental values [27]. Moreover, this value was close to the maximal growth rate calculated with the GS network (0.21 h-1), indicating that no significant growth capacities were lost during the reduction of the GS network and furthermore suggesting that about one-third of the total metabolic potential of the bacterium is not required for growth on methanol.

Dissimilation capabilities of the methylotrophic network. The capability of methylotrophs to use methanol as sole energy and carbon source relies on the occurrence of both dissimilatory and assimilatory pathways, which fulfill all energetic and biosynthetic requirements, respectively. The reconstruction of the metabolic network of M. extorquens gave the opportunity to analyze the (system-level) organization of the two types of processes in this model methylotroph. Elementary flux mode (EFM) analysis, a powerful tool to analyze the functionality of metabolic networks from their topology [28], and FBA simulations, were carried out. Dissimilation and assimilation processes were first analyzed separately. Dissimilation processes were defined here as processes resulting in the net conversion of methanol into CO2 and allowing energy conservation. The main dissimilation route is known to be the stepwise oxidation of methanol to CO2 using dedicated C1 pathways [4, 12]. This process involves the periplasmic oxidation of methanol into formaldehyde, which is further oxidized to CO2 in the cytoplasm (see Fig. 1). In this process one cytochrome C and two nicotinamide adenine dinucleotide (phosphate) (NAD(P)H) are released, assuming the pyridine nucleotide dependent formate dehydrogenase (FDH) operates. This ‗cytoplasmic‘ route can fulfill both adenosine triphosphate (ATP) and redox requirements at the same time. In case cytochrome C and NADH are reoxidized by the most effective oxidative phosphorylation mechanisms, a maximal yield of 5 ATP/methanol is predicted (Table 3). The additional potential routes for methanol dissimilation within the methylotrophic network could be detailed from the in silico investigations (Table 3). A periplasmic route of formaldehyde oxidation can be predicted in case methanol-dehydrogenase-

109 like enzyme XoxF would act together with a periplasmic formate dehydrogenase [29]. Such a route would not generate NAD(P)H and hence could only fulfill ATP requirements, albeit with reduced maximal yield (3 ATP/methanol). Another alternative to generate CO2 from methanol would be via

Table 3. Dissimilatory processes in the methylotrophic network. Types and number of dissimilatory EFMs detected in the methylotrophic network. For each type of dissimilation process, the theoretical maximum yields in ATP, NADH or NADPH are given. max max max ATP NADH NADPH Dissimilation processes in mol.mol(methanol)-1 MeOH -> CO2 (cytoplasmic FDH) 5 2 2

MeOH -> CO2 (periplasmic FDH) 3 0 0

MeOH + CO2 (Ser cycle) -> Acetyl-CoA -> TCA cycle -> 2 CO2 1 0.5 0.5 MeOH + n CO2 (Ser cycle + EMCP) -> other C2s, C3s, etc -> central 1 0.5 0.5 pathways -> n+1 CO2

Max. Minim number of Max. carbo al EMCP carbon in number molar- n- EFM utilisati compound/precursor biosynthesis precursor of EFMs Yield Yield length on 5,10-methylenetetrahydrofolate (Me-THF) 1 2018 1.00 1.00 20 93% acetyl-CoA 2 2440 0.45 0.91 35 95% glycine 2 2054 0.42 0.84 62 100% L-serine 3 2162 0.29 0.88 62 100% D-glyceraldehyde-3-phosphate (GA3P) 3 2592 0.27 0.81 54 100% phosphateenolpyruvate (PEP) 3 3390 0.32 0.97 53 100% pyruvate (PYR) 3 3065 0.33 1.00 50 100% oxaloacetate (OAA) 4 5366 0.32 1.29 51 100% (R)-3-hydroxybutanoyl-CoA (3HBCOA) 4 2789 0.21 0.83 37 92% D-erythrose-4-phosphate (E4P) 4 6576 0.20 0.81 63 100% -ketoglutarate 5 3806 0.20 1.02 54 100% D-ribose-5-phosphate 5 4663 0.16 0.81 61 100% D-glucose-6-phosphate 6 2592 0.14 0.68 58 100%

the initial formation of multi-carbon compounds, which could further be completely oxidized to

CO2 via decarboxylation reactions. Such potential pathways would require the formation of C2 – acetyl-CoA - to C6 compounds that can be further oxidized to CO2 in the TCA cycle or via the various decarboxylation reactions found in central pathways, and are not efficient for energy conservation (Table 3). The indirect dissimilation routes are predicted from in silico investigations but are unlikely to operate upon methylotrophy from an energetic point of view; however, they could represent the main energy conservation mechanisms upon utilization of C2 and other multicarbon sources.

110 Assimilatory processes. The consequences of the particular organization of primary C1 assimilation in M. extorquens AM1 were analyzed by examining the processes allowing the conversion of methanol into each of the 13 key carbon precursors, including C1 (Me-THF), C2 (acetyl-CoA, glycine), C3 (L-serine, pyruvate, PEP, glyceraldehyde-3-phosphate), C4 (OAA, D- erythrose-4-phosphate (E4P), 3HBCOA), C5 (-ketoglutarate, D-ribose-5-phosphate) and C6 (D- glucose-6-phosphate (G6P)). The number of EFMs ranged from 2018 to 6576 for the various carbon precursors (Table 3, Additional file 8). The serine cycle was involved in all assimilatory EFMs except for Me-THF, which can be also generated directly in the C1 pathways. This observation was consistent with the key role of the serine cycle pathway in methanol assimilation. The EMCP was involved in 93%, 95%, and 92% of the EFMs generating Me-THF, acetyl-CoA, and (R)-3-HBCOA, respectively. For all other carbon precursors, including the serine cycle intermediate glycine and L-serine, all assimilatory EFMs required the EMCP. These data emphasized the critical role of the EMCP (12 reactions), in addition to the C1 pathways (10 reactions) and the serine cycle (9 reactions), in methanol assimilation. Hence, the initial steps of C1 assimilation require the consecutive but obligatory operation of a high number of reactions. The minimal EFM length, representing the smallest number of reactions needed to convert methanol into each carbon precursor, was calculated for each of the 13 carbon precursors. The conversion of methanol into C3 compounds required at least 50 reactions, and the minimal number of reactions required to convert methanol into E4P was 63. Even for Me-THF and acetyl-CoA, the minimal EFM lengths were high (20 and 35, respectively). These data indicated that the primary assimilation processes, ensuring the conversion of methanol into carbon precursors, is a particularly complex process in M. extorquens AM1. Despite the complexity of methanol assimilation, the carbon precursors are produced from methanol with carbon yields that are similar to that observed on glucose for species like E. coli and C. glutamicum.

111

Fig. 2 - Structural EMCP properties compared to ICL variant. A. EFM (#122591) with the optimal biomass yield among the EFMs where biomass carbon is derived exclusively from CO2. B. FBA simulation of optimal growth rate depending of a fixed proportion of NADPH/NADH produced by methylene- H4MPT dehydrogenases MtdA and MtdB in EMCP and ICL-variant.

112 Interdependencies of dissimilatory and assimilatory processes. The energetic efficiency of dissimilation processes determines the amount of energy available for assimilation and hence is a major determinant of methylotrophic growth. The complete set of EFMs (152872) through the methylotrophic network was analyzed to investigate the relationships between dissimilation and assimilation processes. The EFMS were classified according to their biomass yields, and then to the various types of dissimilatory processes (Fig. 2). In the EFM with the optimal biomass yield (0.42 g·g-1), 70% of methanol was directly oxidized via C1 pathways and the remaining was used for assimilation purposes. In a significant number of assimilatory EFMs, methanol was entirely oxidized to CO2 via the C1 pathways (Fig. 2), meaning that no reduced carbon entered into assimilatory pathways, and indicating that biomass could be fully generated from CO2. The highest biomass yield that could be obtained by such process was 0.283 g·g-1 (EFM number 122591). In this EFM, (Fig. 3A), carbon fixation is achieved by a process involving both the EMPC and the serine cycle. The process starts in the EMCP where two glyoxylate molecules are generated from one acetyl-CoA and two CO2. The two glyoxylate molecules enter the serine cycle to produce two glycine molecules. One glycine is converted by the glycine cleavage complex into one CO2 and one Me-THF. The latter compound allows the conversion of the second glycine molecule into serine, which is used in subsequent steps of the serine cycle, allowing both the regeneration of the initial acetyl-CoA molecule and enabling – through the operation of the entire mechanism - the formation of all carbon precursors needed for biosynthetic purposes. The overall carbon balance is

2 CO2  1 glyoxylate. As this process requires the release of Me-THF via the glycine decarboxylase complex it represents a distinct feature compared to the classical operation of the serine cycle. The carbon yield of the CO2-assimilation process is significantly lower than observed for methanol assimilation (40% vs 62%). The ATP needs are twice higher (7.2 vs 3.8 mol∙mol(carbon assimilated)-1), and the redox needs are two to three times more elevated. Such high energetic costs can be covered by methanol oxidation, but the overall CO2 assimilation process is much less favorable than methanol assimilation. This CO2 assimilation process was not reported so far and is a direct consequence of the capability of the EMCP to ensure CO2 fixation.

Substitution of the ethylmalonyl-CoA pathway by the glyoxylate cycle. The recently discovered EMCP is an alternative to the classical glyoxylate cycle for the biosynthesis of glyoxylate from acetyl-CoA in organisms lacking isocitrate lyase (ICL) [14, 15, 30]. To compare the metabolic properties conferred by the EMCP with that of the glyoxylate cycle, we generated a variant of the methylotrophic network lacking the EMCP but possessing the glyoxylate cycle. This was done by setting to zero the flux from crotonyl-CoA to propionyl-CoA and by adding the ICL reaction. Malate synthase, the enzyme of the glyoxylate cycle that catalyzes the condensation of acetyl-CoA and glyoxylate into malate, was not added

113

Fig. 3 - EFM analysis of the balance between dissimilation and assimilation in EMCP and ICL variants. The biomass-forming EFMs were calculated and sorted according to the biomass yield (from bottom to top, blue color scale). For each EFM, the flux through the main dissimilatory processes (see text for details) were extracted and plotted separately. Fluxes were expressed relative to the rate of methanol uptake, and were plotted using a colour-scale (black to green). First lane: cytoplasmic FDH, fdh(c); second lane: periplasmic FDH, fdh(p); third lane: TCA oxidation of acetyl-CoA, TCAc; fourth lane: other dissimilation process. A: EFMs calculated with the methylotrophic network, which contains the ethylmalonyl-CoA pathway (EMCP). B: EFMs calculated with the network variant where EMCP was replaced by the glyoxylate cycle (ICL variant). since M. extorquens can use a combination of two enzymes to achieve the same reaction [31], as described also in R. sphaeroids [32]. As expected, the glyoxylate cycle was essential for methanol growth and was found in all assimilatory EFMs. The maximal biomass yield predicted for the ICL variant (0.41 g·g-1) was similar to that observed for the EMCP variant yield (0.42 g·g-1). To obtain such maximal growth, the rate of methanol oxidation in the ICL variant was smaller (61% vs 70%) and the NADPH requirements lower than in the EMCP variant, showing a higher energetic efficiency of the ICL variant. In contrast to the EMCP, the ICL variant was not found to allow entire biomass formation from CO2. To investigate the potential role of the EMCP in redox balancing, we compared the capability of the two metabolic network variants to respond to varying NADPH production levels. FBA simulations were performed where the amount of NADPH produced by the Me-H4MPT dehydrogenase MtdB [33], which can use either NADH or NADPH, was varied from 0 to 15.0 mmol·g-1·h-1 (0 to 100% of MtdB flux). For both the EMCP and ICL variants, the absence of NADPH production in the C1 pathways can be compensated by other NADPH-production systems, the PPP or malic enzyme. The theoretical maximal growth rate increased with NADPH

114 production until a maximum is reached, which represents the optimal balance between NADPH production and growth. Maximal growth for the EMCP variant was obtained at higher NADPH production level than the ICL variant (9.0 vs 7.7 mmol·g-1·h-1), in agreement with the higher redox demand identified previously. The EMCP variant was able to maintain higher growth rates when the NADPH production was further increased. This capability was correlated with a significant increase of the EMCP flux. The increase in the EMCP flux was accompanied by the truncation of the serine cycle. Rather than being converted to OAA, PEP is converted into pyruvate via pyruvate kinase, which is further converted by pyruvate dehydrogenase into acetyl-CoA, which enters the EMCP. This pathway generates ATP (via pyruvate kinase) and releases NADH (via pyruvate dehydrogenase), resulting in a transhydrogenase-like mechanism where the redox equivalents are + transferred from NADPH to NAD in addition to the transfer to CO2.

Fragility of the methylotrophic network. Robustness is an inherent property of a metabolic network and is defined as the capability of this network to operate despite one – or more – reactions are removed. The robustness of the methylotrophic network was analysed using minimal cut sets (MCSs), which correspond to minimal combinations –singlets, pairs, triplets, etc - of reactions whose removal blocks the operation of a target metabolic function [34]. The identification of all MCSs in a metabolic network allows the calculations of the fragility coefficient (FC) of each reaction. The FC of a reaction represents the probability that the metabolic system fails to achieve the target function when the reaction is removed. This approach was applied to analyze the robustness of the methylotrophic network using growth as the target metabolic function. A significant number of reactions (391) were found to have a FC of 1 and are therefore predicted to be essential (Additional file 9). Among these 391 reactions, 30 % are catalyzed by multiple enzymes, indicating that enzyme redundancy is significantly used to avoid metabolic resilience in M. extorquens AM1. Most of the essential reactions (279) were found in biosynthetic pathways, which is consistent with studies performed with other organisms. The FCs of reactions found in central metabolism spanned over a wide range of values but distributed heterogeneously among metabolic pathways (Fig. 4). Some processes, such as the C1- and carbohydrate pathways, and C3/C4 interconversions, had low FCs and hence were predicted to be robust parts of the metabolism. Most other parts of the central metabolism had high FCs and hence were predicted to be fragile. Of the 84 reactions of the central carbon metabolism 40 were found to be essential. The essential reactions concentrated in the primary assimilation processes (Fig. 4). Most reactions of the serine cycle (67%) and of the EMCP (100%) were predicted to be essential for methylotrophic growth, indicating the assimilation processes to be highly fragile. The same observation holds true for gluconeogenesis. Because a particular reaction can be catalyzed by one or several enzymes, the essentiality of a reaction does not necessarily mean that the

115

Fig. 4 - Structural fragility of the methylotrophic network Prediction of reaction essentiality from Minimal Cut Set (MCS) analysis and comparison with experimental mutant phenotypes. The fragility coefficient (FC) calculated from MCS analysis is given for each reaction of M. extorquens central metabolism, and ranges from 0 (fully dispensable reaction) to 1 (essential reaction). Boxes next to reactions represent enzymes. When available, the phenotype of the mutant lacking the gene encoding a particular enzyme is given by a color code (see legend). The occurrence of alternate reactions that are not displayed on the map is indicated by a star. removal – by gene deletion - of one particular enzyme will be lethal. Among the 40 essential reactions found in central pathways, 29 were catalyzed by a single enzyme. Accordingly, 19 of these genes have been studied experimentally, and 95 % of them were shown to be lethal for methylotrophic growth [35] (Additional file 10). For an essential reaction with multiple enzymes, one isoenzyme can compensate the lack of the other one(s). Genes encoding isoenzymes are

116 therefore predicted to be not essential. Mutant analysis showed however that 4 out of 11 essential reactions with multiple enzymes found in primary assimilation processes, were encoded by gene where deletion were lethal for growth on methanol. This observation suggests that the products of these genes play an essential role during growth on methanol that cannot be compensated by the other potential enzymes catalysing the same reactions, or that they have different regulations, or both. Taken together, these data showed that the main processes of methanol assimilation are highly fragile in M. extorquens AM1.

Distribution of metabolic fluxes during methylotrophic growth. The distribution of metabolic fluxes during methylotrophic growth was determined using 13C-metabolic flux analysis (13C- MFA). A first investigation of metabolic fluxes during methylotrophic growth of M. extorquens AM1 was published in 2003 [36]. The novel insights such as the operation of the EMCP [15] and the biomass composition (this study) stressed out the necessity of novel metabolic flux analysis in M. extorquens AM1 during growth on methanol. On purpose a series of new 13C-methanol labeling experiments were performed, and the isotopic information was monitored by both mass spectrometry and two dimension-nuclear magnetic resonance (2D-NMR) [37] (Additional file 11, 12). Such analytical combination proved to provide information critical for the resolution of central pathways in M. extorquens [15]. The flux distribution obtained from these investigations is displayed in Fig. 5 and listed in Additional file 13, and the fitting accuracy and sensitivity analysis are listed in Additional file 12, 13, 14, 15. The flux data indicated that 12.7 ± 0.58 mmol·g-1·h-1 of methanol, 84% of methanol consumed -1 -1 (15.10 ± 0.60 mmol·g ·h ), was directly oxidized to CO2 within the C1 pathways. A release of

CO2 was also observed within biosynthetic pathways (2.6% of consumed methanol) and central metabolism (5.4%). The CO2 releases in central metabolism were due to substantial fluxes through malic enzyme (0.36 mmol·g-1·h-1) and PEPCK (0.26 mmol·g-1·h-1), but they were not associated with dissimilation processes. Indeed, the TCA contributed to only 1.1% of total CO2 release and operated in an incomplete and anabolic mode. The flux of C1 assimilation via Me-THF was 2.4 ± 0.02 mmol·g-1·h-1, which represents 16% of methanol uptake. The flux data clearly showed the central role of the serine cycle in triggering the distribution of carbon flows throughout the entire metabolic network to fulfill the requirements in carbon precursors. About 20% (0.50 mmol·g-1·h-1) was directed towards gluconeogenesis and carbohydrate pathways, and 30% was routed to the formation of pyruvate and TCA cycle intermediates. The release of

117

Fig. 5 - Distribution of fluxes in the central metabolic network of M. extorquens AM1 upon methylotrophic growth. Carbon fluxes were calculated from 183 isotopomer measurements (NMR + MS data) collected during steady-state growth of M. extorquens AM1 on [13C]-methanol. Fluxes are given in mmol·g-1·h-1, with standard deviations given below flux values. Exchange fluxes through reversible reactions are given within brackets. The width of the arrows is proportional to the flux value. acetyl-CoA by the serine cycle was significant (1.80 mmol·g-1·h-1). The major part (1.40 mmol·g- 1·h-1) was recycled back to the serine cycle by the EMCP, and the remaining was used for anabolic purposes. The fixation of CO2 occurring within central pathways was calculated from the -1 -1 difference between CO2-utilizing and CO2-releasing fluxes, and was 2.44 mmol·g ·h . This value was similar to the rate of methanol assimilation via Me-THF (2.39 mmol·g-1·h-1). These data were consistent with the observation that 50% of the biomass carbon derived from CO2 [23, 38]. Such carbon balance can be obtained in case two molecules of glyoxylate are generated per turn of the

118 EMCP [15], which requires that the propionyl-CoA generated in this pathway is not directly used for anabolic purposes but converted into glyoxylate. Accordingly, the replenishment of the glyoxylate pool from propionyl-CoA was almost identical to the direct release of glyoxylate within the EMCP (0.70 ± 0.02 mmol·g-1·h-1). No flux was found through the glycine cleavage complex, indicating that the CO2 assimilation mechanism identified from EFM analysis was not operating during pure methylotrophy, for chosen cultivation conditions. Surprisingly, a glycolytic flux through the Entner-Doudoroff pathway was observed. This flux (0.08 ± 0.02 mmol·g-1·h-1) was low compared to the rate of formate assimilation but was significant to fit the labeling data and represented about 14% of total pyruvate synthesis. This observation indicated that some carbon atoms were recycled through gluconeogenesis and glycolysis during methylotrophic growth. The flux data provided valuable information regarding the C3/C4 pathways. The synthesis of pyruvate, which is required for various anabolic purposes, was proposed earlier to proceed via the conversion of PEP into pyruvate, via pyruvate kinase [39]. The flux data showed that pyruvate was synthesized by three different routes, including pyruvate kinase, the ED pathway and malic enzyme. The main flux was carried out by malic enzyme (0.36 vs 0.13 mmol·g-1·h-1). Moreover PEP synthase, which catalyses the reaction opposite to pyruvate kinase, is active and its flux (0.13 mmol·g-1·h-1) is higher than that of the latter enzyme. This observation indicated the occurrence of a substrate cycle between PEP and pyruvate due to the parallel activity of pyruvate kinase and PEP synthase, and in which 68% of pyruvate is recycled. Three additional substrate cycles were observed in this part of the metabolism. Two of them were related to C3/C4 inter-conversions: i) PEP/OAA cycling via PEPCL and PEPCK(13% of PEP recycled), and ii) PEP/malate/pyruvate cycling via PEPCL, malate dehydrogenase, malic enzyme and PEP synthase (4% of PEP recycled via malate and pyruvate). The fourth substrate cycle was observed between malate and (acetyl- CoA+glyoxylate). It relies on the reversible activity of the malyl-CoA lyase [40] and on the parallel operation of malate-CoA ligase and malyl-CoA thioesterase. Malate-CoA ligase is responsible for the release of glyoxylate and acetyl-CoA from malyl-CoA in the serine cycle. Malyl-CoA thioesterase catalyzes the opposite reaction. This enzyme is supposed to play a key role during growth on multicarbon compounds, but its activity is not expected during methylotrophic growth. Indeed, the enzyme activity is down regulated during methylotrophic growth [41]. However, a significant activity of this enzyme is still detected upon methanol growth [41], and represents 22% of the activity found on acetate. Such level of activity is likely to be sufficient to maintain a flux in the reaction, thereby resulting in substrate cycling. The flux data show not only that the latter cycle operates in M. extorquens AM1 during growth on methanol, but also that the extent of recycling is significant (32%). The energetic cost of metabolite recycling within the four above-mentioned processes was calculated from the flux data to be 1.28 mmol ATP·g-1·h-1, with the most expensive one (0.83 mmol ATP·g-1·h-1) being the malate/(acetyl- CoA+glyoxylate) cycle. However, the energetic cost was almost negligible and represented only 4.4% of the total production of ATP (29.3 ATP mmol·g-1·h-1).

119

The total demand in NADPH during methylotrophic growth could be calculated from the fluxes in NADPH-utilizing reactions and biomass requirements, and was 5.56 mmol·g-1·h-1. The demand was mainly due to formate assimilation (2.33 mmol·g-1·h-1), biosynthetic requirements (1.80 mmol·g-1·h-1), and operation of the EMCP (1.43 mmol·g-1·h-1). The 13C-flux data showed also that the two NADPH-forming reactions found within central carbon pathways – isocitrate dehydrogenase and G6P dehydrogenase - contributed only negligibly (below 5%) to the total NADPH production. Hence, most of NADPH is generated alongside formaldehyde oxidation. From these data it can be calculated that the production of NADPH in the C1 pathway should be 5.33 mmol·g-1·h-1 to close the NADP balance. To evaluate the ATP balance, flux variability analysis and FBA simulations were performed in which the methylotrophic network was constrained with the 13C-flux data, biomass requirements, experimental rates of growth and methanol uptake, and maintenance energy (Supplemental Tables S12 & S13). Because the respiratory mechanisms by which the reduced cofactors (NADH, cytochrome C) generated in the C1 pathways are not clearly established in M. extorquens the ATP cannot be firmly established from the flux data. Nevertheless, the simulations showed that, if dissimilation proceeds via the cytoplasmic, NADH-dependent route at maximal ATP efficiency (Table 3), then the total production of ATP (44 mmol·g-1·h-1) would be in large excess compared to the requirements (29.3 mmol·g-1·h-1).

Discussion

The genome scale metabolic network reconstructed in this work offers an integrated view of the current metabolic knowledge of the methylotrophic bacterium M. extorquens AM1. It provides new insights into the biochemistry of this organism and reveals the network-scale organization of metabolic processes as well as a first evaluation of the complete metabolic potential of this bacterium. The metabolic reconstruction allowed drawing a complete and detailed picture of the central carbon metabolism of the bacterium, which appears as a mosaic of classical (TCA cycle, anaplerotic processes, gluconeogenesis, PPP, ED pathway) and specialized (C1 pathways, serine cycle, EMCP) pathways, enabling growth on C1 and multicarbon (C2 to C4) compounds. The core of the central metabolism is organized as a highly unusual series of metabolic cycles tightly embedded one into each other and operating as an entity to achieve C1 assimilation during methylotrophic growth. The ability to assimilate C1 compounds relies on a complex metabolic machinery, in which the initial steps – from methanol to biomass precursors – require a particularly high number of reactions (e.g. at least 36 reactions to obtain acetyl-CoA). The entire process is strongly reductive and energy-consuming. Most of these reactions require enzymes and cofactors that are specific to C1 – or C2 – growth, and must be biosynthesized for the specific

120 purpose of C1 or C2 utilization. Hence, the energetic costs for the biosynthesis and maintenance of this machinery are likely to be substantial for the bacteria. In addition, the network-scale analysis reveals that C1 assimilation is structurally fragile. Like specialist metabolic networks [42], the core metabolism of M. extorquens is characterized by a low level of alternative pathways but a high fraction of reactions that are essential for methylotrophic growth (almost 50%). In such networks robustness arise usually from enzyme (and genetic) redundancy, where multiple isoenzymes can catalyze essential reactions [42, 43]. The genetic or biochemical deficiency in one isoenzyme can be compensated by another one. Accordingly, the percentage of multiple genes encoding essential reactions in specialized metabolic networks is much higher than in generalist – or flexible – organisms such as E. coli, B. subtilis or S. cerevisiae (>30% vs 10% redundancy, respectively) [43]. The high degree of redundancy (28%) observed in M. extorquens confirms the specialized metabolism of this methylotroph. Furthermore, among the essential reactions with multiple genes, a significant number of particular genes were shown experimentally to be essential for growth on methanol, suggesting that the alternative gene(s) have different functions or regulations, hence are not functionally redundant. Therefore, the number of genes essential for methylotrophy is high, and the risk that a gene mutation results in loss of methylotrophic capacity is elevated. This could explain, in part, the successes in the identification of such genes in the last 2 decades [35]. The observations emerging from metabolic network analysis are highly consistent with experimental evolution experiments in which a significant number of clones collected after prolonged cultivation (1500 generations) on succinate lost their methylotrophic capacity [27]. Both studies suggest that a selection pressure is required to maintain methylotrophy in M. extorquens, indicating that the bacterium encounters frequently methanol in their natural environment and that its usage provides critical advantage in terms of ecological competitiveness.

The metabolic reconstruction data indicate also that a dense network of C3/C4 inter-conversions plays a critical role as a branch-point connecting the specialized and classical pathways. Indeed, seven reactions interconnect three C3 (2PG, PEP, pyruvate) and two C4 (malate, OAA) intermediates, thereby strongly embedding the serine cycle, the TCA cycle, anaplerotic processes, and gluconeogenesis (via 2PG). Such topology allows a wide range of alternative metabolic routes and provides metabolic flexibility. Van Dien et al. [25, 39] have emphasized the role of these processes in the metabolism of multi-carbon compounds. During growth on C4 compounds such as succinate, a functional TCA cycle is required and the generation of acetyl-CoA is ensured by pyruvate dehydrogenase. Hence, the conversion of C4 compounds into pyruvate is critical and can be achieved by redundant routes. During methylotrophic growth - in which a complete TCA does not operate - , the C3/C4 inter-conversions primarily ensure the opposite conversion of C3 intermediates into C4 intermediates in the serine cycle. They provide also alternative metabolic routes for such conversions, as observed with the significant conversion of PEP into pyruvate via PEPCL, malate dehydrogenase, and malic enzyme, though the role of this pathway is still unclear.

121 The most striking feature is, however, the occurrence of substantial substrate cycling within the C3/C4 inter-conversions upon growth on methanol. In addition, substrate cycling was also observed between C4 (malate) and C2 (glyoxylate + acetyl-CoA), at the branch-point between the serine cycle, the TCA cycle, the EMCP and anaplerosis. Substrate cycles are resulting from the simultaneous operation of – non reversible - opposite reactions or processes, at the expense of energy. They can represent adaptation mechanisms allowing fast switching of metabolic processes from one direction to the opposite one [26]. The nature of the substrates cycles observed upon methylotrophic growth indicated that the entire set of reactions starting from PEP or pyruvate to acetyl-CoA and glyoxylate was operating as a fully reversible process. As mentioned above, these processes are the branching-point of the specialized – i.e. serine cycle, EMCP – and generalist – i.e. TCA cycle, anaplerotic processes, and gluconeogenesis – pathways. They are also the starting point of large number of biosynthetic routes, and the entry point of the utilization pathways of all C1, C2 and C3/C4 carbon sources used by the bacterium. Taken together, these data suggest that upon pure methylotrophic growth the occurrence of substrate cycling provides flexibility between specialized and general pathways, thereby allowing fast switching of the metabolism from methanol to more-favorable substrates.

Though M. extorquens AM1 is considered to be a methylotroph but not an autotroph, the in silico investigations revealed a potential carbon autotrophic mechanism in this bacterium, which relies on the unique property of the EMCP to enable CO2 fixation. The CO2-fixation mechanism involves a cyclic operation of both the EMCP and serine cycle to generate one glyoxylate from two CO2. It can potentially operate independently of methanol assimilation in M. extorquens, but was not observed during methylotrophic growth in our investigations. Because the genome of M. extorquens contains the complete information for a photosynthetic machinery, it is tempting to speculate that this bacterium may operate in a photoautotrophic mode. But there is currently no experimental evidence of such behavior in M. extorquens. The question of the role, benefit and conservation of this pathway in M. extorquens and other organisms is still unclear. The EMCP is more complex and energy-demanding than the glyoxylate cycle though it provides a higher carbon balance for assimilation. In photosynthetic methanol utilizers, carbon dioxide functions as an electron sink for the excess electrons in methanol [44]. It was recently proposed that CO2 fixation could represent a central redox cofactor recycling mechanism in bacteria [45]. The potential role of the EMCP in such mechanism was recently shown from investigations of CBB mutants defective for the reductive PPP pathway in acetate-grown R. sphaeroides [46]. Our investigations show that the EMCP can potentially play such a role in M. extorquens upon methylotrophic growth. If further investigations are required to determine the actual physiological benefits of the EMCP in serine cycle methylotrophs, our investigations show that this pathway can potentially play the role of a redox-balancing mechanism or of an autotrophic pathway.

122

Conclusions

The unusual organization of the central carbon metabolism of M. extorquens AM1 allows efficient utilization of C1 compounds via highly specialized - and fragile - pathways but is versatile enough around a flexible backbone of C2/C3/C4 inter-conversions to allow switching to other carbon sources. These observations showed that the bacterium maintains active metabolic processes that are not needed for methanol utilization but allow adaptation to other carbon sources. This hypothesis is consistent with the observation that methanol is produced by plant with methanol release in the morning [47, 48]. This work emphasizes that the metabolism of the bacterium is adapted to its lifestyle not only in terms of enzymatic equipment, but also in terms of network- level structure and regulation. It suggests that the metabolism of the bacterium is adapted both structurally and functionally to an efficient but transitory utilization of methanol. This work also illustrates that the combination of GS network modeling and experimental approaches provides novel insights into the biochemistry and physiology of methylotrophic bacteria, which could be extended to obligate methylotrophs and to the comparison of serine cycle versus RuMP- and CBB- utilizing methylotrophs.

Materials and Methods

Network reconstruction. The genome-scale (GS) metabolic network of M. extorquens AM1 was reconstructed using procedures recommended for the generation of high-quality reconstructions [17]. The detailed procedure is given in Supporting Information Fig. 1. Briefly, the process of metabolic reconstruction included the following steps: 1. Generating a draft reconstruction. The genome information was extracted from the MicroScope database [49] on December 9th, 2009. Genes annotated for metabolic functions were selected and assembled with biochemical information collected from literature. The data were completed using metabolic databases - mainly Metacyc [20], KEGG [18] -. During the reconstruction process, systematic Blast and interrogation of metabolic databases were performed to refine weak - or missing – genome annotation information or new published data. In some cases this process leads to the re-annotation of genes and the publicly available annotation was corrected accordingly. 2. The reconstruction was refined from all genetic, biochemical, and physiological data available for M. extorquens and related species, metabolic (Metacyc, KEGG) and transporter (TCDB, http://www.tcdb.org/) databases, and from self-expertise on methylotrophy (Additional file 1). Metabolite information such as the name, molecular formula and metabolic database identifiers of compounds were included in the network, and refined from available metabolomics data for 157 metabolites [50]. Neutral formulas obtained from PubChem

123 (http://pubchem.ncbi.nlm.nih.gov/) were used to validate reaction stoichiometry (mass balancing) including proton balance. 3. Generation of a Gene-to-Protein-to-Reaction (GPR) association network. The GPR association was designed to describe explicitly all the relationships between molecular species and functional activities. Specific identifiers were assigned to enzymes, reactions and metabolites. 4. Gap-filling. A draft metabolic map, drawn using the software Cytoscape [51], was used as starting point for the gap filling process. Gaps were identified from stand-alone reactions or metabolites, and from missing connections in essential metabolic processes. Spontaneous reactions, and reactions or transports without associated genes in M. extorquens genome, were added according to metabolic (Metacyc, KEGG) and transport (TCDB) databases. 5. Conversion of the reconstruction into computational format. The reconstruction was loaded into the software CellNetAnalyser [21]. The consistency of the reconstructed network was evaluated from in silico investigations (modeling) and from the ability of the network to explain growth on the most studied carbon sources, namely methanol and succinate. Experimental information [22] was used to identify substrate utilization. Exchange reactions – i.e. exchange with environment - were finally added corresponding to known substrates usage and minimal medium composition. 6. Quality assessment. The quality of the reconstructed network was determined according to [17] by assigning a confidence score to each individual reaction, depending on the evidence for the presence of the reaction, with the highest score given to experimentally demonstrated reactions and the lowest score given to gap-filling reactions. The detailed list of reactions, metabolites, and other network components, and the GPR association network are given in Additional file 2, 3. The model in SBML format can be find in Additional file 16.

Determination of the chemical composition of cells. M. extorquens AM1 was grown in fed- batch mode in mineral medium containing methanol as sole carbon and energy source, as described in [15]. For cell dry weight (CDW) determination, 30 ml of culture were centrifuged in a 50 ml falcon tube and washed with de-ionized water and dried to constant weight at 80 °C. Falcon tubes were incubated for several days at 80 °C prior use. For other measurements, cells were harvested by centrifugation at 5000 g during 5 min. Cell pellets were frozen in liquid nitrogen and stored at -20 °C until analysis. i) Lipid content: whole cell hydrolysis with subsequent acid of fatty acids was carried out as described in [30] with slight modifications. Cells (10-20 mg CDW) were hydrolyzed with 4 ml of 15% NaOH (w/v) in methanol/water (1:1, v/v) for 30 min at 100 °C. Fatty acid methyl esters (FAMEs) were obtained by addition of 8 ml 6 M HCl/methanol (13:11, v/v) and incubation for 2.5 hrs at 80 °C. An internal fatty acid standard (3 mg C15:0) was added before hydrolysis for quantification purpose. The methylation yield was measured from the addition of a FAME

124 standard (3 mg C19:0 FAME) after the methylation step. FAMEs were extracted with 5 ml hexane/methyl-tert-butyl ether (1:1, v/v) and washed with 6 ml 1% NaOH in water (w/v). Extracted FAMEs were analyzed by gas chromatography - flame ionization detector (GC-FID) (Agilent Technologies 6850 with 7683B Series injector and FID detector) and a HP-5 column, length 30 m, I.D. 0.25 mm, film 0.25 m (Agilent Technologies). Helium was the carrier gas with a column flow of 2.4 ml/min; detector temperature was set to 300 °C, and inlet to 250 °C. A temperature gradient was run from 190 °C to 260 °C at 5 °C per min. A sample volume of 1 l was injected with a spilt ratio of 30. ii) Protein content: total proteins were quantified by the Biuret method [52], using bovine serum albumine (2 mg/ml) as standard. This method is independent of protein composition [53]. Cells were hydrolyzed in 0.75 ml 1N NaOH (1-2 mg/ml CDW) at 100 °C for 5 min. After addition of

0.25 ml of 2.5 % CuSO4 (w/v), samples were centrifuged and absorption was measured at 550 nm. The composition in amino acids of proteins was determined following hydrolysis in 6 M HCl at 110°C for 22 hours under argon samples were dried and derivatized using the AccQ-TagTM Ultra derivatization chemistry (Waters Corp., Milford, MA, USA) according to the manufacturer's instruction. Amino acid derivatives were separated by UPLC (Waters Corp., Milford, MA, USA) using the AccQ-TagTM Ultra standard hydrolysate conditions. Amino acid derivatives were detected by UV absorbance.. iii) Carbohydrate content: the carbohydrate content was measured after hydrolysis of the entire cell pellet. A two-step derivatization was used to convert carbohydrates into oxime trimethylsilyl derivates [54], which were analyzed by GC-FID. For glucose and rhamnose quantification, cells (1-3 mg CDW) were directly subjected to 200 l 2 M HCl at 80 °C for 4 hrs or to 4 M HCl for 16 hrs for glucosamine quantification, respectively; carbohydrates were stable under these conditions. After neutralization, 50 ml of 25 mM lactose solution was added as an internal standard. Samples were vacuum-dried and derivatized for 40 min with 150 l 0.5 M hydroxylamine•HCl in pyridine at 80 °C. After addition of 110 l (trimethylsilyl)trifluoroacetamid (BSTFA), samples were incubated for another 20 min. Separation and quantification of the derivatives were performed by GC-FID as described under lipids except that column flow was set to 2.7 ml/min and temperature gradient was run from 160 °C to 310 °C with 7 °C per min. iv) Polyhydroxybutyrate (PHB) content. The measurement of the PHB content was performed according to [55, 56]. Cell pellets (3-4 mg) were lyophilized and subjected to acid methanolysis with 3% H2SO4 (v/v) in methanol/chloroform 1:1 (v/v) for 2.5 hrs at 100 °C. Benzoic acid was added as internal standard prior methanolysis. Methyl-hyroxybutyryl monomers were extracted after addition of water (20% v/v) and vigorous mixing, The organic phase was analyzed by GC- FID with a DB-WAX column, length 15 m, I.D. 0.32 mm, film 0.5 m. Column flow was set to 1.8 ml/min, detector temperature to 270 °C and inlet temperature to 240 °C. A sample volume of 1

125 l was injected with a split ratio of 2. Temperature gradient was run from 90 °C to 230 °C at 40 °C/min. v) DNA content. The DNA content was calculated from that in E.coli [57], using appropriate corrections to account for the size of M. extorquens genome and for its growth rate on methanol. vi) RNA content. The RNA content was determined from the amount of ribose released after acidic hydrolysis (2M HCl for 2 hrs), assuming that all ribose was derived from RNA. The hydrolysis yield was determined from commercial RNA and data were corrected accordingly. Ribose was quantified as described under iii). vii) Polyamine content. The polyamine content was calculated from that in E.coli. The occurrence of putrescine in methanol-grown M. extorquens cells was also controlled by GC-MS-MS. viii) Carotenoid content. Data were taken from [58]. Based on their experimentally determined chemical properties, they were assumed to be spirilloxanthin-like carotenoids. ix) Content in intracellular metabolites. Data – which included both the nature and amounts of metabolites - were taken from [50, 59], [60, 61], [62], and [23]. The amounts of Coenzyme A thioesters were determined by P. Kiefer (unpublished data). The content in tetra-aminoptherin and related cofactors were obtained from [23, 62]. The contents in other cofactors were calculated from that in E.coli [57], using appropriate corrections. viii) Inorganic ions. The amounts of inorganic ions were calculated from that in E.coli [57], using appropriate corrections. The complete details of the biomass composition of M. extorquens, as they result from the above investigations or calculations, are given in Additional file 4.

Cultivation and labeling experiment. Batch cultivations (three biological replicates) of M. extorquens AM1 were carried out at 28°C in minimal medium, in a bioreactor (Infors-HT, Bottmingen, Switzerland), as described previously [15]. Cultivations carried out for the purpose of steady-state [13C]-methanol experiments were performed like in [15]. The cultivations were aerated 13 13 with 5% natural labeled CO2 to remove the CO2 produced by the bacteria from [ C]-methanol. 13 Under this condition, only 4.6 ± 0.4 % of total CO2 was found to derive from [ C]-methanol oxidation.

In silico calculations. The metabolic network - containing 1139 (m) reactions 977 (n) metabolites - was converted into a mathematical model corresponding to a m × n matrix defining the stoichiometric coefficient of reactions. Calculations of steady-state fluxes were performed using the software CellNetAnlyser [21] and Matlab (Mathworks, Inc.). Flux Balance Analysis (FBA) calculations were performed using various objective functions, as indicated in the text. Despite the genome annotation revealed the occurrence of a potential photosynthetic machinery, all calculations were performed assuming that no photosynthesis operated since no phototrophic behavior was reported for M. extorquens. Calculation of EFMs in the methylotrophic network was

126 performed using the solver EFMTool [63]. They were calculated assuming no maintenance energy. To be compared with experimental data, the obtain biomass yields (Y) of the EFMs were calculated as following:

With max: theoretical maximum growth rate (calculated to be 0.201 for the methylotrophic network from FBA simulations); i: growth rate of the EFMi (= 0.1); qsi: substrate uptake rate of -1 -1 the EFMi, in mmol·g ·h ; NGAMs: corresponding substrate uptake to fulfill non-growth -1 -1 associated maintenance energy, i.e. 1.9 mmol·g ·h ; MWs: molecular weight of the substrate. Minimal cut sets [34] were calculated on the calculated EFMs, using biomass production as target function.

13C Metabolic Flux Analysis. The distribution of metabolic fluxes during methylotrophic growth was determined from 13C-labeling data collected during steady-state growth of M. extorquens on [13C]-methanol. The distribution of 13C-isotopomers in metabolites can be accurately determined by NMR or MS [37, 64, 65], alone or in combination [15]. Here the 13C-isotopomers of proteinogenic amino-acids were measured by the two methods. NMR spectra were monitored as described in [15] and LC-MS analysis were performed using Rheos 2200 HPLC system (Flux Instruments) coupled to an LTQ Orbitrap mass spectrometer (Thermo Fisher Scientific), equipped with an electrospray ionization probe and the amino acid were separated on a pHILIC column (150 × 2.0 mm, particle size 5 m; Sequant, Umea, Sweden), following a procedure described [59]. A total of 193 isotopomer data – including 137 NMR data plus 56 MS data – were collected (Additional file 12). The metabolic network considered for flux calculations contained 65 reactions – including 7 reversible reactions - describing M. extorquens central metabolism, according to the topology of the methylotrophic network (Additional file 17). Flux calculations were performed using the software 13C-Flux [66], which uses both mass balances and carbon atom transitions to describe the metabolic. The methanol uptake rate and the requirements in biomass precursors, determined from data in Additional file 4, 11, were constrained. The confidence on the measured fluxes was determined using the sensitivity analysis module of 13C-Flux. Results were expressed as absolute fluxes in mmol.g-1.h-1 +/- standard deviations.

Acknowledgements

We thank Philipp Christen for cultivation of M. extorquens AM1 in bioreactors. We thank Birgit Roth Zgraggen of the Functional Genomics Center Zurich for performing amino acid quantification. This work was supported by ETH Zurich, Research Grant ETH-25 08–2. The Swiss Academy of Engineering Science (SATW) and the Centre Français pour l‘Accueil et les Echanges

127 Internationaux (Egide) supported the work with a travel grant (Germaine de Staël program). The work carried out at the LISBP (Toulouse, France) was supported by the Région Midi-Pyrénées, the European Regional Development Fund (ERDF), the French Ministry for Higher Education & Research, the SICOVAL, and the Réseau RMN Midi-Pyrénées.

References

1. Schrader, J., et al., Methanol-based industrial biotechnology: current status and future perspectives of methylotrophic bacteria. Trends Biotechnol, 2009. 27(2): p. 107-15. 2. Brautaset, T., et al., Bacillus methanolicus: a candidate for industrial production of amino acids from methanol at 50 degrees C. Appl Microbiol Biotechnol, 2007. 74(1): p. 22-34. 3. Chistoserdova, L., M.G. Kalyuzhnaya, and M.E. Lidstrom, The expanding world of methylotrophic metabolism. Annu Rev Microbiol, 2009. 63: p. 477-99. 4. Chistoserdova, L., Modularity of methylotrophy, revisited. Environ Microbiol, 2011. 5. Delmotte, N., et al., Community proteogenomics reveals insights into the physiology of phyllosphere bacteria. Proc Natl Acad Sci U S A, 2009. 106(38): p. 16428-33. 6. Corpe, W.A. and S. Rheem, Ecology of the methylotrophic bacteria on living leaf surfaces. FEMS Microbiology Letters, 1989. 62(4): p. 243-249. 7. Sy, A., et al., Methylotrophic metabolism is advantageous for Methylobacterium extorquens during colonization of Medicago truncatula under competitive conditions. Appl Environ Microbiol, 2005. 71(11): p. 7245-52. 8. Abanda-Nkpwatt, D., et al., Molecular interaction between Methylobacterium extorquens and seedlings: growth promotion, methanol consumption, and localization of the methanol emission site. J Exp Bot, 2006. 57(15): p. 4025-32. 9. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 3. Distribution of radioactivity in metabolites of methanol-grown Pseudomonas AM1 after incubation with [C]methanol and [C]bicarbonate. Biochem J, 1962. 82(3): p. 483-8. 10. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C(1) compounds. 4. Carboxylation of phosphoenolpyruvate in methanol-grown Pseudomonas AM1. Biochem J, 1962. 85(1): p. 243-50. 11. Large, P.J. and J.R. Quayle, Microbial growth on C(1) compounds. 5. Enzyme activities in extracts of Pseudomonas AM1. Biochem J, 1963. 87(2): p. 386-96. 12. Chistoserdova, L., et al., C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science, 1998. 281(5373): p. 99-102. 13. Vorholt, J.A., Cofactor-dependent pathways of formaldehyde oxidation in methylotrophic bacteria. Arch Microbiol, 2002. 178(4): p. 239-49. 14. Erb, T.J., et al., Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A, 2007. 104(25): p. 10631-6. 15. Peyraud, R., et al., Demonstration of the ethylmalonyl-CoA pathway by using 13C metabolomics. Proc Natl Acad Sci U S A, 2009. 106(12): p. 4846-51. 16. Vuilleumier, S., et al., Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources. PLoS One, 2009. 4(5): p. e5584. 17. Thiele, I. and B.O. Palsson, A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc, 2010. 5(1): p. 93-121. 18. Kanehisa, M. and S. Goto, KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 2000. 28(1): p. 27-30. 19. Saier, M.H., Jr., C.V. Tran, and R.D. Barabote, TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Res, 2006. 34(Database issue): p. D181-6. 20. Caspi, R., et al., The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res, 2010. 38(Database issue): p. D473-9. 21. Klamt, S., J. Saez-Rodriguez, and E.D. Gilles, Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst Biol, 2007. 1: p. 2. 22. Knief, C., L. Frances, and J.A. Vorholt, Competitiveness of diverse Methylobacterium strains in the phyllosphere of Arabidopsis thaliana and identification of representative models, including M. extorquens PA1. Microb Ecol, 2010. 60(2): p. 440-52.

128 23. Crowther, G.J., G. Kosaly, and M.E. Lidstrom, Formate as the Main Branchpoint for Methylotrophic Metabolism in Methylobacterium extorquens AM1. J Bacteriol, 2008. 24. Southall, S.M., et al., Soluble aldose sugar dehydrogenase from Escherichia coli: a highly exposed active site conferring broad substrate specificity. J Biol Chem, 2006. 281(41): p. 30650-9. 25. Van Dien, S.J., et al., Reconstruction of C(3) and C(4) metabolism in Methylobacterium extorquens AM1 using transposon mutagenesis. Microbiology, 2003. 149(Pt 3): p. 601-9. 26. Portais, J.C. and A.M. Delort, Carbohydrate cycling in micro-organisms: what can (13)C-NMR tell us? FEMS Microbiol Rev, 2002. 26(4): p. 375-402. 27. Lee, M.C., H.H. Chou, and C.J. Marx, Asymmetric, bimodal trade-offs during adaptation of Methylobacterium to distinct growth substrates. Evolution, 2009. 63(11): p. 2816-30. 28. Schuster, S., T. Dandekar, and D.A. Fell, Detection of elementary flux modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Trends Biotechnol, 1999. 17(2): p. 53-60. 29. Schmidt, S., et al., Functional investigation of methanol dehydrogenase-like protein XoxF in Methylobacterium extorquens AM1. Microbiology, 2010. 156(Pt 8): p. 2575-86. 30. Sasser, M., Identification of bacteria through fatty acid analysis, in Methods in Phytobacteriology, Z. Klement, Rudolph, K., Sands, D. C., Editor. 1990, Akademiai Kiado: Budapest. p. 199-204. 31. Okubo, Y., et al., Alternative route for glyoxylate consumption during growth on two-carbon compounds by Methylobacterium extorquens AM1. J Bacteriol, 2010. 192(7): p. 1813-23. 32. Erb, T.J., et al., The apparent malate synthase activity of Rhodobacter sphaeroides is due to two paralogous enzymes, (3S)-Malyl-coenzyme A (CoA)/{beta}-methylmalyl-CoA lyase and (3S)- Malyl-CoA thioesterase. J Bacteriol, 2010. 192(5): p. 1249-58. 33. Hagemeier, C.H., et al., Characterization of a second methylene tetrahydromethanopterin dehydrogenase from Methylobacterium extorquens AM1. Eur J Biochem, 2000. 267(12): p. 3762-9. 34. Klamt, S. and E.D. Gilles, Minimal cut sets in biochemical reaction networks. Bioinformatics, 2004. 20(2): p. 226-34. 35. Chistoserdova, L., et al., Methylotrophy in Methylobacterium extorquens AM1 from a genomic point of view. J Bacteriol, 2003. 185(10): p. 2980-7. 36. Van Dien, S.J., T. Strovas, and M.E. Lidstrom, Quantification of central metabolic fluxes in the facultative methylotroph methylobacterium extorquens AM1 using 13C-label tracing and mass spectrometry. Biotechnol Bioeng, 2003. 84(1): p. 45-55. 37. Massou, S., et al., NMR-based fluxomics: quantitative 2D NMR methods for isotopomers analysis. Phytochemistry, 2007. 68(16-18): p. 2330-40. 38. Large, P.J., D. Peel, and J.R. Quayle, Microbial growth on C1 compounds. II. Synthesis of cell constituents by methanol- and formate-grown Pseudomonas AM 1, and methanol-grown Hyphomicrobium vulgare. Biochem J, 1961. 81: p. 470-80. 39. Van Dien, S.J. and M.E. Lidstrom, Stoichiometric model for evaluating the metabolic capabilities of the facultative methylotroph Methylobacterium extorquens AM1, with application to reconstruction of C(3) and C(4) metabolism. Biotechnol Bioeng, 2002. 78(3): p. 296-312. 40. Hacking, A.J. and J.R. Quayle, Purification and properties of malyl-coenzyme A lyase from Pseudomonas AM1. Biochem J, 1974. 139(2): p. 399-405. 41. Smejkalova, H., T.J. Erb, and G. Fuchs, Methanol assimilation in Methylobacterium extorquens AM1: demonstration of all enzymes and their regulation. PLoS One, 2010. 5(10). 42. Mahadevan, R., B.O. Palsson, and D.R. Lovley, In situ to in silico and back: elucidating the physiology and ecology of Geobacter spp. using genome-scale modelling. Nat Rev Microbiol, 2011. 9(1): p. 39-50. 43. Mahadevan, R. and D.R. Lovley, The degree of redundancy in metabolic genes is linked to mode of metabolism. Biophys J, 2008. 94(4): p. 1216-20. 44. Quayle, J.R. and N. Pfennig, Utilization of methanol by rhodospirillaceae. Arch Microbiol, 1975. 102(3): p. 193-8. 45. McKinlay, J.B. and C.S. Harwood, Carbon dioxide fixation as a central redox cofactor recycling mechanism in bacteria. Proc Natl Acad Sci U S A, 2010. 107(26): p. 11669-75. 46. Laguna, R., F.R. Tabita, and B.E. Alber, Acetate-dependent photoheterotrophic growth and the differential requirement for the Calvin-Benson-Bassham reductive pentose phosphate cycle in Rhodobacter sphaeroides and Rhodopseudomonas palustris. Arch Microbiol, 2010. 47. Fall, R. and A.A. Benson, Leaf methanol - The simplest natural product from plants. Trends in Plant Science, 1996. 1(9): p. 296-301. 48. Huve, K., et al., Simultaneous growth and emission measurements demonstrate an interactive control of methanol release by leaf expansion and stomata. J Exp Bot, 2007. 58(7): p. 1783-93. 49. Vallenet, D., et al., MicroScope: a platform for microbial genome annotation and comparative genomics. Database (Oxford), 2009. 2009: p. bap021.

129 50. Kiefer, P., N. Delmotte, and J.A. Vorholt, Nanoscale Ion-Pair Reversed-Phase HPLC-MS for Sensitive Metabolome Analysis. Anal Chem, 2010. 51. Shannon, P., et al., Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, 2003. 13(11): p. 2498-504. 52. Herbert, D.P., P.J., Strange, R.E., Chemical analysis of microbial cells, in Methods in Microbiology, J.R. Norris, D.W., Editor. 1971, Academic Press: London and New York. p. 209- 344. 53. Sapan, C.V., R.L. Lundblad, and N.C. Price, Colorimetric protein assay techniques. Biotechnology and Applied Biochemistry, 1999. 29: p. 99-108. 54. Kiefer, P., E. Heinzle, and C. Wittmann, Influence of glucose, fructose and sucrose as carbon sources on kinetics and stoichiometry of lysine production by Corynebacterium glutamicum. J Ind Microbiol Biotechnol, 2002. 28(6): p. 338-43. 55. Braunegg, G., B. Sonnleitner, and R.M. Lafferty, Rapid Gas-Chromatographic Method for Determination of Poly-Beta-Hydroxybutyric Acid in Microbial Biomass. European Journal of Applied Microbiology and Biotechnology, 1978. 6(1): p. 29-37. 56. Jan, S., et al., Study of Parameters Affecting Poly(3-Hydroxybutyrate) Quantification by Gas- Chromatography. Analytical Biochemistry, 1995. 225(2): p. 258-263. 57. Neidhardt, F.C., Chemical composition of Escherichia coli, in Escherichia coli and Salmonella: Cellular and Molecular Biology, F.C. Neidhardt, Curtiss, R., Ingraham, J. L., Lin, E. C. C., Low, K. B., Magasanik, B., et al., Editor. 1996, American Society for Microbiology Press: Washington, D.C. p. 3-6. 58. Konovalova, H.M., S.O. Shylin, and P.V. Rokytko, [Characteristics of carotinoids of methylotrophic bacteria of Methylobacterium genus]. Mikrobiol Z, 2007. 69(1): p. 35-41. 59. Kiefer, P., J.C. Portais, and J.A. Vorholt, Quantitative metabolome analysis using liquid chromatography-high-resolution mass spectrometry. Anal Biochem, 2008. 382(2): p. 94-100. 60. Guo, X. and M.E. Lidstrom, Physiological analysis of Methylobacterium extorquens AM1 grown in continuous and batch cultures. Arch Microbiol, 2006. 186(2): p. 139-49. 61. Guo, X. and M.E. Lidstrom, Metabolite profiling analysis of Methylobacterium extorquens AM1 by comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry. Biotechnol Bioeng, 2008. 99(4): p. 929-40. 62. Vorholt, J.A., et al., The NADP-dependent methylene tetrahydromethanopterin dehydrogenase in Methylobacterium extorquens AM1. J Bacteriol, 1998. 180(20): p. 5351-6. 63. Terzer, M. and J. Stelling, Large-scale computation of elementary flux modes with bit pattern trees. Bioinformatics, 2008. 24(19): p. 2229-35. 64. Massou, S., et al., Application of 2D-TOCSY NMR to the measurement of specific(13C-enrichments in complex mixtures of 13C-labeled metabolites. Metab Eng, 2007. 9(3): p. 252-7. 65. Kiefer, P., et al., Determination of carbon labeling distribution of intracellular metabolites from single fragment ions by ion chromatography tandem mass spectrometry. Anal Biochem, 2007. 360(2): p. 182-8. 66. Wiechert, W., et al., A universal framework for 13C metabolic flux analysis. Metab Eng, 2001. 3(3): p. 265-83.

130

Supporting Information

Supporting information, especially the network model, is accessible in the attached CD.

131

Fig. S1 – Work flow of the reconstruction and reduction processes The work flow of the reconstruction was performed similarly to the protocol recommended for the generation of high-quality reconstruction (Thiele & Palsson, 2010).

132

Table S1 – List of reactions of the reconstruction (iRP911) -> CD attached

Table S2 – List of metabolites of the reconstruction (iRP911) -> CD attached

Table S3 – Detailed biomass composition -> CD attached

133 Table S4 – Table of substrate usage by M. extorquens AM1 from experimentally observed phenotype and Flux Balance Analysis using the genome scale network (iRP911) Growth (+/-) Theoretical Capability to Included as Biolog and maximum support Compound exchange flux in Data source oxidation experimental growth rate maintenance the model (1) growth rate (iRP911) energy formaldehyde n.d. 1 0.347 0.36 + Vorholt et al. (2000) Succinic acid 0.33 1 0.18-0.22 0.195 + Knief et al. (2010); Lee et al. (2009) methanol n.d. 1 0.17-0.19 0.213 + This study; Lee et al. (2009) Pyruvic acid 0.69 1 + 0.166 + Knief et al. (2010); Van Dien (2003); Salem et al. (1973) Formic acid 0.5 1 + 0.055 + Knief et al. (2010); Lee et al. (2009) Oxalic acid 0.23 1 + 0.054 + Knief et al. (2010) Acetic acid 0.15 1 + 0.175 + Knief et al. (2010); Lee et al. (2009) L-lactate n.d. 1 + 0.209 + Salem et al. (1973) Carbon sourcesCarbon 1,2-propanediol n.d. 1 + 0.246 + Bolbot and Anthony (1980) ethanol n.d. 1 + 0.267 + Lee et al. (2009) ethylamine n.d. 1 + 0.267 + Okubo et al. (2010) methylamine n.d. 1 + 0.213 + Lee et al. (2009); Okubo et al. (2010) Dihydroxyacetone 1 1 n.d. 0.228 + Knief et al. (2010) L-Malic acid 0.84 1 n.d. 0.142 + Knief et al. (2010) L-Lyxose 0.81 1 n.d. 0 + Knief et al. (2010) D,L-Malic acid 0.7 1 n.d. 0.142 + Knief et al. (2010) D-Ribose 0.59 1 n.d. 0 + Knief et al. (2010) Ethanolamine 0.53 1 n.d. 0.255 + Knief et al. (2010) 5-Keto-D-gluconic acid 0.46 1 n.d. 0.178 + Knief et al. (2010) Fumaric acid 0.45 1 n.d. 0.142 + Knief et al. (2010) D-Arabinose 0.43 1 n.d. 0 + Knief et al. (2010) 2-Deoxy-D-ribose 0.37 1 n.d. 0 + Knief et al. (2010) D-Xylose 0.31 1 n.d. 0 + Knief et al. (2010) L-Arabinose 0.29 1 n.d. 0 + Knief et al. (2010) Acetoacetic acid 0.22 1 n.d. 0.192 + Knief et al. (2010) 3-0--D-Galactopyranosyl-D-arabinose0.15 1 n.d. 0 + Knief et al. (2010) Capric acid 0.13 1 n.d. 0.29 + Knief et al. (2010) -Keto-glutaric acid 0.11 1 n.d. 0.156 + Knief et al. (2010) Propionic acid 0.1 1 n.d. 0.206 + Knief et al. (2010) L-tartaric acid 0 1 n.d. 0.166 + Knief et al. (2010) L- 0 1 n.d. 0.138 + Knief et al. (2010) D-citric acid 0 1 n.d. 0.147 + Knief et al. (2010) glycolic acid 0 1 n.d. 0.128 + Knief et al. (2010) D-gluconic acid 0 1 n.d. 0.197 + Knief et al. (2010) putrescine 0 1 n.d. 0 - Knief et al. (2010) L- 0 1 n.d. 0.173 + Knief et al. (2010) -D-glucose 0 1 n.d. 0.206 + Knief et al. (2010) CO2 n.d. 1 n.d. 0 - Methane n.d. 0 - n.d. n.d. Knief et al. (2010) Oxalomalic acid 0.32 0 n.d. n.d. n.d. Knief et al. (2010) Sorbic acid 0.28 0 n.d. n.d. n.d. Knief et al. (2010)

Other compounds Other Bromo-succinic acid 0.19 0 n.d. n.d. n.d. Knief et al. (2010) D-Glucosamine 0.17 0 n.d. n.d. n.d. Knief et al. (2010) Methyl pyruvate 0.14 0 n.d. n.d. n.d. Knief et al. (2010) Palatinose 0.14 0 n.d. n.d. n.d. Knief et al. (2010) D-Tagatose 0.1 0 n.d. n.d. n.d. Knief et al. (2010) Pectin 0.1 0 n.d. n.d. n.d. Knief et al. (2010) D-Psicose 0.07 0 n.d. n.d. n.d. Knief et al. (2010) 2,3-Butanone 0.06 0 n.d. n.d. n.d. Knief et al. (2010) Dulcitol 0.06 0 n.d. n.d. n.d. Knief et al. (2010) Glucuronamide 0.05 0 n.d. n.d. n.d. Knief et al. (2010) -Hydroxy-butyric acid 0.05 0 n.d. n.d. n.d. Knief et al. (2010) L-Ornithine 0.04 0 n.d. n.d. n.d. Knief et al. (2010) -Cyclodextrin 0.04 0 n.d. n.d. n.d. Knief et al. (2010) D-Raffinose 0.03 0 n.d. n.d. n.d. Knief et al. (2010) Glycyl-L- 0.03 0 n.d. n.d. n.d. Knief et al. (2010) -Keto-butyric acid 0.03 0 n.d. n.d. n.d. Knief et al. (2010) -Phenylethylamine 0.03 0 n.d. n.d. n.d. Knief et al. (2010) Gentiobiose 0.02 0 n.d. n.d. n.d. Knief et al. (2010) 3-Hydroxy-2-butanone 0.01 0 n.d. n.d. n.d. Knief et al. (2010) D-Fucose 0.01 0 n.d. n.d. n.d. Knief et al. (2010) D-Lactic acid methyl ester 0.01 0 n.d. n.d. n.d. Knief et al. (2010) L-Pyroglutamic acid 0.01 0 n.d. n.d. n.d. Knief et al. (2010) N-Acetyl-D-glucosamine 0.01 0 n.d. n.d. n.d. Knief et al. (2010) Tyramine 0.01 0 n.d. n.d. n.d. Knief et al. (2010) -D-Allose 0.01 0 n.d. n.d. n.d. Knief et al. (2010) SUM 66 49 38 12 29 36

134

Fig. S2 – Electron flow through the metabolic network of Methylobacterium extorquens AM1. The schemas represent the reaction involved in electron flow in Methylobacterium extorquens AM1like it appear from the network reconstruction (iRP911). Detail on the reaction, identifiers R-XXXX, can be found in the Additional file 2.

135 Table S5 – Methylotrophic network identification -> CD attached List of criteria and corresponding results for each reaction leading to their exclusion or inclusion into the methylotrophic network.

Table S6 – EFMs analysis and connectivity of biomass precursors biosynthesis. -> CD attached

Fig. S3 – Reaction essentiality in the methylotrophic network. The graph displays the number of reactions identified as essential, essential but having redundant enzyme or alternate reaction reduced during the reduction processes, dispensable (FC < 1), blocked (not used by any elementary modes: dead-end reactions), and non-methylotrphic reactions (set to 0 after the reduction process). Reaction essentiality analysis was performed using Minimal Cut Set calculation (Klamt & Gilles, 2004) on the set of EFMs (Schuster et al, 1999) allowing biomass production from methanol. Reactions having a Fragility Coefficient (FC) (Klamt & Gilles, 2004), calculated from the MCSs, of 1 were identified as essential. Reaction linked via the Gene-to-protein-to-reaction association to gene with mutant phenotype experimentally demonstrated were display by coloring the reaction in red for lethal phenotype, in blues non- lethal phenotype, and black: no mutant data available. The accuracy of the model prediction is indicated upper the bar for each class of reaction.

Table S7 – Table of published mutant phenotypes and associated genes and reactions. -> CD attached

Table S8 – Growth parameters of the 3 replicates during 13C-methanol labeling experiments

Microoraganism Methylobacterium extorquens Strain AM1 (Wild Type) Gaz Air with 5% CO2 12C(98.1%)

Substrate Substarte Medium Gas Correlation Start OD 2  Final OD  2  Substrate qsubstrate 2  Yield biomass concentration (CO2) factor 12 -1 -1 -1 -1 -1 mM C g(CDW).L .OD at harvest (h ) at harvest (mmol.g .h ) (g.g-1) Biological replicate 1 13C meoh 129.3 minimal medium 5.00% 0.27 0.028 0.002 1.14 0.087 0.002 50.0 16.17 0.59 0.175 Biological replicate 2 13C meoh 122.6 minimal medium 5.00% 0.27 0.034 0.005 2.21 0.105 0.002 23.0 15.75 0.68 0.219 Biological replicate 3 13C meoh 126.1 minimal medium 5.00% 0.27 0.022 0.002 2.75 0.121 0.005 23.6 15.10 0.60 0.266

136 Table S9 – Fitting of the isotopomers data collected during 13C-methanol labeling experiments -> CD attached

Table S10 – Flux distributions and sensitivity analysis. -> CD attached Only C1 assimilation was considered during flux calculation due to the high difference in range of C1dissimilation and assimilation. Measured methanol uptake rate was considered subsequently.

137

Fig. S4 – Quality of isotopomers fitting Comparison of experimental and collected isotopomer values for the three biological replicates. The isotopomer data include both LC-MS and 2D-NMR (HSQC and TOCSY) data. Flux calculation and fitting were performed using the software 13CFlux (Wiechert et al, 2001). A) Experimental values (+/- standard deviation) are plotted against theoretical values. B) Residuum of the calculated data.

138

Fig. S5 – Flux variability in the 3 biological replicates. Comparison of the flux distribution obtained for the three biological replicates. The flux calculation and the sensitivity analysis were performed using the software 13CFLUX (Wiechert et al, 2001). The fluxes were normalized by the flux of entry of the C1-units in central metabolism (SHMT : serine hydroxymethyltransferase). Flux distributions were found to be similar except slight changes through the C3/C4 interconversions (pyruvate kinase (PK), pyruvate dikynase (PPDK), malic enzyme (ME) and the phosphoenolpyruvate carboxykinase (PEPCK)), and through the Entner-Doudorof pathway.

Table S11 – Ftbl file describing the network used for flux calculation. -> CD attached

139

140

CHAPTER V

Co-consumption of methanol and succinate by Methylobacterium extorquens AM1

Rémi Peyraud, Patrick Kiefer, Philipp Christen, Jean-Charles Portais, Julia A. Vorholt

Manuscript in preparation for publication: Peyraud R, Kiefer P, Christen P, Portais J-C, and Vorholt JA Contribution by RP: Design of the study, experimental work, data treatment and analysis, writing the manuscript.

141 Abstract Methylobacterium extorquens AM1 is a facultative methylotrophic Alphaproteobacterium and has been subject to intense studies under pure methylotrophic as well as pure heterotrophic growth conditions in the past. Here, we investigated the metabolism of the model methylotroph under mixed substrate conditions, i.e. in the presence of methanol in addition to succinate. We found that both substrates were co-metabolized whereby the carbon conversion rate was two thirds from succinate and one third from methanol relative to mol carbon. 13C-methanol labeling and liquid chromatography mass spectrometry analyses revealed segregation of the fate of carbon from both substrates. Methanol was primarily oxidized to CO2 for energy generation. However, a smaller part of methanol entered biosynthetic reactions via reactions specific to the one carbon carrier tetrahydrofolate. On the other hand, succinate was primarily used to provide precursor metabolites for bulk biomass production. This work opens new perspectives the role of the methylotrophy when simultaneously substrate are available, a situation prevailing under environmental conditions.

Introduction

Bacteria are often living in environments containing limited but divers substrates [1]. One of such habitat is the phyllosphere, where facultative methylotrophic bacteria are found to be ubiquitous and abundant [2, 3]. These methylotrophic bacteria belong to the genus Methylobacterium and are known to metabolize methanol but also a limited number of alternative carbon substrates such as organic acids and alcohols. Plant leaf surfaces release divers carbon sources, essentially sugars and organic acids at low amount (M range) [4-6], which are heterogeneously located and andle result of leaching through the cuticle [7]. Besides these metabolites, also volatile carbon substrates such as methanol are available in the phyllosphere as a result of plant cell wall metabolism. Methanol is released transiently with an emission peak in the morning, when the stomata are opening [8]. Evidence is available that methanol is consumed by Methylobacterium and contributes to the epiphytic fitness of the organisms [5, 9]. However, besides methanol, additional carbon sources were suggested to be relevant for the ability to colonize plant surfaces in situ [9]. M. extorquens AM1 is a model methylotrophic organism and a number of novel enzymes and pathways involved in methanol dissimilation and assimilation were shown to operate in this organism [10-12]. In the past, a number of specific proteins involved in methylotrophy are stand between methylotrophic growth conditions (i.e. methanol as sole source of carbon and energy) and multicarbon growth conditions (i.e. succinate as sole source of carbon and energy). These investigations comprise transcriptomics [13], proteomics [14], and metabolomics analyses [15, 16]. The central metabolism of M. extorquens AM1 is complex and includes 85 biochemical

142 reactions and is strongly reprogrammed between the two conditions. Indeed, large metabolic modules such as the tetrahydromethanopterin-dependent oxidation pathway, the serine cycle, and the ethylmalonyl-CoA pathway are essential during growth of methanol but dispensable as an operating unit during growth on organic acids (although individual enzymes may still be required and during growth with organic acid, such as succinate or pyruvate). On the other side, pathways like the TCA cycle and pyruvate dehydrogenase that are providing energy during growth on organic acid are not required during C1 growth [17]. Nevertheless, these multiple metabolic pathways remain strongly embedded around the nodes that are the C2-C3-C4 metabolites (acetyl- CoA, phosphoenolpyruvate, pyruvate, oxaloacetate and malate). To ensure the capacity to switch to new metabolic substrates the network around these nodes is strongly flexible [17]. Recently, the transition from succinate to methanol growth was analyzed in a comprehensive study using complementary systems level approaches and revealed the stability of core metabolites although 100 genes were found to be up-regulated upon the substrate switch [18]. This reorganization of a large part of the central metabolism of M. extorquens AM1 between growth with each substrate raises the question how cells adapt to mixed substrate conditions naturally encountered by the bacteria in their environment. One strategy of microorganisms to handle mixed substrate conditions is the sequential consumption of carbon sources. The sequential utilization may favor the consumption of a first substrate which allows the highest growth yield, and the repression of gene expression for proteins allowing the specific metabolization of a second substrate (i.e. catabolite repression). Once the first substrate ceases, a subsequent growth phase is initiated which allows the use of a second substrate etc. and involves the adaptation to the new growth conditions. For instance Escherichia coli is able to use both glucose and lactose; however, glucose is consumed first and then lactose when both are present in the medium. The molecular mechanism of this metabolic shift where proposed by Jacobs and Monod 50 years ago, it is based on regulation (induction and/or repression) of specific genes expression involved in lactose utilization and their organization in an operon (lac operon) [19, 20]. Diauxic shift were studied in microorganism growing in batch culture in laboratory and led to fundamental discovery of enzymes regulation. However, there is rising evidence that co-consumption, i.e. simultaneous consumption of several carbon sources at the same time, could be a suitable strategy for microorganism in environment having low but divers substrate available [21]. In this study, we investigated M. extorquens AM1 under mixed substrate condition, methanol plus succinate, to elucidate if diauxie occurs and/or co-consumption.

Results

143

Fig. 1. Monitoring of growth parameters of M. extorquens AM1 batch-culture with 60 mM methanol plus 15 mM succinate. A. Monitoring of Optical Density (OD) at 600 nm, methanol consumption and 13 12 succinate consumption. B. Monitoring of oxygen partial pressure (pO2), acid pumping in mL, C and C CO2 production in exhaust gas. Sampling of intracellular metabolites were performed at three time points and are indicated in A: mid-co-consumption phase (Sampling time 1, S1), end co-consumption phase (Sampling time 2, S2), transition phase (Sampling time 3, S3).

Growth characterization of M. extorquens AM1 on methanol plus succinate. Diauxic growth is often observed when bacteria are exposed to two substrates as a consequence of catabolite

144 repression. In order to assess the behavior of M. extorquens AM1 to deal with succinate in addition to methanol growth experiments were performed in batch-cultures on minimal medium with equivalent C-mol of methanol and succinate, i.e. 60 mM and 15 mM respectively. Measuring of methanol and succinate revealed that during the exponential growth phase both substrates were metabolized at the same time (Fig. 1 A). Consumption rates of both substrates under mixed substrate conditions were lower than their respective ones under pure culture condition. Whereas succinate utilisation dropped by 34%, methanol utilisation dropped by 70% (Table 1). Notably, both conversion rates taken together corresponded to a similar amount of C-mol consumption than during both pure condition with 17.4 mmol·g-1·h-1 of carbon consumption rate (Table 1). The relative higher contribution of succinate to

Table 1. Growth parameters of M. extorquens AM1 cells growing in batch-culture in minimal medium with 120 mM methanol, or 15 mM succinate, or 60 mM methanol plus 15 mM succinate.

succinate succinate (15 mM) + methanol Growth parameters (15 mM) 2* methanol (60 mM) 2* ( 120 mM) 2* Growth rate (h-1) 0.20 0.01 0.18 0.01 0.17 0.01 succinate uptake rate 18.9 1.8 12.5 1.9 (C-mmol.g-1.h-1) methanol uptake rate 4.9 0.2 16.2 0.2 (C-mmol.g-1.h-1) CO production rate 2 8.3 1.3 7.2** 7.7 0.3 (mmol.g-1.h-1) Biomass Yield (g.g-1) 0.36 0.02 0.35 0.01 0.32

Succinate(12C) + Labelling methanol(13C) 2* 12C CO2 production rate -1 -1 (mmol.g .h ) 3.0 13C CO2 production rate -1 -1 (mmol.g .h ) 4.2 13C/12C ratio 1.4 0.2

Standard deviations of 3 biological replicates. ** 2 biological replicates growth under mixed substrate conditions (about 72% C-mol consumed) shows that succinate was the predominant substrate when both substrates were simultaneously available. The growth rate under the latter conditions (0.18 ± 0.01 h-1) was similar to succinate (0.20 ± 0.01 h-1) and on methanol (0.17 ± 0.01 h-1). The yield obtained was 0.35 which is more characteristic for succinate -1 growth (0.36 g(CDW)·g(substrate) compared to 0.32 under methylotrophic growth conditions). Growth ceased once succinate was entirely consumed, and after the transition phase of around 1.5 h growth resumed. In order to follow the metabolic fate of methanol and succinate, we performed the experiment with 13C labeled methanol (> 99%) and succinate at natural abundance of 13C (1.1 %). Determination of

145

Fig. 2. Average 13C labeling in intra-cellular metabolites measured by LC-MS during growth of M.extorquens AM1 upon co-consumption with 13C (> 99%) methanol and natural abundance (1.1% 13C) succinate. Metabolite quenching, extraction and measurements were performed specifically for each class of metabolite, i.e. amino acids, polar compounds, and coenzyme A thio-esters, as described in material and methods. Average 13C labeling (black) : sample collected during mid-co-consumption phase (Sampling time 1, see Fig. 1); (gray): sample collected at the end of the co-consumption phase (Sampling time 2 on Fig. 2).

13 12 CO2/ CO2 production in the exhaust gas of the bioreactor revealed that almost all of the methanol was converted to CO2 (Fig. 1B and Table 1), indicating that the methanol was used in a catabolic process. CO2 was also produced from succinate but in a lower amount than methanol, 24 13 12 % of the succinate was dissimilated. The ratio of C and C CO2 production was found to be 1.4 12 ± 0.2 and stable over time, from 5h cultivation until all succinate was depleted. The C CO2 production was then abolished but the dissimilation of methanol remained and in consequence the 13 C CO2 production became exclusive.

Incorporation of 13C methanol into amino acids and selected metabolites during mixed substrate conditions revealed by LC-MS. As outlined above, the majority of methanol was found to be catabolized to CO2. In order to address the question whether methanol carbon was assimilated into

146

Fig. 3. Display on central metabolic network of selected mass isotopomers distribution of central metabolites measured by LC-MS of M. extroquens AM1 growth upon co-consumption with 13C (> 99%) methanol and natural abundance (1.1% 13C) succinate. Precursors of measured amino acids into central metabolism or metabolites directly measured are indicated in boxes. Boxes color correspond to substrate specific carbon incorporation : orange from succinate, blue from methanol, green from both. Mass isotopomer data correspond to sample collected during mid-co-consumption phase (Sampling time 1, see Fig. 1). biomass at all and if so, into which metabolites, we performed intracellular metabolite analysis by LC-MS. Sampling for metabolites was performed at two different time points during culture growth with 13C methanol and natural labeled succinate. The first sampling of the intracellular

147 metabolites was performed in the middle of the first exponential growth phase and the second sampling was performed just at the complete depletion of succinate in the medium, a third sampling was performed later to monitor the extend of methanol assimilation 90 min after 13 succinate depletion. The average C labeling (AL13C) at the first two time points was found to be identical, indicating that the metabolism was stable until the complete depletion of succinate (Fig.

2). AL13C values determined ranged from 1% to 26%. Note that an AL13C of 1.1% means that all carbon atoms originate from succinate and an AL13C of 99% means that all carbon atoms originate from methanol. It can be concluded that during co-consumption the dominating carbon source for biosynthesis is succinate. However remarkable differences in the 13C fraction of metabolites could be observed. Whereas AL13C of most amino acids was about 2.0 % , phenylalanine (6 %), tyrosine (6 %), and in particular (19 %) showed increased AL13C values (Fig. 2). Analysis of the mass isotopomer distribution of the most conspicuous methionine revealed that 90% of the methionine contained one labeled carbon (Fig. 3 and supplementary Fig. 1). Methionine biosynthesis is known to derive from aspartate plus one C1 units from 5- methyltetrahydrofolate. Since no labeling incorporation was found in aspartate the 5- methyltetrahydrofolate should be the origin of 13C incorporation. As mentioned, also phenylalanine, tyrosine and hexose-phosphate harbored a small but significant incorporation of carbon from methanol which resulted in mass shifts of plus 1 and plus 2. The biosynthesis of these 3 compounds derives from metabolites of gluconeogenesis. There are four metabolites that can supply the gluconeogenesis from succinate and methanol: oxaloacetate, pyruvate, glycine, and 5,10-methylenetetrahydrofolate. Because no significant labeling was found in alanine (derivative of pyruvate), aspartate (derivative of oxaloacetate) and glycine, the unique pathway that can lead to such labeling incorporation is the condensation of 5,10-methylenetetrahydrofolate with glycine into serine via the first step of the serine cycle. Indeed, one labeled C3 units (phosphoglycerate) is generated, it could be incorporated into phosphoenolpyruvate, precursor of the phenylalanine and the tyrosine, and/or via gluconeogenesis into hexose-phosphate. Analysis of their mass isotopomer fraction indicates that 20% of the 2-phosphoglycerate is generated from the first step of the serine cycle, i.e. glycine plus 5,10-methylenetetrahydrofolate condensation (Fig 4). This observation indicates that serine is produced at least partially (> 20%) from condensation of glycine plus C1 compounds. Besides methionine, two more metabolites involving incorporation of C1 precursors showed significant increase of AL13C values: pantothenate (11,3 %) and AMP (26.7 %). Biosynthesis of pantothenic acid involves incorporation of one 5,10-methylenetetrahydrofolate. In case of adenine biosynthesis two formyl-THF and one CO2 are incorporated into the purine part of the molecule.

To see whether the observed AL13C can be explained by incorporation of C1 compounds originating from

148

Fig. 4. Prediction of mass isotopomer distribution of hexose-phosphate depending of the M1 mass isotopomers fraction into C3 precursor of gluconeogenesis. in black, predicted values; in red, measured values. The mass isotopomers fraction of hexose-phosphate measured were 0.63 ± 0.017 for M0, 0.32 ± 0.019 M1, 0.05 ± 0.020 M2, and 0.002 ± 0.020 M3. These values correspond to the probabilistic recombination of two C3-units from gluconeogenesis which have incorporated, for 20% of them, one 13C carbon and for 80% of them no labeling. This result indicates that a significant flux through the first steps of the serine cycle is operating and generate 20% of the 2-phosphoglycerate, even if the main flux 80% is coming from the opposite direction, i.e. gluconeogenesis. methanol, AL13C values were calculated assuming that i. tetrahydrofolate activated C1 precursors were mainly made from methanol (90% 13C accordingly with labeling found in methionine), ii. 13 CO2 was 58% C, and iii. all other carbon atoms originate from natural labeled succinate (1,1 % 13 C). Calculated AL13C values are 18.8 % (methionine), 11.0 % (pantothenic acid) and 24.6 %

(AMP). Calculated AL13C values are very similar to measured ones indicating that observed significant incorporation of 13C is limited to metabolites with C1 precursors needed for formation. In order to validate the approach and demonstrate enhanced assimilation of methanol into intracellular metabolites after succinate depletion, additional samples were taken at 90 minutes after succinate was consumed. Indeed 13C incorporation increased in all amino acids (supplementary Fig.s S2) indicating that cells started to use methanol as a carbon sources in all biosynthesis processes.

Incorporation of 13C methanol into CoA esters and their precursors during mixed substrate conditions. The ethylmalonyl-CoA pathway is a crucial pathway in providing anaplerotic support to the serine cycle by glyoxylate regeneration during methanol growth [11, 22]. In order to monitor the operation of the ethylmalonyl-CoA pathway during growth on methanol plus succinate, we performed CoA-ester extractions at the times points indicated above. To remove matrix effects of the medium an online-desalting protocol was developed, allowing analysis of LC-MS samples with relative high salt concentrations.

149 4 out of 12 CoA esters involved in the ethylmalonyl-CoA pathway as well as free coenzymeA could be detected. Remarkably the key intermediates crotonyl-CoA, and ethylmalonyl-CoA were not detectable. Free Coenzyme A as well as all the CoA ester derivatives that were quantifiable exhibited an important incorporation of labeling (Fig. 2). Highest AL13C was found for free Coenzyme A (22%), followed by Acetyl-CoA (20%). Finally all CoA esters of C2 and C4 organic acids had AL13C values of about 18 %. In addition all CoA esters showed very similar mass isotopomers distribution (Fig. 3). The decrease of average labeling with increasing number of carbon atoms of the organic acid residual indicates that labeled carbon atoms are located in the Coenzyme A part rather than in the esterified acids. Since C4 -hydroxybutyryl-CoA and C4 methylmalonyl-CoA showed very similar AL13C and since no significant change in the MID of 13 methylmalonyl-CoA due to incorporation of C labeled CO2 via the ethylmalonyl-CoA pathway

(note CO2 in the reactor is labeled to 58%) was observed, in addition with missing intermediaries in cell extract, it is very likely that the ethylmalonyl-CoA pathway was not operating during co- metabolism of succinate and methanol. To elucidate the origin of 13C labeling in coenzyme A moiety we calculate as previously the theoretical AL13C value coming from C1 precursor accordingly to the pathway of Coenzyme A biosynthesis (supplementary Fig. S3). Carbon atoms into Coenzyme A derived from carbons into one molecule of AMP, plus one molecule of pantothenic acid and carbon 2 and 3 of which derived from carbon 2 and 3 of serine. Calculated AL13C values considering the measured

AL13C of AMP and pantothenic acid should be of 18.4 % if 20% of serine having incorporated 13C carbon into position 3 whereas it should be 21.9% considering 90% of serine have incorporated 13 one C carbons (labeling state of C1 precursor). The AL13C of coenzyme A measured is 22.4 % indicating that likely a high level of serine (likely all serine) has incorporated 13C carbon from C1 units. These results indicates that mainly serine is biosynthetically produced from glycine and methylene-tetrahydrofolate during co-consumption of methanol plus succinate.

Discussion

In this study we elucidated the metabolism of M. extorquens AM1 cultures during growth on the mixed substrates methanol and succinate. We found simultaneous utilization of substates during the exponential growth phase whereby succinate conversion contributed to about 72% to the overall carbon conversion, followed after its total depletion by a transition phase to methylotrophic growth conditions. Interestingly, 13C-labeling experiment showed that during co-consumption of methanol and succinate there is partitionning of both substrates to supply specific metabolic pathways and functions. Methanol was mainly used to fulfill energy requirement. Only little, but significant, methanol carbon (in the order of 7%) ended up in biomass. This carbon could be specifically attributed to biosynthetic pathways linked to one carbon metabolism such as purine

150 biosynthesis from tetrahydrofolate derivatives. On the other hand, under mixed substrate conditions succinate was used in particular to fulfill the carbon requirement of the cell, and to energy generation. As succinate was the principle carbon source that was converted (relative to mol carbon) it can be regarded as the main substrate and methanol as an additive substrate used to supply energy and the C1-unit requirement. Apparently, methanol insertion into the serine cycle is reduced as is CO2 assimilation via phosphoenolpyruvate carboxylase and the ethylmalonyl-CoA pathway which appear to be blocked. At the enzyme activity level Dunstan and co-worker [23] observed upon co-consumption of methanol and succinate that the methanol dehydrogenase activity was detectable and not catabolically repressed. These results indicate in accordance with ours results, that cells are able to consume methanol during mixed substrate condition. The observed anabolic repression of the enzymes activities of the first step of serine cycle [23] is also in-line with the results presented in this study and may represent the separation point underlying the metabolic segregation under mixed substrate conditions. Indeed, the serine-glyoxylate aminotransferase activity dropped by a factor 10 under mixed substrate condition compared with methanol growth, but was still significant, higher than under pure succinate growth.

Table 2. Production yield in mol/C-mol(substrate) of specific metabolites from succinate or methanol calculated from biochemical knowledge on M. extorquens AM1 metabolic network.

Yield (mol/C-mol substrate) Compound methanol succinate ratio 5,10-methylene-THF 1 0.65 1.54 glycine 0.49 0.42 1.17 oxaloacetate 0.33 0.29 1.14 pyruvate 0.35 0.30 1.17 Energetic ATP 5 2.96 1.69 NADH 2 1.45 1.38 NADPH 2 1.50 1.33

ATP production was calculated considering optimal respiratory chain (P/O = 2).

The results obtained in this study indicated that the relative contribution of succinate and methanol to the overall carbon utilization is maintained until succinate is entirely consumed, suggesting at least on the flux level no preadaptation to assimilate methanol as sole carbon controlled by the succinate concentration occurs. Classical diauxic growth associated with catabolite repression was intensively studied in the past with [20] and it was postulated to operate in order to select the substrate supporting the higher growth yield. Only few studies reported on co-consumption [24] and few studies addressed the underlying carbon distribution [25, 26]. Wendisch and coworkers elucidated metabolic fluxes during co-consumption of Corynebacterium glutamicum and demonstrated that specific metabolic pathways regulations were established differently compared

151 when only substrate was present. Carvalho and coworkers found in Mycobacterium tuberculosis that partitionning of substrate into specific part of the metabolism is operating during co- consumption. We tried to identify a potential rational of the cell to segregate substrates in supplying specific metabolic functions and calculated from biochemical knowledge of M. extorquens AM1 the yield for the production of key metabolites in mol(molecules)/C- mol(substrate) from methanol or succinate (Table 2). In accordance with the observed segregation these calculation showed that methanol is more efficient to generate C1-precursors than succinate whereas for all the others metabolic precursors efficiencies are close. With regard to energetics, methanol appears more favorable for producing reducing equivalents for aerobic respiration than the equivalent C-mol of succinate. Therefore, the physiological state of the cell with substrate segregation in metabolic function appears to correspond to a rational utilization of the available resources. This study indicates that M. extorquens AM1 is able to performed co-consumption of succinate and methanol, a strategy that could be useful and efficient for the cell as regard to its environmental condition on leaf surface, i.e. low amount, transient, divers, and heterogeneously located substrates availability.

Materials and Methods

Chemicals. [13C ]methanol (99%) was purchased from Cambridge Isotope Laboratories; all others chemicals were purchased from Sigma (St. Louis, MO, USA). Acetonitrile, formic acid, and ammonium used for HPLC solvents were of LC-MS degree.

Medium composition, culture conditions, and growth parameters measurement. The minimal medium used to grow M. extorquens AM1 contained 1.62 g/L NH4Cl, 0.2 g/L MgSO4, 2.21 g/L K2HPO4, 1.25 g/L NaH2PO42H2O, and the following trace elements: 15 mg/L Na2EDTA2H2O, 4.5 mg/L ZnSO47H2O, 3 mg/L CoCl26H2O, 1 mg/L MnCl24H2O, 1 mg/L H3BO3, 2.5 mg/L CaCl2, 0.4 mg/L ofNa2MoO42H2O, 3 mg/LFeSO47H2O, and 0.3 mg/LCuSO45 H2O. Batch- cultures were carried out in a 500-mL bioreactor (Infors-HT) at 28 °C and at 1000 rpm, aerated with compressed air at 0.1 L/min. The pH was kept constant at 7.0 by addition of 1 M NH4OH or HCl. Cells were grown in 400 mL of medium containing a mixture of 60 mM methanol plus 15 mM succinic acid (same C-mole amount of each carbon source)M. The partial pressure of dissolved oxygen was monitored using polarographic oxygen sensors (InPro 6800, Mettler- Toledo). Methanol concentration was determined by GC-flame ionization detection (GC-FID) (GC 6850, Agilent Technologies; column: DB-Wax, J&W Scientific). Succinate concentration was determined by HPLC-UV-DAD (column: Phenomenex Rezex ROA-organic acid H+ 7.8 mm) 13 using tartaric acid as internal standard. C enrichment of CO2 in the exhaust gas were carried out 12 using two infrared sensors (BCP-CO2, BlueSens), one sensitive to C CO2 and the other sensitive 13 12 13 to C CO2. Calibration of each sensor and the specific correction of C and C signal were

152 performed as recommended by the company. Cell dry weith (CDW) was determined upon each substrate growth condition (methanol, methanol plus succinate, succinate), results of 7 cultures were not statistically different and an overall CDW value was found to be 0.269 ± 0.013 (2).

Sampling, quenching, and extraction of intracellular metabolites. CoA-esters sampling and quenching were performed as following: a volume of 1 mL of culture was directly injected into 4.5 mL of -20 °C cold acidified acetonitrile with 0.1 M formic acid on a Vortex [11]. The extraction was performed during incubation of sample 15 min on ice and subsequently freeze-dried and stored at -20 °C until analysis. Prior analysis, dried samples were dissolved in 300 L of 25mM ammonium formate buffer (pH 3.5, 2% MeOH). The suspension was centrifuged (14,000 g, 2 min, -5 °C), and the supernatant was filtered through a Sartorius Minisart filter (pore size 0.2 m) before analysis.Amino acids and central metabolites sampling were carried out as described previously [16]. In brief, a volume of 1 mL of culture was sampled by fast filtration and washed with 5 ml medium with a 90% reduced salt concentration. The filters (RC Sartorius Minisart, pore size 0.2 µm), were directly transferred into vessels containing 8 ml boiling water for quenching and extraction. The extracts were cooled on ice and filtered via a RC Sartorius Minisart filter (pore size 0.2 m), prior to chilling with liquid nitrogen. All samples were lyophilized immediately and had been stored at -20 °C. The dried sample was dissolved in 100 l double-distilled water and diluted 30/70 (v/v) with acetonitrile prior to analysis.

LC-MS analysis.LC-MS analysis were performed using Rheos 2200 HPLC system (Flux Instruments) coupled to an LTQ Orbitrap mass spectrometer (Thermo Fisher Scientific), equipped with an electrospray ionization probe. CoA-thioesters were separated using the procedure described earlier [11], with slight modifications. To perform online desalting of samples prior to LC separation two C18 analytical columns (Gemini 50 x 2.0 mm and 100 x 2.0 mm, particle size 3 m; Phenomenex, Torrance, CA, USA) were used. Flow rate was 220 µl∙min-1. Solvent A was 50 mM formic acid adjusted to pH 8.1 with NH4OH and solvent B was methanol. Injection volume was 10 µl. For online desalting samples were loaded on 50 x 2.0mm C18 column and the sample was washed on column for 5 min with 100 % solvent A. During desalting the short column was connected to waste via a 6-port-valve and the 100 x 2.0 mm column was equilibrated with solvent A by an additional pump. After desalting both columns were connected in series and the following gradient of B was applied to separate CoA esters: 5 min, 5 %; 15 min, 23 %; 25 min, 80 %; 27 min, 80 %. The LC-MS system was equilibrated for 6 min at initial elution conditions between two successive analyses. The LC was coupled to the mass spectrometer. Sheath gas flow rate was 40, auxiliary gas flow rate was 30, tube lens was 80 V, capillary voltage was 35 V, and ion spray voltage was 4.3 kV. MS analysis was done in the positive FTMS mode at a resolution of 60,000 (m/z 400). Polar intracellular metabolite were separated on a pHILIC column (150 × 2.0 mm,

153 particle size 5 m; Sequant, Umea, Sweden), following a procedure described [16]. Separation of phosphorylated hexose were not achieved, thus data given are an average of them.

Data analysis. The incorporation of 13C label into metabolites during 13C-labeling experiment was calculated from the analysis of the mass isotopomers distribution (MID) in the mass spectra. The resolution 60,000 (m/z 400) used allowed separation of carbon, nitrogen and oxygen mass isotopomers, therefore only carbon MID were considered, and correction for natural occurring isotopes of the other elements was not required. The standard deviation (STD) of the measurement were considered to be at least 2%, higher than found over the 3 technical replicates (average STD in amino acids: 0.53%), due systematic error resulting from low linearity of the LTQ-Orbitrap evaluated to be lower than 2% in the intensity range considered. Average 13C carbon labeling,

AL13C, was calculated as following: ∑

n= n° of cabon atoms, H relative abundance of mass fraction.

Acknowledgments

This work was supported by ETH Zurich, Research Grant ETH-25 08–2.

References

1. Egli, T. and C.A. Mason, Mixed substrates and mixed cultures. Biotechnology, 1991. 18: p. 173- 201. 2. Corpe, W.A. and S. Rheem, Ecology of the methylotrophic bacteria on living leaf surfaces. FEMS Microbiology Letters, 1989. 62(4): p. 243-249. 3. Delmotte, N., et al., Community proteogenomics reveals insights into the physiology of phyllosphere bacteria. Proc Natl Acad Sci U S A, 2009. 106(38): p. 16428-33. 4. Lindow, S.E. and M.T. Brandl, Microbiology of the phyllosphere. Appl Environ Microbiol, 2003. 69(4): p. 1875-83. 5. Abanda-Nkpwatt, D., et al., Molecular interaction between Methylobacterium extorquens and seedlings: growth promotion, methanol consumption, and localization of the methanol emission site. J Exp Bot, 2006. 57(15): p. 4025-32. 6. Miller, W.G., et al., Biological sensor for sucrose availability: relative sensitivities of various reporter genes. Appl Environ Microbiol, 2001. 67(3): p. 1308-17. 7. Leveau, J.H. and S.E. Lindow, Appetite of an epiphyte: quantitative monitoring of bacterial sugar consumption in the phyllosphere. Proc Natl Acad Sci U S A, 2001. 98(6): p. 3446-53. 8. Huve, K., et al., Simultaneous growth and emission measurements demonstrate an interactive control of methanol release by leaf expansion and stomata. J Exp Bot, 2007. 58(7): p. 1783-93. 9. Sy, A., et al., Methylotrophic metabolism is advantageous for Methylobacterium extorquens during colonization of Medicago truncatula under competitive conditions. Appl Environ Microbiol, 2005. 71(11): p. 7245-52. 10. Chistoserdova, L., et al., C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science, 1998. 281(5373): p. 99-102. 11. Peyraud, R., et al., Demonstration of the ethylmalonyl-CoA pathway by using 13C metabolomics. Proc Natl Acad Sci U S A, 2009. 106(12): p. 4846-51. 12. Large, P.J. and J.R. Quayle, Microbial growth on C(1) compounds. 5. Enzyme activities in extracts of Pseudomonas AM1. Biochem J, 1963. 87(2): p. 386-96.

154 13. Okubo, Y., et al., Implementation of microarrays for Methylobacterium extorquens AM1. Omics, 2007. 11(4): p. 325-40. 14. Bosch, G., et al., Comprehensive proteomics of Methylobacterium extorquens AM1 metabolism under single carbon and nonmethylotrophic conditions. Proteomics, 2008. 8(17): p. 3494-505. 15. Guo, X. and M.E. Lidstrom, Metabolite profiling analysis of Methylobacterium extorquens AM1 by comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry. Biotechnol Bioeng, 2008. 99(4): p. 929-40. 16. Kiefer, P., J.C. Portais, and J.A. Vorholt, Quantitative metabolome analysis using liquid chromatography-high-resolution mass spectrometry. Anal Biochem, 2008. 382(2): p. 94-100. 17. Van Dien, S.J., et al., Reconstruction of C(3) and C(4) metabolism in Methylobacterium extorquens AM1 using transposon mutagenesis. Microbiology, 2003. 149(Pt 3): p. 601-9. 18. Skovran, E., et al., A systems biology approach uncovers cellular strategies used by Methylobacterium extorquens AM1 during the switch from multi- to single-carbon growth. PLoS One, 2010. 5(11): p. e14091. 19. Jacob, F. and J. Monod, Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol, 1961. 3: p. 318-56. 20. Beckwith, J.R., Regulation of the lac operon. Recent studies on the regulation of lactose metabolism in Escherichia coli support the operon model. Science, 1967. 156(3775): p. 597-604. 21. Egli, T., The ecological and physiological significance of the growth of heterotrophic microorganisms with mixtures of substrates. Advances in Microbial Ecology, Vol 14, 1995. 14: p. 305-386. 22. Erb, T.J., et al., Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A, 2007. 104(25): p. 10631-6. 23. Dunstan, P.M., C. Anthony, and W.T. Drabble, Microbial metabolism of C 1 and C 2 compounds. The role of glyoxylate, glycollate and acetate in the growth of Pseudomonas AM1 on ethanol and on C 1 compounds. Biochem J, 1972. 128(1): p. 107-15. 24. Ihssen, J. and T. Egli, Global physiological analysis of carbon- and energy-limited growing Escherichia coli confirms a high degree of catabolic flexibility and preparedness for mixed substrate utilization. Environ Microbiol, 2005. 7(10): p. 1568-81. 25. Wendisch, V.F., et al., Quantitative determination of metabolic fluxes during coutilization of two carbon sources: comparative analyses with Corynebacterium glutamicum during growth on acetate and/or glucose. J Bacteriol, 2000. 182(11): p. 3088-96. 26. de Carvalho, L.P., et al., Metabolomics of Mycobacterium tuberculosis reveals compartmentalized co-catabolism of carbon substrates. Chem Biol, 2010. 17(10): p. 1122-31.

155

156

Supporting Information

157

SI Fig. S1. Mass isotopomers distribution of central metabolites measured by LC-MS of M. extroquens AM1 upon co-consumption with 13C (> 99%) methanol and natural abundance (1.1% 13C) succinate. Mass isotopomer data in black correspond to samples collected during mid-co-consumption phase (Sampling time 1, see Fig. 1), and in red to samples collected at the end of the co-consumption phase (Sampling time 2).

158

SI Fig. S2. Average 13C labeling of central metabolites measured by LC-MS of M. extroquens AM1 cells growing upon co-consumption with 13C (> 99%) methanol and natural abundance (1.1% 13C) succinate. Average 13C labeling (black) of sample collected during mid-co-consumption phase (sampling time 1, see Fig. 1); grey : samples collected after 1.5h of the end of the complete depletion of succinate (Sampling time 3, see Fig. 1).

159

SI Fig. S3. Scheme of the Coenzyme A biosynthesis in M. extorquens AM1. Identified 13C carbon entry points are indicated by colored cycles. Red, C1-precursor from tetrahydrofolate pathway; yellow, CO2; green, C3 carbon of the serine.

160

Chapter VI

Discussion

161 The work presented here provides a system level investigation of the metabolism of the facultative methylotroph M. extorquence AM1. In chapter II and III the basis for the description of the complex network of M. extorquens AM1 was described. The results presented therein lead to the demonstration of a novel pathway essential during methylotrophy, i.e. the ethylmalonyl-CoA pathway, an alternative to the well-known glyoxylate cycle for C1 as well as C2 metabolism. In addition the genome sequencing M. extorquens AM1 provided the list genes present in the bacteria. Genome annotation together with biochemical information was used to build up a global metabolic model of the strain describing in a mathematical model the current knowledge collected so far of the microorganism (chapter VI). This model was validated by the accurate prediction of the known physiology of M. extorquens AM1. Then, intrinsic properties of the metabolic network structure were studied in silico and revealed a high fragility of the methylotrophic metablism indicating that a high pressure of selection operates in nature to maintain the methylotrophic capacity. The network structure analysis reveals an original metabolic mode consisting on the first proof of principle that the ethylmalonyl-CoA pathway can be a new strategy of CO2 fixation. Furthermore this model provides a valuable tool to support the further system level investigation that would be address by the methylotrophic community also for biotechnological applications. The model was used in a 13C-metabolic flux analysis experiment to elucidate the physiology of M. extorquens AM1 growing on methanol and methanol plus succinate. The last study (described in chapter V) revealed the intriguing strategy of M. extorquens to deal with two carbon sources simultaneously. This behavior is an unexpected strategy whereas diauxic shift is the classically described behavior of microorganisms growing in laboratory condition with two substrates. This observation supports earlier evidence that co-consumption could be a better strategy for microorganism growing in environment with low but divers carbon sources. Indeed, the results obtain during the overall study illustrates the interest of a system biology approach in microbiology. More specifically, it supports new discoveries in the field of methylotrophy and bacterial ecology addressing fundamental questions like adaptation and evolution of microorganism in the context of their environment.

VI.1 Additional properties of the EMCP. A new strategy for autotrophy?

During this PhD work, the operation of the pathway for glyoxylate regeneration during methylotrophy, i.e. ethylmalonyl-CoA pathway, was demonstrated, a question which stayed unsolved during half a century since the discovery of the serine cycle. The breakthrough which lead to this discovery came from the genetic and biochemistry analysis performed by Mary Lidstrom and coworkers [1] where key genes, reactions and metabolites involved in this pathway

162 were identified. This work was followed by enzymatic studies performed by Tobias Erb and coworkers in Rhodobacter sphaeroids which lead to the characterization of 4 key enzymes involved in the pathway [2-5], and they were the first to formulate the later demonstrated hypothesis of the reaction sequence of the EMCP [2]. They provided a deep understanding of the biochemistry of the pathway whereby stereochemistry plays a key role in the CoA-thioester chemistry and reaction mechanism. This work also proposes an evolution scenario for the appearing of this crucial carboxylation step in the ethylmalonyl-CoA pathway based on amino acid sequence analysis as well as reaction mechanism where they found that the crotonyl-CoA carboxylase/reductase (Ccr) derived from enoyl-CoA reductases through evolution. Therefore, they discussed that it could be that the carboxylase/reductase emerged from the former reductase and that the reductase activity, having a much lower catalytic efficiency than the carboxylase/reductase, can be considered as an evolutionary relic [4]. And finally, the ethylmalonyl-CoA pathway appears to be an original and complex alternative to the glyoxylate cycle [6] which identically allows oxidizing the acid part of acetyl-CoA into glyoxylate. Both pathways are essential in C1 (serine cycle) as well as in C2 metabolism, but the mechanism of the pathway is highly divergent with the glyoxylate cycle. Indeed, 2 molecules of CO2 are assimilated via the ethylmalonyl-CoA pathway and 12 steps are required in the EMCP whereas only 1 enzyme the isocitrate lyase in addition of the TCA cycle is required in the glyoxylate cycle. Nevertheless, a so complex (12 steps) and fragile process in comparison with the glyoxylate cycle (1 step) seems unlikely to have been selected through the evolution, and if not it could have been for additional raisons. Accordingly, genomic database for the presence of the glyoxylate cycle as well as key enzymes of the ethylmalonyl-CoA pathway in 1215 fully sequenced bacteria reveal that the glyoxylate cycle is more spread through 142 bacterial genera (34 % of the 413 genera sequenced) whereas the ethylmalonyl-CoA pathway is lower, even significantly, spread in 32 bacterial genera (8 % of the 413 genera sequenced) [5]. Nevertheless, a significant number of bacterial genera where found to harbor the ethylmalonyl-CoA pathway indicating its ecological relevance and further that additional benefit for the bacteria compared with the glyoxylate cycle could to be provided. Indeed, as presented in chapter VI, the EMCP was shown in silico to allow the organism to generate all carbon from CO2, a potential additional function of the pathway which is not provided by the glyoxylate cycle. It is tempting to speculate that the EMCP may allow photoautotrophic growth. Indeed, the in silico analysis revealed a metabolic mode whereby the EMCP combined with the serine cycle and the glycine cleavage complex is able to performed biomass biosynthesis from only CO2. Nevertheless, in the simulation methanol was still used as energy source, and maybe due to the higher efficiency of the carbon assimilation strategy from methanol, and down regulation of glycine cleavage complex observed at proteomic and fluxomic level during methylotrophy, the simulated mode is unlikely to operate in the presence of methanol and were never reported experimentally. Two things are required in M. extorquens AM1 to support

163 autotrophy independently to methanol as energy sources. First the glycine cleavage complex, the

EMCP and the serine cycle have to be up regulated at the same time to allow CO2 fixation cycle. Secondly, an alternative energy sources to methanol (electron donor) has to be provided to the cell. At the state of knowledge no condition were identified in which the 3 pathways are up regulated in the same time. Nevertheless, concerning the second requirement, the photosystem I was found to be present in M. extorquens AM1, which could support light energy usage to generate proton motrice force and then ATP independently to methanol. In addition, a NADPH-ferredoxin oxidoreductase is present in the genome and could allow using photon energy to transfer electron from periplasmic space (cytochrome) into NADP+ generating the NADPH required for biosynthesis. Then, an electron sources is required to supply cytochrome c and provide NADPH redox equivalent. Interestingly the Sox system (sulfur oxidizing system) is present in the genome and is known to be active in M. extorquens AM1 [7] under aerobic condition and could provide the cell in electron from thiosulfate. This information together draws a condition in which M. extorquens AM1 could be able to grow photolithoautotrophycally using light, thiosulfate and

CO2. The theoretical growth of M. extorquens AM1 in this condition was predicted with the genome scale model and find to be feasible (data not shown). Under aerobic conditions the EMCP pathway alone is sufficient to reduce and assimilate CO2 whereas under anaerobic conditions CO2 was found to be assimilated via formate dehydrogenase combined with the EMCP. It will be interesting in the future to cultivate M. extorquens AM1 in the above defined conditions anaerobically and aerobically in order to find out if the ethylmalonyl-CoA pathway can support autotrophy. Interestingly Rhodospirillum rubrum and Rhodobacter sphaeroides were found to grow in the absence of the Calvin-Benson-Bassham cycle (deletion-mutant) using thiosulfate as electron donor and light as energy sources in anaerobic condition [8]. Since the EMCP is known to be present in Rhodobacter sphaeroides this pathway could support effectively photolithoautotrophy. One may then speculate that soon those bacteria will growth on CO2 as sole carbon source using the ethylmalonyl-CoA pathway. An additional benefit for bacteria to use the ethylmalonyl-CoA pathway in place of the glyoxylate cycle was discovered recently. Indeed, photoheterotrophic growth with acetate was described recently of a deletion-mutant in the Calvin-Benson-Bassham reductive pentose phosphate cycle of Rhodobacter sphaeroide, whereas deletion mutant of Rhodopseudomonas palustris an ICL-variant bacteria can not grow [9]. The authors proposed that the rescue mechanism is the regeneration of the NADPH cofactor over produced during photoheterotrophic growth with acetate (anaerobic condition) by reducing CO2 via the carboxylation/reduction step of the EMCP. The in silico prediction performed with the genome-scale model of M. extorquens AM1 corroborate that the EMCP can sustain higher growth in case of NADPH overproduction comparing to an ICL-variant. Hence, the first additional property of the ethylmalonyl-CoA pathway compared to the glyoxylate

164 cycle and shared with the CBB cycle [10] is the central redox cofactor recycling, as demonstrated in anoxygenic phototrophic bacteria [9].

VI.2. Yield and limitation of M. extorquens AM1 during methylotrophic growth. Genome-scale perspective

Achieving the metabolic network reconstruction of M. extorquens AM1 allowed a first comparison of the physiology of this organism and predictions at the system level. Indeed, prediction of the assimilation and dissimilation requirements balancing within the metabolism is a valuable achievement for biochemists whereby it confronts biochemical and genomic knowledge to physiological observed parameters. Hence, yield and cell efficiency can be addressed with respect to their environmental competitiveness, rational selection through evolution and biotechnological utilization. The ability to predict accurately the yield of an organism is an indication that a closed to be complete description of the metabolic processes has been achieved. Anthony in 1982 [11] report metabolic balancing for divers types of methylotrophs and a cross comparison of their different metabolic strategies (serine cycle, RuMP, and RuBP cycle). He reported two key questions to predict accurately the yield, first the P/O ratio which corresponds to the mole of ATP produced by mole of oxygen consumed and depends of the electron transfer chain efficiency, and secondly the number of redox equivalent, i.e. NAD(P)H, generated by mole of formaldehyde oxidized. Assuming high P/O ratio (3 from NADH), he predicted in serine cycle -1 methylotrophs a yield of 0.44 to 0.56 g(CDW).g (methanol) considering one or two NADH produced by formaldehyde oxidized respectively. This values were at the upper limits or higher than the -1 physiologically measured yield 0.30 to 0.45 g(CDW).g (methanol) [12]. Prediction obtained with the genome-scale model were a higher P/O ratio of 2 from NADH and the biomass quantified results -1 in a maximal yield of 0.42 g(CDW).g (methanol) which is close to the optimal one measured (0.45) and lower than the value predicted by Anthony (0.44). The experimental observation of a yield of 0.35 -1 g(CDW).g (methanol) indicates that M. extorquens AM1 is not growing at maximal efficiency under laboratory conditions used in this thesis work. Indeed, a maximal growth rate of 0.20 h-1 is predicted whereas 0.17 ± 0.1 h-1 was observed. Therefore, these results indicate a limitation during methylotrophic growth, whereas the prediction for succinate (0.195 h-1) is much more accurate with 0.20 ± 0.1 h-1 observed. A simple explanation could be that one element of the medium is limiting during methanol growth. Indeed, Patrick Kiefer found recently that a cobalt limitation occurred in previous experiments. This cobalt limitation affected specifically the ethylmalonyl-CoA pathway and was demonstrated by using metabolomics [13]. Another explanation could be that formaldehyde, a central intermediate but toxic compound for the cell, induces a stress response of the cell and then a growth reduction.

165 Considering only biochemical processes, Anthony predicted in 1982 that the serine cycle methylotrophs are NADH-limited in case that only 1 NADH is produced during formaldehyde oxidation. However, elucidation of the tetrahydromethanopterin-dependent formaldehyde oxidation pathway in M. extorquens AM1 [14] revealed that 2 redox equivalents are produced from formaldehyde to CO2. Thus, serine cycle methylotrophs appeared from the GS model of M. extorquens AM1 to be not specifically ATP or NAD(P)H limited, but are "energy limited" in a wider sense. Energy limitation is classically considered to be associated with ATP limitation whereas here it is both energy as well as redox equivalent which are limiting. Indeed, due to the capacity of the formaldehyde oxidation step to produce NADPH or NADH, NADH which can be subsequently used to produce ATP, the cell can fulfill both requirements, so it is the number of electrons that can be harvested from the donor which are limiting more than the nature of the redox cofactor to which they are transferred. In consequence, as methanol is the energy source, serine cycle methylotrophs are substrate limited. There is just one step through the optima flux distribution that could be optimized considering the energy in electron harvest. It is the methanol dehydrogenase which transfert them using cytochrome c which are lower in energy than into NADH. Therefore, usage of a NADH methanol dehydrogenase would increase the yield. Furthermore, considering that C1 oxidation is close to the optimal electron harvesting capacity and that the cells are energy-limited increasing of the growth rate could be achieved only by increasing the methanol consumption rate. Indeed, the methanol dehydrogenase appears as the main target for fluxes improvement in biotechnology as regards to growth rate as well as yield increase. One explanation for the cell to keep a substrate limiting growth controlled by the methanol dehydrogenase could be to avoid an accumulation of formaldehyde which could lead to toxic effect.

VI.3. Optimal energetic balance in M. extorquens AM1 during methylotrophy. Genome-Scale and enzymology perspectives.

The theoretical considerations outlined above indicated that the metabolism of M. extorquens is not specifically limited by ATP or NADPH. Therefore, ATP and NADPH production through the central metabolism must be accurately balanced. Indeed, an unbalanced production of ATP and NADPH will generate an excess of one cofactor and a default of the other ones the last becoming the limiting factor whereas the first one will be in excess and requiring to be dissipated. This balance is orchestrated in M. extorquens AM1 via formaldehyde oxidation. Indeed, the optimal flux prediction during methylotrophy (Chapter IV) indicates that the formaldehyde oxidation step is not used at 100% to produce only NADPH either NADH (ATP) but at an optimal ratio NADH/NADPH of 0.66. Thus, the accurate energy balance to supply optimal growth requires a tightly regulated NADH/NADPH flux ratio through the methylene-tetrahydromethanopterin

166 (H4MPT) dehydrogenases. The reaction is catalyzed by two enzymes, the NAD(P)H methylene- tetrahydromethanopterin dehydrogenase (MtdB) [15] in addition to the NADPH-dependent methylenetetrahydrofolate dehydrogenase (MtdA) [16]. Catalyzing this crucial step for energetic and consequently for growth by two enzymes represents an ideal system to add flexibility to control the NADH/NADPH ratio. Indeed MtdA which is NADPH-specific and also catalyses the corresponding NADPH-dependent tetrahydrofolate reaction involved in C1 assimilation allows regulating the NADPH production in accordance with biomass precursor supply. Hence, decreasing of the enzyme activity will at the same time decrease NADPH requirement for biosynthesis and NADPH produced during C1 oxidation. Properties of both enzymes, MtdA and MtdB allow to predict enzyme activities (flux) in vivo by considering the known metabolite concentration in the cell. Indeed, considering the methylene-tetrahydrometanopterin, NAD+, NADP+ concentration experimentally determined in M. extoquens AM1 cell [15, 17, 18], and kinetics properties (Km, Vmax) of both enzyme MtdB and MtdA [16], the ratio of NADH/NADPH produced should be 0.48 (fig. 1). Those calculations were performed considering the following equation for each enzymes and redox cofactors, in accordance with the kinetics model of Marx et al. [19]:

The graph of enzyme activity depending of NAD(P)+ cofactor concentration is displayed in Fig. 1. Interestingly, if one assumes the enzymes are operating at maximum activity relatively to cofactor concentration, then this ratio is quite robust to NAD(P)+ concentration changes ( > 2 times).

Fig. 1. Predicted methylene- H4MPT dehydrogenase activity (MtdB plus MtdA) from kinetics parameters depending on intracellular NAD(P)+ concentration. Black circle: NADP+-dependent reaction; white circle: NAD+-dependent reaction. Intracellular concentration of NAD+ and NADP+ experimentally determined are reported on dash lines.

167

Hence, the formaldehyde oxidation flux through the enzymes (MtdB, MtdA) should generate 67 % NADPH and 33% NADH, whereas 60% and 40% correspond to the optimal value respectively. Interestingly the enzymes parameters seem to be set to provide an NADH/NADPH production ratio suitable for an optimal growth. Indeed the enzyme properties and optimal flux distribution coincide suggesting that study of the metabolic network structure, in case of known biomass composition, could potentially predict enzyme kinetic parameters.

VI.4. Flexibility in energetic balance in M. extorquens AM1 during methylotrophy avoids formaldehyde accumulation

The importance of the methylene-tetrahydromethanopterin dehydrogenase activity in generating NADPH and NADH was pointed out in the previous paragraph. Since the enzyme properties of the catalyzing enzymes (MtdA and MtdB) seem to be suitable for optimal growth with methanol, the question rises how the electron flow is directed when cells are not growing at the optimal growth rate. Indeed, the expected behavior of the cell to keep a balance between NADPH and NADH production in accordance with the growth rate and to avoid excess of one of the pyridine nucleotides could be to decrease the methanol uptake rate accordingly to the growth rate. Notably however, the methanol consumption is stable at about 0.15 to 16 mmol∙g-1∙h-1, independent of the measured growth rate of 0.09 h-1 or 0.17 h-1. A slower growth rate (0.09 to 0.12 h-1) was for instance observed as a consequence of the high CO2 concentration in the medium and cobalt limitation during the 13C-MFA experiments, and then NADPH would be over produced considering a fix NADH/NADPH ratio. Calculations using the genome-scale model reveal an apparent NADH/NADPH ratio of 1.79 using fluxes determined by 13C-MFA experiments as constraints, and considering NADPH production to be equal to the growth requirements. The rest of the formaldehyde oxidation can be assumed to be used for NADH generation.

168

Fig. 2. Scheme of the electron harvesting reactions during methanol oxidation. MxaFI, methanol dehydrogenase; XoxF, MxaF homologue; Fdh, formate dehydrogenase; MtdB, NAD(P)+-dependent methylene-tetrahydrometanopterin dehydrogenase; MtdA, NADP+-dependent methylene-tetrahydrofolate dehydrogenase; cyt c, cytochrome c. dash arrow: collapsed reaction steps.

Three mechanisms have been identified to allow the cell to decrease the NADPH production at this key step of the energetic metabolism (Fig. 2). First, decreasing the quantity of MtdA can reduce the specific enzyme catalyzing only the NADPH-depend oxidation accordingly with reducing the C1 assimilation. Secondly, if the enzymes are saturated relatively to NAD(P)+ concentration, NADP+ is not far to become limiting. Indeed, considering the mass action of substrate and product, decrease of NADP+ by a factor 10 will start to reduce significantly the NADPH production by a factor 2. Therefore an excess of NADPH production as regard to growth requirement will reduce the pool size of NADP+ and then could start to reduce the over flow (Fig. 1). Thirdly, there is a noncompetitive inhibition of MtdB by NADPH [16], an inhibition that reduces both the NAD+ and NADP+ reaction and may play a role in controlling and avoiding an NADPH overflow. These three mechanisms combined together can lead to a reduction of both NADP+ and NAD+ dependent formaldehyde oxidation. In consequence, considering the stable methanol oxidation, this regulation should lead to an accumulation of the toxic formaldehyde. Thus, an alternative formaldehyde oxidizing, NAD(P)H-independent mechanism could be in place that might be involved in formaldehyde detoxification if the methylene-tetrahydromethanopterin dehydrogenase activity decreases. A potential candidate could be XoxF, a methanol

169 dehydrogenase like (MxaF homologue) enzyme which has also a formaldehyde oxidation activity. This enzyme is also periplasmic and suspected to be cytochrome c dependent [20]. Interestingly, the deletion mutant growth of this mutant is reduced by 33% relative to the wild type, but also is consuming fewer methanol (38%). This may suggest that the XoxF deletion-mutant may not be able to decouple methanol oxidation and growth rate and would be in-line with a putative role of carrying the overflow required for efficient formaldehyde oxidation. In addition, just after a starvation phase the deletion mutant lost the capability to quickly convert a high amount of methanol, and formaldehyde to formate coupled with high O2 consumption [20]. Indeed, just after a starvation, when the cells are not ready to grow and therefore are consuming less NADPH, XoxF might be required to oxidize high levels of methanol and formaldehyde. Speculation of XoxF as an alternative for formaldehyde oxidation with the function of avoiding NADPH overproduction, however, is tentative. There are 3 more mechanisms for the cell to deal with overproduction of redox cofactors (NADPH or NADH). The first one concerns the respiratory chain. The NADH: ubiquinone oxidoreductase complex I and II of M. extorquens AM1 can be expected to pump 4 or 0 protons respectively; furthermore the bacterium is predicted to contain two different terminal oxidases, Cyt o and Cyt d pumping 4 and 2 protons respectively. This flexibility in the electron transfer chain can avoid an over production of ATP by reducing the P/O ratio. Secondly, there is the membrane bound transhydrogenase which allows transferring electron from NADH to NADPH. Even if this enzyme is predicted not to be used during methylotrophic growth based on its down regulation at the trancriptomic and proteomic level in comparison with growth in the presence of succinate, it could compensate the imbalance of the NADH/NADPH production ratio by MtdA,B in case of overproduction of NADH. Finally, there is a putative NADPH dehydrogenase (old yellow enzyme like protein; META1_2244) identified during the reconstruction process which could allow + regenerating NADP by oxidizing NADPH directly with O2, in case of excess of NADPH.

170

Fig. 3. Variability analysis of the electron flow system (C1 oxidation and electron transfer chain). MDH, methanol dehydrogenase; Fae, formaldehyde activating enzyme; Mtd, methylene- tetrahydrometanopterin dehydrogenase; Fdh, formate dehydrogenase; Fadh, formaldehyde dehydrogenase; OYE, old yellow enzyme, i.e NADPH oxidoreductase. Flux variability was performed using the methylotrophy GS-network with 15.1 mmol∙g-1∙h-1 methanol uptake rate and 0.17 h-1 growth rate as constrains.

In conclusion the formaldehyde node in the network of M. extorquens AM1 appears to be the key branch point of energetics by performing electron routing to NADH or NADPH. It requires a fine regulation to avoid overproduction of one or the other cofactor which could consequently decrease the formaldehyde oxidation flux. Reduction of the formaldehyde consumption flux despite stable formaldehyde production by the methanol dehydrogenase will result in accumulation of this toxic compound. Nevertheless, the metabolic network is very flexible to carry formaldehyde oxidation (Fig. 3) by compensating lower cytoplasmic oxidation with periplasmic oxidation, adapting the electron flow routing in accordance with the growth requirements, and dissimilating the excess of electron.

VI.5. Pre-adaptation in Methylobacterium extorquens AM1?

Adaptation of microorganisms to changing environmental conditions requires metabolic flexibility and accurate network regulation to orchestrate the metabolic reprogramming. If the habitat of the microorganism is a recurrent changing environment the evolution process will shape the biological network to be suitable for these changing conditions and to increase its fitness. The phyllosphere, i.e. plant surface, is an environment where epiphytes are exposed to diverse carbon substrates which are heterogeneously localized and transiently available [21-23]. Measurement of methanol emission by plants show a methanol spike in the early morning when the stomata are opening, whereas a lower and basal amount is released over the day and none during the night (closed stomata) [21]. Therefore, one may expect that M. extorquens species

171 should have adapted to deal with this changing environment. In addition, M. extorquens species were isolated not only from the phyllosphere but also from the rhizosphere [24] two environments which are even more different in term of physical constraints, i.e desiccation, light, temperature and maybe in their substrates diversity and availability. Several evidences of this adaptation capacity were found during this thesis work. First the genome sequencing of M. extorquens AM1 [25] revealed a high genome size (6.88 Mb) and high number of genes (6759 open reading frames). This size corresponds to free living bacteria having a versatile metabolism required for environmental adaptation [26]. In contrast, for instance, intracellular bacteria dependent and adapted to a narrow environment of their host show a tendency to have fewer genes in the range of 2000. The second evidence, which is partly determined by the number of genes, is the number of metabolic reactions contained in the genome-scale metabolic network reconstructed. In total, 1139 reactions were assembled in the model. During methylotrophic growth around one third are predicted to be essential, most of them are involved in the generation of all biomass constituents of the cell from mineral medium, the second third is flexible part of the network within methylotrophic mode resulting from alternate pathways, whereas the last third is apparently not used under methylotrophic condition, i.e. block reactions. This last part represents a high number of enzymes pointing towards metabolic versatility potentially allowing the use of different substrates. However, even if other metabolic modes are encountered by M. extorquens AM1 in its environment, there is evidence, from the fragility analysis presented in chapter VI, that it encounters regularly a methylotrophic life style. Howether only a narrow substrate can be used as carbon sources. Thus, M. extorquens AM1 appears to be a highly specialized organism with some generalist bacterial features. Hence, the metabolic adaptation capacity is dedicated to narrow carbon sources utilization and seems restricted to organic acid, whereas the generalist E. coli for instance could use potentially until 174 carbon sources [27]. Nevertheless, adaptation to this narrow carbon sources requires quite strong metabolic reprogramming. The most well investigated metabolic adaptation of M. extorquens AM1 under laboratory conditions is the ability to growth on methanol, a C1 compound, as well as succinate, a multi carbon (C4) compound. Indeed, the biological network of M. extorquens AM1 was found to be strongly reorganized during growth on the two substrate conditions at the transcriptome [28], proteome [29], and metabolome level [30, 31]. In addition, Skovran and coworkers [32] investigated the biological network reorganization upon a switch from succinate to methanol. Surprisingly, they report that the methanol dehydrogenase is significantly active during growth on succinate, indicating that no strong catabolic repression is operating. It was proposed that the cells are constantly ready to metabolize methanol, at least in a catabolic way, whereas they observed that adaptation to assimilation requires several minutes. In two other studies [33, 34] a quite significant catabolic repression of the methanol dehydrogenase activity was observed, at least by a factor 10. One explanation of this discrepancy could be that the first studies were performed in chemostat (dilution rate 0.163 h-1), i.e. where the cells are substrate limited, whereas the other

172 studies where performed in batch, i.e. without substrate limitation. Therefore, substrate limitation could be a signal for pre-adaptation of the cell to encounter methanol. Two hypotheses could explain this putative and conditional pre-adaptation of switching from non- methylotrophic to methylotrophic conditions. First, the regulation of the metabolic network could have evolved to suit a diurnal cycle of methanol production by plants. Therefore, to benefit from this relatively high methanol spike which occurs in the morning [21], the cell could pre-adapt to methanol metabolization. Consistently with the hypothesis that M. extorquens AM1 is pre-adapted to use methanol, Schmidt et al. [20] mimicked a methanol spike after a starvation phase. They found that cells are ready to quickly consume methanol. Identical experiment using other carbon sources would be required to answer if the cells are pre-adapted to consume specifically methanol or if they are just ready to consumed ever substrate appearing in the environment. Another type of adaptation of the regulatory network of M. extorquens AM1 could be to perform co-consumption when different carbon sources are available. This would be a metabolic strategy used by microorganism to deal with low amount of carbon sources encountered in environment [35], e.g. the surface of plants for M. extorquens. The experiments described in chapter V show cometabolisation of methanol and succinate. Interestingly, a specific regulation of the enzyme activities was observed under mixed conditions in 1972 by Dunstan and co-workers [33]. The authors observed a repression of the serine cycle based on enzyme activity. This result is in accordance with the observation of the 13-C labeling experiments performed during this work. Predictions using the GS network model of M. extorquens AM1 suggest that the use of the two substrates methanol and succinate corresponds to a rational utilization. However, additional studies with other substrates and microorganisms are required to validate this principle of mixed substrate growth and furthermore if the observed regulation evolved accordingly with the environmental condition. Trying to understand advantage of the microorganism to performed co-consumption opens new challenges in studying adaptation and evolution of microorganism metabolic strategies accordingly to their environment. Indeed, optimization of fitness to specific carbon sources is considered as a result of evolution, whereas the capacity to use several substrates or to perform quick switches from one to the others could also be of benefit. Actually only steady-state metabolism can be addressed by model prediction and experimental flux quantification methods, therefore developing modeling methods for apprehending dynamic-state of the metabolism would be of great interest toward understanding of metabolic adaptation processes. One challenging step to achieved this goal would be defining conceptually an objective function taken into account the entire costs associated to metabolic reprogramming, e.g. protein synthesis, dynamic regulation, and associate quantitative parameters. Then, metabolic efficiency of the diauxic shift versus co- consumption as well as tradeoff between the metabolic robustness and efficiency of the switch could be addressed with regard to microorganism ecology.

173 References

1. Korotkova, N., et al., Glyoxylate regeneration pathway in the methylotroph Methylobacterium extorquens AM1. J Bacteriol, 2002. 184(6): p. 1750-8. 2. Erb, T.J., et al., Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc Natl Acad Sci U S A, 2007. 104(25): p. 10631-6. 3. Erb, T.J., et al., Ethylmalonyl-CoA mutase from Rhodobacter sphaeroides defines a new subclade of coenzyme B12-dependent acyl-CoA mutases. J Biol Chem, 2008: p. M805527200. 4. Erb, T.J., et al., Carboxylation mechanism and stereochemistry of crotonyl-CoA carboxylase/reductase, a carboxylating enoyl-thioester reductase. Proc Natl Acad Sci U S A, 2009. 106(22): p. 8871-6. 5. Erb, T.J., G. Fuchs, and B.E. Alber, (2S)-Methylsuccinyl-CoA dehydrogenase closes the ethylmalonyl-CoA pathway for acetyl-CoA assimilation. Mol Microbiol, 2009. 73(6): p. 992-1008. 6. Kornberg, H.L. and H.A. Krebs, Synthesis of cell constituents from C2-units by a modified tricarboxylic acid cycle. Nature, 1957. 179(4568): p. 988-91. 7. Anandham, R., et al., Thiosulfate Oxidation and mixotrophic growth of Methylobacterium goesingense and Methylobacterium fujisawaense. J Microbiol Biotechnol, 2009. 19(1): p. 17-22. 8. Wang, X., H.V. Modak, and F.R. Tabita, Photolithoautotrophic growth and control of CO2 fixation in Rhodobacter sphaeroides and Rhodospirillum rubrum in the absence of ribulose bisphosphate carboxylase-oxygenase. J Bacteriol, 1993. 175(21): p. 7109-14. 9. Laguna, R., F.R. Tabita, and B.E. Alber, Acetate-dependent photoheterotrophic growth and the differential requirement for the Calvin-Benson-Bassham reductive pentose phosphate cycle in Rhodobacter sphaeroides and Rhodopseudomonas palustris. Arch Microbiol, 2010. 10. McKinlay, J.B. and C.S. Harwood, Carbon dioxide fixation as a central redox cofactor recycling mechanism in bacteria. Proc Natl Acad Sci U S A, 2010. 107(26): p. 11669-75. 11. Anthony, C., The Biochemistry of Methylotrophs. 1982, London: Academic Press. 12. Rokem, J.S., I. Goldberg, and R.I. Mateles, Maintenance requirements for bacteria growing on C1- compounds. Biotechnol Bioeng, 1978. 20(10): p. 1557-64. 13. Kiefer, P., et al., Metabolite profiling uncovers plasmid-induced cobalt limitation under methylotrophic growth conditions. PLoS One, 2009. 4(11): p. e7831. 14. Chistoserdova, L., et al., C1 transfer enzymes and coenzymes linking methylotrophic bacteria and methanogenic Archaea. Science, 1998. 281(5373): p. 99-102. 15. Vorholt, J.A., et al., The NADP-dependent methylene tetrahydromethanopterin dehydrogenase in Methylobacterium extorquens AM1. J Bacteriol, 1998. 180(20): p. 5351-6. 16. Hagemeier, C.H., et al., Characterization of a second methylene tetrahydromethanopterin dehydrogenase from Methylobacterium extorquens AM1. Eur J Biochem, 2000. 267(12): p. 3762-9. 17. Crowther, G.J., G. Kosaly, and M.E. Lidstrom, Formate as the Main Branchpoint for Methylotrophic Metabolism in Methylobacterium extorquens AM1. J Bacteriol, 2008. 18. Guo, X. and M.E. Lidstrom, Physiological analysis of Methylobacterium extorquens AM1 grown in continuous and batch cultures. Arch Microbiol, 2006. 186(2): p. 139-49. 19. Marx, C.J., S.J. Van Dien, and M.E. Lidstrom, Flux analysis uncovers key role of functional redundancy in formaldehyde metabolism. PLoS Biol, 2005. 3(2): p. e16. 20. Schmidt, S., et al., Functional investigation of methanol dehydrogenase-like protein XoxF in Methylobacterium extorquens AM1. Microbiology, 2010. 156(Pt 8): p. 2575-86. 21. Huve, K., et al., Simultaneous growth and emission measurements demonstrate an interactive control of methanol release by leaf expansion and stomata. J Exp Bot, 2007. 58(7): p. 1783-93. 22. Abanda-Nkpwatt, D., et al., Molecular interaction between Methylobacterium extorquens and seedlings: growth promotion, methanol consumption, and localization of the methanol emission site. J Exp Bot, 2006. 57(15): p. 4025-32. 23. Lindow, S.E. and M.T. Brandl, Microbiology of the phyllosphere. Appl Environ Microbiol, 2003. 69(4): p. 1875-83. 24. Schauer, S. and U. Kutschera, Methylotrophic bacteria on the surfaces of field-grown sunflower plants: a biogeographic perspective. Theory Biosci, 2008. 127(1): p. 23-9. 25. Vuilleumier, S., et al., Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources. PLoS One, 2009. 4(5): p. e5584. 26. Cases, I., V. de Lorenzo, and C.A. Ouzounis, Transcription regulation and environmental adaptation in bacteria. Trends Microbiol, 2003. 11(6): p. 248-53. 27. Feist, A.M., et al., A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol, 2007. 3: p. 121.

174 28. Okubo, Y., et al., Implementation of microarrays for Methylobacterium extorquens AM1. Omics, 2007. 11(4): p. 325-40. 29. Bosch, G., et al., Comprehensive proteomics of Methylobacterium extorquens AM1 metabolism under single carbon and nonmethylotrophic conditions. Proteomics, 2008. 8(17): p. 3494-505. 30. Guo, X. and M.E. Lidstrom, Metabolite profiling analysis of Methylobacterium extorquens AM1 by comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry. Biotechnol Bioeng, 2008. 99(4): p. 929-40. 31. Kiefer, P., J.C. Portais, and J.A. Vorholt, Quantitative metabolome analysis using liquid chromatography-high-resolution mass spectrometry. Anal Biochem, 2008. 382(2): p. 94-100. 32. Skovran, E., et al., A systems biology approach uncovers cellular strategies used by Methylobacterium extorquens AM1 during the switch from multi- to single-carbon growth. PLoS One, 2010. 5(11): p. e14091. 33. Dunstan, P.M., C. Anthony, and W.T. Drabble, Microbial metabolism of C 1 and C 2 compounds. The role of glyoxylate, glycollate and acetate in the growth of Pseudomonas AM1 on ethanol and on C 1 compounds. Biochem J, 1972. 128(1): p. 107-15. 34. Smejkalova, H., T.J. Erb, and G. Fuchs, Methanol assimilation in Methylobacterium extorquens AM1: demonstration of all enzymes and their regulation. PLoS One, 2010. 5(10). 35. Egli, T., The ecological and physiological significance of the growth of heterotrophic microorganisms with mixtures of substrates. Advances in Microbial Ecology, Vol 14, 1995. 14: p. 305-386.

175

176 Acknowledgment

This thesis was more than a job or a study even a research, it was for me an adventure. I would like to take the opportunity to thank people who contributed to this work and supported me during that time.

First, my supervisor Prof. Julia Vorholt for her support, for pushing me further than I thought to be able to go and for giving me freedom in my research.

Jean-Charles Portais, my co-supervisor, who initiated me to the fascinating world of metabolism and the smart 13C-Metabolic Flux experiments.

Prof. Uwe Sauer for his active participation during my PhD committee meetings and for accepting to evaluate this work.

Special thanks also to Patrick Kiefer. It was so much a pleasure to do science with you! I will always remember ever that metabolomics should not be too salty, such as cooking!

Many thanks to Philipp Christen, who run so much bioreactors for me.

For Kathrin Schneider, thanks for the valuable work on the biomass quantification. It was a pleasure to discuss M. extorquens metabolism and human life in general.

To Stephane Massou, thanks for your support in NMR measurement and analysis, I would have like to spend time to play on NMR with you.

I address special thanks to Sergueï Sokol who supported me in mathematical aspect.

I would like to thank the entire team in Zürich and in particular Gerd, Benjamin, Nathanaël, Judith for their disponibility in helping in metabolomics sampling.

Thanks also to the team in Toulouse for the enthusiastic discussions about metabolomics, fluxomic, and metabolic system. Pierre, Edern, Brice… ils nous restent quelques bars à écumer sur Paris je crois…

177 To my dears Benjamin Gourion, Anne Francez-Charlot, Andreas Kaczmarczyk, and Anne Idkowiak, 4 shiny hearts in the night of Zürich. Thanks for your strong critics ability which is so valuble in science!

Nathanaël and Claudia, for discussing methylotrophs ecology and M. extorquens AM1 behavior on leaf. It is glad and sad to be aware of the good time we had together in Zürich when it‘s time to leave…

Thanks to the nice people I met in the Maine: Tobias Erb and Birgit Albert who discussed many aspects of the ethylmalonyl-CoA with me; and the Genome people among them Stéphane Vuilleumier, Ludmila Chistoserdova, and Chris Marx.

I would like to thank my family and friends for their support.

Finally, I thank Emilie, you supported me so much during this PhD. I would not have been able to achieve this without you.

« La grandeur d‘un métier est peut-être, avant tout, d‘unir des hommes: il n‘est qu‘un luxe véritable, et c‘est celui des relations humaines. » Antoine de Saint-Exupéry ... Liaison Toulouse-Zürich effectuée...

178 PEYRAUD Rémi

E-mail: [email protected] Phone number (lab): +044 6323325 Lab adress: Institut f. Mikrobiologie, HCI F 431, Wolfgang-Pauli-Str. 10, 8093 Zürich, SWITZERLAND Date of Birth: 07 March 1981 Nationality: French Language skills: French, English

CURICULUM VITAE

EDUCATION AND QUALIFICATIONS

2007-2011 PhD student Swiss Federal Institute of Technology - ETH Zürich Metabolic pathways elucidation and System Biology of Methylotroph.

2006-2007 Master (year 2) University Paul Sabatier Structure/Function of Macromolecule and life process

2005-2006 Master (year 1) University Paul Sabatier SVS Biochemistry and Biotechnology

2004-2005 Licence University Paul Sabatier Biochemistry and Molecular Biology

WORK AND INTERNSHIP

July 2007 Engineer in structural biology IPBS(CNRS) Toulouse

Nov. 2006 Master (2) INTERNSHIP IPBS(CNRS) Toulouse Biochemical and structural characterization of a lectin in order to use it as nano-carrier.

Oct. 2005 Master (1) INTERNSHIP INSA Toulouse Adaptation of E. coli central metabolism to carbon nutrition.

ORAL COMMUNICATION AND POSTERS

Oral communication: May. 2011 Réseau Français de Métabolomique et Fluxomique Paris (FR) "Network-scale investigation of methanol metabolism in Methylobacterium extorquens upon mixed methylotrophy"

Sept. 2009 International Methylobacterium Meeting Zürich (CH) "Flux analysis upon methanol growth of M. extorquens AM1"

Jul. 2008 Gordon Research Conference Lewiston (USA) GRC: Molecular Basis of Microbial One-Carbon Metabolism "Demonstration of the pathway leading to glyoxylate generation using 13C metabolomics"

Posters: Jul. 2010 Gordon Research Conference Lewiston (USA) GRC: Molecular Basis of Microbial One-Carbon Metabolism

179 "Methylotrophy from a system perspective: genome-wide reconstruction and functional analysis of the metabolism of Methylobacterium extorquens.‖

Mar. 2010 Systems Biology of Microorganisms Paris (FR) "Methylotrophy from a system perspective: genome-wide reconstruction and functional analysis of the metabolism of Methylobacterium extorquens."

Jul. 2008 Gordon Research Conference Lewiston (USA)

GRC: Molecular Basis of Microbial One-Carbon Metabolism "Demonstration of the pathway leading to glyoxylate generation using 13C metabolomics"

PUBLICATION LIST

"Demonstration of the ethylmalonyl-CoA pathway by using 13C metabolomics." Peyraud R, Kiefer P, Christen P, Massou S, Portais JC, Vorholt JA. Proc Natl Acad Sci USA. 2009 106(12):4846-51

"Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources." Vuilleumier S, Chistoserdova L, Lee MC, Bringel F, Lajus A, Zhou Y, Gourion B, Barbe V, Chang J, Cruveiller S, Dossat C, Gillett W, Gruffaz C, Haugen E, Hourcade E, Levy R, Mangenot S, Muller E, Nadalig T, Pagni M, Penny C, Peyraud R, Robinson DG, Roche D, Rouy Z, Saenampechek C, Salvignol G, Vallenet D, Wu Z, Marx CJ, Vorholt JA, Olson MV, Kaul R, Weissenbach J, Médigue C, Lidstrom ME. PLoS One. 2009 4(5): e5584

In preparation “Genome-scale reconstruction and system level investigation of the metabolic network of Methylobacterium extorquens AM1” Peyraud R, Schneider K, Kiefer P, Massou S, Vorholt JA, Portais JC. BMC Systems Biology, under revision.

“Co-consumption of methanol plus succinate by Methylobacterium extorquens AM1” Peyraud R, Kiefer P, Christen P., Portais JC, Vorholt J.A.In Preparation

“Metabolic characterization of the facultative methylotroph Methylobacterium extorquens AM1 on acetate” Schneider K, Peyraud R, Kiefer P, Christen P, Delmotte N, Massou S, Portais JC, Vorholt JA. In Preparation

AWARD

Publication Award - Institute of Microbiology (ETH) 2009

180

181