Journal of Theoretical Biology 407 (2016) 303–317

Contents lists available at ScienceDirect

Journal of Theoretical Biology

journal homepage: www.elsevier.com/locate/yjtbi

Nature lessons: The whitefly bacterial endosymbiont is a minimal amino acid factory with unusual energetics

Jorge Calle-Espinosa a, Miguel Ponce-de-Leon a, Diego Santos-Garcia b, Francisco J. Silva c,d, Francisco Montero a,n, Juli Peretó c,e,nn a Departamento de Bioquímica y Biología Molecular I, Facultad de Ciencias Químicas, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid 28045, Spain b Department of Entomology, The Hebrew University of Jerusalem, Israel c Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, C/José Beltrán 2, Paterna 46980, Spain d Departament de Genètica, Universitat de València, Spain e Departament de Bioquímica i Biologia Molecular, Universitat de València, Spain

HIGHLIGHTS

A GSM model (iJC86) of Candidatus Portiera aleyrodidarum is constructed. ATP production pathways in Candidatus Portiera aleyrodidarum are dissected in silico. ATP production is associated to the production of carotenoids and/or amino acids. article info abstract

Article history: Reductive genome evolution is a universal phenomenon observed in endosymbiotic bacteria in insects. Received 14 March 2016 As the genome reduces its size and irreversibly losses coding genes, the functionalities of the cell system, Received in revised form including the energetics processes, are more restricted. Several energetic pathways can also be lost. How 8 July 2016 do these reduced metabolic networks sustain the energy needs of the system? Among the bacteria with Accepted 18 July 2016 reduced genomes Candidatus Portiera aleyrodidarum, obligate endosymbiont of whiteflies, represents an Available online 26 July 2016 extreme case since lacks several key mechanisms for ATP generation. Thus, to analyze the cell energetics Keywords: in this system, a genome-scale metabolic model of this endosymbiont was constructed, and its energy Portiera aleyrodidarum production capabilities dissected using stoichiometric analysis. Our results suggest that energy genera- Elementary flux modes tion is coupled to the synthesis of essential amino acids and carotenoids, crucial metabolites in the Essential amino acids symbiotic association. A deeper insight showed that ATP production via carotenoid synthesis is also Carotenoids connected with amino acid production. This unusual association of energy production with anabolism suggests that, although minimized, endosymbiont metabolic networks maintain a remarkable biosyn- thetic potential. & 2016 Elsevier Ltd. All rights reserved.

1. Introduction endosymbionts outside of their hosts has limited the study of the interactions between symbionts and hosts mainly by using com- Nutritionally driven symbioses between insects and one or parative genomics (Moya et al., 2008). Obligate endosymbionts more intracellular, maternally inherited bacteria (endosymbionts) usually have highly reduced genomes (o1000 kb, o600 genes), are widespread (Baumann, 2005). These mutualistic associations which contain a conserved core of housekeeping genes, and genes allow insects to colonize novel ecological niches with unbalanced for the biosynthesis of amino acids and vitamins that are essential nutritional sources. The extreme difficulty to culture obligate for the host (Baumann, 2005). The synthesis of these essential compounds requires, in many cases, the metabolic com- plementation with the host or from other coexisting en- n Corresponding author. dosymbionts (McCutcheon and Moran, 2012). Although the nu- nn Corresponding author at: Institute for Integrative Systems Biology (I2SysBio), tritional role of obligate endosymbionts is well defined, other Spain. ’ E-mail addresses: [email protected] (F. Montero), important aspects of the endosymbiont s biochemistry are less [email protected] (J. Peretó). known. In spite of the fact that the metabolic networks of http://dx.doi.org/10.1016/j.jtbi.2016.07.024 0022-5193/& 2016 Elsevier Ltd. All rights reserved. 304 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 endosymbionts are relatively small, they are significantly complex metabolite concentrations are assumed constant) can be assumed and systems biology approaches are needed to further understand as an approximation. Taking all restrictions into account, except the physiology of these microorganisms. Earlier research ap- the specific kinetic laws which are unknown, the set of all possible proaches used techniques derived from graph theory and stoi- flux distributions (i.e. vector of reaction fluxes), the so-called flux chiometric analysis. In particular, these techniques have been used space (F), is defined as follows: to infer plausible events of metabolic complementation (Belda ⃗ n ⃗ ⃗ et al., 2012; Cottret et al., 2010; González-Domenech et al., 2012; FvRNvdcdtv={}∈⋅:/0,,0= =≤≤∀∈=∀∈βαj jj rJ βj rIrr ()1 Ponce-de-León et al., 2013; Thomas et al., 2009), to determine the where v is the flux distribution, c is the vector of metabolite fragilities of the metabolic networks (Belda et al., 2012; González- concentrations,β and α correspond to the lower and upper bound Domenech et al., 2012; Ponce-de-León et al., 2013; Thomas et al., i i of reaction r, respectively; and Irr is the set of the irreversible re- 2009), to reveal characteristics of the transit from free-life to en- action indexes. dosymbiosis (Belda et al., 2012; González-Domenech et al., 2012) Elementary flux modes (EFM) are a special set of flux dis- and to further understanding of management in these tributions which fulfils the constrains in Eq. (1) and are non-de- systems (González-Domenech et al., 2012; Thomas et al., 2009). composable, that is, no reaction can be deleted from an EFM and However, to the best of our knowledge, it is still unclear to still obtain a valid (non-trivial) steady-state flux distribution. The which extent endosymbionts are dependent upon their hosts for most interesting property of these EFMs is that their combinations the necessary energy to support cellular processes and no previous (convex or linear depending on the existence of irreversibilities) work (neither theoretical nor experimental) has focused on this can generate the complete flux space (i.e. all the flux distributions aspect of the endosymbiont’s biology. Specifically, can these bac- compatible with the stationary state and the irreversibility con- teria be energetically independent from their host? And, if this is straints). Attending to this, if a property is present in every EFM, the case, how do they operate to perform such a task? The answer and this property is conserved after combinations of these EFMs to these questions is important not only for an understanding of can be extrapolated to every possible flux distribution and thus, a insect-bacteria symbiosis, but also in order to set the limits of proper analysis of these EFMs can lead to a better understanding genome reduction properly, a matter of major importance in fields of the capabilities of the related metabolic network. Unfortunately, such as synthetic biology. Among the known endosymbionts, Candidatus Portiera aleyr- the number of EFMs increases in a combitanorial way with the odidarum (hereafter referred to as Portiera), hosted in the whitefly number of reactions. In consecuence, only small metabolic net- bacteriocyte (Thao and Baumann, 2004), displays a key char- works are suitable for a complete EFM analysis in practice. How- acteristic that make it a good candidate for a study in this direc- ever, when it is possible, this kind of analysis is one of the most tion: it contains the set of genes coding for cytochrome oxidase, exhaustive and reliable achievable. NADH dehydrogenase, and ATP synthase, and has the most re- duced genome among endosymbionts with these traits (Moran and Bennett, 2014). This feature suggests that potentially Portiera 3. Results is energetically autonomous even though the endosymbiont me- tabolic network, in obligate cooperation with the host, is involved 3.1. Genome-scale metabolic reconstruction in the synthesis of essential amino acids and carotenoids (Luan et al., 2015; Rao et al., 2015; Santos-Garcia et al., 2015, 2012; Up- Metabolic reconstruction was based on the annotated pan- adhyay et al., 2015). The results presented in this paper support genome, derived from six previously published different strains of the hypothesis that Portiera is capable of synthesizing its own ATP Portiera (Sloan and Moran, 2012 and references therein). The pan- from the resources provided by the whitefly, due to the utilization genome contains a total of 280 genes, of which 160 were anno- of a limited number of pathways related to the synthesis of well- tated as coding genes. A first draft of the metabolic net- defined amino acids and/or carotenoids. Moreover, the involve- work was generated using the SEED (Overbeek et al., 2005) pi- ment of the carotenoid synthesis in ATP production suggests that peline. The draft obtained (id: Seed1206109.3.1165) included 136 this pathway is a potential point for the insect to regulate amino internal reactions, 14 exchange fluxes and one biomass equation. acid production. Thus, the characterization of Portiera's metabo- Although the SEED pipeline includes a step for gap-filling the lism demonstrates the richness of the energetics of missing reactions, which are essential for the synthesis of all endosymbionts. biomass precursors, the outputted model was not able to predict biomass formation. Moreover, 95% of the reactions were found to be blocked. Thus, in order to obtain a consistent model, a de- 2. Background tailed manual curation was performed using the unconnected modules approach (Ponce-de-León et al., 2013) (see Appendix A The structure of any metabolic network can be represented by for details). its corresponding stoichiometric matrix N where each element nij During the curation a total of 13 enzymatic activities, all related corresponds to the stoichiometric coefficient of metabolite mi in with amino acid biosynthesis and the pentose phosphate pathway reaction r. Additionally, along with N several constraints are nee- (PPP) and coded by 10 genes, mostly contributed by the host (Luan ded to properly assess the properties of the associated metabolic et al., 2015; Santos-Garcia et al., 2015; Upadhyay et al., 2015), were network: the thermodynamic constraints, which prevents the ir- added to the model (Table 1). A total of 39 exchange fluxes were reversible reactions from taking negative flux values; the lower also incorporated in the model, which account of the incorpora- and upper bounds imposed over each reaction flux to represent spe- tion of glucose, pyruvate and/or oxaloacetate, α-ketoglutarate cific environments (e.g. amount of available nutrients) or enzy- (which can be also produced in Portiera through glutamate dea- matic/transporter maximum capacities; and the specific kinetic mination), geranylgeranyl diphosphate, the amino acids not syn- laws that govern each reaction of the network. thesized by Portiera and small molecules (e.g. ammonia) from the Unfortunately, the above-mentioned kinetic laws are often insect; and the exportation of the amino acids produced by Por- unknown. However, is characterized by usually fast tiera and the associated subproducts (Table 2). Because no nu- reactions and high turnover of metabolites when compared to cleotid or is produced in Portiera, they were included regulatory events and thus, a pseudo-stationary state (i.e. the neither in the biomass equation nor as exchange fluxes. To assess J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 305

Table 1 Orphan reactions of the iJC86 model of Portiera´s metabolism.

Reaction Gene Pathway

histidinol dehydrogenase hisD histidine biosynthesis histidinal dehydrogenase hisD histidine biosynthesis phosphoenolpyruvate synthetase▲ pps chorismate biosynthesis phosphoenolpyruvate carboxykinase▲ pck chorismate biosynthesis transaldolase talB pentose phosphate pathway ribose-5P- rpiA/B pentose phosphate pathway phosphogluconolactonase spontaneous/pgl pentose phosphate pathway ribulose-5P-3-epimerase * rpe pentose phosphate pathway glucose-6P-isomerase * pgi pentose phosphate pathway ilvE BCA biosynthesis isoleucine transaminase ilvE BCA biosynthesis valine transaminase ilvE BCA biosynthesis carbonic anhydrase yadF/cynT/spontaneous arginine biosynthesis

Reactions in bold could be substituted by a proper complementation with the host. Reactions labelled with a (*) are not essential for biomass production by iJC86. At least one of the reactions labelled with a (▲) must be active for biomass production by iJC86. BCA, branched-chain amino acid.

Table 2 A.7), the behaviour of the model does not change whether the Characteristics of the iJC86 model of Portiera’s metabolism. endosymbiont or the host performs the transamination step. Fi- nally, the last step of the histidine biosynthesis, catalysed by his- Inputs Outputs tidinol dehydrogenase (EC 1.1.1.23), may be carried by the host leucine D-glucose, pyruvate/oxaloacetate, ADP α-ketoglutarate, AMP even though it is included in the endosymbiont network (Fig. A.2). valine D-glucose, pyruvate/oxaloacetate, ADP α-ketoglutarate, AMP As a consequence of the manual curation, a flux consistent (i.e. isoleucine D-glucose, pyruvate/oxaloacetate, as- α-ketoglutarate, AMP every reaction can carry flux in the stationary state) genome-scale partate, ADP model of Portiera, named iJC86 predicting biomass formation was tryptophan D-glucose, pyruvate/oxaloacetate, ADP, α-ketoglutarate, AMP serine obtained (see Supplementary Table 1). phenylalanine D-glucose, pyruvate/oxaloacetate, ADP α-ketoglutarate, AMP threonine D-glucose, aspartate, ADP α-ketoglutarate, AMP 3.2. Assessing the metabolic capabilities of Portiera lysine D-glucose, pyruvate/oxaloacetate, as- succinate, α-ketoglu- partate, ADP tarate, AMP arginine aspartate, glutamine, ornithine, ADP fumarate, AMP According to iJC86 in-silico predictions, Portiera is able to pro- histidine D-glucose, glutamine, ADP AICAR, α-ketogluta- duce β-carotene and nine essential amino acids for the whitefly rate, AMP (see Table 2). Although the majority of enzyme-coding genes for β-carotene geranylgeranyl diphosphate — these pathways have been found in the endosymbiont genome, some of the reactions are orphans. These orphan reactions include BIOMASS D-glucose, pyruvate/oxaloacetate fumarate, succinate, the transamination step at the end of the BCA biosynthesis, as well α-ketoglutarate as the last two steps in the histidine pathway (Table 1). The stoi- alanine, proline, glycine, serine, gluta- — chiometric analysis of iJC86 suggested that certain intermediate mine, glutamate, asparagine, aspar- tate, cysteine, metionine, tyrosine, metabolites of these pathways must be exchanged between the ornithine endosymbiont and its host. In view of these observations, different ubiquinone-8/menaquinone-8, lipoa- — scenarios of metabolic complementation are proposed (Fig. 1). ⁺ ⁺ mide, coenzyme A, NAD , NADP , TPP The first one is related to the exchange of glutamate and α- nitrogenous bases and derivatives, AICAR, AMP fl lipids ketoglutarate. The white y provides glutamate to Portiera, which spends it on several transamination reactions. The α-ketoglutarate The first block displays the inputs and outputs needed for β-carotene and each produced as a by- can either be used as a precursor in the amino acid to be produced. All the compounds in this block are essential for the lysine biosynthesis pathway or transported to the insect. A second whitefly. The second block contains those compounds (excluding ions), which are metabolic complementation event appears as a consequence of essential for the network to be functional and produce biomass. TPP, thiamine pyrophosphate; AICAR, 5-amino-1-ribofuranosyl-imidazole-4-carboxamide. the by-product formation of 5-amino-1-ribofuranosyl-imidazole- 4-carboxamide (AICAR) during histidine biosynthesis. Although AICAR is also a precursor in the de novo synthesis of purine bases, the essentiality of the orphan reactions and the exchange fluxes, Portiera lacks the genes associated with this pathway. Accordingly, an in-silico knock-out analysis was performed (see 5.3). The result AICAR might be exported to the bacteriocyte cytoplasm, where it showed that all exchange fluxes were essential for every amino could be used by the host metabolism. acid and β-carotene to be produced, and that only ribulose- The synthesis of β-carotene also involves metabolic collabora- phosphate 3-epimerase (EC 5.1.3.1) and glucose 6-phosphate iso- tion between Portiera and the whitefly. According to the model, merase (EC 5.3.1.9) do not significantly affect the predicted growth the biosynthetic precursor geranylgeranyl diphosphate should be (Fig. A.10, Table 1 labelled as n). However, these activities are re- imported from the bacteriocyte and reduced to β-carotene using quired to uncouple the biosynthesis of phenylalanine and trypto- ubiquinone which is reoxidized by a membrane-associated ubi- phan (see 3.3 and Fig. 3). In contrast, the synthesis of phosphoe- quinol oxidase (EC 1.10.3.10), generating a proton-motive force. nolpyruvate (PEP) and biomass production depends on either one The last complementation suggested by the model involves the of two orphan reactions: PEP synthase (EC 2.7.9.2) or, PEP car- bacteriocyte tricarboxylic acid (TCA) cycle, a pathway absent in the boxykinase (EC 4.1.1.49, Table 1 labelled as ▲), although the uptake endosymbiont. In Portiera, the synthesis of lysine and arginine of this metabolite from the host may not be discarded. Regarding entails the production of succinate and fumarate, respectively. the synthesis of branched chain amino acids, BCA (Figs. A.6 and However, since the endosymbiont lacks any reaction able to 306 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317

Fig. 1. Metabolic complementations suggested by the iJC86 model. Exchange of glutamate and α-ketoglutarate: the endosymbiont uses glutamate for transamination reactions and the produced α-ketoglutarate is exported to the bacteriocyte. Exchange of AICAR and purine bases (and its derivatives): the endosymbiont generates AICAR as a subproduct of histidine biosynthesis and exports it to the bacteriocyte, where it is recycled into purine bases for the use of both the endosymbiont and its host. β-carotene production: whitefly provides geranylgeranyl diphosphate for the synthesis of β-carotene. This process requires the reduction of quinone to quinol, which is then oxidized by the respiratory chain. Exportation of succinate and fumarate: the synthesis of lysine and arginine involves the production of succinate and fumarate, respectively, compounds that must be consumed by the host. gg-PPi, geranylgeranyl diphosphate; Isocit, isocitrate; Cit, citrate; Oxaloac, oxaloacetate; Mal, malate; Fum, fumarate; Succ, succinate; Succ-Coa, succinyl-CoA; α-ketoglu, α-ketoglutarate; AICAR, 5-amino-1-ribofuranosyl-imidazole-4-carboxamide; PRPP, phosphoribosyl pyrophosphate; Orn, ornithine; GS, glutamine synthetase; GOGAT, glutamine:α-ketoglutarate aminotransferase. consume these compounds, a plausible scenario is that both me- metabolites produced by the metabolism under study: tabolites are exported to the bacteriocyte’s cytoplasm. Definition II. a product selection of a metabolic network is the set of compounds produced by this network considered in a par- 3.3. Analysis of energy-production pathways through elementary ticular study. flux modes analysis Thus, the product selection chosen for this study (see Section One remarkable feature of Portiera´s metabolism is that it lacks 3.4) is composed by leucine, isoleucine, valine, threonine, histi- several canonical energy-conserving pathways, namely glycolysis, dine, histidinol, arginine, lysine, phenylalanine, tryptophan and β- a complete TCA cycle, and fatty-acid oxidation. Furthermore, only carotene. a fragmented PPP, and the production of succinyl-CoA and acetyl- The main idea of the analysis proposed below is that, if every CoA are retained in the endosymbiont. This striking characteristic enEFM produces at least one of these compounds, ATP generation is shared with other endosymbionts of equal or smaller genome also must be associated to the corresponding synthesis pathways. fi size (e.g. Carsonella, Sulcia, Uzinura, Tremblaya and Hodgkinia, see But, how can the speci c pathways responsible for ATP production fi Moran and Bennett, 2014). Because Portiera can synthesize the be determined? To answer these question the following de ni- same number of amino acids as endosymbionts with larger gen- tions are set: omes, the characterization of its energetics could serve as a model Definition III. the production of an EFM is the subset of the for the study of the endosymbionts harbouring the most reduced product selection that is synthesized in this EFM. genomes with equal or more limited biosynthetic capabilities. To fi achieve this goal, a general method based on elementary flux De nition IV. the order of an enEFM is the cardinal of its modes (EFMs) analysis was developed. production. fl Brie y, an EFM contains a minimal and unique set of enzymatic To illustrate Definitions III and IV an enEFM whose exchange reactions that can support cellular functions at steady state and, fluxes are all zero except those corresponding to carbon dioxide, more important, any metabolic behaviour under this assumption leucine and histidine is considered. Carbon dioxide is not part of can be expressed as combination of these EFMs. Thanks to this the product selection (see the example associated to Definition II) property, the evaluation of the EFMs associated with the produc- and thus, the production of this enEFM is the set leucine/histi- tion of energy, referred to as energy-producing EFMs allows the dine. Because the cardinal (i.e. number of elements) in this pro- characterization of every energetic pathway in the model. duction is two, this is also the order of the enEFM.

Definition I. an energetic EFM (enEFM) of a metabolic network is Definition V. a subset C of the product selection is minimal and an EFM that produces a net amount of energy in the form of ATP. energetic if, and only if, it is the production of an order k enEFM and no enEFM producing a subset of C exists with an order less The objective now is to make the information contained in the than k. enEFM comprehensible and non-redundant. To achieve this goal, the characterization of the enEFM was based on the main Continuing with the exemplification, and attending to J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 307

Supplementary Table 2, β-carotene is produced alone in one en- composition of the energy-producing pathways associated to a EFM. Thus, because no order zero enEFM exists, β-carotene is a collection (possibly one) of minimal energetic sets and the ATP- minimal energetic set of cardinal one. Contrarily, the set arginine/ dependent biosynthesis of other compounds, the following criteria leucine/histidine is not a minimal energetic set because order two are set: enEFMs exist which produce leucine and histidine, a subset of the fi fl set under consideration. Precisely, the set leucine/histine is mini- De nition VI. a set of reactions whose in uence is considered as mal and energetic because no order one enEFM exists which non-informative (e.g. water or CO2 transport) is described as tri- produces either leucine or histidine. vial set. fi Definitions III to. V allow the first insight into energetics by ob- De nition VII. an ATP-decoupling of a certain enEFM, is the taining the smallest ATP-producing modules regarding a set of collection of EFMs obtained from a model composed of the origi- ’ products of interest. That is, for ATP to be generated, at least the nal enEFM s reactions, with their corresponding directions, plus production (amino acids and/or β-carotene in the previous ex- the trivial set, and a reaction that freely recycles ADP to ATP. ample) of one of the minimal energetic sets must also be Such ATP-decoupling produces the original order-k enEFM plus synthesized. trivial variations of this enEFM, and possibly several EFMs of lower ’o Three important considerations have to be taken into account order (k k). when discussing the physiological implications of a set of com- Definition VIII. a subset of the product selection is defined as pounds being minimal and energetic. First, several enEFM can be energetic in a certain order k enEFM if its production can be associated to a particular minimal energetic set if different reac- obtained by the union of productions of a collection of order k’ok tions are employed in the synthesis of the compounds in the set. EFMs obtained after ATP-decoupling this enEFM. This means that the energetic properties corresponding to a minimal energetic set may not be unique and all the possibilities The basis of these criteria is that if the synthesis of a particular need to be taken into account. A good example of this situation is set of compounds is coupled to an energetic set just through the displayed in Table 3, in which the enEFMs whose productions are ATP produced during such synthesis, this coupling is lost if a free the leucine/phenylalanine set can be grouped in three classes ac- ATP source is added to the metabolic model. Moreover, enEFMs cording to their energetic properties (this case is thoroughly dis- can appear as the coupling of minimal energetic sets through cussed in Section 3.4). Second, the ATP generation associated to a metabolites of minimal interest in a particular context (e.g. water). minimal energetic set can be concentrated in the reactions leading By defining a set of irrelevant reactions this trivial couplings can be to a particular compounds in the set. An example of this situation avoided. (discussed in detail further in this section) is the leucine/histidinol But how can the exploration of the ATP-decoupled enEFM help set, in which the reactions responsible for NADH (and thus ATP) us determine whether this enEFM provides essential information generation are linked exclusively to leucine synthesis, while his- regarding the energetics of the network? The answer depends on tidinol production provides the NADPH. Exploring the idiosyncrasy whether the production of such enEFM can be recovered from the of each particular minimal energetic set requires a deep knowl- union of lower order EFM productions that results from its ATP- edge of the network from which they were obtained. However, decoupling. If it can, then the enEFM is not as simple as possible trying to fragment the corresponding enEFM through cofactor or and therefore it does not provide elementary information about similar metabolites can be a good strategy to unveil the specific the network energetics. If it cannot, then it must be kept in a role of the synthesis of each compound in the minimal energetic comprehensive description of the network energetics. set in their enEFM. In general, an enEFM is ATP-decoupled in two kinds of EFMs: The third consideration, closely related to the second one, is those dependent on the free ADP recycling reaction (thus energy- that those compounds not included in any minimal energetic set dependent), and those independent of this reaction (and therefore can still be energetic in the sense that their synthesis pathway, non energy-dependent). The former, i.e. the energy-dependent provided certain compounds obtained along the production of its EFMs, by definition, do not provide information regarding energy escorts are available, generates ATP. To distinguish whether a production in the network and can be ignored. subset of a product selection is energetic but not minimal, or is a As to the non energy-dependent EFMs, they can either produce

Table 3 Properties of the enEFMs related to the energetic sets of the iJC86 model of Portiera’s metabolism.

þ Order Productions Ratio ΔATP ΔNADH ATPcons H gap Trans. hisD pps pck talB rpiA/B rpe pgi

one beta-Carotene 1 2.5 0 0 0 5 ––––– ––

two leu-L, his-L 2:1 2.33 2 1 1.667 8 þ ––– þ ––

phe-L, leu-L 1:3 1.81 1.5 1 0.25 6 ––þþ þ þþ 1:3 1.56 1.5 1 1.25 7.5 – þ – þþ þþ leu-L, histdinol 2:1 1.42 1.33 1 0.667 7.667 ––––þ –– his-L, lys-L 1:1 0.5 1.5 2 2.5 8.5 þ ––– þ –– val-L, his-L 2:1 0.17 0.67 1 1 7.333 þ ––– þ ––

For each order (first column), the corresponding amino acid sets (β-carotene for order zero) are displayed in the second column, while the associated properties are listed in þ the next ones. The properties listed are net NADH production (ΔNADH); ATP consumed (ATPcons); proton gap (H gap); net ATP produced (ΔATP) and number of transported molecules (Trans.). Data are based on the production of 1 arbitrary flux unit of amino acid. The requirements for each enEFM regarding the presence (þ) or absence (–)ofa variant-specific gene are indicated as follows: requirement of histidinol (hisD); pyruvate-dependent phosphoenolpyruvate production (pps); oxoaloacetate- dependent phosphoenolpyruvate production (pck); requirement of transaldolase B (talB); requirement of ribose-5-phosphate isomerase (rpi A/B); requirement of D-ribulose- 5-phosphate epimerase (rpe), D-glucose-6-phosphate isomerase (pgi) The number of enEFMs associated with each energetic amino acid set and each particular property are displayed in the last column. For every second order EFM the contribution of each amino acid is displayed (see Ratio column). 308 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317

Fig. 2. Representation of the energetics of Portiera according to the iJC86 model. The only zero order enEFM of the iJC86 model involves the synthesis of β-carotene. In this pathway the geranylgeranyl diphosphate is oxidized by quinones, which are then incorporated into the respiratory chain in the form of quinol, generating ATP through oxidative phosphorylation. Every non-zero order enEFM is associated with the production of at least two amino acids, one associated with the production of NADPH (phenylalanine or histidine/histidinol) and the other with its consumption (leucine, valine or lysine). The number of ATP molecules generated (red) /consumed (blue) per molecule of amino acid or amount of β-carotene produced is displayed. net ATP (an thus are enEFMs in the original model) or not (i.e. ATP possibility compatible with the synthesis of all amino acids and β- consumption and production coincide). Thus, the former have to carotene. In particular, the effects of the presence/absence of the be analyzed by the same methods until no further decomposition orphan reactions displayed in Table 1, with the exception of those is possible. When all the original model’s enEFMs have been that can be spontaneous and the transamination step of the analyzed in this way, every essential energetic pathway of the branched chain amino acids. Note that no difference in energetics network will have been identified. can be induced by this absence. In addition, exchange fluxes cor- To end this section, let us consider a particular enEFM whose responding to phosphoenolpyruvate and all the sugars involved in production is arginine/leucine/histidine as an example to illustrate the PPP were considered (Section 5.4). The new model considering the criteria established in Definition VIII. The ATP-decoupling of all these possibilities is referred hereafter as iJC86v model. this enEFM results, in e.g. three EFM’s whose productions are ar- The iJC86v model accounts for 1,496,762 EFMs, of which only ginine, leucine/histidine and arginine/leucine/histidine. Because in 10,680 are enEFMs. To completely characterise the energetics asso- this case the set arginine/leucine/histidine can be obtained from ciated to these enEFMs, each one was assigned with five properties: the two first EFMs, no coupling between the three amino acids net ATP production; net NADH production; H þgap (net cytosolic H þ exists that cannot be explained by those of the arginine and leu- consumption not considering the cytosolic H þ lost by pumping re- cine/histidine EFMs, an energetic association, or the reactions of actions);ATPconsumed;andthenumber of transported molecules the trivial set. In consecuence, the arginine/leucine/histidine set through the membrane (see Supplementary Table 2 for a complete list cannot be considered energetic in the enEFM under study (note of the enEFMs and their properties). The relation between these that it can be energetic in other enEFMs). properties is deduced and discussed in Appendix B. On the other hand, let us suppose that the ATP-decoupling of a After applying the criteria established in Definition VIII, a total certain enEFMs whose production is leucine/histidine yields only of 3154 informative enEFM associated to 28 energetic sets were one EFM, identical to the original enEFM. Thus, it can be assured found in the iJC86v model. These 3154 enEFM can be grouped in that a coupling between at least one leucine and one histidine 1968 classes considering their energetic properties, and the pre- synthesis pathway exists which cannot be explained by an en- sence/absence of the orphan reactions and exchange fluxes con- ergetic association or via reactions of the trivial set. Therefore, the sidered in the study (Supplementary Table 3). set leucine/histidine is an energetic set which provides non-trivial The analysis of the 3154 informative enEFM is divided in two information regarding the energetics of the metabolism under stages. First, the enEFMs which do exchange neither phosphoe- study. nolpyruvate nor phosphorylated sugars of the PPP are studied. As can be deduced, these enEFMs define the energetics of the original 3.4. Dissecting Portiera’s energetics iJC86 model if the export of histidinol is allowed. Once the en- ergetics of the iJC86 model is characterized, the new possibilities A detailed study of the energetics of Portiera was performed by in the iJC86v model are discussed. The characteristics of the iJC86 analyzing the elementary flux modes (EFMs) of the iJC86 model model’s enEFMs, are given in Table 3, and a representation of the according to Section 3.3. Although the model presented in Section associated energetics is displayed in Fig. 2. 3.1 was built disregarding metabolic complementations implying Only one enEFM of order one was found between those non phosphorylated and/or high molecular weight metabolites, the dependant on the PPP. This enEFM comprises the synthesis of energetics of the system was studied taking into account every β-carotene from the geranylgeranyl diphosphate imported from J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 309 the host (Fig. 15 in Appendix A). In this pathway four double bounds are formed, each of which requires the concomitant re- duction of a quinone to quinol. These four quinol molecules are incorporated and reoxidized in the respiratory chain, and 2.5 mo- lecules of ATP per β-carotene synthesized may be produced. While only one enEFMs of order one has been found, a total of seven order two enEFMs were identified. These enEFMs involve four distinct pairs (five if histidinol is considered) of amino acids (see Table 3): leucine/histidine; valine/histidine; lysine/histidine; and phenylalanine/leucine. These pairs can be classified in two groups: histidine (or histidinol) containing pairs, and phenylala- nine/leucine pairs. Of the three histidine-containing pairs, only the leucine/histi- dine pair generates energy in the absence of the histidinol oxi- doreductase activity (leu-L/histidinol pair in Table 3). In addition, only the amount of ATP generated by the synthesis of leucine/ histidine is significant: 2.33 mols of ATP (1.42 if the histidinol oxidoreductase is discarded) per mol produced of leucine/ histi- dine versus 0.17 in the case of valine/histidine and 0.5 in the case of lysine/histidine. This suggests that, in the histidine-containing pairs group, only the set involving histidine and leucine can be considered energetic with reasonable certainty. The possibility of a certain amount of active transport (i.e. the incorporation or ex- portation of particular metabolites imply an ATP cost), suggests that the other histidine-containing pairs (which have a rather small net ATP production) are probably not energetic but, at least, cheap to produce. Since the synthesis of phosphoenolpyruvate (essential pre- cursor in aromatic amino acid production pathways) relies on ei- ther pyruvate or oxaloacetate, the four phenylalanine/leucine en- EFMs can be classified into two groups: pyruvate-dependent or oxaloacetate-dependent. The reactions involved in each case are:

þ pyruvateþATPþH2O-phosphoenolpyruvateþAMPþH þPi

þ oxaloacetateþATP þH -phosphoenolpyruvateþADPþCO2 Fig. 3. Graph representing the coupling between phenylalanine and tryptophan Because the iJC86 model assumes that the recycling of AMP to biosynthesis iniJC86 model of Portiera metabolism if D-ribulose-5-phosphate-3- epimerase and D-glucose-6-phosphate isomerase are not included as orphan. Me- ADP depends on the host, regarding energetics there are two tabolites are represented with their abridged name and reactions are represented significant differences between the two groups of phenylalanine/ as softened rectangles containing the corresponding EC number. ru5p-D_c, ribu- leucine enEFMs. Firstly, the usage of pyruvate involves more lose-5-phosphate; g3p_c, glyceraldehyde-3-phosphate; s7p_c, sedoheptulose-7- transport processes compared to oxaloacetate due to the recycling phosphate; f6p_c, fructose-6-phosphate; e4p_c, erythrose-4-phosphate; xu5p-D_c, xylulose-5-phosphate. of AMP in the insect cytoplasm. Secondly, while the phosphoe- nolpyruvate synthesis via pyruvate generates one cytosolic Hþ ,if þ oxaloacetate is used instead, one cytosolic H is consumed. These phosphoenolpyruvate carboxykinase is longer needed. Moreover, differences explain why distinct enEFMs were identified, since the the net ATP production associated to the pair is increased to 2.06 usage of oxaloacetate/pyruvate implies different reactions. ATP per mol of amino acid, making this pathway a great alternative Another important property of the phenylalanine/leucine pro- to the synthesis of both β-carotene (2.5 ATP per mol of β-carotene) ducing enEFMs is that they all depend on the presence of two non- and the leucine/histidine pair (2.33 ATP per mol of amino acid). An essential orphan reactions. The absence of either D-glucose-6- additional effect of allowing phosphoenolpyruvate import is that phosphate isomerase or D-ribulose-5-phosphate-3-epimerase tryptophan appears in two energetic sets of order three: pheny- supposes the coupling of phenylalanine and tryptophan produc- lalanine/tryptophan/leucine and tryptophan/leucine/histidine. tion (Fig. 3). Because tryptophan does not appear in any of the However, because the amount of ATP per mol of amino acid pro- selected enEFM after applying Definition VIII, this coupling could duced is negligible (0.38 and 0.08 respectively) compared to the seriously restrict not only the phenylalanine supply to the white- amount of transported molecules (7 and 8 respectively); and the fly, but also the energetics of the endosymbiont. If we focus on the structure of the enEFMs is easily explained through the arrange- viability of the enEFMs studied above, oxalacetate-dependent ones ment of carbon fragments obtained after glucose decarboxylation produce 1.81 ATP per mol of amino acid while pyruvate-dependent (and thus NADPH obtention) in the oxidative branch of the PPP. ones produce 1.56 ATP per mol of amino acid. In this case the When the 1957 energy-producing pathways that require the difference is slight, and both enEFMs can be considered energy- exchange of phosphorylated sugars are coarsely considered, two producing pathways. interesting facts arise from their comparison to the previously As have been shown, the energetic properties of phenylalanine/ discussed enEFMs. First of all, several new energetic sets of one leucine pairs in iJC86 model is significantly affected by the way amino acid appears (Supplementary Table 3): leucine, lysine, va- phosphoenolpyruvate is produced. As can be seen in Supplemen- line and histidine. This energetic sets are associated to a total of tary Table 3, if this molecule is assumed to be provided directly by 168 enEFMs. However, if the specific active orphan reactions and the insect, neither phosphoenolpyruvate synthetase nor exchanges, and the number of transported molecules are ignored, 310 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 only 16 classes of enEFMs are found (Supplementary Table 4). almost surely the ATP cost associated to glucose phosphorylation Moreover, the differences between the production pathways of (note that the importation and exportation of the proper PPP su- each amino acid are restricted to the ATP consumed in its synth- gars can make this step unnecessary, Fig. 10 in Appendix A) or to esis. This fact, along with the inspection of Figs. 2 and 3, 6 and 10 the fate of the glycerol phosphate produced during tryptophan in Appendix A, shows that the appearance of these order one biosynthesis (Fig. 13 in Appendix A). enEFM and their differences regarding net ATP production can be Finally, the existence of 82 enEFMs in the iJC86v model which explained by the NADPH involved in these pathways. produces significantly more ATP than the maximum ATP produced Leucine, valine and lysine consume NADPH in their synthesis, by an enEFM in the iJC86 model can be attributed enEFMs to either and the origin of this NADPH is necessarily the glucose dec- the decoupling of NADPH synthesis to amino acid synthesis arboxylation (no other reaction can produce it in this system). If no pathways or to the avoidance of either glucose phosphorylation, phosphorylated sugar interchange is allowed, the carbon skeletons glycerol phosphate management in tryptophan production or produced after glucose decarboxylation cannot be excreted with- NADPH production in histidine-producing enEFMs. out processing. However, these processing implies their integra- Although these results are interesting by themselves, a more tion into two possible synthesis pathways: the chorismate synth- detailed evaluation of the enEFMs associated to iJC86v energetics esis pathway via erithrose phosphate or the histidine synthesis is needed if the impact of the several PPP configurations in ATP pathway via ribose phosphate. Thus, the production of either generation want to be addressed. To these end, the different en- leucine, valine or lysine is coupled with the synthesis of either EFMs were grouped in six C-classes (C0,C3,C4,C5,C6 and C7) con- histidine or any of the chorismate-dependent amino acids (phe- sidering the maximum number of carbons between the ex- nylalanine and tryptophan). However, if the exchange of phos- changing PPP intermediates. phorylated sugars is allowed, glucose is decarboxylated, NADPH is When order one enEFMs are considered, all classes except C0 produced concomitantly and the remaining skeleton can be ex- account for the same energetic sets: leucine, lysine, valine and creted directly. In consequence, the order one enEFMs displayed in histidine (note that the architecture of their synthesis has been Supplementary Table 4 appear. As can be easily deduced, the dif- previously discussed in this section). Moreover, the maximum ATP ferences in the ATP consumed in these enEFMs arise from the yield for each amino acid is exactly the same for C4,C5,C6 and C7, different ways in which the glucose carbon skeleton is manipu- and almost the same for C3 when considering leucine, whose lated in the PPP before its excretion. synthesis, as can be seen in Supplementary Table 3, is the most On the other hand, histidine production requires ribose phos- energy-producing. phate, a sugar synthesized from ribulose phosphate, the direct If enEFMs of order two are considered, the outlook remarkably product of glucose decarboxylation. Thus, histidine production changes. First of all, not every order two energetic set is found in generates NADPH in excess. Because the redox balance must be every class. More important, the maximum ATP yield associated to maintained in the cell, this NADPH needs to be recycled by each set varies depending on its type. Those phenylalanine-con- NADPH-dependent routes like those of leucine, valine and lysine taining pairs, with the exception of the phenylalanine/histidine synthesis. If a direct input of phosphorylated sugars to the en- pair, achieve the maximum energy yield in C3. On the other hand, dosymbiont is allowed, alternative ways to produce ribose phos- the histidine-containing ones (with the exception of the iso- phate bypass the NADPH generation. The emergence of order one leucine/histidine pair), as expected, reaches its optimum ATP yield enEFMs with histidine as their productions is, therefore, explained. when ribose phosphate importation is allowed (C5,C6 and C7). Fi- Finally, and as in the case of leucine, lysine and valine, the differ- nally, the pairs isoleucine/histidine and tryptophan/lysine appears ences in the ATP consumed in these enEFMs comes from the dif- only in C7 and yields small ATP quantities while the pair trypto- ferent ways in which the PPP is operating. phan/leucine energy yield is optimal in this class, although similar In line with the previous reasoning, the net ATP per amino acid yields are obtained in C3,C5 and C6. produced in several of the new order one enEFMs is higher that There is little interest in characterising order three and four their order two counterparts when the exchange neither phos- enEFMs. The reason is that these enEFMs only produce more than phoenolpyruvate nor phosphorylated sugars of the PPP are al- 0.7 ATP per amino acid if leucine is part of the energetic set, and lowed (iJC86 model): leucine generates between 2.38 and 3.5 ATP only in classes C5,C6 and C7. This means that, to obtain worse en- per amino acid, lysine between 0.25 and 1, valine between 0.06 ergy yields than in less order enEFMs, high molecular weight su- and 0.25, and histidine between 0.25 and 0.75. According to these gars need to be exchanged. Thus, it can be concluded that, even if numbers, the synthesis of leucine and perhaps lysine and histidine all pentose phosphate sugars are allowed to be exchanged freely, can be considered true engines in Portiera’s metabolism if all- these enEFMs would probably contribute little to energy passive transport assumptions does not hold and a certain ATP production. cost is incorporated to incorporation and/or product excretion. The second interesting fact regarding the enEFMs as- 3.5. Analysis of different Portiera strains sociated to phosphorylated sugar exchanges is that two new amino acids appear in the energetic sets of the iJC86v model: As explained in Section 3.1, the pangenome on which iJC86 and isoleucine and threonine. Isoleucine appears always associated to iJC86v models rely, comes from the genomes of six different either histidine, phenylalanine and histidine, or tryptophan and strains: PAD (host: Aleurodicus dispersus), PLF (host: Aleurodicus leucine; while threonine appears in only one enEFM in association floccissimus), PTV (host: Trialeurodes vaporarium), TV (host: Tria- to tryptophan and leucine. As discussed previously in this section, leurodes vaporarium), PBTB (host: Bemisia tabaci B) and PBTQ and excluding leucine, the production of each of the isoleucine and (host: Bemisia tabaci Q). These strains can be classified in three threonine companions implies NADPH generation. In addition, groups according to their metabolic genes (see Table 4): (1) PAD isoleucine and threonine production requires NADPH (Figs. 7 and and PLF, (2) PTV and TV, and (3) PBTB and PBTQ. In particular, 8 in Appendix A). Accordingly, it can be deduced that these en- strains PAD and PLF account for exactly the same metabolic genes EFMs arise from the coupling of NADPH-consuming amino acids than the pangenome. In consequence, all the results obtained and (isoleucine, threonine and leucine) and NADPH-producing ones commented in Section 3.4 regarding iJC86v model immediately (histidine, phenylalanine and tryptophan). Because the net ATP apply to these strains. The remaining ones were assigned to a generation of these enEFMs is small (between 0.05 and 1 ATP), the particular variant of iJC86v for their metabolic characterization reason why these enEFMs do not appear in the iJC86 model is (Section 5.4): PTV and TV to TV-iJC86v and PBTB and PBTQ to BQ- J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 311

Table 4 where no enEFM is found which produces an appreciable amount Differences in the metabolic genes present in six Portiera strains. of ATP molecules per amino acid. This enEFM decrease is especially notable in those related to phenylalanine and tryptophan pro- PAD PLF PTVA TV PBTQ PBTB duction. Essentially, every enEFM of this characteristic is lost with hisE þþ þþ Phosphoribosyl-ATP the exception of a few ones related to tryptophan biosynthesis. pyrophosphatase Moreover, while in iJC86v the almost maximum ATP yield can be þþ tktA Transketolase obtained by allowing the transporof just glycerol-3-phosphate dapB þþþ þps ps Dihydrodipicolinate reductase lysA þþþ þ Diaminopimelate decarboxylase (with the appropriate orphan reactions), similar values are argH þþþ þps ps Argininosuccinate reached in TV-iJC86v and BQ-iJC86v only when phosphorylated dapF þþþ þ Diaminopimelate epimerase sugars of five or more carbons are allowed to be exchanged. Ac- cording to this, the flexibility an efficiency of the energetics in TV- The presence (þ) or absence (, ps if present as a pseudogene) of the metabolic fi genes exclusive of PAD (host: Aleurodicus dispersus), PLF (host: Aleurodicus floccis- iJC86v and BQ-iJC86v are de ned essentially by the non-phenyla- simus), PTV (host: Trialeurodes vaporarium), TV (host: Trialeurodes vaporarium), lanine enEFMs displayed in Table 3 if no high molecular weight PBTB (host: Bemisia tabaci B) and PBTQ (host: Bemisia tabaci Q) are shown. phosphorylated sugars are allowed to be interchanged. Because the major changes in the energetics of TV-iJC86v and iJC86v. Then, the energetics (Supplementary Table 3) and how the BQ-iJC86v with respect to that of iJC86v are almost identical, they orphan reactions of these models influence it, were studied as can be attributed to the absence of the transketolase activity. iJC86v in Section 3.4. However, other differences arise when the remaining metabolic Regarding the TV-iJC86v model, 168,314 EFMs were obtained, genes are considered. TV-iJC86v also lacks the phosphoribosyl-ATP 465 of which were enEFM. After applying the criteria stablished in pyrophosphatase. If this activity is carried out by the insect, the Definition VIII, only 15 energetic productions associated to 167 proton associated to the histidinol phosphate hydrolysis is gener- enEFMs (grouped in 121 classes considering their energetic ated in the host cytoplasm, and not in the endosymbiont’s. This properties and the presence/absence of the orphan reactions and fact implies an increase of 0.25 in the number of ATPs produced exchange fluxes considered in the study) were found. Then again, per histidine and the unlocking of several enEFMs which are not of the 145,109 EFMs of the BQ-iJC86v model, 987 are enEFMs. compatible with the stationary state due to the generation of the From these enEFMs, 219 (grouped in 173 classes) are associated to above mentioned proton. Thus, if the phosphoribosyl-ATP pyr- the 18 energetic productions that satisfy Definition VIII. To com- ophosphatase activity is transferred to the insect, the en- pare the energetic properties of the six strains considered, the ATP dosymbiont energetics are benefited, especially considering the yield of the enEFMs associated to energetic sets of their corre- consequences of the transketolase activity removal and that every sponding models was plotted (Fig. 4) after classifying them ac- order two enEFM in Table 3 which not produce phenylalanine cording to their C-class (Section 3.4) and to their order. requires histidine or histidinol production to be operative. A first inspection of Fig. 4 shows that compared to iJC86v, TV- On the other hand, the BQ-iJC86 model lacks several genes iJC86v and BQ-iJC86v present quite a few enEFMs independently related to lysine and arginine synthesis. As can be seen when in- of the order, C-class (with the exception of the C0-class) and ATP specting those enEFMs exclusive of this model (Supplementary yield considered. This is especially dramatic in C3 and C4 classes Table 3), and because arginine is not part of any energetic set, only

Fig. 4. Comparative of the energetics ofiJC86v, TV-iJC86v and BQ-iJC86v strain-specific models of Portiera’s metabolism. The energetic efficiency (axis of abscises) in number of ATPs per molecule of production (Definition III) of each enEFM (square marks) associated to an energetic production is plotted for each model. Purple squares represent exclusive enEFMs of a particular model while the red ones represent those shared between two or more models. To facilitate the interpretation of this plot, especially in conjunction with Supplementary Table 3, the enEFMs are displayed classified by C-class and order. 312 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 the alterations associated to the lysine synthesis pathway affect unbounded). the network’s energetics. In particular, if the dihydrodipicolinate Because the majority of the amino acid production may be reductase, diaminopimelate decarboxylase and diaminopimelate destined to the insect, a reaction representing the whitefly bio- epimerase activities are now carried out in the insect cytoplasm, mass was included in the model and several ratios of en- the production of one lysine molecule now costs one less NADPH dosymbiont biomass to insect biomass were evaluated (see 5.2). (which suppose the variation in the ratios of lysine producing Since no specific information on the amino acid composition and enEFMs, Supplementary Table 3) and produces 0.5 less ATP (note ATP requirements of either the whiteflyorPortiera was found in that the consumption of the two protons associated to these ac- the literature, the simulations were performed using an ensemble tivities now is performed in the insect). However, because several of randomized biomass equations derived from those corre- non-lysine producing enEFMs have better ATP yields, this altera- sponding to A. pisum and E. coli, respectively (see 5.8). It is im- tions might have either none or minor effects in network portant to note that no ATP cost is included in the whitefly bio- performance. mass equation in iJC86 because Portiera is only involved in the exportation of amino acids (i.e. the ATP cost of assembling the 3.6. The role of β-carotene synthesis in the metabolic capabilities of proteins needed for the insect is assumed by the insect). The most Portiera relevant findings obtained through this analysis are displayed in Fig. 5. The most remarkable feature of Portiera’s metabolism, regard- Because the differences in the amino acid compositions of the less the model variant considered is that ATP production is ne- two biomasses are less marked than the corresponding to ATP cessarily associated to the synthesis of either β-carotene or certain consumption, it can be assumed that an increase in the en- amino acids. This implies, especially considering β-carotene, that if dosymbiont to insect biomass ratio is equivalent to an increase in the needs for these molecules are not well balanced with those for the ATP requirements for the whole system (whitefly plus Portiera) energy, an overproduction will occur. Obviously, the nature and to grow. Thus, attending to the results presented in the previous magnitude of these overproductions depends on the particular section, an increment in the overproduction of amino acids is model considered. However, in the case presented here, it is im- expected. Results from constraining the flux through β-carotene possible to characterize all the iJC86 variants. In consequence, the synthesis demonstrate this point: as the ratio of endosymbiont results obtained from a particular model will be extrapolated biomass to insect biomass is increased, the total biomass produced properly to obtain robust conclusions regarding this phenomenon (insect plus endosymbiont) quickly decreases at the same rate as in Portiera. the flux of overproduced amino acids increases (Fig. 5). The analysis of the nature and magnitude of the over- The effect on biomass production seems to be independent of productions in the iJC86 model of Portiera’s metabolism were de- the constraint in β-carotene synthesis but not for the amino acid termined after constraining the flux through β-carotene synthesis overflows (Fig. 5). If the β-carotene production accounts for at (note that the enEFM associated to the maximum ATP yield pro- least around 70% of the total carbon source flux, no amino acid duces this molecule, Table 3) to different values (from zero to overflow is observed irrespective of the value of the symbiont-to-

Fig. 5. Impact of β-carotene synthesis on the metabolic performance of the iJC86 model of Portiera’s metabolism. Representation of the effect of β-carotene synthesis in maximum insect biomass production (green) and amino acid overflows (histidine, blue; leucine, magenta; histidine þ leucine, black). Because part of the insect’s biomass corresponds to Portiera, several symbiont-to-insect biomass ratios were tested (panels a, b, c and d). For each ratio, the 90% of maximum biomass production, in addition to the overflow of each produced amino acid were calculated under different constraints on the β-carotene synthesis (max crt): from 0% to 100% of the total carbon sources flux. Only leucine and histidine were detected to be overproduced. Due to the lack of specific information regarding the biomass equation of either Portiera or the whitefly, the simulation described above was carried on an ensemble of 100 randomized variants of these equations (see 5). The results plotted here accounts for the mean of all simulations, error bars representing the corresponding standard deviation. AFU: arbitrary flux units. J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 313 insect biomass ratio. This suggests that, assuming the most relaxed 4. Discussion case (symbiont-to-insect biomass ratio ¼ 1:100 in the model with all the orphan reactions of Table 1 included), the number of β- The results of the stoichiometric analysis of the iJC86 model carotene molecules produced per amino acid destined to biomass and its variants with regard to the amino acid metabolism are is at least 2. Because a minimum of 90% of the biomass maximum consistent with all previous studies on metabolic complementa- was set in these analyses, we asked at what fraction of the biomass tion between Portiera and its insect host. In addition, these com- maximum (if this maximum exists) will there be no β-carotene plementations resemble those reported in Buchnera aphidicola, the production and no amino acid overflow. Restricting the β-carotene endosymbiont of the aphid Acyrthosiphon pisum (Hansen and and amino acid exchange fluxes to zero impedes the grow of the Moran, 2011; Macdonald et al., 2012). Most of these studies are system, whatever the symbiont-to-insect biomass ratio. Even if based either on qualitative genome analysis (Rao et al., 2015; this condition is relaxed (by allowing a maximum flux through Santos-Garcia et al., 2015, 2012) or on experimental approaches these reactions of 1% of the total carbon source flux allowed to (e.g. transcriptome analysis (Luan et al., 2015; Upadhyay et al., 2015)). A curated model, such as the one here presented iJC86, enter the symbiont), less than 10% of the maximum biomass is offers additional opportunities for investigating the structural produced in the corresponding models without these constraints. properties of the metabolic network with a systems biology ap- Taking these results as a whole, for Portiera (no matter which or- proach, such as the EFM analysis. According to the iJC86 model and phan reactions are included in the model, see Table 1) to work in its variants, the energetics of Portiera relies (with the exception of near-optimal (490% of the maximum, as above assessed) bio- the valine-producing enEFMs of order one, which relies only in the mass-producing conditions (either for itself or for the insect) it proton gap) entirely on its respiratory chain: the ATP produced must either overproduce amino acids or β-carotene. depends on either the oxidation of the ubiquinol produced in the If we consider the nature of the overproduced amino acids formation of the double-bonds of the β-carotene or the oxidation (Fig. 5), it is not surprising that leucine and histidine/histidinol are of the NADH generated during the synthesis of certain amino acid the ones which are overproduced. As mentioned above, the leu- sets (Fig. 2, Supplementary Table 3). These sets are established cine-histidine/histidinol and phenylalanine-leucine pairs are as- either because the NADPH balance must be maintained or due to sociated with the most effective second order enEFMs, with re- links associated to the PPP. Regarding the NADPH balance, the spect to the amount of ATP synthesized. Because leucine partici- amino acid sets involves a NADPH-consuming/energy-producing pates in all the pairs named, it is overproduced in higher amounts amino acid (leucine, valine or lysine) and one or more NADPH- than histidine/histidinol. Phenylalanine is not overproduced es- producing/energy-consuming ones (phenylalanine, histidine/his- sentially because it is required in higher amounts in the biomass tidinol and/or tryptophan, depending on the particular iJC86 var- equations compared to histidine, and because the leucine-histi- iant considered). The remaining amino acids are either not asso- dine pair produces approximately 0.5 ATP molecules per amino ciated to any enEFM (arginine) or are part of the production of acid, which is more than the best phenylalanine-leucine-produ- high order ones with almost negligible ATP yields and only in very cing enEFM. particular conditions (isoleucine and threonine) The amount of The results presented in this section show that, when the in- energy produced in association with the synthesis of these amino sect-symbiont system is required to grow at quasi-optimal rates acids ranges from 0.027 to 3.5 mol of ATP per mol of amino acid (490% of the maximum), the enEFMs employed are those asso- produced (Table 3). However, these results only hold if all the ciated to the greatest ATP yields. Moreover, if the synthesis of a transport processes in Portiera are assumed to be passive. Al- particular molecule (e.g. lysine) can proceed along energy pro- though this assumption is reasonable considering the high con- duction, the burden over the greatest ATP yield enEFMs is relaxed. centration of sugars and non-essential amino acids in the insect If the flux through the productions of the most efficient enEFMs is diet (Dinant et al. 2010), it may not be wholly valid. Minor de- limited, then the ATP synthesis is distributed between these viations from the all-passive transport assumption have important modes and the next ones in order of energetic efficacy. Consider- consequences when considering the energetics of Portiera. Essen- tially, the ATP production associated with the valine/histidine and ing these all, and with Supplementary Table 3 as a support, several β general conclusions can be obtained regarding the iJC86 model lysine/histidine pairs would be abolished, leaving -carotene and variants (iJC86v, TV-iJC86v and BQ-iJC86v). First of all, and what- leucine production as the energetic motor of the symbiont. This situation highlights the importance of the knowledge gap re- ever particular model considered, leucine producing enEFMs are garding the transport processes in endosymbionts when analyzing the best alternatives to the β-carotene producing one. In fact, their metabolic networks: the behaviour of the network can vary depending on the strain, and on the particular orphan reactions dramatically under slightly different scenarios. Nevertheless, the considered to be active, several of the former surpass the last if passive transport assumption allows all possible functionalities of some (just glycerol-3-phosphate in certain cases) intermediates of the network to be considered, and robust behaviours can be the PPP are allowed to be exchanged. In consequence, leucine is identified, such as the case of β-carotene or the leucine pairs probably overproduced in essentially any circumstance in Portiera. discussed previously. Depending on the existence of order one leucine producing en- Three remarkable features were revealed through the model- EFMs or not, other amino acids (histidine, phenylalanine or tryp- ling of Portiera's energetic metabolism. The first one comprises the tophan) will be also overproduced. In particular, because of the production of ATP associated with valine production, which does lack of transketolase activity, strains PTV, TV, PBTB and PBTQ, the not rely on NADH synthesis (Supplementary Table 3). Because no leucine companions are limited to histidine and/or tryptophan. If substrate-level phosphorylation exists in the network, an energy the overflow of leucine is somehow limited, the better alternatives production independent of the respiratory chain was unexpected, imply β-carotene or lysine producing enEFMs. and raises the question of how the endosymbiont can generate the In conclusion, the energetics of Portiera seems to rely on the electrochemical proton gradient needed for the ATP synthase to synthesis of leucine and β-carotene. The presence of certain or- operate. According to the iJC86 model predictions, the proton phan reactions and exchanges for intermediate sugars of the PPP motive force needed is obtained through the exportation of cyto- conditions the magnitude of these overproductions and the pre- solic protons by including them in the structure of other mole- sence of overflows of other amino acids (histidine, phenylalanine cules. Symmetrically, this contribution to the generation of proton and/or tryptophan) whose synthesis support that of leucine. motive force, defined here as a proton gap, can also be detrimental 314 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 to energy production if protons are generated instead of con- Portiera’s panmetabolism. Let consider now the effects on the sumed. The effects of the proton gap on the symbiont energetics network of allowing the importation of phosphoenolpyruvate are widespread and significant in the majority of cases (Supple- (PEP) in Portiera’s energy metabolism. The only destination of this mentaryTable 3). Because this effect is only predicted using stoi- metabolite in the network is the production of phenylalanine. chiometric analysis, empirical tests should also be used to assess Attending to Supplementary Table 3, the cost of synthesizing this its role in the system. amino acid depends, precisely, on the precursor employed in the A second remarkable feature of Portiera's metabolism involves production of PEP. Thus, by discounting the contribution of PEP the β-carotene synthesis. Carotenoids and their derivatives have synthesis to those enEFM associated to phenylalanine production, important functions in insects, such as coloration, vision, diapause, the effects of the free importation of PEP on the energy metabo- photoperiodism, mate choice or oxidative stress (Cazzonelli, 2011). lism of Portiera can be determined. In particular, a new order two The stoichiometric analysis of the iJC86 model and its variants enEFM synthesizing phenylalanine/leucine will appear with an reveals an additional possible function related to energetics and, associated net production of 2.06 ATP per amino acid, in contrast ultimately, to the control of amino acid biosynthesis. Leucine, ly- to the 1.86 and 1.56 ATP obtained if PEP is produced from sine, valine and, to a lesser extent, phenylalanine and tyrosine oxaloacetate and pyruvate, respectively. Again, no substantial (obtained through phenylalanine oxidation), span a notable per- changes appear if a free incorporation of PEP is considered, the centage of the proteome not only in insects (Russell et al., 2014) main reason being that the more efficient ATP production in the but also in other organisms (Sorimachi, 2009). In contrast, histi- network is still associated to the synthesis of β-carotene (2.5 ATP dine is only required in small quantities (Russell et al., 2014; per β-carotene) and the pair leucine/histidine (2.33 ATP per Sorimachi, 2009). Considering the iJC86 model, if no β-carotene- amino acid). associated ATP production exists, Portiera will produce an excess of The third and last remarkable characteristic of Portiera’s me- certain amino acids with a significant overproduction of leucine tabolism relates to the PPP. As have been extensively explored in and histidine (Fig. 4). Moreover, this overproduction of leucine Section 3, the reactions and phosphorylated sugar exchanges as- seems to appear in the other model variants presented in Sections sociated to this pathway are essential for defining the viability of 3.4 and 3.5. However, if β-carotene-associated ATP production is Portiera’s production capabilities and the nature of its energetics. allowed, this restriction can be avoided. Nevertheless, the mini- However, genomic evidence is only available for glucose kinase, mum amount of β-carotene synthesis needed for neglecting amino glucose-6-phosphate dehydrogenase and transketolase (only in acid overproduction in the iJC86 model is about two times the PAD and PLF strains) activities. Moreover, even after studying all production of amino acids, if all of them are destined to the the possible distributions of this pathway between the insect and fl ’ white y, and much more if we also consider Portiera s growth. No the symbiont no clear scenario can be selected, especially if no better numbers are expected if the inclusion of phosphorylated transportation limitations regarding the size and charge of the sugars exchanged are taken into consideration because the ATP sugar can be set. This limitations arises from the fact that, up to yields associated to non-leucine producing enEFMs do not im- our knowledge, no information of the transport processes between fi prove signi cantly (Fig. 4). Thus, for the insect-endosymbiont Portiera and the whitefly exist and nucleotides (or nucleotide system to growth with a significant yield (410% of its maximum), β precursors), which display high molecular weights, must be ne- either leucine and histidine or -carotene must be substantially cessarily incorporated to the symbiont. overproduced. This performance suggests two non-exclusive pos- In conclusion, Portiera shows the capability of simultaneously sible scenarios: the overproduced substances have a particular providing the whitefly with nine essential amino acids while being physiological or ecological (e.g. through excretion in honeydew) capable of synthesizing its own ATP. The ways nature has allowed fl fi fi function for the white y, or they have no signi cant tness effect this to be possible involve the efficient configuration of bio- in the system. However, a third scenario can be conceived taking synthesis pathways that reduces the catabolic reactions needed to into consideration recent evidence (Valmalette et al., 2012) which a minimum. This configuration is especially notable when β-car- suggests that β-carotene could be involved in light-induced re- þ otene synthesis is considered: just a few reactions allow a coarse duction of NAD in insects. Thus, if light can induce a cyclic control of amino acid production while serving as fuel for this electron transport in which the NADH oxidation through NADH- particular ATP-synthesis machinery. Because Portiera is just one of dehydrogenase takes part, ATP can be synthesized without the a number of reduced genome endosymbionts, it serves as a concomitant production of amino acids or β-carotene. Assuming milestone in the exploration of the energetics of these organisms. this as true, its energetic role in Portiera is independent of the amount of β-carotene synthesized, and thus overproduction of any kind is unexpected. 5. Methods It is worth to emphasize that the properties inferred from iJC86 model of Portiera’s panmetabolism can be extrapolated to similar 5.1. Flux balance analysis networks with few additional considerations. Lets consider two examples. Attending to comparative genomics (Santos-Garcia Flux balance analysis (FBA) approach combines the description et al., 2015), the only Portiera strain which deviate significantly of the flux space F (see background) with optimization techniques from the others is Candidatus Portiera aleyrodidarum BT-QVLC, to find a flux distribution v⃗that maximizes the growth rate (Orth hosted in B. tabaci. Specifically, this strain lacks genes involved in et al., 2010). The FBA problem is formulated as a linear programme lysine (dapB, dapF and lysA) and arginine (argG and argH) synthesis as follows: (Luan et al., 2015; Santos-Garcia et al., 2015). The inspection of the reactions associated to dapB, dapF and lysA shows that they con- Max: vBiomass tribute with 0.25 ATP per amino acid (by increasing the proton gap st.. by one per amino acid) to the net production of 0.5 ATP per amino → Nv⋅ = 0 acid associated to the enEFM’s in which lysine is produced (Ta- ble 2). Because arginine synthesis is not associated by any means αβjj≤≤∀∈vjJj to ATP or NADPH production, it can be concluded that the energy αj =∀∈0 jJirrev ()2 metabolism of Candidatus Portiera aleyrodidarum BT-QVLC is es- sentially the same than the one inferred from iJC86 model of where the flux vBiomass represents the growth rate of the organism. J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 315

This flux, also refereed as the biomass equation, includes all the the original ones in which the non-main metabolites (phosphate, metabolites that are biomass components, in its specific propor- NADPH, NADP⁺, protons, water, carbon dioxide and fumarate) are tions (Feist and Palsson, 2010). Since the FBA formulation is linear not included. All these variants are provided as sbml files. programme, it can be solved using standard techniques of Linear Programming. 5.5. Elementary flux modes analysis

5.2. Flux bounds and boundary conditions An elementary flux mode (EFM) is a steady-state flux dis- tributions of minimal active reactions (i.e. reactions with non-zero To fulfil the previous calculations, the model’s reactions were fluxes) fulfilling all irreversibility constraints. Computation of the fl elementary modes (EFMs) was performed on a modified version of considered to be unbounded by setting %iαj=10⁶ arbitrary ux unit iJC86, where an ATP consumption reaction (referred to as energy) (AFU) and βj=−10⁶ AFU for reversible reactions. Therefore, in fi order to bound the flux space, the exchanges between the system was incorporated as a label for the identi cation of enEFMs. EFMs and the surrounding environment were constrained using the in which energy is greater than zero are the enEFMs of the model fi approach proposed by (Burgard et al.). Thus, a single constraint (see De nition I in results). The set of EFMs was calculated using which limits the total carbon sources imports was introduced as the EFMtool (Terzer and Stelling, 2008). follows: 5.6. ATP-decoupling and determination of energetic sets ∑ vKi ≤ iCS∈ ()3 The ATP-decoupling each enEFM was performed according to fi fi where CS is the set of transport reaction indexes which includes: De nition VII. The selected reactions for the trivial set (De nition glucose, pyruvate, oxaloacetate, α-ketoglutarate, ornithine, aspar- VI) were the exchanges of glutamine, glutamate, 2-oxoglutarate, tate, and geranylgeranyl diphosphate. When either phosphoe- AMP, ADP, AICAR, oxygen, carbon dioxide, ammonium, water, nolpyruvate or any of the PPP intermediates (glycerol-3-phos- protons, pyrophosphate and phosphate, the carbonic anhydrase phate, erythrose-4-phosphate, ribose-5-phosphate, xylulose-5- activity, the ATP formation via ATP synthase and the artefactual phosphate, fructose-6-phosphate and/or sedoheptulose-7-phos- reactions employed in the different studies (free ATP hydrolysis, free ADP recycling to ATP and free NADP⁺-NADPH interchange). phate) were allowed to be exchanged, the corresponding transport After ATP-decoupling an enEFM, its production (Definition III) reactions were included in CS. Is worth to note that the fluxes was classify as energetic or not according to the criteria estab- included in (2) are constrained to be non-negative. Finally, in order lished in Definition VIII. to make (2) the only active inequality constraint in the model the value of K was fixed to 100 AFUs. The constraint was used in all the 5.7. Determination of the impact of β-carotene synthesis on the analysis. metabolic performance of iJC86

5.3. In-silico knock-out analysis In order to perform this analysis the model was reconfigured to represent the Portiera-whitefly interaction in a more realistic way. The fragility analysis of the network was performed by simu- First, a simplified biomass equation representing the fraction of lating knock-out experiments for each orphan reaction and ex- each amino acid respect the total demand required for the insect's change flux introduced in the model. The analysis consisted in growth was added to the system, using the amino acid content of bound to zero the flux of each given reaction and to use FBA to find A. pisum as a reference (Russell et al., 2014). In particular, each the maximal value of the biomass reaction. If the optimal value of amino acid produced by Portiera was added as a reactant with a fi biomass reaction is lower than a certain de ned threshold, the stoichiometric coefficient equal to its content in A. pisum (with the fi reaction is classi ed as essential, otherwise as dispensable. The exception of phenylalanine, whose coefficient also includes the fi 3 threshold used to de ne the essentialist was 10 . content of tyrosine because the last one is a necessary precursor of the first one). Once the stoichiometric coefficients were set, they 5.4. Obtention of the iJC86 model variants were normalized for their summation to be one. Due to the fact that the relation of both biomass productions is Three models were constructed based on the iJC86 model: unknown, several symbiont-to-insect biomass ratios were tested. iJC86v, TV-iJC86v and BQ-iJC86v. iJC86v model was obtained by For each of these ratios, the maximum biomass production of the incorporating the importation and exportation reactions, along system (insectþsymbiont) was calculated through flux balance with the corresponding exchange reactions for phosphoenolpyr- analysis (Orth et al., 2010), after constraining the β-carotene ex- uvate (only importation), glycerol-3-phosphate, erythrose-4- portation flux to several values C ϵ [0, 100]. Then, for each C, the phosphate, ribose-5-phosphate, xylulose-5-phosphate, fructose-6- overproduction of every amino acid was evaluated by considering phosphate and sedoheptulose-7-phosphate. To make the compu- the minimum value computed using flux variability analysis tation of EFMs possible, the 2,3-dihydroxy-3-methylbutanoate (Terzer and Stelling, 2008) (biomass flux 490% of its maximum). NADP⁺ oxidoreductase (isomerizing) activity (EC 1.1.1.86) was re- To fulfil the previous calculations, the model was constrained moved because an alternative two-reaction path exists which as described in Section 5.2. perform the same transformation (Fig. 4 in Appendix A). The re- Since no specific information on the amino acid composition of maining models are derived from iJC86v by removing the follow- either the whiteflyorPortiera was found in the literature, the ing activities: phosphoribosyl-ATP pyrophosphatase (EC 3.6.1.31) experiments were performed using an ensemble of 100 rando- and transketolase (EC 2.2.1.1) in the TV-iJC86v model; and trans- mized variants of the biomass equations to make the results ob- ketolase (EC 2.2.1.1), dihydrodipicolinate reductase (EC 1.3.1.26), tained more robust. It is worth to note that the relative amino acid diaminopimelate decarboxylase (EC 4.1.1.20), diaminopimelate composition does not vary greatly between even quite diferent epimerase (EC 5.1.1.7) and argininosuccinate lyase (EC 4.3.2.1) in taxons (Sorimachi, 2009). Accordingly, the randomization was BQ-iJC86v model. In substitution of these activities (except trans- conducted in the following way: for each amino acid the stoi- ketolase), new reactions were included that represent the trans- chiometric coefficient was drawn from a uniform distribution in formation carried out by the insect. These reactions are copies of the interval (0.75x0, 1.25x0) where x0 corresponds to the reference 316 J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 value. These values were normalized afterwards as mentioned Burgard, A.P., Vaidyaraman, S., Maranas, C.D.,. Minimal reaction sets for Escherichia above. The ATP consumption for the insect and symbiont pro- coli metabolism under different growth requirements and uptake environ- – fi ments. Biotechnol. Prog. 17, 791 7. http://dx.doi.org/10.1021/bp0100880. duction was then modi ed. For the insect, the ATP cost was set to Cazzonelli, C.I., 2011. Goldacre Review: Carotenoids in nature: insights from plants zero. On the other hand, the ATP cost associated to the symbiont and beyond. Funct. Plant Biol. 38, 833. http://dx.doi.org/10.1071/FP11192. biomass equation was extracted from a uniform distribution in the Cottret, L., Milreu, P.V., Acuña, V., Marchetti-Spaccamela, A., Stougie, L., Charles, H., interval [30,70], which contains 59.81, the ATP cost of E. coli bio- Sagot, M.-F., 2010. Graph-based analysis of the metabolic exchanges between two co-resident intracellular symbionts, Baumannia cicadellinicola and Sulcia mass equation in iAF1260 (Feist et al., 2007). The range for the ATP muelleri, with their insect host, Homalodisca coagulata. PLoS Comput. Biol. 6, cost to be randomly extracted was set arbitrarily and skewed to- e1000904. http://dx.doi.org/10.1371/journal.pcbi.1000904. wards low ATP values. The reason for these skew is that the lower Dinant, S., Bonnemain, J.-L., Girousse, C., Kehr, J., 2010. Phloem sap intricacy and interplay with aphid feeding. C. R. Biol. 333, 504–515. http://dx.doi.org/10.1016/ the ATP cost value, the lower the overproduction and thus; more j.crvi.2010.03.008. robust results can be achieved. Ebrahim, A., Lerman, J.A., Palsson, B.O., Hyduke, D.R., 2013. COBRApy: Constraints- based reconstruction and analysis for python. BMC Syst. Biol. 7, 74. http://dx. doi.org/10.1186/1752-0509-7-74. 5.8. Computational tools Feist, A.M., Henry, C.S., Reed, J.L., Krummenacker, M., Joyce, A.R., Karp, P.D., Broadbelt, L.J., Hatzimanikatis, V., Palsson, B.Ø., 2007. A genome-scale meta- Constraint-based analysis was performed using the python- bolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol. 3, 121. http://dx.doi.org/ based toolbox COBRApy (Ebrahim et al., 2013). LP and MILP pro- 10.1038/msb4100155. blems were solved using the Gurobi Solver (Gurobi Optimization, Feist, A.M., Palsson, B.O., 2010. The biomass objective function. Curr. Opin. Micro- 2012) accessed through COBRApy. The computation of connected biol. 13, 344–349. http://dx.doi.org/10.1016/j.mib.2010.03.003. components for the detection of the set of unconnected modules González-Domenech, C.M., Belda, E., Patiño-Navarrete, R., Moya, A., Peretó, J., La- torre, A., 2012. Metabolic stasis in an ancient symbiosis: genome-scale meta- was performed using the Python NetworkX library (Hagberg et al., bolic networks from two Blattabacterium cuenoti strains, primary en- 2008). The scripts developed for this paper were programmed in dosymbionts of cockroaches. BMC Microbiol. 12 (Suppl 1), S5. http://dx.doi.org/ Python language (Jones et al., n.d.), and can be requested from the 10.1186/1471-2180-12-S1-S5. Gurobi Optimization, I., 2012. Houston. Gurobi Optimization, Inc, Texas. authors. Graphs were drawn using the yEd Graph Editor (30). All Hagberg, A.A., Schult, D.A., Swart, P.J., 2008. Exploring network structure, dynamics, the computations were performed on a workstation using an In- and function using {NetworkX}. In: Proceedings of the 7th Python in Science s s – tel Xeon CPU X5660 2.80 GHz processor, with 32 GiB, running Conference (SciPy2008). Pasadena, CA USA, pp. 11 15. Hansen, A.K., Moran, N.A., 2011. Aphid genome expression reveals host-symbiont under Ubuntu 14.04.2 Linux OS. cooperation in the production of amino acids. Proc. Natl. Acad. Sci. USA. 108, 2849–2854. http://dx.doi.org/10.1073/pnas.1013465108. Jones, E., Oliphant, T., Peterson, P., others, n.d. SciPy: Open source scientific tools for Python. PythonLabs, Virginia, USA. Contributions Luan, J.-B., Chen, W., Hasegawa, D.K., Simmons, A.M., Wintermantel, W.M., Ling, K.- S., Fei, Z., Liu, S.-S., Douglas, A.E., 2015. Metabolic coevolution in the bacterial F.J.S. and D.S.G. provided functionally annotated genomic data symbiosis of whiteflies and related plant Sap-feeding insects. Genome Biol. Evol. 7, 2635–2647. http://dx.doi.org/10.1093/gbe/evv170. prior to publication; J.C.E. and M.P.L. performed the experiments; J. Macdonald, S.J., Lin, G.G., Russell, C.W., Thomas, G.H., Douglas, A.E., 2012. The C.E, M.P.L, F.M and J.P. analyzed the data; J.C.E. wrote the manu- central role of the host cell in symbiotic nitrogen metabolism. Proc. Biol. Sci. script with inputs from M.P.L, F.M., D.S.G., F.J.S. and J.P. All authors 279, 2965–2973. http://dx.doi.org/10.1098/rspb.2012.0414. read and approved the final manuscript. McCutcheon, J.P., Moran, N.A., 2012. Extreme genome reduction in symbiotic bac- teria. Nat. Rev. Microbiol. 10, 13–26. http://dx.doi.org/10.1038/nrmicro2670. Moran, N.A., Bennett, G.M., 2014. The tiniest tiny genomes. Annu. Rev. Microbiol. 68, 195–215. http://dx.doi.org/10.1146/annurev-micro-091213-112901. Competing financial interests Moya, A., Peretó, J., Gil, R., Latorre, A., 2008. Learning how to live together: genomic insights into prokaryote-animal symbioses. Nat. Rev. Genet. 9, 218–229. http: //dx.doi.org/10.1038/nrg2319. The authors declare no competing financial interests. Orth, J.D., Thiele, I., Palsson, B.Ø., 2010. What is flux balance analysis? Nat. Bio- technol. 28, 245–248. http://dx.doi.org/10.1038/nbt.1614. Overbeek, R., Begley, T., Butler, R.M., Choudhuri, J.V., Chuang, H.-Y., Cohoon, M., de Crécy-Lagard, V., Diaz, N., Disz, T., Edwards, R., Fonstein, M., Frank, E.D., Gerdes, Acknowledgements S., Glass, E.M., Goesmann, A., Hanson, A., Iwata-Reuyl, D., Jensen, R., Jamshidi, N., Krause, L., Kubal, M., Larsen, N., Linke, B., McHardy, A.C., Meyer, F., Neu- weger, H., Olsen, G., Olson, R., Osterman, A., Portnoy, V., Pusch, G.D., Rodionov, We would like to thank the Obra Social Programme of La Caixa D.A., Rückert, C., Steiner, J., Stevens, R., Thiele, I., Vassieva, O., Ye, Y., Zagnitko, O., Savings Bank for the doctoral fellowship granted to JCE. Financial Vonstein, V., 2005. The subsystems approach to genome annotation and its use support from the Spanish Government (Grant reference: BFU2012- in the project to annotate 1000 genomes. Nucleic Acids Res. 33, 5691–5702. fi http://dx.doi.org/10.1093/nar/gki866. 39816-C02-02 and BFU2012-39816-C02, co- nanced by FEDER Ponce-de-León, M., Montero, F., Peretó, J., 2013. Solving gap metabolites and funds and the Ministry of the Economy and Competitiveness) and blocked reactions in genome-scale models: application to the metabolic net- the Regional Government of Valencia (Grant reference: PROME- work of Blattabacterium cuenoti. BMC Syst. Biol. 7, 114. http://dx.doi.org/ TEOII/2014/065) is also gratefully acknowledged. 10.1186/1752-0509-7-114. Rao, Q., Rollat-Farnier, P.-A., Zhu, D.-T., Santos-Garcia, D., Silva, F.J., Moya, A., Latorre, A., Klein, C.C., Vavre, F., Sagot, M.-F., Liu, S.-S., Mouton, L., Wang, X.-W., 2015. Genome reduction and potential metabolic complementation of the dual en- fl Appendix A. Supporting information dosymbionts in the white y Bemisia tabaci. BMC Genom. 16, 226. http://dx.doi. org/10.1186/s12864-015-1379-6. Russell, C.W., Poliakov, A., Haribal, M., Jander, G., van Wijk, K.J., Douglas, A.E., 2014. Supplementary data associated with this article can be found in Matching the supply of bacterial nutrients to the nutritional demand of the the online version at http://dx.doi.org/10.1016/j.jtbi.2016.07.024. animal host. Proc. Biol. Sci. 281, 20141163. http://dx.doi.org/10.1098/ rspb.2014.1163. Santos-Garcia, D., Farnier, P.-A., Beitia, F., Zchori-Fein, E., Vavre, F., Mouton, L., Moya, A., Latorre, A., Silva, F.J., 2012. Complete genome sequence of “Candidatus References Portiera aleyrodidarum” BT-QVLC, an obligate symbiont that supplies amino acids and carotenoids to Bemisia tabaci. J. Bacteriol. 194, 6654–6655. http://dx. doi.org/10.1128/JB.01793-12. Baumann, P., 2005. Biology bacteriocyte-associated endosymbionts of plant sap- Santos-Garcia, D., Vargas-Chavez, C., Moya, A., Latorre, A., Silva, F.J., 2015. Genome sucking insects. Annu. Rev. Microbiol. 59, 155–189. http://dx.doi.org/10.1146/ evolution in the primary endosymbiont of whiteflies sheds light on their di- annurev.micro.59.030804.121041. vergence. Genome Biol. Evol. 7, 873–888. http://dx.doi.org/10.1093/gbe/evv038. Belda, E., Silva, F.J., Peretó, J., Moya, A., 2012. Metabolic networks of Sodalis glossi- Sloan, D.B., Moran, N.A., 2012. Endosymbiotic bacteria as a source of carotenoids in nidius: a systems biology approach to reductive evolution. PLoS One 7, e30652. whiteflies. Biol. Lett. 8, 986–989. http://dx.doi.org/10.1098/rsbl.2012.0664. http://dx.doi.org/10.1371/journal.pone.0030652. Sorimachi, K., 2009. Evolution from primitive life to homo sapiens based on visible J. Calle-Espinosa et al. / Journal of Theoretical Biology 407 (2016) 303–317 317

genome structures: the amino acid world. Nat. Sci. 01, 107–119. http://dx.doi. bacterium Buchnera aphidicola. BMC Syst. Biol. 3, 24. http://dx.doi.org/10.1186/ org/10.4236/ns.2009.12013. 1752-0509-3-24. Terzer, M., Stelling, J., 2008. Large-scale computation of elementary flux modes Upadhyay, S.K., Sharma, S., Singh, H., Dixit, S., Kumar, J., Verma, P.C., Chandrashekar, with bit pattern trees. Bioinformatics 24, 2229–2235. http://dx.doi.org/10.1093/ K., 2015. Whitefly genome expression reveals host-symbiont interaction in bioinformatics/btn401. amino acid biosynthesis. PLoS One 10, e0126751. http://dx.doi.org/10.1371/ Thao, M.L., Baumann, P., 2004. Evolutionary relationships of primary prokaryotic journal.pone.0126751. fl endosymbionts of white ies and their hosts. Appl. Environ. Microbiol. 70, Valmalette, J.C., Dombrovsky, A., Brat, P., Mertz, C., Capovilla, M., Robichon, A., 2012. – 3401 3406. http://dx.doi.org/10.1128/AEM.70.6.3401-3406.2004. Light- induced electron transfer and ATP synthesis in a carotene synthesizing Thomas, G.H., Zucker, J., Macdonald, S.J., Sorokin, A., Goryanin, I., Douglas, A.E., insect. Sci. Rep. 2, 579. http://dx.doi.org/10.1038/srep00579. 2009. A fragile metabolic network adapted for cooperation in the symbiotic