SUPPLEMENTARY TEXT

Uncovering novel pathways for enhancing hyaluronan synthesis in recombinant Lactococcus lactis: Genome-scale metabolic modelling and experimental validation

Abinaya Badri a,1, Karthik Raman 1,2,3,* and Guhan Jayaraman 1,*

1 Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai – 600 036, INDIA

2 Initiative for Biological Systems Engineering, IIT Madras

3 Robert Bosch Centre for Data Science and Artificial Intelligence (RBC-DSAI), IIT Madras

a Present address: Department of Chemical and Biological Engineering, Rensselaer Polytechnic Institute, Troy NY 12180, USA

* Correspondence: [email protected], GJ; [email protected], KR;

A. Supplementary Figures and Tables Table A.1. 78 Reactions identified by FSEOF. S.No. Reaction Name Reaction Formula 2 methylbutanal dehydrogenase acid forming 'h2o[c] + nad[c] + 2mbal[c] <=> 2 h[c] + 1 ' nadh[c] + 2mba[c] ' 2 '2 methylbutanoic acid transport H symport ' 'h[c] + 2mba[c] <=> h[e] + 2mba[e] ' 3 '3 methyl 2 oxopentanoate decarboxylase' 'h[c] + 3mop[c] -> 2mbal[c] + co2[c] ' 'coa[c] + nad[c] + acald[c] <=> h[c] + 4 'acetaldehyde dehydrogenase acetylating ' nadh[c] + accoa[c] ' 5 'acetaldehyde reversible transport' 'acald[e] <=> acald[c] ' 'atp[c] + dcyt[c] <=> adp[c] + h[c] + 6 'deoxycytidine ' dcmp[c] ' 'adenine transport via proton symport 7 'h[e] + ade[e] <=> h[c] + ade[c] ' reversible ' 8 '' 'atp[c] + amp[c] <=> 2 adp[c] ' 9 'adenylosuccinate ' 'dcamp[c] -> amp[c] + fum[c] ' 'asp_L[c] + gtp[c] + imp[c] -> 2 h[c] + pi[c] 10 'adenylosuccinate synthetase' + dcamp[c] + gdp[c] ' 11 'adenosylhomocysteine nucleosidase' 'h2o[c] + ahcys[c] -> ade[c] + rhcys[c] ' 12 'alanine racemase' 'ala_L[c] <=> ala_D[c] ' 13 'L alanine transaminase' 'akg[c] + ala_L[c] <=> pyr[c] + glu_L[c] ' 'D alanine transport inout via proton 14 'h[e] + ala_D[e] <=> h[c] + ala_D[c] ' symport' 15 'L aspartate transport in via proton symport' 'h[e] + asp_L[e] -> h[c] + asp_L[c] ' 'adp[c] + pi[c] + 3 h[e] <=> atp[c] + h2o[c] + 16 'ATP synthase three protons for one ATP ' 2 h[c] ' 17 'CO2 transport out via diffusion' 'co2[e] <=> co2[c] ' 'atp[c] + nh4[c] + utp[c] -> adp[c] + 2 h[c] + 18 'CTP synthase NH3 ' pi[c] + ctp[c] ' 19 'cytidylate kinase dCMP ' 'atp[c] + dcmp[c] <=> adp[c] + dcdp[c] ' 20 'deoxycytidine deaminase' 'h2o[c] + h[c] + dcyt[c] -> nh4[c] + duri[c] '

1/12

21 'deoxyribose phosphate aldolase' '2dr5p[c] -> acald[c] + g3p[c] ' 'purine nucleoside phosphatase deoxyuridine 22 'pi[c] + duri[c] <=> ura[c] + 2dr1p[c] ' ' 23 '2 methyl butanoic acid exchange' '2mba[e] <=> ' 24 '4 Aminobutanoate exchange' '4abut[e] <=> ' 25 'Acetaldehyde exchange' 'acald[e] <=> ' 26 'Adenine exchange' 'ade[e] <=> ' 27 'D Alanine exchange' 'ala_D[e] <=> ' 28 'L Aspartate exchange' 'asp_L[e] <=> ' 29 'CO2 exchange' 'co2[e] <=> ' 30 'L Glutamate exchange' 'glu_L[e] <=> ' 31 'L Isoleucine exchange' 'ile_L[e] <=> ' 32 'Inosine exchange' 'ins[e] <=> ' 33 'L Lactate exchange' 'lac_L[e] <=> ' 34 'Exchange for Serine' 'ser_L[e] <=> ' 35 'Succinate exchange' 'succ[e] <=> ' 36 'fructose bisphosphate aldolase' 'fdp[c] <=> dhap[c] + g3p[c] ' 37 'fructose bisphosphatase' 'h2o[c] + fdp[c] -> pi[c] + f6p[c] ' 'h[c] + nadh[c] + fum[c] <=> nad[c] + 38 'fumarate reductase NADH ' succ[c] ' 'glucosamine 1 phosphate N 'accoa[c] + gam1p[c] -> coa[c] + h[c] + 39 acetyltransferase' acgam1p[c] ' 'nad[c] + glyc3p[c] <=> h[c] + nadh[c] + 40 'glycerol 3 phosphate dehydrogenase NAD ' dhap[c] ' 'nadp[c] + glyc3p[c] <=> h[c] + nadph[c] + 41 'glycerol 3 phosphate dehydrogenase NADP ' dhap[c] ' 'h[c] + utp[c] + g1p[c] <=> ppi[c] + udpg[c] 42 'UTP glucose 1 phosphate uridylyltransferase' ' 'glutamine fructose 6 phosphate 43 'gln_L[c] + f6p[c] -> glu_L[c] + gam6p[c] ' transaminase' 'atp[c] + glu_L[c] + nh4[c] -> adp[c] + h[c] 44 'glutamine synthetase' + pi[c] + gln_L[c] ' 'glu_L[e] + 4abut[c] <=> glu_L[c] + 45 '4 aminobutyrateglutamate antiport' 4abut[e] ' 46 'glutamate decarboxylase' 'h[c] + glu_L[c] -> co2[c] + 4abut[c] ' 'amet[c] + hcys_L[c] -> h[c] + ahcys[c] + 47 'homocysteine S methyltransferase' met_L[c] ' 'hypoxanthine phosphoribosyltransferase 48 'prpp[c] + hxan[c] -> ppi[c] + imp[c] ' Hypoxanthine ' 49 'isoleucine transaminase' 'akg[c] + ile_L[c] <=> 3mop[c] + glu_L[c] ' 'L isoeucine transport inout via proton 50 'h[e] + ile_L[e] <=> h[c] + ile_L[c] ' symport' 'inosine transport in via proton symport 51 'h[e] + ins[e] <=> h[c] + ins[c] ' reversible' 'nad[c] + lac_L[c] + 1.125 pseud[c] <=> h[c] 52 'L lactate dehydrogenase' + nadh[c] + pyr[c] ' 'L lactate reversible transport via proton 53 'h[e] + lac_L[e] <=> h[c] + lac_L[c] ' symport' 'atp[c] + h2o[c] + met_L[c] -> pi[c] + ppi[c] 54 'methionine adenosyltransferase' + amet[c] '

2/12

55 'nucleoside diphosphate kinase ATPUDP ' 'atp[c] + udp[c] <=> adp[c] + utp[c] ' 56 'nucleoside diphosphate kinase ATPdCDP ' 'atp[c] + dcdp[c] <=> adp[c] + dctp[c] ' 57 'phosphoglucosamine mutase' 'gam1p[c] <=> gam6p[c] ' 58 'phosphoglucomutase' 'g1p[c] <=> g6p[c] ' 59 'inorganic diphosphatase' 'h2o[c] + ppi[c] -> h[c] + 2 pi[c] ' 60 'phosphopentomutase deoxyribose ' '2dr1p[c] <=> 2dr5p[c] ' 'atp[c] + r5p[c] <=> h[c] + amp[c] + prpp[c] 61 'phosphoribosylpyrophosphate synthetase' ' 62 'purine nucleoside phosphorylase Inosine ' 'pi[c] + ins[c] <=> hxan[c] + r1p[c] ' 63 'pyrimidine nucleoside phosphorylase uracil ' 'pi[c] + uri[c] <=> ura[c] + r1p[c] ' 64 'ribokinase' 'atp[c] + rib_D[c] -> adp[c] + h[c] + r5p[c] ' 65 'ribosylhomocysteinase' 'h2o[c] + rhcys[c] -> hcys_L[c] + rib_D[c] ' 'ctp[c] + trdrd[c] -> h2o[c] + dctp[c] + 66 'ribonucleoside triphosphate reductase CTP ' trdox[c] ' 67 'L serine deaminase' 'ser_L[c] -> pyr[c] + nh4[c] ' 68 'L serine transport inout via proton symport' 'h[e] + ser_L[e] <=> h[c] + ser_L[c] ' 'succinate transporter inout via proton 69 'h[e] + succ[e] <=> h[c] + succ[c] ' symport' 70 'triose phosphate ' 'dhap[c] <=> g3p[c] ' 'h[c] + nadph[c] + trdox[c] -> nadp[c] + 71 'thioredoxin reductase NADPH ' trdrd[c] ' 'h[c] + utp[c] + acgam1p[c] -> uacgam[c] + 72 'UDP N acetylglucosamine diphosphorylase' ppi[c] ' 'h2o[c] + 2 nad[c] + udpg[c] -> 3 h[c] + 2 73 'UDPglucose 6 dehydrogenase' nadh[c] + udpglcur[c] ' 74 'uridylate kinase UMP ' 'atp[c] + ump[c] -> adp[c] + udp[c] ' 75 'uridine kinase ATPUridine ' 'atp[c] + uri[c] -> adp[c] + h[c] + ump[c] ' 76 'Ha out' 'HA_monomer[e] -> ' 77 'HA c2e' 'HA_monomer[c] -> HA_monomer[e] ' 'uacgam[c] + udpglcur[c] -> 2 udp[c] + 78 'HAS' HA_monomer[c] '

3/12

created using Escher [2]. Potential contribution of Potential contribution created using Escher [2]. ux is highlighted with green lines. ux is highlighted with green the inosine feeding strategy to HA fl Sub-network of over-expression targets from FSEOF analysis Figure A.1.

4/12

(a) mAU 250nm,4nm (1.00) 1250

1000

750

500

250 0

0.0 5.0 10.0 15.0 20.0 25.0 30.0 min (b) mAU 250nm,4nm (1.00) 1750

1250

750

250

-250 0.0 5.0 10.0 15.0 20.0 25.0 30.0 min

Figure A.2. Reverse Phase Chromatogram of spent media at the (a) start with just inosine peak (RT = 13.5 mins) and (b) end with hypoxanthine (RT = 11 mins) and superimposed hypoxanthine standard (dashed) peaks.

5/12

B. Supplementary Methods

B.1 Standard plots and HPLC protocol

2.5 0.35 0.3 2 (g/L) 0.25 1.5 y = 0.413x 0.2

505 y = 0.3213x Weight

R² = 0.99173 0.15 1 R² = 0.99561 OD Cell

0.1 0.5

Dry 0.05 0 0 0246 00.511.5 OD600 Glucose Concentration (g/L)

300 250

250 200 200 250nm 250nm

150 at at

150

y = 0.949x 100 y = 1.9904x (*10^5) Area Area

100 R² = 0.96786 R² = 0.97395 50 50 Peak Peak 0 0 0 100 200 300 0 20406080100120 Inosine Concentration (μg/mL) Hypoxanthine Concentration (μg/mL) Figure B.1. Clockwise from top-left: Standard plots for Biomass, glucose, hypoxanthine and inosine estimation.

Reverse Phase Gradient Protocol for the hypoxanthine and inosine estimation (adapted [1])

• Monolithic Luna C-18 Phenomenex® column - length 250 mm, internal diameter of 4.6 mm, particle size of 5μ and pore size of 100 Å; Photodiode array (PDA) detector at 250 nm • A flow rate of 0.6 mL/min • Aqueous mobile phase - trifluoroacetic acid (0.05% TFA in deionized water pH 2.2, v/v); methanol gradient • 32 minute time course per sample as follows: A - 0.05% TFA in deionized water; B - 100% Methanol -- 95:5::A:B (v/v) at 0 min ; 70:30::A:B (v/v) at 12 min; 10:90::A:B (v/v) at 13 min and held 3 min, and 95:5::A:B (v/v) at 17 min and hold for 15 minutes (to elute all other components) • Inosine and Hypoxanthine elute at 13.5 and 11 mins respectively.

B.2 Adaptations to available L. lactis model • Available L. lactis GSM (iNF518) 754 reactions, 650 metabolites and 518 genes. • 35 exchange reactions with experimentally derived bounds were relaxed with default lower bounds of -1000 or 0 mmol/(g DCW·h) depending on reversibility and upper bounds of 1000 mmol/(g DCW·h). • Hyaluronan synthase, HA transport and exchange reactions added -- 757 reactions, 652 metabolites and 519 genes (iNF519).

6/12

Table B.1: Reactions added to relaxed model to enable HA production. Reaction Name Reaction Formula UDP-glucuronic acid[c] + UDP-N-acetylglucosamine[c] ‘HA Synthase’  HA_Monomer[c] + 2UDP[c] ‘HA Transport’ HA_Monomer[c]  HA_Monomer[e] ‘HA Exchange’ HA_Monomer[e] 

• iNF519 + SJR6 chemostat data for glucose consumption rate and lactate, acetate, ethanol and formate production rates [3]

Table B.2: Experimental flux bounds incorporated in the model. Lower bound of flux Upper bound of flux Exchanged Metabolite (mmol/(g DCW·h)) (mmol/(g DCW·h)) Glucose -9.78 -2.94 Lactate 3.44 14.1 Acetate 0.083 0.237 Formate 0 0.511 Ethanol 0.39 1.326

• Model Cleaning: iNF518 model -113 gaps in total (77 root gaps and 36 downstream gaps).

Table B.3: Reactions added and removed to the model. S.No. Reactions added 1 ‘superoxide-forming NADH oxidase in L. lactis’ 2 ‘Octaprenyl pyrophosphate synthase’ 3 ‘Formation reaction for Acyl Carrier Protein (ACP)’ 4 ‘Formation reaction for N-formylmethonine (fMet)’ S.No. Reactions Removed 1 ‘undecaprenol kinase’ 2 ‘N-hydroxyarylamine O-Acetyltransferase’ 3 ‘4-carboxymuconolactone decarboxylase’ 4 ‘coproporphyrinogen oxidase 3’ 5 ‘hydroxybutyrate dehydrogenase’ 6 ‘Arbutin 6-phosphate glucohydrolase’ 7 ‘methylthioadenosine nucleosidase’ 8 ‘myo-inositol-1-phosphatase’ 9 ‘Glycerophosphodiester phosphodiesterase glycerophosphoinositol’ 10 ‘Glycerophosphodiester phosphodiesterase glycerophosphoethanolamine’ 11 ‘tetrahydropicolinate succinylase’ 12 ‘Amylomaltase maltotriose’ 13 ‘Amylomaltase maltotetraose’ 14 ‘Amylomaltase maltopentaose’ 15 ‘Amylomaltase maltohexaose’ 16 ‘CDP glycerol glycerophosphotransferase’ 17 ‘CDP ribitol phosphoribitoltransferase’ 18 ‘Chitinase’ 19 ‘2-dehydro-3-deoxy phosphogluconate aldolase’

7/12

20 ‘5-formyltetrahydrofolate cyclo ’ 21 ‘5-methyltetrahydropteroyltriglutamate homocysteine S methyltransferase’ 22 ‘Salicin-6-phosphate glucohydrolase’

• Final model used in this study: 741 reactions, 618 metabolites, 519 genes and only 7 root gaps.

C. Supplementary Results

C.1 Characteristics of in silico HA production

Following modifications of the model as described in §A.2, the model predicted L. lactis growth rate reasonably well (predicted growth rate of 0.322 hr-1 for a chemostat run with steady state dilution rate of 0.3 hr-1). It also proved to be capable of reproducing certain trends of HA flux that were previously reported in literature. For example, the model reproduced the positive correlation reported between HA and glucose [3,4] for lower glucose uptake rates (Fig. B.1(a)). Simulations also showed an increase in HA with increase in glutamine uptake (Fig B.1(b)) as previously reported [5–7]. The simulations also indicated a decrease in HA flux with increase in lactate production flux or flux towards fructose- 6- phosphate (Fig. B.1(c) and (d) respectively) as reported earlier [3]. Further, in silico HA flux increased with additional ATP formation flux (Fig. B.1(e)), thereby agreeing with previous reports on the energy- intensive characteristics of HA synthesis [8].

We also examined the theoretical maximum HA flux that can be produced under the given conditions of metabolite uptake, secretion and growth rates. Ideally, this is obtained by setting the flux as the objective function. However, in this case, an objective function that maximized product yielded zero biomass, while that which maximized biomass yielded zero product flux. Due to this mutual exclusivity between biomass and HA fluxes, the conditions at which standard theoretical maximum flux is obtained is merely an impractical scenario.

To surmount this problem, the derivation of a theoretical maximum was slightly modified. To simulate a condition with a minimum non-zero growth, the maximum HA objective was augmented by fixing a lower bound for biomass corresponding to a lower growth rate of 0.1 h-1 for which experimental values at steady state were available. With all other flux bounds retained at the same values, the maximum HA flux under these conditions was 19.107 mmol/(g DCW·h). This can be compared to the HA flux currently obtained in the recombinant L. lactis system at steady state in a CSTR with the 0.1 h-1 dilution rate – 0.023 mmol/(g DCW·h). The order of magnitude difference in the theoretical maximum from the observed HA flux for this strain indicates that there is a huge room for improvement for the latter in this organism. The simulations also show a steady decrease in the theoretical maximum HA flux with increase in enforced minimum biomass flux (Fig. B.2). This indicates that a little compromise on the growth rate may drastically increase the theoretical maximum HA flux.

8/12

a) b)

c)

d) e)

Figure C.1: Correlation between theoretical maximum HA flux and (a) glucose uptake (b) glutamine uptake (c) lactate production (d) fructose-6-phosphate production (e) additional ATP formation.

9/12

Figure C.2. Plot of predicted theoretical maximum HA flux for different biomass fluxes.

C.2 Knock-outs predicted to increase the theoretical maximum

Several algorithms have been developed until now for identification of knock-out targets [9–13]. The most widely used algorithm for this purpose is OptKnock, the bi-level platform that points out genes that need to be removed to simultaneously maximize growth rate and product flux [9]. Analyzing our model with OptKnock suggested three knock-out gene-targets, viz. lactate dehydrogenase (ldh), alcohol dehydrogenase (adh) and acetate kinase (ak) (Table B.1(a)). Out of these, only the ldh knock-out simulation showed a significant increase in the theoretical maximum HA and hence would be the foremost knock-out target derived out of this analysis. Previous reports on high titers of HA obtained from mutant strains that lack the ldh gene support this prediction [14].

C.3 Knock-outs to reach the theoretical maximum

This section stems from the order-of-magnitude difference between the theoretical maximum HA flux and those obtained at steady state in the organism. The difference indicates that knock-outs that take the system closer to the theoretical maximum from the current state have the potential to make a bigger impact on HA flux than those predicted to increase theoretical maximum HA flux (in §B.2). To identify these, we listed out a set of reactions that are switched off (fluxes go to zero from a non-zero value) when shifted from conditions maximizing biomass to those maximizing HA. The major reactions that were put forth as knock-out targets in this category belonged to Glycolysis and Pentose Phosphate Pathway (Table B.1(b)). This signifies that the major roadblock in achieving theoretical maximum HA is the competition for carbon from pathways that are related to growth and energy production.

10/12

Table C.1. (a) Knock-outs to increase theoretical maximum (OptKnock) (b) Knock-outs to reach theoretical maximum.

(a) S. No. Knocked out ∆Theoretical maximum 1. ‘Lactate dehydrogenase’ 0.167 2. ‘Alcohol dehydrogenase’ 0.004 3. ‘Acetate Kinase’ E^-5

(b) Knock-outs to reach theoretical maximum ‘Glucose-6-phosphate dehydrogenase’ ‘Phosphogluconate dehydrogenase’ ‘6-phosphogluconolactonase‘ ‘’ ‘L-threonine transport’

References:

1. Farthing, D., Sica, D., Gehr, T., Wilson, B., Fakhry, I., Larus, T., Farthing, C., and Karnes, H.T. (2007). An HPLC method for determination of inosine and hypoxanthine in human plasma from healthy volunteers and patients presenting with potential acute cardiac ischemia. J. Chromatogr. B 854, 158– 164. Available at: http://linkinghub.elsevier.com/retrieve/pii/S1570023207002814. 2. King, Z.A., Dräger, A., Ebrahim, A., Sonnenschein, N., Lewis, N.E., and Palsson, B.O. (2015). Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. PLOS Comput. Biol. 11, e1004321. Available at: http://dx.plos.org/10.1371/journal.pcbi.1004321. 3. Badle, S.S., Jayaraman, G., and Ramachandran, K.B. (2014). Ratio of intracellular precursors concentration and their flux influences hyaluronic acid molecular weight in Streptococcus zooepidemicus and recombinant Lactococcus lactis. Bioresour. Technol. 163, 222–227. Available at: http://dx.doi.org/10.1016/j.biortech.2014.04.027. 4. Prasad, S.B., Jayaraman, G., and Ramachandran, K.B. (2010). Hyaluronic acid production is enhanced by the additional co-expression of UDP-glucose pyrophosphorylase in Lactococcus lactis. Appl. Microbiol. Biotechnol. 86, 273– 283. Available at: http://link.springer.com/10.1007/s00253-009-2293-0. 5. Blank, L.M., McLaughlin, R.L., and Nielsen, L.K. (2005). Stable production of hyaluronic acid inStreptococcus zooepidemicus chemostats operated at high dilution rate. Biotechnol. Bioeng. 90, 685–693. Available at: http://doi.wiley.com/10.1002/bit.20466. 6. Im, J.-H., Song, J.-M., Kang, J.-H., and Kang, D.-J. (2009). Optimization of

11/12

medium components for high-molecular-weight hyaluronic acid production by Streptococcus sp. ID9102 via a statistical approach. J. Ind. Microbiol. Biotechnol. 36, 1337–1344. Available at: http://link.springer.com/10.1007/s10295-009-0618-8. 7. Shah, M. V., Badle, S.S., and Ramachandran, K.B. (2013). Hyaluronic acid production and molecular weight improvement by redirection of carbon flux towards its biosynthesis pathway. Biochem. Eng. J. 80, 53–60. Available at: http://dx.doi.org/10.1016/j.bej.2013.09.013. 8. Liu, L., Du, G., Chen, J., Wang, M., and Sun, J. (2008). Enhanced hyaluronic acid production by a two-stage culture strategy based on the modeling of batch and fed-batch cultivation of Streptococcus zooepidemicus. Bioresour. Technol. 99, 8532–8536. Available at: http://linkinghub.elsevier.com/retrieve/pii/S096085240800179X. 9. Burgard, A.P., Pharkya, P., and Maranas, C.D. (2003). Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. 84, 647–657. Available at: http://doi.wiley.com/10.1002/bit.10803. 10. Pharkya, P., Burgard, A.P., and Maranas, C.D. (2004). OptStrain: A computational framework for redesign of microbial production systems. Genome Res. 14, 2367–2376. Available at: http://www.genome.org/cgi/doi/10.1101/gr.2872004. 11. Pharkya, P., and Maranas, C.D. (2006). An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng. 8, 1–13. Available at: http://linkinghub.elsevier.com/retrieve/pii/S1096717605000704. 12. Tepper, N., and Shlomi, T. (2010). Predicting metabolic engineering knockout strategies for chemical production: accounting for competing pathways. Bioinformatics 26, 536–543. Available at: http://bioinformatics.oxfordjournals.org/cgi/doi/10.1093/bioinformatics/btp704. 13. Yang, L., Cluett, W.R., and Mahadevan, R. (2011). EMILiO: A fast algorithm for genome-scale strain design. Metab. Eng. 13, 272–281. Available at: http://dx.doi.org/10.1016/j.ymben.2011.03.002. 14. Kaur, M., and Jayaraman, G. (2016). Hyaluronan production and molecular weight is enhanced in pathway-engineered strains of lactate dehydrogenase- deficient Lactococcus lactis. Metab. Eng. Commun. 3, 15–23. Available at: http://linkinghub.elsevier.com/retrieve/pii/S2214030116300037.

12/12