A application in metabolic engineering

by

Nikolaos Anesiadis

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Chemical Engineering and Applied Chemistry University of Toronto

Copyright © 2014 by Nikolaos Anesiadis Abstract

A synthetic biology application in metabolic engineering

Nikolaos Anesiadis

Doctor of Philosophy

Graduate Department of Chemical Engineering and Applied Chemistry

University of Toronto

2014

Since the 1970s, bioprocess engineering has focussed on the optimization of the production of chemicals via biological transformations. In particular, much emphasis has been placed on estimating the optimal process variables to maximize the production of the desired chem- ical. Notably, engineers were limited to using macroscopic process variables, such as the feed rate of the bioreactor. Optimization involves the trade-off between productivity and yield. High values of both metrics are required for a viable plant; however, the two metrics are in competition. Recently, the emergence of synthetic biology has enabled bioengineers to extend the optimization of bioprocesses from the macroscopic level to the genetic level.

With this in mind, we propose a novel synthetic biology approach for bioprocess optimiza- tion. Our case study involves a lactic acid-producing Escherichia coli strain with the adh

(alcohol dehydrogenase) and pta (phosphotransacetylase) genes deleted. Deletion of these genes increases the yield of lactic acid; but, at the same time, growth rate and productivity decrease drastically. Initially, we introduce the model-based design of an integrated genetic circuit that links a density sensory mechanism to a dynamic genetic controller, and subse- quently to bacterial . In this way, the genetic circuit dynamically controls genes that contribute to growth and productivity. Then, we conduct a mathematical analysis of the model to help us in the initial design and further optimization of the integrated circuit.

The analysis can minimize the time required to design and troubleshoot the genetic circuit.

Also, the analysis showed that the induction time is the most important process variable we

ii can optimize. Finally, we carried out experimental results in an attempt to utilize the ge- netic toggle switch as a controller to manipulate genes adh and pta in an ON-OFF fashion.

While we expected to observe some growth restoration and productivity improvement, it is common for synthetic biology constructs to behave differently in different environments or strains. Indeed, the experimental results show that our assumption that the genetic toggle switch will restore wild-type levels of adh-pta expression may not be true. In summary, this work introduces a novel synthetic biology approach for the optimization of bioprocesses and attempts a proof of concept implementation of the strategy. Although, the initial im- plementation was not successful, we have done some troubleshooting with respect to the problems involved and suggestions are given for future experiments.

iii Acknowledgements

I am fortunate to have many people to acknowledge for the help, support and contributions that made this work possible. First, my supervisors Professor W. R. Cluett and Professor

R. Mahadevan for their continuous support and advice. They have both been an inspira- tion to me and they have stimulated my intellectual curiosity. To my advisory committee,

Professor E. Edwards, Professor E. Master and Professor A. Yakunin thank you for your advice and support. Your suggestions have improved this thesis in all respects; your en- couragement is greatly appreciated.

Biozone made graduate school a great experience. I thank all the professors, students, and staff for creating a positive and pleasant environment. In particular, Laurence Yang,

Victor Balderas, Pratish Gawand and Naveen Venayak have been very knowledgeable lab- partners and friends. Susie Susilawati, Christina Heidorn and Weijun Gao have also been a continuous source of support.

Collaborators outside the University of Toronto have been tremendously helpful. Pro- fessor V. Martin and Dr. A. Ekins from Concordia University, Dr. H. Kobayashi from the Japan Agency for Marine-Earth Science and Technology (JAMSTEC), and Professor

S. Fong from the Virginia Commonwealth University have shared their experience with our group.

Finally, I want to thank my family: my parents Anastasia and Michael, my brother

Stelio, and my wife Azi for their constant trust, love and support.

iv Contents

List of Tables xi

List of Figures xiii

List of Abbreviations and Symbols xxii

1 Introduction 1

1.1 Motivation ...... 1

1.2 Challenges and objectives ...... 3

1.3 Contributions ...... 4

1.3.1 Model-based design of dynamic metabolic engineering ...... 5

1.3.2 Analysis of dynamic strategy ...... 6

1.3.3 Experimental implementation of the dynamic strategy ...... 7

1.3.4 Additional contributions ...... 8

2 Literature review 9

2.1 Experimental and computational approaches to metabolic engineering . . . 11

2.1.1 Experimental approaches ...... 11

2.1.2 Computational approaches ...... 14

2.1.3 The trade-off between yield and productivity of a process ...... 19

2.2 Dynamic control of gene expression for bioprocesses optimization ...... 21

2.2.1 Dynamic metabolic optimization ...... 22

v 2.2.2 Comparison of the dynamic strategy to existing methods ...... 25

2.3 Quorum sensing in bacteria ...... 26

2.3.1 The mechanism of the Lux system ...... 26

2.3.2 Potential for quorum sensing applications ...... 28

2.4 Genetic controllers ...... 29

2.4.1 The first synthetic biology construct: toggle switch ...... 29

2.4.2 Model-based design of the toggle switch ...... 29

2.4.3 Construction of the toggle switch ...... 30

2.4.4 Logic gates ...... 32

2.5 Density-dependent genetic networks ...... 37

2.5.1 Density-dependent applications ...... 38

2.5.2 Challenges ...... 39

2.6 Lactate production in microbial hosts ...... 41

2.6.1 Natural producers and strains ...... 41

2.6.2 Lactate from E. coli strains ...... 42

2.6.3 Anaerobic production in E. coli ...... 42

2.6.4 Dual-phase production in E. coli ...... 44

2.6.5 Summary ...... 47

2.7 Summary and synthesis ...... 49

2.7.1 Summary of the literature review ...... 49

2.7.2 Synthesis and outline ...... 52

3 Model-based design for dynamic metabolic engineering 53

3.1 Methods ...... 55

3.1.1 Quorum sensing modelling ...... 55

3.1.2 Toggle switch modelling ...... 57

3.1.3 Dynamics of the genetic circuit ...... 59

3.1.4 Strain design for serine production ...... 61

vi 3.1.5 Coupling the genetic circuit to the serine-producing strain ...... 63

3.2 Results ...... 66

3.2.1 Static strategy for serine production ...... 66

3.2.2 Dynamic strategy for serine production ...... 66

3.2.3 Comparison of static and dynamic strategy ...... 67

3.3 Conclusions ...... 69

4 Mathematical analysis of the dynamic strategy 71

4.1 Introduction ...... 71

4.2 Methods ...... 73

4.2.1 Global sensitivity analysis ...... 73

4.2.2 Analysis of the most sensitive parameters ...... 76

4.3 Results ...... 78

4.3.1 Ideal dynamic strategy ...... 78

4.3.2 Global sensitivity analysis ...... 80

4.3.3 Effect of αC and γC ...... 82

4.3.4 Effect of αC and LuxR ...... 84

4.3.5 Effect of γC and LuxR ...... 86

4.3.6 Summary on the effects of changing two parameters at a time . . . . 87

4.3.7 Effect of all three parameters ...... 88

4.3.8 Preliminary design considerations ...... 89

4.4 Conclusions ...... 91

5 Experimental implementation of the dynamic strategy 93

5.1 Materials and methods ...... 94

5.1.1 Strains and plasmids ...... 94

5.1.2 Media and growth conditions for strain SUC-AE ...... 97

5.1.3 Media and growth conditions for strains SUC-AN and LAC-AN . . . 98

vii 5.1.4 Genetic methods ...... 99

5.1.5 Analytical techniques ...... 100

5.2 Aerobic succinate production ...... 100

5.3 Anaerobic lactate production: protocol development ...... 106

5.3.1 Preliminary characterization of the lactate-producing strain . . . . . 106

5.3.2 Expression of pTOG(pta) in minerals medium ...... 106

5.3.3 Use of Luria broth as a supplement ...... 108

5.3.4 Minimizing the use of Luria-Bertani supplement ...... 109

5.3.5 Using inexpensive supplements ...... 111

5.3.6 Different induction times ...... 112

5.3.7 Bioreactor experiment ...... 114

5.3.8 Use of pH buffer in the inoculum preparation ...... 115

5.3.9 Protocol ...... 116

5.4 Anaerobic lactate production: characterization ...... 118

5.4.1 Characterization of wild-type and mutant in 100 mM of glucose . . . 118

5.4.2 Characterization of wild-type and mutant in 50 mM of glucose . . . 119

5.4.3 Characterization of the toggle switch in 50 mM of glucose ...... 121

5.4.4 Synopsis of the batch characterizations ...... 124

5.4.5 Conclusions ...... 126

5.5 Troubleshooting ...... 127

5.5.1 Individual signal testing ...... 127

5.5.2 Sequencing ...... 128

5.5.3 Flow cytometry ...... 130

5.5.4 Conclusions ...... 132

6 Conclusions and recommendations for future work 134

6.1 Conclusions ...... 134

6.2 Recommendations for future work ...... 136

viii Bibliography 140

Appendix 153

A Strain design for serine production 153

B Matlab code 154

B.1 M-files for Chapter 3 ...... 154

B.1.1 Dynamics of the genetic circuit (section 3.1.3) ...... 154

B.1.2 Production envelope of strain designs predicted by EMILiO (section

3.1.4) ...... 157

B.1.3 Static strategy for serine production (section 3.2.1) ...... 160

B.1.4 Dynamic strategy for serine production (section 3.2.2) ...... 165

B.2 M-files for Chapter 4 ...... 170

B.2.1 Ideal dynamic strategy (section 4.3.1) ...... 170

B.2.2 Global sensitivity analysis (section 4.3.2) ...... 175

B.2.3 Effect of αC and γC (section 4.3.3) ...... 182

B.2.4 Effect of αC and LuxR (section 4.3.4) ...... 188

B.2.5 Effect of γC and LuxR (section 4.3.5) ...... 195 B.2.6 Effect of all three parameters (section 4.3.7) ...... 202

C Mathematical analysis of the dynamic strategy 213

D Phase plane analysis of the genetic circuit 218

D.1 Introduction ...... 218

D.2 Methods ...... 219

D.3 Results ...... 220

E Supplemental data for Chapter 5 223

F Carbon balances of the bioreactor experiments 241

ix G Plasmid sequences 244

G.1 Plasmid pTOG(adh,pta)...... 244

G.2 Plasmid pTAK132 ...... 248

G.3 Plasmid pTAK131 ...... 251

H Time profiles of triplicate experiments 254

x List of Tables

2.1 The most common metabolic engineering strategies...... 13

2.2 Comparison of some strain design algorithms for metabolic engineering. . . 18

2.3 Three simple digital logic gates: (a) NOT, (b) OR, and (c) AND. (a) For

a repressor-promoter system, the absence of the repressor (i.e., input is ‘0’)

results in the expression of the gene (i.e., output is ‘1’). The presence of the

repressor (i.e., input is ‘1’) results in the repression of the gene (i.e., output

is ‘0’). (b, c) For an activator-promoter system, an input of ‘1’ indicates

the presence of an activator, and ‘0’ the absence. An output of ‘1’ indicates

expression, and ‘0’ repression of the gene...... 34

2.4 The NAND and NOR gates can be used to apply the dynamic ON-OFF

strategy suggested by Gadkar et al. (2005b)...... 35

2.5 Comparison of natural lactate-producing and yeast strains...... 43

2.6 Comparison of lactate-producing strains under strictly anaerobic conditions.

Productivity: g/l/h; yield: g/g; titer: g/l...... 45

2.7 Comparison of lactate-producing strains under two-phase aerobic-anaerobic

conditions. Productivity: g/l/h; yield: g/g; titer: g/l...... 46

2.8 Summary of dynamic optimization approaches to metabolic engineering. . . 51

3.1 Parameter values of the quorum sensing ...... 57

3.2 Parameter values of the toggle switch ...... 59

xi 3.3 Growth rate and serine flux under aerobic conditions for a glucose uptake

rate of 10 mmol/gDW/h from flux balance analysis...... 62

3.4 Comparison of objective function values for static and dynamic strategies.

The dynamic strategy is based on the nominal parameter values of the genetic

circuit. The best values are highlighted in bold...... 69

5.1 List of strains and plasmids ...... 95

5.2 Gene expression scheme in the toggle switch. The symbol + indicates ex-

pression; the symbol – indicates repression...... 96

5.3 Growth characteristics of wild-type, mutant and the mutants transformed

with the pTOG(adh, pta) plasmid. The bioengineering objectives are esti-

mated at the end of the batch from HPLC measurements. Lactate yield,

productivity, and titer are in bold...... 125

F.1 Carbon balance for the wild-type characterization. Yeast extract is not in-

cluded in the total. The estimate of carbon dioxide is from the FBA model. 242

F.2 Carbon balance for the mutant characterization (static strategy). Yeast ex-

tract is not included in the total. The estimate of carbon dioxide is from the

FBA model...... 243

F.3 Carbon balance for the dynamic strategy characterization. The estimate of

carbon dioxide is from the FBA model...... 243

xii List of Figures

2.1 A simple schematic of a hypothetical production envelope. (A) The max-

imum theoretical yield can be calculated by setting growth rate equal to

zero and maximizing the flux of the desired product. (B) Common solu-

tions of strain design algorithms are located between the two extremes. (C)

The wild-type operating point has maximum growth rate, while the desired

product is normally not produced...... 16

2.2 The production envelope of the dynamic strategy. (A) Initially, the growth

phase operates as the wild-type point (high growth rate, no production).

(B) Then, the second phase operates as the mutant, inducing the production

phase (high production rate, lower growth rate)...... 24

2.3 The mechanism of quorum sensing in Vibrio fischeri: (a) low concentra-

tion, (b) high cell concentration...... 27

2.4 (A) The bistable region lies between the dashed lines. Strong and similar

strength promoters (i.e., the dimensionless parameters α1 and α2) facilitate a stable and robust bistable switch. (B) High cooperativity (parameters

n1, n2 > 1 in the model) increase the area of the bistable region. If the cooperativity factor is low (e.g., 1.1), noise can induce random transition

between the states. Figure adopted from Gardner et al. (2000). Permission

to use the figure has been granted by the publisher...... 31

xiii 2.5 (A) In the first state, gene lacI is expressed, and protein LacI represses the

promoter Ptrc. Thus, lacI is in the ON-state, and cI is in the OFF-state. Expression of lacI is effected by inactivation of the temperature-sensitive

CI repressor at 42oC. (B) In the second state, gene cI is expressed, and

protein cI represses the promoter PL. Thus, cI is in the ON-state, and lacI is in the OFF-state. Expression of cI is induced by inactivation of the LacI

repressor protein with IPTG. Red circles represent inactive proteins, green

circles represent active proteins...... 33

2.6 The boxes P, S and C represent the process, sensor and controller, respec-

tively. Variables u and y are the input and output. The measured output

signal is ym, the set-point is ysp, and the difference between them is the error e. 38

2.7 The density-dependent expression of gfp from Kobayashi et al. (2004). Red

colour indicates a repressed gene, green indicates an activated gene. (A) At

low cell density, the AHL and LuxR production from the sensor plasmid are

low. As a result, lacI expression is low, and expression of cI in the controller

plasmid is ON, thus repressing the reporter gfp. (B) At densities above a

certain threshold, the AHL-LuxR complex formed triggers expression of lacI.

In return, cI is repressed, leading to expression of the reporter protein GFP. 40

3.1 The negative feedback design for density-dependent control of gene expres-

sion. is sensed through the signalling molecule AHL with the quo-

rum sensing module. The AHL signal feeds the toggle switch depending

on its value relative to the threshold. The manipulated genes are placed

downstream of gene cI in the toggle switch, and therefore follow the profile

of protein CI. If the value of AHL is below the threshold, the manipulated

genes are expressed. When AHL is higher than the threshold, the genes are

repressed. Genes that contribute to high growth rate are placed under the

control of the toggle switch, thus linking the toggle switch and metabolism. 54

xiv 3.2 The quorum sensing plasmid. The signalling molecule AHL is proportional

to biomass concentration. Once AHL reaches a threshold concentration, it

binds to LuxR, forms a dimer complex, and activates the Plux promoter. . . 56

3.3 Design of the genetic controller plasmid and the link to the sensor plasmid.

Initially, the sensor plasmid does not express the three genes, and CI protein

is present. The figure shows the induction of the quorum sensing and lacI,

which in turn represses gene cI...... 58

3.4 Time profiles of the genetic circuit variables (A: biomass; B: AHL; C: LuxR/AHL

complex; D: LacI protein; and E: CI protein). The simulations support the

conclusion of Gardner et al. (2000) that the dynamics of switching a gene off

(E) are notably faster than the dynamics of switching a gene on (D). This is

to our advantage, since a sharp ON-OFF switch is required by Gadkar et al.

(2005a) for optimization. The MATLAB code to generate the figure is given

in the Appendix B.1.1...... 60

3.5 Production envelope of E. coli strain design predicted by EMILiO. The base-

line strain includes 7 fine-tuned fluxes (A). The serine-producing strain in-

cludes 3 knockouts and 7 fine-tuned fluxes (B). The MATLAB code to gen-

erate the figure is given in the Appendix B.1.2...... 62

3.6 Static strategy profiles of biomass (A), growth rate (B), serine (C) and glucose

(D) concentration of the mutant with 20 mM initial glucose concentration.

The MATLAB code to generate the figure is given in the Appendix B.1.3. . 67

3.7 Dynamic profiles of biomass and growth rate (A), serine (B) and glucose (D)

concentration. Manipulated fluxes ACALD and LSERDH (C) are under the

control of the toggle switch. Nominal values of the parameters were used

here. The MATLAB code to generate the figure is given in the Appendix

B.1.4...... 68

xv 4.1 Productivity (A), yield (B), serine titer (C) and batch time (D) as a function

of the switching time. The maximum theoretical productivity is approxi-

mately 29.6% higher than the static strategy (i.e., switching time=0). The

inset in panel A refers to a flux profile corresponding to a switching time of

4 h. The MATLAB code to generate the figure is given in the Appendix B.2.1. 79

4.2 Sensitivity indices of parameters γC (A), αC (B) and LuxR (C) for serine concentration. Total (ST), interaction (Inter.) and individual (Indiv.) in-

dices are shown. The ranges refer to sensitivity values obtained every hour

over the batch. Note that the variation of the sensitivity indices across time

is not significant. The boxes show the lower quartile, the median and the up-

per quartile values. The whiskers represent 1.5 times the interquartile range.

The MATLAB code to generate the figure is given in the Appendix B.2.2.

−1 The parameters are: γC , the CI protein degradation rate constant (h ), αC , the CI protein production rate constant (µM h−1), and LuxR, the protein

LuxR concentration (µM)...... 81

4.3 The genetic circuit consists of the sensor and the genetic controller plas-

mids (with genes sda, ydfG and mhpF controlling the fluxes of ACALD and

LSERDHr in the toggle switch). The most significant parameters as iden-

tified by the global sensitivity analysis are highlighted in bold (αC , γC and LuxR concentration)...... 82

4.4 Effect of αC −γC on productivity (A), yield (B), titer (C), batch and switching

time (D). The blue point shows the nominal values of parameters αC and γC . The flat surface in (A) shows the productivity of the static strategy. The

MATLAB code to generate the figure is given in the Appendix B.2.3. . . . . 83

xvi 4.5 Effect of αC -LuxR on productivity (A), yield (B), titer (C), batch and switch-

ing time (D). The blue point shows the nominal values of parameters αC and LuxR. The MATLAB code to generate the figure is given in the Appendix

B.2.4...... 85

4.6 Effect of γC -LuxR on productivity (A), yield (B), titer (C), batch and switch-

ing time (D). The blue point shows the nominal values of parameters γC and LuxR. The MATLAB code to generate the figure is given in the Appendix

B.2.5...... 86

4.7 Effect of αC − γC −LuxR on productivity and yield. Isosurfaces are shown for values of productivity higher than 2.9 and 2.8 mmol serine/l/h and yield

higher than 1.2 and 1.1 mmol serine/mmol glucose. The MATLAB code to

generate the figure is given in the Appendix B.2.6...... 89

4.8 Optimal parameter design space of αC and γC to achieve productivity higher than 2.9 mmol serine/l/h and yield higher than 1.2 mmol serine/mmol glu-

cose. This figure is the top view of 4.7.A. Values of LuxR of this volume are

between 0.5 and 1.5 µM...... 92

5.1 The toggle switch plasmid backbone ...... 96

5.2 Dynamics of the pTOG(adh,pta) plasmid. Genes lacI, adh and pta are ex-

pressed with heat shock, and repressed with the addition of IPTG...... 97

5.3 Gene deletions for the aerobic succinate production strain: 1. ptsG (glucose

PTS permease), 2. poxB (pyruvate oxidase), 3. ackA-pta (acetate kinase-

phosphate acetyltransferase), 4. iclR (iclR transcriptional repressor), and 5.

sdhAB (succinate dehydrogenase)...... 101

xvii 5.4 Optical density profiles for different induction times (flask experiment in LB

supplemented with approximately 55 mM glucose). Induction time is 18 hr

in A (triplicate experiments and one standard deviation shown) and 12 hr in

B (duplicate experiments). The cells carrying the pTOG plasmid grow faster

than the pentaple mutant...... 103

5.5 Optical density profiles for induction time 18hr (single flask experiment in

minerals medium with approximately 55 mM glucose) ...... 104

5.6 Characterization of the succinate-producing strain and effect of the dynamic

expression of ptsG (single experiment). A: optical density at 600 nm, B:

pyruvate concentration, C: glucose concentration, D: acetate concentration.

The arrow at 12 hr in A indicates induction with IPTG. Succinate amounts

were below 1 mM...... 105

5.7 The genes adh and pta are deleted to decrease byproduct formation. The

growth rate is severely affected since NAD+ regeneration and ATP produc-

tion are decreased as a result of the gene deletions...... 107

5.8 Optical density and metabolite profiles. Single experiments was conducted

in 100 ml bottles and minimal medium supplemented by approximately 55

mM glucose...... 108

5.9 Optical density profiles of mutant and mutant carrying the pTOG(pta) plas-

mid for two experiments in minerals medium with 55 mM glucose. The

experiment was repeated two more times, in triplicates, and no growth was

observed for the transformed mutant...... 109

5.10 Optical density profiles of mutant and mutant carrying the pTOG(pta) plas-

mid for two experiments in minerals medium, supplemented with 25 g/l of

LB and 11 mM glucose ...... 110

5.11 Optical density, glucose and lactate time profiles for Control 1 (single exper-

iment) ...... 110

xviii 5.12 Optical density, glucose and lactate time profiles for Control 2 (single exper-

iment) ...... 111

5.13 Optical density, glucose and lactate time profiles for 0.1 g/l of peptone and

0.05 g/l of yeast extract supplements (single experiment) ...... 111

5.14 Optical density profiles for induction times of 0 and 5 hours. The inoculum

was prepared in different tubes for every experiment and thus may be a source

of the large variation. The results are average of triplicate experiments and

one standard deviation...... 112

5.15 Optical density profiles for induction times of 0 and 4 hours. The inoculum

was prepared in the same tubes. Although the variation was smaller than

the previous experiment, the error bars still overlap. The results are average

of triplicate experiments and one standard deviation...... 113

5.16 Glucose and lactate concentration after 9hr. The results are average of trip-

licates and the error bars represent one standard deviation...... 114

5.17 Optical density of the mutant carrying pTOG(gfp) and pTOG(pta) for a

single experiment. The cultures did not grow...... 115

5.18 The use of MOPS as a pH buffer allows the cells to grow for longer than 10

hr in the inoculum preparation stage. One experiment was conducted per case.116

5.19 The protocol developed for running bioreactor and bottle experiments . . . 117

5.20 Characterization of wild-type (A and B) and mutant (C and D) in 100 mM

of glucose in a single experiment ...... 119

5.21 Characterization of wild-type (A and B) and mutant (C and D) in 50 mM of

glucose in a single experiment ...... 120

xix 5.22 Characterization of the mutant ∆(adh,pta) transformed with the plasmid

pTOG(adh,pta) in 50 mM of glucose. (A-B) Heat shock was applied for 30

min at the beginning of the batch and no IPTG was added in the culture

(single experiment). (C-D) The first phase of the experiment was carried out

in 3 bottles of 100 ml each, with IPTG (2mM) for 7 hr. Then, cells were

washed 3 times to remove IPTG (between 7 and 10 hr). The second phase

was carried out in the bioreactor with 300 ml of media, with heat shock for

30 min. The results shown in C and D are average of triplicates and one

standard deviation is also shown...... 122

5.23 Characterization of the wild-type (square), mutant ∆(adh,pta) (circle), mu-

tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-

pTOG(adh,pta) without IPTG (diamond). A: optical density, B: glucose, C:

lactate. Average of triplicate experiments and one standard deviation is shown.128

5.24 Gene cI sequences from plasmids pTAK 131, pTAK 132, and pTOG(adh,pta).

A: The single nucleotide mutations on the cI gene are shown as yellow sym-

bols at the bottom of the figure. The frame mutation is emphasized in the

square box. B: The frame mutation is shown in this zoom in view of the area

around the mutation...... 131

5.25 The difference in the ribosomal binding site for plasmids pTAK132 (top) and

pTAK131 (bottom) is highlighted in the sequences...... 131

5.26 Map of plasmids pTAK132 and pTAK131. The genes adh and pta are also

shown here to understand the expression pattern in the pTOG(adh,pta) plas-

mid...... 133

xx 5.27 Analysis of plasmids pTAK132 (A and B) and pTAK131 (C and D) in strain

MG1655 by flow cytometry showing two treatments: heat shock at 42oC

without IPTG is tested in A and C; IPTG at 30oC is tested in B and D.

Plasmid pTAK132 always expresses GFP with both treatments; thus, it is a

monostable switch and genes adh, pta are never expressed. Plasmid pTAK131

expresses YFP with heat shock as expected. Also, YFP is repressed in the

presence of IPTG, also as expected...... 133

xxi List of Abbreviations and Symbols

A AHL concentration

ACALD acetaldehyde dehydrogenase flux ackA acetate kinase

AcP acetyl phosphate

ACS acetyl-CoA synthetase flux adh alcohol dehydrogenase

AHL acylhomoserine lactone

ATC anhydrotetracycline

C CI repressor protein concentration

CBM constraint-based modelling cI gene expressing protein CI

CI repressor protein of the λ phage dFBA dynamic flux balance analysis

DSred2 red fluorescence protein

DySScO Dynamic Strain Scanning Optimization algorithm

xxii edd 6-phosphogluconate dehydratase

EMILiO Enhancing Metabolism with Iterative Linear Optimization algorithm

FACS fluorescence-activated cell sorting

FBA flux balance analysis focA formate transporter frd fumarate reductase

GFP green fluorescence protein glnAp2 promoter of the nitrogen regulation system gnd gluconate-6-phosphate dehydrogenase iclR isocitrate lyase regulator idi isopentenyl diphosphate isomerase

IPTG isopropyl β-D-1-thiogalactopyranoside

L LacI repressor protein concentration

LAB lactic acid bacteria lacI gene of the lac operon

LacI repressor protein

LB Luria-Bertani ldh lactate dehydrogenase

LP linear programming

LSERDHr L-serine dehydrogenase flux

xxiii lux luciferase gene system luxA-E,G,I genes of the luciferase system luxR gene of the luciferase system

LuxR protein product of the luxR gene and concentration of the protein mhpF acetaldehyde dehydrogenase 2

MOPS 3-(N-morpholino)propanesulfonic acid

MTHFD methylenetetrahydrofolate flux

NADH nicotinamide adenine dinucleotide

NADPH nicotinamide adenine dinucleotide phosphate

Ntr nitrogen regulation system

O.D. optical density pdc pyruvate decarboxylase

PDH pyruvate dehydrogenase

PEP phosphoenolpyruvate pfl pyruvate formate lyase

PFL pyruvate formate lyase flux

PGCD phosphoglycerate dehydrogenase flux pgi glucosephosphate isomerase

PLA polylactic acid poxB pyruvate oxidase

xxiv ppc phosphoenolpyruvate carboxylase pps phosphoenolpyruvate synthase

PSE process systems engineering pta phosphotransacetylase

PTAr phosphotransacetylase flux

PTS phosphotransferase system ptsG glucose phosphotransferase

R AHL/LuxR complex concentration

RBS ribosomal binding sites

S stoichiometric matrix of the reactions sda L-serine deaminase sdhAB succinate dehydrogenase

SERDL L-serine deaminase flux

TCA tricarboxylic acid

TRPAS2 tryptophanase

vC manipulated fluxes vgrowth growth rate obtained from FBA

vj flux j vmin vector of lower limits of the fluxes vmax vector of upper limits of the fluxes

xxv vproduct vector of product fluxes

X biomass concentration ydfG 3-hydroxy acid dehydrogenase

Y.E. yeast extract

YFP yellow fluorescence protein zwf glucose-6-phosphate dehydrogenase

Greek letters

αL,C LacI and CI production rate constant

βC,L CI and LacI repression coefficient

γA,R,C,L AHL, complex, CI and LacI degradation rate constant

θR LuxR/AHL dimer activation coefficient

µ growth rate

νA AHL production rate constant

ρR LuxR/AHL dimerization constant

xxvi Chapter 1

Introduction

1.1 Motivation

With fossil fuels becoming scarcer and more expensive, offers an alternative way to produces fuels, chemicals, drugs, and proteins. Using , such as bacte- ria and algae, as the catalysts of conversions offers several advantages. In addition, the use of renewable resources or agricultural waste, and the reduction of carbon dioxide emissions lead to sustainable processes. Although microorganisms naturally produce a wide variety of chemicals, further engineering is necessary to either introduce new biosynthetic pathways or improve existing ones.

In metabolic engineering, biochemical networks are genetically engineered to improve the production of a desired chemical. The main principle involves altering the metabolic pathways in order to guarantee that the resources and metabolic intermediates are supplied at the appropriate levels to maximize product yield. Chemicals naturally produced by the cells and relatively simple products require a few genetic manipulations to improve yield.

In simple cases, elimination of competing pathways or overexpression of certain is sufficient to achieve yield targets. Heterologous or more complex products require op- timization and convergence of more pathways. If key cellular precursors are exhausted or toxic intermediates accumulate, the result can be deleterious to the cells. The complex

1 Chapter 1. Introduction 2

nature of metabolic networks suggests that precise control of gene expression is necessary for system-level optimization.

Aside from yield, productivity has an impact on the economic viability of large-scale production. For low-volume, high-cost chemicals, the emphasis is on maximizing the yield of the product. In contrast, the trade-off between yield and productivity is critical for the feasibility of a high-volume, low-cost (bulk) chemical production process. Productiv- ity becomes a significant parameter in the latter cases since the profit margin is lower.

Low productivity is the main reason biofuels production processes cannot yet compete with petroleum-based fuels processes yet (Holtz and Keasling, 2010).

The trade-off between yield and productivity can be elucidated with the conservation of mass principle. The carbon consumed by microorganisms is converted into product and biomass. Balancing the partition between the two products is crucial for the optimization of a bioconversion, and determines the economics and the viability of any large-scale plant.

Protein production in fed-batch mode is a simple example of bioprocess optimization. In this paradigm, the batch is divided in two phases: a growth phase, in which biomass ac- cumulates, and a production phase, in which carbon is primarily converted into product.

Expression of the protein is implemented with inducible promoters, however the cost of the inducer may be prohibitive for large-scale production. An alternative approach, used for anaerobic products, is to grow cells aerobically during the growth phase, and switch to anaerobic conditions in order to produce the desired chemical. However, heterogeneity in oxygen levels can be problematic in large-scale bioreactors, resulting in suboptimal growth and production. Also, the control is applied at the reactor level in both approaches, and monitoring the culture density is necessary to determine the switching time between the two phases.

Metabolic engineering has benefited from advances in synthetic biology, and extended the optimization of bioprocesses at the genetic level. Synthetic biology is a discipline that combines principles from biology and engineering, and facilitates the re-engineering of ge- Chapter 1. Introduction 3

netic networks. The objective of synthetic biology is to develop well-characterized genetic parts that can be introduced to an organism, and perform a specific task. The increasing number of available genetic parts has expanded the tool set for control and optimization of metabolic engineering applications.

Accordingly, this thesis has its origin at the interface of metabolic engineering and synthetic biology. Early synthetic biology constructs are the genetic toggle switch and the metabolic oscillator (Fung et al., 2004; Gardner et al., 2000). The toggle switch allows the dynamic control of gene expression in an ON-OFF or OFF-ON fashion. Since then, the tog- gle switch has been used in more complex genetic circuits to implement the aforementioned gene expression profile. A notable example is the density-dependent expression of a protein, achieved through the coupling of the toggle switch to the natural quorum sensing system

(Kobayashi et al., 2004). The use of density-dependent gene expression can be invaluable in metabolic engineering. Gadkar et al. (2005a) studied in silico the optimization of mutants with impaired growth rate as a result of gene deletions leading to competing byproducts.

The computational study showed that an ON-OFF gene expression profile is preferred to gene deletions (Gadkar et al., 2005a). Genes contributing to growth should be ON initially to stimulate high growth rate. Once biomass accumulates, genes should be turned OFF to redirect carbon towards the desired product. The advantage of the dynamic control is that it balances the trade-off between productivity and yield. However, the study does not discuss the potential for an experimental application. This thesis will focus on the design, analysis, and implementation of the dynamic strategy for metabolic engineering, as outlined in the next section.

1.2 Challenges and objectives

As mentioned above, a high-priority challenge in metabolic engineering is to maximize the productivity of bioconversions. The computational study by Gadkar et al. (2005a) demon- strated that the dynamic control of gene expression can improve productivity, however, no Chapter 1. Introduction 4

experimental design to implement the strategy has been suggested yet. An initial model- based design of a density-dependent genetic controller is introduced in Chapter 3. The density dependence is applied through the natural quorum-sensing system from Vibrio fis- cheri, and the controller with the artificial genetic toggle switch.

Once we design a prototype model, the challenge is to assess the sensitivity of the design to the kinetic parameters of the individual components. Sensitivity analysis is of particular interest, since the genetic components are poorly characterized and the inter- actions highly non-linear. In addition, the goal of the analysis is to propose the optimal operating region for the parameters of interest. The mathematical analysis to address this problem is discussed in Chapter 4.

The next challenge is the implementation of the dynamic strategy using synthetic bi- ology tools. A critical obstacle in the development of complex genetic circuits in synthetic biology is the connectivity between the individual components. Poor characterization of the components adds to the difficulty in designing more complex systems. The process of build- ing a genetic circuit often proceeds in a trial-and-error manner, with many variants of each building component being tested. Compatibility is another major challenge in synthetic biology. Artificial genetic parts are introduced to hosts that have never encountered them, and they may disrupt the native gene expression. A common approach in synthetic biology to minimize the time and effort of the design stage is to start from a simple system. Thus, in Chapter 5, we implement the genetic toggle switch to control the expression of two genes contributing to growth. The expression of these genes is expected to improve growth and productivity of the process, compared to the process utilizing the mutant organism.

1.3 Contributions

As summarized above, the objective of this thesis is achieved by addressing three main challenges described in Chapters 3 to 5. Here, we outline the individual contributions from the work presented in each chapter. Copyright permission to use the material from all Chapter 1. Introduction 5

publications has been granted directly from the publishers.

1.3.1 Model-based design of dynamic metabolic engineering

The first challenge in this thesis is the design of a dynamic gene expression control system to implement the dynamic strategy of Gadkar et al (2005a). Accordingly, we propose the design and a mathematical model for a density-dependent genetic circuit that utilizes the natural quorum sensing module, and the artificial toggle switch (Chapter 3). The genetic circuit is based on the design by Kobayashi et al. (2004) for the density-dependent expression of the GFP protein. We built a model that integrates the quorums sensing, the toggle switch, and the metabolic network based on existing kinetic models of the quorum sensing, and the toggle switch, in addition to the flux balance analysis framework. The work relevant to

Chapter 3 has been published and presented as listed below:

• N. Anesiadis, R. Mahadevan, W. R. Cluett (2008) Dynamic control of gene expression increases productivity, Metabolic Engineering, vol. 10, p. 255-266.

• N. Anesiadis, P. Gawand, A. Ekins, H. Kobayashi, V. Martin, R. Mahadevan “Ratio- nal approaches to improve bioprocess productivity,” 60th Annual Meeting of Society

for Industrial Microbiology, San Francisco, CA, USA, August 2010 (Oral presenta-

tion).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Dynamic control of gene expression to maximize productivity of bioprocesses,” 7th Panhellenic Conference in Chemical

Engineering, Patra, Greece, June 2009 (Oral presentation).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Programming bacteria for self-monitored dynamic metabolic engineering processes,” 58th Canadian Chemical Engineering Con-

ference, Ottawa, ON, Canada, October 2008 (Oral presentation).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Dynamic control of gene expression to Chapter 1. Introduction 6

optimize product formation in Escherichia coli,” 9th Annual Ontario-Quebec Biotech-

nology Meeting, Toronto, ON, Canada, June 2007 (Poster presentation).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Dynamic control of gene expression to optimize product formation in Escherichia coli,” Statistics and Control Meeting,

Hamilton, ON, Canada, May 2007 (Oral presentation).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Dynamic control of gene expression to op- timize product formation in Escherichia coli,” Integrating Metabolism and Genomics

Conference, Montreal, QB, Canada, April 30th-May 3rd 2007 (Oral presentation).

1.3.2 Analysis of dynamic strategy

The initial model of the dynamic strategy uses kinetic values from the literature to cap- ture the dynamics of the quorum sensing and the toggle switch. Since characterization of these modules was conducted in different systems, the confidence intervals of the kinetic parameters are wide. Also, noise in biological systems can trigger responses not anticipated in the initial design. Thus, it is important to estimate the sensitivity of the design to the kinetic parameters of the genetic circuit. Furthermore, after identifying the most sensitive parameters, we are in a position to recommend the operating range for these parameters to achieve target values in productivity and yield (Chapter 4). The corresponding work has been published and presented as listed below:

• N. Anesiadis, H. Kobayashi, W.R. Cluett, R. Mahadevan (2013) Analysis and design of a genetic circuit for dynamic metabolic engineering, ACS Synthetic Biology, 2(8):442-

452.

• N. Anesiadis, R. Mahadevan, W. R. Cluett (2011) Model-driven design based on sensi- tivity analysis for a synthetic biology application, In Proceedings of the 21st European

Symposium on Computer-Aided Process Engineering-ESCAPE 21 (Pistikopoulos, E. Chapter 1. Introduction 7

N., Georgiadis, M. C., Kokossis, A. C., Eds.), Vol. 29, pp. 1446-1451, Elsevier,

Amsterdam.

• N. Anesiadis, W. R. Cluett, H. Kobayashi, R. Mahadevan “Enhancing bioprocess productivity through dynamic control of gene expression,” Recent Advances in Fer-

mentation Technology (RAFT IX), Marco Island, FL, USA, November 2011 (Invited

talk).

• N. Anesiadis, H. Kobayashi, R. Mahadevan, W. R. Cluett “Model-driven design based on sensitivity analysis for a synthetic biology application,” 21st European Symposium

on Computer Aided Process Engineering (ESCAPE), Porto Karras, Greece, May 29th-

June 1st 2011 (Oral presentation).

• N. Anesiadis, R. Mahadevan, W. R. Cluett “Dynamic metabolic engineering strat- egy to improve productivity of bioprocesses,” 59th Annual Meeting of Society for

Industrial Microbiology, Toronto, ON, Canada, July 2009 (Poster presentation).

1.3.3 Experimental implementation of the dynamic strategy

The first step towards creating the programmable genetic circuit is to couple the genetic tog- gle switch to bacterial metabolism. The dynamic strategy is applied in a lactate-producing strain of Escherichia coli, lacking genes adh (alcohol dehydrogenase) and pta (phospho- transacetylase). These genes lead to the competing byproducts ethanol and acetate, re- spectively, and they also contribute to growth rate. Accordingly, genes adh and pta are placed under the control of a toggle plasmid, and transformed in a ∆(adh, pta) mutant strain of Escherichia coli. In this proof of concept experiment, the expression of adh, pta is induced with the addition of isopropyl β-D-1-thiogalactopyranoside (IPTG).

The mutant strains carrying the toggle plasmid (dynamic strategy) is expected to out- perform the mutant strains without the toggle plasmid (static strategy). The comparison of the two strategies (Chapter 5), has been presented at the following conference: Chapter 1. Introduction 8

• N. Anesiadis, W. R. Cluett, H. Kobayashi, R. Mahadevan “Dynamic metabolic en- gineering for lactate production,” American Institute of Chemical Engineers Annual

Meeting (AIChE), Minneapolis, USA, October 2011 (Oral presentation).

1.3.4 Additional contributions

Besides the common synthetic biology theme, this thesis has also investigated additional problems relevant to gene expression dynamics. Based on the dynamic model of a switch- like gene expression system (Chapter 3), we have investigated the significance of the timing between gene expression and diffusion of tissue cells. This problem is relevant to developmental biology, and resulted in the following paper and conference presentation:

• S. Javaherian, N. Anesiadis, R. Mahadevan, A.P. McGuigan (2013) Design princi- ples for generating robust gene expression patterns in dynamic engineered tissues,

Integrative Biology, 5(3): 578-589.

• S. Javaherian, N. Anesiadis, R. Mahadevan, A.P. McGuigan “Design principles for generating sharp gene expression patterns in vitro in dynamic and re-organizing tis-

sues,” 13th International Conference on Systems Biology (ICSB), Toronto, ON, Au-

gust 2012 (Poster presentation). Chapter 2

Literature review

Metabolic engineering is the engineering of biological systems to improve the production of a chemical. With the understanding of metabolic and regulatory networks, rational ap- proaches to maximize the production of a desired chemical have been developed. The target product can either be naturally produced by the organism, or introduced to the host from a different organism through . In the latter case, one or more genes necessary for the reaction pathway are transformed in the host (heterologous production).

Normally, the first step in metabolic engineering projects is to engineer microorganisms to produce some amount of the desired product. Then, the goal is to improve yield with further engineering of the strain. Computational models of metabolic networks in the post genomic era have facilitated the systematic analysis and engineering of microorganisms.

The most commonly used modeling framework for metabolic networks is the flux balance analysis (FBA). Several FBA-based algorithms have been designed to predict the genetic manipulations required to maximize the production of the desired chemical. These algo- rithms, along with rational experimental strategies for metabolic engineering are reviewed in sections 2.1.1 and 2.1.2.

Most metabolic engineering strategies aim at increasing product yield (i.e., the moles of product per mol of carbon source), whereas productivity (moles of product per unit time)

9 Chapter 2. Literature review 10

is often neglected. In many cases, genetic engineering slows down the growth, thus affecting the productivity of the process. Productivity is a critical parameter for the feasibility of a metabolic engineering strategy, and the trade-off between productivity and yield is of great importance (discussed in section 2.1.3). Holtz and Keasling (2010) indicated that the ma- jority of metabolic engineering approaches apply static gene expression profiles. The term static is used to indicate that the level of gene expression is not manipulated over time.

As a result, the gene expression may not be optimal for the overall production during the course of a batch.

In contrast, dynamic control of gene expression is inherent in cells for adapting to a constantly changing environment. One of the main principles in metabolic engineering is to facilitate the delivery of the necessary intermediates at the appropriate levels, in order to optimize production and growth. Thus, a dynamic control strategy has the potential to balance the trade-off between productivity and yield by alleviating the growth impairment.

The principle and characteristics of the dynamic gene expression approach are discussed in section 2.2

Although the design and implementation of dynamic approaches in metabolic engi- neering has been slow to date, recent advances in synthetic biology promise more tools for the dynamic control of gene expression (Holtz and Keasling, 2010). Synthetic biology offers tools with sensing and control capabilities, necessary components for a dynamically regu- lated system. The natural quorum sensing lux-system and some artificial genetic controllers are reviewed in sections 2.3 and 2.4, respectively.

Based on the sensor and control modules, some density-dependent genetic circuits have been constructed for several applications (section 2.5). Although these engineered genetic networks are not designed for metabolic engineering, they are sources of inspiration for the development of a dynamic metabolic engineering strategy for bioprocess optimization. In the last section of this chapter (section 2.6), we review the production of lactate as a case study that can benefit from the implementation of such a dynamic strategy. Chapter 2. Literature review 11

2.1 Experimental and computational approaches to metabolic

engineering

2.1.1 Experimental approaches

As mentioned above, the goal of metabolic engineering is to improve the yield of the desired product through engineering of the metabolic network. In this section, we review the most common intuitive experimental strategies for yield improvement (see Table 2.1, adapted from Lee et al., 2012).

The most intuitive strategy for metabolic engineering is to eliminate competing byprod- ucts and increase the precursor availability. Elimination of competing byproducts and degra- dation pathways are imposed by deletion of the relevant genes. To increase the concentration of essential precursors, the appropriate proteins are normally overexpressed. Although these simple approaches have achieved yield improvement in several cases, there are some poten- tial risks associated with gene deletion and protein overexpression. Elimination of pathways that contribute to biomass or energy generation can severely affect the growth rate. Large growth impairment is not desired, since it affects the productivity of a process as previously mentioned. In addition, cell growth is sensitive to the level of protein expression. If the level is not appropriate, it may lead to depletion of essential intermediates or accumulation of toxic ones. Thus, the effect of gene deletions and expression of heterologous proteins on growth and productivity cannot be neglected (Table 2.1).

A lot of emphasis in metabolic engineering is placed on the engineering of substrate utilization. Both Saccharomyces cerevisiae and Escherichia coli preferably consume six car- bon sugars first (e.g., glucose), and only once they are depleted, they start consuming five carbon sugars (e.g., xylose). This effect, known as carbon catabolite repression, is caused by the repression of the xylose transporter by glucose. Since lignocellulose, a common feed- stock for biofuel production, includes both five and six carbon sugars, it is important to engineer strains for the simultaneous uptake of both sugar types. Eiteman et al. (2008) Chapter 2. Literature review 12

used two substrate-selective E. coli strains (one that uses glucose, one that uses xylose) to improve the consumption of both sugars in a single process. However, control of growth rates in co-cultures is difficult, and not ideal for large-scale applications. Another potential problem with substrate uptake arises where phosphoenolpyruvate (PEP) is an intermediate for the desired product. PEP is part of the major phosphotransferase system (PTS), and as a result may not be available for production of the desired product. The substitution of the native PTS system in E. coli with a system from Corynobacterium glutamicum that does not require PEP has been used in the production of L-lysine (Lindner et al., 2011).

Although engineering substrate utilization can improve the uptake, overexpression of mem- brane transporters is challenging (Table 2.1).

Cofactors, such as NADH and NADPH, are organic electron mediators necessary to carry out several biochemical reactions. The availability and the balance of the cofactors is crucial for the fast growth of an organism. Two engineering approaches have been used: increasing the production of the required cofactor, and changing the preference of the rele- vant to an alternative cofactor. Improving the availability of NADPH has been used for valine, leucocyanidin, and catechin production (Chemler et al., 2010; Park et al., 2007).

Altering the availability of cofactors can impact growth rate once again. Although altering the preference of enzymes from NADH to NADPH is challenging, it has been illustrated for 2,3-butanediol dehydrogenase, ketol-acid reductoisomerase, and alcoholdehydrogenase

(Bastian et al., 2011; Ehsani et al., 2009).

In some cases the original pathway may not lead to highest possible flux towards the desired product, and more efficient enzymes can replace existing pathways. For example, the CoA-dependent pathway for 1-butanol production from natural producer Clostridium species has been transferred to several fast-growing organisms with low resulting titers (e.g.,

1.2 g/l in E. coli). This pathway is not very efficient because it needs both NADH and reduced ferredoxin as reducing power. As an alternative, Shen et al. (2011) replaced the

Clostridium enzyme with one from Treponema denticola that needs only NADH. With fur- Chapter 2. Literature review 13

ther coupling to acetyl-CoA driving force a titer of 30 g/l was achieved. It is possible however, that when a new pathway is introduced in a host strain the pathway activity will be lower than the native host.

Finally, transporter engineering has been implemented with a few successful results.

The goal is to facilitate the secretion of the desired chemical from the cell, and block the uptake of the product. With the combination of byproduct elimination, transporter engi- neering has been used to improve the production of threonine, valine and putrescine (Lee et al., 2007; Park et al., 2007; Qian et al., 2009). However, membrane crowding can cause growth retardation.

Table 2.1: The most common metabolic engineering strategies.

Strategy Potential complications

Byproduct elimination & pre- • Gene deletions can cause severe growth impairment

cursor enrichment • Protein overexpression may lead to depletion of necessary and accu-

mulation of toxic intermediates

Substrate utilization engineer- • Overexpression of membrane transporters can be difficult

ing • Control of co-cultures growth is difficult

• Removing catabolite repression may reduce the uptake rates of the

individual sugars

Cofactor optimization • Altering cofactor availability might cause growth impairment

• Changing enzyme specificity is not always possible

Rerouting metabolic pathways • New pathways may have lower activity in the host strain

• Efficiency of new pathways may need to be improved

Transporter engineering • Membrane crowding may lead to growth impairment

The most common problem among the aforementioned strategies, often neglected, is growth impairment. Although, metabolic networks have evolved over billion of years to adapt to changing environments, additions to or modifications of the network normally results in a lower growth rate than the wild-type strain. Severe growth rate impairment leads to a slow Chapter 2. Literature review 14

conversion rate, and the economics of a bioprocess become less favourable. Thus, the design of economically feasible large-scale bioprocesses requires both high cell growth and product formation.

Accordingly, this thesis addresses the need to take into consideration the effect of growth rate on the productivity of a bioprocess. In particular, this thesis focuses on metabolic engineering strategies that apply byproduct elimination (i.e., the first strategy reviewed in Table 2.1). The hypothesis is that, instead of eliminating byproducts that contribute to growth rate, dynamic expression of genes leading towards growth-coupled products improves the overall productivity of the process. In this dynamic strategy, the genes are initially expressed to stimulate high growth rate and biomass generation, and then are switched off to produce the desired chemical. The rest of the literature review will help lead us to a model-based design of the dynamic method, using synthetic biology tools.

2.1.2 Computational approaches

Although intuition has been valuable in metabolic engineering, the systematic analysis of metabolic networks with computational methods has uncovered several non-intuitive strategies. Metabolic networks normally consist of 3,000-4,000 reactions and metabolites, with an enormous number of interactions between the components, and different layers of hierarchy. The complexity of metabolic systems calls for mathematical modeling of their networks.

A mathematical model is a quantitative representation of a system (Ingalls, 2013). A mechanistic model is based on natural laws, and can be used to help understand or simulate the behaviour of the system. Model analysis can also give us insights into the behaviour of a system under varying conditions. Simulation and model analysis can be a useful guide to experimental design as well (Saltelli et al., 2008). However, the following points must be highlighted: Chapter 2. Literature review 15

• A model is simply a demonstration of a hypothesis.

• Every model is subject to uncertainties and limitations.

• Models cannot be proven; they only improve our understanding.

• A model undergoes continuous tests, and it will be rejected or updated if it fails to explain the system behaviour.

In metabolic engineering, a model ideally would predict the gene expression level required to maximize the production of a desired chemical. Predictions can be generated with the

Flux Balance Analysis (FBA) or Constraint-Based Modelling (CBM) framework, which is discussed in section 3.1.5. Briefly, linear programming (LP) is used to determine the op- timal flux distribution that maximizes the formation of biomass, which is believed to be a reasonable objective function for microbes growing in lab conditions.

Despite the development of genome-scale models of metabolism, the prediction of the optimal gene expression levels is still challenging. This is mainly because the genome-scale metabolic models predict fluxes (or reaction rates), rather than gene expression levels, and the relationship between the two is not well understood. Nevertheless, the FBA modeling framework has been a useful tool in developing strain design algorithms that predict target reactions that need to be modified to improve product yield.

The production envelope is a feature that can help us elucidate the concept, advan- tages, and disadvantages of strain design algorithms (Figure 2.1). A production envelope is simply a plot of all the potential solutions of the FBA solution, with growth rate on the x-axis, and the flux of the desired product on the y-axis. The FBA, along with the production envelope, help us calculate and visualize the potential flux distributions of a metabolic network. For example, by setting growth rate equal to zero and maximizing the

flux of the desired product, we simply force all the carbon flux towards the desired product.

This is the maximum theoretical yield of the desired product, and it is shown as point A in Figure 2.1. Point A may seem ideal for metabolic engineering purposes at first, however Chapter 2. Literature review 16

this is not the case. This is because the growth rate is zero, and biomass generation will be problematic in a realistic application. Therefore, point A is an extreme case and only has theoretical purposes. ) r

h A: maximum / product flux W D g

/ B: common algorithm l

o solutions m m

( Strain #2

x u l f

t

c Strain #1 u d

o C: wild-type r P Growth rate (hr-1)

Figure 2.1: A simple schematic of a hypothetical production envelope. (A) The maximum theoretical yield can be calculated by setting growth rate equal to zero and maximizing the

flux of the desired product. (B) Common solutions of strain design algorithms are located between the two extremes. (C) The wild-type operating point has maximum growth rate, while the desired product is normally not produced.

The other extreme of the production envelope is point C, the wild-type operating point. Point C is calculated by maximizing biomass production or growth rate (this implies the farthest right point), and it commonly results in zero flux of the desired product. Out of all the potential operating points enclosed in the production envelope, point C is favoured by evolution to represent the real flux distribution.

When genetic manipulations are applied, the production envelope is pruned in order to couple growth rate with the production of the desired chemical. In Figure 2.1, we show Chapter 2. Literature review 17

some common algorithm solutions (points shown as B), and two production envelopes of genetically modified strains or mutants. The production envelopes of mutant strains are defined by the axes origin, point A, and one of the points B (outlined with the dashed lines for two strain examples). Comparison of strain designs on a production envelope gives us some rough insight on the economics of a process. Product flux and growth rate, the two variables plotted in the production envelope, determine the productivity and yield of a bioconversion. These important economic factors, and the trade-off between them are discussed in the next section. Next, we briefly discuss the existing strain design algorithms developed so far.

The algorithms developed in the last decade have focused on manipulation strate- gies for optimizing product yield (see Table 2.2). The number of possible combinations in genome-scale networks is extremely large, and the algorithms can be classified in two main categories: global and local methods. In global methods, the search for potential solutions is exhaustive, requiring very long computational times. For this reason, the number and type of manipulations is limited. In contrast, local or heuristic methods do not guarantee the global optimum, but they provide solutions close to the global in a timely manner.

The first algorithm developed to predict targets for manipulations was OptKnock (Bur- gard et al., 2003). Burgard’s algorithm searches for combinations of gene deletions only, and it was used to predict strategies for succinate, lactate and 1,3-propanediol production.

Fong et al. (2005) experimentally verified some of the predicted strain designs for lactate production. The Maranas group later extended the Opt algorithms to include up- and down-regulations of genes in the OptReg algorithm (Pharkya and Maranas, 2006). Later, a local search algorithm was developed by Lun et al. (2009) to allow for more modifica- tions than OptKnock and OptReg. More recently, new algorithms such as OptForce and

EMILiO capable of predicting optimal flux levels for maximum production have been de- veloped (Ranganathan et al., 2010; Yang et al., 2011). The EMILiO algorithm is capable of generating hundreds of designs, at a much faster rate than global methods. However, Chapter 2. Literature review 18

validating the plethora of the predicted designs is challenging, since the results cannot be easily implemented.

Table 2.2: Comparison of some strain design algorithms for metabolic engineering.

Method Implementation Consequences

OptKnock (Burgard et al., Small number of knockouts (≤ 10) May not converge to global opti-

2003, Fong et al, 2005) mum

Maximize yield Does not consider productivity

OptReg (Pharkya and Gene knockouts, activations Hard to estimate exact levels

Maranas, 2006) and/or inhibitions

Maximize yield Does not consider productiv-

ity

GDLS (Lun et al., 2009) Faster than global methods but May not converge to global opti-

limited by the local search mum

Maximize yield Does not consider productivity

EMILiO (Yang et al., 2011) Manual tuning of optimization pa- Hard to determine the initial opti-

rameters is required mization parameters

Maximize yield Does not consider productivity

DySScO (Zhuang et al., Evaluate productivity, yield and Consider bioreactor-level proper-

2013) titer ties

Scan the production envelope Reach the limit of static strategies

The objective of these strain design algorithms is to maximize product yield, which typi- cally comes at the expense of lower growth rate, and consequently productivity. Zhuang et al. (2013) addressed the issue and considered the dynamics of a batch. The developed

Dynamic Strain Scanning Optimization (DySScO) algorithm integrates the dynamic Flux

Balance Analysis (dFBA) modeling framework with existing strain algorithms, such as the local search algorithm by Lun et al. Incorporating the dynamics of the process in the

DySScO strategy has the advantage of evaluating a better estimate of the economics by Chapter 2. Literature review 19

considering yield, titer and productivity.

To elucidate the implication of DySScO, we reexamine the potential solutions of strain design algorithms (points B in Figure 2.1). All strain design algorithms, except DySScO, calculate the point with the highest possible product flux (i.e., the highest y-axis value), without considering the effect of growth rate on productivity. In contrast, DySScO searches for a solution that achieves high productivity, in addition to high yield. Despite this ad- dition, the trade-off between yield and productivity is more complex, and is discussed in the next section. The next section will lead us to the development of a dynamic control strategy to optimally balance productivity and yield (section 2.2).

2.1.3 The trade-off between yield and productivity of a process

Yield is defined as the moles of product per moles of carbon source. The maximum theoret- ical yield is determined if we assume that all carbon is directed towards the desired product.

The production envelope, discussed in the previous section, helps us estimate the product yield of a strain (y-coordinate), and the maximum theoretical yield (point A). Produc- tivity is defined as the amount of product in moles produced per unit time. The value of productivity depends on both the product flux (or yield) and the growth rate. When growth rate decreases as a result of metabolic engineering, productivity is also reduced.

The trade-off between yield and productivity becomes evident when we consider the carbon balance. In metabolic networks, carbon is directed towards products and biomass.

Yield is associated with the amount of carbon that goes towards product formation. Pro- ductivity though, is a function of both product and biomass generation, since biomass generation accelerates the overall productivity. Therefore, an optimal point in terms of productivity exists in the production envelope (this is one of the points shown as B in

Figure 2.1).

The DySScO algorithm deals with evaluating several strain design strategies (i.e., points in area B) in terms of productivity, yield and titer, to select the one with the highest Chapter 2. Literature review 20

productivity. However, this is not the only approach to optimization. All strain design algorithms we have reviewed so far, including DySScO, will be referred to as static, since they can be represented with a single point in the production envelope. With the develop- ment of synthetic biology tools, dynamic control of genetic networks has become practical.

In the next section, we discuss the basis for going towards a dynamic approach, and the advantages and the disadvantages of such methods. Chapter 2. Literature review 21

2.2 Dynamic control of gene expression for bioprocesses op-

timization

Bioprocess optimization has been an active area of research since the early 1970s. The ma- jority of the papers are devoted to the theoretical formulation of optimal control problems, and the experimental investigation of the solutions (Cuthrell and T., 1989; Hansen et al.,

1993; Lee and Ramirez, 1994; San and Stephanopoulos, 1989). The traditional formulation has been ”to calculate the substrate feed rate that optimizes a fed-batch fermentation”.

These studies are based on simple phenomenological models of bacterial growth, and sub- strate/product inhibition. Despite these disadvantages, it is worthwhile to surveying the concept of the problems, and the solutions derived.

Fermentation processes are considered to be irreversible autocatalytic reactions. This approach inherently incorporates the trade-off between yield and productivity, balancing production and growth. Batch and fed-batch operating policies follow a simple pattern, namely the separation of the growth and the production phase. Initially, resources must be allocated so that the biomass increases as fast as possible (i.e., the growth phase). Once enough biomass is generated, growth must be halted, and the carbon must be redirected towards the desired product. This approach has been suggested for penicillin and ethanol production, based on the results of phenomenological dynamic models of growth and pro- duction.

With the development of genome-scale networks, comprehensive models of metabolism have extended the understanding and predictive capability from the bioreactor to the genetic level (Mahadevan et al., 2002; Varma and Palsson, 1994). Dynamic flux balance analysis

(dFBA) has been used in the past to predict fed-batch operating policies in S. cerevisiae

(Hjersted and Henson, 2006). The result follows the same motif: an initial growth phase followed by a high ethanol production phase.

The control in the papers reviewed here is limited to the macroscopic level (i.e., sub- Chapter 2. Literature review 22

strate addition rate, oxygen level control, etc.). Advances in genetic engineering have al- lowed researchers to exert control at the genetic level. In the next section, we review dynamic optimization at the genetic level for metabolic engineering applications.

2.2.1 Dynamic metabolic optimization

Metabolic networks have intrinsic mechanisms that dynamically regulate gene expression.

Although some of these mechanisms have been identified (e.g., allosteric regulation), the development of engineered, dynamic controllers or regulators has been slow. Synthetic bi- ology, aided by mathematical models of gene expression, has facilitated the model-based design of several artificial dynamic systems.

The first engineered system with dynamic control of gene expression is that of Farmer and Liao (2000). Lycopene-producing E. coli strains exhibit excess glucose flux to acetate and growth retardation, resulting in low yield and productivity. Farmer and Liao (2000) used an acetyl phosphate-activated transcription factor and promoter pair as a sensor to measure excess flux through the glycolytic pathway. Acetyl phosphate (AcP) indicates ex- cess glycolytic flux, and is also a regulator in several systems. One of them is the Ntr regulon (nitrogen regulation system), which activates the promoter glnAp2 in the presence of AcP. Two genes in the lycopene pathway, pps (phosphoenolpyruvate synthase) and idi

(isopentenyl diphosphate isomerase) were placed under the control of glnAp2. With this design, genes pps and idi are expressed once AcP is present, thus diverting flux from acetate to lycopene. The dynamic AcP-dependent strategy allowed cells to grow faster than the constitutive (or static) expression of pps and idi, and resulted in higher yield, productivity and titer.

The disadvantage of this method is that it is specific to the system applied, and it cannot be generalized. For a product different than lycopene, one has to find a system similar in principle to the Ntr regulon that does not interact with any other components of the network. This property, known as orthogonality, was identified at the early stages Chapter 2. Literature review 23

of synthetic biology as an essential requirement for the design of genetic circuits. Orthog- onality allows the design of modular systems, i.e., an engineered subsystem that can be connected to a broader system to perform a specific task without interfering with it.

The modular design of Kobayashi et al (2004) is an example of a dynamic pro- grammable network. The structure of the programmable cell follows the design of a negative feedback loop. The three modules are: the sensor, the controller, and the process to be controlled. In this design, engineered E. coli cultures sense their own density (biosensor module), and activate gene expression of an output (control or regulatory module) when a set-point is reached. The density-dependent gene expression is of key importance in this thesis, and therefore is discussed in details with other density-dependent applications in section 2.5.1.

Gadkar et al. (2006, 2005b) applied a dynamic gene expression approach to tackle the trade-off between productivity and yield in computational algorithms such as OptKnock.

OptKnock identifies genes/reactions to eliminate, in order to couple product formation to growth. This however, may result in decreased productivity due to growth impairment.

In contrast, Gadkar et al. (2005b) assumed that we can manipulate the genes dynami- cally, instead of deleting them. The algorithm predicts the optimal temporal gene expres- sion to maximize product concentration using a bilevel optimization framework. Gadkar et al. (2005b) showed that when gene deletions cause growth impairment, it is optimal to dynamically control gene expression instead, with a bang-bang type of control (San and

Stephanopoulos, 1984) (i.e., an abrupt switch between two states). In the optimal ON-OFF policy, the genes are expressed at the wild-type level initially to achieve high growth rate.

Then, once enough biomass is generated the genes are turned off to switch to the high production phase (Gadkar et al., 2005b).

The concept of the dynamic strategy is elucidated if we revisit the production envelope.

Initially, the strain expresses genes contributing to growth and operates as the wild-type

(point A in Figure 2.2). In the hypothetical production envelope, that means that cells are Chapter 2. Literature review 24

growing with the same growth rate as the wild-type, without producing any product. In the second phase, the cells turn off the expression of the genes that the algorithm suggests, and operate at point B. At this point, cells are mostly making product, and the growth rate is significantly lower than the growth rate of the wild-type. ) r h / W D g / l B: Production phase o m m (

x u l f

t c u

d A: Growth o

r phase P Growth rate (hr-1)

Figure 2.2: The production envelope of the dynamic strategy. (A) Initially, the growth phase operates as the wild-type point (high growth rate, no production). (B) Then, the second phase operates as the mutant, inducing the production phase (high production rate, lower growth rate).

Although theoretical, the work by Gadkar et al. (2006, 2005b) has significant advan- tages over existing static approaches, which we discuss in the next section. The central notion of this thesis is to design, analyze, assemble and implement a practical application of the dynamic strategy. Chapter 2. Literature review 25

2.2.2 Comparison of the dynamic strategy to existing methods

The dynamic strategy can be thought of as an extension to existing strain design algorithms, yet it is not applicable to every engineered strain. Low growth rate and productivity is the reason that many bioprocesses are financially infeasible and they cannot compete with tra- ditional production processes. Dynamic gene expression compensates for impaired growth rate and productivity. Therefore, this method is beneficial only when a genetically engi- neered strain grows poorly compared to the wild-type. The larger the growth impairment, the larger is the advantage of using the dynamic strategy.

The main advantage of the dynamic method by Gadkar et al. (2005b). is that it allows the operator to determine the productivity and yield values, depending on the market de- mand. This can be achieved by controlling the duration of each phase. Thus, we introduce the main independent, manipulated variable, the switching time, as the time at which the switch from growth phase to production phase occurs.

The remainder of the literature review will cover the tools necessary to assemble a genetic circuit in order to apply the dynamic strategy:

• the quorum sensing mechanism (a sensory module)

• the genetic toggle switch (a control module)

• the construction of density-dependent genetic networks

Finally, lactate production in E. coli, the system to test our proof of concept, is discussed.

The design of dynamic gene expression may initially seem complex, expensive, and time- consuming. However, better understanding, characterization, and standardization of the gene expression tools will lead to cheaper and more complex constructs. Eventually, the advantages of dynamic control are expected to overcome the time and effort to design such systems. Chapter 2. Literature review 26

2.3 Quorum sensing in bacteria

Microorganisms thrive in constantly changing environmental conditions (e.g., temperature, osmolarity, pH and nutrients concentrations). Bacteria have developed several systems to quickly adapt to environmental variations. In addition to environmental signals, microbes also have the ability to monitor their own density and respond with the appropriate pheno- type to improve their survival chances. This natural mechanism, known as quorum sensing, is more common than it was thought when it was first discovered four decades ago (White- head et al., 2001).

Density sensing is achieved through the production and secretion of small signalling molecules. Gram-positive bacteria produce amino-acids and short peptides, whereas Gram- negative bacteria utilize derivatives as signalling molecules. The concentration of the signalling molecules is correlated to the concentration of the . The sig- nalling molecules diffuse across the membranes. Once a threshold concentration is reached, the formation of a complex triggers the expression of certain genes. The genetic mechanism of quorum sensing has been studied in several organisms, in vivo. Many natural quorum sensing systems have been engineered to modify their properties, and have been used in non-native hosts. In sections 2.3.1 and 2.3.2, we will describe the genetic mechanism of quorum sensing and applications in engineered systems.

2.3.1 The mechanism of the Lux system

Vibrio fischeri is the model system for quorum sensing in Gram-negative bacteria. This organism regulates expression of certain genes in a density-dependent manner. At low den- sities V. fischeri is non-luminescent. When a culture reaches a threshold cell density it luminesces a blue-green light at 490 nm. Many symbiotic relationships between V. fis- cheri and fish or squid species are well studied. Some squid species allow growth of the microorganism in high cell densities in a specific light organ. The light produced by the microorganism eliminates the shadow of the moonlight, thus protecting the squid from Chapter 2. Literature review 27

predators.

a. Low cell concentration

luxR luxI C D A B E G

AHL LuxR LuxI

b. High cell concentration

luxR luxI C D A B E G

+ AHL LuxR LuxI LuxR

Figure 2.3: The mechanism of quorum sensing in Vibrio fischeri: (a) low cell concentration,

(b) high cell concentration.

The lux gene cluster is the main component of the quorum sensing mechanism (luxA-

E, luxG, luxI and LuxR in Figure 2.3). The key genes of the cluster, luxI and luxR, are transcribed by two operons in the opposite direction. Gene luxI produces a group of chem- icals called acylhomoserine lactones (AHL) that diffuse across the cell membrane. Cells Chapter 2. Literature review 28

produce AHL at concentrations proportional to their density. Therefore, as the cell density increases, AHL concentration also increases. The product of gene luxR, is a molecule that enhances the expression of all the other genes of the cluster. Protein LuxR is produced at a basal rate that is significantly higher than that of AHL at low cell density (Egland and

Greenberg, 1999).

At low cell density, the activator LuxR is unable to bind, and the luminescence genes are weakly expressed. When AHL concentration reaches 10 nM, it binds to the activator

LuxR. Binding of AHL induces a conformational change in the LuxR protein, which be- comes activated. The activated LuxR/AHL dimer then binds to the lux box and induces high expression of the luminescence genes luxA-E, luxG, and luxI (Figure 2.3).

Quorum systems similar in principle to the luxR/luxI pair have been found in Gram- negative bacteria such as Pseudomonas, Erwinia, Serratia, Yersinia and Rhizobium species and Agrobacterium tumefaciens. These systems share proteins with sequences similar to

LuxR, and signalling molecules of the acylhomoserine lactones group.

2.3.2 Potential for quorum sensing applications

The ability of microorganisms to sense their own population and respond to dynamic en- vironments is a valuable tool in metabolic engineering since it allows researchers to design self-monitored cells and manipulate metabolic pathways based on the state of a culture.

Also, the modular feature of quorum systems allows researchers to link them with actua- tors and regulate fluxes of metabolic pathways.

In the next section, we review the genetic toggle switch, a genetic control device that can be linked to the quorum sensing module. Linking actuators to the quorum sensing leads to more elaborate genetic networks that perform more complex tasks in a density-dependent manner. The design of such genetic networks is discussed in section 2.5. Chapter 2. Literature review 29

2.4 Genetic controllers

2.4.1 The first synthetic biology construct: toggle switch

Synthetic biology aims at the design, construction and optimization of complex genetic circuits that perform tasks which are not carried out naturally by the cells. Biological parts that have been constructed include sensors, switches, logic gates, oscillators, pulse generators, inverters and band pass filters. Genetic control devices are commonly thought of as analogs to the electrical parts of circuits, and they are valuable tools in metabolic engineering since they enable researchers to modulate metabolic pathways. In contrast to their electronic counterparts, compatibility, connectivity and tuning of genetic devices are not trivial tasks in biological systems. Synthetic biology emerged in 2000 with the construction of two artificial genetic devices: the toggle switch and the metabolic oscillator.

The genetic toggle switch accommodates the dynamic ON-OFF profile that we want to apply, and therefore is introduced here.

2.4.2 Model-based design of the toggle switch

Gardner et al. (2000) designed an artificial genetic circuit, the toggle switch, using a simple mathematical model of gene expression. The design requirement is to build a genetic device that switches between two states when an external, transient signal is applied. This prop- erty, also known as bistability, is illustrated in Figure 2.4. A simple model was built, prior to the construction of the genetic switch, in order to guide the selection of the genetic com- ponents that satisfy the design requirement (details and analysis of the model are covered in the Appendix D). The model is based on biochemical rate equations of gene expression

(Gardner et al., 2000). The effective rate of synthesis of the repressors, α1 and α2, are lumped parameters that model the net effect of RNA polymerase binding, complex forma- tion, transcript elongation, transcript termination, repressor binding, ribosome binding and polypeptide elongation (Gardner et al., 2000). These parameters are dimensionless. Param- Chapter 2. Literature review 30

eters n1 and n2 model the multimerization of the repressors and the cooperative binding of the multimers to the operator sites of the promoter. Analysis of the mathematical model revealed three essential conditions for a robust and stable switch:

• Symmetrical promoters: the components of the toggle have to be symmetric, i.e., the strength of the two promoters must be similar. If one promoter is significantly

stronger than the other, then it will prevail resulting in a one-state system.

• Strong promoters: a strong pair of promoters (i.e., high gene expression rates) results in a more robust bistable switch. If gene expression rates are low, the repressor

concentration is also low and the system can switch between states spontaneously. The

first two conditions suggest that parameters α1 and α2 should have high values that are close to each other (Figure 2.4.A).

• Cooperative binding between protein and DNA: the model recommends that cooperative binding of protein to DNA favours bistability. Positive cooperativity is

the phenomenon that involves sequential binding events, in which the binding of a

ligand to a site increases the binding to the following site. Parameters n1 and n2 capture the cooperativity in the model. At least one of the repressors must have

an n value higher than one (Figure 2.4.B). Cooperative binding results in sigmoidal

responses that resemble the sharp ON-OFF dynamics.

Based on the insights from the model analysis (Figure 2.4), Gardner et al. (2000) constructed different toggle switches with the IPTG-repressed LacI repressor, the heat-sensitive CI, and the aTc-repressed TetR protein.

2.4.3 Construction of the toggle switch

The optimal dynamic ON-OFF control described by Gadkar et al. (2005b) in Section 2.2 can be applied with the genetic toggle switch constructed by Gardner et al. (2000). In this section, we describe the toggle switch constructed from genes lacI and cI, two natural Chapter 2. Literature review 31 # - 9- 9"8"# 78# 28# " 1 23,)456*# &*.3'-#

-"9-191# %&'(')*&#+1#,)&*-.)/0 $ -"9-19:# 1# !

!"#$%&'(')*&#+"#,)&*-.)/0# !"#$%&'(')*&#+"#,)&*-.)/0#

Figure 2.4: (A) The bistable region lies between the dashed lines. Strong and similar strength promoters (i.e., the dimensionless parameters α1 and α2) facilitate a stable and robust bistable switch. (B) High cooperativity (parameters n1, n2 > 1 in the model) increase the area of the bistable region. If the cooperativity factor is low (e.g., 1.1), noise can induce random transition between the states. Figure adopted from Gardner et al. (2000).

Permission to use the figure has been granted by the publisher. and well-studied genetic switches (Figure 2.5). This toggle was used in our experimental implementation of the dynamic metabolic engineering application.

The lac operon transcribes genes lacZ, lacY , and lacA when E. coli needs to metabolize lactose. In the absence of lactose, the cells inhibit expression of the lac operon. This is achieved by a transcription factor, the lac repressor, which binds close to the promoter and inhibits transcription. In the presence of lactose, lac repressor binds to the sugar and is unable to bind to the promoter, allowing expression of the genes. Since lactose induces expression of the lacI operon, it is called an inducer. A common inducer used instead of lactose in molecular biology is isopropyl-β-D-thio-galactoside (IPTG), because it is not metabolized by E. coli. IPTG binds to lac repressor and inhibits repression.

In the toggle switch, lac repressor was paired with the Ptrc promoter. In the absence of Chapter 2. Literature review 32

IPTG, lac repressor is expressed (ON-state), and represses the promoter Ptrc (Figure 2.5.A).

When IPTG is added, repression of Ptrc is inhibited, and the other component of the toggle switch, CI, represses lacI expression (OFF-state - Figure 2.5.B).

The second component of the toggle switch comes from the phage λ. Upon infection of a bacterial host, phage λ establishes either cell lysis (i.e., break), or lysogenic growth.

In the first case, the production of several copies of the phage leads to the host’s death.

In lysogeny, the phage DNA is integrated into the microbial genome and the phage is in a dormant state.

One of the genes involved in the decision, cI, is used in the toggle switch. Gene cI encodes protein CI which forms homodimers, and bind to the operator region of the promoter regulating the expression. The operator region has three sites and cooperative binding is observed. When a CI dimer binds to the first operator region, it increases the affinity to the second operator region. A temperature-sensitive, mutant version of the CI

o repressor was paired with the PL phage promoter. At high temperature (i.e., 42 C), the

CI repressor is not functional, and unable to repress the promoter PL (Figure 2.5.A). At

o temperature lower than 37 C, CI repressor functions normally and represses promoter PL (Figure 2.5.B).

Gardner et al. (2000) combined the lacI and cI systems to form a bistable switch. In state A, gene lacI is expressed at a high level (ON-state), and gene cI is repressed (OFF- state). This state is induced by heat shock at 42oC. In state B, gene lacI is repressed by

IPTG, and gene cI is expressed. Therefore, in order to apply the optimal ON-OFF gene expression suggested by Gadkar et al. (2005b), we can place genes contributing to growth

(e.g., adh and pta for lactate production) downstream of gene lacI.

2.4.4 Logic gates

Biological logic gates implement Boolean or digital functions inside a living cell. A logical operation is applied to one or more logical inputs and a single logical output is produced. Chapter 2. Literature review 33

A. ∅ 42°C

cI Ptrc PL lacI

B.

cI Ptrc PL lacI

IPTG ∅

Figure 2.5: (A) In the first state, gene lacI is expressed, and protein LacI represses the promoter Ptrc. Thus, lacI is in the ON-state, and cI is in the OFF-state. Expression of lacI is effected by inactivation of the temperature-sensitive CI repressor at 42oC. (B) In the second state, gene cI is expressed, and protein cI represses the promoter PL. Thus, cI is in the ON-state, and lacI is in the OFF-state. Expression of cI is induced by inactivation of the LacI repressor protein with IPTG. Red circles represent inactive proteins, green circles represent active proteins.

The most simple logic gate is the NOT gate. The NOT gate inverts the input, and is commonly called an inverter. The repression of gene expression of a single promoter can be Chapter 2. Literature review 34

thought of as a NOT gate. If the presence/absence of the repressor (input) is denoted by

1/0, and the expression/repression of the gene by 1/0, then the NOT gate responds as in

Table 2.3.a.

Gates OR and AND require a promoter that is regulated by two inputs or transcription factors (activators). The output of an OR gate is on when either, or both activators are present (Table 2.3.b). The OR gate is off, only in the absence of both activators. In contrast, an AND gate is on only when both activators are present (Table 2.3.c). If only one of the activators, or none of them is present, the AND gate is off. Biological NOT, OR and AND gates naturally exist in biological networks (Setty et al., 2003), and have been constructed from existing components (Guet et al., 2002).

(b) OR gate (c) AND gate

inputs output inputs output (a) NOT gate AB AB input output 0 0 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 1 1 0 1 0

1 1 1 1 1 1

Table 2.3: Three simple digital logic gates: (a) NOT, (b) OR, and (c) AND. (a) For a repressor-promoter system, the absence of the repressor (i.e., input is ‘0’) results in the expression of the gene (i.e., output is ‘1’). The presence of the repressor (i.e., input is ‘1’) results in the repression of the gene (i.e., output is ‘0’). (b, c) For an activator-promoter system, an input of ‘1’ indicates the presence of an activator, and ‘0’ the absence. An output of ‘1’ indicates expression, and ‘0’ repression of the gene.

More complex biological gates have been constructed in the last two years. More recently, Moon et al. (2012) created an AND gate that requires 3 and 4 inputs in order Chapter 2. Literature review 35

to turn on gene expression. Wang et al. (2011) connected an AND gate to a NOT gate to create the first biological NAND gate. The NAND gate expresses the output gene (i.e., value of 1) when none or only one of the inducers is present (i.e., the first three cases in

Table 2.4.a). The output is repressed (i.e., output of 0) only when both inducers are present.

Another valuable gate, the NOR gate, was constructed by Tamsir et al. (2011). The NOR gate expresses the output gene (i.e., value of 1) when no inducer is present (i.e., both inputs are 0). When one or both inducers are added the output is repressed (i.e., output is 0). The last two gates are mentioned here since they provide us with a practical way to implement the dynamic ON-OFF gene expression profile suggested by Gadkar et al. (2005b). However, these systems are more complex than the toggle switch and are not preferred at this stage.

(a) NAND gate (b) NOR gate

inputs output inputs output

AB AB

0 0 1 0 0 1

1 0 1 1 0 0

0 1 1 0 1 0

1 1 0 1 1 0

Table 2.4: The NAND and NOR gates can be used to apply the dynamic ON-OFF strategy suggested by Gadkar et al. (2005b).

Callura et al. (2012) created a metabolic switchboard that controls the carbon flow through the Embden-Meyerhof, the Entner-Doudoroff and the pentose phosphate pathways.

The metabolic switchboard regulates genes pgi, zwf, edd, and gnd which control the flux towards the three carbon utilization pathways. These genes were deleted from the wild-type

MG1655 strain and placed under the control of the switchboard. The promoters used in the switchboard are: anhydrotetracycline (ATC), IPTG, AHL and Mg+2. The authors showed Chapter 2. Literature review 36

that despite E. coli’s preference to the Embden-Meyerhof route, it is possible to switch to the other two pathways upon addition of the appropriate inducer. The performance of the switchboard was confirmed by transcriptomic, proteomeic and metabolomic data.

In the next section, we review integrated engineered networks, that consist of simple genetic devices connected with the natural genetic network of the organism to which they are introduced. Chapter 2. Literature review 37

2.5 Density-dependent genetic networks

Synthetic biology facilitates the engineering of synthetic genetic networks to re-program cells in order to perform novel tasks. The natural quorum sensing and the artificial toggle switch, described in sections 2.3 and 2.4 respectively, are examples of simple components that can be used to construct more complex genetic networks. The design of complex ge- netic networks adopts concepts and principles from control theory and circuit design. From a control theory perspective, synthetic biology should be considered as an extension of con- trol from the macroscopic reactor level to the genetic level.

The most simple case of control in the context of fermentations is the open loop control

(Figure 2.6.a). In an open loop control, the goal is to optimize the production of a chemical.

The process or system of interest is the cell metabolism (P). The output of interest (y) is controlled directly by manipulating the input (u). An example of open-loop control is the protein production with an inducible promoter. The cells are grown at a maximum rate initially, while the cell density is monitored externally. When a target density is reached, inducer is added by an operator to initiate protein production.

In a closed loop system, the control is applied inside the cell (Figure 2.6.b). A native or artificial sensor (S) monitors the status of the cell and responds by sending the appropriate signal to a controller (C). The controller is linked to both the sensor and the cell metabolism.

The example shown in Figure 2.6.b shows a negative feedback loop. The measured output

(ym) is compared to a set-point (ysp), and the difference between the two signals gives the error (e). The error is then transmitted to the controller, which determines the control or manipulated action (u).

The advantage of a closed loop control strategy is that the process does not need continuous monitoring and addition of expensive inducers. Therefore, a self-monitoring, self-regulating system can be valuable for bioconversions. The ability to apply dynamic control gives us the tools to implement the dynamic strategy proposed by Gadkar et al

(2005b). Although the construction of complex genetic networks is challenging (see sec- Chapter 2. Literature review 38

a. Open loop control b. Closed loop control

ysp e u y u y +_ C P P ym S

Figure 2.6: The boxes P, S and C represent the process, sensor and controller, respectively.

Variables u and y are the input and output. The measured output signal is ym, the set-point is ysp, and the difference between them is the error e. tion 2.5.2), some inspiring examples from the literature are discussed in the following section.

2.5.1 Density-dependent applications

The natural quorum sensing mechanism has been exploited in many applications to program a specific response in a density-dependent manner. One of the first applications was that of You et al. (2004). The quorum sensor was coupled to a ‘killer’ gene that causes cell death. The circuit achieves the control of the population below a target concentration.

This system represents a closed loop control architecture as described in Figure 2.6.b. The quorum sensing mechanism (i.e., the sensor S) is coupled to an actuator or controller C

(i.e., the ‘killer’ gene) which expresses a protein that causes death when the concentration reaches the set-point ysp. In a simple control architecture, Tsao et al. (2010) engineered the native quorum sensing system in E. coli to improve expression of the T7 polymerase. Overexpression of the recombinant protein causes growth decrease, therefore it is preferred to separate the growth phase from the production phase. The engineered system couples expression of the Chapter 2. Literature review 39

T7 promoter to E. coli’s quorum sensing, so that expression is induced only when the system has reached the stationary phase. This control strategy has two significant advantages over conventional protein expression processes. First, the use of external inducer (typically

IPTG) is eliminated, thus decreasing the operational costs of the process. Also, monitoring of the culture is no longer necessary.

Kobayashi et al. (2004) created a modular system that couples the quorum sensing from

V. fischeri to the artificial toggle switch, to express a target gene when a threshold density is reached (Figure 2.7). In the engineered system, E. coli senses its own concentration through AHL (presented in section 2.3), which is encoded in the plasmid pAHLb. This plasmid expresses the AHL-producing gene luxI and gene luxR from V. fischeri, and gene lacI from the toggle switch. This plasmid is not expressed at low cell density (Figure 2.7.A), and it is induced only above a threshold density (Figure 2.7.B). With this design, the toggle switch is coupled to the quorum sensing module through gene lacI. When the culture reaches the threshold density, it triggers lacI expression, and in return represses gene cI, the second component of the toggle plasmid. Therefore, gene lacI follows an OFF-ON profile during bacterial growth, while gene cI responds in an ON-OFF manner (Kobayashi et al., 2004).

This design is perfectly suitable for the application of the ON-OFF control suggested by Gadkar et al. (2005b), and it will be the basis for our implementation of the dynamic strategy we propose in Chapter 3.

2.5.2 Challenges

The analogy to electronic circuits helps us in the design of integrated circuits. However the implementation of biological circuits has its own challenges. The genetic components are part of a large network with numerous interactions, many of them unknown, and integration in different conditions or organisms is not always trivial. Once a genetic circuit is integrated, its function may be altered by genetic mechanisms inherent in the host organism or there Chapter 2. Literature review 40

A. B.

LuxR LuxR AHL LuxR AHL LuxR LuxR

sensor sensor LuxR

luxI luxR Plux lacI luxI luxR Plux lacI

genetic controller genetic controller

cI Ptrc PL* lacI

cI Ptrc PL* lacI

reporter reporter

gfp PL* gfp PL*

Figure 2.7: The density-dependent expression of gfp from Kobayashi et al. (2004). Red colour indicates a repressed gene, green indicates an activated gene. (A) At low cell density, the AHL and LuxR production from the sensor plasmid are low. As a result, lacI expression is low, and expression of cI in the controller plasmid is ON, thus repressing the reporter gfp. (B) At densities above a certain threshold, the AHL-LuxR complex formed triggers expression of lacI. In return, cI is repressed, leading to expression of the reporter protein

GFP.

may be implication with transferring the genetic circuit to the next generations. Thus, epigenetic inheritance, a persistent phenotypic response, is desired in synthetic biology.

Another concern is that tuning of the parameters of a genetic circuit is limited. For example, in the aforementioned system, one would like to be able to manipulate the threshold cell density that triggers the switch from the ON to the OFF state. The level of gene expression may also be challenging. Altering the ribosomal binding sites may offer some degree of tuning, but the limits are mostly determined by the selected components of the circuit Chapter 2. Literature review 41

(Salis et al., 2009).

Therefore, in this thesis we first attempt to implement the toggle switch on its own, to show the potential for dynamic control. After this, the next step will be to couple the toggle switch to the quorum sensing module in a similar way to the work of Kobayashi et al. (2004).

2.6 Lactate production in microbial hosts

Lactic acid is a bulk chemical with a strong market growth from 150,000 tonnes in 2002 to approximately 400,000 tonnes in 2011. It is one of the top 30 chemical building block candidates according to the US Department of Energy, and it has potential for significant market growth. Lactic acid has been used for a long time in the food and beverage industry as a preservative and pH buffer, as well as in pharmaceutical and chemical industries as solvent and raw material for other products. Recently, lactic acid has attracted a lot of interest as the monomer in the production of polylactic acid (PLA), a biodegradable plastic

(NNFCC, 2013).

2.6.1 Natural producers and yeast strains

The lactic acid bacteria (LAB) Lactobacillus acidophilus and Streptococcus thermophilus produce 90% of the lactic acid worldwide. Productivity and titer values as high as 144 g/l/h and 771 g/l were achieved with processes involving either continuous product extraction or cell recycling. Although LABs produce lactate naturally, their growth requires complex nutrients, they cannot ferment pentose sugars, and they produce a mix of acids (), which complicates downstream processes (Sauer et al., 2008). An exception is the filamentous Rhizopus oryzae, which grows on mineral medium and several carbon sources (Liu et al., 2006). LAB produce L-lactic acid at 95% purity which affects the properties of PLA. Yeast strains have been genetically engineered to achieve higher purity of the desired monomer. In Table 2.5, we review the main bioengineering objectives, Chapter 2. Literature review 42

namely productivity, yield and titer, as well as the specific nutrient requirements for batch characterizations (Demirci and Pometto, 1992; Hofvendahl et al., 1999; Kyla-Nikkila et al.,

2000; Liu et al., 2006; Porro et al., 1999; Saitoh et al., 2005; Soccol et al., 1994; Tokuhiro et al., 2009). Productivity values range from 0.91 to 3.3 g/l/h, and yield values range between 0.60 and 1 g of lactate/g glucose, which is the maximum theoretical yield. Also, titer as high as 122 g/l was achieved (Saitoh et al., 2005). The high costs of nutrients and downstream processing makes most of these hosts inefficient biocatalysts for large-scale processes.

2.6.2 Lactate from E. coli strains

Escherichia coli is advantageous for commercial production of lactate since it grows rapidly, and it has simple nutritional requirements. Also, simple genetic engineering techniques allow metabolic engineering for optimization of lactate production. Here, we examine the bioengineering objectives associated with the economics of lactate production with E. coli strains. Fermentations are grouped in strictly anaerobic and dual-phase aerobic-anaerobic batches (Tables 2.6 and 2.7 respectively).

2.6.3 Anaerobic production in E. coli

Anaerobically, the best E. coli strains achieve yields between 93 and 99% of the maximum theoretical yield (Table 2.6). However, titer and productivity is limited when growing in minimal medium. Strains JP201 (Chang et al., 1999), SZ 40, SZ 58, SZ63 (Zhou et al.,

2003), and SZ132 (Zhou et al., 2005) were not able to fully ferment 100 g/l of glucose in minimal medium, a typical requirement for large-scale fermentations. These strains were only able to fully consume 50 g/l of glucose, resulting in titers between 47 and 51 g/l. Pro- ductivity and growth rate were relatively low as a result of the mutations and the anaerobic conditions (productivity values between 0.27 and 0.44 g/l/h). The use of tryptone and yeast extract in strain FBR11 (Dien et al., 2001) and SZ132 (Zhou et al., 2005) induced faster Chapter 2. Literature review 43

Table 2.5: Comparison of natural lactate-producing and yeast strains.

Strain Productivity (g/l/h) Yield (g/g) Titer (g/l) Year

1K. lactis 0.91 0.60 109 1999

2L. delbrueckii 1.17 0.73 117 1992

3R. oryzae 1.38 0.78 94 1994

4S. cerevisiae 1.83 0.68 64 2009

5Rhizopus sp. 2.20 0.80 95 2006

6S. cerevisiae 2.54 0.61 122 2005

7L. helveticus 3.00 0.92 60 2000

8L. lactis 3.30 1.00 91 1999

1 The medium contains (per l): 50 g glucose; 30 g of dry solids of light corn steep water; 10 g

yeast extract; 200 mg adenine.

2 Mutant consumed 160 g/l glucose, in 3% yeast extract and 0.05% oleic acid medium.

3 Medium contains 120 g/l glucose and minerals.

4 Medium contains (g/l): 100 glucose; 10 yeast extract; 20 peptone.

5 Scale-up study to 5 m3, using 120 g/l glucose in mineral medium.

6 Using cane juice-based medium (20 g/l sugar) with 0.3% yeast extract.

7 Using whey permeate medium (41 g/l lactose) supplemented with 40 g/l lactose and 20 g/l

yeast extract.

8 Simultaneous saccharification-fermentation (SSF) process using a commercial enzyme with

90 g/l wheat flour and 5 g/l yeast extract. Productivity reported is the maximum rate, not

the average.

growth and resulted in productivity values of 2.33 and 1.88 g/l/h, respectively, but these supplements are not recommended for industrial applications because of their high cost.

Zhou et al. (2006a) replaced complex nutrients with betaine, a non-metabolized pro- tective osmolyte. Betaine protects the cells from osmotic stress, and allowed strain SZ132 to fully ferment 100 g/l of glucose, resulting in improved growth and productivity of 1.16 Chapter 2. Literature review 44

g/l/h. However, betaine resulted in the production of non-desired byproducts and lower product yield (0.84 g/g). In SZ194, byproducts were eliminated after deleting the heterolo- gous genes casAB and celY from strain SZ132 (Zhou et al., 2006b). This step restored the yield to 0.95 g/g and resulted in the highest titer among the anaerobic processes. However, the productivity in strain SZ194 was a little lower than SZ132, due to lower growth rate.

2.6.4 Dual-phase production in E. coli

To enhance growth rate and productivity, dual-phase fermentations have been applied. The initial phase is aerobic to generate biomass at the highest rate possible. Once high cell density is generated, switching to anaerobic or oxygen-limited conditions halts growth and initiates lactate production. All strains were grown aerobically for 10-14 h to biomass con- centration of approximately 10-11 g/l and then switched to anaerobic (strains JP201, JP203 and ALS974) and oxygen-limited (B0013 strains) conditions. Strictly anaerobic conditions were maintained by sparging with N2. Alternatively, oxygen-limited conditions were imple- mented by low agitation rate (100 rpm) and no N2 sparging, which reduces the production cost. Glucose was fed when necessary to maintain the concentration above 10 g/l. The results of dual-phase fermentations are reviewed in Table 2.7.

Tryptone and yeast extract was supplemented to strains JP201 and JP203, resulting in productivity values of 1.07 and 1.04 g/l/h, respectively. In these strains, genes pta and ppc are deleted to eliminate acetate and succinate production. Low yield and titer values were reported for both strains, despite the addition of tryptone and yeast extract supplements

(Chang et al., 1999).

In strain ALS974, genes pfl and aceEF are deleted to prevent conversion of pyruvate to acetyl-CoA, pps and poxB are deleted to prevent conversion of pyruvate to phospho- enolpyruvate and acetate, respectively. Also, the frd operon is deleted to prevent accu- mulation of succinate. Strain ALS974 was supplemented with acetate in the first phase to support growth. In the first study of strain ALS974, small amounts of succinate and Chapter 2. Literature review 45

Table 2.6: Comparison of lactate-producing strains under strictly anaerobic conditions.

Productivity: g/l/h; yield: g/g; titer: g/l.

Strain Characteristics Productivity Yield Titer Year

1JP201 ∆(pta) 0.31 0.76 47 1999

2FBR11 ∆(pflB), ldh::S.bovis ldh 2.33 0.93 73 2001

1SZ40 ∆(pflB-frd) 0.27 0.99 51 2003

1SZ58 ∆(pflB-frd-adhE) 0.30 0.97 51

+acetate (10mM) 0.33 0.97 48

1SZ63 ∆(pflB-frd-adhE-ackA) 0.29 0.98 49

3SZ132 ∆(focA-pdc-adhB-pflB) a1.23 0.88 44 2005

adhE::FRT ∆ackA::FRT 1,3b0.44 0.92 63

c1.88 0.95 90

SZ132 +betaine (1mM) 1.16 0.84 84 2006

4SZ194 +betaine (1mM) 0.94 0.95 110 2006

1 These strains were not able to fully ferment 100 g/l of glucose in minimal medium. The results shown

here are from fermentations of 50 g/l of glucose. All SZ strains are in NBS medium.

2 Fermentations were performed in 80 g/l glucose, supplemented with (per l): 10 g tryptone and 5 g yeast

extract.

3 Genes casAB from K. oxytoca and celY from E. chrysanthemi were integrated into the chromosome.

Cases (a) and (b) are based on fermentation of 50 and 100 g/l of glucose, respectively. Case (c) is in

100 g/l of glucose in LB medium.

4 Strain SZ194 is derived from SZ132 after deleting native gene frdBC and heterologous genes casAB and

celY. ethanol accumulated during the second phase of the batch, resulting in productivity of 3.54 g/l/h. Metabolic flux analysis with 13C-labeled substrates showed that succinate is pro- duced anaerobically from residual acetate at the end of the first phase (Zhu et al., 2007).

In the subsequent study, the timing of the switch between the aerobic and anaerobic phase Chapter 2. Literature review 46

was optimized. The purpose of the optimization was to eliminate succinate production by switching to the anaerobic phase when acetate was depleted. In addition to succinate elim- ination, cell recycling during the anaerobic phase improved the productivity to 4.20 g/l/h

(Zhu et al., 2008).

Table 2.7: Comparison of lactate-producing strains under two-phase aerobic-anaerobic con- ditions. Productivity: g/l/h; yield: g/g; titer: g/l.

Strain Characteristics Productivity Yield Titer Year

1JP201 ∆(pta) 1.07 0.56 60 1999

JP203 ∆(pta-ppc) 1.04 0.71 62 1999

2ALS974 ∆(aceEF-pfl- 3.54 0.86 138 2007

poxB-pps-frd)

2ALS974 Process optimization 4.20 - - 2008

3B0013-070 3.32 0.87 125 2011

4B0013-070B 4.32 0.84 123 2012

1 Medium contains (per l): 10 g tryptone, 5 g yeast extract, 0.3 g of thiamine and 50 g of glucose.

2 GAM mineral medium with glucose (20 g/l) and acetate (10 g/l) was used as carbon sources in the

growth phase.

3 M9 medium was used in the bioreactor experiments.

4 The native promoters of ldh are replaced by the temperature-sensitive promoters pL and pR.

In strains B0013-070, oxygen-limited conditions were implemented in the second phase instead of strictly anaerobic. Strain B0013-070 has 8 gene deletions that target elimina- tion of byproducts (ackA-pta, pflB, poxB, adh), accumulation of the precursor pyruvate

(pps), prevention of lactate utilization (dld), and disruption of the TCA cycle (frdA). Im- plementation of a dual-phase approach (11 h aerobic phase, followed by the oxygen-limited production phase in a fed-batch mode) resulted in productivity of 3.32 g/l/h and titer of

125 g/l, without using acetate as carbon source in the first phase (Zhou et al., 2011). Chapter 2. Literature review 47

Further improvements of strain B0013-070 were applied in the derivative strain B0013-

070B. Lactate production in the growth phase, even in small amounts, decreases biomass yield. Strain B0013-070 produces 7 g/l of lactate at the end of the first phase, competing with biomass. The hypothesis of Zhou et al. (2012) is that complete elimination of lactate from the first phase increases biomass yield. This hypothesis was tested by replacing the native ldh promoters with the temperature-sensitive promoters pL and pR in strain B0013- 070B (Zhou et al., 2012). The activity of these tandem promoters is low at temperatures between 30 and 35oC, and increases sharply at temperatures higher than 35oC. Zhou et al.

(2012) ran the aerobic phase at 33oC, although it is not the optimal temperature for growth.

At this temperature lactate is produced in traces and the growth rate is only 4% less than the maximum growth rate observed at 37oC. The production phase was performed at 42oC to maximize the expression of ldh. This genetic design completely separates the growth and the production phase, by minimizing lactate generation during the first phase, while maintaining high growth rate, and then maximizing lactate production in the second phase.

Although the operating temperatures are not optimal (i.e., 37oC), the results of the ge- netic optimization increased productivity by 30% when compared to the previous strain,

B0013-070. Also, the productivity is the highest among E. coli cells (4.32 g/l/h).

2.6.5 Summary

Several lactic acid bacteria and rhizopus species produce lactate naturally, and are used in existing industrial applications with productivity values as high as 3.30 g/l/h. However, the complex nutrient requirements and the lack of genetic engineering tools limits the metabolic engineering of such strains. In contrast, genetic engineering of E. coli is straightforward and simple nutrients are required for the growth of the organism. For these reasons, E. coli is a great host for lactate production. Metabolic engineering of E. coli is focused on eliminating competing byproducts and forcing carbon flow towards lactate to improve its yield. Chapter 2. Literature review 48

Mutations and anaerobic conditions have a negative impact on the growth rate and the productivity of the process. The productivity ranges between 0.27 and 0.44 g/l/h for cultures growing in minimal medium and 100 g/l glucose. Addition of tryptone and yeast improved productivity to 1.88 and 2.33 g/l/h, but the use of these supplements is not recommended in large-scale applications due to high cost. When betaine, a protective os- molyte, was used at low concentration (1mM), cells were able to restore their growth, and productivities of 0.94 and 1.16 g/l/h were achieved. Even with these improvements, the productivity is much lower compared to the natural producers discussed in the previous paragraph.

To alleviate the effect of the anaerobic conditions on growth rate and productivity, dual-phase batches were applied. In the first phase, aerobic conditions favour growth at the expense of lactate production. When high biomass concentration is reached, switching to anaerobic or oxygen-limited conditions induces lactate production at high productivity.

With further improvements at the process and the genetic level, productivity values of 4.20 and 4.32 g/l/h were achieved. Although, these values are higher than those achieved by nat- ural producers, dual-phase fermentations are still not economically feasible for large-scale applications, due to the high cost of sparging to maintain aerobic and anaerobic conditions in two separate phases. In addition, the transition between the two modes of metabolism can be problematic in an industrial scale setup.

Here, we present an alternative approach to improve growth rate and productivity, without the aerobic growth phase. Growth impairment due to gene deletions can be re- stored by dynamically expressing genes contributing to growth, instead of deleting them.

The dynamic gene expression implies that the cells grow as fast as wild-type cells, when the genes of interest are expressed. Then, genes are repressed and cells function as mu- tants, maximizing lactate production. Our genetic approach also separates the growth and the production phase; however, this is achieved in completely anaerobic conditions. The dynamic method is expected to improve the productivity of anaerobic processes similar to Chapter 2. Literature review 49

the ones reviewed in section 2.6.3.

2.7 Summary and synthesis

2.7.1 Summary of the literature review

Bioprocess optimization has engaged the Process Systems Engineering (PSE) community since the 1970s, with the development of optimization algorithms and methods at the reac- tor level. Optimization of protein and penicillin production has reached similar qualitative results; an initial growth phase, followed by the production phase (first method in Ta- ble 2.8). In both cases, glucose is fed to ensure maximum growth rate in the first phase.

In the second phase, production is induced with the appropriate inducer in the protein production case (Lee and Ramirez, 1994). Conditions that favour penicillin production are implemented in the latter case (San and Stephanopoulos, 1989).

Bacteria have evolved to improve their chances of survival in constantly changing environments. Thus, metabolic networks favour biomass production, rather than produc- tion of wasteful chemicals. In bioengineering and metabolic engineering, effort is put to shift the balance towards the latter. With the availability of whole genome sequences and genome-scale metabolic models in the last two decades, the opportunity for extending the optimization at the genetic level has opened. The creation of genome-scale models of metabolism has improved our understanding, but has also exposed the complexity of genetic networks. Computational algorithms have aided us to systematically explore the potential of metabolic networks for chemicals production (reviewed in Table 2.2). Although the al- gorithms demand a lot of computational power and time, they have confirmed intuitive designs and uncovered some non-intuitive ones. Most of the strain design algorithms focus on improving the yield of the desired chemical, by coupling production with growth (Opt-

Knock, OptReg, GDLS, EMILiO). Optimization of chemical transformations indicates that productivity is a factor that has to be considered along with yield. Low productivity is a Chapter 2. Literature review 50

common bottleneck for the implementation of strain designs, and it comes as a result of low growth rate. The trade-off between productivity and yield in metabolic engineering is the starting point of this thesis. Although, DySScO addressed this issue and used a brute-force method to scan along the production envelope, growth rate still restricts productivity. In all the strain design algorithms discussed so far, the control is static; that is gene expression

(knockout, up- or down-regulation) does not change over time.

In contrast, dynamic strategies can be used to alleviate the low growth rate, and improve productivity. Apart from reactor-feeding and control of reactor conditions (e.g., oxygen level), dynamic gene expression strategies are of particular interest. The design of

Farmer and Liao (2000) uses a genetically engineered system to redirect excess carbon flux toward lycopene; however this approach is specific to lycopene production and cannot be applied to other strain designs (second case in Table 2.8). Gadkar et al. showed that it is optimal to initially direct carbon towards growth, and then reroute carbon towards product

(third method in Table 2.8). This dynamic strategy is not specific to a certain strain design; it can be applied in any strain that suffers from growth retardation. A graphic comparison of the dynamic and static strain design algorithms illustrates that every strain design can benefit from the application of the dynamic strategy. Even in DySScO, which is the only algorithm that takes productivity into account, the dynamic strategy can improve the pro- ductivity of the strain design.

The rest of the literature review covers the synthetic biology tools required for the application of the dynamic control of gene expression to a metabolic engineering process.

Quorum sensing, the sensing element of the circuit, is a natural bacterial mechanism that allows them to sense their own density. The genetic toggle switch is used as the controller of the system and is coupled to the quorum sensing. Challenges encountered during the construction of density-dependent genetic circuits are also discussed. Finally, succinate and lactate production are selected as fermentation case studies that can benefit from the dynamic strategy. Chapter 2. Literature review 51

Table 2.8: Summary of dynamic optimization approaches to metabolic engineering.

Method Implementation Consequences

Bioreactor optimization (San & First accumulate biomass as Two-phase fermentation

Stephanopoulos, 1989; Lee & fast as possible

Ramirez, 1994) Then induce protein or peni- Control at the reactor level

cillin production

Lycopene production (Farmer Linked to the NRI phosphoryla- Not generally applicable to

and Liao, 2000) tion other strain designs

Only responds to acetyl phos-

phate

Dynamic control of gene expres- Target knockouts can be dy- Bi-level optimization problem is

sion (Gadkar et al., 2005) namic instead difficult

Balances productivity and yield Allows tuning between produc-

trade-off tivity and yield

Bioreactor optimization for First grow cells aerobically for Two-phase fermentation

ethanol production (Hjersted high growth

and Henson, 2006) Then switch to anaerobic condi- Control at the reactor level

tions for increased ethanol pro-

duction

Dynamic metabolic engineering Propose in silico an experi- Exploits quorum sensing and

- DME (Anesiadis et al., 2008) mental implementation of the genetic toggle switch

method by Gadkar et al. (2005)

Control at the genetic level

Sensitivity analysis of DME Sensitivity of the genetic circuit Identify targets for design and

(Anesiadis et al., 2011) parameters with respect to the engineering

product concentration

Parametric analysis of DME Analyze the balance between Identify the best operating re-

(Anesiadis et al., 2013) productivity and yield gion in the parameter space Chapter 2. Literature review 52

2.7.2 Synthesis and outline

Although, Gadkar et al. (2005b) demonstrated in silico that dynamic control of gene expres- sion can improve productivity, the major challenge still is: how do we implement dynamic control of gene expression in vivo? The main contribution of this thesis is an attempt to answer this question (second part of Table 2.8).

In Chapter 3, we develop a model-based design for the dynamic metabolic engineer- ing strategy. The initial circuit design is based on synthetic biology concepts that were discussed in the literature review. This chapter includes material from Anesiadis et al

(2008), where the initial model-based design was applied for the anaerobic production of ethanol and succinate. Since a simple way to alleviate the growth impairment in anaerobic processes is to apply a dual aerobic-anaerobic strategy, here we apply the dynamic strategy for the aerobic production of serine. The serine case study was selected to stress the broad applicability of the method.

The questions that arise once we build the initial design are: how sensitive is the cir- cuit design to the genetic circuit parameters, and what are the ranges of the parameters that will lead to the maximum productivity. The mathematical analysis to answer these questions is presented in Chapter 4. Material in this chapter is based on the publications by Anesiadis et al (2011 & 2013).

Finally, the in vivo implementation of the dynamic strategy is the concluding assign- ment of this thesis. The experiments conducted to this extent, and the results are discussed in Chapter 5. Chapter 3

Model-based design for dynamic metabolic engineering

The design of a programmable, density-dependent, integrated genetic circuit is the objec- tive of this chapter. The negative feedback system described in section 2.5 is a simple architecture to engineer a density-dependent circuit. The modular nature of gene regu- latory systems allows the integrations of a biosensor and a genetic controller to bacterial metabolism. Quorum sensing (section 2.3) is utilized as the biosensor, to detect the cell density of the culture and link it to the genetic controller. The genetic toggle switch (sec- tion 2.4) responds to the signal created by the quorum sensing, and regulates the output in an ON-OFF manner. The output placed under the control of the toggle switch is the gene(s) of the metabolic network that contribute towards biomass formation (and growth rate), at the expense of product. The integration of the three modules (shown in Figure

3.1) is elucidated in this chapter with the aid of an integrated mathematical model. The dynamic properties of the individual modules and the input/output relationships dictate the behaviour of the overall genetic circuit. Thus, modelling the overall integrated system facilitates our understanding of the system dynamics.

53 Chapter 3. Model-based design for dynamic metabolic engineering 54

Toggle Cell AHL switch Metabolism sp CI Cellular +_ biomass

AHL

Quorum sensing

Figure 3.1: The negative feedback design for density-dependent control of gene expression.

Biomass is sensed through the signalling molecule AHL with the quorum sensing module.

The AHL signal feeds the toggle switch depending on its value relative to the threshold.

The manipulated genes are placed downstream of gene cI in the toggle switch, and therefore follow the profile of protein CI. If the value of AHL is below the threshold, the manipulated genes are expressed. When AHL is higher than the threshold, the genes are repressed.

Genes that contribute to high growth rate are placed under the control of the toggle switch, thus linking the toggle switch and metabolism.

In this chapter, we develop a mathematical model of the individual modules of the genetic circuit (sections 3.1.1 and 3.1.2), and the dynamics of the genetic circuit (sec- tion 3.1.3). Then, we present a serine-producing strain, the case study for the dynamic method (section 3.1.4). Aerobic serine production was preferred over the anaerobic produc- tion of succinate and ethanol proposed by Anesiadis et al. (2008) for the following reason.

Shortly after the publication of this earlier paper, the objection was made that the growth retardation caused by the gene deletions can be relieved simply by the implementation of a dual aerobic-anaerobic strategy. Aerobic conditions induce fast growth initially, and the switch to anaerobic conditions triggers the production of the desired metabolite. Although Chapter 3. Model-based design for dynamic metabolic engineering 55

the aerobic-anaerobic strategy is straightforward for anaerobic production processes, the dynamic gene control method is still valuable. Consider a strain that produces a chemical aerobically with a growth rate significantly lower than that of the wild-type. In this case, there is no alternative for improving the growth rate other than the dynamic control of gene expression. In fact, recent developments in strain design algorithms are capable of predicting genetic manipulations that achieve high yield aerobically at the expense of low growth rate. Serine production is one such example, and the strain design used here was obtained through the EMILiO algorithm (Yang et al., 2011). Finally, the coupling of the genetic circuit to bacterial metabolism through flux balance analysis framework is discussed

(section 3.1.5).

3.1 Methods

3.1.1 Quorum sensing modelling

In section 2.3 we saw that AHL is the natural signalling molecule secreted by Gram-negative bacteria to monitor the density of their culture and respond according to the signal. To equip E. coli with the sensing capability using AHL as the signalling molecule, the design of the sensor plasmid from Kobayashi et al. (2004) is used. In the sensor plasmid (Figure

3.2), gene luxI from V. fischeri is expressed from the PluxI promoter, along with genes luxR and lacI.

The protein LuxI is a synthetase that converts precursors metabolites into AHL. Thus, the extracellular concentration of AHL is proportional to the cell density (Kobayashi et al.,

2004). Equation 3.1 models bacterial growth; the rate is proportional to the cell or biomass density (X is the biomass concentration, and µ is the growth rate). Equation 3.2 models the AHL dynamics (A). The rate of AHL production is proportional to cell density (first term), and AHL degrades at a constant rate (second term). In this model, AHL is assumed to diffuse freely, and its intracellular and extracellular concentrations are assumed to be Chapter 3. Model-based design for dynamic metabolic engineering 56

equal (Kaplan and Greenberg, 1985). Parameters vA and γA are the AHL production and degradation rate constants, respectively. As a result, both biomass and AHL concentration are expected to increase exponentially.

dX Biomass (X): = µ · X (3.1) dt

dA AHL (A): = v · X − γ · A (3.2) dt A A

dR AHL/LuxR complex (R): = ρ · [LuxR]2 · A2 − γ · R (3.3) dt R R

[LuxR] LuxR LuxR AHL LuxR LuxR

sensor LuxR

luxI luxR Plux lacI

Figure 3.2: The quorum sensing plasmid. The signalling molecule AHL is proportional to biomass concentration. Once AHL reaches a threshold concentration, it binds to LuxR, forms a dimer complex, and activates the Plux promoter.

Once AHL reaches a critical concentration, it binds to protein LuxR. Upon binding and dimerization the AHL/LuxR complex (R) becomes activated and it binds to the relevant promoter site. This site is the promoter region of the toggle switch, thus linking the sensor and the controller. The toggle switch dynamics will be discussed next. The dynamics of the AHL/LuxR complex are described in Equation 3.3. Quadratic terms are used to model Chapter 3. Model-based design for dynamic metabolic engineering 57

the dimerization of the initial complex to form an active transcription factor. Parameter

ρR is the LuxR/AHL dimerization rate constant and γR is the complex degradation rate constant. All the parameters and their values are presented in Table 3.1.

Table 3.1: Parameter values of the quorum sensing

Parameter Description Value Reference

µ Growth rate 0.4 h−1 From FBA µmol v AHL production rate constant 1.6 You et al. (2004) A gr · h −1 γA AHL degradation rate constant 0.6 h You et al. (2004) 1 ρR LuxR/AHL dimerization constant 30 Basu et al. (2005) µM 3 · h LuxR LuxR concentration 0.5 µM Basu et al. (2005)

−1 γR LuxR/AHL degradation rate constant 1.386 h Basu et al. (2005)

3.1.2 Toggle switch modelling

The toggle switch used here is comprised of two genes, lacI and cI, which encode the transcriptional repressors LacI (L) and CI (C), respectively. Gene lacI is repressed by the

CI repressor protein. Repression of lacI is modelled in the first term of Equation 3.4 (when

C is high, the first term is practically zero). The second term models the induction of lacI in the quorum sensing plasmid and links the sensor with the controller. Biomass, AHL, and complex R have very low concentrations at the early stages of the batch. Therefore, the second term is initially zero. When complex R starts forming, the second term increases.

At some critical point, the complex fully induces lacI promoter and protein concentration Chapter 3. Model-based design for dynamic metabolic engineering 58

(L) increases rapidly. The last term in Equation 3.4 models protein degradation.

dL a a R3 LacI protein (L): = L1 + L2 − γ L (3.4) dt  2 θ + R3 L 1 + C R βC

dC a CI protein (C): = C − γ C (3.5) dt  2 C 1 + L βL The dynamics of CI are modelled in Equation 3.5. Repression of the cI gene from the

LacI protein is modelled in the first term of Equation 3.5. Initially, lacI is repressed, and L concentration is close to zero. As a result, protein CI is synthesized at a constant protein synthesis rate αC , and it degrades based on the second term of the equation (γC is the protein degradation constant). As the culture grows and lacI is induced, the concentration of L increases. Increase of L in the denominator of the first term results in repression of cI.

The parameters of the toggle switch, and their values are summarized in Table 3.2.

sensor

luxI luxR Plux lacI

genetic controller

cI Ptrc PL* lacI

Figure 3.3: Design of the genetic controller plasmid and the link to the sensor plasmid.

Initially, the sensor plasmid does not express the three genes, and CI protein is present.

The figure shows the induction of the quorum sensing and lacI, which in turn represses gene cI.

Figure 3.3 shows the genetic toggle switch (controller) and how it is linked to quorum Chapter 3. Model-based design for dynamic metabolic engineering 59

sensing. Gene lacI in the sensor plasmid is induced by the AHL/LuxR complex (second term in Equation 3.4). When the complex concentration is high, lacI is fully induced and protein LacI is produced from the sensor plasmid at constant rate αL2. In addition, induction of lacI from the controller plasmid results in production of the LacI protein at a constant rate of αL1. Gene cI is initially expressed at a constant rate αC , and eventually drops to zero when lacI is induced.

The set of five differential equations introduced here models the dynamics of the genetic circuit, which is discussed in the next section. This model is the starting point of our design and analysis of the dynamic gene expression method.

Table 3.2: Parameter values of the toggle switch

Parameter Description Value Reference

−1 αL1, αL2 LacI production rate constant 60 µM · h Basu et al. (2005)

−1 αC CI production rate constant 60 µM · h Basu et al. (2005)

βC CI repression coefficient 0.008 µM Basu et al. (2005)

βL LacI repression coefficient 0.8 µM Basu et al. (2005)

−4 θR LuxR/AHL activation coefficient 10 µM Basu et al. (2005)

−1 γC CI degradation rate constant 4.152 h Basu et al. (2005)

−1 γL LacI degradation rate constant 1.386 h Basu et al. (2005)

3.1.3 Dynamics of the genetic circuit

Here, we present the simulation of the genetic circuit for nominal values found in the literature (the values in Tables 3.1 and 3.2). The value of growth rate used here (0.4 h−1) is obtained from the FBA solution for the serine strain design. This strain design will be discussed in the next section. Constant growth rate was used here, since the purpose of the simulation is to demonstrate the dynamics of the concentrations involved in the genetic circuit. In the fully integrated model of the genetic circuit with bacterial metabolism, Chapter 3. Model-based design for dynamic metabolic engineering 60

growth rate is calculated in the metabolic model. The initial conditions are 0.1 g/L for the biomass; zero for AHL, the complex and the protein LacI; and 0.029 mM for the protein CI, which is the steady-state value when gene cI is turned on. The integration of the genetic circuit model with the metabolic model is discussed in section 3.1.5.

0.1 A D 0.2 0.05 0.1

0 LacI (mM) 0

Biomass (g/L) 0 2 4 6 8 10 12 0 2 4 6 8 10 12

300 0.04 B E 200 0.02 100 CI (mM) AHL (nM) 0 0 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Time (h) 200 C 100

LuxR/AHL 0 Complex (nM) 0 2 4 6 8 10 12 Time (h)

Figure 3.4: Time profiles of the genetic circuit variables (A: biomass; B: AHL; C: LuxR/AHL complex; D: LacI protein; and E: CI protein). The simulations support the conclusion of

Gardner et al. (2000) that the dynamics of switching a gene off (E) are notably faster than the dynamics of switching a gene on (D). This is to our advantage, since a sharp ON-OFF switch is required by Gadkar et al. (2005a) for optimization. The MATLAB code to generate the figure is given in the Appendix B.1.1.

The time profiles of the five concentrations in the genetic circuit are shown in Figure

3.4. Biomass and the signalling molecule AHL increase exponentially (Figure 3.4.A and

3.4.B). The AHL concentration is expected to grow exponentially since it is proportional to biomass concentration. However, the AHL/LuxR complex is produced more slowly than the AHL, since AHL has to build up before it binds to the LuxR protein. The complex Chapter 3. Model-based design for dynamic metabolic engineering 61

concentration increases noticeably after approximately 3-4 h, and at a much sharper rate after 6 h (Figure 3.4.C).

The toggle switch dynamics are shown in Figure 3.4.D and 3.4.E. The protein LacI is induced after approximately 2 h, and the concentration slowly reaches its final value after 10 h. In contrast, the repression of gene cI starts after approximately 2 h, and the CI protein concentration drops rapidly to zero within about an hour. Therefore, the model demonstrates that the dynamics of switching on a gene are significantly slower than the dynamics of switching a gene off. Remarkably, this result confirms the experimental characterization findings of the original toggle switch. Gardner et al. (2000) observed that switching from high to low expression took approximately 0.5 h, and that switching from low to high expression levels required almost 5 h. The sharp ON-OFF characteristic response of the toggle switch is a highly desirable feature for the dynamic control we want to apply.

To conclude, the CI protein concentration follows a time profile that resembles the optimal ON-OFF profile suggested by Gadkar et al. (2005b). Thus, the genetic circuit can be utilized to apply the dynamic control of metabolic genes contributing to growth. Next, we discuss the strain design for serine production in the next section. Then, the genetic circuit will then be introduced to this strain to manipulate dynamically certain genes.

3.1.4 Strain design for serine production

Here, we design an aerobic serine-producing E. coli strain using the EMILiO algorithm

(Yang et al., 2011). The EMILiO algorithm is used with a minimum growth rate constraint of 0.4 h−1. The strain design involves a total of ten modifications: three gene deletions and seven fine-tuned fluxes. The gene deletions include the reactions of acetaldehyde dehydroge- nase (ACALD), L-serine dehydrogenase (LSERDHr) and L-serine deaminase (SERDL). The

fine-tuned fluxes include the phosphoglycerate dehydrogenase (PGCD), phosphotransacety- lase (PTAr), acetyl-CoA synthetase (ACS), methylenetetrahydrofolate (MTHFD), pyruvate dehydrogenase (PDH), pyruvate formate lyase (PFL) and tryptophanase (TRPAS2). The Chapter 3. Model-based design for dynamic metabolic engineering 62

values of the fine-tuned fluxes, calculated by EMILiO, are given in Appendix A.

The production envelope of the modifications mentioned above is shown in Figure 3.5.

The growth rate and serine flux obtained for the baseline strain with fine-tuned fluxes (A) and the mutant (B) are shown in Table 3.3. The higher serine production is associated with lower growth rate (i.e., 0.4 compared to 0.75 h−1). This makes the dynamic control of gene expression a favourable method for optimizing productivity.

20 A

10 Serine flux

(mmol/gDW/hr) 0 0 0.5 1 Growth (hrï1) 20 B

10 Serine flux

(mmol/gDW/hr) 0 0 0.5 1 Growth (hrï1)

Figure 3.5: Production envelope of E. coli strain design predicted by EMILiO. The baseline strain includes 7 fine-tuned fluxes (A). The serine-producing strain includes 3 knockouts and

7 fine-tuned fluxes (B). The MATLAB code to generate the figure is given in the Appendix

B.1.2.

Growth rate (h−1) Serine flux (mmol/gDW/h)

7 fixed fluxes 0.75 2.7

3 knockouts + 7 fixed fluxes 0.4 12.9 Table 3.3: Growth rate and serine flux under aerobic conditions for a glucose uptake rate of 10 mmol/gDW/h from flux balance analysis. Chapter 3. Model-based design for dynamic metabolic engineering 63

3.1.5 Coupling the genetic circuit to the serine-producing strain

In our earlier work, we proposed an initial, integrated design of a modular genetic circuit, similar to that of Kobayashi et al. (2004), that coupled the quorum sensing and the toggle switch module with bacterial metabolism (Anesiadis et al., 2008). The genetic circuit con- sists of the sensor and the genetic controller plasmids discussed in sections 3.1.1 and 3.1.2

(equations 3.1 to 3.5). In the integrated model these equations become equations 3.10 and

3.12 to 3.15.

The metabolism of E. coli is modelled using the dynamic Flux Balance Analysis

(dFBA) framework (Mahadevan et al., 2002). Equations 3.6 to 3.8 model the Flux Balance

Analysis (FBA). The FBA modeling approach represents the balance of metabolic fluxes or reactions for every metabolite, assuming that the network is at steady state. The output of the FBA is the growth rate vgrowth and the product fluxes vproduct. The dFBA involves solving the FBA to obtain the outputs and integrating equations 3.10 and 3.11 over a short interval. The FBA and integration is repeated over the course of the batch to obtain the time profiles of the biomass, glucose, and products.

The three gene deletions of the serine strain design are candidates for the dynamic gene expression. Out of the three deletions only fluxes ACALD and LSERDHr have a significant impact on growth and are therefore placed under the control of the genetic circuit. The flux of SERDL has little impact on growth and therefore the gene associated is deleted. Genes sda (L-serine deaminase), ydfG (3-hydroxy acid dehydrogenase) and mhpF (acetaldehyde dehydrogenase) control the fluxes of ACALD and LSERDHr, and are therefore expressed downstream of gene cI. Thus, ACALD and LSERDHr flux profiles are expected to follow the same profile as protein CI (Figure 3.4.E). The coupling of bacterial metabolism to the genetic circuit is captured in the third constraint of the FBA (Equation 3.9). In this equation, the manipulated fluxes vC (i.e., ACALD and LSERDHr) are proportional to the concentration of CI protein (C), since they are expressed from the same promoter.

The dFBA parameters include: S the stoichiometric matrix of the reactions, vj the Chapter 3. Model-based design for dynamic metabolic engineering 64

product growth flux j, v the vector of product fluxes, v the growth rate, vmin and vmax the vectors of lower and upper limits of the fluxes. The total number of metabolites is M, the total number of fluxes is N and the total number of manipulated fluxes is U. Also, X is the biomass concentration and P is the product concentration. The genetic circuit variables

A, R, L and C are the AHL, LuxR-AHL complex, LacI and CI concentration, respectively.

Parameter vA is the AHL production rate constant, αL1, αL2, αC are the protein LacI and

CI synthesis rate constants, βL and βC are the LacI and CI repression coefficients, γA, γR,

γL and γC are the AHL, complex, LacI and CI protein decay constants, θR is the complex activation coefficient, ρR is the complex dimerization constant and LuxR is the protein

LuxR concentration. Finally, the batch time (t0 − tf ) is divided into f intervals of length

Ts equal to 0.05 h. Although the majority of the quorum sensing systems shows no cooperativity (e.g., the

LuxR-AHL complex has a Hill coefficient is 1), we used a Hill coefficient of 3 in Equation

3.14. This was done to achieve sharper dynamics, similar to the dynamics reported by

Gardner et al. (2000). Highly cooperative protein-DNA interactions have been reported in the quorum sensing regulator CepR of Burkholderia cenocepacia, and therefore a Hill coefficient of 3 is realistic in an experimental context (Weingart et al., 2005).

The latest genome-scale model of E. coli metabolism, iJO1366, was used in all simula- tions (Orth et al., 2011). The stoichiometry matrix S of the model includes 2583 reactions and 1805 metabolites. The maximum glucose and oxygen uptake rate used in the simu- lations is 10 and 20 mmol/gDW/h, respectively. These values are typically used in FBA simulations for aerobic growth (Orth et al., 2011). The integration was performed using the ode23s MATLAB solver (The Mathworks, Inc., Natick, MA) over 12 h of batch time to ensure full consumption of the glucose for the slowest growing case which is the knockout static strategy. The LP problem was solved in MATLAB, using CPLEX 11.2 with the

CPLEXINT MATLAB interface. All simulations were run on a Red Hat Enterprise Linux

Server 5.8 with 8 hexa-core AMD Opteron processors and 256 GB of RAM. Chapter 3. Model-based design for dynamic metabolic engineering 65

growth maximize v (ti) (3.6)

M X subject to: Skjvj(ti) = 0 for k = 1, ..., M (3.7) j=1

vmin ≤ vj(ti) ≤ vmax for j = 1, ..., N (3.8)

vcj(ti) = γcj · C(ti) for cj = 1, ..., U (3.9)

dX = vgrowth(t ) · X (3.10) dt i

dP = vproduct(t ) · X (3.11) dt i

dA = v X − γ A (3.12) dt A A

dR = ρ [LuxR]2A2 − γ R (3.13) dt R R

dL a a R3 = L1 + L2 − γ L (3.14) dt  2 θ + R3 L 1 + C R βC

dC a = C − γ C (3.15) dt  2 C 1 + L βL

ti = t0 + i · Ts, for i = 0,. . . , f Chapter 3. Model-based design for dynamic metabolic engineering 66

3.2 Results

3.2.1 Static strategy for serine production

The static strategy is the baseline for our comparison with the dynamic strategy. Essentially, the static strategy refers to the mutant strain (3 deletions) with the fine-tuned fluxes fixed at their optimal values (7 fine-tuned fluxes). The term static is used since the genetic modifications do not change with time. In the production envelope, the characteristics of this strategy are shown in Figure 3.5.B. The mutant strain is expected to have the longest batch time, since the growth rate is 0.4 h−1. The batch is simulated using the dynamic

flux balance formulation (dFBA) with 20 mM of initial glucose, the three genes knocked out and the seven fluxes fixed at their optimal values. The dynamic profile of the static strategy (Figure 3.6) results in a serine titer of 26.1 mM and batch time of 11.2 h. Based on these values, we can estimate the values for the three objectives: productivity, yield and titer in Table 3.4, where we later compare them with the results of the dynamic strategy.

3.2.2 Dynamic strategy for serine production

In the dynamic strategy the genetic circuit is coupled to bacterial metabolism with the model described in section 3.1.5. In this initial implementation, we used the nominal pa- rameter values found in the literature (Tables 3.1 and 3.2 for the quorum sensing and the toggle switch, respectively). The manipulated genes and cI are initially in the ON-state, resulting in high growth rate and consequently production and accumulation of AHL. When the concentration of AHL reaches a critical threshold the LuxR-AHL dimer formed initi- ates transcription of gene lacI. The production of the protein repressor from lacI leads to the repression of the manipulated genes and cI. The manipulated fluxes are ACALD and

LSERDHr, since they affect growth rate significantly.

Figure 3.7 shows the time profile of the variables of interest associated with the dy- namic strategy. Fluxes ACALD and LSERDHr are at their wild-type levels initially, start Chapter 3. Model-based design for dynamic metabolic engineering 67

1 0.5 AB ) 1

0.8 − 0.4

0.6 0.3

0.4 0.2

Biomass (g/L) 0.2 0.1 Growth rate (hr 0 0 0 5 10 15 0 5 10 15

30 CD20 15 20 10 10 Serine (mM) 5 Glucose (mM)

0 0 0 5 10 15 0 5 10 15 Time (hr) Time (hr)

Figure 3.6: Static strategy profiles of biomass (A), growth rate (B), serine (C) and glucose

(D) concentration of the mutant with 20 mM initial glucose concentration. The MATLAB code to generate the figure is given in the Appendix B.1.3. decreasing approximately after 4 h, and are fully turned off after approximately one hour

(panel C). Biomass increases faster than the static strategy in the first 4 h, since growth rate is 0.75 h−1 (panel A). No serine is produced in the first 4 h (panel B). Once ACALD and

LSERDHr drop to zero, growth rate drops to 0.4 h−1, and serine production is initiated.

Glucose is consumed faster than the static strategy in the first 4 h, and is fully consumed in 7.8 h (panel D). In comparison, with the static strategy it took 11.2 hours to consume the same amount of glucose.

3.2.3 Comparison of static and dynamic strategy

Here, we compare some important bioengineering objectives for the static and the initial dynamic strategy implementation. The reaction yield is defined as the amount of product Chapter 3. Model-based design for dynamic metabolic engineering 68

1 30 Biomass A B )

1 Growth rate

− 0.8 20 0.6

0.4 10 Serine (mM) Biomass (g/L) 0.2 Growth rate (hr

0 0 0 2 4 6 8 10 0 2 4 6 8 10

10 CD 20 15 ACALD 5 LSERDH 10

mmol/gDW/hr 5 Glucose (mM) Manipulated fluxes 0 0 0 2 4 6 8 10 0 2 4 6 8 10 Time (hr) Time (hr)

Figure 3.7: Dynamic profiles of biomass and growth rate (A), serine (B) and glucose (D) concentration. Manipulated fluxes ACALD and LSERDH (C) are under the control of the toggle switch. Nominal values of the parameters were used here. The MATLAB code to generate the figure is given in the Appendix B.1.4. over the amount of reactant consumed. The maximum theoretical yield is the maximum amount of product that could be produced if 100% of the reactant was converted into it.

Values close to the maximum are required to minimize downstream separation cost and waste byproducts. However, yield is not the only important factor of process economics, since it does not consider the time factor. Accordingly, productivity is defined as the amount of product produced over time. Productivity and yield are calculated at the end of the batch using the following equations:

serine produced mmol serine P roductivity = [=] (3.16) batch time hr

serine produced mmol serine Y ield = [=] (3.17) glucose consumed mmol glucose Chapter 3. Model-based design for dynamic metabolic engineering 69

Batch time is estimated as the time at which glucose is fully consumed. Serine concentration at the end of the batch is defined as titer. The values of the bioengineering objectives are shown below in Table 3.4.

Static strategy Dynamic strategy

Productivity (mmol serine/l/hr) 2.33 2.99

Yield (mmol ser./mmol glc.) 1.30 1.14

Serine titer (mM) 26.1 22.7

Batch time (hr) 11.2 7.8

Table 3.4: Comparison of objective function values for static and dynamic strategies. The dynamic strategy is based on the nominal parameter values of the genetic circuit. The best values are highlighted in bold.

Even without any fine-tuning of the parameters associated with the genetic circuit, we notice a significant increase in the productivity of approximately 28.3% (2.99 vs. 2.33 mmol serine/l/hr). This is a first proof-of-principle that the dynamic gene expression can improve productivity. Activation of fluxes ACALD and LSERDHr for 4 hr, indeed enabled rapid generation of biomass and faster overall consumption of glucose than the static strategy.

Nevertheless, the increase in productivity comes at the expense of lower yield and titer compared to the static strategy. Yield and titer are lower in the dynamic strategy because in the first phase carbon is invested towards biomass. Finally, the batch time is significantly shorter in the case of the dynamic strategy, due to faster growth and glucose consumption.

3.3 Conclusions

The initial implementation of the dynamic strategy shows tremendous potential when pro- ductivity is the bottleneck of a bioprocess as a result of growth impairment. Although metabolic engineering focuses on increasing the yield of a product, many strain designs are Chapter 3. Model-based design for dynamic metabolic engineering 70

not economically feasible because of low productivity. Thus, an experimental technique that allows bioengineers to modulate productivity and yield values is very valuable. Improving the productivity of a strain at the expense of yield may make an infeasible process feasible.

In this chapter, we presented the preliminary model-based design, and the simulation for serine production. The genetic circuit was based on nominal parameter values obtained from the literature. In the next chapter, we explore the effect of the genetic circuit pa- rameters on the bioengineering objectives, as we attempt to further optimize the design.

Particularly, we quantify all the relevant variables (productivity, yield, titer, switching and batch times) with respect to the most significant parameters. The calculations provide a quantitative understanding of the trade-off between productivity and yield. Chapter 4

Mathematical analysis of the dynamic strategy

4.1 Introduction

Mathematical modelling and model analysis play a significant role in the design and en- gineering of synthetic biology elements. The complexity emerging from the hierarchical structure of biochemical networks, the connectivity and interactions between the compo- nents, the nonlinear and stochastic nature of biochemical processes all can be elucidated to a certain degree with mathematical models (Biliouris et al., 2011; Chandran et al., 2008;

Haseltine and Arnold, 2007; Van Riel, 2006; Zheng and Sriram, 2010). Computational mod- els can provide us with in silico experiments to explore different behaviours over a range of conditions and identify correlations between the parameters. Notwithstanding, model uncertainty is an inherent attribute of mathematical models and therefore analyzing the impact of this uncertainty is crucial for the design of meaningful experiments (Kiparissides et al., 2009, 2011a; Miˇskovi´cand Hatzimanikatis, 2011; Saltelli et al., 2008). Phase plane analysis is a method used to study the behaviour of nonlinear dynamic systems modelled with differential equations. The construction of the genetic toggle switch was driven by

71 Chapter 4. Mathematical analysis of the dynamic strategy 72

the phase plane analysis of the dynamic model developed by Gardner et al. (2000) (sec- tion 2.4.2). This analysis however is not capable of treating the dFBA model developed in section 3. Instead, we performed the phase plane analysis of the model developed by

Gardner et al. (2000) for the toggle switch, with an additional term that models the quorum sensing (Appendix D). The results of the analysis are only qualitative and can be useful in troubleshooting the initial design.

In contrast, global sensitivity analysis (GSA) offers more quantitative results. GSA methods are typically applied first in the modelling building cycle presented in Kiparissides et al. (2011a) to quantify the significance of the parameters on an output of the model

(Saltelli, 2002). Sensitivity analysis has provided in depth insights in the design of experi- ments for the optimization of genetic circuits (Feng et al., 2004; Miller et al., 2010), RNA devices (Carothers et al., 2011), signalling pathways (Chu et al., 2007; Zheng and Run- dell, 2006), bioremediation applications (Zhao et al., 2010, 2011), stem cell differentiation

(Kiparissides et al., 2011b) and antibody production (Ho et al., 2012; Kontoravdi et al.,

2005).

In Chapter 3, we presented the initial circuit design for a serine-producing strain.

The focus in Chapter 4 is the effect of the genetic circuit parameters on the important process variables such as productivity, yield, titer and batch time, along with determining a recommended operating range for the key design parameters.

First, we perform the analysis of a theoretical instantaneous ON-OFF switch (section

4.3.1). Although not realistic, this ideal switch allows us to explore the full potential of the dynamic strategy, and estimate the maximum theoretical increase in productivity. The theoretical analysis also gives us a quantitative understanding of the trade-off between productivity and yield.

After coupling the genetic circuit to bacterial metabolism in Chapter 3, global sensi- tivity analysis is applied to quantify the effect of circuit parameters on serine concentration

(section 4.3.2). Based on the GSA results, we then investigate how the most important Chapter 4. Mathematical analysis of the dynamic strategy 73

parameters impact bioprocess objectives (such as productivity, yield and titer), batch time and switching time of the toggle. This analysis allows us to identify the parameter space that satisfies targets, along with gaining a better understanding of the bistability and the switching time of the integrated circuit (sections 4.3.3 to 4.3.8). This work sets the stage for the experimental implementation of the dynamic metabolic engineering strategy.

4.2 Methods

4.2.1 Global sensitivity analysis

The dynamic model described in Chapter 3 involves a large number of parameters (i.e., 13) and highly nonlinear terms interacting with each other. With the abundance of integration routines in MATLAB and linear programming solvers that can be coupled to MATLAB, we can perform in silico experiments in a quick and efficient way to study model behaviour under different conditions. Statistical analysis of the model evaluations can give us insights on optimal experimental design and minimize the cost and effort of parameter estimation and optimization of the process. Global sensitivity analysis (GSA) is the most suitable and systematic approach for such model analysis. In contrast to local sensitivity methods, all parameters are varied simultaneously and as a result the outcome is independent of the nominal point, and perhaps more importantly, interactions between the parameters are re- vealed. The main disadvantage of these methods is the computational time required for the evaluation of the model for perturbed parameter values.

Here we used the Sobol’ global sensitivity method for parameter ranking (Saltelli,

2002; Sobol, 2001). This is a variance-based Monte Carlo method that explores the pa- rameter space by changing all parameters simultaneously, as opposed to local methods, and generates estimates of the first-order, interaction and total sensitivities indices. The model parameters are assumed to be random variables and initially a random sample of the parameters is generated within a predetermined range for each one of them. The Sobol’ Chapter 4. Mathematical analysis of the dynamic strategy 74

sampling sequence is based on a quasi-random sequence, which means that the sampling is more uniformly distributed than pseudo-random generators, resulting in faster convergence of the estimates. Then the model is evaluated for every parameter set in the sample and the resulting output is also a random variable. In the last step of the analysis, the model output is broken down based on the analysis of variance (ANOVA) decomposition to allocate the output variation among the parameters.

Sobol’s algorithm calculates two measures of parameter sensitivity on the output: the

first-order or individual sensitivity index Si, and the total sensitivity index ST i. The indi- vidual index Si represents the independent effect of parameter i on the model output. The total sensitivity index, ST i, is the sum of the individual effect of parameter i and the inter- action between parameter i and all other parameters. This is a measure of the variance in the output created by parameter i. Thus, the higher a total sensitivity index of a parameter is, the higher the variance of the output. Typically, only a few parameters generate most of the output variance, whereas most of the parameters have negligible effect on the output variance. A common threshold ST i value used in the literature is 0.1. Parameters with a total sensitivity index, ST i, less than 0.1 have negligible effect on the output variance and can be fixed at their nominal values. Parameters with ST i indices higher than 0.1 have a significant impact on the output variance. Also, the interaction effects Sint,i can be quan- tified if we consider that the total sensitivity index is the sum of the individual and the interaction effects:

ST i = Si + Sint,i (4.1)

Following Saltelli’s implementation (the methodology is described in details in Ap- pendix C), two random matrices of the parameter space A and B are generated using

MATLAB routine ”sobolset.m” (Saltelli, 2002). Matrices A and B are K × n dimensions, where K is the sample size and n is the number of the parameters varied. From matrix B, n matrices Ci (i = 1, . . . , n) are generated with all columns of B except the i − th, which is taken from matrix A. Then the model is evaluated for all the parameter sets in matrices Chapter 4. Mathematical analysis of the dynamic strategy 75

A, B and Ci (i.e., K · (n + 2) times) and the outputs of interest yA, yB and yCi (i.e., serine concentration) are used to calculate the first-order or individual Si and total sensitivity estimates ST i based on the following equations:

K 1 X y · y − · yj · yj A Ci K A B j=1 Si = (4.2) K 2  1 X  y · y − · yj A A K A j=1

K 2  1 X  y · y − · yj B Ci K A j=1 ST i = 1 − (4.3) K 2  1 X  y · y − · yj B B K B j=1 To elucidate the sensitivity indices equations, one should keep in mind that the product yA · yCi is computed as the output of matrix A multiplied by the output of Ci which means that all parameters but pi are resampled. Based on this observation, the Saltelli estimates of sensitivity indices imply that:

• If parameter pi is non-influential, then high and low values of yA and yCi are ran- domly associated and the first-order sensitivity is small.

• If parameter pi is influential, then high values of yA are multiplied with high values

of yCi and the first-order sensitivity is high.

Also, some useful properties of the sensitivity indices for the interpretation of the results are the following:

• If the total sensitivity index ST i ' 0, then parameter pi is not significant and can be fixed at nominal or literature value.

• If ST i ≥ 0.1, the parameter pi is significant and a proper experiment should be de-

signed to estimate its value. Also, parameter pi could be a potential candidate for control purposes. Chapter 4. Mathematical analysis of the dynamic strategy 76

• The sum of all Si is always ≤ 1. A low value of the sum indicates strong presence of interactions.

• The sum of all ST i is always ≥ 1. The two sums take the value of 1 if and only if there are no interactions in the model.

In our model, all parameters of the genetic circuit (n = 13) are varied between 0.1× and

10× of their nominal values. Convergence of the GSA estimates is achieved for more than

8000 model evaluations, therefore K = 8000 was used for the results presented here. The total number of model evaluations in Saltelli’s implementation is K · (n + 2) = 120, 000.

The model evaluation step was anticipated to be the most time-consuming. To speed up the model evaluation step, we parallelized the model evaluation over 15 different processors

(i.e., n + 2) and the results were obtained within 2 days, instead of 30 days.

Essentially, GSA quantifies the effect of the parameters and the interactions between them on the product formation. Typically a small number of parameters creates most of the variation to the output, whereas the majority of the parameters have negligible contributions. GSA methods however do not provide any information on the direction of the parameters effect. Further analysis is required to reveal the direction and quantify the effect of the parameters on the outputs of interest. The analysis of the most sensitive parameters is discussed in the next section.

4.2.2 Analysis of the most sensitive parameters

Once we identified that there are three parameters with total sensitivity indices higher than

0.1, we evaluated the effect of these parameters on important variables of the bioprocess. We performed brute-force simulations by varying two and three parameters at a time and plotted the objectives (i.e., productivity, yield and titer) and measures of the process dynamics (i.e., batch time and switching time). Batch time was estimated as the time at which glucose was fully consumed and switching time as the time at which the manipulated fluxes reached

5% of their initial values. Chapter 4. Mathematical analysis of the dynamic strategy 77

To visualize the results, we varied two (three possible pairs) and three parameters at a time over a 25-points grid in each dimension and plotted the four variables of interest, while all the other parameters of the genetic circuit were kept at their nominal values. Here, we vary the parameters over a range wider than the sensitivity analysis in order to examine all possible behaviours and parameter regions. Notice that the two extreme cases of this analysis are the mutant (i.e when switch turns off gene expression at the beginning of the batch) and the wild-type (i.e., when the switch does not turn off gene expression during the batch). This brute force parametric analysis is a partial exploration of all the interaction effects. However, since most of the parameters have negligible sensitivity indices, we are capturing the majority of the interactions. Chapter 4. Mathematical analysis of the dynamic strategy 78

4.3 Results

4.3.1 Ideal dynamic strategy

In the ideal dynamic strategy, we assume that the fluxes of the genes being knocked out have the same values as the wild-type initially and then go to zero instantaneously when turned off (inset in Figure 4.1.A). The fine-tuned fluxes remain at their optimal level. The ideal dynamic strategy refers to the perfect ON-OFF control, since we assume that the fluxes go to zero instantaneously. In practice, this is not realistic because: i) gene repression is a process that happens on the order of minutes to hours and ii) even after the gene is repressed, the proteins present will catalyse the reaction before they degrade. However, the instantaneous ON-OFF control serves as a basis of comparison for the dynamic strategy.

The dynamics of this switch are very important for the optimization of the process and will be studied in the following section.

Here, we define the switching time (tS) as the time at which the switch from the ON to the OFF-state occurs (the switching time should not be confused with the duration or the dynamics of the switch). In the ideal dynamic strategy, we consider the switching time to be the manipulated independent (input) variable. Serine concentration and batch time are the dependent (output) variables since they are a function of the switching time. These outputs in turn define the values of the three bioprocess objectives, namely productivity, yield and titer shown in Figure 4.1.A, B and C, respectively. Here, notice that for the two extreme values of the switching time tS, we have the batch equivalent to the mutant if tS = 0, and batch equivalent to the wild-type if tS is greater than 6.5 h (Figure 4.1.A). All the lines in Figure 4.1 flatten out after 6.5 h, indicating that if the switch does not occur within the first 6.5 h, the glucose is consumed within 6.5 h and the batch time is 6.5 h. Also, the greater the switching time, the more biomass and less serine is generated. As a result, the yield and titer of the dynamic strategy is always lower than the static strategy (i.e., the mutant or tS = 0 in Figure 4.1.B and C). However, the trade-off between biomass and Chapter 4. Mathematical analysis of the dynamic strategy 79

3.5 2 A B 1 Yield 3 0 0 2 4 6 8 (mmol ser/mmol gluc) 2.5 40 Mutant C 20 2 10 (mM) Serine titer 0 0 2 4 6 8 5 Flux 1.5 0 0 5 10 12 Time D

Productivity (mmol serine/l/hr) 10 1 Wild−type 8 6 Batch time (hr) 0 2 4 6 8 0.5 Switching time (hr) 0 2 4 6 8 Switching time (hr)

Figure 4.1: Productivity (A), yield (B), serine titer (C) and batch time (D) as a function of the switching time. The maximum theoretical productivity is approximately 29.6% higher than the static strategy (i.e., switching time=0). The inset in panel A refers to a flux profile corresponding to a switching time of 4 h. The MATLAB code to generate the figure is given in the Appendix B.2.1.

product leads to a maximum productivity of 3.02 mmol serine/l/h, which is 29.6% higher than the static strategy (2.33 mmol serine/l/h). The increase in productivity comes as a result of the decrease in the batch time, since we generate biomass at a faster rate initially.

The maximum in productivity and the associated batch time in the optimum productivity region are emphasized in the boxed areas. Note that if we want to keep the yield and titer high, we should apply a switching time that is below the optimal value (approximately 4 h), because if we apply a switching time less than 4 h we can still improve the productivity, while keeping yield and titer high. Chapter 4. Mathematical analysis of the dynamic strategy 80

4.3.2 Global sensitivity analysis

Here, we use global sensitivity analysis to study the effect of the genetic circuit parameters on the initial flux levels, the switching time, the duration of the switch, and in turn on the bioengineering objectives. The switching time, tS, associated with the dynamic strategy is defined here as the time at which the manipulated fluxes reach 5% of their initial values and is seen to be approximately 5 h in Figure 3.7.

In Figure 4.2, we show the total, interaction and individual sensitivity indices of the most significant parameters of the genetic circuit with respect to serine concentration, av- eraged over time. Out of 13 parameters, 3 parameters account for more than 90% of the output variance, namely the toggle switch parameters γC and αC and the quorum sensing parameter LuxR (average total sensitivity indices of approximately 0.54, 0.35 and 0.10, respectively). This result suggests that the process can be optimized by focusing on these three parameters, while the rest of the parameters are fixed at their nominal values. The average sensitivities for all parameters of the genetic circuit are shown in Appendix C (Fig- ure S.1).

Interestingly, parameters αL and γL of the genetic circuit have negligible sensitivity indices. In contrast, the toggle switch alone is a symmetric circuit and previous analysis showed that high production and degradation rates of both genes (i.e., high αC , αL, γC , and γL) favour a robust and stable switch (Gardner et al., 2000). Here, analysis of the integrated circuit indicates high sensitivity of the cI component of the toggle (i.e., αC and

γC ) and parameter LuxR of the quorum sensing, but insignificant sensitivity to parameters

αL and γL. The reason is that the integrated circuit is not symmetric and the quorum sensing induces lacI independently of the toggle switch.

The interaction indices are very significant for all 3 of these key parameters as they account for an average of 58, 83 and 93% of the total sensitivities, respectively. However, the analysis used here does not estimate the individual interactions between parameters

(i.e., second, third-order effects etc.) but only the total interaction effects. Also, the global Chapter 4. Mathematical analysis of the dynamic strategy 81

sensitivity analysis does not provide the direction of the parameter effect (e.g., whether an increase or decrease in γC leads to an increase or decrease in the productivity). Therefore, in order to investigate the interaction effects further, we explore the effects of changing two and three parameters at a time in the following sections.

Parameter Parameter C C Parameter LuxR

ABC

0.5 0.5 0.5

0.4 0.4 0.4

0.3 0.3 0.3

0.2 0.2 0.2 Sensitivity index value 0.1 0.1 0.1

0 0 0 ST Inter. Indiv. ST Inter. Indiv. ST Inter. Indiv.

Figure 4.2: Sensitivity indices of parameters γC (A), αC (B) and LuxR (C) for serine concentration. Total (ST), interaction (Inter.) and individual (Indiv.) indices are shown.

The ranges refer to sensitivity values obtained every hour over the batch. Note that the variation of the sensitivity indices across time is not significant. The boxes show the lower quartile, the median and the upper quartile values. The whiskers represent 1.5 times the interquartile range. The MATLAB code to generate the figure is given in the Appendix

−1 B.2.2. The parameters are: γC , the CI protein degradation rate constant (h ), αC , the CI protein production rate constant (µM h−1), and LuxR, the protein LuxR concentration

(µM). Chapter 4. Mathematical analysis of the dynamic strategy 82

[LuxR] LuxR LuxR AHL LuxR LuxR

sensor LuxR

luxI luxR Plux lacI

genec controller

αc

sda ydfG mhpF cI Ptrc PL* lacI

degradaon

γc

Figure 4.3: The genetic circuit consists of the sensor and the genetic controller plasmids

(with genes sda, ydfG and mhpF controlling the fluxes of ACALD and LSERDHr in the tog- gle switch). The most significant parameters as identified by the global sensitivity analysis are highlighted in bold (αC , γC and LuxR concentration).

4.3.3 Effect of αC and γC

Figure 4.4.A shows that the productivity of the dynamic strategy is higher than the static

(flat surface shown for comparison) over a wide range of the αC − γC parameter space.

Productivity exhibits a maximum plateau, which broadens with increased values of αC and Chapter 4. Mathematical analysis of the dynamic strategy 83

γC . A wide plateau is desirable as it ensures robustness of the design. The value of produc- tivity within the plateau is between 2.95 and 3 mmol serine/l/h, which is approximately

27−29.6% higher than the static strategy (in comparison, the ideal ON-OFF controller has a maximum productivity of 3.02 mmol serine/l/h).

A B 3 1.5

2 1

Yield 0.5 1

Productivity 0 100 100 (mmol Ser./l/hr) 100 100 10 10 10 10

1 (mmol Ser./mmol Gluc.) 1 1 0.1 1 0.1 γC 0.1 0.01 αC γC 0.1 0.01 αC

C D Batch time 30 12 20 9 10 6 Switching 0

100 Time (hr) 3 time 100 100 10 Serine titer (mM) 10 10 1 0 1 α γ 1 0.1 α 0.1 C C 0.1 0.01 C 100 10 1 0.1 0.01 γC

Figure 4.4: Effect of αC − γC on productivity (A), yield (B), titer (C), batch and switching time (D). The blue point shows the nominal values of parameters αC and γC . The flat surface in (A) shows the productivity of the static strategy. The MATLAB code to generate the figure is given in the Appendix B.2.3.

In Figure 4.4.B and 4.4.C, yield and titer are always lower than the static strategy as a consequence of the initial growth phase and tend towards the static strategy for high γC and low αC values since this leads to genes being turned off and cells growing as mutants (OFF−state). At this extreme, productivity also tends towards the value of the static strat- egy. At the other extreme, gene expression stays in the ON−state throughout the batch for high αC and low γC values, resulting in wild-type cells (ON−state) where the objectives Chapter 4. Mathematical analysis of the dynamic strategy 84

values are at their minima.

Figure 4.4.D illustrates the effect of the parameters on the batch and switching time.

High γC and low αC values result in zero switching time (i.e a mono-stable OFF−state switch, since the toggle is always in the OFF−state). In this region, the batch is equivalent to the mutant and the objective values tend to the values of the static strategy. At the other extreme, high αC and low γC values result in the switching time being equal to the batch time (i.e., a mono-stable ON−state switch, since the toggle is in the ON−state throughout the course of the batch). In this region, the batch is equivalent to the wild-type and the objectives values are at their minima. For values between the two extremes, there is a third region where the switching time is between 4 and 6.5 h. In this region, the toggle switches between the ON and the OFF−state resulting in what is referred to as a bistable switch.

The area of the bistability increases with increased values of αC and γC . In most of the bistable region the switching time is approximately 4 h and it increases rapidly to 6.5 h when γC decreases, whereas αC does not affect the switching time significantly. To support the conditions for bistability of the toggle, Gardner et al. (2000) and

Kobayashi et al. (2004) demonstrated that strong promoters (i.e., high αC ) and high degra- dation rates (i.e., high γC ) increase the size of the bistable region. This result was derived based on phase-plane analysis of a mechanistic toggle switch model and was also tested in a number of plasmids with different promoter strengths. In addition, protein degradation tags were used in the plasmids to increase protein degradation rates.

4.3.4 Effect of αC and LuxR

Similarly, the productivity of the dynamic strategy is higher than the static in most of the

αC -LuxR parameter space (Figure 4.5.A). The maximum value in the plateau converges to the maximum productivity of the ideal ON−OFF controller (i.e., 3.02 mmol serine/l/h).

However, the maximum plateau when varying these two parameters is narrow (in contrast with the αC − γC parameter space). The shape of the plateau suggests that fine-tuning of Chapter 4. Mathematical analysis of the dynamic strategy 85

parameter LuxR is crucial to ensure maximum productivity.

A B

3 1.5

2 1 0.5 1 Yield Productivity

(mmol Ser./l/hr) 100 0 100 100 10 100 10 10 1 10 1 1 0.1 0.1 α 1 0.1 0.1 α LuxR 0.01 0.01 C (mmol Ser./mmol Gluc.) LuxR 0.01 0.01 C C D Batch time 30 12 20 9 10 6 Switching time 100 0 100 Time (hr) 3 10 Serine titer (mM) 100 10 1 10 1 0 1 0.1 0.1 α 0.1 α LuxR 0.01 0.01 C 100 10 1 0.1 0.01 0.01 C LuxR

Figure 4.5: Effect of αC -LuxR on productivity (A), yield (B), titer (C), batch and switching time (D). The blue point shows the nominal values of parameters αC and LuxR. The MATLAB code to generate the figure is given in the Appendix B.2.4.

Figure 4.5.D reveals how parameter LuxR affects the batch and the switching time.

The switching time (lower curve) depends primarily on parameter LuxR. Switching time ranges from zero for high LuxR values (i.e., approaching the static strategy, where the manipulated genes are turned off immediately) to the total batch time for small LuxR values (thus resulting in the other extreme, cells operating in the ON−state throughout the batch). Similar features are observed in the two extreme operating points for yield and titer

(Figure 4.5.B and C). The process approaches the static strategy for high LuxR values and as a result yield and titer converge to their maximum values. Chapter 4. Mathematical analysis of the dynamic strategy 86

4.3.5 Effect of γC and LuxR

In Figure 4.6.A, the productivity of the dynamic strategy is again higher than the static in most of the γC -LuxR parameter space and the maximum productivity converges to the value of the ideal ON−OFF controller (3 mmol serine/l/h). The plateau is narrow, similarly to the αC -LuxR space, however in this case both γC and LuxR seem to affect the productivity (in contrast to the previous pair of parameters where LuxR mostly affected productivity).

Productivity is favoured by either high LuxR−low γC , or low LuxR−high γC values.

A B 3 1.5

2 1 0.5

1 Yield Productivity 0 (mmol Ser./l/hr) 100 100 100 100 10 10 10 10 1 1 1 1 LuxR 0.1 0.1 γC (mmol Ser./mmol Gluc.) LuxR 0.1 0.1 γC C D

12 Batch time 30 9 20 6 10 Switching 3 time 0 Time (hr) 100 Serine titer (mM) 10 100 0 1 10 0.1 1 0.1 1 γ 10 10 1 LuxR 0.1 0.1 C LuxR 100 100 γC

Figure 4.6: Effect of γC -LuxR on productivity (A), yield (B), titer (C), batch and switch- ing time (D). The blue point shows the nominal values of parameters γC and LuxR. The MATLAB code to generate the figure is given in the Appendix B.2.5.

The effect of parameters γC and LuxR on the batch and the switching time is elucidated in Figure 4.6.D (notice that the axes direction is reversed). Here, both γC and LuxR affect the switching time, in contrast with the previous pair of parameters where αC did not affect the switching time. This implies that the interaction between parameters γC and LuxR is strong. High values of γC and LuxR lead to an instantaneous switch and the values of the Chapter 4. Mathematical analysis of the dynamic strategy 87

objectives tend towards the static strategy values. At the other extreme, gene expression remains in the ON−state throughout the batch for low values of γC and LuxR, resulting in wild-type cells and minimum objective values.

4.3.6 Summary on the effects of changing two parameters at a time

Considering the objective surfaces in Figure 4.4, values of αC and γC must be chosen to ensure that the productivity lies in the optimal plateau region. To achieve balanced pro- ductivity and yield, high γC and low αC over the plateau are desired (i.e., values that correspond to the left side of the plateau shown in Figure 4.4.A). High αC values increase the optimal plateau area. However very strong promoters could lead to low growth rate due to increased metabolic burden associated with the genetic circuit and should be avoided.

The analysis also suggests that LuxR concentration can be manipulated to fine-tune the switching time (Figure 4.5). The engineering of LuxR promoters and design of syn- thetic ribosome binding sites (RBS) can be used to modify the value of LuxR concentration

(Collins et al., 2005; Salis et al., 2009). We also showed that parameter LuxR strongly in- teracts with γC (Figure 4.6) and they both affect the switching time. Therefore, switching time is sensitive with respect to parameters LuxR and γC . Up to this point, we have identified the three most influential parameters, the inter- actions between them in a pair-wise manner and how they affect the key features of the dynamic strategy, namely the bistability of the toggle and the switching time. Parameters

αC and γC affect the bistability of the toggle switch and the design of the integrated circuit must guarantee that the circuit is bistable. Parameters γC and LuxR strongly influence the switching time in a synergistic way and therefore fine-tuning of the switching time is pos- sible. In the next section, the objective of the analysis is to determine the final parameter region that satisfies the yield and productivity targets. Chapter 4. Mathematical analysis of the dynamic strategy 88

4.3.7 Effect of all three parameters

To visualize the effect of all three parameters when varied at the same time, we generated the isosurface plots of productivity and yield with respect to parameters αC , γC and LuxR (Figures S.2 and S.3 in Appendix C). The volume of the isosurface shows the solution space that satisfies targets and it increases as we relax the target values. Figure S.2 shows that high αC and γC values increase the solution space of the strategy as indicated by the increased volume of the plot in the high αC and γC region (in agreement with Figure 4.4). Second, the shape of the volume demonstrates the outcome of Figure 4.6 that either high

LuxR-low γC or low LuxR-high γC values are optimal for productivity. The same result is observed in Figure S.2, where we can isolate two subvolumes of maximum productivity: one for high LuxR and low γC and one for low LuxR and high γC values. Considering the effect of the parameters on the yield of the process (Figure S.3) allows us to prune the optimal parameter space. The maximum yield volume is located at the tip of the parameter cube, for low αC and high γC and LuxR values, which leans towards the static strategy. In Figure 4.7, we overlap the two volumes for decreasing levels of productivity and yield to identify regions in the parameter space where the target values indicated are satisfied. In

Figure 4.7.A, the targets for productivity and yield are very high (i.e., 2.9 mmol serine/l/h and 1.2 mmol serine/mmol glucose, respectively) and the volume in the parameter space to achieve these thresholds is a thin horizontal slice that lies between values of LuxR of 0.5 and 1.5 µM. The shape of the volume indicates that in order to achieve the highest values for productivity and yield, fine-tuning of LuxR concentration is crucial.

Next, the target for productivity is relaxed to 2.8 mmol serine/l/h (Figure 4.7.B). The volume lies between LuxR values of 0.5 and 2.5 µM in most of the parameter space, which supports the outcome of Figure 4.7.A that tight regulation of LuxR between 0.5 and 1.5 µM can lead to the highest values of productivity and yield. Another interesting feature here is the growth of a subvolume toward higher LuxR values for low γC . This is also observed when the yield target is reduced to 1.1 mmol serine/mmol glucose (Figure 4.7.C) and when Chapter 4. Mathematical analysis of the dynamic strategy 89

both productivity and yield are reduced to 2.8 mmol serine/l/h and 1.1 mmol serine/mmol glucose, respectively (Figure 4.7.D). The subvolume for low γC and LuxR>2 values is less robust than the horizontal subvolume, since it requires tight fine-tuning of both αC (0.4 to

−1 −1 10 µM ·min ) and γC (1 to 2 min ). This implies that operating in this subvolume could potentially cause loss of the bistability if αC and γC are not tuned precisely. Therefore, it is recommended that the operating volume be chosen as the volume shown in Figure 4.7.A.

Figure 4.7: Effect of αC − γC −LuxR on productivity and yield. Isosurfaces are shown for values of productivity higher than 2.9 and 2.8 mmol serine/l/h and yield higher than 1.2 and 1.1 mmol serine/mmol glucose. The MATLAB code to generate the figure is given in the Appendix B.2.6.

4.3.8 Preliminary design considerations

The sensitivity-based model analysis presented here has given us insight into the design and optimization of the dynamic control strategy. The results of the global sensitivity analy- Chapter 4. Mathematical analysis of the dynamic strategy 90

sis were used to reduce the model complexity by identifying a set of key parameters (i.e.,

αC , γC and LuxR) that have the greatest effect on serine production. Furthermore, we have identified the ranges of the key parameters that satisfy productivity and yield targets. In order to design a stable and robust system that balances productivity and yield, we need to identify an optimal operating region. Considering the uncertainties associated with param- eter estimation and noise in gene expression, we believe that proposing an operating region is more appropriate than suggesting a single set of parameter values.

The key parameters can be adjusted using standard molecular biology techniques. To this end, the protein synthesis rate constant αC and the LuxR protein concentration can be manipulated by designing synthetic ribosome binding sites (Salis et al., 2009). Manipulating

LuxR concentration can be challenging, as the protein degrades in the absence of AHL (re- ported half-time of 65 min (Manefield et al., 2002)). The fast growth in the first phase will lead to AHL production and formation of the slow-degrading LuxR-AHL complex, which accumulates at high concentration (Huang et al., 2012). The protein decay constant γC can be engineered either by using degradation tags to reduce the half-life of the proteins (Fung et al., 2004), or by using temperature-sensitive mutants of the LacI (McCabe et al., 2011).

To summarize the effect of the main parameters, αC and γC have their greatest effect on the bistability of the genetic circuit. This result is in agreement with previously pub- lished model analysis and experimental validation by Gardner et al. (2000) for the toggle switch alone. The interaction between αC and γC does not influence the switching time significantly. LuxR concentration has a major impact on both the bistability of the switch and the switching time. Specifically, when interacting with parameter γC , the switching time depends strongly on both LuxR and γC . In the last part of the analysis, changing all three parameters simultaneously revealed that it if we use values of γC and αC to ensure bistability of the switch we can use LuxR to manipulate the switching time of the circuit in order to achieve high productivity and yield targets close to the maximum theoretical values. Chapter 4. Mathematical analysis of the dynamic strategy 91

A final recommended range for LuxR is between 0.5 and 1.5 µM, with preference for values close to 1.5 to achieve higher yield (the measured value by Basu et al., 2005 is 0.5

µM). To visualize the optimal parameter design space of αC and γC , the top view of Figure

−1 4.7.A is shown in Figure 4.8. Parameter γC must be at least 1.5 min to achieve the target yield and productivity values. For increasing values of γC , the range of αC also increases. Due to stochastic effects arising from transcription and translation noise, individual cells will have slightly different boundaries than Figure 4.8 and as a consequence operating close to the boundaries will likely lead to lower objective values. It has been previously shown that the toggle switch is resistant to noise-induced transitions and this is due to the high transcription rates (Gardner et al., 2000). Low transcription rates (i.e., αC ) can result in spontaneous switching between the states (Elowitz et al., 2002; Isaacs et al., 2003). Hence, the recommended operating region is in the middle of the optimal design space shown in

Figure 4.8 aiming towards higher values of αC and γC where the width of the optimal de- sign space increases. Nevertheless, very high values of both parameters should be avoided because of the metabolic burden associated with high expression (high values of αC and γC mean high production rate and high degradation rate of proteins). These results now set the stage for the experimental implementation of the dynamic control strategy.

4.4 Conclusions

We have performed both analysis and design of our integrated model of a genetic circuit coupled to bacterial metabolism for the production of serine. By manipulating the switching time of the ideal ON-OFF control, we showed that the maximum theoretical productivity of the dynamic strategy is 29.6% higher than the static strategy, for an optimal switch- ing time of approximately 4 h. The initial design of the genetic circuit was applied using parameter values from the literature and it showed a 28.3% increase in productivity com- pared to the static strategy, very close to the maximum theoretical productivity. In order to explore the sensitivity of serine concentration to the parameters of the genetic circuit Chapter 4. Mathematical analysis of the dynamic strategy 92

Figure 4.8: Optimal parameter design space of αC and γC to achieve productivity higher than 2.9 mmol serine/l/h and yield higher than 1.2 mmol serine/mmol glucose. This figure is the top view of 4.7.A. Values of LuxR of this volume are between 0.5 and 1.5 µM. we applied global sensitivity analysis (GSA) to identify the parameters with the highest impact on serine concentration. GSA identified three key parameters (i.e., αC , γC and LuxR), thus reducing model complexity and allowing for further simulations to investigate the relationship between these parameters and the bioengineering objectives. In turn, these results have enabled us to identify the optimal parameter design space required to operate the genetic circuit at both high productivity and yield, setting the stage for experimental implementation. Chapter 5

Experimental implementation of the dynamic strategy

In this chapter, we utilize the genetic toggle switch to dynamically manipulate genes adh

(alcohol dehydrogenase) and pta (phosphotransacetylase) in the lactic acid-producing E. coli mutant MG1655-∆(adh, pta). Although our initial target product was succinic acid, the construction of a mutant with five gene deletions did not show significant succinate production. The genetic toggle we used throughout the experiments is based on the plasmid pTAK132 from Gardner et al. (2000). Gardner et al. tested plasmid pTAK132 in the E. coli strain JM2.300 (λ−, lacI22 rpsL135 (StrR), thi − 1). However, this strain is not available any more since the genotype could not be verified (personal communication with research assistant Linda Mattice from the Coli Genetic Stock Center). Thus, using plasmid pTAK132 in strain MG1655 can result in a nonfunctional toggle switch. As we will see in the troubleshooting section, this was the case when we tested the plasmid in MG1655 background.

93 Chapter 5. Experimental implementation of the dynamic strategy 94

5.1 Materials and methods

5.1.1 Strains and plasmids

The wild-type E. coli strain MG1655 was used as the starting strain for all experiments.

The first component of the strain name indicates the product (i.e., SUC: succinate and

LAC: lactate) and the second implies the conditions (i.e., AE: aerobic and AN: anaerobic).

All succinate-producing strains were generated with the help of A. Ekins (Professor Vincent

Martin lab from Concordia University). The lactate-producing strain was kindly donated by Professor Stephen S. Fong from Virginia Commonwealth University. The pTOG plas- mids were derived from the pTAK132 plasmid using standard methods and were created by our collaborator Dr. Hideki Kobayashi (Japan Agency for Marine-Earth Science and

Technology-JAMSTEC). A list of the final mutant strains and plasmids used is shown in

Table 5.1.

The backbone of the toggle switch plasmid map is shown in Figure 5.1. The switch consists of genes lacI and cI857 (simply cI from now on), that encode the transcriptional regulatory proteins LacR and CI, respectively (Gardner et al., 2000). Gene lacI is expressed from promoter PL, which is repressed by protein CI. Gene cI is expressed from promoter

Ptrc which is repressed by protein LacR. This design gives cells two distinct phenotypic states: one where CI activity is high and lacI expression low, and one where LacR activity is high and the expression of cI is low. The metabolic genes that we want to manipulate dynamically are placed downstream of lacI (gene A and gene B shown in Figure 5.1). For example, in plasmid pTOG(adh, pta), genes adh and pta are placed in the positions an- notated as gene A and gene B. Although genes can be placed on either side of the toggle, this design was preferred to enable expression of the metabolic genes with heat shock at the beginning of the batch and repression with IPTG.

All pTOG plasmids carry the ampicillin resistance and the ColE1 origin of replica- tion. A red fluorescence protein, DSred2, is placed downstream of gene cI to visualize and Chapter 5. Experimental implementation of the dynamic strategy 95

quantify gene expression. Also, Dr. Kobayashi designed restriction sites XmaI and NheI, so that we can add extra genes in the pTOG plasmids that only carry gene A.

Table 5.1: List of strains and plasmids

Strain Characteristics Reference

MG1655 Wild-type (F−) (1)

SUC-AE MG1655(∆sdhAB, ackA-pta, poxB, iclR, ptsG::CmR) (2)

SUC-AN MG1655(∆ldh, adhE::CmR) (3)

LAC-AN MG1655(∆adh, pta) (4)

Plasmid

pTAK131 Backbone toggle plasmid (5)

pTAK132 Backbone toggle plasmid (5)

pTOG(ptsG) Toggle carrying gene ptsG (6)

pTOG(adh) Toggle carrying gene adh (6)

pTOG(pta) Toggle carrying gene pta (6)

pTOG(adh,pta) Toggle carrying genes adh, pta (6)

1 Coli Genetic Stock Center strain (CGSC) no.7740

2 Mutant constructed based on the strain published by Lin et al., (2005)

3 Mutant constructed based on the strain published by Burgard and Van Dien (2007)

4 Mutant kindly provided by Fong et al. (2005)

5 From Gardner et al. (2000)

6 Plasmids constructed by H. Kobayashi

The inducing factors are illustrated in Figure 5.2. Green indicates that a gene is expressed (ON state), whereas red indicates that a gene is repressed (OFF state). In the upper left side of Figure 5.2, lacI is expressed and the protein produced is repressing gene cI.

This is achieved by heat shocking the cells at 42oC since the CI repressor is deactivated at this temperature. In the upper right side of Figure 5.2, we show the transition to the other Chapter 5. Experimental implementation of the dynamic strategy 96

state. When IPTG is added, the Lac repressor is inactivated and therefore cI is expressed.

XmaI

ge ne A NheI 1 lE o C

g

e

n

e SacI B

pTOGAP/pTOGPA

l

a

c

pTOG plasmid

I

11,131 bp

c AatII

I

8

5

7

p

m

A

D

S

r

e

d

2

SwaI pTOGAP: gene A= pflss, gene B= AdhEss FigurepTOGPA: 5.1: The gene toggle A=AdhEss, switch plasmid gene B=pflss backbone

HindIII+NdeI digestion of plasmids The expression of the related genes is shown in Table 5.2. The batch starts with a heat

o shock at 42 C to drive expression of lacI,M pTOGAP adh and M ptapTOGPA. Expression of these genes promotes growth. Addition of IPTG induces expression of cI, and consequently repression of lacI, adh and pta to promote production of lactate.

Gene cI lacI adh pta

Heat shock (42oC)

Phase 1 – + + + growth M: 1kb ladder(NEB) Add IPTG

Phase 2 + – – – production

Table 5.2: Gene expression scheme in the toggle switch. The symbol + indicates expression; the symbol – indicates repression. Chapter 5. Experimental implementation of the dynamic strategy 97

42oC

Phase #1 Phase #2

adh adh

pta pta

Figure 5.2: Dynamics of the pTOG(adh,pta) plasmid. Genes lacI, adh and pta are expressed with heat shock, and repressed with the addition of IPTG.

5.1.2 Media and growth conditions for strain SUC-AE

The medium used for strain SUC-AE was Luria broth (LB) with 2 g/l NaHCO3 and ap- proximately 55 mM of glucose (Lin et al., 2005). The medium for the inoculum preparation was LB without additional glucose. NaHCO3 was added due to its pH-buffering and its ability to supply CO2. Flask experiments were run in 500 ml flasks with 100 ml as the working volume at 37oC and 200 rpm.

For the bioreactor experiment, the initial medium volume was 3 l in a 5-l Minifors stirred tank bioreactor (Infors HT). The inoculum was grown overnight aerobically in 500 ml flasks with 100 ml of LB, at 37oC and 200 rpm. Cells were centrifuged and resuspended in 10 ml of fresh LB. The appropriate volume of inoculum was then transferred in the biore- actors, so that the initial optical density was approximately 0.1 at 600 nm. The pH was measured using a Mettler Toledo 405-DPAS-SC-K8S electrode and controlled at 7.0 using Chapter 5. Experimental implementation of the dynamic strategy 98

o 1.5 N HNO3 and 2 N Na2CO3. The temperature was maintained at 37 C and the agitation speed at 700 rpm. The dissolved oxygen was monitored using an OxyfermO2 sensor (Hamil- ton) and maintained above 80%. Isopropyl-β-thiogalactopyranoside (IPTG) was added at a concentration of 2 mM to induce gene expression of the pTOG(ptsG) plasmid. Antibiotics were used in appropriate concentrations (ampicillin: 100 µg/ml and chloramphenicol: 25

µg/ml).

For each experiment the strains were freshly transformed with the appropriate pTOG plasmid. Then, a single colony was transferred into a 15 ml falcon tube containing 5 or 10 ml of LB with the appropriate antibiotic concentration and grown overnight aerobically at

37oC with 2 mM IPTG and shaking at 200 rpm. Cells were washed three times with fresh medium to remove IPTG and inoculated at 1% v/v for the aerobic case into 500 ml flasks containing 100 ml of fresh medium without IPTG (or 100 ml anaerobic bottles). The flasks were heat shocked at 42oC for 30 min and then the incubation was continued at 37oC.

5.1.3 Media and growth conditions for strains SUC-AN and LAC-AN

The strains were grown in mineral medium with approximately 50 mM glucose, 0.5 g/l yeast extract, 2 mM IPTG and 100 µg/ml ampicillin (if necessary) at 37oC in all stages of the inoculum preparation and the final characterization. For pH control, we used 4 M KOH in the bioreactor and MOPS during the inoculum preparation.

The mutant strain was transformed with pTOG plasmid. A mutant carrying plasmid pTOG(gfp) served as the control. Cells were heat shocked at 42oC for 30 min to turn on gene expression of adh and pta. Repression of adh and pta was induced by the addition of

IPTG (2 mM).

Fresh colonies from -80oC stocks were plated on LB plates with ampicillin (100 µg/ml).

A single colony was transferred to a 15 ml falcon tube containing 10 ml of mineral salts and glucose medium with yeast extract, ampicillin and IPTG, and grown overnight aerobically at 37oC at 180 rpm. Overnight cultures were washed with fresh medium and inoculated Chapter 5. Experimental implementation of the dynamic strategy 99

at 10% v/v into 10 ml anaerobic tube with medium. After approximately 10 hours, the culture was washed again with fresh medium and inoculated in 100 ml anaerobic medium

(10% v/v) with MOPS as a pH-buffer and grown for approximately 10 hr. Cultures were washed three times to remove IPTG and resuspended in fresh medium. Optical density was measured and appropriate volume was inoculated to the 300 ml reactors to achieve an initial OD600 of 0.1 (approximately 30 ml were needed for the wild-type and 60 ml for the mutant). Anaerobic tubes and bottles were prepared by sparging with N2 for appropriate time (10 and 30 min for the liquid phase, 2 and 10 min for the headspace of the tube and the bottle, respectively).

Bottle and reactor experiments were performed on a mineral medium containing min- eral salts (per l: 3.5 g KH2PO4, 5 g K2HPO4, 3.5 g (NH4)2HPO4, 0.25 g MgSO4.7H2O,

0.015 g CaCl2.2H2O, 0.5 mg thiamine and 1 ml of trace metal stock) and 55 mM glucose.

The trace metal stock was prepared in 0.1 M HCl (per l: 1.6 g FeCl3, 0.2 g CoCl2.6H2O,

0.1 g CuCl2, 0.2 g ZnCl2.4H2O, 0.2 g NaMoO4, 0.05 g H3BO3).

5.1.4 Genetic methods

The P1 transduction method was used to transfer a mutation from a BW25113 mutant strain (received from the KEIO collection) to the MG1655 strain. When phage P1 grows and encapsidates DNA, it occasionally packages DNA from the bacterial host rather than its own. Based on this property, when one makes lysates the individual particles in it con- tain either packaged phage DNA or packaged bacterial DNA. Therefore, we can transduce the BW25113 cells (containing a kanamycin cassette instead of the gene to be deleted) and infect MG1655 host to transfer DNA pieces into the new strain. Then, the DNA pieces recombine and they are permanently incorporated into the chromosome of the recipient.

Lysates were made as follows. BW25113 strains containing a single gene deletion were grown on LB broth overnight at 37oC. In a 2 ml eppendorf tube, 1 ml of LB, 5 µl of 1M

CaCl2, 10 µl of 20% glucose and 10 µl of the overnight culture were added. After incubat- Chapter 5. Experimental implementation of the dynamic strategy 100

ing for 45 minutes at 37oC, 20 µl of P1 stock was added and the mixture was incubated at

37oC with shaking for at least 3 hours. Then 20 µl of chloroform was added, followed by vortex and spin down. Finally, the supernatant was transferred to a new tube with 10 µl of chloroform.

To transduce the cells (i.e., the MG1655 strain), 500 µl of an overnight culture was resuspended in 500 µl MC buffer (containing 0.1M MgSO4 and 0.005M CaCl2 for attach- ment). Two dilutions of the phage lysate (usually 1:1 and 1:10) and two negative controls

(one for the cells and one for the phage) were used.

Cells were incubated with the lysate at 37oC for 20 minutes, then 200 µl of citrate buffer (prevents the reattachment) and 600 µl of LB were immediately added. Cells were incubated for 1 hr at 37oC and resuspended in 50-75 µl. Finally, cells were plated on kanamycin plates for selection. Colonies from plates containing both cells and lysates were streaked out on kanamycin plates. These cells contain the kanamycin resistance cassette, which can be eliminated using the pCP20 plasmid.

5.1.5 Analytical techniques

Optical density was measured at 600 nm (visible range on Jenway 6320D). Organic acids and residual glucose were measured with a Shodex RI detector using an Aminex HPX-87X

o ion exchange column at 25 C with 0.5 and 5-10 mM H2SO4 as the mobile phase. Using two different concentrations of the eluent was necessary for satisfactory separation. Glucose was measured at low H2SO4 concentration. Acids and IPTG separate well for concentration higher than 5 mM. The conditions were chosen using the Dionex optimization software for

Bio-Rad HPX-87H organic acids column.

5.2 Aerobic succinate production

The metabolic engineering approach for this strain (Figure 5.3) aims at the: 1) inactiva- tion of the phosphotransferase uptake system (gene ptsG) to avoid formation of pyruvate Chapter 5. Experimental implementation of the dynamic strategy 101

from PEP through the transport of the glucose, 2-3) elimination of the byproduct acetate

(deletions of poxB and ackA-pta operon), 4) removal of the glyoxylate shunt repressor (gene iclR), and 5) disruption of the TCA cycle (deletion of the sdhAB operon).

Glucose

PEP 1.ptsG Pyruvate

G6P

G3P

PEP

2.poxB 4.iclR Acetate Pyruvate ace repressor

3.ackA-pta Acetate Acetyl-CoA

Citrate

OAA aceA isocitrate Glyoxylate aceB Malate

aceA 2-ketoglutarate

Fumarate Succinyl-CoA

5.sdhAB Succinate

Figure 5.3: Gene deletions for the aerobic succinate production strain: 1. ptsG (glucose PTS permease), 2. poxB (pyruvate oxidase), 3. ackA-pta (acetate kinase-phosphate acetyltrans- ferase), 4. iclR (iclR transcriptional repressor), and 5. sdhAB (succinate dehydrogenase).

Here, we investigate the effect of the dynamic expression of gene ptsG in the aerobic Chapter 5. Experimental implementation of the dynamic strategy 102

succinate-producing mutant SUC-AE. Gene ptsG is selected to be under dynamic control because it is known to significantly decrease growth rate. Aerobic flask experiments were performed in 500 ml flasks with strain SUC-AE in two different media. The first medium includes LB supplemented with approximately 55 mM of glucose, according to Lin et al.

(2005) (Figure 5.4). Two induction times were applied (indicated by the arrows in the

figure): 18 and 12 hr in panel A and B, respectively. In the second set of experiments we wanted to test the ability of the strain SUC-AE to grow in minerals medium without

LB. Figure 5.5 shows the profiles of flask experiments in minerals medium with 55 mM of glucose. IPTG was added after 18 hr. The results shown are average of triplicates. Strain

SUC-AE without the pTOG(ptsG) plasmid was used as a control (showed as ∆5 in the

figures). A list of the strains and conditions used in the experiments presented in Figures

5.4 and 5.5 is discussed below:

• IPTG(-) is the strain with the plasmid pTOG(ptsG), without induction (i.e., no ad- dition of IPTG). This is essentially similar to a mutant with only four gene deletions,

since ptsG is always expressed, and is a control experiment.

• IPTG(+) is the strain with the plasmid pTOG(ptsG) and induction with addition of IPTG at the times indicated by the arrows. This is the proposed dynamic strategy,

since the gene expression is manipulated dynamically. Essentially, this mutant is a

”hybrid” since it operates as a quadraple mutant in the first phase of the batch and

as a pentaple mutant in the second.

• ∆5 is the pentaple mutant without the pTOG plasmid. This serves as a control and is expected to grow slower than the other two cases.

A comparison of the time profiles of the optical densities shows the advantage of the dynamic expression of ptsG. Both strains with the plasmid pTOG(ptsG) grow faster than the mutant without the toggle in LB (Figure 5.4). The advantage of the dynamic strategy in terms of growth rate is more prominent in minimal medium (Figure 5.5). The mutant with the Chapter 5. Experimental implementation of the dynamic strategy 103

A 4

2 O.D. (600 nm) 0 0 10 20 30 40 50 Time (hr)

B 4

5−pTOG(ptsG)−IPTG(−) 2 5−pTOG(ptsG)−IPTG(+)

O.D. (600 nm) 5 0 0 10 20 30 40 50 Time (hr)

Figure 5.4: Optical density profiles for different induction times (flask experiment in LB supplemented with approximately 55 mM glucose). Induction time is 18 hr in A (triplicate experiments and one standard deviation shown) and 12 hr in B (duplicate experiments).

The cells carrying the pTOG plasmid grow faster than the pentaple mutant.

dynamic expression (i.e., IPTG(+)) is expected to have O.D. between the other two since it is a ”hybrid” of them. This is not observed in the top graph and this is due to the conditions of the experiment. Aeration, pH-control, and initial O.D. are not controlled very accurately in the flask and generate large variation. Alternatively, bioreactors provide precise control of variables (e.g., pH and aeration), thus they are more suitable for characterizations. Batch experiments were carried out and the results are discussed next. Strain SUC-AE with and without the plasmid pTOG(ptsG) were grown in bioreactors at 37oC, with pH control at 7. The dissolved O2 was maintained above 80% saturation. The toggle was induced 12 hr after inoculation (indicated by the arrow). The metabolite profiles are compared in the Chapter 5. Experimental implementation of the dynamic strategy 104

2 5 5−pTOG(ptsG)−IPTG(+)

1.5

1 OD (600 nm)

0.5

0 0 5 10 15 20 25 30 35 40 45 Time (hr)

Figure 5.5: Optical density profiles for induction time 18hr (single flask experiment in minerals medium with approximately 55 mM glucose)

Figure 5.6. The mutant strain shows a lag in growth, compared to the strain carrying the toggle plasmid. The expression of gene ptsG results in faster glucose consumption in the

first phase of the batch (i.e., before induction). The O.D. at the point of induction is very close to the maximum O.D. observed at 22 hr. This suggests that the switch was turned off relatively late.

Although pyruvate and acetate profiles of the mutant are in good agreement with the published results, both strains do not produce succinate in amounts comparable to Lin et al. (2005). This can be due to the different background strain used in our experiments. Lin et al. (2005) used a lab strain that was created on a spontaneous cadR mutant of MC4100, whereas the mutant we created is based on MG1655. However, the principle of our method was demonstrated in this experiment since the production of pyruvate and acetate was higher with the dynamic method and this was due to increased biomass production. Chapter 5. Experimental implementation of the dynamic strategy 105

6 60 AB 4 40

2 20 OD (600nm) Pyruvate (mM) 0 0 0 10 20 30 0 10 20 30 Time (hr) Time (hr)

30 60 CD5 −pTOGptsG 20 5 40

20 10 Acetate (mM) Glucose (mM)

0 0 0 10 20 30 0 10 20 30 Time (hr) Time (hr)

Figure 5.6: Characterization of the succinate-producing strain and effect of the dynamic expression of ptsG (single experiment). A: optical density at 600 nm, B: pyruvate con- centration, C: glucose concentration, D: acetate concentration. The arrow at 12 hr in A indicates induction with IPTG. Succinate amounts were below 1 mM.

Since the strain used in the literature is a laboratory strain and we do not have access to it, we have to focus on troubleshooting the MG1655 strain. Further examination of the products produced (e.g., formate accumulation) may lead us to an intuitive redesign of the mutant. Evolution of the strain can also be useful if the succinate production is coupled to growth. However, since strain development and evolution require more effort than the next candidate, the lactate-producing ∆(adh, pta) mutant, we will focus on the following mutant. Chapter 5. Experimental implementation of the dynamic strategy 106

5.3 Anaerobic lactate production: protocol development

5.3.1 Preliminary characterization of the lactate-producing strain

The strain MG1655 ∆(adh, pta) used for the anaerobic production of lactate is provided by Professor S. Fong and has been evolved for lactate production (Fong et al., 2005). Dele- tion of genes adh and pta eliminate the production of the byproducts ethanol and acetate, respectively. The growth rate is severely affected since deletion of these pathways decreases the production of NAD+ and ATP (Figure 5.7). The coupling of lactic acid production and growth rate has been predicted computationally and observed experimentally during the evolution of the strain.

Figure 5.8 shows the results of the characterization of the mutant and the wild-type.

The experiment was conducted in 100 ml anaerobic bottles and minerals medium supple- mented by 55 mM of glucose. The growth rate and the biomass yield of the mutant is considerably lower than the wild-type. As a result, the final optical density of the mutant is significantly lower than the wild-type (value of approximately 0.1 for the mutant, in comparison to approximately 0.35 for the wild-type). Also, the mutant requires more time to consume all the available glucose. Pyruvate production in the mutant is below 5 mM and the final lactate titer is a little higher than 25 mM. The mutant produces no acetate, in contrast with the wild-type which produces acetate at a final concentration of 10 mM.

Thus, the significant growth defect suggests that the dynamic expression of genes adh and pta can improve the productivity of the conversion. In the next experiment, we will test the expression of gene pta in the mutant with the plasmid pTOG(pta).

5.3.2 Expression of pTOG(pta) in minerals medium

Duplicate experiments were conducted in 100 ml bottles, with mineral salts medium and 55 mM of glucose for two cases: the mutant and the mutant carrying the pTOG(pta) plasmid.

The mutant ∆(adh, pta) grows to an optical density between 0.15 and 0.2, which is in good Chapter 5. Experimental implementation of the dynamic strategy 107

Glucose

NAD+

NADH

PEP NADH NAD+

Biomass Pyruvate Lactate

Acetyl-CoA 2.NADH 2.NAD+ pta adh

Pi Acetyl-P Ethanol CoA

ADP ackA ATP

Acetate

Figure 5.7: The genes adh and pta are deleted to decrease byproduct formation. The growth rate is severely affected since NAD+ regeneration and ATP production are decreased as a result of the gene deletions. agreement with the previous experiment (Figure 5.9). The transformed mutant carrying plasmid pTOG(pta) does not grow. The experiment was repeated two more times, and no growth was observed for the transformed mutant. The lack of growth can be attributed to the cumulative stress created by the following factors:

• the antibiotic ampicillin

• the inducer IPTG

• heat shock at 42oC for 30 minutes Chapter 5. Experimental implementation of the dynamic strategy 108

0.4 30 A B 20 0.2 Wild−type 10 (adh−pta) Lactate (mM) O.D. (600 nm) 0 0 0 10 20 30 40 0 10 20 30 40 Time (hr) Time (hr) 60 10 C D 40 5 20 Acetate (mM) Glucose (mM) 0 0 0 10 20 30 40 0 10 20 30 40 Time (hr) Time (hr)

Figure 5.8: Optical density and metabolite profiles. Single experiments was conducted in

100 ml bottles and minimal medium supplemented by approximately 55 mM glucose.

• metabolic burden associated with the plasmid pTOG(pta)

The use of supplements to support growth is suggested for the following experiments. The candidate supplement we examine are LB, yeast extract, and peptone.

5.3.3 Use of Luria broth as a supplement

Here, we used LB to supplement growth at 25 g/l, the concentration suggested by the manufacturer. The transformed mutant grows well to an optical density slightly lower than the mutant alone due to the metabolic burden associated with the plasmid (Figure 5.10). A transformed mutant carrying a control plasmid could serve as an appropriate control (e.g., pTOG(gfp)). The high concentration of LB (i.e., 25 g/l in addition to 2 g/l of glucose) is not desirable in an industrial application and we will attempt to minimize it in the following experiments. Chapter 5. Experimental implementation of the dynamic strategy 109

0.2 (adh,pta) (adh,pta) (adh,pta)−pTOG(pta) 0.15 (adh,pta)−pTOG(pta)

0.1 OD (600nm)

0.05

0 0 5 10 15 20 25 30 35 Time (hr)

Figure 5.9: Optical density profiles of mutant and mutant carrying the pTOG(pta) plasmid for two experiments in minerals medium with 55 mM glucose. The experiment was repeated two more times, in triplicates, and no growth was observed for the transformed mutant.

5.3.4 Minimizing the use of Luria-Bertani supplement

In this experiment we used 5% of the LB used in the previous experiment (i.e., 1.25 instead of 25 g/l) to supplement growth in 2 g/l or 11 mM glucose. The cultures grow well to approximately the same optical density as before. Two controls were used as described in the following paragraphs.

Control 1 (Figure 5.11): This is the ∆(adh, pta)-pTOG(pta) experiment with pta being turned off after 4 hr (i.e., IPTG(+)). The first control is the same strain without adding IPTG (i.e., IPTG(-)). Without IPTG addition, gene pta is always expressed in the toggle plasmid. The optical density, glucose consumption and lactate production is similar for both cases.

Control 2 (Figure 5.12): The second control is strain ∆(adh, pta) carrying the plas- mid pTOG(gfp). This strain is essentially equivalent to ∆(adh, pta), with the addition of Chapter 5. Experimental implementation of the dynamic strategy 110

0.5

0.4

0.3

0.2 OD (600nm) (adh,pta) (adh,pta) 0.1 (adh,pta)−pTOG(pta) (adh,pta)−pTOG(pta)

0 0 5 10 15 20 25 Time (hr)

Figure 5.10: Optical density profiles of mutant and mutant carrying the pTOG(pta) plasmid for two experiments in minerals medium, supplemented with 25 g/l of LB and 11 mM glucose

40 0.5 ABC10 IPTG(+) IPTG(−) 5 20

Lactate (mM) O.D. (600 nm) 0 Glucose (mM) 0 0 0 5 10 0 5 10 0 5 10 Time (hr) Time (hr) Time (hr)

Figure 5.11: Optical density, glucose and lactate time profiles for Control 1 (single experi- ment)

the metabolic burden associated with the plasmid pTOG(gfp). The dynamic strategy grows faster and at higher OD than the control and it makes 21.2 mM of lactate after 7 hr in contrast with 12.3 mM of the control at the same time. Chapter 5. Experimental implementation of the dynamic strategy 111

0.5 40 ABC10 IPTG(+) GFP(+) 5 20

Lactate (mM) O.D. (600 nm) 0 Glucose (mM) 0 0 0 5 10 0 5 10 0 5 10 Time (hr) Time (hr) Time (hr)

Figure 5.12: Optical density, glucose and lactate time profiles for Control 2 (single experi- ment)

5.3.5 Using inexpensive supplements

Here, we explore the use of yeast extract (Y.E.) and peptone as supplements instead of

LB, since they are less expensive and more suitable for an industrial process. Different concentrations on a logarithmic scale were examined (0.05 and 0.5 g/L of yeast extract-

0.01, 0.1 and 1 g/L of peptone). Growth profiles in supplements of 1 g/L of peptone and

0.5 g/L of Y.E. are shown in Figure 5.13. Cultures in 0.1 g/L of peptone and 0.05 g/L of Y.E. did not grow. Also, peptone seems to be consumed in the early stage of the batch instead of glucose and this is not desirable. Therefore, we choose yeast extract as a supplement for all our future experiments at the concentration of 0.5 g/L.

Peptone 0.5 15 Yeast extract 30 ABC 10 20 5 10

Lactate (mM) O.D. (600 nm) 0 Glucose (mM) 0 0 0 10 20 0 10 20 0 10 20 Time (hr) Time (hr) Time (hr)

Figure 5.13: Optical density, glucose and lactate time profiles for 0.1 g/l of peptone and

0.05 g/l of yeast extract supplements (single experiment) Chapter 5. Experimental implementation of the dynamic strategy 112

5.3.6 Different induction times

In this experiment, different induction times are tested in triplicates (at 0 and 5 hours).

The error bars in Figure 5.14 represent one standard deviation. Cultures induced after 5 hr are expected to have a growth advantage since the pta gene is expressed (when induction is applied at the beginning of the batch pta is repressed). The optical density profiles confirm this hypothesis. However, the error bars overlap and the confidence of the results is low.

The high variation is probably due to variations in the initial conditions. The inoculums for the 100 ml bottles were prepared in separate 10 ml tubes and this could be a possible cause for the variation. To eliminate this variation the inoculum will be prepared in one bottle from now on.

0.4

0.35 Induction at t=0 Induction at t=5 0.3

0.25

0.2

0.15 O.D. (600 nm)

0.1

0.05

0 0 2 4 6 8 10 12 Time (hr)

Figure 5.14: Optical density profiles for induction times of 0 and 5 hours. The inoculum was prepared in different tubes for every experiment and thus may be a source of the large variation. The results are average of triplicate experiments and one standard deviation.

The same experiment is conducted with the inoculum prepared in the same bottle. Chapter 5. Experimental implementation of the dynamic strategy 113

Also, the glucose is added with the inoculum, resulting in less experimental variation in the initial conditions. However this did not solve the issue of overlapping profiles (Figure 5.15).

Large variation is also observed in the initial glucose concentration. This is due to very small volumes of glucose solution added with the syringe in the bottles. It should be avoided to add volumes less than 100-200 µl with the syringe. These volumes will be added with the inoculum (i.e., 10 ml culture in the 100 ml bottles). Glucose seems to be consumed faster with IPTG induction at 4hr and growth seems to stop after 10hr. In the slower growing strategy (i.e., IPTG induction at 0 hr), cells grow slower, glucose is consumed slower and less lactate is produced up to 10 hr. However, cultures grow for an extra hour, i.e., total batch time is 11 instead of 10 hrs.

Induction at t=0

0.5 20 Induction at t=4 15 ABC 10 10 5

Lactate (mM) O.D. (600 nm) 0 Glucose (mM) 0 0 0 5 10 0 5 10 0 5 10 Time (hr) Time (hr) Time (hr)

Figure 5.15: Optical density profiles for induction times of 0 and 4 hours. The inoculum was prepared in the same tubes. Although the variation was smaller than the previous experiment, the error bars still overlap. The results are average of triplicate experiments and one standard deviation.

Then, we present the glucose and lactate concentration after 9 hr with different in- duction times. In addition to induction at t=0 and 4 hr, we include the case without any induction (shown as ”No IPTG” in Figure 5.16). The glucose and lactate concentrations still show high variation. Due to the high variation in the results and the partial consump- tion of glucose, the use of bioreactors is highly recommended at this point. We expect that better control of variables and initial conditions will result in less variable measurements. Chapter 5. Experimental implementation of the dynamic strategy 114

1.1 16 A B 15.5 1 15

0.9 14.5 14

0.8 13.5 Lactate (mM) Glucose (mM) 13 0.7 12.5

0.6 12 0hr 4hr NO IPTG 0hr 4hr NO IPTG IPTG added after: IPTG added after:

Figure 5.16: Glucose and lactate concentration after 9hr. The results are average of tripli- cates and the error bars represent one standard deviation.

5.3.7 Bioreactor experiment

These experiments were conducted in the bioreactors under anaerobic conditions and the cells did not grow (Figure 5.17). The most likely reason is that the inoculum was left in the stationary phase for too long (approximately 20 hr). This was done in order to achieve more stringent anaerobic conditions in the bioreactors; however, the low pH in the late phase may have resulted in cell death. Thus, we suggest the use of the pH buffer, 3-(N- morpholino)propanesulfonic acid (MOPS), to maintain neutral pH for prolonged time. The dynamics of the growth in 100 ml bottles with and without MOPS (pH buffer) are going to be studied in the next experiment.

Another potential problem to be taken into account is the temperature control in the bioreactors. The batch started at 42oC in order to heat shock the cells. The set-point was changed after 30 minutes to 37oC. However, it took an additional 20 min for the reactor to Chapter 5. Experimental implementation of the dynamic strategy 115

reach temperatures below 40oC.

0.2

(adh,pta)−pTOG(gfp) (adh,pta)−pTOG(pta) 0.15

0.1 O.D. (600 nm)

0.05

0 0 5 10 15 20 25 30 35 40 Time (hr)

Figure 5.17: Optical density of the mutant carrying pTOG(gfp) and pTOG(pta) for a single experiment. The cultures did not grow.

5.3.8 Use of pH buffer in the inoculum preparation

These experiments were carried over in 100 ml bottles with 5 g/L glucose with and without

MOPS. Figures 5.18.A and B show the growth curves of the ∆(adh, pta) mutant carrying plasmids pTOG(pta) and pTOG(gfp), respectively. These results show that the cells grow for at least an extra 4 hr in the presence of MOPS. With this, we can also explain the failure of the previous experiment. The inoculum was left in 100 ml bottles for almost 20 hr to achieve anaerobic conditions in the bioreactor. Figures 5.18.A and B show that the cells reach the stationary phase after about 8 hr without pH buffer and they were probably dead after 20 hr. Thus, MOPS is used in the preparation of the inoculum in the next bioreactor experiment. Chapter 5. Experimental implementation of the dynamic strategy 116

0.4 A

0.2 pTOG(pta) pTOG(pta)+MOPS O.D. (600 nm) 0 0 5 10 15 20 Time (hr)

0.4 B

0.2 pTOG(gfp) pTOG(gfp)+MOPS O.D. (600 nm) 0 0 5 10 15 20 Time (hr)

Figure 5.18: The use of MOPS as a pH buffer allows the cells to grow for longer than 10 hr in the inoculum preparation stage. One experiment was conducted per case.

5.3.9 Protocol

The protocol was developed through the experiments described in the previous sections

(Figure 5.19). In the early stages of the experiments, there were more steps in the protocol to allow cells for adaptation from an initial LB culture to the mineral salts medium. Eventually, we realized that these steps were not necessary when yeast extract was used as a supplement and were omitted.

Fresh colonies from -80oC stocks were plated on LB-ampicillin plates (100 µg/ml). A single colony was transferred in 15 ml falcon tube containing 10 ml of mineral salts and glucose medium with yeast extract (0.5 g/l), ampicillin (100 µg/ml) and IPTG (2mM) and Chapter 5. Experimental implementation of the dynamic strategy 117

grown overnight aerobically at 37oC (180 rpm). Overnight culture was washed in fresh medium and inoculated at 10% v/v into 10 ml anaerobic medium. After approximately 10 hours the culture was washed again with fresh medium and inoculated in 100 ml anaerobic medium (10% v/v) with MOPS as a pH-buffer and grown for approximately 10 hr (late exponential phase). Cultures were washed three times to remove IPTG, resuspended in fresh medium and inoculated with a volume so that the initial biomass concentration in the bioreactor is approximately 0.05 O.D.

8-10 hr aerob. 8-10 hr anaerob.

From -80oC stock

10 ml 10 ml

Glucose: 2 g/L Minerals medium 8-10 hr anaerob. Yeast extract: 0.5 g/L Ampicillin: 100 mg/L IPTG: 2 mM

8-10 hr anaerob.

+ MOPS as pH-bufer

2 lt 100 ml

Figure 5.19: The protocol developed for running bioreactor and bottle experiments Chapter 5. Experimental implementation of the dynamic strategy 118

5.4 Anaerobic lactate production: characterization

In this section, we characterize the relevant strains and transformants in the 0.5 l Applikon bioreactors. First, we characterize the wild-type and the mutant with 100 mM of initial glucose. The mutant is not able to consume all the available glucose, most likely due to product inhibition. The characterizations are repeated with 50 mM of initial glucose success- fully. The transformed mutant does not perform as expected, and further troubleshooting is suggested.

5.4.1 Characterization of wild-type and mutant in 100 mM of glucose

The anaerobic characterization of the control strain MG1655 wild-type and the mutant

∆(adh, pta) with 100 mM initial glucose is presented in Figure 5.20. The biomass (O.D. at 600 nm) and glucose (mM) concentrations are shown in A and C for the wild type and the mutant, respectively. The anaerobic products formate, acetate, ethanol, lactate, and succinate are shown in B and D.

The wild-type strain consumes the glucose within approximately 13 hr, and the biomass reaches a final value of approximately 3.6 O.D (Figure 5.20.A). The estimated growth rate

−1 µWT is 0.49 h . The main products of the wild-type strain are formate, acetate, and ethanol (respective titers of 123.8, 70.3 and 57.5 mM). Lactate, the desired product, has a lower titer (19.6 mM), and succinate 10.2 mM (Figure 5.20.B). The mutant ∆(adh,

−1 −1 pta) grows slower than the wild-type (µmutant=0.08 hr compared to µWT =0.49 hr ) and to a lower final biomass concentration of approximately 0.5 (Figure 5.20.C). The glucose is not fully consumed (32.2 mM left after 78 hr), and consumption ceases after approximately

50 hr, when the lactate concentration reaches 100 mM. The main fermentation product in

D is lactate (135.7 mM after 78 hr), and the only other products detected, formate and succinate, are present at significantly lower concentrations (10.5 and 3.1 mM, respectively).

While the results are in good agreement with the yield values published by Fong et al.

(2005), growth is inhibited by lactate after approximately 55 hr. Chapter 5. Experimental implementation of the dynamic strategy 119

Wild−type MG1655 Mutant (adh,pta) 4 1 Glucose A Glucose C 100 100 O.D. 3 O.D.

2 0.5 50 50 1 Glucose (mM) Glucose (mM)

0 0 Optical density (600 nm) 0 0 Optical density (600 nm) 0 5 10 15 0 20 40 60 80 Time (h) Time (h)

150 150 Formate B Formate D Acetate Acetate 100 Ethanol 100 Ethanol Lactate Lactate Succinate Succinate 50 50 Concentration (mM) Concentration (mM) 0 0 0 5 10 15 0 20 40 60 80 Time (h) Time (h)

Figure 5.20: Characterization of wild-type (A and B) and mutant (C and D) in 100 mM of glucose in a single experiment

The most likely reason for the inhibition is the presence of product. Weak organic acids such as lactate do not completely dissociate. The undissociated lactate enters the cells, and there it dissociates. Lactic acid forms chelated molecules with essential growth metals, such as iron. Presser et al. studied the effect of lactate concentration and suggested that for total lactate concentrations between 25 and 200 mM, the undissociated lactate is the major inhibitory factor (Presser et al., 1997). To avoid the inhibition effect in this work, we perform the next experiment with 50 mM of initial glucose, which will result in less than

100 mM of lactate at the end of the batch.

5.4.2 Characterization of wild-type and mutant in 50 mM of glucose

Here, we use 50 mM of initial glucose instead of 100 mM. The toxicity effect is eliminated, and the mutant strain consumes all the available glucose. The wild-type is also characterized Chapter 5. Experimental implementation of the dynamic strategy 120

in 50 mM of glucose for completeness.

The wild-type consumes the glucose in approximately 10 hr, reaching a final O.D. of

2.2 (Figure 5.21.A). The estimated growth rate is 0.23 hr−1. The main products are again formate, acetate and ethanol. Lactate and succinate are produced at lower concentration

(Figure 5.21.B).

Wild−type Mutant (adh,pta) (adh,pta)−pTOG(adh,pta) 50 3 50 1 50 1 A C E 40 40 40 2 30 30 30 0.5 0.5 20 20 20 1

Glucose (mM) 10 Glucose (mM) 10 Glucose (mM) 10

0 0 Optical density (600 nm) 0 0 Optical density (600 nm) 0 0 Optical density (600 nm) 0 5 10 0 20 40 0 20 40 60 80 Time (h) Time (h) Time (h)

Formate B D F 100 Acetate 100 100 Ethanol Lactate 50 Succinate 50 50 Concentration (mM) Concentration (mM) Concentration (mM) 0 0 0 0 5 10 0 20 40 0 20 40 60 80 Time (h) Time (h) Time (h)

Figure 5.21: Characterization of wild-type (A and B) and mutant (C and D) in 50 mM of glucose in a single experiment

The mutant ∆(adh, pta) consumes 50 mM of glucose in approximately 35 hr, reaching an O.D. of 0.62 (the growth rate is 0.07 hr−1). The final lactate concentration is 110 mM. Chapter 5. Experimental implementation of the dynamic strategy 121

Considering that the maximum theoretical yield is 2 mol lactate/mol glucose, the maximum theoretical production is 100 mM of lactate. The higher value is attributed to the extra carbon added as supplement with the yeast extract. Small amounts of formate and succinate are also detected (8 and 2 mM, respectively). A comparison of the product yields, growth rate, batch time, and final O.D. is presented at the end of this section.

5.4.3 Characterization of the toggle switch in 50 mM of glucose

In this section, we transform the mutant ∆(adh,pta) with the pTOG(adh,pta) plasmid to implement the dynamic strategy. The protocol for the dynamic expression of genes adh and pta was discussed in the ”Materials and methods” section. Initially, the expression of the target genes is induced by heat shock for 30 min. If genes adh and pta are expressed in the

first phase, the growth rate is expected to be higher than the mutant. In the second phase, the genes are turned off with the addition of IPTG, and growth rate is expected to drop.

In Figure 5.22.A and B, we present the characterization of the transformed mutant.

Although heat shock was applied at the beginning of the batch, the growth rate is slower than that of the mutant. The glucose is consumed slower than the mutant; the batch time is 55 hr compared to approximately 35 hr for the mutant without the toggle plasmid. The biomass concentration reaches a maximum O.D. of 0.39 and the growth rate is approx- imately 0.05 hr−1. Lactate is the main product, with a titer of 105 mM. Formate and succinate are produced at lower concentration (10 and 2 mM, respectively).

Our hypothesis was that the expression of genes adh and pta from the pTOG plasmid with heat shock will improve the growth of the mutant; however, this was not observed.

The growth of the transformed mutant is slower than that of the mutant alone, and it takes

55 hr to reach a final O.D. of 0.39. In contrast, the mutant reached an O.D. of 0.5 after approximately 35 hr. Interestingly, in the inoculum preparation stage, we noticed that the cultures growing with IPTG grow faster than those cultures growing without IPTG. Thus, we decided to test whether IPTG improves growth. Chapter 5. Experimental implementation of the dynamic strategy 122

pTOG: IPTG(0−7hr)−washing(7−10hr) pTOG: Heat shock only 30’ heat shock at 10hr 60 1 60 1 A C

40 40 0.5 0.5 20 20 Glucose (mM) Glucose (mM)

0 0 Optical density (600 nm) 0 0 Optical density (600 nm) 0 20 40 60 80 0 10 20 30 Time (h) Time (h)

B Lactate D 100 100 Formate Succinate Acetate 50 50 Concentration (mM) Concentration (mM) 0 0 0 20 40 60 80 0 10 20 30 Time (h) Time (h)

Figure 5.22: Characterization of the mutant ∆(adh,pta) transformed with the plasmid pTOG(adh,pta) in 50 mM of glucose. (A-B) Heat shock was applied for 30 min at the beginning of the batch and no IPTG was added in the culture (single experiment). (C-D)

The first phase of the experiment was carried out in 3 bottles of 100 ml each, with IPTG

(2mM) for 7 hr. Then, cells were washed 3 times to remove IPTG (between 7 and 10 hr).

The second phase was carried out in the bioreactor with 300 ml of media, with heat shock for 30 min. The results shown in C and D are average of triplicates and one standard deviation is also shown.

In Figures 5.22.C and D, we show the results of triplicate experiments with the trans- formed mutant. The first phase of the batch was conducted in three bottles, with 100 ml Chapter 5. Experimental implementation of the dynamic strategy 123

working volume and 2 mM IPTG (0-7 hr). The cells were then washed three times to remove IPTG (7-10 hr). During this period, the remaining glucose at 7 hr was measured.

The experiment resumed at 10 hr with 30 min heat shock in the Applikon bioreactors with

300 ml working volume. In the bioreactor, we add appropriate amount of glucose to ensure that the concentration at 10 hr is equal to the glucose concentration measured at 7 hr.

The use of IPTG seems to improve the growth rate in the first phase. After 7 hr, the

O.D. reaches an average value of 0.265 (the standard deviation is 0.019 and the error bars overlap with the circle marker). In contrast, the mutant requires approximately 12 hr to reach an O.D. of 0.265. Between 7 and 10 hr, the cells were washed 3 times to remove the

IPTG. A small amount of cells was lost during the washing step, and the cell density at 10 hr is a little lower than 7 hr (average O.D. at 10 hr is 0.235). Despite the waste of some cells during washing, rapid biomass generation results in fast glucose consumption. With this implementation, the batch is completed in 27 hr; in comparison the mutant batch time was 35 hr.

Lactate is the main product of the fermentation (Figure 5.22.D). In the first phase, lactate concentration reaches approximately 15 mM after 7 hr. Once the cells are washed with fresh medium, the old medium is discarded, and the second phase is conducted in fresh medium without IPTG. Thus, at the beginning of the second phase (i.e., 10 hr), the concentration of the products is zero. However, in Figure 5.22.D, we consider the cumu- lative lactate production, and the product concentrations at the beginning of the second phase start from the values they have at 7 hr. The final lactate concentration, 90 mM, is lower than the mutant (110 mM). This is because more biomass is produced in these experiments compared to the mutant (final O.D. here is 0.84, compared to only 0.62 for the mutant). Formate, acetate and succinate are produced in concentrations below 1 mM. Since no acetate and ethanol were detected in the fermentation, it is not evident that the genes adh and pta are expressed. The fast growth in the last case can be attributed to several factors. One possible reason is that lactate is removed from the media after 7 hr. Thus, the Chapter 5. Experimental implementation of the dynamic strategy 124

product inhibition in the last pTOG characterization is lower than the mutant since the lactate concentration is lower. Also, it is possible that the genes adh and pta are expressed at very low levels. In the next sections, we review the characterizations conducted here.

5.4.4 Synopsis of the batch characterizations

Here, we compare the characteristics of the experiments conducted in 50 mM of glucose.

We review the product yields, lactate productivity and titer, growth rate, batch time and

final optical density in Table 5.3. The carbon balances of the batches are also included in the Appendix F.

Wild-type strain produces a mix of products (formate, acetate and ethanol account for

90% of the products on molar basis), while lactate yield is much lower (approximately 0.23 mmol/mmol; lactate yield is emphasized in bold). The growth rate is shown in the second group of parameters in Table 5.3. The growth rate of the wild-type strain is the highest among the three strains (0.23 hr−1), and it results in a short batch time (10 hr) and a high

final O.D. of 2.2.

The mutant or the static strategy shows the highest lactate yield between the four experiments (1.97 mmol/mmol), while acetate and ethanol are zero due to the deletion of genes adh and pta. Formate yield is very close to the one reported in Fong et al. (2005). The disadvantage of this strategy is the slow growth (0.07 hr−1), which results in lower biomass production and much longer batch time (0.62 O.D. and 35 hr). The productivity value

(3.15 mmol/hr/l) can be improved if we achieve rapid biomass generation at the beginning of the batch.

The implementation of the plasmid pTOG(adh, pta) with heat shock at the beginning of the batch showed an unexpected decrease in the growth rate and productivity. The batch was significantly longer than the mutant (55 compared to 35 hr). The lactate yield and titer were slightly lower, but close to those of the mutant, and formate production was slightly higher (yield of 0.22 compared to 0.13). Chapter 5. Experimental implementation of the dynamic strategy 125

Strain Wild-type ∆(adh,pta) ∆(adh,pta) ∆(adh,pta)

Plasmid pTOG(adh, pta) pTOG(adh, pta)

Strategy static dynamic dynamic

Phase 1 heat shock IPTG (0-7 hr)

Phase 2 heat shock at 10 hr

Yield (mmol/mmol)

Formate 1.15 0.13 0.22 0.02

Acetate 0.71 - - 0.03

Ethanol 0.66 - - -

Succinate 0.05 0.05 0.04 0.02

Lactate 0.23 1.97 1.89 1.59

Productivity 1.03 3.15 1.93 3.78

(mmol/hr/l)

Titer 11.3 110.3 105.9 90.6

(mM lactate)

Growth rate (hr−1) 0.23 0.07 0.04 0.20-0.08

Batch time (hr) 10 35 55 27

Final O.D. 2.2 0.62 0.39 0.84

Table 5.3: Growth characteristics of wild-type, mutant and the mutants transformed with the pTOG(adh, pta) plasmid. The bioengineering objectives are estimated at the end of the batch from HPLC measurements. Lactate yield, productivity, and titer are in bold.

When IPTG was applied in the first phase, growth rate increased to 0.20 hr−1. Al- though this value was calculated based on a few points, the biomass generated at the end of the first phase was higher than that of the mutant at the same time (i.e., at 7 hr). In the second phase, the growth rate dropped close to the value of the static (0.08 hr−1), and lactate was the main product with a final overall yield of 1.59 mmol/mmol. The batch time Chapter 5. Experimental implementation of the dynamic strategy 126

was only 27 hr, compared to 35 hr for the static strategy. The decrease in the batch time resulted in the highest productivity among the four cases.

The values of the three main bioengineering objectives: productivity, yield and titer are emphasized in Table 5.4. The static strategy has the highest yield and titer, at the expense of low productivity. In contrast, the use of the pTOG plasmid shows a 20% in- crease in productivity (i.e., 3.78 compared to 3.15 mmol lactate/hr/l for the static), at the expense of lower yield. This, however, was not achieved with the proposed protocol: heat shock to induce adh and pta expression, and IPTG to induce repression. Instead, the highest productivity was obtained when we reversed the signals: IPTG first, followed by a washing step, and then heat shock. Since this result is counterintuitive, further analysis and investigation is performed in the next section.

5.4.5 Conclusions

In our initial scheme, we proposed that we induce expression of genes adh and pta with heat shock and repression with IPTG. This implementation is practical, since heat is applied at the beginning of the batch, and IPTG is added later. The pTOG plasmid was designed so that we can apply the above induction scheme. However, in our first test of the pTOG plasmid, we noticed that growth after the initial heat shock was slower than the mutant, rather than faster. To investigate this inconsistency, we reversed the order of the signals:

IPTG was added at the beginning of the batch, then washed out, and the cells were imme- diately heat shocked. Although this induction scheme seems to improve growth rate and productivity, it is most likely due to decreased lactate inhibition, and not because of the adh and pta expression. The fact that the pTOG did not work the way we expected raises questions about the design of the plasmid pTOG(adh,pta). To troubleshoot the pTOG plasmid, the following steps are discussed in the next section: Chapter 5. Experimental implementation of the dynamic strategy 127

• perform the batch characterizations with one signal at a time: IPTG throughout the entire batch in one case, and heat shock at the beginning of the batch without IPTG

in the second case,

• sequence the plasmid to confirm the order of the genes and identify potential muta- tions,

• use flow cytometry to quantify the expression of the fluorescence protein.

5.5 Troubleshooting

5.5.1 Individual signal testing

In this section, we compare the following four cases in triplicate experiments in 100 ml bottles with 50 mM of initial glucose:

• wild-type MG1655

• mutant ∆(adh,pta)

• mutant ∆(adh,pta)-pTOG(adh,pta) with IPTG (i.e., IPTG(+))

• mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG, and heat shock for 30 min at the beginning of the batch (i.e., IPTG(-)).

The time profiles of the optical density, glucose and lactate are shown in Figure 5.23. The wild type strain grows faster, and consumes glucose faster than all other scenarios; however it produces the least lactate. In the other three scenarios, growth and glucose consumption is slower than the wild-type. Also, the main product of the mutant and the transformed mutants is lactate. The time profiles of the other by-products are given in Appendix H. No acetate and ethanol is produced in any case, other than the wild-type. This indicates that the toggle switch did not express genes adh and pta at a significant level.

To compare the data means of all the variables, the two-sample t-test was performed Chapter 5. Experimental implementation of the dynamic strategy 128

for every possible pair of samples after 5 hr. The results confirm that the data for wild- type and the three mutant scenarios are different. Comparison between the mutant and the transformed mutant with two different treatments showed that there is no difference between the means. Thus, with this experiment, we are more confident that the toggle switch does not express genes adh and pta, regardless of the signal input. To conclude, we propose that every toggle we create should be tested in both signals individually (i.e., heat shock and IPTG) to ensure that the toggle reaches both states.

1 40 A Wild−type C 35 Mutant IPTG(+) 0.5 30 IPTG(−) (600 nm)

Optical density 25 0 0 5 10 15 20 60 B 15 Lactate (mM) 40 10

20 5 Glucose (mM)

0 0 0 5 10 15 0 5 10 15 Time (h) Time (h)

Figure 5.23: Characterization of the wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond). A: optical density, B: glucose, C: lactate. Average of triplicate experiments and one standard deviation is shown.

5.5.2 Sequencing

The backbone plasmids pTAK 131 and pTAK132, and the pTOG(adh,pta) plasmid were sequenced with the Sanger sequencing method by primer walking. The template DNA sam- Chapter 5. Experimental implementation of the dynamic strategy 129

ples were purified using the Sigma plasmid miniprep kit and resuspended in water. The amount of template needed is 200 ng per reaction. We designed 32 primers to cover the plasmid two-fold. The sequences of all three plasmids are included in the Appendix G.

The plasmid was sequenced to compare the pTOG plasmid created by Dr. H. Kobayashi with the original pTAK toggle plasmids obtained from Professor Collins lab. The original pTAK plasmids were never sequenced, and only theoretical sequences are available online

(personal communication with Dr. Timothy Gardner). Comparison of the three sequences with the theoretical sequence of the individual genes show several single nucleotide muta- tions in gene cI (Figure 5.25.A). A single frame mutation was also found in gene cI in all three plasmids (Figure 5.25.B). With the comparison of the sequences, we confirm that the pTOG plasmid originates from the plasmid backbone pTAK132, and no mutations are inserted when the pTOG plasmid was constructed. The cI gene has worked successfully in plasmids pTAK131 and pTAK132 since the publication of the toggle switch paper in 2000

(Sekine et al., 2011). Thus, we believe that these mutations were present in the original pTAK plasmids. The genes of the host organism may interfere with the plasmid genes, and the bistable expression in strain JM2.300 may become monostable in our host strain

MG1655. To clarify the lack of expression of genes lacI, adh and pta we take the following two actions:

• calculate the translation rate of gene lacI for plasmids pTAK131 and pTAK132 in the host strain MG1655 based on the RBS sequences

• quantify the expression of both plasmids with flow cytometry.

The sequencing of the plasmids pTAK131 and pTAK132 confirmed the main difference between the two plasmids, the ribosomal binding sites of the gene lacI. The theoretical analysis of Gardner et al. (2000) showed that the bistability occurs when the expression of genes lacI and cI is strong and balanced. Gardner et al. (2000) created a set of toggle plasmids with different ribosomal binding sites to control the strength of lacI expression, while keeping the same promoter and RBS for gene cI. These ribosomal binding sites for Chapter 5. Experimental implementation of the dynamic strategy 130

gene lacI were tested to tune the expression level of the associated repressor protein. Here, we use the ribosome binding site calculator developed by Salis et al. (2009) to estimate the protein expression level in the host strain MG1655.

The calculator uses a model based on statistical mechanics, kinetics, and thermo- dynamics to estimate the protein expression level for a specific strain and RBS. In strain

MG1655, the expression of gene lacI in the plasmid pTAK132 is significantly lower than the expression in pTAK131. Thus, although both plasmids show bistability in strain JM2.300, it is possible that the plasmid pTAK132 can only reach the high cI state in strain MG1655.

If that is the case, then genes lacI, adh and pta are always repressed. This explains why we did not observe any acetate and ethanol production in the experiments discussed in the previous section. The expression of the plasmids reporters is quantified experimentally in the next section using flow cytometry.

5.5.3 Flow cytometry

Flow cytometry is used to measure the expression level of the reporter green fluorescence protein (GFP) for the plasmid pTOG132, and the yellow fluorescence protein (YFP) for the plasmid pTAK131 (fresh stocks of both plasmids were received from Professor Collins lab). The pTOG(adh) and pTOG(adh, pta) plasmids we used in our experiments are based on plasmid pTAK132. Since all the pTAK plasmids created in the Gardner lab were tested in the E. coli strain JM2.3000, we want to test the expression levels in the MG1655 ∆(adh, pta) mutant. The only difference between the two plasmids is the ribosomal binding sites discussed in the previous section. The overall maps of the plasmids are reviewed in Figure

5.26. Genes adh and pta are also included in the figure for completeness. The results from the fluorescence-activated cell sorting (FACS) are shown in Figure 5.27. Two experiments were conducted for each plasmid: the first at 42oC without IPTG, and the second at 30oC with IPTG.

The analysis of the plasmid pTAK132 in strain MG1655 ∆(adh, pta) confirms our Chapter 5. Experimental implementation of the dynamic strategy 131

A.

B.

Figure 5.24: Gene cI sequences from plasmids pTAK 131, pTAK 132, and pTOG(adh,pta).

A: The single nucleotide mutations on the cI gene are shown as yellow symbols at the bottom of the figure. The frame mutation is emphasized in the square box. B: The frame mutation is shown in this zoom in view of the area around the mutation.

Figure 5.25: The difference in the ribosomal binding site for plasmids pTAK132 (top) and pTAK131 (bottom) is highlighted in the sequences. Chapter 5. Experimental implementation of the dynamic strategy 132

experimental observation that genes adh and pta are not expressed in the pTOG(adh, pta) plasmid, regardless of the treatment. Both heat shock and IPTG result in high expression of the GFP protein; therefore, the genes adh and pta would always be repressed when down- stream of lacI. Figures 5.27.A and B show that in both cases the GFP expression is high.

Although the plasmid pTAK132 is a bistable switch in strain JM2.300, it only expresses gene cI in strain MG1655. This result is supported by the expression level estimated with the RBS calculator of Salis et al. (2009) in the previous section. Gene expression of lacI is significantly lower than that of gene cI in strain MG1655, and the plasmid always expresses the operon with genes cI and GFP.

In contrast, the plasmid pTAK131 operates as a functional bistable switch in the mutant strain. In this plasmid, the reporter yellow fluorescence protein (YFP) is placed downstream of gene lacI, and YFP expression is indicative of the genes adh and pta expres- sion. In Figures 5.27.C and D, we show that the plasmid pTAK131 operates as expected; heat shock turns the lacI gene on, and IPTG turns the gene off. The RBS calculator also predicts that the lacI RBS used in pTAK131 is stronger than that of pTAK132.

5.5.4 Conclusions

The analysis presented in this section shows that the plasmid pTAK132 is not a functional bistable switch in strain MG1655. Thus, we confirmed the experimental observation that the plasmid pTOG(adh, pta), which is based on pTAK132, does not operate as a dynamic

ON-OFF controller in the mutant strain MG1655 ∆(adh, pta). As a result, the genes adh and pta are not expressed. In contrast, the plasmid pTAK131 is a bistable switch in the

MG1655 strain, since it is able to reach both states. Therefore, we suggest that the next step is to express the genes adh and pta with the backbone plasmid pTAK131. Chapter 5. Experimental implementation of the dynamic strategy 133

lacI cI lacI cI

adh GFP YFP pTAK132 pTAK131 pta adh

pta

Figure 5.26: Map of plasmids pTAK132 and pTAK131. The genes adh and pta are also shown here to understand the expression pattern in the pTOG(adh,pta) plasmid.

pTAK132-42oC: ON pTAK131-42oC: ON

A C

pTAK132-IPTG: ON pTAK131-IPTG: OFF

B D

Figure 5.27: Analysis of plasmids pTAK132 (A and B) and pTAK131 (C and D) in strain

MG1655 by flow cytometry showing two treatments: heat shock at 42oC without IPTG is tested in A and C; IPTG at 30oC is tested in B and D. Plasmid pTAK132 always expresses

GFP with both treatments; thus, it is a monostable switch and genes adh, pta are never expressed. Plasmid pTAK131 expresses YFP with heat shock as expected. Also, YFP is repressed in the presence of IPTG, also as expected. Chapter 6

Conclusions and recommendations for future work

6.1 Conclusions

The work presented in this thesis involves the development of a novel synthetic biology method for improving the productivity of metabolic engineering applications. The Pro- cess Systems Engineering community has approached the bioprocess optimization problem at the reactor level since the 1970s. With the development of rapid and reliable genetic engineering methods, metabolic engineering emerged in the early 1990s to develop the opti- mization of metabolic networks. Traditional metabolic engineering methods include static manipulations over time, such as gene deletion, up or downregulation and the introduction of heterologous genes to improve the yield of a desired product. However, these approaches do not consider the trade-off between productivity and yield. Productivity is a crucial factor for every bioprocess, and must not be neglected. Many of the strain designs developed with the aforementioned methods are not practical because they do not reach productivity tar- gets for industrial applications. As a result, the cost of the bioprocess is significantly higher than that for conventional chemical processes, as is the price of the chemical produced.

134 Chapter 6. Conclusions and recommendations for future work 135

To address the trade-off between productivity and yield in strain development, Gadkar et al. (2005) developed the hypothesis that dynamic gene expression can improve the pro- ductivity of a process. In the proposed dynamic strategy, instead of eliminating byproducts contributing to growth, the genes associated with biomass formation are initially expressed at the wild-type level. Once enough biomass is generated, the genes are turned off, and production of the desired chemical occurs at a rate faster than that for the traditional gene deletion approach.

Our hypothesis is that we can experimentally implement the theoretical work presented by Gadkar et al., using the genetic toggle switch constructed by Gardner et al. (2000). Based on this hypothesis, the individual contributions of this thesis are:

• First, we designed a genetic circuit that applies the optimal dynamic strategy of Gad- kar et al. (2000). The proposed design is based on a mathematical model that couples

a quorum-sensing module to the genetic toggle switch. The switch expresses the target

genes from a non-native promoter in the required ON-OFF fashion. Model simulations

showed density-dependent gene expression with the optimal dynamic characteristics

required.

• Then, we analyzed the model to understand the influence of the dynamics on the variables of interest. First, we showed that there is an optimal switching time that

maximizes the productivity of the process. Using sensitivity analysis, we identified

the range of the genetic circuit parameters that gives the maximum productivity. The

sensitivity-based model analysis gave insight into the implementation and optimiza-

tion of the dynamic control strategy.

• Finally, we utilized the toggle switch pTAK132 to manipulate the genes adh and pta genes in a mutant lacking these genes. Our hypothesis was that expression of

adh and pta with the toggle plasmid will restore the wild-type levels; however, this

was not observed in the experiments. Our fermentation experiments, supported by Chapter 6. Conclusions and recommendations for future work 136

flow cytometry data showed that plasmid pTAK132 was not able to express adh and

pta at all. In addition, sequencing of the plasmid and calculations based on the

ribosomal binding site confirmed that the expression would be low. An alternative

toggle plasmid, pTAK131, obtained from Professor Collins’ lab, was also tested. Based

on the sequence, higher expression of the lacI gene is expected with plasmid pTAK131.

Flow cytometry showed that the plasmid pTAK131 is indeed a bistable switch in the

strain MG1655. The comparison of the toggle plasmids pTAK131 and pTAK132 in the

”Troubleshooting” section was conducted by the Ph.D. candidate Naveen Venayak,

who has undertaken the project. These results, although inconclusive, are the first step

towards the implementation of the dynamic gene expression strategy for metabolic

engineering.

In this thesis, we presented a novel tool for bioprocess optimization. The synthetic biology application enables the re-engineering of E. coli to effectively manipulate the carbon distribution between biomass and product. The dynamic control of gene expression can be applied in addition to existing process level optimization methods. In theory, it can be generally applied to any fermentation and strain, but it is appropriate for slowly growing strains. In these cases, the dynamic method can significantly improve the productivity of bioprocesses.

6.2 Recommendations for future work

The dynamic gene expression model developed here has the potential to improve the produc- tivity of bioprocesses by boosting the initial growth of a strain. Also, the dynamic method enables the engineer to adjust productivity and yield values. Mathematical analysis of the method showed that productivity can increase as much as 100% compared to traditional static strains. These results suggest that many strain designs considered infeasible at the present time can become profitable for industrial applications if the dynamic strategy is uti- lized. Accordingly, the commercialization of several strains will be possible, and the price Chapter 6. Conclusions and recommendations for future work 137

of biochemicals will be more competitive with that of conventional chemicals.

Next, we summarize some recommendations to complete this project, and some ideas for the practical implementation of the strategy in an industrial scale.

• Use the pTAK131 plasmid as the backbone to manipulate genes adh and pta: The flow cytometry experiments showed that the plasmid pTAK131 is bistable

in the MG1655 strain. In agreement with Gardner et al. (2000), heat shock induces

expression of the gene lacI, and IPTG represses it. Thus, if we introduce the genes

adh and pta downstream of lacI, we will likely be able to express adh and pta in

an ON-OFF fashion. The construction and characterization of the pTOG(adh, pta)

plasmid with the pTAK131 as the backbone is currently in progress.

• Quantify the level of adh and pta expression: In our model, we assumed that the toggle plasmid expresses the manipulated genes at a level that provides the same flux

value as the wild-type. Large deviations from the wild-type flux may cause suboptimal

growth rate. Thus, fine-tuning of the expression level in the toggle plasmid may be

required.

• Create toggle plasmids that use inexpensive inducers in a practical way: The application of heat shock and the addition of IPTG can be challenging in a

large-scale fermentation process. Temperature dynamics in a reactor may slow the

dynamics of the genetic switch. Also, IPTG contributes towards the operating costs

of the bioprocess, and decreases growth rate. Thus, it is useful to create a toggle

plasmid that uses low-cost inducers to turn on and off the genes effectively.

• Integrate the toggle switch with the quorum sensing from V. fischeri: In the integrated genetic circuit, the quorum sensing induces the switch from the

ON to the OFF state. Thus, the need for inducers is eliminated if we incorporate

the quorum sensing module. If the cells initially express the manipulated genes,

the quorum sensing will repress the genes once a critical density is reached. The Chapter 6. Conclusions and recommendations for future work 138

V. fischeri quorum sensing system used in Kobayashi et al. (2004) turns off gene

expression when the optical density reaches 0.06. This, however, is not practical for

an industrial application. Typically, in industrial fermentations, cells are grown to

a high optical density (in the order of 30 to 50), and then cells are switched to the

production phase.

• Integrate the genes involved in the dynamic expression on the chromosomal DNA: Expression of multiple large plasmids place metabolic burden on the cells and

decreases growth rate. To overcome this problem, the genes associated with the

quorum sensing and the toggle switch can be integrated on the chromosome.

• Induce the repression at high optical density: Quorum sensing was first dis- covered in V. fischeri in the late 1960s. Since then, the phenomenon was observed in

other organisms, such as E. coli, Salmonella enterica, and Pseudomonas aeruginosa.

Some systems showed induction of the quorum sensing genes at optical densities sig-

nificantly higher than V. fischeri. These systems can be useful for induction of genes

at optical densities that are relevant to industrial applications.

• Apply the dynamic strategy for a product of high industrial interest: Lactic acid is the monomer for producing the biodegradable polymer polylactic acid (PLA).

Currently, lactic acid is primarily produced by fermenting starch with lactic acid

bacteria, which have simple growth requirements and grow relatively fast.

The dynamic method is generally applicable in any bioprocess, however, it is well-

suited when the growth rate is the bottleneck. Many metabolically engineered strains

show improved product yield at the cost of impaired growth and productivity. Some

of them show no growth and are impractical for a production process. These strains

are excellent candidates for the dynamic metabolic engineering strategy. Products

such as succinate and butanediol have a large market and the productivity of the

processes can be improved with a dynamic approach. Chapter 6. Conclusions and recommendations for future work 139

• Perform a techno-economic evaluation of the dynamic strategy and com- pare with current methods: The dual-phase fermentation reviewed in chapter 2.6

for lactate production is a straightforward solution for boosting the growth rate and

the productivity. The productivity increased 10-fold when air or oxygen was supplied

in the first phase of the fermentation; from approximately 0.3 g/l/h for purely anaer-

obic fermentations to values above 3 g/l/h for dual-phase fermentations in E. coli.

Sparging with air or oxygen can be challenging and costly in large-scale fermentations,

adding to the operating cost of a production process. A techno-economic comparison

of the two methods is necessary to evaluate the feasibility and the profitability of the

methods. Bibliography

Anesiadis, N., W. R. Cluett, and R. Mahadevan. Model-driven design based on sensitiv-

ity analysis for a synthetic biology application. Computer-Aided Process Engineering-

ESCAPE, 21:1446–1450, 2011.

Anesiadis, N., H. Kobayashi, W. R. Cluett, and R. Mahadevan. Analysis and design of

a genetic circuit for dynamic metabolic engineering. ACS Synthetic Biology, 2:442–452,

2013.

Anesiadis, N., W. R. Cluett, and R. Mahadevan. Dynamic metabolic engineering for in-

creasing bioprocess productivity. Metabolic Engineering, 10(5):255 – 266, 2008.

Bastian, S., X. Liu, J. T. Meyerowitz, C. D. Snow, M. M. Y. Chen, and F. H. Arnold.

Engineered ketol-acid reductoisomerase and alcohol dehydrogenase enable anaerobic 2-

methylpropan-1-ol production at theoretical yield in Escherichia coli. Metabolic Engi-

neering, 13(3):345–52, 2011.

Basu, S., Y. Gerchman, C. H. Collins, F. H. Arnold, and R. Weiss. A synthetic multicellular

system for programmed pattern formation. Nature, 434(7037):1130–4, April 2005.

Biliouris, K., P. Daoutidis, and Y. N. Kaznessis. Stochastic simulations of the tetracycline

operon. BMC Systems Biology, 5(1):9, 2011.

Burgard, A. P., P. Pharkya, and C. D. Maranas. Optknock: a bilevel programming frame-

140 BIBLIOGRAPHY 141

work for identifying gene knockout strategies for microbial strain optimization. Biotech-

nology and Bioengineering, 84(6):647–57, 2003.

Callura, J. M., C. R. Cantor, and J. J. Collins. Genetic switchboard for synthetic biology

applications. Proceedings of the National Academy of Sciences of the United States of

America, 109(15):5850–5, 2012.

Carothers, J. M., J. Goler, D. Juminaga, and J. D. Keasling. Model-driven engineering

of RNA devices to quantitatively program gene expression. Science, 334(6063):1716–9,

2011.

Causey, T. B., K. T. Shanmugam, L. P. Yomano, and L. O. Ingram. Engineering escherichia

coli for efficient conversion of glucose to pyruvate. Proceedings of the National Academy

of Sciences of the United States of America, 101(8):2235–2240, 2004.

Chandran, D., W. B. Copeland, S. C. Sleight, and H. M. Sauro. Mathematical modeling

and synthetic biology. Drug Discovery Today: Dis. Models, 5:299–309, 2008.

Chang, D., H. Jung, and J. Rhee. Homofermentative Production of D - or L -Lactate in

Metabolically Engineered Escherichia coli RR1. Applied and Environmental Microbiology,

65(4):1384–1389, 1999.

Chemler, J., Z. L. Fowler, K. P. McHugh, and M. G. Koffas. Improving NADPH availability

for natural product biosynthesis in Escherichia coli by metabolic engineering. Metabolic

Engineering, 12(2):96–104, 2010.

Cherry, J. and F. Adler. How to make a biological switch. Journal of Theoretical Biology,

203(2):117 – 133, 2000.

Chu, Y., A. Jayaraman, and J. Hahn. Parameter sensitivity analysis of IL-6 signalling

pathways. Engineering and Technology, (6):342 –352, 2007. BIBLIOGRAPHY 142

Collins, C. H., F. H. Arnold, and J. R. Leadbetter. Directed evolution of Vibrio fischeri

LuxR for increased sensitivity to a broad spectrum of acyl-homoserine lactones. Molecular

Microbiology, 55(3):712–23, 2005.

Cuthrell, J. E. and Biegler L. T. Simultaneous optimization and solution methods for batch

reactor control profiles. Computers in Chemical Engineering, 13(1):49–62, 1989.

Demirci, A. and A. Pometto. Enhanced production of D (-)-lactic acid by mutants of

Lactobacillus delbrueckii ATCC 9649 *. Journal of Industrial Microbiology, 11(2889):

23–28, 1992.

Dien, B. S., N. N. Nichols, and R. J. Bothast. Recombinant Escherichia coli engineered

for production of L-lactic acid from hexose and pentose sugars. Journal of Industrial

Microbiology & Biotechnology, (July):259–264, 2001.

Egland, K. A. and E. P. Greenberg. Quorum sensing in Vibrio fischeri : elements of the

luxI promoter. Molecular Microbiology, 31(4):1197–1204, 1999.

Ehsani, M., M. R. Fern´andez,J. Biosca, and S. Dequin. Reversal of coenzyme specificity

of 2,3-butanediol dehydrogenase from Saccharomyces cerevisae and in vivo functional

analysis. Biotechnology and Bioengineering, 104(2):381–9, 2009.

Eiteman, M., S. Lee, and E. Altman. A co-fermentation strategy to consume sugar mixtures

effectively. Journal of Biological Engineering, 2:3, 2008.

Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain. Stochastic gene expression in

a single cell. Science, pages 1183–1186, 2002.

Feng, X., S. Hooshangi, D. Chen, G. Li, R. Weiss, and H. Rabitz. Optimizing genetic

circuits by global sensitivity analysis. Biophysical Journal, 87(4):2195–202, 2004.

Fong, S. S., A. P. Burgard, C. D. Herring, E. M. Knight, F. R. Blattner, C. D. Maranas, and BIBLIOGRAPHY 143

B. O. Palsson. In silico design and adaptive evolution of Escherichia coli for production

of lactic acid. Biotechnology and Bioengineering, 91(5):643–8, 2005.

Fung, E., W. W. Wong, J. K. Suen, T. Bulter, S. Lee, and J. C. Liao. A synthetic gene-

metabolic oscillator. Biomedical Engineering, pages 1–8, 2004.

Gadkar, K., R. Mahadevan, and F. Doyle. Optimal genetic manipulations in batch biore-

actor control. Automatica, 42(10):1723–1733, 2006.

Gadkar, K. G., F. J. Doyle, J. S. Edwards, and R. Mahadevan. Estimating optimal profiles

of genetic alterations using constraint-based models. Biotechnology and Bioengineering,

89:243–251, 2005a.

Gadkar, K. G., F. J. Doyle, J. S. Edwards, and R. Mahadevan. Estimating optimal profiles

of genetic alterations using constraint-based models. Biotechnology and Bioengineering,

89(2):243–51, 2005b.

Gardner, T. S., C. R. Cantor, and J. J. Collins. Construction of a genetic toggle switch in

Escherichia coli. Nature, 403(6767):339–42, 2000.

Govaerts, W., Y. Kuznetsov, A. Dhooge, H. G. E. Meijer, W. Mestrom, A. M. Riet, and

B. Sautois. MATCONT and CL MATCONT : Continuation toolboxes in matlab. 2006.

Guet, C. C., M. B. Elowitz, W. Hsing, and S. Leibler. Combinatorial synthesis of genetic

networks. Science, 296(5572):1466–70, 2002.

Hansen, J. M., H. C. Lim, and J. Hong. Optimization of autocatalytic reactions. Chemical

Engineering Science, 48(13):2375–2390, 1993.

Haseltine, E. L. and F. H. Arnold. Synthetic gene circuits: design with directed evolution.

Annual Review of Biophysics and Biomolecular Structure, 36(1):1–19, 2007.

Hjersted, J. L. and M. A. Henson. Optimization of fed-batch saccharomyces cerevisiae BIBLIOGRAPHY 144

fermentation using dynamic flux balance models. Biotechnology Progress, 22(5):1239–

1248, 2006.

Ho, Y., A. Kiparissides, E. N. Pistikopoulos, and A. Mantalaris. Computational approach

for understanding and improving GS-NS0 antibody production under hyperosmotic con-

ditions. Journal of Bioscience and Bioengineering, 113(1):88–98, 2012.

Hofvendahl, K., C. Akerberg, G. Zacchi, and B. Hahn-Hagerdal. Simultaneous enzymatic

wheat starch saccharification and fermentation to lactic acid by lactococcus lactis. Applied

Microbiology and Biotechnology, 52(2):163–169, 1999.

Holtz, W. J. and J. D. Keasling. Engineering static and dynamic control of synthetic

pathways. Cell, 140(1):19–23, 2010.

Huang, D., W. J. Holtz, and M. M. Maharbiz. A genetic bistable switch utilizing nonlinear

protein degradation. Journal of Biological Engineering, 6(1):9, 2012.

Ingalls, B. Mathematical Modelling in Systems Biology : An Introduction. 2013.

Isaacs, F. J., J. Hasty, C. R. Cantor, and J. J. Collins. Prediction and measurement of

an autoregulatory genetic module. Proceedings of the National Academy of Sciences, 100

(13):7714–7719, 2003.

Joshi, A. and B. O. Palsson. Metabolic dynamics in the human red cell. Part IV–Data

prediction and some model computations. Journal of Theoretical Biology, 142(1):69–85,

1990.

Kaplan, H. B. and E. P. Greenberg. Diffusion of autoinducer is involved in regulation of

the Vibrio fischeri luminescence system. Journal of Bacteriology, 163:1210–1214, 1985.

Kiparissides, A., S. S. Kucherenko, A. Mantalaris, and E. N. Pistikopoulos. Global sen-

sitivity analysis challenges in biological systems modeling. Industrial and Engineering

Chemistry Research, 48(15):7168–7180, 2009. BIBLIOGRAPHY 145

Kiparissides, A., M. Koutinas, C. Kontoravdi, A. Mantalaris, and E. N. Pistikopoulos. Clos-

ing the loop in biological systems modeling: From the in silico to the in vitro. Automatica,

47(6):1147–1155, 2011a.

Kiparissides, A., M. Koutinas, T. Moss, J. Newman, E. N. Pistikopoulos, and A. Mantalaris.

Modelling the Delta1/Notch1 pathway: in search of the mediator(s) of neural stem cell

differentiation. PloS one, 6(2):e14668, 2011b.

Kobayashi, H., M. Kaern, M. Araki, K. Chung, T. S. Gardner, C. R. Cantor, and J. J.

Collins. Programmable cells: interfacing natural and engineered gene networks. Pro-

ceedings of the National Academy of Sciences of the United States of America, 101(22):

8414–9, 2004.

Kontoravdi, C., S. P. Asprey, E. N. Pistikopoulos, and A. Mantalaris. Application of global

sensitivity analysis to determine goals for design of experiments : an example study on

antibody-producing cell cultures. Biotechnology Progress, 21(4):1128–1135, 2005.

Kyla-Nikkila, K., M. Hujanen, M. Leisola, and A. Palva. Metabolic Engineering of Lacto-

bacillus helveticus CNRZ32 for Production of Pure L-Lactic Acid. Applied and Environ-

mental Microbiology, 66:3835–3841, 2000.

Lee, J. W., D. Na, J. M. Park, J. Lee, S. Choi, and S. Y. Lee. Systems metabolic engineering

of microorganisms for natural and non-natural chemicals. Nature , 8(6):

536–46, 2012.

Lee, J. and W. F. Ramirez. Optimal fed-batch control of induced foreign protein production

by recombinant bacteria. AIChE Journal, 40(5):899–907, 1994.

Lee, K. H., J. H. Park, T. Y. Kim, H. U. Kim, and S. Y. Lee. Systems metabolic engineering

of Escherichia coli for L-threonine production. Molecular Systems Biology, 3(149):149,

2007. BIBLIOGRAPHY 146

Lin, H., G. N. Bennett, and K. Y. San. Genetic reconstruction of the aerobic central

metabolism in Escherichia coli for the absolute aerobic production of succinate. Biotech-

nology and Bioengineering, 89:148–156, 2005.

Lindner, S. N., G. M. Seibold, A. Henrich, R. Kr¨amer,and V. F. Wendisch. Phosphotrans-

ferase system-independent glucose utilization in corynebacterium glutamicum by inositol

permeases and glucokinases. Applied and Environmental Microbiology, 77(11):3571–81,

2011.

Liu, T., S. Miura, M. Yaguchi, T. Arimura, E. Y. Park, and M. Okabe. Scale-up of L-lactic

acid production by mutant strain Rhizopus sp. MK-96-1196 from 0.003 m3 to 5 m3 in

airlift bioreactors. Journal of Bioscience and Bioengineering, 101(1):9–12, 2006.

Lun, D. S., G. Rockwell, N. J. Guido, M. Baym, J. Kelner, B. Berger, J. E. Galagan, and

G. M. Church. Large-scale identification of genetic design strategies using local search.

Molecular Systems Biology, 5(296):296, 2009.

Ma, L. and P. Iglesias. Quantifying robustness of biochemical network models. BMC

Bioinformatics, 3(1):38, 2002.

Mahadevan, R., J. S. Edwards, and F. J. Doyle. Dynamic flux balance analysis of diauxic

growth in Escherichia coli. Biophysical Journal, 83(3):1331–40, 2002.

Manefield, M., T. B. Rasmussen, M. Henzter, J. B. Andersen, P. Steinberg, S. Kjelleberg,

and M. Givskov. Halogenated furanones inhibit quorum sensing through accelerated luxr

turnover. Microbiology, 148:1119–1127, 2002.

McCabe, K. M., E. J. Lacherndo, I. Albino-Flores, E. Sheehan, and M. Hernandez. Laci(ts)-

regulated expression as an in situ intracellular biomolecular thermometer. Applied Envi-

ronmental Microbiology, 77:2863, 2011.

Miller, G. M., B. A. Ogunnaike, J. S. Schwaber, and R. Vadigepalli. Robust dynamic balance BIBLIOGRAPHY 147

of AP-1 transcription factors in a neuronal gene regulatory network. BMC Systems

Biology, 4:171, 2010.

Miˇskovi´c,L. and V. Hatzimanikatis. Modeling of uncertainties in biochemical reactions.

Biotechnology and Bioengineering, 108(2):413–23, 2011.

Moon, T. S., C. Lou, A. Tamsir, B. C. Stanton, and C. Voigt. Genetic programs constructed

from layered logic gates in single cells. Nature, 491(7423):249–53, 2012.

Morohashi, M., A. E. Winn, M. T. Borisuk, H. Bolouri, J. Doyle, and H. Kitano. Robustness

as a measure of plausibility in models of biochemical networks. Journal of Theoretical

Biology, 216(1):19–30, 2002.

NNFCC. Lactic acid fact sheet, 2013.

Orth, J. D., T. M. Conrad, J. Na, J. Lerman, H. Nam, A. M. Feist, and B. O. Palsson. A

comprehensive genome-scale reconstruction of Escherichia coli metabolism–2011. Molec-

ular Systems Biology, 7(535):535, 2011.

Park, J. H., K. H. Lee, T. Y. Kim, and S. Y. Lee. Metabolic engineering of Escherichia

coli for the production of L-valine based on transcriptome analysis and in silico gene

knockout simulation. Proceedings of the National Academy of Sciences of the United

States of America, 104(19):7797–802, 2007.

Pharkya, P. and C. D. Maranas. An optimization framework for identifying reaction ac-

tivation/inhibition or elimination candidates for overproduction in microbial systems.

Metabolic Engineering, 8:1–13, 2006.

Porro, D., M. M. Bianchi, L. Brambilla, D. Bolzani, V. Carrera, C. Liu, B. M. Ranzi,

L. Frontali, R. Menghini, J. Lievense, and L. Alberghina. Replacement of a metabolic

pathway for large-scale production of lactic acid from engineered . Applied and

Environmental Microbiology, 65(9):4211–4215, 1999. BIBLIOGRAPHY 148

Presser, K. A., D. A. Ratkowsky, and T. Ross. Modelling the growth rate of Escherichia

coli as a function of pH and lactic acid concentration. Applied and Environmental Mi-

crobiology, 63:2355–2360, 1997.

Qian, Z.-G., X. Xia, and S. Y. Lee. Metabolic engineering of Escherichia coli for the

production of putrescine: a four carbon diamine. Biotechnology and Bioengineering, 104

(4):651–62, 2009.

Ranganathan, S., P. F. Suthers, and C. D. Maranas. OptForce: an optimization proce-

dure for identifying all genetic manipulations leading to targeted overproductions. PLoS

Computational Biology, 6(4):e1000744, 2010.

Rouwenhorst, R. J., J. F. Jzn, W. Scheffers, and J. P. Dijkenvan . Determination of protein

concentration by total organic carbon analysis. Journal of Biochemical and Biophysical

Methods, 22(2):119–28, 1991.

Saitoh, S., N. Ishida, T. Onishi, E. Nagamori, K. Kitamoto, K. Tokuhiro, and H. Taka-

hashi. Genetically engineered wine yeast produces a high concentration of l-lactic acid

of extremely high optical purity genetically engineered wine yeast produces a high con-

centration of L-lactic acid of extremely high optical purity. Applied and Environmental

Microbiology, 71(5):2789–2792, 2005.

Salis, H. M., E. A. Mirsky, and C. A. Voigt. Automated design of synthetic ribosome

binding sites to control protein expression. Nature Biotechnology, 27(10):946–950, 2009.

Saltelli, A. Making best use of model evaluations to compute sensitivity indices. Computer

Physics Communications, 145:280–297, 2002.

Saltelli, A., M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, and

S. Tarantola. Global sensitivity analysis: the primer. Wiley, 2008.

San, K. Y. and G. Stephanopoulos. A note on the optimality criteria for maximum biomass

production in a fed-batch fermentor. Biotechnology Bioengineering, 1261:–1264, 1984. BIBLIOGRAPHY 149

San, K. Y. and G. Stephanopoulos. Optimization of fed-batch penicillin fermentation: a

case of singular optimal control with state constraints. Biotechnology Bioengineering, 34:

72–78, 1989.

Sauer, M., D. Porro, D. Mattanovich, and P. Branduardi. Microbial production of organic

acids: expanding the markets. Trends in Biotechnology, 26(2):100–8, 2008.

Sekine, R., M. Yamamura, S. Ayukawa, K. Ishimatsu, S. Akama, M. Takinoue, M. Hagiya,

and D. Kiga. Tunable synthetic phenotypic diversification on waddington’s landscape

through autonomous signaling. Proceedings of the National Academy of Sciences, 108

(44):17969–17973, 2011.

Setty, Y., E. Mayo, M. G. Surette, and U. Alon. Detailed map of a cis-regulatory input func-

tion. Proceedings of the National Academy of Sciences of the United States of America,

100(13):7702–7, 2003.

Shen, C. R., E. I. Lan, Y. Dekishima, A. Baez, K. M. Cho, and J. C. Liao. Driving

forces enable high-titer anaerobic 1-butanol synthesis in Escherichia coli. Applied and

Environmental Microbiology, 77(9):2905–15, 2011.

Sobol, I. M. Global sensitivity indices for nonlinear mathematical models and their Monte

Carlo estimates. Mathematics and Computers in Simulation, 55:271–280, 2001.

Soccol, C. R., B. Marin, M. Raimbault, and J. M. Lebeault. Potential of solid state fermen-

tation for production of l(+)-lactic acid by rhizopus oryzae. Applied Microbiology and

Biotechnology, 41(3):286–290, 1994.

Tamsir, A., J. J. Tabor, and C. Voigt. Robust multicellular computing using genetically

encoded NOR gates and chemical ’wires’. Nature, 469(7329):212–5, 2011.

Tokuhiro, K., N. Ishida, E. Nagamori, S. Saitoh, T. Onishi, A. Kondo, and H. Takahashi.

Double mutation of the PDC1 and ADH1 genes improves lactate production in the yeast BIBLIOGRAPHY 150

Saccharomyces cerevisiae expressing the bovine lactate dehydrogenase gene. Applied Mi-

crobiology and Biotechnology, 82(5):883–90, 2009.

Tsao, C.-Y., S. Hooshangi, H. Wu, J. J. Valdes, and W. E. Bentley. Autonomous induction

of recombinant proteins by minimally rewiring native quorum sensing regulon of E. coli.

Metabolic Engineering, 12(3):291–7, 2010.

Van Riel, N. A. Dynamic modeling and analysis of biochemical networks: mechanism-based

models and model-based experiments. Briefings in Bioinformatics, pages 364–374, 2006.

Varma, A. and B. O. Palsson. Stoichiometric flux balance models quantitatively predict

growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Applied

and Environmental Microbiology, 60(10):3724, 1994.

Wang, B., R. I. Kitney, N. Joly, and M. Buck. Engineering modular and orthogonal genetic

logic gates for robust digital-like synthetic biology. Nature Communications, 2:508, 2011.

Weingart, C. L., C. E. White, S. Liu, Y. Chai, H. Cho, C. S. Tsai, Y. Wei, N. R. Delay,

M. R. Gronquist, A. Eberhard, and C. S. Winans. Direct binding of the quorum sensing

regulator CepR of Burkholderia cenocepacia to two target promoters in vitro. Molecular

Microbiology, 57:452–467, 2005.

Whitehead, N. A., A. M. Barnard, H. Slater, N. J. Simpson, and G. P. Salmond. Quorum-

sensing in Gram-negative bacteria. FEMS Microbiology Reviews, 25:365–404, 2001.

Yang, L., W. R. Cluett, and R. Mahadevan. EMILiO: a fast algorithm for genome-scale

strain design. Metabolic Engineering, 13(3):272–81, 2011.

You, L., R. S. III Cox, R. Weiss, and F. H. Arnold. Programmed population control by

cell-cell communication and regulated killing. Nature, 428:868–871, 2004.

Zhao, J., Y. Fang, T. D. Scheibe, D. R. Lovley, and R. Mahadevan. Modeling and sensitivity BIBLIOGRAPHY 151

analysis of electron capacitance for Geobacter in sedimentary environments. Journal of

Contaminant Hydrology, 112(1-4):30–44, 2010.

Zhao, J., T. D. Scheibe, and R. Mahadevan. Model-based analysis of the role of biological,

hydrological and geochemical factors affecting uranium bioremediation. Biotechnology

and Bioengineering, 108(7):1537–48, 2011.

Zheng, Y. and A. Rundell. Comparative study of parameter sensitivity analyses of the

TCR-activated Erk-MAPK signalling pathway. IEE Proceedings in Systems Biology, 153

(4):201–211, 2006.

Zheng, Y. and G. Sriram. Mathematical modeling: bridging the gap between concept and

realization in synthetic biology. Journal of Biomedicine & Biotechnology, 2010:16, 2010.

Zhou, L., Z. Zuo, X. Chen, D. Niu, K. Tian, B. Prior, W. Shen, G. Shi, S. Singh, and

Z. Wang. Evaluation of genetic manipulation strategies on D-lactate production by Es-

cherichia coli. Current Microbiology, 62(3):981–9, 2011.

Zhou, L., D. Niu, K. Tian, X. Chen, B. Prior, W. Shen, G. Shi, S. Singh, and Z. Wang.

Genetically switched D-lactate production in Escherichia coli. Metabolic Engineering, 14

(5):560–8, 2012.

Zhou, S., L. P. Yomano, K. T. Shanmugam, and L. O. Ingram. Fermentation of 10% (w/v)

sugar to D(-)-lactate by engineered Escherichia coli B. Biotechnology Letters, 27(23-24):

1891–6, 2005.

Zhou, S., T. B. Grabar, K. T. Shanmugam, and L. O. Ingram. Betaine tripled the volumetric

productivity of D(-)-lactate by Escherichia coli strain SZ132 in mineral salts medium.

Biotechnology Letters, 28(9):671–6, 2006a.

Zhou, S., K. T. Shanmugam, L. P. Yomano, T. B. Grabar, and L. O. Ingram. Fermentation

of 12% (w/v) glucose to 1.2 M lactate by Escherichia coli strain SZ194 using mineral salts

medium. Biotechnology Letters, 28(9):663–70, 2006b. BIBLIOGRAPHY 152

Zhou, S., T. B. Causey, A. Hasona, K. T. Shanmugam, and L. O. Ingram. Production of op-

tically pure D-lactic acid in mineral salts medium by metabolically engineered Escherichia

coli W3110. Applied and Environmental Microbiology, 69(1):399–407, 2003.

Zhu, Y., M. Eiteman, K. DeWitt, and E. Altman. Homolactate fermentation by metaboli-

cally engineered Escherichia coli strains. Applied and Environmental Microbiology, 73(2):

456–64, 2007.

Zhu, Y., M. Eiteman, and E. Altman. Indirect monitoring of acetate exhaustion and cell

recycle improve lactate production by non-growing Escherichia coli. Biotechnology Letters,

30(11):1943–6, 2008.

Zhuang, K., L. Yang, W. R. Cluett, and R. Mahadevan. Dynamic strain scanning opti-

mization: an efficient strain design strategy for balanced yield, titer, and productivity.

DySScO strategy for strain design. BMC Biotechnology, 13(1):8, 2013. Appendix A

Strain design for serine production

Flux Value (mmol/gDW/h)

PGCD 13.5927

PTAr -0.1344

ACS 0.0986

PDH 1.6533

MTHFD 0.3518

PFL 0.0437

TRPAS2 -0.0277

Table S.1: Values of fine-tuned fluxes for the serine-producing strain design by the EMILiO algorithm.

153 Appendix B

Matlab code

The Matlab code associated with this thesis is located in the Sunfire server and the following path: /data2/nikanesiad/P hD/2012/Comp P aper Batch/V glc 10.

B.1 M-files for Chapter 3

B.1.1 Dynamics of the genetic circuit (section 3.1.3)

The first m-file, mod solv.m, integrates the quorum sensing and toggle switch equations defined in the m-file QS TS.m.

% File: mod solv.m

% Integrates the differential equation inQS TS.m

% Parameters approximations aL1 = 0.06 ;% mM hrˆ( −1) aL2 = 0.06 ; aC = 0.12 ;%0.12 betaL = 0.0008 ;% mM betaC = 0.000008 ; gammaC = 4.152 ;% hrˆ( −1)4.152 gammaA = 0.6 ;

154 Appendix B. Matlab code 155

gammaL = 1.386 ; gammaR = 1.386 ; thetaR = 1e−05 ;% mM n1 = 2 ;% Hill coefficients n3 = 2 ; rhoR = 3.1*10ˆ10 ;%mMˆ( −3) hrˆ(−1)3.1 *10ˆ10 k = 0.27 ;% hrˆ( −1)

% Nm=0.340 *10ˆ12;% CFU/l vA = 4.8*10ˆ(−16);% mM *L/hr LuxR = 0.0005 ;% mM0.0005

% Integration tspan = [0:0.25:20] ; yinit = [1.5*10ˆ10 0.8*10ˆ(−05) 0 0 0.029 ] ; sol = ode15s(@QS TS,tspan,yinit,[],k,vA,gammaA,rhoR,LuxR,gammaR,aL1,...

betaC,n1,aL2,n3,thetaR,gammaL,aC,betaL,gammaC); time = sol.x ; yvect = sol.y ; y 1 = yvect(1,:)*6.65*10ˆ(−13) ;% Biomass ing/L y 2 = yvect(2,:) ;% AHL y 3 = yvect(3,:) ;% AHL/LuxR Complex y 4 = yvect(4,:) ;% LacI y 5 = yvect(5,:) ;% lambda CI

figure(1); subplot(3,2,1,'FontSize',14) plot(time,y 1,'k','LineWidth',4) legend('Biomass') ylabel('Biomass(g/L)') subplot(3,2,3,'FontSize',14) plot(time,y 2,'k','LineWidth',4) legend('AHL') ylabel('AHL(mM)') Appendix B. Matlab code 156

subplot(3,2,5,'FontSize',14) plot(time,y 3,'k','LineWidth',4) legend('LuxR/AHL Complex') xlabel('Time(h)') ylabel('LuxR/AHL Complex(mM)') subplot(3,2,2,'FontSize',14) plot(time,y 4,'k','LineWidth',4) legend('LacI') ylabel('LacI(mM)') subplot(3,2,4,'FontSize',14) plot(time,y 5,'k','LineWidth',4) legend('\lambdaCI') xlabel('Time(h)') ylabel('\lambdaCI(mM)')

The m-file with the definition of the dynamic equations, QS TS.m, is shown below.

function dY = QS TS(time,x,k,vA,gammaA,rhoR,LuxR,gammaR,aL1,betaC,n1,...

aL2,n3,thetaR,gammaL,aC,betaL,gammaC) dY = zeros(5,1) ; dY = [k*x(1)%N(cell density)

vA*x(1) − gammaA*x(2)% AHL

rhoR*LuxRˆ2*x(2).ˆ2 − gammaR*x(3)% AHL/LuxR Complex

aL1/(1+(x(5)./betaC).ˆn1)+aL2*x(3).ˆn3/(thetaRˆn3+x(3).ˆn3)−gammaL*x(4)% LacI

aC/(1+(x(4)./betaL).ˆn3)−gammaC*x(5)];% lambda CI return Appendix B. Matlab code 157

B.1.2 Production envelope of strain designs predicted by EMILiO (sec- tion 3.1.4)

load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ;

lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) The fine−tuned strain calculated here serves as the operating point #1

% These are the fine−tuned fluxes lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2

Max Growth v = cplexlp(c,[],[],S,b,lb,ub) ;

Max Growth = Max Growth v(8) ;

Ser AtMaxGrowth = Max Growth v(286) ;

Ser WT min(1) = Ser AtMaxGrowth ;

Ser WT max(1) = Ser AtMaxGrowth ;

Growth WT min(1) = Max Growth ;

Growth WT max(1) = Max Growth ;

NoLoops = 100 ; Appendix B. Matlab code 158

c min Ser = c ; c min Ser(286,1) = +1 ; c max Ser = c ; c max Ser(286,1) = −1 ;

for i = 1:NoLoops

i

lb(8) = Max Growth*(NoLoops−i)/NoLoops ;% Fix Growth

ub(8) = Max Growth*(NoLoops−i)/NoLoops ;% Fix Growth

v min = cplexlp(c min Ser,[],[],S,b,lb,ub) ;

v max = cplexlp(c max Ser,[],[],S,b,lb,ub) ;

Ser WT min(i+1) = v min(286) ;

Growth WT min(i+1) = v min(8);

Ser WT max(i+1) = v max(286) ;

Growth WT max(i+1) = v max(8) ; end

% 2) This is the fine−tuned mutant and serves as the operating point #2

% Re−Initialization S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ;

% 1) KNOCK−OUTS lb(493) = 0 ; ub(493) = 0 ;% ACALD lb(1698) = 0 ; ub(1698) = 0 ;% LSERDHr lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNED FLUXES Appendix B. Matlab code 159

lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD {phosphoglycerate dehydrogenase} lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr {phosphotransacetylase} lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS {acetyl−CoA synthetase} lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD {methylenetetrahydrofolate dehydrogenase} lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH {pyruvate dehydrogenase} lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL {pyruvate formate lyase} lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 {Tryptophanase}

Max Growth v = cplexlp(c,[],[],S,b,lb,ub) ;

Max Growth = Max Growth v(8) ;

Ser AtMaxGrowth = Max Growth v(286) ;

Ser Mut min(1) = Ser AtMaxGrowth ;

Ser Mut max(1) = Ser AtMaxGrowth ;

Growth Mut min(1) = Max Growth ;

Growth Mut max(1) = Max Growth ;

for i = 1:NoLoops

i

lb(8) = Max Growth*(NoLoops−i)/NoLoops ;% Fix Growth

ub(8) = Max Growth*(NoLoops−i)/NoLoops ;% Fix Growth

v min = cplexlp(c min Ser,[],[],S,b,lb,ub) ;

v max = cplexlp(c max Ser,[],[],S,b,lb,ub) ;

Ser Mut min(i+1) = v min(286) ;

Growth Mut min(i+1) = v min(8);

Ser Mut max(i+1) = v max(286) ;

Growth Mut max(i+1) = v max(8) ; end

figure(1) Appendix B. Matlab code 160

plot(Growth WT min,Ser WT min,'k',Growth WT max,Ser WT max,'k',...

Growth Mut min,Ser Mut min,'k',Growth Mut max,Ser Mut max,'k','LineWidth',3) xlabel('Growth(hrˆ {−1})','FontSize',24) ylabel('Serine flux(mmol/gDW/hr)','FontSize',24) legend('Wild−type+7 fine −tunings','3KO+7 fine −tunings')

B.1.3 Static strategy for serine production (section 3.2.1)

The m-file Static intsim.m solves the dFBA for the static strategy of the EMILiO serine- producing strain design. The differential equations are defined in the m-file StaticdFBA.m.

The FBA is solved in the m-file fba in.m.

% File: Static intsim

% This file solves the dFBA for the static strategy. global S b lb ub c f kLa T xinit N kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ;

% Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUTS lb(493) = 0 ; ub(493) = 0 ;% ACALD lb(1698) = 0 ; ub(1698) = 0 ;% LSERDHr lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr Appendix B. Matlab code 161

lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2

% The order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); xinit(1,1)=20;%Glc xinit(1,2)=0;%Gl xinit(1,3)=0;%Rib xinit(1,4)=0;%Ac xinit(1,5)=0;%Lac xinit(1,6)=0;%For xinit(1,7)=0;%Eth xinit(1,8)=0;%pyr xinit(1,9)=0;%succ xinit(1,10)=0.0;% SERINE xinit(1,11)=0.2;%O2 xinit(1,12)=0.5;%CO2 xinit(1,13)=0.01;%biomass

T=0.1 ;% The time interval

N=150 ;% The number of integration steps

% Set the initial conditions of the external metabolites clear xaug x t vf;

% Starting the loop t(1)=0;

% Initialising x oldex=xinit; for i=1:N Appendix B. Matlab code 162

v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% FBA solution delx(1,1) = v(164) ;% Glc delx(1,2) = v(174) ;% Gl delx(1,3) = v(281) ;% Rib delx(1,4) = v(36) ;% Ac delx(1,5) = v(208) ;% Lac delx(1,6) = v(138) ;% For delx(1,7) = v(124) ;% Eth delx(1,8) = v(277) ;% Pyr delx(1,9) = v(293) ;% Succ delx(1,10)= v(286) ;% SERINE delx(1,13)= v(8);% Growth mugx = v(8) ; delo2up = v(1975) ;%EX o2(e) delo2diff = kLa*(.21−x oldex(1,11)) ; delco2sec = v(842) ;%EX co2(e) delcco2diff=kLa*(x oldex(1,12)−0.5); % Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; sol = ode23s(@StaticdFBA,tspan,x oldex,[],mugx,...

delx,kLa,delco2sec) ;

x newex = sol.y ;

[m,n] = size(x newex) ;

x newex = [x newex(1:13,n)]' ;

x newex=max(zeros(size(x newex)),x newex) ;

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end Appendix B. Matlab code 163

t(i+1)=t(i)+T; x(i,:)=x newex; vf(i,:)=v'; x oldex=x newex; end xaug=[xinit;x]; biomass = xaug(:,13) ;% Biomass from integration

Serine = xaug(length(xaug),10) ;

Acetate = xaug(length(xaug),4) ;

Res Glc = xaug(length(xaug),1) ;

Biomass = biomass(length(biomass),1) ; products = [xaug(end,1) xaug(end,2) xaug(end,3) xaug(end,4) xaug(end,5) xaug(end,6) xaug(end,7) xaug(end,8) xaug(end,9) xaug(end,10)]; display('Results for KOs+ Fine tuned strain:') display('Initial glucose:') xinit(1,1) display('Final concentrations:') display(' Glucose Glycerol Rib Acetate Lactate Formate EtOH

Pyruv Succin Serine') products

figure(10) subplot(2,2,1,'FontSize',24), plot(t,biomass,'k','LineWidth',3) annotation('textbox',[0.14 0.83 0.1 0.1],'String','A','FontSize',28','LineStyle','none') axis([0 15 0 1]) set(gca,'YTick',[0:0.2:1]) set(gca,'YTickLabel',[0:0.2:1]) ylabel('Biomass') subplot(2,2,2,'FontSize',24), plot(t(1,2:N+1),vf(:,8),'k','LineWidth',3) annotation('textbox',[0.87 0.83 0.1 0.1],'String','B','FontSize',28','LineStyle','none') axis([0 15 0 .5]) set(gca,'YTick',[0:0.1:.5]) set(gca,'YTickLabel',[0:0.1:.5]) ylabel('Growth rate(hrˆ {−1})') Appendix B. Matlab code 164

subplot(2,2,3,'FontSize',24), plot(t,xaug(:,10),'k','LineWidth',3) annotation('textbox',[0.14 0.36 0.1 0.1],'String','C','FontSize',28','LineStyle','none') axis([0 15 0 30]) ylabel('Serine(mM)') xlabel('Time(hr)') subplot(2,2,4,'FontSize',24), plot(t,xaug(:,1),'k','LineWidth',3) annotation('textbox',[0.87 0.36 0.1 0.1],'String','D','FontSize',28','LineStyle','none') axis([0 15 0 20]) set(gca,'YTick',[0:5:20]) set(gca,'YTickLabel',[0:5:20]) ylabel('Glucose(mM)') xlabel('Time(hr)')

The differential equation are included in the m-file StaticdFBA.m shown below.

function dY = StaticdFBA(time,y,mugx,...

delx,kLa,delco2sec) dY = zeros(13,1) ; dY(1) = delx(1,1)*y(13) ; dY(2) = delx(1,2)*y(13) ; dY(3) = delx(1,3)*y(13) ; dY(4) = delx(1,4)*y(13) ; dY(5) = delx(1,5)*y(13) ; dY(6) = delx(1,6)*y(13) ; dY(7) = delx(1,7)*y(13) ; dY(8) = delx(1,8)*y(13) ; dY(9) = delx(1,9)*y(13) ; dY(10) = delx(1,10)*y(13) ; dY(11) = 0 ; dY(12) = delco2sec*y(13) + kLa*(0.5−y(12)) ; dY(13) = mugx*y(13) ; return

Also, the following function is required to solve the FBA in every iteration. Appendix B. Matlab code 165

function X=fba in(y,S,b,lb,ub,c,kLa,T); ki=25; upper g=10*1/(1+y(10)/ki);% upper g=10; maxGlcup=min(upper g,max(0.0,y(1,1)/(y(1,13)*T))); lb(164) = −maxGlcup ; ub(164) = 0 ;% Glc uptake if y(1,1)>0.05, X = cplexlp(c,[],[],S,b,...

lb,ub) ; else

X=[zeros(2583,1)]; end

B.1.4 Dynamic strategy for serine production (section 3.2.2)

The m-file Dynamic intsim.m solves the dFBA for the dynamic strategy of the EMILiO serine-producing strain design. The differential equations are defined in the m-file qs- dFBA.m and the FBA in the fba in.m (the same one as the one in the previous section).

% File: Dynamic intsim

% This file solves the dFBA for the dynamic strategy. global S b lb ub c f kLa T xinit N kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ; % Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUT Appendix B. Matlab code 166

lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 % Parameters approximations% Basal Values from Weiss and You vA = 1.6 ; gammaA = 0.60 ; rhoR = 30 ;

LuxR = 0.5; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; gammaL = 1.386 ; aC = 0.12 ; betaL = 0.0008 ; gammaC = 4.152 ;

%the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); xinit(1,1)=20;%Glc xinit(1,2)=0;%Gl xinit(1,3)=0;%Rib xinit(1,4)=0;%Ac xinit(1,5)=0;%Lac xinit(1,6)=0;%For Appendix B. Matlab code 167

xinit(1,7)=0;%Eth xinit(1,8)=0;%pyr xinit(1,9)=0;% SERINE xinit(1,10)=0.0;%O2 xinit(1,11)=0.5;%CO2 xinit(1,12)=0.01;%biomass xinit(1,13)=0.8*10ˆ(−3)*xinit(1,12) ;%AHL xinit(1,14)=0.0;%LuxR/AHL Complex xinit(1,15)=0.0 ;%LacI xinit(1,16)=0.029 ;%lambdaCI

T=0.05;% The time interval

N=200;% The number of integration steps

clear xaug x t vf;

%starting the loop t(1)=0;

%initialising x oldex=xinit; for i=1:N

% ACALD lb(493) = 2.3346/0.029*x oldex(1,16) ; ub(493) = 2.3346/0.029*x oldex(1,16) ; % LSERDHr lb(1698) = 9.7169/0.029*x oldex(1,16) ; ub(1698) = 9.7169/0.029*x oldex(1,16) ; v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% fba solution delx(1,1) = v(164) ;% Glc delx(1,2) = v(174) ;% Gl delx(1,3) = v(281) ;% Rib delx(1,4) = v(36) ;% Ac delx(1,5) = v(208) ;% Lac delx(1,6) = v(138) ;% For delx(1,7) = v(124) ;% Eth Appendix B. Matlab code 168

delx(1,8) = v(277) ;% Pyr delx(1,9) = v(286) ;% SERINE delx(1,12)= v(8) ;% Growth mugx = v(8); delo2up = v(252) ;%EX o2(e) delo2diff = kLa*(.21−x oldex(1,10)) ; delco2sec = v(85) ;%EX co2(e) delcco2diff=kLa*(x oldex(1,11)−0.5); % Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; sol = ode23s(@qsdFBA,tspan,x oldex,[],aL1,aL2,aC,betaL,betaC,gammaA,...

gammaC,gammaL,gammaR,thetaR,rhoR,vA,LuxR,mugx,...

delx,kLa,delco2sec) ;

x newex = sol.y ;

[m,n] = size(x newex) ;

x newex = [x newex(1:16,n)]' ;

x newex=max(zeros(size(x newex)),x newex) ;

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end t(i+1)=t(i)+T; x(i,:)=x newex; vf(i,:)=v'; x oldex=x newex; end xaug=[xinit;x]; biomass = xaug(:,12) ;% Biomass from integration

Serine = xaug(length(xaug),9) Appendix B. Matlab code 169

Acetate = xaug(length(xaug),4)

Res Glc = xaug(length(xaug),1)

Biomass = biomass(length(biomass),1)

% Calculate the switching time if t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) < 25

t 5 = t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) ; else

t 5 = 25 ; end if t(find(xaug(:,16)<0.25*xaug(1,16),1,'first')) < 25

t 25 = t(find(xaug(:,16)<0.25*xaug(1,16),1,'first')) ; else

t 25 = 25 ; end t 5% The time when the manipulated flux reaches 5% of the initial value t 25% The time when the manipulated flux reaches 25% of the initial value

figure(1) subplot(2,2,1,'FontSize',16), plot(t,biomass,t(1,2:N+1),vf(:,8),'LineWidth',3) ylabel('Biomass−Growth rate') axis([0 15 0 1]) subplot(2,2,2,'FontSize',16), plot(t,xaug(:,9),'LineWidth',3) ylabel('Serine(mM)') subplot(2,2,3,'FontSize',16), plot(t(1,2:N+1),vf(:,493),t(1,2:N+1),vf(:,1698),'LineWidth',3) ylabel('ACALD−LSERDH flux') xlabel('Time(hr)') subplot(2,2,4,'FontSize',16), plot(t,xaug(:,1),'LineWidth',3) ylabel('Glucose(mM)') xlabel('Time(hr)')

And the differential equations are described in the m-file qs dFBA.m shown below.

function dY = qsdFBA(time,y,aL1,aL2,aC,betaL,betaC,gammaA,... Appendix B. Matlab code 170

gammaC,gammaL,gammaR,thetaR,rhoR,vA,LuxR,mugx,...

delx,kLa,delco2sec) dY = zeros(16,1) ; dY(1) = delx(1,1)*y(12) ; dY(2) = delx(1,2)*y(12) ; dY(3) = delx(1,3)*y(12) ; dY(4) = delx(1,4)*y(12) ; dY(5) = delx(1,5)*y(12) ; dY(6) = delx(1,6)*y(12) ; dY(7) = delx(1,7)*y(12) ; dY(8) = delx(1,8)*y(12) ; dY(9) = delx(1,9)*y(12) ; dY(10) = 0 ; dY(11) = delco2sec*y(12) + kLa*(0.5−y(11)) ; dY(12) = mugx*y(12) ; dY(13) = vA*y(12) − gammaA*y(13);% AHL dY(14) = rhoR*LuxRˆ2*y(13).ˆ2 − gammaR*y(14);% AHL/LuxR Complex dY(15) = aL1/(1+(y(16)./betaC).ˆ2)+aL2*y(14).ˆ3 ... /(thetaR+y(14).ˆ3)−...

gammaL*y(15);% LacI dY(16) = aC/(1+(y(15)./betaL).ˆ2)−gammaC*y(16);% lambda CI return

B.2 M-files for Chapter 4

B.2.1 Ideal dynamic strategy (section 4.3.1)

The main m-file, OptimalTswitching.m, solves the integration in file Ideal intsim.m for different switching times.

% File: OptimalTswitching.m

% For different switching times ts, it calls the integrator file Ideal intsim.m Appendix B. Matlab code 171

% to runa batch simulation global S b lb ub c f kLa T xinit N kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ; % Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% FINE−TUNED FLUXES lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 %the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); T = 0.025 ;% The interval ts = [0:0.25:8];% The switching times

N = 480 ;% The number of integration steps clear xaug x t vf;

for j = 1:length(ts)

j

Y = ts(1,j) ;

[End Serine(1,j),tbatch(1,j),End Glucose(1,j)] = ...

Ideal intsim(Y,S,b,lb,ub,c,kLa,T,xinit,N); Appendix B. Matlab code 172

end

Productivity = End Serine./tbatch ;

Yield = End Serine./(20 − End Glucose) ;

save SensResults.mat ts End Serine tbatch End Glucose Productivity Yield

figure(3) subplot(3,2,1,'FontSize',20) plot(ts,Productivity,'LineWidth',3) ylabel({'Productivity''(mM serine/hr)' }) subplot(3,2,3,'FontSize',20) plot(ts,Yield,'LineWidth',3) ylabel({'Yield''(mM ser/mM gluc)' }) subplot(3,2,5,'FontSize',20) plot(ts,End Serine,'LineWidth',3) ylabel({'Serine titer''(mM)' }) xlabel('Switching time(hr)') subplot(3,2,[2 4 6],'FontSize',20) plot(ts,tbatch,'LineWidth',3) xlabel('Switching time(hr)') ylabel('Batch time(hr)')

The file Ideal intsim.m is a modified version of the previous intsim.m files we have seen, and it’s given below. The files StaticdFBA.m and fba in.m have to be in the same folder as well.

function [End Serine,tbatch,End Glucose] = ...

Ideal\ intsim(Y,S,b,lb,ub,c,kLa,T,xinit,N); % Starting the loop t(1)=0;

% Initialising xinit(1,1)=20;%Glc Appendix B. Matlab code 173

xinit(1,2)=0;%Gl xinit(1,3)=0;%Rib xinit(1,4)=0;%Ac xinit(1,5)=0;%Lac xinit(1,6)=0;%For xinit(1,7)=0;%Eth xinit(1,8)=0;%pyr xinit(1,9)=0;%succ xinit(1,10)=0.0;% SERINE xinit(1,11)=0.2;%O2 xinit(1,12)=0.5;%CO2 xinit(1,13)=0.01;%biomass

x oldex=xinit;

for i=1:N

if t(i) < Y(1,1) lb(493) = 2.3346 ; ub(493) = 2.3346 ;% ACALD

lb(1698) = 9.7169 ; ub(1698) = 9.7169 ;% LSERDHr

lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

else

lb(493) = 0 ; ub(493) = 0 ;% ACALD

lb(1698) = 0 ; ub(1698) = 0 ;% LSERDHr

lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

end v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% FBA solution delx(1,1) = v(164) ;% Glc delx(1,2) = v(174) ;% Gl delx(1,3) = v(281) ;% Rib delx(1,4) = v(36) ;% Ac delx(1,5) = v(208) ;% Lac delx(1,6) = v(138) ;% For delx(1,7) = v(124) ;% Eth Appendix B. Matlab code 174

delx(1,8) = v(277) ;% Pyr delx(1,9) = v(293) ;% Succ delx(1,10)= v(286) ;% SERINE delx(1,13)= v(8);% Growth

mugx = v(8) ; delo2up = v(1975) ;%EX o2(e) delo2diff = kLa*(.21−x oldex(1,11)) ; delco2sec = v(842) ;%EX co2(e) delcco2diff=kLa*(x oldex(1,12)−0.5);

% Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; sol = ode23s(@StaticdFBA,tspan,x oldex,[],mugx,...

delx,kLa,delco2sec) ;

x newex = sol.y ;

[m,n] = size(x newex) ;

x newex = [x newex(1:13,n)]' ;

x newex=max(zeros(size(x newex)),x newex) ;

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end t(i+1)=t(i)+T; x(i,:)=x newex; vf(i,:)=v'; x oldex=x newex; end xaug=[xinit;x]; Appendix B. Matlab code 175

TbatchIndex = find(xaug(:,1) == min(xaug(:,1)),1,'first'); tbatch = t(TbatchIndex) ;

End Serine = xaug(TbatchIndex,10) ;

End Glucose = xaug(TbatchIndex,1) ;

B.2.2 Global sensitivity analysis (section 4.3.2)

The code used is based on the code provided by Prof. R. Vadigepalli from Thomas Jefferson

University. The global sensitivity analysis begins with the file that generates the perturbed parameters, Generate perturbed parameters.m, which calls the file Get perturbed parameters.m to use the Sobol’ generator of quasi random numbers.

% File: Generate perturbed parameters.m

% This file generates matricesA andB required for the GSA

% by calling the file Get perturbed parameters.m global vA gammaA rhoR global LuxR gammaR aL1 betaC global aL2 thetaR global gammaL aC betaL gammaC global glcnoichdxcon glcnoichdxsto f kLa T xinit N

% The above global parameters are included in the file par initial.m

% They describe the initial values of the parameters that vary

% and the FBA arguments. vA = 1.6 ; gammaA = 0.6 ; rhoR = 30 ;

LuxR = 0.5 ; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; Appendix B. Matlab code 176

gammaL = 1.386 ; aC = 0.12 ; betaL = 0.0008 ; gammaC = 4.152 ;

Sample Size N = 8000; index Parameters to permute = [1:13];

Base parameters = [vA,gammaA,rhoR,LuxR,gammaR,aL1,betaC,aL2,...

thetaR,gammaL,aC,betaL,gammaC]; output filename ='A and B from 13 parameters';

MAT file from Get perturbed parameters = output filename;

Index of Model outputs = [9];

Model time vector = [0:0.5:25];

MAT file to save sensitivity results ='Sensitivity Results';

Get perturbed parameters(Sample Size N, ... index Parameters to permute, Base parameters, output filename);

The file Get perturbed parameters.m is shown below.

function Get perturbed parameters(Sample Size N, index Parameters to permute,...

Base parameters, output filename)

% This script usesa scrambled Sobol set to generate matrices of perturbed parameters

% Inputs are:

% Sample Size N: the number of rows in the perturbed parameter

% matrices(A andB)

% index Parameters to permute: an index of which parameters of all model

% parameter to investigate

% Base parameters:a vector of all basal model parameters and initial conditions

% This must bea row vector(so dimensions1 by something) Appendix B. Matlab code 177

% output filename refers to the name of the text file to be

% generated by this function

% Output: None − this function generatesa text file of filename"output filename" % Example function call:

% Get perturbed parameters(1e5, 1:3, [1,2,3,4,5], ...

%'A and B from first 13 parameters')

% Definek k = length(index Parameters to permute) ;

% this will be the number of columns inA andB

% Create the matricesA andB

My sobol set = sobolset(k);

% Sobol set with dimensions of the number of parameters we're perturbing

My scrambled sobol set = scramble(My sobol set,'MatousekAffineOwen')

My parameter Multiplier matrix = net(My scrambled sobol set, 2*Sample Size N);

% Pluck out the Sobol set between1 and2 *N

% Initialize the matrix of permuted parameters

% Note:A and B has matrixA for rows 1:Sample Size N;

% then matrixB for rowsN+1:end

A and B = zeros(2*Sample Size N, k); %A=A and B(1:N,:);B=A and B(N+1:end,:);

for i = 1:2*Sample Size N% Givena range of param *0.1 − param*10

A and B(i,:)=10.ˆ(1−2.*My parameter Multiplier matrix(i,:)).*... Base parameters ; end save(output filename)

Once the matrices A and B are generated we can run the file Parallel init.m to initiate the simulations in the parallel computation mode. The simulation part is the most time con- suming and parallelization is necessary to reduce the computation time. With the following Appendix B. Matlab code 178

implementation we can run 15 processors in parallel (where 15 = number of perturbed pa- rameters + 2) to generate the responses of the output yA, yB, and yCi for i=1,...,13. The file Parallel init.m calls the file Get perturbed model simuations.m, in which the model is simulated for the relevant matrix (i.e., A, B or Ci). Notice that all the m-files generated below this level should be kept in a separate folder in order to run them in parallel.

% File: Parallel init.m

% This file initiates the parallel evaluation of models to calculate the outputs

%y A,y B andy Ci(i=1,..., 13) load('A and B from 3 parameters')

Index of Model outputs = [9];

Model time vector = [0:0.5:25];

Get perturbed model simulations('A and B from 13 parameters', ...

Index of Model outputs, Model time vector)

In the following Get perturbed model simulations.m file, we generate the file with the model output yC5 (notice that the loop is executed for i=5). The other yCi files can be generated with the appropriate value of i in separate folders. At the end of the file, we also include the code to generate the outputs yA and yB as comments (simply remove the % symbol to run the code).

function Get perturbed model simulations(MAT file from Get perturbed parameters, ...

Index of Model outputs,Model time vector)

% This script computes they A,y B andy Ci vectors necessary for the sensitivity

% analysis.

% Note: this will generatea text file of the model output.

% Their filenames will be extensions of the output filename input to

% Get perturbed parameters.m

% Inputs: Appendix B. Matlab code 179

%MAT file from Get perturbed parameters: the name of the .mat file

% generated by Get perturbed parameters.m

% Index of Model outputs: an index of which model outputs will be evaluated

% in the sensitivity analysis

% Model time vector:a vector of time points used by the Get y function call

% Output: None − this function generates MANY text files fory A,y B andy Ci % Example function call:

% Get perturbed model simulations('A and B from 13 parameters',[2,3],0:0.1:20)

% LoadA and B matrix load(MAT file from Get perturbed parameters)

% Calculate the number of columns to be saved in they A,y B andy Ci matrices

Number of columns = length(Index of Model outputs)*length(Model time vector);

% Initialize Variables for writingy A,y B andy Ci text files output format entry ='%0.5e \t'; output format line = repmat(output format entry,1,Number of columns); output format line = [output format line' \n'];

% Evaluate Model to get yCi

Ci = zeros(Sample Size N ,k);

% Recall thatk was defined in the previous script, and loaded

% in the .mat file at the beginning of this script fori=5

Ci = A and B(Sample Size N + 1:end, :);% Ci=B

Ci(:,i) = A and B(1:Sample Size N , i);% So thei −th column of Ci is taken fromA file to write=fopen([output filename,' y C',num2str(i),'.txt'],'w');

Iter Ci = i

for j = 1:Sample Size N

yCi = Get y(Ci(j,:),index Parameters to permute,...

Index of Model outputs,Model time vector,Base parameters); Appendix B. Matlab code 180

Iter yCi = j

fprintf(file to write, output format line, yCi);

end

fclose(file to write); end

% Evaluate Model to get yA and yB

% yA= zeros(Sample Size N,Number of columns);

% Rows correspond to perturbation index;

% Columns are the time−series of model outputs % yB= zeros(Sample Size N,Number of columns);

% write yA and yB to text files

% file to write=fopen(['yA ',output filename,'.txt'],'w');

% This will evaluate the model each time fora row, then write it toa text file

% forj= 1:Sample Size N

% yA= Get y(A and B(j,:),index Parameters to permute,...

% Index of Model outputs,Model time vector,Base parameters);

% Iter yA=j

% fprintf(file to write, output format line, yA);

% end

% fclose(file to write);

% file to write=fopen(['yB ',output filename,'.txt'],'w');

% forj= 1:Sample Size N

% yB= Get y(A and B(Sample Size N+j,:),index Parameters to permute,...

% Index of Model outputs,Model time vector,Base parameters);

% Iter yB=j

% Recall howB is saved in the matrixA and B

% fprintf(file to write, output format line, yB);

% end

% fclose(file to write); Appendix B. Matlab code 181

The file Get y.m that is called in the file above is shown next. The file Get y.m needs the following files: intsim.m, qsdFBA.m and fba in.m to integrate the equations in the dFBA.

The files given in the section ”Dynamic strategy for serine production” can be used.

function y = Get y(Perturbed parameters,index Parameters to permute,...

Index of Model outputs,Model time vector,Base parameters)

% This function is supplied by the user to describe how to evaluate

%a model simulation witha given vector of perturbed parameters.

% Requirements:y must be returned asa row vector

% Calculate the number of columns to be saved in the yA, yB, and yCi matrices

Number of columns = length(Index of Model outputs)*length(Model time vector);

% initialize output y = zeros(1,Number of columns);

% Rows correspond to perturbation index

% Columns are the time−series of model outputs

Parameters to be evaluated in model(index Parameters to permute) =

Perturbed parameters;

% User must supply the code below

% Model evaluation evaluates the model with input

% Parameters to be evaluated in model

% at the times described in the vector Model time vector

Model evaluation = intsim(Parameters to be evaluated in model);

Output = Model evaluation(:,Index of Model outputs); y = [Output(:,1)']; Appendix B. Matlab code 182

B.2.3 Effect of αC and γC (section 4.3.3)

The m-file intsim parametric.m scans the plane of the parameters αC and γC and integrates the ODE’s for every value of the grid. Thus, the files qsdFBA.m and fba in.m have to be in the same folder.

% File: intsim parametric.m

% This file iterates the parameters$ \alpha C$ and$ \gamma C$ % and integrates the ODE's for every value of the grid. global S b lb ub c f kLa T xinit N kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ; % Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUT lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 % Parameters approximations% Basal Values from Weiss and You vA = 1.6 ; gammaA = 0.6 ; Appendix B. Matlab code 183

rhoR = 30 ;

LuxR = 0.5 ; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; gammaL = 1.386 ; betaL = 0.0008 ;

nsteps = 25 ; aC min = log10(0.01) ; aC max = log10(10) ; aC = logspace(aC min,aC max,nsteps);% #3 nominal value aC=0.12

msteps = 25 ; gC min = log10(0.1) ; gC max = log10(100) ; gammaC = logspace(gC min,gC max,msteps) ;% #6 nominal value gC=4.152

%the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); fori aC=1:length(aC)

fori gC=1:length(gammaC)

progress in percent = 100*((i aC−1)*msteps+i gC)/nsteps/msteps xinit(1,1)=20;%Glc

xinit(1,2)=0;%Gl

xinit(1,3)=0;%Rib

xinit(1,4)=0;%Ac

xinit(1,5)=0;%Lac

xinit(1,6)=0;%For

xinit(1,7)=0;%Eth Appendix B. Matlab code 184

xinit(1,8)=0;%pyr

xinit(1,9)=0;%succ

xinit(1,10)=0.0;%O2

xinit(1,11)=0.5;%CO2

xinit(1,12)=0.01;%biomass

xinit(1,13)=0.8*10ˆ(−3)*xinit(1,12) ;%AHL xinit(1,14)=0.0;%LuxR/AHL Complex

xinit(1,15)=0.0 ;%LacI

xinit(1,16)=0.029 ;%lambdaCI

T=0.05;

N=260;

clear xaug x t vf;

%starting the loop

t(1)=0;

%initialising

x oldex=xinit;

for i=1:N

lb(493) = 0 ; ub(493) = 2.3346/0.029*x oldex(1,16) ;% ACALD

lb(1698) = 0 ; ub(1698) = 9.7169/0.029*x oldex(1,16) ;% LSERDHr v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% FBA solution

delx(1,1) = v(164) ;% Glc

delx(1,2) = v(174) ;% Gl

delx(1,3) = v(281) ;% Rib

delx(1,4) = v(36) ;% Ac

delx(1,5) = v(208) ;% Lac

delx(1,6) = v(138) ;% For

delx(1,7) = v(124) ;% Eth

delx(1,8) = v(277) ;% Pyr

delx(1,9) = v(286) ;% SERINE

delx(1,12)= v(8) ;% Growth

mugx = v(8);

delo2up = v(252) ;%EX o2(e)

delo2diff = kLa*(.21−x oldex(1,10)) ; Appendix B. Matlab code 185

delco2sec = v(85) ;%EX co2(e)

delcco2diff=kLa*(x oldex(1,11)−0.5); % Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; [time,yvect] = ode23s(@qsdFBA,tspan,x oldex,[],aL1,aL2,...

aC(1,i aC),betaL,betaC,gammaA,gammaC(1,i gC),...

gammaL,gammaR,thetaR,rhoR,vA,LuxR,...

mugx,delx,kLa,delco2sec) ;

x newex = yvect ;

[m,n] = size(x newex) ;

x newex = [x newex(m,1:16)] ;

x newex=max(zeros(size(x newex)),x newex);

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end

t(i+1)=t(i)+T;

x(i,:)=x newex;

vf(i,:)=v';

x oldex=x newex;

end

xaug=[xinit;x];

% Estimate objectives

clear TbatchIndex

TbatchIndex = find(xaug(:,1) == min(xaug(:,1)),1,'first');

tbatch(i aC,i gC) = t(TbatchIndex) ;

SerineTiter(i aC,i gC) = xaug(end,9) ;

GLC end(i aC,i gC) = xaug(end,1) ; Appendix B. Matlab code 186

if t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) < 22

t 5(i aC,i gC) = t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) ; else

t 5(i aC,i gC) = t(TbatchIndex) ;

end

if t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) < 22

t 30(i aC,i gC) = t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) ; else

t 30(i aC,i gC) = t(TbatchIndex) ;

end

end end

Productivity = SerineTiter./tbatch ;

Yield = SerineTiter./(xinit(1,1)−GLC end) ;

% Plotting aC Limits = [min(aC) max(aC)] ; gC Limits = [min(gammaC) max(gammaC)] ;

StaticProductivity = [2.3 2.3;2.3 2.3] ; figure(1) subplot(2,2,1,'FontSize',22) surf(aC,gammaC,Productivity','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); hold on surf(aC Limits,gC Limits,StaticProductivity) set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100']) Appendix B. Matlab code 187

zlabel({'Productivity''(mM Ser./hr)' },'FontSize',24) xlim([0.01 10]) ylim([0.1 100]) zlim([0.5 3]) hold on scatter3(0.12,4.152,3,64,'filled') subplot(2,2,2,'FontSize',22) surf(aC,gammaC,Yield','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100']) zlabel({'Yield''(mM Ser./mM Gluc.)' },'FontSize',24) xlim([0.01 10]) ylim([0.1 100]) zlim([0 1.5]) hold on scatter3(0.12,4.152,1.25,64,'filled') subplot(2,2,3,'FontSize',22) surf(aC,gammaC,SerineTiter','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100']) zlabel('Serine titer(mM)','FontSize',24) xlim([0.01 10]) ylim([0.1 100]) zlim([0 30]) hold on Appendix B. Matlab code 188

scatter3(0.12,4.152,24,64,'filled') subplot(2,2,4,'FontSize',22) surf(aC,gammaC,t 30','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); hold on surf(aC,gammaC,tbatch','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100']) set(gca,'ZTick',[0 3 6 9 12]) set(gca,'ZTickLabel',[0 3 6 9 12]) xlim([0.01 10]) ylim([0.1 100]) zlim([0 12]) zlabel('Time(hr)','FontSize',24) view([−70 18]) hold on scatter3(0.12,4.152,4.2,64,'filled')

B.2.4 Effect of αC and LuxR (section 4.3.4)

Here, parameters αC and LuxR are varied across the plane. The files qsdFBA.m and fba in.m are also required to execute the main file below.

% File: intsim parametric.m

% This file iterates the parameters$ \alpha C$ and LuxR % and integrates the ODE's for every value of the grid. global S b lb ub c f kLa T xinit N Appendix B. Matlab code 189

kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; ub = model.ub ; c = −model.c ; % Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUT lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 % Parameters approximations% Basal Values from Weiss and You vA = 1.6 ; gammaA = 0.6 ; rhoR = 30 ;

LuxR = 0.5 ; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; gammaL = 1.386 ; betaL = 0.0008 ; gammaC = 4.152; Appendix B. Matlab code 190

nsteps = 25 ; aC min = log10(0.01) ; aC max = log10(10) ; aC = logspace(aC min,aC max,nsteps);% #3 nominal value aC=0.12

msteps = 25 ;

LuxR min = log10(0.01) ;

LuxR max = log10(100) ;

LuxR = logspace(LuxR min,LuxR max,msteps) ;

% #6 nominal value gC=4.152

%the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })');

fori aC=1:length(aC)

fori LuxR=1:length(LuxR)

progress in percent = 100*((i aC−1)*msteps+i LuxR)/nsteps/msteps xinit(1,1)=20;%Glc

xinit(1,2)=0;%Gl

xinit(1,3)=0;%Rib

xinit(1,4)=0;%Ac

xinit(1,5)=0;%Lac

xinit(1,6)=0;%For

xinit(1,7)=0;%Eth

xinit(1,8)=0;%pyr

xinit(1,9)=0;%succ

xinit(1,10)=0.0;%O2

xinit(1,11)=0.5;%CO2

xinit(1,12)=0.01;%biomass

xinit(1,13)=0.8*10ˆ(−3)*xinit(1,12) ;%AHL xinit(1,14)=0.0;%LuxR/AHL Complex Appendix B. Matlab code 191

xinit(1,15)=0.0 ;%LacI

xinit(1,16)=0.029 ;%lambdaCI

T=0.05;

N=260;

clear xaug x t vf;

%starting the loop

t(1)=0;

%initialising

x oldex=xinit;

for i=1:N

lb(493) = 0 ; ub(493) = 2.3346/0.029*x oldex(1,16) ;% ACALD

lb(1698) = 0 ; ub(1698) = 9.7169/0.029*x oldex(1,16) ;% LSERDHr v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% FBA solution

delx(1,1) = v(164) ;% Glc

delx(1,2) = v(174) ;% Gl

delx(1,3) = v(281) ;% Rib

delx(1,4) = v(36) ;% Ac

delx(1,5) = v(208) ;% Lac

delx(1,6) = v(138) ;% For

delx(1,7) = v(124) ;% Eth

delx(1,8) = v(277) ;% Pyr

delx(1,9) = v(286) ;% SERINE

delx(1,12)= v(8) ;% Growth

mugx = v(8);

delo2up = v(252) ;%EX o2(e)

delo2diff = kLa*(.21−x oldex(1,10)) ; delco2sec = v(85) ;%EX co2(e)

delcco2diff=kLa*(x oldex(1,11)−0.5); % Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex); Appendix B. Matlab code 192

tspan = [(i−1)*T i*T] ; [time,yvect] = ode23s(@qsdFBA,tspan,x oldex,[],aL1,aL2,...

aC(1,i aC),betaL,betaC,gammaA,gammaC,...

gammaL,gammaR,thetaR,rhoR,vA,LuxR(1,i LuxR),...

mugx,delx,kLa,delco2sec) ;

x newex = yvect ;

[m,n] = size(x newex) ;

x newex = [x newex(m,1:16)] ;

x newex=max(zeros(size(x newex)),x newex);

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end

t(i+1)=t(i)+T;

x(i,:)=x newex;

vf(i,:)=v';

x oldex=x newex;

end

xaug=[xinit;x];

% Estimate objectives

clear TbatchIndex

TbatchIndex = find(xaug(:,1) == min(xaug(:,1)),1,'first');

tbatch(i aC,i LuxR) = t(TbatchIndex) ;

SerineTiter(i aC,i LuxR) = xaug(end,9) ;

GLC end(i aC,i LuxR) = xaug(end,1) ;

if t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) < 22

t 5(i aC,i LuxR) = t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) ; else

t 5(i aC,i LuxR) = t(TbatchIndex) ;

end

if t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) < 22 Appendix B. Matlab code 193

t 30(i aC,i LuxR) = t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) ; else

t 30(i aC,i LuxR) = t(TbatchIndex) ;

end

end end

Productivity = SerineTiter./tbatch ;

Yield = SerineTiter./(xinit(1,1)−GLC end) ; % Plotting aC Limits = [min(aC) max(aC)] ;

LuxR Limits = [min(LuxR) max(LuxR)] ;

StaticProductivity = [2.3 2.3;2.3 2.3] ; figure(1) subplot(2,2,1,'FontSize',22) surf(aC,LuxR,Productivity','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); hold on surf(aC Limits,LuxR Limits,StaticProductivity) set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−02 1e−01 1 10 100]) set(gca,'YTickLabel',[1e−02 1e−01 1 10 100']) zlabel({'Productivity''(mM Ser./hr)' },'FontSize',24) xlim([0.01 10]) ylim([0.01 100]) zlim([0.5 3]) view([−54 26]) hold on scatter3(0.12,0.5,3,64,'filled') subplot(2,2,2,'FontSize',22) Appendix B. Matlab code 194

surf(aC,LuxR,Yield','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−02 1e−01 1 10 100]) set(gca,'YTickLabel',[1e−02 1e−01 1 10 100']) zlabel({'Yield''(mM Ser./mM Gluc.)' },'FontSize',24) xlim([0.01 10]) ylim([0.01 100]) zlim([0 1.5]) view([−54 26]) hold on scatter3(0.12,0.5,1.22,64,'filled') subplot(2,2,3,'FontSize',22) surf(aC,LuxR,SerineTiter','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−02 1e−01 1 10 100]) set(gca,'YTickLabel',[1e−02 1e−01 1 10 100']) zlabel('Serine titer(mM)','FontSize',24) xlim([0.01 10]) ylim([0.01 100]) zlim([0 30]) view([−54 26]) hold on scatter3(0.12,0.4,23.5,64,'filled') subplot(2,2,4,'FontSize',22) surf(aC,LuxR,t 5','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); Appendix B. Matlab code 195

hold on surf(aC,LuxR,tbatch','FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−02 1e−01 1 10]) set(gca,'XTickLabel',[1e−02 1e−01 1 10]) set(gca,'YTick',[1e−02 1e−01 1 10 100]) set(gca,'YTickLabel',[1e−02 1e−01 1 10 100']) set(gca,'ZTick',[0 3 6 9 12]) set(gca,'ZTickLabel',[0 3 6 9 12]) xlim([0.01 10]) ylim([0.01 100]) zlim([0 12]) zlabel('Time(hr)','FontSize',24) view([−82 24]) hold on scatter3(0.12,0.5,4.5,64,'filled')

B.2.5 Effect of γC and LuxR (section 4.3.5)

Parameters γC and LuxR are varied across the plane. The files qsdFBA.m and fba in.m are also required to execute the main file below.

% File: intsim parametric.m

% This file iterates the parameters$ \gamma C$ and LuxR % and integrates the ODE's for every value of the grid. global S b lb ub c f kLa T xinit N kLa=80; load('iJO1366.mat')

S = model.S ; b = model.b ; lb = model.lb ; Appendix B. Matlab code 196

ub = model.ub ; c = −model.c ; % Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUT lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 % Parameters approximations% Basal Values from Weiss and You vA = 1.6 ; aC = 0.12; gammaA = 0.6 ; rhoR = 30 ; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; gammaL = 1.386 ; betaL = 0.0008 ;

nsteps = 25 ;

LuxR min = log10(0.01) ;

LuxR max = log10(10) ;

LuxR = logspace(LuxR min,LuxR max,nsteps);

% #3 nominal value aC=0.12 Appendix B. Matlab code 197

msteps = 25 ; gC min = log10(0.1) ; gC max = log10(100) ; gammaC = logspace(gC min,gC max,msteps) ;

% #6 nominal value gC=4.152

%the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); fori LuxR=1:length(LuxR)

fori gC=1:length(gammaC)

progress in percent = 100*((i LuxR−1)*msteps+i gC)/nsteps/msteps xinit(1,1)=20;%Glc

xinit(1,2)=0;%Gl

xinit(1,3)=0;%Rib

xinit(1,4)=0;%Ac

xinit(1,5)=0;%Lac

xinit(1,6)=0;%For

xinit(1,7)=0;%Eth

xinit(1,8)=0;%pyr

xinit(1,9)=0;%succ

xinit(1,10)=0.0;%O2

xinit(1,11)=0.5;%CO2

xinit(1,12)=0.01;%biomass

xinit(1,13)=0.8*10ˆ(−3)*xinit(1,12) ;%AHL xinit(1,14)=0.0;%LuxR/AHL Complex

xinit(1,15)=0.0 ;%LacI

xinit(1,16)=0.029 ;%lambdaCI

T=0.05;

N=260;

clear xaug x t vf; Appendix B. Matlab code 198

%starting the loop

t(1)=0;

%initialising

x oldex=xinit;

for i=1:N

lb(493) = 0 ; ub(493) = 2.3346/0.029*x oldex(1,16) ;% ACALD

lb(1698) = 0 ; ub(1698) = 9.7169/0.029*x oldex(1,16) ;% LSERDHr v=fba in(x oldex,S,b,lb,ub,c,kLa,T);% FBA solution

delx(1,1) = v(164) ;% Glc

delx(1,2) = v(174) ;% Gl

delx(1,3) = v(281) ;% Rib

delx(1,4) = v(36) ;% Ac

delx(1,5) = v(208) ;% Lac

delx(1,6) = v(138) ;% For

delx(1,7) = v(124) ;% Eth

delx(1,8) = v(277) ;% Pyr

delx(1,9) = v(286) ;% SERINE

delx(1,12)= v(8) ;% Growth

mugx = v(8);

delo2up = v(252) ;%EX o2(e)

delo2diff = kLa*(.21−x oldex(1,10)) ; delco2sec = v(85) ;%EX co2(e)

delcco2diff=kLa*(x oldex(1,11)−0.5); % Integration

if mugx>0 x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; [time,yvect] = ode23s(@qsdFBA,tspan,x oldex,[],aL1,aL2,...

aC,betaL,betaC,gammaA,gammaC(1,i gC),...

gammaL,gammaR,thetaR,rhoR,vA,LuxR(1,i LuxR),...

mugx,delx,kLa,delco2sec) ;

x newex = yvect ;

[m,n] = size(x newex) ; Appendix B. Matlab code 199

x newex = [x newex(m,1:16)] ;

x newex=max(zeros(size(x newex)),x newex);

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end

t(i+1)=t(i)+T;

x(i,:)=x newex;

vf(i,:)=v';

x oldex=x newex;

end

xaug=[xinit;x];

% Estimate objectives

clear TbatchIndex

TbatchIndex = find(xaug(:,1) == min(xaug(:,1)),1,'first');

tbatch(i LuxR,i gC) = t(TbatchIndex) ;

SerineTiter(i LuxR,i gC) = xaug(end,9) ;

GLC end(i LuxR,i gC) = xaug(end,1) ;

if t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) < 22

t 5(i LuxR,i gC) = t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) ; else

t 5(i LuxR,i gC) = t(TbatchIndex) ;

end

if t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) < 22

t 30(i LuxR,i gC) = t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) ; else

t 30(i LuxR,i gC) = t(TbatchIndex) ;

end

end end Appendix B. Matlab code 200

Productivity = SerineTiter./tbatch ;

Yield = SerineTiter./(xinit(1,1)−GLC end) ; % Plotting gammaC Limits = [min(gammaC) max(gammaC)] ;

LuxR Limits = [min(LuxR) max(LuxR)] ;

StaticProductivity = [2.3 2.3;2.3 2.3] ; figure(1) subplot(2,2,1,'FontSize',22) surf(gammaC,LuxR,Productivity,'FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); hold on surf(gammaC Limits,LuxR Limits,StaticProductivity) set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) zlabel({'Productivity''(mM Ser./hr)' },'FontSize',24) xlim([0.1 100]) ylim([0.1 100]) zlim([0.5 3]) hold on scatter3(4.152,0.5,3,64,'filled') subplot(2,2,2,'FontSize',22) surf(gammaC,LuxR,Yield,'FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) Appendix B. Matlab code 201

zlabel({'Yield''(mM Ser./mm Gluc.)' },'FontSize',24) xlim([0.1 100]) ylim([0.1 100]) zlim([0 1.5]) hold on scatter3(4.152,0.5,1.2,64,'filled') subplot(2,2,3,'FontSize',22) surf(gammaC,LuxR,SerineTiter,'FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) zlabel('Serine titer(mM)','FontSize',24) xlim([0.1 100]) ylim([0.1 100]) zlim([0 30]) hold on scatter3(4.152,0.5,24,64,'filled') subplot(2,2,4,'FontSize',22) surf(gammaC,LuxR,t 5,'FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); hold on surf(gammaC,LuxR,tbatch,'FaceColor','interp'); set(gca,'XScale','log'); set(gca,'YScale','log'); set(gca,'XDir','reverse'); set(gca,'YDir','reverse'); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) Appendix B. Matlab code 202

set(gca,'YTickLabel',[1e−01 1 10 100]) set(gca,'ZTick',[0 3 6 9 12]) set(gca,'ZTickLabel',[0 3 6 9 12]) xlim([0.1 100]) ylim([0.1 100]) zlim([0 12]) zlabel('Time(hr)','FontSize',24) view([−40 12]) hold on scatter3(4.152,0.5,4.7,64,'filled')

B.2.6 Effect of all three parameters (section 4.3.7)

All three parameters are varied across the plane. The files qsdFBA.m and fba in.m are also required to execute the main file below.

% File: intsim parametric.m

% This file iterates all three parameters

% and integrates the ODE's for every value of the grid. global S b lb ub c f kLa T xinit N N rows kLa=80; load('iJO1366.mat')

S = full(model.S) ; b = full(model.b) ; lb = full(model.lb) ; ub = full(model.ub) ; c = −full(model.c) ; [N rows,N columns] = size(S);

% Conditions lb(164) = −10 ; ub(164) = −10 ;% Glucose uptake rate lb(252) = −20 ; ub(252) = −20 ;% O2 exchange

% 1) KNOCK−OUT Appendix B. Matlab code 203

lb(2343) = 0 ; ub(2343) = 0 ;%SERD L

% 2) FINE−TUNING lb(2076) = 13.5927 ; ub(2076) = 13.5927 ;% PGCD lb(2240) = −0.1344 ; ub(2240) = −0.1344 ;% PTAr lb(547) = 0.0986 ; ub(547) = 0.0986 ;% ACS lb(1848) = 0.3518 ; ub(1848) = 0.3518 ;% MTHFD lb(2047) = 1.6533 ; ub(2047) = 1.6533 ;% PDH lb(2067) = 0.0437 ; ub(2067) = 0.0437 ;% PFL lb(2466) =−0.0277 ; ub(2466) =−0.0277 ;% TRPAS2 % Parameters approximations% Basal Values from Weiss and You vA = 1.6 ; aC = 0.12; gammaA = 0.6 ; rhoR = 30 ; gammaR = 1.386 ; aL1 = 0.06 ; betaC = 0.000008 ; aL2 = 0.06 ; thetaR = 0.01 ; gammaL = 1.386 ; betaL = 0.0008 ;

nsteps = 25 ; aC min = log10(0.1) ; aC max = log10(100) ; aC = logspace(aC min,aC max,nsteps) ;

% #6 nominal value gC=4.152

msteps = 25 ; gC min = log10(0.1) ; gC max = log10(100) ; gammaC = logspace(gC min,gC max,msteps) ;

% #6 nominal value gC=4.152 Appendix B. Matlab code 204

psteps = 25 ;

LuxR min = log10(0.01) ;

LuxR max = log10(10) ;

LuxR = logspace(LuxR min,LuxR max,psteps);

% #3 nominal value LuxR=0.5

% Memory pre−allocation tbatch = zeros(length(aC),length(gammaC),length(LuxR));

SerineTiter = zeros(length(aC),length(gammaC),length(LuxR));

GLC end = zeros(length(aC),length(gammaC),length(LuxR)); t 5 = zeros(length(aC),length(gammaC),length(LuxR)); t 30 = zeros(length(aC),length(gammaC),length(LuxR));

T=0.05;

N=260;

%the order for the external metabolites is as follows met=strvcat(({'Glc','Gl','Rib','Ac','Lac','For','Eth','Pyr','Succ','O2',...

'CO2','biomass','N','AHL','AHL/LuxR Complex','LacI','lambdaCI' })'); fori aC=1:length(aC)

fori gC=1:length(gammaC)

fori LuxR = 1:length(LuxR)

progress in percent = 100*((i aC−1)*msteps*psteps+(i gC−1)*... psteps+i LuxR)/nsteps/msteps/psteps

xinit(1,1)=20;%Glc

xinit(1,2)=0;%Gl

xinit(1,3)=0;%Rib

xinit(1,4)=0;%Ac

xinit(1,5)=0;%Lac

xinit(1,6)=0;%For

xinit(1,7)=0;%Eth

xinit(1,8)=0;%pyr

xinit(1,9)=0;%succ Appendix B. Matlab code 205

xinit(1,10)=0.0;%O2

xinit(1,11)=0.5;%CO2

xinit(1,12)=0.01;%biomass

xinit(1,13)=0.8*10ˆ(−3)*xinit(1,12) ;%AHL xinit(1,14)=0.0;%LuxR/AHL Complex

xinit(1,15)=0.0 ;%LacI

xinit(1,16)=0.029 ;%lambdaCI

clear xaug x t vf;

%starting the loop

t(1)=0;

%initialising

x oldex=xinit;

for i=1:N

lb(493) = 0 ; ub(493) = 2.3346/0.029*x oldex(1,16) ;% ACALD

lb(1698) = 0 ; ub(1698) = 9.7169/0.029*x oldex(1,16) ;% LSERDHr v=fba in(x oldex,S,b,lb,ub,c,kLa,T,N rows,N);% FBA solution

delx(1,1) = v(164) ;% Glc

delx(1,2) = v(174) ;% Gl

delx(1,3) = v(281) ;% Rib

delx(1,4) = v(36) ;% Ac

delx(1,5) = v(208) ;% Lac

delx(1,6) = v(138) ;% For

delx(1,7) = v(124) ;% Eth

delx(1,8) = v(277) ;% Pyr

delx(1,9) = v(286) ;% SERINE

delx(1,12)= v(8) ;% Growth

mugx = v(8);

delo2up = v(252) ;%EX o2(e)

delo2diff = kLa*(.21−x oldex(1,10)) ; delco2sec = v(85) ;%EX co2(e)

delcco2diff=kLa*(x oldex(1,11)−0.5); % Integration

if mugx>0 Appendix B. Matlab code 206

x oldex=max(zeros(size(x oldex)),x oldex);

tspan = [(i−1)*T i*T] ; [time,yvect] = ode23s(@qsdFBA,tspan,x oldex,[],aL1,aL2,...

aC(i aC),betaL,betaC,gammaA,gammaC(1,i gC),...

gammaL,gammaR,thetaR,rhoR,vA,LuxR(1,i LuxR),...

mugx,delx,kLa,delco2sec) ;

x newex = yvect ;

[m,n] = size(x newex) ;

x newex = [x newex(m,1:16)] ;

x newex=max(zeros(size(x newex)),x newex);

else

x newex = x oldex ;

end

ifx newex(1,1)==0,

x newex(1,1)=0.01;

end

t(i+1)=t(i)+T;

%analytical integration

x(i,:)=x newex;

vf(i,:)=v';

x oldex=x newex;

end

xaug=[xinit;x];

% Estimate the objectives

clear TbatchIndex

TbatchIndex = find(xaug(:,1) == min(xaug(:,1)),1,'first');

tbatch(i aC,i gC,i LuxR) = t(TbatchIndex) ;

SerineTiter(i aC,i gC,i LuxR) = xaug(end,9) ;

GLC end(i aC,i gC,i LuxR) = xaug(end,1) ;

if t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) < 22

t 5(i aC,i gC,i LuxR) = t(find(xaug(:,16)<0.05*xaug(1,16),1,'first')) ; else

t 5(i aC,i gC,i LuxR) = t(TbatchIndex) ; Appendix B. Matlab code 207

end

if t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) < 22

t 30(i aC,i gC,i LuxR) = t(find(xaug(:,16)<0.30*xaug(1,16),1,'first')) ; else

t 30(i aC,i gC,i LuxR) = t(TbatchIndex) ;

end

end

end end

Productivity = SerineTiter./tbatch ;

Yield = SerineTiter./(xinit(1,1)−GLC end) ; % Plotting

[x,y,z] = meshgrid(gammaC,aC,LuxR);

V P = Productivity;

V Y = Yield;

SubplotSize = 0.4 ;

FontSize = 24 ;

BoxFontSize = 26 ;

Position X1 = 0.13 ;

Position X2 = 0.63 ;

Position Y1 = 0.09 ;

Position Y2 = 0.56 ;

Level Y1 = [1.2] ;% max(Y)=1.3(static)

Level Y2 = [1.1] ;% max(Y)=1.3(static)

Level P1 = [2.9] ;% max(P)=3(optimal dynamic)

Level P2 = [2.8] ;% max(P)=3(optimal dynamic) str array Y1 = str2num(sprintf('%0.2f', Level Y1)); str array Y1 = strread(num2str(str array Y1),'%s'); str array Y2 = str2num(sprintf('%0.2f', Level Y2)); str array Y2 = strread(num2str(str array Y2),'%s'); str array P1 = str2num(sprintf('%0.2f', Level P1)); str array P1 = strread(num2str(str array P1),'%s'); Appendix B. Matlab code 208

str array P2 = str2num(sprintf('%0.2f', Level P2)); str array P2 = strread(num2str(str array P2),'%s'); str array 1x1 = strcat({'Prod. > '},str array P1,{'Yield >'},str array Y1); str array 1x2 = strcat({'Prod. > '},str array P2,{'Yield >'},str array Y1); str array 2x1 = strcat({'Prod. > '},str array P1,{'Yield >'},str array Y2); str array 2x2 = strcat({'Prod. > '},str array P2,{'Yield >'},str array Y2);

DATA1 = smooth3(V P > Level P1 & V Y > Level Y1) ;

DATA2 = smooth3(V P > Level P2 & V Y > Level Y1) ;

DATA3 = smooth3(V P > Level P1 & V Y > Level Y2,'gaussian');

DATA4 = smooth3(V P > Level P2 & V Y > Level Y2) ; figure(1) h1 = subplot(2,2,1,'FontSize',FontSize); ax1=get(h1,'Position'); set(h1,'Position',ax1); set(h1,'position',[Position X1 Position Y2 SubplotSize SubplotSize]); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) set(gca,'ZTick',[1e−02 1e−01 1 10]) set(gca,'ZTickLabel',[1e−02 1e−01 1 10]) [f1,v1] = isosurface(x,y,z,DATA1,0.5) ; p1 = patch('Faces',f1,'Vertices',v1,...

'FaceColor','blue',...

'EdgeColor','none',...

'AmbientStrength',.2,...

'SpecularStrength',.7,...

'DiffuseStrength',.4); patch(isocaps(x,y,z,DATA1,0.5),...

'FaceColor','interp',...

'EdgeColor','none') colormap cool pbaspect([1,1,1]) Appendix B. Matlab code 209

axis tight view(3) grid on camlight right camlight left set(gcf,'Renderer','zbuffer'); lighting phong set(gca,'XScale','log','YScale','log','ZScale','log') grid on xlabel('$\gamma C$','interpreter','latex','FontSize',28) ylabel('$\alpha C$','interpreter','latex','FontSize',28) zlabel('LuxR') view(40,20) xlim([min(min(min(x))) max(max(max(x)))]) ylim([min(min(min(y))) max(max(max(y)))]) zlim([min(min(min(z))) max(max(max(z)))]) h2 = subplot(2,2,2,'FontSize',FontSize); ax2=get(h2,'Position'); set(h2,'Position',ax2); set(h2,'position',[Position X2 Position Y2 SubplotSize SubplotSize]); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) set(gca,'ZTick',[1e−02 1e−01 1 10]) set(gca,'ZTickLabel',[1e−02 1e−01 1 10]) [f2,v2] = isosurface(x,y,z,DATA2,0.5) ; p2 = patch('Faces',f2,'Vertices',v2,...

'FaceColor','blue',...

'EdgeColor','none',...

'AmbientStrength',.2,...

'SpecularStrength',.7,...

'DiffuseStrength',.4); Appendix B. Matlab code 210

patch(isocaps(x,y,z,DATA2,0.5),...

'FaceColor','interp',...

'EdgeColor','none') colormap cool pbaspect([1,1,1]) axis tight view(3) grid on camlight right camlight left set(gcf,'Renderer','zbuffer'); lighting phong set(gca,'XScale','log','YScale','log','ZScale','log') grid on xlabel('$\gamma C$','interpreter','latex','FontSize',28) ylabel('$\alpha C$','interpreter','latex','FontSize',28) zlabel('LuxR') view(40,20) xlim([min(min(min(x))) max(max(max(x)))]) ylim([min(min(min(y))) max(max(max(y)))]) zlim([min(min(min(z))) max(max(max(z)))]) h3 = subplot(2,2,3,'FontSize',FontSize); ax3=get(h3,'Position'); set(h3,'Position',ax3); set(h3,'position',[Position X1 Position Y1 SubplotSize SubplotSize]); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) set(gca,'YTickLabel',[1e−01 1 10 100]) set(gca,'ZTick',[1e−02 1e−01 1 10]) set(gca,'ZTickLabel',[1e−02 1e−01 1 10]) [f3,v3] = isosurface(x,y,z,DATA3,0.5) ; p3 = patch('Faces',f3,'Vertices',v3,... Appendix B. Matlab code 211

'FaceColor','blue',...

'EdgeColor','none',...

'AmbientStrength',.2,...

'SpecularStrength',.7,...

'DiffuseStrength',.4); patch(isocaps(x,y,z,DATA3,0.5),...

'FaceColor','interp',...

'EdgeColor','none') colormap cool pbaspect([1,1,1]) axis tight view(3) grid on camlight right camlight left set(gcf,'Renderer','zbuffer'); lighting phong set(gca,'XScale','log','YScale','log','ZScale','log') grid on xlabel('$\gamma C$','interpreter','latex','FontSize',28) ylabel('$\alpha C$','interpreter','latex','FontSize',28) zlabel('LuxR') view(40,20) xlim([min(min(min(x))) max(max(max(x)))]) ylim([min(min(min(y))) max(max(max(y)))]) zlim([min(min(min(z))) max(max(max(z)))]) h4 = subplot(2,2,4,'FontSize',FontSize); ax4=get(h4,'Position'); set(h4,'Position',ax4); set(h4,'position',[Position X2 Position Y1 SubplotSize SubplotSize]); set(gca,'XTick',[1e−01 1 10 100]) set(gca,'XTickLabel',[1e−01 1 10 100]) set(gca,'YTick',[1e−01 1 10 100]) Appendix B. Matlab code 212

set(gca,'YTickLabel',[1e−01 1 10 100]) set(gca,'ZTick',[1e−02 1e−01 1 10]) set(gca,'ZTickLabel',[1e−02 1e−01 1 10]) [f4,v4] = isosurface(x,y,z,DATA4,0.5) ; p4 = patch('Faces',f4,'Vertices',v4,...

'FaceColor','blue',...

'EdgeColor','none',...

'AmbientStrength',.2,...

'SpecularStrength',.7,...

'DiffuseStrength',.4); patch(isocaps(x,y,z,DATA4,0.5),...

'FaceColor','interp',...

'EdgeColor','none') colormap cool pbaspect([1,1,1]) axis tight view(3) grid on camlight right camlight left set(gcf,'Renderer','zbuffer'); lighting phong set(gca,'XScale','log','YScale','log','ZScale','log') grid on xlabel('$\gamma C$','interpreter','latex','FontSize',28) ylabel('$\alpha C$','interpreter','latex','FontSize',28) zlabel('LuxR') view(40,20) xlim([min(min(min(x))) max(max(max(x)))]) ylim([min(min(min(y))) max(max(max(y)))]) zlim([min(min(min(z))) max(max(max(z)))]) Appendix C

Mathematical analysis of the dynamic strategy

The global sensitivity analysis involves the following steps:

Step 1:

• Select the range and distribution of the parameters

• Select the output(s) of interest

• For n perturbing parameters, the base sample size N= 2n (e.g. for 13 parameters N=8192)

• Use quasi-random sampling methods over pseudo-random

Step 2:

For each of the n parameters, we draw a sample of 2N from their distribution (N: base sample). We generate the N × n matrices A and B shown below.

213 Appendix C. Mathematical analysis of the dynamic strategy 214

Sampling matrix   p1,1 p2,1 ··· pn,1      p1,2 p2,2 ··· pn,2  A =    . . . .   . . .. .   . . .    p1,N p2,N ··· pn,N

Re-sampling matrix

  p1,N+1 p2,N+1 ··· pn,N+1     p1,N+2 p2,N+2 ··· pn,N+2 B =    . . . .   . . .. .   . . .    p1,2N p2,2N ··· pn,2N

We create n matrices that are equivalent to B, except the i-th column, which is taken from matrix A. Ci is the matrix which all parameters except pi are re-sampled.   p1,N+1 p2,N+1 ··· pi,1 ··· pn,N+1     p1,N+2 p2,N+2 ··· pi,2 ··· pn,N+2 C =   i  ......   ......   . . . .    p1,2N p2,2N ··· pi,N ··· pn,2N

Step 3:

Simulate the model to obtain the model output(s) for all the parameter sets in matrices A,

B and Ci. The outputs are yA = f(A), yB = f(B), and yCi = f(Ci), respectively. The number of model evaluations are: N, N and n · N, respectively, or (n + 2) · N in total. This is the most computationally expensive part of GSA. For example, for N=8000 and n=13 we perform 120,000 model evaluations (≈ 25 sec per dFBA, the total simulation time is 30 days). Appendix C. Mathematical analysis of the dynamic strategy 215

Step 4:

We estimate the sensitivity indices based on the equations:

K 1 X y · y − · yj · yj A Ci N A B j=1 Si = (C.1) K 2  1 X  y · y − · yj A A N A j=1

K 2  1 X  y · y − · yj B Ci N A j=1 ST i = 1 − (C.2) K 2  1 X  y · y − · yj B B N B j=1 Appendix C. Mathematical analysis of the dynamic strategy 216

0.5 Total Interaction 0.4 Individual

0.3

Sensitivity index 0.2

0.1

0 γ ρ γ α α γ α γ vA A R LuxR R L1 βC L2 θR L C βL C

Figure S.1: Averages of total, interaction and individual sensitivity indices of all parameters involved in the genetic circuit. Only parameters αC , γC and LuxR have total indices higher than 0.1, and are therefore significant. Appendix C. Mathematical analysis of the dynamic strategy 217

Figure S.2: Effect of αC -γC -LuxR on productivity. Isosurfaces for values of productivity higher than 2.9, 2.8, 2.7 and 2.6 mM serine/h.

Figure S.3: Effect of αC -γC -LuxR on yield. Isosurfaces for values of yield higher than 1.3, 1.2, 1.1 and 1 mM serine/mM glucose. Appendix D

Phase plane analysis of the genetic circuit

D.1 Introduction

A common problem in the analysis of nonlinear dynamic systems is to find the param- eters boundaries for which the systems experience changes in their behaviour. Changes include multistability (i.e., several steady states), oscillations and chaotic behaviour. For our bistable system, we are interested in locating the range of the parameters for which the toggle settles on a steady state that is zero (i.e., off state).

The range, area or volume of the parameter space has been associated to the robustness of the system. Robustness analysis is an important concept in the analysis of regulatory circuits. Qualitatively, the robustness analysis will tell us whether a network is able to recognize and respond to the appropriate signal, while ignoring small variations in the environment.

Most methods use the volume of the parameter space to quantify robustness. A small parameter space volume indicates low robustness or high sensitivity and very precise fine- tuning is required to achieve desirable performance. Therefore, robustness is linked to large

218 Appendix D. Phase plane analysis of the genetic circuit 219

volume and sensitivity to smaller. In some cases, the size is not enough to quantify robust- ness and the shape is also crucial. The shape of a region indicates how far perturbations around each parameter could disturb the circuit.

Bifurcation (or phase plane) analysis is a tool used for the analysis of parametric influ- ence on the location and stability of equilibrium points. Phase planes are the trajectories of the final or equilibrium states with respect to one or more parameter. Although limited to two or three varied parameters, the numerical continuation method is commonly used for bifurcation analysis (Govaerts et al., 2006). The numerical continuation method allows the computation of approximate solutions of a nonlinear system, without solving repetitively the system of dynamic equations. The basis of the method is that one can use a solution of the system at a specific value of a parameter as an initial guess to estimate the solution within a small range around the starting value. With an iterative approach and a suffi- ciently small step the method converges and the phase plane can be done much faster than a brute force method.

Bifurcation analysis has been used to elucidate oscillations in chemotaxis (Ma and

Iglesias, 2002), cell cycle regulation (Morohashi et al., 2002) and multiple states in red cell metabolism (Joshi and Palsson, 1990). In bistable systems with switch-like response, nonlinear analysis has also been used to establish the conditions for the design of a robust genetic toggle switch (Gardner et al., 2000). Here, we use bifurcation analysis to update the conditions for the design of a toggle switch coupled to the quorum sensing mechanism.

D.2 Methods

Bifurcation analysis is a fundamental tool for studying dynamical systems. Nonlinear inter- actions give rise to complex behaviours such as multistability, oscillations and chaos. For a general set of nonlinear ordinary differential equations (ODEs) of the form:

du = f(u, α) (D.1) dt Appendix D. Phase plane analysis of the genetic circuit 220

bifurcation or phase plane analysis is used to determine the critical values of parameters α that change the behaviour of the states u. Here, u is the vector of dependent variables or states (dimensions: nx1) and α is the vector of parameters (dimensions: px1).

The most common condition for stable states is negative eigenvalues of the Jacobian.

However, to distinguish between different topological types of stable states (i.e., mono-stable and bi-stable) additional tests need to be evaluated. Evaluating the Jacobian matrix at the equilibrium and examining its eigenvalues determine the type of a state. When an equilib- rium state undergoes a critical change from a mono-stable to a bi-stable an eigenvalues of the Jacobian is zero, while the other one is negative and real (imaginary eigenvalues result in oscillations).

Continuation methods, as discussed earlier, compute the curves of the system f(u, α) =

0 recursively using the previous solutions with a prediction-correction continuation algo- rithm. MATCONT is the latest generation bifurcation toolbox that provides sophisticated and up-to-date numerical algorithms for continuation of bifurcation points implemented as a MATLAB toolbox.

D.3 Results

Phase plane analysis has provided the conditions to design a robust bistable switch (Cherry and Adler, 2000). The first condition is the cooperative repression of transcription, which is modelled with a Hill function with cooperativity factor greater than one (parameters β and

γ in the following equations). In the study by Cherry and Adler (2000), it was also shown that Michaelis-Menden type of kinetics cannot generate a switch. The second condition required for a robust toggle switch is strong and balanced promoters. This result from

Gardner et al. (2000) is reproduced in the following figure (solid line, for R=0). The lines mark the bistable region. The model used for this analysis is identical to the one presented Appendix D. Phase plane analysis of the genetic circuit 221

by Gardner et al. (2000) and consists of the two dimensionless equations:

du α = 1 − u (D.2) dt 1 + vβ

dv α = 2 − v (D.3) dt 1 + uγ where u and v are the concentrations of the repressors, α1 and α2 are the effective rates of repressors synthesis. These are lumped parameters that include all the effects occurring during transcription and translation. Coefficients β and γ are the cooperativity factors and their values are fixed to 2, in order to model the dimer formed by the repressor proteins and the binding of the dimer on the promoters.

The solid line for R=0 (i.e., no quorum induction) supports both conditions discussed earlier. The bistable region lies between the lines, whereas outside the lines only one stable state is possible (i.e., either on or off). For β and γ values equal to 1, no bistable region is observed, justifying the condition for cooperative repression. The second condition, bal- anced and strong promoters is supported by the shape of the bistable region. The bistable area is symmetrical (indicates balanced promoters) and increases in size as the parameters

α1 and α2 increase, which indicates that stronger promoters favour a robust controller. To include the effect of coupling the toggle switch to the quorum sensing mechanism, a term modelling the activation by the dimer R was included in differential equation D.3. For the bifurcation analysis we use a single lumped parameter R that describes the activation strength, so that equations D.2 and D.3 become:

du α = 1 + R − u (D.4) dt 1 + vβ

dv α = 2 − v (D.5) dt 1 + uγ

The phase plane analysis for increasing values of the lumped activation strength R is shown in the following figure. Increasing the strength of the quorum sensing reduces the robustness of the switch, since the area of bistability shrinks with increasing R. This Appendix D. Phase plane analysis of the genetic circuit 222

indicates that the switch becomes more sensitive to variations on the parameters and tuning of the complete circuit will be more sensitive than the toggle alone.

In addition, the shape of the area is changing and the symmetry with respect to the axes is lost with increasing R. The condition for balanced promoters has to be modified as following. The promoter triggered by the complex R (i.e., the one described by Equation

D.4) has to be weaker than the other. The rationale behind this is that the additional force of complex R now activates this promoter.

90 80

70 R=0 60 R=1 R=2 50 R=3 2 40 30 20 10

0 0 10 20 30 40 50 60 70 80 90 1

Figure S.4: Phase plane diagrams for increasing strength of the quorum-sensing induction.

The area between the lines is the bistable region. When the strength of the quorum-sensing module increases, the area of bistability shrinks. Also, in the absence of quorum-sensing, the two promoters must be of similar strength. In the presence of a strong quorum-sensing system, the promoter that is triggered by the quorum-sensing has to be weaker than the other one. Appendix E

Supplemental data for Chapter 5

Figure 5.4.A.: Optical density values at 600 nm Time (hr) IPTG(-)#1 IPTG(-)#2 IPTG(-)#3

0 0.004 0.004 0.006

5 0.115 0.132 0.116

10 0.455 0.56 0.442

18 2.47 2.84 1.865

24 2.619 2.979 2.934

29 3.02 2.97 3.09

34 3.04 3.09 3.2

44 3.05 3.09 3.22

223 Appendix E. Supplemental data for Chapter 5 224

Time (hr) IPTG(+)#1 IPTG(+)#2 IPTG(+)#3

0 0.008 0.008 0.008

5 0.15 0.115 0.137

10 0.64 0.444 0.561

18 2.8 2.285 2.64

24 3.06 2.673 2.754

29 3.25 2.82 2.88

34 3.27 3.09 3.15

44 3.21 3.11 3.11

Time (hr) ∆5#1 ∆5#2 ∆5#3

0 0.009 0.009 0.009

5 0.159 0.16 0.155

10 0.408 0.43 0.426

18 1.155 1.24 1.24

24 1.746 2.052 2.205

29 3 3.2 3.37

34 3.83 3.79 3.81

44 3.68 3.87 3.76 Appendix E. Supplemental data for Chapter 5 225

Figure 5.4.B.: Optical density values at 600 nm Time (hr) IPTG(-)#1 IPTG(-)#2 IPTG(+)#1 IPTG(+)#2 ∆5#1 ∆5#2

0 0.009 0.009 0.009 0.01 0.011 0.013

4 0.058 0.071 0.071 0.102 0.111 0.109

8 0.258 0.29 0.297 0.358 0.294 0.287

12 0.605 0.66 0.67 0.733 0.57 0.55

16 1.257 1.44 1.299 1.398 1.086 1.038

20 2.185 2.52 2.125 2.38 1.88 1.77

24 3.64 3.72 3.3 3.31 2.72 2.58

28 3.86 3.69 3.6 3.81 3.62 3.67

32 3.29 3.32 3.64 3.41 3.94 4.07

36 3.55 3.36 3.63 3.85 4 3.92

41.75 3.34 3.33 3.16 3.15 3.93 3.79 Appendix E. Supplemental data for Chapter 5 226

Figure 5.5.: Optical density values at 600 nm Time (hr) ∆5::Cm ∆-pTOG(ptsG)

0 0.057 0.047

8 0.083 0.065

16 0.144 0.213

18 0.172 0.287

22 0.274 0.502

24.2 0.398 0.664

27.2 0.545 0.98

30.2 0.864 1.179

32.5 1.065 1.329

34.5 1.143 1.377

35.5 1.281 1.323

36.5 1.395 1.326

37.5 1.494 1.341

38.5 1.569 1.335

Figure 5.6.: Strain: ∆5::Cm Time (hr) O.D. (600nm) Glucose (mM) Pyruvate (mM) Acetate (mM)

0 0.137 51.62 0.75 0.68

4 0.347 55.09 1.65 0.99

8 0.576 61.47 2.33 1.54

12 1.015 54.22 2.29 2.13

18 3.78 32.54 18.00 9.34

22 4.98 7.65 51.33 11.32

27 4.39 3.27 56.00 16.23 Appendix E. Supplemental data for Chapter 5 227

Strain: ∆5-pTOG(ptsG) Time (hr) O.D. (600nm) Glucose (mM) Pyruvate (mM) Acetate (mM)

0 0.102 0.21 0.73 0.65

4 0.374 0.21 2.79 1.04

8 0.94 0.36 5.22 1.86

12 3.725 0.69 25.57 7.69

18 4.19 2.30 38.94 15.28

22 4.57 2.56 47.22 17.76

27 4.08 2.78 51.08 22.48

Figure 5.8.: Strain: wild-type Time (hr) Glucose (mM) Lactate (mM) Acetate (mM) O.D. (600 nm)

0 49.1896544 0 0 0.025

5 46.015902 0 1.5215062 0.038

15 43.7380186 0 1.7827284 0.077

23.5 0 0 9.7546911 0.336

Strain: ∆(adh,pta) Time (hr) Glucose (mM) Lactate (mM) Acetate (mM) O.D. (600 nm)

0 43.82 3.97 0 0.02

5 45.73 7.37 0 0.069

15 18.15 15.56 0 0.071

23.5 7.90 24.58 0 0.083

35.5 0.00 26.17 0 0.111 Appendix E. Supplemental data for Chapter 5 228

Figure 5.9.: Optical density values at 600 nm Time (hr) ∆(adh,pta) ∆(adh,pta) ∆(adh,pta)-pTOG ∆(adh,pta)-pTOG

0 0.04 0.066 0.017 0.04

6 0.083 0.098 0.036 0.031

8 0.091 0.111 0.042 0.048

12 0.099 0.126 0.045 0.063

16 0.117 0.16 0.05 0.063

20 0.131 0.18 0.062 0.058

24 0.128 0.194 0.053 0.052

31 0.155 0.174 0.071 0.062

Figure 5.10.: Optical density values at 600 nm Time (hr) ∆(adh,pta) ∆(adh,pta) ∆(adh,pta)-pTOG ∆(adh,pta)-pTOG

0 0.027 0.02 0.009 0.022

4 0.107 0.095 0.086 0.083

8 0.413 0.416 0.373 0.38

14 0.426 0.414 0.394 0.386

18 0.428 0.418 0.39 0.376

21 0.426 0.412 0.373 0.383

Figure 5.11 and 5.12.:

IPTG(+) Time (hr) O.D. (600nm) Glucose (mM) Lactate (mM)

0 0.024 10.75 0.00

2.5 0.09 11.02 1.50

4 0.161 7.74 4.27

5.5 0.296 3.87 10.47

7 0.446 0.00 21.21 Appendix E. Supplemental data for Chapter 5 229

IPTG(-) Time (hr) O.D. (600nm) Glucose (mM) Lactate (mM)

0 0.021 11.32 0.00

2.5 0.085 11.17 1.73

4 0.165 7.05 4.24

5.5 0.293 4.16 10.81

7 0.4 1.36 18.27

8.5 0.416 0.00 20.61

GFP(+) Time (hr) O.D. (600nm) Glucose (mM) Lactate (mM)

0 0.014 10.98 0.00

2.5 0.07 10.87 1.25

4 0.11 8.59 3.12

5.5 0.197 7.50 6.73

7 0.289 3.64 12.31

8.5 0.35 0.00 18.09

Figure 5.13.:

Yeast extract Time (hr) O.D. (600 nm) Glucose (mM) Lactate (mM)

0 0.032 11 0

7 0.404 2.34 19.61

20 0.339 0 25.37 Appendix E. Supplemental data for Chapter 5 230

Peptone Time (hr) O.D. (600 nm) Glucose (mM) Lactate (mM)

0 0.027 11 0

7 0.09 10.75 4.74

20 0.265 0 25.81

Figure 5.14.: Bottles 1-3 are for induction time 0 hr; bottles 4-6 for induction time

5 hr. Values are optical densities (O.D. at 600 nm). Time (hr) 1 2 3 4 5 6

0 0.047 0.055 0.068 0.049 0.06 0.084

2 0.055 0.056 0.061 0.048 0.059 0.063

4 0.114 0.123 0.129 0.115 0.141 0.157

6 0.176 0.213 0.234 0.184 0.239 0.284

7 0.198 0.234 0.294 0.232 0.304 0.339

8 0.213 0.248 0.313 0.258 0.327 0.356

9 0.233 0.274 0.352 0.285 0.364 0.361

10 0.237 0.288 0.336 0.296 0.346 0.333

Figure 5.15 and 5.16.: Bottles 1-3 are for induction time 0 hr; bottles 4-6 for induction time 4 hr. Values are optical densities (O.D. at 600 nm). Time (hr) 1 2 3 4 5 6

0 0.086 0.065 0.065 0.077 0.065 0.051

2 0.092 0.097 0.095 0.091 0.097 0.08

4 0.196 0.226 0.214 0.242 0.241 0.206

6 0.274 0.33 0.328 0.379 0.384 0.288

8 0.316 0.397 0.395 0.412 0.44 0.319

9 0.305 0.429 0.418 0.418 0.454 0.352

10 0.32 0.44 0.415 0.442 0.494 0.376

11 0.343 0.449 0.439 0.47 0.476 0.378 Appendix E. Supplemental data for Chapter 5 231

Glucose (mM) Time (hr) 1 2 3 4 5 6

11 4.22 1.13 1.17 0.78 0.00 3.80

10 5.12 1.98 2.46 3.06 1.57 4.79

9 9.36 3.70 2.42 2.88 3.51 3.38

Lactate (mM) Time (hr) 1 2 3 4 5 6

11 7.26 13.25 14.32 11.86 11.68 9.30

10 7.53 11.51 11.97 11.22 13.62 9.75

9 6.88 11.45 9.10 9.31 15.28 8.42

Figure 5.17.: Optical density values at 600 nm Time (hr) ∆(adh, pta)pTOG(gfp) ∆(adh, pta)pTOG(pta)

0 0.088 0.082

2 0.048 0.051

5 0.055 0.057

8 0.061 0.06

11 0.063 0.056

15 0.071 0.061

18 0.09 0.058

20 0.088 0.08

25.5 0.118 0.081

37 0.074 0.063 Appendix E. Supplemental data for Chapter 5 232

Figure 5.18.: Optical density values at 600 nm Time (hr) pTOG(pta) pTOG(pta)+MOPS pTOG(gfp) pTOG(gfp)+MOPS

0 0.04 0.032 0.04 0.035

3.5 0.118 0.115 0.089 0.098

6.5 0.299 0.271 0.238 0.231

9.5 0.382 0.35 0.368 0.324

12.5 0.378 0.378 0.36 0.34

15.5 0.355 0.375 0.343 0.383

Figure 5.20.: Optical density values at 600 nm. Products and glucose in mM.

Wild-type Time (hr) O.D. Glucose Formate Acetate Ethanol Lactate Succinate

0 0.064 102.17 0.00 0.30 0.00 1.36 0.00

2 0.245 91.31 5.86 2.66 3.61 0.00 0.36

3 0.364 84.58 9.02 5.43 4.79 0.88 0.61

4 0.556 68.25 15.59 9.55 8.48 1.70 1.12

5 0.726 80.52 24.53 11.81 11.95 1.66 1.93

6 1.206 73.65 32.57 20.30 14.29 2.10 2.34

7 1.56 56.71 48.06 27.47 24.75 3.86 3.60

8 2.04 51.09 59.39 34.61 27.77 5.33 4.32

9 2.412 43.04 75.75 43.42 39.93 7.21 5.44

10 2.796 32.44 85.37 48.91 45.63 9.12 6.20

11 3.28 15.19 108.42 62.19 53.73 13.67 7.84

12 3.432 5.78 113.21 65.10 59.50 16.82 8.40

13 3.58 0.00 123.78 70.34 57.48 19.61 10.22 Appendix E. Supplemental data for Chapter 5 233

Mutant Time (hr) O.D. Glucose Formate Acetate EtOH Lactate Succinate

0 0.066 100.1 1.40 0.00 0.00 0.44 0.00

2 0.094 98.38 2.50 0.00 0.00 3.86 0.14

3 0.111 86.84 1.54 0.00 0.00 3.89 0.20

4 0.119 87.06 0.93 0.00 0.00 4.85 0.21

5 0.132 74.42 1.48 0.00 0.00 5.02 0.21

6 0.149 71.12 1.97 0.00 0.00 5.84 0.20

7 0.156 64.86 3.86 0.00 0.00 7.11 0.00

8 0.155 81.78 2.72 0.00 0.00 9.47 0.30

11 0.197 75.12 1.57 0.00 0.00 12.23 0.36

14 0.225 67.19 4.17 0.00 0.00 17.07 0.81

23 0.325 63.37 3.09 0.31 0.00 30.07 0.90

29 0.38 59.85 3.30 0.48 0.00 45.14 1.24

47 0.451 47.57 2.73 1.08 0.00 96.62 3.08

53 0.456 43.45 6.27 1.15 0.00 104.80 2.43

76.5 0.393 32.24 10.46 1.59 0.00 135.66 3.08 Appendix E. Supplemental data for Chapter 5 234

Figure 5.21.: Optical density values at 600 nm. Products and glucose in mM.

Wild-type Time (hr) O.D. Glucose Formate Acetate EtOH Lactate Succinate

0 0.08 50 0 0 0 0 0

1.75 0.224 45.69 5.21 3.12 3.18 0.38 0.22

2.75 0.34 38.18 11.01 5.58 6.43 1.54 0.45

5 0.726 28.99 14.99 8.47 8.15 2.19 0.63

6 0.912 31.13 20.57 11.45 11.46 3.99 1.13

7 1.28 26.30 28.82 16.51 17.12 5.27 1.64

8 1.592 12.13 33.06 18.52 18.26 6.46 1.99

10 2.25 1.29 55.30 31.61 30.87 10.73 3.09

11 2.2 0.65 57.45 35.58 32.82 11.29 3.42

Mutant Time (hr) O.D. Glucose Lactate Succinate Acetate Formate

0.5 0.08 50.00 0.54 0.00 0.00 0.87

2.5 0.111 47.66 3.42 0.00 0.00 0.94

4 0.124 48.35 4.29 0.00 0.00 0.85

6 0.154 49.91 7.89 0.09 0.00 1.10

12 0.23 46.10 18.47 0.17 0.00 2.16

21.5 0.351 31.92 47.61 0.79 0.44 3.31

23 0.387 24.33 51.68 0.77 0.20 3.69

29 0.51 9.08 74.97 1.25 0.89 3.88

31 0.546 2.54 80.37 1.34 0.84 4.51

33 0.63 1.46 96.70 1.72 1.46 5.64

35 0.644 0.68 102.15 1.81 1.35 6.97

37.25 0.618 2.51 110.30 2.04 1.27 7.85 Appendix E. Supplemental data for Chapter 5 235

Figure 5.22.: Optical density values at 600 nm. Products and glucose in mM.

Heat-shock only Time (hr) O.D. Glucose Lactate Succinate Acetate Formate

0.5 0.08 49.92 0.83 0.00 0.03 0.87

2.5 0.122 48.79 3.81 0.13 0.16 0.94

8 0.146 47.86 9.30 0.13 0.18 1.89

12 0.159 46.88 14.06 0.25 0.25 2.16

21.5 0.209 35.91 26.58 0.31 0.61 3.31

31 0.256 27.25 41.60 0.64 0.82 4.51

33 0.252 27.54 46.24 0.63 0.97 5.64

35 0.262 26.59 51.84 0.93 1.01 6.97

37.25 0.267 25.31 58.01 0.76 1.03 7.85

45.5 0.325 19.88 88.72 1.64 1.63 9.39

51 0.342 5.51 83.85 1.92 1.79 10.19

55 0.394 0.20 94.54 2.12 2.34 18.06

70 0.333 0.62 105.89 2.49 2.28 6.51 Appendix E. Supplemental data for Chapter 5 236

IPTG first, then washing and heat shock-Replicate #1 Time (hr) O.D. Glucose Lactate Succinate Acetate

0 0.04 54.48 0.14 0.00 0.00

2 0.0935 54.83 2.32 0.14 0.00

4 0.174 50.70 7.18 0.24 0.09

6 0.253 47.24 12.39 0.41 0.13

7 0.275 45.22 15.00 0.49 0.16

10 0.248 48.32 1.09 0.00 0.00

12 0.349 44.64 8.16 0.13 0.04

14 0.413 41.07 14.89 0.17 0.10

16 0.502 35.65 25.23 0.34 0.23

18 0.628 29.43 37.04 0.35 0.25

20 0.712 21.65 51.31 0.58 0.39

22 0.724 15.38 63.31 0.82 0.44

24 0.778 9.48 74.31 1.22 0.63

25 0.802 5.95 81.03 1.29 0.67

26 0.832 2.56 87.24 1.37 0.73

27 0.836 0.80 90.60 1.45 0.76 Appendix E. Supplemental data for Chapter 5 237

IPTG first, then washing and heat shock-Replicate #2 Time (hr) O.D. Glucose Lactate

0 0.048 52.38 2.20

7 0.242 42.79 15.23

10 0.229 42.79 1.63

12 0.34 39.55 8.85

14 0.401 34.47 14.85

16 0.477 30.00 28.17

18 0.573 25.62 36.96

20 0.6 21.84 49.15

22 0.627 15.14 62.64

24 0.635 4.20 74.10

25 0.639 2.17 78.86

IPTG first, then washing and heat shock-Replicate #3 Time (hr) O.D. Glucose Lactate

0 0.049 52.07 1.91

7 0.278 44.20 15.18

10 0.228 44.20 1.47

12 0.327 37.84 8.68

14 0.399 33.85 13.68

16 0.472 29.50 24.54

18 0.569 26.03 35.90

20 0.612 22.11 49.10

22 0.665 15.51 62.63

24 0.672 4.54 74.07

25 0.678 2.30 78.86 Appendix E. Supplemental data for Chapter 5 238

Figure 5.23.: Optical density values at 600 nm. Products and glucose in mM.

Wild-type triplicates Time (hr) O.D.#1 O.D.#2 O.D.#3 Glc.#1 Glc.#2 Glc.#3

0 0.04 0.04 0.04 52 51 52

1.5 0.068 0.06 0.1 48.02 46.67 49.14

4 0.256 0.246 0.286 46.98 44.63 44.82

6 0.494 0.388 0.47 37.94 36.69 34.90

8 0.62 0.684 0.84 31.75 30.06 27.24

10 0.76 0.732 0.84 26.22 24.88 21.49

12 0.772 0.776 0.9 24.04 21.15 19.01 Mutant triplicates Time (hr) O.D.#1 O.D.#2 O.D.#3 Glc.#1 Glc.#2 Glc.#3

0 0.033 0.04 0.045 54.68 54.40 55.16

3 0.094 0.121 0.133 53.70 53.56 53.18

6 0.259 0.307 0.288 49.22 47.56 47.95

9 0.356 0.396 0.368 43.84 41.80 43.78

12 0.352 0.388 0.34 39.53 37.54 39.66

13 0.354 0.388 0.336 37.93 35.57 37.98 Mutant-pTOG(adh,pta) with IPTG Time (hr) O.D.#1 O.D.#2 O.D.#3 Glc.#1 Glc.#2 Glc.#3

0 0.04 0.04 0.04 51 51 51

3 0.097 0.101 0.098 49.25 49.36 48.80

6 0.249 0.256 0.256 48.13 45.75 44.45

9 0.33 0.352 0.352 41.74 41.11 39.50

12 0.344 0.354 0.352 36.07 36.57 34.07

13 0.346 0.36 0.356 34.63 35.20 32.65 Appendix E. Supplemental data for Chapter 5 239

Mutant-pTOG(adh,pta) without IPTG and heat shock for 30 min at the beginning of the batch Time (hr) O.D.#1 O.D.#2 O.D.#3 Glc.#1 Glc.#2 Glc.#3

0 0.04 0.04 0.04 51 51 51

3 0.108 0.071 0.093 49.35 49.65 50.19

6 0.266 0.186 0.227 44.80 46.92 46.69

9 0.358 0.322 0.32 38.88 42.28 42.26

12 0.38 0.334 0.334 33.11 38.02 38.37

13 0.376 0.34 0.326 31.86 36.65 37.67 Wild-type triplicates Time (hr) Lact.#1 Lact.#2 Lact.#3

0 0 0 0

1.5 0.03 0.04 0.10

4 0.18 0.26 0.36

6 2.07 1.82 3.00

8 7.68 7.34 8.54

10 14.20 13.60 14.36

12 17.50 16.51 17.23 Mutant triplicates Time (hr) Lact.#1 Lact.#2 Lact.#3

0 0.17 0 0

3 2.49 3.44 3.85

6 10.99 13.18 12.80

9 20.78 23.47 21.50

12 28.61 32.09 29.04

13 32.12 36.10 32.41 Appendix E. Supplemental data for Chapter 5 240

Mutant-pTOG(adh,pta) with IPTG Time (hr) Lact.#1 Lact.#2 Lact.#3

0 0 0 0

3 2.45 2.77 2.66

6 10.77 11.55 11.64

9 20.38 21.10 21.88

12 29.32 30.01 30.96

13 31.19 32.07 33.22 Mutant-pTOG(adh,pta) without IPTG and heat shock for 30 min at the beginning of the batch Time (hr) Lact.#1 Lact.#2 Lact.#3

0 0 0 0

3 3.07 2.04 2.78

6 12.78 8.76 10.79

9 23.45 18.30 19.65

12 34.35 26.88 28.02

13 36.61 28.68 29.95 Appendix F

Carbon balances of the bioreactor experiments

Closing the carbon balance of a fermentation is essential in understanding the microbial physiology. Here, we quantify the carbon in the form of substrate and products and apply the mass balance for carbon.

Glucose and yeast extract (Y.E.) are the only sources of carbon. Carbon is converted into the measured products (formate, acetate, ethanol, lactate, succinate and biomass), and the non-measured CO2. Since CO2 was not measured, the FBA prediction was used as an estimate to close the carbon balance.

First, we determine the carbon content in yeast extract. According to most manufac- turers, yeast extract is approximately 70% protein (Bioshop and Acumedia). Considering the average carbon concentration in protein to be 0.53 g/g (Rouwenhorst et al., 1991), the total carbon content for 0.5 g/l of yeast extract is:

g Y.E. g protein g C g C 0.5 × 0.7 × 0.53 = 0.19 or 15.8 mM of C l g Y.E. g protein l

To determine the carbon content in the biomass, the commonly used conversion be- tween optical density and g DCW/l was used (Causey et al., 2004) (1 OD600=0.33 g DCW/l).

241 Appendix F. Carbon balances of the bioreactor experiments 242

A common value for the carbon composition of E. coli is 45% w/w DCW. Therefore, optical density of 1 is equivalent to:

g DCW g C g C 0.33 × 0.45 = 0.15 or 12.5 mM of C l g DCW l

The carbon balance of the wild-type is shown in Table F.1. Carbon dioxide is an estimate from FBA. Assuming that yeast extract is only used as a source of aminoacids, the total carbon in the system is approximately 300 mM. Considering the lack of CO2 measurements, the balance is relatively close (approximately 9% error).

Wild-type mmol mmol C mmol mmol C

Glucose 50 300

Yeast extract 0.5 (g/L) 15.8

Formate -57.5 -57.5

Acetate -35.6 -71.2

Ethanol -32.8 -65.6

Lactate -11.3 -33.9

Succinate -2.6 -10.4

Biomass 2.2 (O.D.) -27.5

CO2* -5.75 -5.75

Total 300 -271.9

Table F.1: Carbon balance for the wild-type characterization. Yeast extract is not included in the total. The estimate of carbon dioxide is from the FBA model.

Next, we estimate the carbon balance of the mutant characterization (Table F.2). In this case, the model does not predict any CO2 production. The total carbon in the product is 8.8% higher than the carbon coming from glucose. This could be due either to errors in the measurements or yeast extract can be used up for biomass.

Finally, for the dynamic strategy characterization, we calculate the carbon balance at Appendix F. Carbon balances of the bioreactor experiments 243

∆(adh,pta) mmol mmol C mmol mmol C

Glucose 55 330

Yeast extract 0.5 (g/L) 15.8

Formate -7.1 -7.1

Acetate -1.3 -2.6

Lactate -110.3 -330.9

Succinate -2.7 -10.8

Biomass 0.62 (O.D.) -7.75

CO2* --

Total 330 -359.15

Table F.2: Carbon balance for the mutant characterization (static strategy). Yeast extract is not included in the total. The estimate of carbon dioxide is from the FBA model. the end of the second phase (Table F.3). The difference between the two values is less than

2%.

∆(adh,pta)-pTOG(adh,pta) mmol mmol C mmol mmol C

Glucose 48.3 289.8

Yeast extract 0.5 (g/L) 15.8

Formate -0.79 -0.79

Acetate -1.53 -3.06

Lactate -91 -273

Biomass 0.582 (O.D.) -7.3

CO2* --

Total 289.8 -284.2

Table F.3: Carbon balance for the dynamic strategy characterization. The estimate of carbon dioxide is from the FBA model. Appendix G

Plasmid sequences

G.1 Plasmid pTOG(adh,pta)

GCCCTAGGTCTAGGGCGGCGGATTTGTCCTACTCAGGAGAGCGTTCACCGACAAACAACAGATAAAAC GAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTCTAGCACGCGTACCATGGGATC CCGGGTTAAGCTGCTAAAGCGTAGTTTTCGTCGTTTGCTGCAGCGGATTTTTTCGCTTTTTTCTCAGC TTTAGCCGGAGCAGCTTCTTTCTTCGCTGCAGTTTCACCTTCTACATAATCACGACCGTAGTAGGTAT CCAGCAGAATCTGTTTCAGCTCGGAGATCAGCGGGTAACGCGGGTTAGCGCCGGTGCACTGGTCATCG AATGCATCTTCAGACAGTTTATCCACGTTCGCCAGGAAGTCTGCTTCCTGAACGCCAGCTTCACGGAT AGATTTCGGAATACCCAGTTCAGCTTTCAGCGTTTCCAGCCATGCCAGCAGTTTCTCGATCTTAGCAG CAGTACGGTCGCCCGGTGCGCTCAGACCCAAGAGGTCGGCAATTTCAGCATAACGACGGCGAGCCTGC GGACGGTCATACTGGCTGAATGCAGTCTGCTTGGTCGGGTTGTCGTTCGCATTGTAGCGAATAACGTT ACAAATCAGCAGGGCGTTTGCCAGACCGTGCGGAATATGGAACTGGGAACCCAGTTTGTGCGCCATTG AGTGACATACACCCAGGAAGGCGTTCGCAAACGCGATACCCGCGATAGTCGCTGCACTGTGAACACGT TCACGCGCTACCGGATTTTTAGACCCTTCGTGGTAGGACGCTGGCAGATATTCTTTCAGCAGTTTCAG TGCCTGCAGAGCCTGACCATCAGAGAACTCAGATGCCAGTACAGAAACATAAGCTTCCATGGCGTGAG TTACCGCGTCCAGACCACCGAAAGCACACAGGGACTTCGGCATGTCCATAACCAGGTTGGCGTCGACA ATCGCCATATCCGGAGTCAGCGCATAGTCTGCCAGCGGATATTTCTGACCAGTAGCGTCGTCAGTTAC AACCGCAAACGGAGTGACTTCAGAACCTGTACCAGAAGTGGTGGTGACAGCGATCATTTTCGCTTTCA CGCTCATTTTCGGGAACTTGTAGATACGTTTACGGATATCCATAAAGCGCAGCGCCAGCTCTTCGAAG TGAGTTTCCGGATGTTCGTACATAACCCACATGATCTTCGCGGCGTCCATCGGGGAACCACCACCCAG CGCGATAATCACGTCTGGTTTGAAGGAGTTTGCCAGTTCTGCACCTTTACGAACGATGCTCAGGGTCG GGTCCGCTTCTACTTCGAAGAAGACTTCAGTTTCAACGCCTGCTGCTTTCAGTACGGAAGTGATCTGA TCAGCATAACCATTGTTGAACAGGAAGCGGTCAGTCACGATGAGCGCACGTTTGTGGCCATCAGTAAT CACTTCATCCAGCGCGATTGGCAGGGAGCCACGGCGGAAGTAGATAGATTTCGGAAGTTTGTGCCACA

244 Appendix G. Plasmid sequences 245

ACATGTTTTCAGCTCGCTTAGCAACGGTTTTCTTGTTGATCAGGTGTTTCGGACCAACGTTTTCAGAG ATGGAGTTACCACCCCAAGAACCACAACCCAGAGTCAGGGAAGGTGCGAGTTTGAAGTTATACAGGTC ACCGATACCACCCTGAGACGCTGGGGTGTTAATCAGGATACGCGCCGTTTTCATTTTCTGACCGAAGT AAGAAACGCGAGCCGGTTGGTTATCCTGGTCAGTGTACAGGCAAGAGGTATGACCGATACCGCCCATA GCAACCAGTTTCTCTGCTTTTTCTACCGCGTCTTCGAAATCTTTAGCGCGGTACATTGCCAGAGTCGG GGACAGTTTTTCATGTGCGAACGGTTCGCTTTCATCAACAACGGTCACTTCACCGATCAGAATCTTGG TGTTTTCTGGTACAGAGAAGCCTGCCAGTTCAGCAATTTTATAGGCTGGCTGACCAACGATAGCCGCG TTCAGCGCACCGTTTTTCAGGATAACATCCTGAACAGCTTTCAGCTCTTTACCCTGCAACAGATAGCC GCCGTGGGTTGCAAAACGTTCACGTACAGCGTCATAAACAGAGTCAACAACAACAACAGACTGTTCAG AAGCACAGATTACGCCGTCGTCGGAGGTTTTGGACATCAGTACAGATGCAACTGCACGTTTGATATCA GCAGTTTCATCGATAACAACTGGAGTGTTGCCCGCGCCTACACCGATAGCTGGTTTACCGGAGCTGTA TGCGGCTTTAACCATGCCCGGACCACCAGTCGCGAGGATCAGGTTGGTGTCTGGGTGGTGCATCAGTG CGTTAGACAGTTCAACAGAAGGTTGATCGATCCAGCCGACCAGATCTTTCGGAGCACCGGCAGCGATA GCAGCCTGCAGAACGATATCAGCCGCTTTGTTGGTGGCATCTTTTGCACGCGGGTGCGGGGAGAAGAT AATGGCGTTACGGGTCTTCAGACTGATCAGCGATTTGAAGATAGCAGTTGAAGTCGGGTTAGTGGTCG GAACGATACCGCAAATAATACCGATTGGTTCAGCGATAGTGATGGTACCAAAAGTGTCGTCTTCAGAC AGAACACCACAGGTTTTTTCATCTTTATAGGCGTTGTAGATATATTCAGAAGCAAAGTGGTTTTTGAT CACTTTATCTTCGACGATACCCATGCCGGATTCGGCAACGGCCATTTTCGCGAGTGGGATTCGAGCAT CTGCAGCAGCCAGAGCGGCGGCGCGGAAGATTTTGTCTACTTGCTCTTGAGTGAAACTGGCATATTCA CGCTGGGCTTTTTTTACACGCTCTACGAGTGCGTTAAGTTCAGCGACATTAGTAACAGCCATTTAATT AATTTCTCCTCTTTAATGGCGCGCGCTAGTTAAGCTGCTAAAGCGTAGTTTTCGTCGTTTGCTGCCTG CTGCTGTGCAGGCTGAATCGCAGTCAGCGCGATGGTGTAGACGATATCGTCAACCAGTGCGCCACGGG ACAGGTCGTTAACCGGCTTGCGCATACCCTGCAGCATCGGCCCGATGGAGATCAGGTCGGCAGAACGC TGTACCGCTTTGTAGGTGGTGTTACCGGTGTTCAGATCCGGGAAGATGAACACGGTAGCGCGACCTGC AACCGGAGAGTTCGGCGCTTTGGATTTCGCAACGTCAGCCATTACCGCAGCGTCGTACTGCAGCGGAC CGTCGATCATCAGGTCAGGACGTTTTTCCTGCGCCAGACGAGTTGCTTCGCGAACTTTTTCTACGTCG CTACCTGCACCAGAAGTACCGGTGGAGTAGGAGAGCATAGCAACGCGCGGCTCGATACCGAAGGCCGC AGCGGAATCAGCGGACTGAATCGCGATTTCTGCCAGCTGTTCAGCGGTCGGATCCGGGTTGATCGCAC AGTCACCGTAAACGTAAACCTGTTCCGGCAGCAGCATGAAGAACACGGAAGATACCAGGGAGCTGCCC GGTGCAGTTTTGATCAGCTGCAGCGGCGGACGGATGGTGTTTGCGGTAGTGTGAACAGCACCGGAAA CCAGACCATCAACTTCATCCTGTTCCAGCATCAGCGTACCGAGCACCACGTTGTCTTCCAGCTGTTCGC GGGCAACGGTTTCGGTCATGCCTTTGTTCTTACGCAGTTCGACCAGACGACCAACATAGCTTTCGCGA ACCACTTCTGGATCAACGATTTCAATCCCTGCACCCAGTTCTACACCCTGAGACGCTGCAACACGGTT GATCTCTGCCGGATTACCCAGCAGTACGCAAGTTGCGATACCACGTTCAGCACAGATAGCGGCTGCTT TAACGGTACGCGGTTCGTCACCTTCCGGCAGTACGATACGTTTGCCCGCTTTGCACGCAAGTTCAGTC AGCTGATAACGGAACGCAGGCGGAGACAGACGACGGCTGCGCTCAGAAGTGGCAGTCAGAGATTCGA TCCAGTCAGCGTTGATGTAGTTAGCAACGTATTCCTGAACTTTCTCGATACGTTCGTGATCGTCAACC GGAACTTCCAGGTTGAAGCTCTGCAGGCTCAGAGAGGTCTGCCAGGTGTTGGTGTTCACCATAAATAC CGGCAGGCCGGTAGCGAAAGCACGTTCGCACAGTTTAGAAATGCGCGCGTCCATTTCGTAACCGCCAG Appendix G. Plasmid sequences 246

TCAGCAGCAGGGCACCGATTTCTACGCCGTTCATGGCTGCCAGGCAAGCGGCCACCAGCACGTCAGGA CGGTCTGCGGAAGTCACCAGCAGAGAACCGGCACGGAAGTGCTCCAGCATGTGCGGAATGCTGCGTG CGCAGAAAGTGACGGATTTAACGCGGCGAGTATTGATGTCGCCTTCGTTGATGATGGTCGCATTCAGG TGGCGAGCCATATCGATCGCACGAGTCGCGATCAGGTCAAAGCTCCACGGCACAGCGCCGAGAACCGG CAGCGGGCTGGATTCTTGCAGCTTCGCCGGATCAACATTGTTTACTTTAGCTTTGGAAGAGTCGTCGA AAATCTCGGACAGATCCGGGCGAGTACGACCCTGTTCATCAACCGGTGCGTTCAGTTTGTTAACGATA ACGCCGGTGATGTTGGTGTTTTTGGCACCGCCGAAGCTGTTGCGGGTCAGTTCGATACGCTCTTTCAG CTGTTCCGGGGTGTCAGTGCCCTGAGACATAACGAAGACGATTTCCGCATTCAGCGTTTTAGCGATTT CGTAGTTCAGAGACTGGGCAAACTGGTGCTTACGTGTCGGGACCAGACCTTCAACCAGAACGACTTCA GCGTCTTTGGTGTTAGCGTGGTAGTTTGCGACGATCTCTTCCATCAGCACATCTTTCTGATTGCTGGA AAGCAGACCTTCAACGTAGCTCATTTTCAGCGGTTCAGCGGCCGTCGTGGTGGAAGAGTTCGCACGCA CGATAGTCGTAGTCTGATCGGGCGCATCGCCACCGGTACGCGGCTGAGCGATAGGTTTGAAAACGCTC AGACGAACGCCTTTGCGTTCCATTGCACGGATCACGCCAAGGCTGACGCTGGTCAGACCGACGCTGGT TCCGGTAGGGATCAGCATAATAATACGGGACACTTAATTAATTTCTCCTCTTTAATGGCGCGCCTCAC TGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGA GGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGC CCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAA TCCTGTTTGATGGTGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTAC CGAGATATCCGCACCAACGCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGAT CGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCG GACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATG CCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGT GACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTG ATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAAT GGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCA CCGCCGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGA TCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAA CGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCC GCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGA AACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCATGACGTCCA TCCGGCCGTCCTTTGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTG TAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAA CGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAG GGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCA TCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGC GATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCC CAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTG GTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATA ATTTATCACCGCAGATGGTTATCTGTGCATGCATTTACGTTGACACCATCGAATGGCTGAAATGAGCT Appendix G. Plasmid sequences 247

GTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAA ACCGGTAATGAGCACAAAAAGAAACCATTAACACAAGAGCAGCTTGAGGACGCACGTCGCCTTAAAGC AATTTATGAAAAAAAGAAAAATGAACTTGGCTTATCCCAGGAATCTGTCGCAGACAAGATGGGGATGG GGCAGTCAGGCGTTGGTGCTTTATTTAATGGCATCAATGCATTAAATGCTTATAACGCCGCATTGCTT ACAAAAATTCTCAAAGTTAGCGTTGAAGAATTTAGCCCTTCAATCGCCAGAGAAATCTACGAGATGTA TGAAGCGGTTAGTATGCAGCCGTCACTTAGAAGTGAGTATGAGTACCCTGTTTTTTCTCATGTTCAGG CAGGGATGTTCTCACCTAAGCTTAGAACCTTTACCAAAGGTGATGCGGAGAGATGGGTAAGCACAACC AAAAAAGCCAGTGATTCTGCATTCTGGCTTGAGGTTGAAGGTAATTCCATGACCGCACCAACAGGCTC CAAGCCAAGCTTTCCTGACGGAATGTTAATTCTCGTTGACCCTGAGCAGGCTGTTGAGCCAGGTGATT TCTGCATAGCCAGACTTGGGGGTGATGAGTTTACCTTCAAGAAACTGATCAGGGATAGCGGTCAGGTG TTTTTACAACCACTAAACCCACAGTACCCAATGATCCCATGCAATGAGAGTTGTTCCGTTGTGGGGAA AGTTATCGCTAGTCAGTGGCCTGAAGAGACGTTTGGCTGACTGCAGCATAAATAACCCCGCTCTTACA CATTCCAGCCCTGAAAAAGGGCATCAAATTAAACCACACCTATGGTGTATGCAAAGGAATTTAAATGG GTACCATGGCCTCCTCCGAGAACGTCATCACCGAGTTCATGCGCTTCAAGGTGCGCATGGAGGGCACC GTGAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCCACAACACCG TGAAGCTGAAGGTGACCAAGGGCGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCCCAGTTCCAG TACGGCTCCAAGGTGTACGTGAAGCACCCCGCCGACATCCCCGACTACAAGAAGCTGTCCTTCCCCGA GGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGCGACCGTGACCCAGGACTCC TCCCTGCAGGACGGCTGCTTCATCTACAAGGTGAAGTTCATCGGCGTGAACTTCCCCTCCGACGGCCC CGTGATGCAGAAGAAGACCATGGGCTGGGAGGCCTCCACCGAGCGCCTGTACCCCCGCGACGGCGTGC TGAAGGGCGAGACCCACAAGGCCCTGAAGCTGAAGGACGGCGGCCACTACCTGGTGGAGTTCAAGTC TATCTACATGGCCAAGAAGCCCGTGCAGCTGCCCGGCTACTACTACGTGGACGCCAAGCTGGACATCA CCTCCCACAACGAGGACTACACCATCGTGGAGCAGTACGAGCGCACCGAGGGCCGCCACCACCTGTTC CTGTAGCGGCCGCGACTCTAGATCATAATCAGCCATACCACATTTGTAGAGGTTTTACTTGCTTTAAA AAACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATTGTTGTTGTTAACTTGTTTAT TGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCAC TGCGCCGGCCCTAGCTTGGCTGTTTTGGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCA GAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCC CATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAG GGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTG TTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAAC GGCCCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGCCAT CCTGACGGATGGCCTTTTTGYGTTTCTACAAACTCTTTTGTTTATTTTTCTAAAKACWTTCWAATATG TATCCGCTCATGASACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTRTGAGTAT TCRACAKTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGA ARCGCTGGTGAAAGTARAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCAGCTAACCGC TTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCA TACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACT GGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGG Appendix G. Plasmid sequences 248

ACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTG GGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACG ACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAA GCATTGGTAACTGTCAGACCAAGTTTACGAGCTCGCTTGGACTCCTGTTGATAGATCCAGTAATGACC TCAGAACTCCATCTGGATTTGTTCAGAACGCTCGGTTGCCGCCGGGCGTTTTTTATTGGTGAGAATCC AAGCACTAGTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACC ACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCT TCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAAC TCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAA GTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACG GGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGA GCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTC GGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGT TTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAAC GCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCG TTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCG AACGCCCTAGGTCTAGGGCGGCGGATTTGTCCTACTCAGGAGAGCGTTCACCGACAAACAACAGATAA AACGAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTCTAGCACGCGTACCATGGG ATCCCGGGTTAAGCTGCTAAAGCGTAGTTTTCGTCGTTTGCTGCAGCGGATTTTTTCGCTTTTTTCTC AGCTTTAGCCGGAGCAGCTTCTTTCTTCGCTGCAGTTTCACCTTCTACATAATCACGACCGTAGTAGG TATCCAGCAGAATCTGTTTCAGCTCGGAGATCAGCGGGTAACGCGGGTTAGCGCCGGTGCACTGGTCA TCGAATGCATCTTCAGACAGTTTATCCACGTTCGCCAGGAAGTCTGCTCC

G.2 Plasmid pTAK132

ATCAGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCCAAGCTTATAAGGCGCGCCTCACTGCCCGCT TTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTT GCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACC GCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTT GATGGTGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATAT CCGCACCAACGCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCA ACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGC ACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAG CCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAAT GCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGT CTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCT GGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCT TTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCG Appendix G. Plasmid sequences 249

AGATTTAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCA GCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCC GCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTG ATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCATGACGTCCATCCGGCCGT CCTTTGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGG GGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATC AGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCT GCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCT CCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACA TTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCT GCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTC CTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCG CAGATGGTTATCTGTGCATGCATTTACGTTGACACCATCGAATGGCTGAAATGAGCTGTTGACAATTA ATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACCGGTAATGA GCACAAAAAGAAACCATTAACACAAGAGCAGCTTGAGGACGCACGTCGCCTTAAAGCAATTTATGAAA AAAAGAAAAATGAACTTGGCTTATCCCAGGAATCTGTCGCAGACAAGATGGGGATGGGGCAGTCAGG CGTTGGTGCTTTATTTAATGGCATCAATGCATTAAATGCTTATAACGCCGCATTGCTTACAAAAATTC TCAAAGTTAGCGTTGAAGAATTTAGCCCTTCAATCGCCAGAGAAATCTACGAGATGTATGAAGCGGTT AGTATGCAGCCGTCACTTAGAAGTGAGTATGAGTACCCTGTTTTTTCTCATGTTCAGGCAGGGATGTT CTCACCTAAGCTTAGAACCTTTACCAAAGGTGATGCGGAGAGATGGGTAAGCACAACCAAAAAAGCCA GTGATTCTGCATTCTGGCTTGAGGTTGAAGGTAATTCCATGACCGCACCAACAGGCTCCAAGCCAAGC TTTCCTGACGGAATGTTAATTCTCGTTGACCCTGAGCAGGCTGTTGAGCCAGGTGATTTCTGCATAGC CAGACTTGGGGGTGATGAGTTTACCTTCAAGAAACTGATCAGGGATAGCGGTCAGGTGTTTTTACAAC CACTAAACCCACAGTACCCAATGATCCCATGCAATGAGAGTTGTTCCGTTGTGGGGAAAGTTATCGCT AGTCAGTGGCCTGAAGAGACGTTTGGCTGACTGCAGCATAAATAACCCCGCTCTTACACATTCCAGCC CTGAAAAAGGGCATCAAATTAAACCACACCTATGGTGTATGCAAAGGAATTTAAATGGGTACCATGAG TAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGC ACAAATTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCAACATACGGAAAACTTACCCTTAAATTTATT TGCACTACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTCGGTTATGGTGTTCAATG CTTTGCGAGATACCCAGATCATATGAAACAGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATG TACAGGAAAGAACTATATTTTTCAAAGATGACGGGAACTACAAGACACGTGCTGAAGTCAAGTTTGAA GGTGATACCCTTGTTAATAGAATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTTGG ACACAAATTGGAATACAACTATAACTCACACAATGTATACATCATGGCAGACAAACAAAAGAATGGAA TCAAAGTTAACTTCAAAATTAGACACAACATTGAAGATGGAAGCGTTCAACTAGCAGACCATTATCAA CAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCCACACAATCTGCC CTTTCGAAAGATCCCAACGAAAAGAGAGACCACATGGTCCTTCTTGAGTTTGTAACAGCTGCTGGGAT TACACATGGCATGGATGAGCTCTACAAATAAAAGCTAGCTTGGCTGTTTTGGCGGATGAGAGAAGATT TTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAG TAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTG Appendix G. Plasmid sequences 250

TGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGA CTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGGGAG CGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAG GCATCAAATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTTTA TTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATAT TGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTG CCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCAC GAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGT TTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCA AGAGCAACTCGGTCGCSGCATACACTATTCTCAGAATGACTTGGTKGAGTACTCACCAGTCACAGAAA AGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACT GCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGG GGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTG ACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTA GCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGC CCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTG CAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACT ATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGA CCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAA GATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCC CGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAA AAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTA ACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTT CAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTG GCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGC TGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACA GCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGC AGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTG TCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGG AAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTT CCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCG CAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGAGTTTGTAGAAACGCA AAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTGATGCCTGGCAGTTTATGGCGGGCGTCCTGC CCGCCACCCTCCGGGCCGTTGCTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGGA GAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATT TGATGCCTGGCAGTTCCCTACTCTCGCATGGGGAGACCCCACACTACCATCGGCGCTACGGCGTTTCA CTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTGCCGCCAGGCAAATTCTGTTTTATC AGACCGCTTCTGCGTTCTGATTTAATCTGT Appendix G. Plasmid sequences 251

G.3 Plasmid pTAK131

CCATCGAATGGCTGAAATGAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAG CGGATAACAATTTCACACAGGAAACCGGTAATGAGCACAAAAAGAAACCATTAACACAAGAGCAGCTT GAGGACGCACGTCGCCTTAAAGCAATTTATGAAAAAAAGAAAAATGAACTTGGCTTATCCCAGGAATC TGTCGCAGACAAGATGGGGATGGGGCAGTCAGGCGTTGGTGCTTTATTTAATGGCATCAATGCATTAA ATGCTTATAACGCCGCATTGCTTACAAAAATTCTCAAAGTTAGCGTTGAAGAATTTAGCCCTTCAATC GCCAGAGAAATCTACGAGATGTATGAAGCGGTTAGTATGCAGCCGTCACTTAGAAGTGAGTATGAGTA CCCTGTTTTTTCTCATGTTCAGGCAGGGATGTTCTCACCTAAGCTTAGAACCTTTACCAAAGGTGATG CGGAGAGATGGGTAAGCACAACCAAAAAAGCCAGTGATTCTGCATTCTGGCTTGAGGTTGAAGGTAAT TCCATGACCGCACCAACAGGCTCCAAGCCAAGCTTTCCTGACGGAATGTTAATTCTCGTTGACCCTGA GCAGGCTGTTGAGCCAGGTGATTTCTGCATAGCCAGACTTGGGGGTGATGAGTTTACCTTCAAGAAAC TGATCAGGGATAGCGGTCAGGTGTTTTTACAACCACTAAACCCACAGTACCCAATGATCCCATGCAAT GAGAGTTGTTCCGTTGTGGGGAAAGTTATCGCTAGTCAGTGGCCTGAAGAGACGTTTGGCTGACTGCA GCATAAATAACCCCGCTCTTACACATTCCAGCCCTGAAAAAGGGCATCAAATTAAACCACACCTATGG TGTATGCAAAGGAATTTAAATGGGTACCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTG CCCATCCTGGTCGAGCTGGACGGCGACGTARACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGG GCGATGCCRCCTACGGCAAGCTGACCCTGAAGTTCATYTGCACCACCGGCRAGCTGCCCGTGCCCTGG CCCACCCTCGTGACCACCCTGACCTGSGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAG CACGACTTCTTCAAGTCCGCCATGCCYGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGA CGGCAACTAYRAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACYGCATCKAGCTGA AGGGCATCGACTTCAAGGAGGACGGCAACATCCTGSGGCACRAGCTGGAGTACAACTACATCAGCCAC AACGTCTATATCACCGCCGACAAGCAGAAGAACGGCATCAAGGCCAACTTCAAGATCCKCCACAACAT CGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGC TGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGAT CACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTA AGCTAGCTTGGCTGTTTTGGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCAGAACGCAG AAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGA ACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTG CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCG GTGAACGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGG AGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGCCATCCTGACGG ATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTC ATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTT CCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGT GAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCG GTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTA TGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCA GAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAAT Appendix G. Plasmid sequences 252

TATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGA CCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACC GGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGT TGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAG GCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATC TGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTA TCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATA GGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTT AAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCC TTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATC CTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTG CCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATAC TGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCG CTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCA AGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCC GAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGC TTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGA TTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTT CCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCG TATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGAGTTTGTAGAAACGCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTT AATTTGATGCCTGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTGCTTCGCAACGT TCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGGAGAGCGTTCACCGACAAACAACAGATAAAACGA AAGGCCCAGTCTTTCGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACTCTCGCATGGG GAGACCCCACACTACCATCGGCGCTACGGCGTTTCACTTCTGAGTTCGGCATGGGGTCAGGTGGGACC ACCGCGCTACTGCCGCCAGGCAAATTCTGTTTTATCAGACCGCTTCTGCGTTCTGATTTAATCTGTATC AGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCCAAGCTTTTACTTGTACAGCTCGTCCATGCCGAG AGTGATCCCGGCGGCGGTCACGAACTCCAGCAGGACCATGTGATCGCGCTTCTCGTTGGGGTCTTTGC TCAGGGCGGACTGGTAGCTCAGGTAGTGGTTGTCGGGCAGCAGCACGGGGCCGTCGCCGATGGGGGT GTTCTGCTGGTAGTGGTCGGCGAGCTGCACGCTGCCGTCCTCGATGTTGTGGCGGATCTTGAAGTTCA CCTTGATGCCGTTCTTCTGCTTGTCGGCCATGATATAGACGTTGTGGCTGTTGTAGTTGTACTCCAGC TTGTGCCCCAGGATGTTGCCGTCCTCCTTGAAGTCGATGCCCTTCAGCTCGATGCGGTTCACCAGGGT GTCGCCCTCGAACTTCACCTCGGCGCGGGTCTTGTAGTTGCCGTCGTCCTTGAAGAAGATGGTGCGCT CCTGGACGTAGCCTTCGGGCATGGCGGACTTGAAGAAGTCGTGCTGCTTCATGTGGTCGGGGTAGCG GGCGAAGCACTGCAGGCCGTAGCCGAAGGTGGTCACGAGGGTGGGCCAGGGCACGGGCAGCTTGCCG GTGGTGCAGATGAACTTCAGGGTCAGCTTGCCGTAGGTGGCATCGCCCTCGCCCTCGCCGGACACGCT GAACTTGTGGCCGTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACCACCCCGGTGAACAGCTCCT CGCCCTTGCTCACCATGGTACCTTTCTCCTCTTTAATGGCGCGCCTCACTGCCCGCTTTCCAGTCGGGA Appendix G. Plasmid sequences 253

AACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGC GCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTG AGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTA ACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACG CGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGC AGTGGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGC CTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCAGACGCAGA CGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAGATG CTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGTCTGGTCAGAGA CATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCTGGTCATCCAGC GGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCTTTACAGGCTTC GACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCG CCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGT TTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTT TTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACAC CGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCATGACGTCCATCGAACCGTCCTTTGCATAC ACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATG CTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATA GTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATT GAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTT AGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTA ATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATG CTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGT GCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTA TCTGTGCATGCATTTACGTTGACA Appendix H

Time profiles of triplicate experiments

1 Wild−type Mutant 0.8 IPTG(+) IPTG(−)

0.6

0.4

Optical density (600 nm) 0.2

0 0 5 10 15 Time (h)

Figure S.5: Optical density (600 nm) for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond).

254 Appendix H. Time profiles of triplicate experiments 255

60 Wild−type Mutant 50 IPTG(+) IPTG(−) 40

30

Glucose (mM) 20

10

0 0 5 10 15 Time (h)

Figure S.6: Glucose concentration for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond).

40 Wild−type 35 Mutant IPTG(+) 30 IPTG(−)

25

20

15 Lactate (mM)

10

5

0 0 5 10 15 Time (h)

Figure S.7: Lactate concentration for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond). Appendix H. Time profiles of triplicate experiments 256

15 Wild−type Mutant IPTG(+) IPTG(−) 10

Acetate (mM) 5

0 0 5 10 15 Time (h)

Figure S.8: Acetate concentration for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond).

15 Wild−type Mutant IPTG(+) IPTG(−) 10

Ethanol (mM) 5

0 0 5 10 15 Time (h)

Figure S.9: Ethanol concentration for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond). Appendix H. Time profiles of triplicate experiments 257

30 Wild−type Mutant 25 IPTG(+) IPTG(−) 20

15

Formate (mM) 10

5

0 0 5 10 15 Time (h)

Figure S.10: Formate concentration for wild-type (square), mutant ∆(adh,pta) (circle), mu- tant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)-pTOG(adh,pta) without IPTG (diamond).

10 Wild−type Mutant 8 IPTG(+) IPTG(−)

6

4 Succinate (mM)

2

0 0 5 10 15 Time (h)

Figure S.11: Succinate concentration for wild-type (square), mutant ∆(adh,pta) (cir- cle), mutant ∆(adh,pta)-pTOG(adh,pta) with IPTG (triangle), mutant ∆(adh,pta)- pTOG(adh,pta) without IPTG (diamond).