Quick viewing(Text Mode)

Design, Construction and Characterization of Dynamic Genetic Circuits in Bacteria

Design, Construction and Characterization of Dynamic Genetic Circuits in Bacteria

Design, Construction and Characterization of Dynamic Genetic Circuits in Bacteria

A Thesis Presented

by

BORIS KIROV

Submitted to the

Ecole doctorale des Génomes Aux Organismes

of the

University of Evry Val-d’Essonne

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

JANUARY 2014

Acknowledgements

I would like to thank all the people that helped me fulfill this research. I express my sincere gratitudes to:

• all the members of the jury who agreed to help me make the final step of a long journey • my supervisor Alfonso Jaramillo for taking me on this journey • Jeff Hasty and his group for the transfer of all aspects of microfluidics technology and their constant support • Mohammed Atari, Claudiu Giuraniuc, Fabio Cancare' and Jangir Selimkhanov for the help with the development of the image processing scripts • Octavio Mondragón-Palomino, Fabrice Monti and Ivan Razinkov for teaching me microfluidics fabrication • all my colleagues, or better all my friends from iSSB for the great environment and for the work we did and for the fun we had together

Finally, I would like to thank all my close friends for the constant support and understanding, and patience and love. I could have never reached that far without Kiril, Charles, Jirair, father Emilian and my loving sister Vesela.

1

Abstract

Engineering of synthetic genetic devices capable of controlling different aspects of the cellular physiology in a predefined manner and with precise timing is regarded as crucial for modern bioengineering and synthetic . The task to design and construct parts for synthetic biology is not simple and needs to meet a number of requirements. The parts utilized for the construction of genetic circuits should be modular, well-characterized, well-behaved and robust to changes in the environment. They should be insulated from cross-talk with the environment and be resilient to mutations. Finally, they should also be properly modeled based on parameters derived from single-cell level experiments. In my thesis I researched in detail the general requirements for the engineering of individual parts like promoters, ribosome binding site, transcription factors and of some important type of devices. Furthermore, I established a complete platform for the single-cell level characterization of engineered genetic devices. All the required hardware and know-how for the fabrication of microfluidics devices capable of sustained bacterial growth was acquired. The whole process from the design of microfluidics devices with aimed functionality to their fabrication and utilization for microbial experiments was successfully developed. An efficient image processing tool for distributed computational analysis of the data acquired during the microscopy experiments was also developed. The experimental results proved that the engineered genetic devices were behaving according to theoretical expectations. Furthermore, the established experimental procedures, fabrication process and automated data analysis showed to be well- adapted to the task of single-cell characterization of engineered bacteria and efficient.

2

Table of contents

ACKNOWLEDGEMENTS ...... 1 ABSTRACT ...... 2 TABLE OF CONTENTS ...... 3 INDEX OF ILLUSTRATIONS ...... 5 INDEX OF TABLES ...... 6 I. OVERVIEW ...... 7 II. ENGINEERING OF GENETIC PARTS ...... 10 II.1. INTRODUCTION ...... 10 II.2. PROMOTER ENGINEERING ...... 13 II.3. SYNTHETIC PROMOTERS ...... 20 II.4. RIBOSOME BINDING SITE ...... 23 II.5. TRANSCRIPTION FACTORS ...... 24 II.6. FLUORESCENCE REPORTER PROTEINS ...... 26 II.7. PROTEIN DEGRADATION TAGS ...... 27 II.8. TRANSCRIPTIONAL TERMINATORS ...... 28 II.9. EXPRESSION VECTORS ...... 29 II.10. CONSTRUCTION OF GENETIC CIRCUITS ...... 32 II.11. GENETIC PARTS ENGINEERED FOR THIS RESEARCH ...... 32 II.11.1. Synthetic promoters (APPENDIX A) ...... 32 II.11.2. XOR gate promoters ...... 34 II.12. CONCLUSION ...... 38 II.13. REFERENCES ...... 38 III. ENGINEERING OF GENETIC OSCILLATORS ...... 43 III.1. INTRODUCTION ...... 43 III.2. MATHEMATICAL MODELING ...... 45 III.3. SYNTHETIC GENETIC OSCILLATORS ...... 63 III.4. GENETIC OSCILLATORS ENGINEERED FOR THIS RESEARCH ...... 72 III.4.1. Goodwin-type oscillators ...... 72 III.4.2. Double genetic oscillators...... 79 III.4.3. Oscillatory copy number plasmid ...... 83 III.4.4. Phage-communication-based oscillator ...... 85 III.5. CONCLUSION ...... 88 III.6. REFERENCES ...... 89 IV. ENGINEERING OF MICROFLUIDICS DEVICES ...... 94 IV.1. INTRODUCTION ...... 94 IV.2. MICROFLUIDICS DEVICES DESIGN ...... 98 IV.2.1. Growth chambers ...... 98 IV.2.2. system...... 108 IV.3. MICROFLUIDICS DEVICE FABRICATION ...... 125

3

IV.3.1. Photomask printing ...... 126 IV.3.2. Wafer spin-coating ...... 128 IV.3.3. Wafer UV exposure and development ...... 129 IV.3.4. Multilevel devices fabrication ...... 132 IV.3.5. PDMS structures stamping ...... 136 IV.3.6. Final devices bonding ...... 137 IV.4. Soft lithography setup developed for this research ...... 138 IV.5. CONCLUSION ...... 144 IV.6. REFERENCES ...... 144 V. MICROSCOPY AND IMAGE PROCESSING ...... 147 V.1. MICROSCOPY ...... 147 V.2. IMAGE PROCESSING ...... 151 V.2.1. Introduction ...... 151 V.2.2. Image pre-processing ...... 152 V.2.3. Frame-to-frame object matching ...... 155 V.2.4. Single-cell fluorescence level tracking ...... 158 V.3. IMAGE PROCESSING ALGORITHMS ...... 159 V.3.1. Image processing algorithms developed in collaboration ...... 159 V.3.2. Image processing algorithm developed during this research...... 160 V.4. CONCLUSION ...... 175 V.5. REFERENCES ...... 175 VI. CHARACTERIZATION OF SYNTHETIC GENETIC PARTS AND DEVICES ...... 178 VI.1. INTRODUCTION ...... 178 VI.2. POPULATION -LEVEL CHARACTERIZATION ...... 180 VI.3. SINGLE -CELL -LEVEL CHARACTERIZATION ...... 183 VI.3.1. Experimental setup ...... 183 VI.3.2. Characterization examples ...... 185 VI.4. CONCLUSION ...... 193 VI.5. REFERENCES ...... 194 VII. CONCLUSION ...... 196 APPENDIX A ...... 198 PROMOTERS SEQUENCES : ...... 198 XOR DEVICES ...... 201 APPENDIX B ...... 204 IMAGE PROCESSING SCRIPTS ...... 204 FLUOROMETER PROCESSING SCRIPTS ...... 225 GENETIC PARTS AND DEVICES MODELS ...... 233 CHARACTERIZATION ...... 241

4

Index of illustrations

Illustration 1. The engineering cycle for synthetic genetic devices...... 8 Illustration 2. The flow of biochemical information in living organisms...... 13 Illustration 3. Structure of the consensus 70 promoter...... 15 Illustration 4. Repression mechanisms for the Plac promoter...... 18 Illustration 5. Examples of synthetic promoters design...... 21 Illustration 6. The effect of homologous recombination over the sequences of some promoters...... 23 Illustration 7. XOR gate design...... 36 Illustration 8. Schematic representation of the design for the construction of XOR gate...... 38 Illustration 9. Generalized scheme of an oscillator ...... 46 Illustration 10. The Goodwin oscillator...... 48 Illustration 11. Assumption underlying the Hill formalism...... 52 Illustration 12. Positive-negative-feedback oscillator...... 60 Illustration 13. Rules for simplification of logics architectures ...... 63 Illustration 14. Oscillations of the KaiC protein in cyanobacteria...... 64 Illustration 15. The repressilator circuit design...... 66 Illustration 16. Stricker et al. positive-negative-feedback oscillator...... 68 Illustration 17. Pattern formation circuit...... 69 Illustration 18. Oscillator with synchronization through quorum sensing ...... 72 Illustration 19. General MATLAB model for dynamic genetic circuits...... 77 Illustration 20. Goodwin-type oscillators...... 78 Illustration 21. Characterization of a Goodwin-type oscillator...... 80 Illustration 22. Combinations of two different promoters engineered in the same cell...... 81 Illustration 23. Two oscillators in the same cell...... 83 Illustration 24. Oscillatory copy number...... 87 Illustration 25. Phage-communication-based synchronous oscillator...... 89 Illustration 26. E. coli colony growing inside a rectangular chamber...... 96 Illustration 27. Wang et al. microfluidics device...... 104 Illustration 28. Microfluidics device for the growth of yeast cell in a single line...... 106 Illustration 29. Microfluidics device for 2D growth of E. coli...... 108 Illustration 30. Shapes of channels to be avoided in microfluidics devices ...... 116 Illustration 31. Finite element modeling simulation of the fluid flow...... 117 Illustration 32. Passive microfluidics switch device...... 119 Illustration 33. Passive microfluidics switching device with two inputs...... 121 Illustration 34. Delay-line device used as a mixer in microfluidics...... 124 Illustration 35. Design file sent to a photoplotting company...... 127 Illustration 36. Individual stages of the photolithography process...... 131 Illustration 37. Alignment features utilized to facilitate fabrication ...... 133 Illustration 38. Design examples avoiding the need of a mask aligner...... 134 Illustration 39. Low-cost photolithography setup...... 141

5

Illustration 40. Watershed segmentation technique...... 157 Illustration 41. SinCePro web interface...... 163 Illustration 42. Image pre-processing steps...... 166 Illustration 43. Results from the gradual erosion of laterally merged cells...... 168 Illustration 44. Effect of parallelization over the image processing speed...... 171 Illustration 45. XOR characterization results...... 183 Illustration 46. Characterization results from a Goodwin-type Ptet/lac-TetR oscillator...... 187 Illustration 47. Characterization of the uncoupled system of two oscillators...... 188 Illustration 48. Characterization of oscillators by external forcing and periodograms...... 190 Illustration 49. Lissajous figures for the two types of double oscillators ...... 191 Illustration 50. Conjugation event in microfluidics device...... 193

Index of tables

Table 1. Standard registry of biological parts vector plasmids...... 32 Table 2. Design limitations for different types of growth chambers used for cultivation of E. coli ...... 100 Table 3. Values of the diffusion coefficients for some substances ...... 125 Table 4. UV filters equipped in the TECAN500 fluorometer apparatus...... 181

6

I. Overview

Engineering of synthetic genetic devices capable of controlling different aspects of the cellular physiology in a predefined manner and with precise timing is regarded as crucial for modern bioengineering and synthetic biology. There already have been some major breakthroughs in the field accomplished by the engineering of a number of basic genetic devices like toggle-switches, oscillators, different types of logic gates, amplifiers, inverters, etc. However, the final goal to obtain a level of engineer-ability for biological functions similar to the accomplishments we have in electronics for example is till far ahead.

The task to design and construct parts for synthetic biology is not simple and needs to meet three major type of requirements. On the first place are the engineering requirements arising from the fact that parts utilized by synthetic biology and their assembly should provide the possibility for the generation of final genetic circuits with defined and reliable behavior. To provide such functionality for the final circuits, the building block of the latter should be modular, well- characterized, well-behaved and robust to changes in the environment. The second group of requirements should meet the problems connected to the constant and multilateral crosstalk of the synthetic-biology parts with the intracellular environment. An additional aspect of the interaction with the cellular environment is the contextual dependence of the outcome, which is a source of a significant problem with maintaining the engineered genetic parts intact once they are transformed in a living cell. Finally, there are the principal issues with mathematical modeling of biochemical processes involving such small amount of and depending on so many stochastic events.

In this thesis the general requirements for the engineering of individual parts like promoters, ribosome binding site, transcription factors and of some important type of devices (e.g. oscillators) were studied. The major theoretical considerations for the utilization and combination of such parts were derived. Based on those the successful construction and characterization of a synthetic promoters, genetic oscillators and logic gates is reported.

7

Illustration 1: The engineering cycle for synthetic genetic devices. In this thesis we covered the theory of genetic oscillators, the engineering of synthetic promoters and genetic oscillators, characterization in microfluidics devices and fluorescence microscopy and image processing.

Furthermore, a complete platform for the single-cell level characterization of engineered genetic devices was established. All the required hardware and know-how for the fabrication of microfluidics devices capable of sustained bacterial growth was acquired. The whole process from the design of microfluidics devices with aimed functionality to their fabrication and utilization for microbial experiments was successfully developed. Those experiments allowed for the long time-lapse fluorescence microscopy observation of bacteria growing in exponential phase in single layers. An efficient image processing tool for distributed computational analysis of the data acquired during the microscopy experiments was also developed. This way, the characterization of the engineered genetic devices at the single-cell level was attained.

8

Overall, in this thesis the complete engineering cycle for synthetic genetic circuits was elaborated and exemplified with actual devices (Illustration 1). Starting from theoretical design, passing through in vivo implementation, until microscopy characterization at the single-cell level is covered. For that purpose, an actual fabrication platform and dedicated image-processing were also produced. The precise stages are analyzed in the separate chapters theoretically and then are exemplified by corner-stone research from other groups and by our own accomplishments.

9

II. Engineering of genetic parts

II.1. Introduction

The design and construction of parts for synthetic biology need to meet three major type of requirements. On the first place are the engineering requirements arising from the fact that parts utilized by synthetic biology and their assembly should provide the possibility for the generation of final genetic circuits with defined and reliable behavior (Slusarczyk, Lin, & Weiss, 2012) . To provide such functionality for the final circuits, the building block of the latter should be modular, well-characterized, well-behaved and robust to changes in the environment. If we utilize the electronics analogy, the resistor, amplifiers, transistors, etc. that are combined to create an electronic circuit should have known parameters, known response to inputs and reliable function which remains stable in different combinatorial circuits.

The second group of requirements should meet the problems connected to the constant and multilateral crosstalk of the synthetic-biology parts with the intracellular environment (Nandagopal & Elowitz, 2011) . Within the living cells everything is connected to everything in a direct or indirect manner. Even if we manage to create a completely orthogonal genetic circuit with all information molecules being independent from the similar molecules in the cell, the energy used to drive the flow if this information would come from the same pool as the energy utilized by the cell to perform all of its functions. At our homes the only reason why the electrical appliances continue to work simultaneously without seeming distortions in their functions is the huge oversupply of electricity. However, if we plug into the electrical system a large consumer such as an electric heater, its effect becomes immediately apparent by the quality of the light produced by the bulbs. Much in the same way, the insertion of a circuit that drives the overproduction of a certain protein in a living cell affects the energy and material availability for all the processes inside the cell. An additional aspect of the interaction with the cellular environment is the contextual dependence of the outcome (Randall, Guye, Gupta, Duportet, & Weiss, 2011) . Since we utilize bacteria for the characterization of our synthetic parts and

10

devices we are constantly affected by the growth competition between the individual cells in the same colony. The “winner takes it all” phenomenon results in strong selective pressure towards the silencing of any genetic circuit that does not provide a direct competitive advantage. The latter is a source of a significant problem with maintaining the original parts intact once they are transformed in a living cell.

Finally, there are the principal issues with mathematical modeling of biochemical processes involving such small amount of molecules and depending on so many stochastic events (Ozbudak, 2004) . For some of our circuits we could not have precise models of behavior even theoretically. This is owing to the fact that the level of modeling is very low, since we try to model the exact behavior of the carriers of the information inside our circuits, i.e. the information molecules and their mediators. Very similar results would be obtained if in electronics we would try to model the exact behavior of the electrons generating the electrical flow and predict the exact number of electrons at a given state in a given moment. This would be literally trying to violate the principle of indeterminacy. Therefore, our capacity to obtain precise parameters for the genetic parts and circuits is limited.

However, even if exact quantification of the behavior of the devices used in synthetic biology is unattainable, comparison of different parts immersed in the same environment could be a way to obtain at least qualitative understanding of the difference between them. Therefore, the utilization of standard parts that allow for modular re-combination and have plug-and-play type of behavior is a must (Randall et al., 2011) . This way, many different parts of the same class could be compared to a standard part in standard conditions and thus a method for characterization could be achieved. This is exactly the way in which the existent libraries of similar biological parts of the same type (promoters and ribosome binding sites) are characterized (Salis, Mirsky, & Voigt, 2009) .

The genetic circuits in the living organisms are based on the general flow of information from

11

DNA to proteins (Illustration 2). Each of the individual conversion steps, i.e. transcription, translation, folding, protein activation are susceptible to control by different effectors. In bacteria the transcription of DNA into RNA is performed by the RNA-polymerase (RNAP) complex and is controlled by the sequence of the promoter and some DNA upstream and downstream sequence elements. Next, the translation of mRNA into peptide chain is is performed by a ribosome and is controlled by the sequence of the ribosome binding site (RBS) and sequences being capable of generation of secondary structures in the mRNA and also by some sRNA's. Finally, the folding of the peptides into protein monomers and the multimerization of the proteins and their activation are dependent on some enzymes and specific ligands. To construct synthetic genetic circuits we rely exactly on those processes. We combine different elements like promoters, RBS's, sRNA's, proteins that interact with each other in a defined manner in order to generate a given function.

12

Illustration 2: The flow of biochemical information in living organisms. We are mostly interested in the direct flow from DNA to proteins (left to right) and particularly in regulation of direct transcription and of translation. This regulation could be accomplished through regulatory proteins or through the sequence of specific stretches of DNA or RNA.

In my research I engineered and used different synthetic promoters and their control by regulatory proteins in order to generate the aimed dynamics of novel genetic circuits. Therefore, the parts needed for the expression and regulation of the expression of proteins will be discussed in detail, namely promoters, RBS's, proteins, terminators. Additionally, the vectors for the expression of those parts and their properties will be examined. Finally, the assembly methods that we utilized will be also described.

II.2. Promoter engineering

The promoter sequence of E. coli has been very well studied and has relatively simple organization. The function of this sequence is first to provide attachment place for the RNAP and second, to allow for the separation (melting) of the two strands of DNA, so that the polymerization of RNA could proceed on the non-coding strand. The RNAP itself is comprised

13

of several subunits and a specificity factor called . It is exactly the factor that is responsible for the recognition and the initial binding of the polymerase to the promoter sequence. There are few types of factors in E. coli , each responsible for the expression of proteins under specific conditions (Gruber & Gross, 2003) .The factor that controls the gene expression under standard growth conditions in exponential phase is the The sequence of the promoters controlled by 70 has been thoroughly studied and a consensus promoter sequence has been derived (Hook- Barnard & Hinton, 2007) . This sequence consists of two major elements and several that are considered secondary, however all of them have a certain role in promoting genetic expression (Illustration 3). The two major sequences in the consensus promoter are the -35 and the -10. The names are derived from the position that the centers of those elements usually have with respect to the transcription start site, which is regarded as +1. Both of those sequences have been shown to directly interact with the specificity factor. On the other hand, the UP elements are in direct contact with the subunit of the polymerase and provide a efficient position for the regulatory proteins that require such interactions. The “TG” element at -15 and -14 is usually not mentioned, but it also is contacting directly the polymerase and seems to be capable of recovering part of the activity of promoters with compromised -35 or -10 sequences. Finally, the spacers between the - 35 and -10 and between the -10 and +1 work best with their fixed lengths of 17bp and 7bp, respectively. The sequences of the latter have also some importance for the proper functioning of the promoter. Consequently, even though some of the positions through the promoter sequence have known function, it is the entire sequence that provides for the proper functioning of the promoter as a DNA stretch for RNAP binding and consecutive direction of the polymerization process. Therefore, the modularity of the promoter structure is not always a fact and the random combination of -35, -10 and/or other elements does not necessarily produce a functioning genetic element.

The bacterial promoter is also the DNA region in which are positioned the DNA sequences which are specifically recognized and bound by the regulatory proteins, activators or repressors. Those specific sequences are denoted “operators” and are usually placed in UP region, between the -35 and the -10 sequences or immediately downstream of the +1 element. The UP-allocated operators

14

attract activator proteins, which enhance the contact between the subunit of the RNAP and the DNA, thus increasing the rate of polymerization from the promoter. However, if at the same position there is an operator for a repressor protein instead, the latter would act as a spatial impediment for the polymerase and would reduce the effective transcription rate. The same effect is observed when operators between the -35 and the -10 or downstream of the +1 element attract repressor proteins and the latter block the binding of the polymerase to the promoter DNA or its progress through the transcribed region.

Distant operator elements could also interfere with efficient transcription, however they could not act per se . Instead, distant operators provide additional binding sites for multimeric proteins. Binding of the latter to an operator within the promoter region and to those additional operators results in bending of DNA and very efficient restriction of the access of the RNAP to the promoter region, i.e. repression.

Illustration 3: Structure of the consensus σ70 promoter. The specificity factor binds directly the -35 and the -10 elements. The UP sequence is positioned at proximity to the α-subunit of RNAP allowing for protein-mediated activation of the promoter expression.

By far the most-widely utilized wild-type bacterial promoters in biotechnology, bioengineering an synthetic biology are the promoter controlling the lactose utilization system, the promoter of

15

the arabinose uptake, the promoter of the phage, the promoter controlling the quorum sensing in V. fischeri and the promoter transcribed by the polymerase of the phage T7. The promoter controlling the expression of the proteins responsible for the lactose uptake system (Chakerian & Matthews, 1992)  as a function is a typical inducible promoter. It is normally blocked by a repressor, which upon binding to its cognate ligand (the metabolite, in this case lactose molecule) loses affinity to DNA and releases the expression. The lactose promoter (P lac ) is repressed by the LacI repressor (Illustration 4). There are three operators for a LacI dimer positioned at different sites in the promoter. The first promoter (O1) is the one with the highest affinity towards the repressor and is positioned immediately downstream of the transcription start site. The second promoter (O2) is 401 bp downstream of O1 and has the second strength of affinity towards LacI. Finally, the third operator (O3) has the weakest affinity and is 92 bp upstream of O1. O3 is positioned immediately upstream of the UP element, which in this case is the binding site for the cyclic-AMP receptor protein (CRP). The function of the latter is the following. If there is not enough energy supplied to the bacterium, the AMP molecules cannot be phosphorylated and they are converted into cyclic-AMP. Thus, the existence of cyclic-AMP in the cytoplasm is a sign of starvation and is used as a trigger for the uptake of alternative energy sources. This is accomplished by the CRP, which attracts RNAP and is an activator for a number of different promoters. Thus, in the presence of lactose, LacI is inhibited from binding and the RNAP could proceed with the transcription of the operon. CRP acts as an attractor for the RNAP and consequently an activator for the promoter. On the other hand, if there is no lactose in the cytoplasm, LacI repressor can bind DNA. The binding to O1 is enough to block the transcription to some extent (4700 fold) (Müller, Oehler, & Müller-Hill, 1996) . However, the auxiliary operators O2 and O3 allow for the formation of DNA loop between O1/O2 or O1/O3. Those structures stabilize significantly the bond between LacI and O1 and thus increase the efficiency of repression (up to 19 000 fold).

The arabinose uptake controlling promoter (P BAD ) works in a different manner (Schleif, 2010) . The regulatory protein (AraC) is an activator for the expression from this promoter and attracts

16

the RNAP when it is bound to DNA. The operator for the binding of AraC is consisting of two half-sites, I1 and I2 positioned upstream of the -35 sequence, whereas I2 overlaps the -35 with two bp. It has been shown that binding of AraC only to I2 is sufficient for the activation of the promoter. In this case the regulatory protein is allosterically activated and binding of arabinose to AraC increases the affinity of the protein to DNA.

The -phage P /P RM promoter system is somewhat more complex, because it is responsible for the precise timing in the expression of the different proteins controlling the phage proper functioning (Joung, Koepp, & Hochschild, 1994)  . This system consists of two promoters with opposing expression directions, which share a common operator. The elements of the P RM and the

PR promoters are positioned sequentially as following -10, -35, -35 and -10 for P RM and P R respectively.

17

Illustration 4: Repression mechanisms for the P lac promoter. In the absence of its cognate repressor (LacI), the promoter is active and the RNAP is able to transcribe (A). When synthesized in sufficient quantities, LacI tetramerizes and binds to DNA. There are three lac operators in the promoter with different affinities. The repression is through DNA looping, either between O1 and O2 (B) or between O1 and O3 (C).

The operator for the regulatory protein ( -CI) with the highest binding affinity (O1) is between

the -35 and overlapping the -10 of the P R. Upstream of the same -35 is placed the next affinity operator O2 overlapping the two -35 sequences. Finally, there is the weakest operator O3

between the -10 and the -35 of the P RM promoter. The gradual accumulation of the CI protein leads to its sequential binding of the respective operators in order of affinity. Consequently,

18

binding to O1 represses the expression controlled by P R. Next, O2 is also targeted and the regulatory protein acts as an RNAP attractor this time resulting in the activation of the expression from the P RM promoter. Finally, binding of CI also to O3 leads to the switching off of the whole system. Although the CI protein has dual functions, it is not widely adopted for regulation of gene expression, because it is not inducible, hence it could only provide for constitutive expression. However, the phage P L promoter sequence showed to be tolerant to mutations and insertions of different operators at the key positions, hence was used as a template for the generation of many synthetic promoters (Knaus & Bujard, 1988) .

The promoter regulating the expression of the quorum sensing molecule is also well-studied

(Dunlap, 1999) . The name of this promoter is lux P R and its function is very similar to the P BAD one. The operator sequence is positioned at the UP site and is partially overlapping the -35 element of the promoter. Upon binding to AHL, the LuxR regulator increases its affinity to DNA and thus is capable of binding to the operator and attract RNAP. In V. fischeri the product of this operon is the protein that generates the AHL molecule and thus this promoter acts as a trigger for the active production of the quorum sensing molecule.

Finally, the P T7 is transcribed by the specific RNA polymerase of the T7 phage. This way, upon infection, the phage is not susceptible to control over the activity of the host's RNAP and could easily reach very high expression levels of its own protein. Unlike the RNAP promoters, this promoter (Imburgio, Rong, Ma, & McAllister, 2000)  has much shorter sequence (17 bp) and thus is much easier for cloning. The usage of this system in bioproduction is very much the same as in nature, a protein that needs to be over-expressed is put under the control of the T7 promoter and the T7-RNAP is simultaneously expressed in the host system. This way very high concentrations of the end product could be reached. However, there is a real danger that this forcing of the cellular synthetic machinery may lead to cell death. To avoid that problem, synthetic variants of this promoter have been developed.

19

II.3. Synthetic promoters

The first successful generation of hybrid promoters is the engineering of the group of the tac

(Boer, Comstock, & Vasser, 1983)  promoters. The Ptac promoter was initially created as a hybrid between the lacUV5 promoter (a derivative of the wild-type P lac ) and the wild-type trp promoter. The aim of this endeavor was to increase the strength of the lacUV5 promoter by exchanging its original -35 sequence with the -35 sequence of the stronger trp promoter.

Amazingly enough it worked, with productivity increase of more than 5 times. The Ptrc promoter (Brosius, Erfle, & Storella, 1985)  was the results of fine-level engineering of the tac promoter spacer between the -35 and the -10 sequence and its increasing from 16 to 17 bp. It is interesting to note that those hybrid promoters were not created with the aim of utilizing the 70 consensus promoter sequence, however the -35 and -10 region are matching exactly the template.

Next was the group of promoters engineered by Lutz and Bujard (Lutz & Bujard, 1997) , (Illustration 5), which remain some of the most used promoters in bioengineering both directly or as a template for further development of novel promoters. The specificity of this work is the development of hybrid regulation for the promoters. Out of this work three are the promoters that are most used up till now. The first two promoters use as a promoter template the sequence of the

-PL promoter. The authors maintained the entire promoter sequence intact and exchanged only some sequences where they wanted to insert operators for regulators. Explicitly, they exchanged the UP sequence of the promoter and the spacer between the -35 and -10 with operators for a repressor – LacI or TetR depending on the promoter. The two hybrid promoters were named

PLtet-O1 and PLlac-O1 . The obtained repression levels were 2500 and 600 times for TetR and LacI respectively. Since the operators for both the repressors are double and their length is more than 17 bp, they had to be abolished at the overlap with the -10 sequence. Still the promoters maintained stable expression activity. The other very important promoter developed in the same research is the Plac/ara . This design was based on a stronger derivative of the P lac promoter. To maintain repression by LacI, the O1 operator sequence was placed immediately downstream from +1 and the strongest known operator for LacI, the symmetric Os was inserted in the spacer

20

between the -35 and the -10. Another additional O1 operator was inserted at -448 to provide for the DNA looping. Finally, the two half-sites of the AraC operator were introduced at the position of the CRP binding. In order to maintain the symmetry of the position of the activator, 5 bp of the I2 had to be abolished had to be abolished, however the promoter still remained activated in the presence of AraC and arabinose. At full induction of LacI and activation of AraC, the promoter has an activation range of almost 1800 fold.

Illustration 5: Examples of synthetic promoters design. The distances along the DNA axes (gray) do not reflect the actual ratios. The operators for transcription factors are presented as labeled boxes. The reference promoter structures are also presented as well as the original promoters they are derived from.

It is extremely important to note some implications that the exact sequences of the repressor

operators of the P Ltet-O1 and P Llac-O1 promoters have. The most wide-spread version of those promoters are the ones in the registry of standard biological parts under parts numbers

BBa_R0040 and BBa_R0011 for P Ltet-O1 and P Llac-O1 respectively. In those parts the two repressor operators surrounding the -35 promoter region have exactly the same sequence. This fact does not interfere with the promoters' expression functions, neither with the efficiency of repression by their cognate regulators. However, if those promoters are used to control the

21

expression of a protein in a mostly induced manner, i.e. in cellular environment without constant production of the repressors this becomes a problem. The repetition of the two operator sequences provide a comfortable substrate for the action of the RecA protein responsible for homologous recombination. If a construction under the control of either of the two promoters is transformed in a host that does not produce constantly the repressors and that are not deficient in RecA function and if this construction involves protein synthesis, in almost 100% of the observed cases in our lab the two operators merge. The latter leads to removal of the entire -35 sequence, thus rendering the promoter useless (Illustration 6). This phenomenon is easy to explain, since prokaryotic cells are in constant competition for the scarce resources in the environment and outgrowing is a guarantee for survival. In such context, if any bacterial cell manages to efficiently remove the load of constant protein production that does not bring any survival advantage, it would outgrow the cells that do not remove this burden. Consequently it would proliferate faster and soon would conquer the whole population. This phenomenon may lead to serious problems with the maintenance of the genetic stability of the synthetic circuits and needs to be avoided. There are two approaches to overcome this issue. First, recA- mutants might be used to host such constructions, however this imposes serious limitations to the bacterial strain used. On the other hand, slightly different sequences might be used for the two different operators, thus reducing the probability for recombination. Interestingly, the original version of the P Llac-O1 promoter has two degenerate sequences versions for the two repressor operators.

Based on the work of Lutz and Bujard was the promoter developed for the iGEM competition in

2008 as BioBrick part BBa_K091101 or Ptet/lac (Illustration 5). It consists of the P Ltet-O1 promoter with the addition of a LacI operator immediately downstream of the -10 element. This promoter is efficiently repressed by either of the TetR or LacI or by both of them (also confirmed by our personal experience).

Finally, another very important engineered promoter is the PT7-lac . This promoter consists of the native T7 sequence with addition of a LacI operator sequence immediately downstream. As a result, in presence of LacI, the expression from this promoter is highly repressed. Furthermore,

22

by introduction of different concentrations of inducer, one can control the exact level of expression from this promoter. This way is avoided the problem with uncontrollable over- expression of proteins in the host and in the same time high product yields could be attained. Consequently, this promoter is very widely adopted in preproduction.

Illustration 6: The effect of homologous recombination over the sequences of some of the mostly used synthetic promoters. The original promoters (above) change their sequence losing the -35 region (below) and thus becoming defective for expression. The sequences of the mutated promoters have been confirmed more than once.

II.4. Ribosome binding site

The next element defining the rate of protein expression in a genetic circuit after the promoter is the RBS (Salis et al., 2009) . This is the mRNA site at which the ribosome binds and slides to search for a start codon from which to initiate the translation. We will regard as an RBS the short sequence used by the 16S subunit of the ribosome to bind. In the context of synthetic biology a

23

of standard RBS's with different initiation rates is a tool to regulate protein expression without having to interfere with complex promoter engineering with results difficult to predict. Therefore we chose to utilize the RBS library of the repository for standard biological parts, which has always worked for us robustly and presented exactly the required diversity in options for protein expression levels. We utilized the Community collection members, mostly the BBa_B0034 as a reference (initiation rate =1) and a strong RBS, the BBa_B0030 as an average- strength RBS (initiation rate = 0.6) and the BBa_B0032 as a weakRBS (initiation rate = 0.3).

II.5. Transcription factors

As we stated before, we engineered only genetic circuits that are under the control of protein regulators of promoter expression of transcription factors (TF's). The latter are used in combination with appropriate promoters and were already mentioned in the paragraph regarding promoter design. The LacI (Lewis, 2005)  has a clear domain structure and is comprised of three distinct domains. First there is helix-turn-helix domain at the N-terminus responsible for the DNA binding. Additionally, there is the central domain responsible for binding to sugar ligands, i.e. lactose or its structural analogue IPTG. Finally, at the C-terminus of the protein is the tetramerization domain. The LacI protein can bind to DNA also as a dimer, but the complete activity of the protein is observed in its tetrameric form. Thanks to this structure, this repressor could bind simultaneously distant operator sites and provoke DNA looping, which increases the efficiency of repression many folds.

The TetR DNA binding motif is also a helix-turn-helix and the protein is active as a homodimer (Ramos et al., 2005) . However, the dissociation constant of the repressor-DNA complex is about 4 orders lower than the the same constant for the LacI-DNA complex (Kamionka, Bogdanska-Urbaniak, Scholz, & Hillen, 2004) . This makes the binding of the TetR protein to DNA much tighter. In the same time, the affinity to non-operator DNA sequences is much lower, which makes this repressor also much morte specific. There is a simple biological reason for the difference between the properties of the two transcription factors. LacI represses the expression

24

importer of lactose in the cell and this way avoids unneeded synthesis load for a protein. However, complete lack of the importer would mean that even if the lactose is present in the environment and the cell is starving, this sugar would enter in the cell in very low quantities.

Therefore, occasional leaks in the P lac promoter are needed in order to provide the availability of a low number of importer molecules, which could trigger the activation of the whole lactose- assimilation system. Conversely, the TetR repressor blocks the expression of the tetracycline antiporter, which removes the antibiotic from the cytosol. This antiporter is only needed when there are some molecules of the antibiotic inside the cell and even small quantities from this protein are a heavy burden to the cellular expression machinery. Therefore, the repression from TetR must be very tight and the leaks from the promoter are not tolerated. On the other hand, the tetracycline itself is deleterious for the bacteria and the induction of the repressor by its ligand should be very efficient. Consequently, TetR requires also very low ligand (tetracycline or anhydrotetracycline) concentration.

LuxR is also considered to be a TetR-type protein, which binds DNA through a helix-turn-helix motif. The allosteric control is performed by acyl-homoserine lactone (AHL) signal and the protein is known to both repress and activate transcription. However, little is known about the precise activation mechanism that this transcription factor is involved in.

Variants of both the LacI and the TetR repressor with different operator specificity have been engineered (Krueger, Scholz, Wisshak, & Hillen, 2007) , (Falcon & Matthews, 2000) . Those mutated proteins have somewhat reduced DNA affinity, hence repression efficiency, but they still are controlled allosterically, which makes them useful addition to the family of available transcription factors for use in genetic circuit engineering.

Not so much is known about the structure of the AraC protein (Schleif, 2010) . There are two suspected helix-turn-helix motifs in the C-terminal domain of the protein that might be binding the two operator half site I1 and I2, but that is known. However, the function of this protein is very well-studied. It acts as a homodimer. When bound to DNA it is capable of both attract the

25

RNAp and help its conversion to open complex ready for polymerization. The inducer of this protein is the L-arabinose, which affects the spatial structure of the dimer in a way, which prevent efficient binding to the half-sites. The strong activation capacity of this transcription factor is very useful for the engineering of novel hybrid promoters that involve positive regulation.

The CI proteins from different phages ( , 434 p22) all have similar functions both as activators and repressors and have very similar domain organization (Kim & Hu, 1997) . They are active as dimers and have a DNA recognition domain in the N-terminus, which comprises an -helix. Their structure is so similar that successful hybrid proteins have been created bearing the dimerization domain of one type of CI protein ( or 434) and the DNA binding domain of another type of those proteins ( /434, /p22 and 434/p22) (Di Lallo, Castagnoli, Ghelardini, & Paolozzi, 2001; Webster, Merryweather, & Brammar, 1992) . Those hybrid proteins dimerize successfully with their wild-type variants and bind to hybrid DNA operators. This way a number of interesting logics could be created. Unfortunately, the CI proteins are under no allosteric control from metabolites, thus their activity depends entirely on their expression levels and circuits involving them could not be fine-tuned.

II.6. Fluorescence reporter proteins

A very important part of every genetic circuit is the reporting device. This is a simple genetic device that allows for the successful representations of the dynamics of a particular part of the genetic circuit, thus rendering possible the assessment of its behavior. From engineering point of view, the reporting device provides the observability of the system and therefore it indispensable for any type of modeling of synthetic biological circuits. The reporting of the dynamic state of living cells is accomplished by fluorescence microscopy observations of fluorescent dyes or proteins. The latter are much more photostable (Shaner, Steinbach, & Tsien, 2005)  and are the only realistic option for long time-lapse in vivo experiments for now. The available fluorescent reporter proteins are derivatives of the green fluorescence protein (GFP) originally isolated from A. victoria (Chalfie, 1995) . The fluorescence proteins have a few relevant characteristics to be

26

taken into consideration when deciding with which ones to work. As already mentioned, one of the very important properties that are required for long time-lapse microscopy is the capacity to resist photobleaching after long exposure to UV light, i.e. photostability. All fluorescence proteins are prone to inactivation under long cumulative emission of UV light, however, if this effect takes more time than the average turnover time of the protein, this remains without consequences. Another important property of the fluorescence proteins is their brightness, since it is directly correlated to the amount of UV light that would be required for clear observation of the fluorescence under microscope. The brighter is the fluorescence protein, the smaller total UV energy would be required for its excitation and the less damage would be inflicted on the cells. Finally, when more than one processes in the cell need to be reported simultaneously, a combination of different fluorescence reporter would be required. However, since all fluorescence proteins are derived from the same original protein, they have very near excitation and emission spectra. Therefore, when combined together, such groups of proteins should be selected that have the smallest possible overlaps between their spectra. The discussed characteristics of the different available fluorescence proteins were reviewed many times e.g. in (Shaner et al., 2005) ). Keeping all that in mind, the standard fluorescence proteins from the registry of standard parts used alone are mCherry (red), GFP (green) and enhanced yellow fluorescence protein (eYFP). When couples of fluorescence proteins are required the combinations in order of preference are mCherry/GFP and cyan fluorescence protein (CFP)/eYFP. An important aspect of the fluorophores of those protein is that they require oxygen to fold properly, hence fluorescence proteins cannot be used in low-oxygen environments.

II.7. Protein degradation tags

Protein synthesis is a very important aspect of the functioning of genetic circuits. The interactions between the different nodes of the circuit are provided by the regulations proteins exert on promoter expression. Proteins provide connectivity between devices which are cloned on separated vectors through diffusion in the cytosol. Furthermore, it is through the observation of fluorescence emitted by reporter proteins that we could assess the expression dynamics exhibited buy a certain promoter, hence estimate the function of a certain device. Therefore, proteins in

27

genetic circuits should be generated in sufficient quantities in order to allow them to perform efficiently all of their functions. However, those biomolecules could be quite stable and when overproduced, they might accumulate in the cytosol. Apart from being a general spatial hindrance for the movement of all other biomolecules in the cytosol, the latter may also affect the actual or observed dynamics of the system. Accumulating proteins reduce the reactivity of the system and act as a damper for the dynamics it could exhibit (Elowitz & Leibler, 2000) . Therefore, a method for efficient removal of the synthesized proteins is usually required when engineering a dynamic genetic circuit. The most-widely used such method is tagging of the proteins by protease tags that direct them towards protease enzymes for fast degradation (Gottesman, Roche, Zhou, & Sauer, 1998) . The most widely-used such system in E. coli is the system for removal of nascent peptide chains with interrupted translation. This system directs the tagged peptide chain towards the proteases ClpXP and ClpAP. The tag that is used for recognition of the compromised peptide consists of 11 amino-acids (AA) short peptide attached the C-terminus of the protein. The way to sequester this algorithm in order to increase the turnover of the regulatory proteins used in synthetic gene circuits is through borrowing of this peptide tag. This is accomplished by addition of the tag coding sequence immediately upstream of the stop codon of the protein coding sequence. This way, when normally expressed, the synthesized protein would have a very short life until it is degraded by the proteases. The degradation tags sequences differ in the last three AA of the peptide chain and are known exactly by those last three peptides. The two most widely-used protein degradation tags are the AANDENYALAA and the AANDENYALVA (Andersen et al., 1998)  and some of the proteins provided by the registry of standard biological parts already have those tags in their sequences.

II.8. Transcriptional terminators

The process of translation end when the ribosome slides over a stop codon and the nascent polypeptide chain is released into the cytosol. However, if there is a nearby downstream of the stop codon, a new event of translation might begin. This is the normal process for translation of

28

polycostronically encoded proteins. However, if this process continues downstream of all the aimed protein coding sequences, this would have some unwanted consequences. On the first place, such process would sequester valuable cellular resources for the synthesis of unwanted peptides. Additionally, those small peptides would have unknown functions, which might even be detrimental. Finally, any addition of large molecules in the cytosol adds up to the general molecular noise existing in the system. The natural elements that provide the efficient blockage of transcription at the proper position are known as transcriptional terminators (Larson, Greenleaf, Landick, & Block, 2008) . Those are mRNA sequences consisting of a long stretch of predominantly GC nucleotides, followed by a 7-9 U nucleotides. The GC nucleotides bonds have very low energy and they are formed very efficiently, hence the GC stretch forms a hairpin- type secondary structure. The U-rich sequence on its turn has low binding affinity to the DNA template and is easily pulled out of the RNAP during transcription, thus allowing for the formation of the hairpin. Transcriptional terminators allow for successful insulation and reduction of the crosstalk between genetic elements, therefore they are of crucial importance for the proper engineering of synthetic genetic circuits. Conforming with the aim to use as much as possible standard parts, unless stated otherwise, we use transcriptional terminators from the registry of standard biological parts. We usually utilize the BBa_B0015 double terminator.

II.9. Expression vectors

We engineer synthetic genetic circuits that should function in living cells, therefore we aim to express them and characterize them in vivo . In order to transform them in the host E. coli , we need to utilize a vector. One approach is to insert our synthetic parts directly in the . This method has many advantages. First, the chromosome has much more stable copy number with known variations, which are connected to the cellular growth and could be easily filtered out, if needed. Additionally, there is no need for the addition of a special incentive for the cell to maintain the chromosome, it has to. The low copy number also means a lower expression load for the cell from the genetic circuit, which means more stable genetic maintenance and less distortion for the normal metabolism. The other possible vector, which is widely utilized in synthetic

29

biology is a plasmid. Plasmids are mostly circular DNA sequences, which carry their own origin of replication and copy-number control mechanism. Therefore, they replicate independently from the chromosome and maintain their copy number within certain margins, which however may be very wide. The copy number of a plasmid may change harshly, especially after cell division and uneven copy distribution between the daughter cells. Therefore, there is a large drift in the plasmid copy number, which also affects the expression levels of all the carried genetic elements. However, plasmids are extremely easy to work with. Their DNA could be engineered utilizing standard molecular cloning techniques. They are active immediately after transformation and need no additional operations to be inserted anywhere. Plasmid DNA is also extremely easy to isolate, sequence, conserve, distribute, etc. Plasmid DNA is directly physically accessible unlike the chromosome, which we approach only indirectly through specific enzymes and under special conditions. Finally, the higher copy number of plasmids with respect to the chromosome provide a higher level of expression of all the synthetic genetic elements, including the fluorescence reporter proteins. The latter means easier detection and reduced exposure time of the engineered cells under the deleterious UV light. For characterization of dynamic genetic circuits long time- lapse experiments are required and the possibility for reduced UV exposure is of key importance for the successful experiment performance. Overall, plasmids are not perfect vectors for synthetic genetic circuits, but they are the easiest to work with and allow for the type of experiments we aim to perform. Therefore, we decided to utilize plasmids as expression vectors.

There are two major type of considerations involved when utilizing plasmids as vectors for genetic expression. First, there is the need to counteract the natural direction of bacterial evolution towards the removal of all unnecessary protein expression systems. As already discussed, any bacterium that manages to remove the plasmid form its cytosol would have a growth advantage and would soon take over the whole population. Therefore, the plasmids we transform in the cells need to provide some advantage to the host in order not to be selected against. The standard approach is to clone in the plasmid an antibiotic resistance system, which would lead to cell death or growth inhibition if lost. There is a large number of known antibiotics with specific resistance systems that could be employed. The most standard of those are the

30

ampicillin, kanamycin, chloramphenicol. Those antibiotics could be combined if more than one type of plasmid need to be maintained in the same cell. The other type of problem that arise from plasmid usage is the limited size a plasmid should have in order to be efficiently transformed in a bacterial host (Chan, Dreolini, Flintoff, Lloyd, & Mattenley, 2002) . This limitation leads to the need to clone different parts of a genetic circuit in more than one plasmid. In addition, modularity is another reason for separating the individual functions of a genetic circuit on separate plasmids and then just recombine them through different co-transformations. However, plasmid copy number is controlled through a system characteristic for their origin of replication. This system is based on RNAs or proteins and affects all plasmids with the same origin of replication in the same cell. Therefore, if two different genetic elements are expressed in the same type of plasmid in the same cell, the exact copy number of each of the two plasmids would be unknown. The total number would be the characteristic for the origin, however the regulator (be it protein or RNA) could not differentiate between the two plasmids. Consequently, the ratio of the copies of the two plasmids would constantly fluctuate, which would affect in an unknown manner the dynamics of the genetic circuit. Hence, plasmids that employ different modes for copy-number control should be utilized for the expression of different genetic devices. The registry of standard biological parts provides plasmid vectors with three different compatible origins of replication (Table 1; http://openwetware.org/wiki/Escherichia_coli/Vectors) and we utilized them for our research, unless stated otherwise.

Plasmid origin of replication Typical copy number BioBrick reference pSC101 ~ 5 pSB4xx p15A 10 - 12 pSB3xx pMB1 500 - 700 pSB1xx

31

Table 1: Standard registry of biological parts vector plasmids.

II.10. Construction of genetic circuits

For the handling, construction and transformation of the genetic circuits we used standard molecular cloning techniques such as plasmid DNA isolation (miniprep), restriction digestion, ligation, PCR, chemical transformation, electroporation, electrophoresis, gel DNA extraction, etc. The protocols that were employed were the standard protocols supplied by the companies producing the chemical consumables and/or the hardware. Protocols for DNA and bacterial strain handling and additional information were derived from the manual of Green and Sambrook (http://www.molecularcloning.com/). The BioBrick standard also involves standardization of the cloning such as utilization of restriction sites, standard upstream and downstream sequences flanking the genetic parts, etc. As always, we tried to employ as much as possible the standard of the registry. The latter resulted in obtaining a standard spacer between the joined genetic parts denoted “scar”, which is 8 bp long. The latter is very close to the 7 bp spacer of the 70 consensus promoter between the -10 sequence and the transcriptional start site. Additionally, protein sequences provided by the registry that already have an RBS in front are separated from the RBS by a shorter 6 bp spacer, which is also close to the optimum of 7-9 bp.

II.11. Genetic parts engineered for this research

II.11.1. Synthetic promoters (APPENDIX A) The first promoters that I engineered were a group of synthetic promoters for the needs of the BiomodularH2 project (http://www.biomodularh2.org/). In summary, the aim of this project was

32

to increase the level of molecular hydrogen produced by cyanobacteria by the local reduction of the oxygen concentration. To accomplish that, the project involved the engineering of a large number of synthetic devices and express them in the host organism. However, not much is known about the promoter sequence function in cyanobacteria. Therefore, we aimed to base our hybrid promoters on templates that were known to function in Synechocystis like the P trc or its close variants P lac-UV5 and P lac8A . As another option we also engineered some promoters utilizing the -35 and -10 sequences from the -PL promoter, since phages are often not species-specialized. In these backbones different operator sites for different transcription factors were grafted in putative repressor sites (e.g. between the -35 and -10 regions). The transcription factors utilized are CI from phages or 434 and their mutations and combinations, TetR and its variant, LacI and its variant and AraC.

Ptrc repressed by -CI

Ptrc repressed by mutated lambda -CI (sequence and position of the operator were based on (Cox, Surette, & Elowitz, 2007) )

Ptrc repressed by 434-CI

Ptrc repressed by a mixed 434/P22-CI and P22-CI

Ptrc repressed by 434-CI and 434/P22-CI

Ptrc repressed by 434-CI using O3 (weak operator)

Ptrc repressed by P22-CI using O3 (weak operator)

PL from lambda phage repressed by CI hybrid operator from phages 434/P22 (weak)

Plac8A promoter activated by AraC

Plac8A promoter activated by AraC another version

PL repressed by TetR’ or TetR

Plac-UV5 repressed by TetR

Ptac promoter repressed by TetR

As explained in the chapter dedicated on genetic oscillators, PTrc1.x.TetR promoter was used for the engineering of a Goodwin-type oscillator. The hybrid promoter was ligated with TetR

33

(Biobrick part: BBa_SO3518) transcription factor in pSB4A5 low copy number plasmid (pSC101: 5 copy per cells). TetR contains the RBS at the upstream region. The PTrc1.x.TetR promoter was also used for construction of genetic amplifier with double GFP with enhanced fluorescence. This amplifier was cloned in high copy number plasmid with replication origin pUC19-derived pMB1 (copy number of 100-300 per cell). The GFP part also contains an RBS. The co-transformation of those two parts was used for the characterization of the putative oscillator.

II.11.2. XOR gate promoters Another work on synthetic promoters was performed in connection with the task of the Bactocom project (http://www.bactocom.eu/). This project consists in allowing for a random combination of genetic parts encoded in plasmids through conjugation of those plasmids in a cellular population. The host cells would promote the combinations of plasmids that produce a given output predefined by an input function controlled by the researcher. This way, instead of performing top-down engineering, the design process id based on directed combinatorial evolution towards the best solution. In order to decide whether the output of the randomly combined circuit is close to the template input, a specific device needed to be employed. This device had to produce a negative signal “punishing” the cell if it hosted a circuit that was producing a response which differs from the required. Consequently, the entire digital logic of the device had to be the following. Whenever the two inputs are the same (both “0”'s or both “1”'s), the output (repress the plasmids' replication) had to be “0”. Additionally whenever the inputs would differ (“01” or “10”), the output had to be activated and the spread of the bad combination of plasmids prevented (Illustration 7). This logic is known as an XOR gate or a comparator device. In order to construct this device we decided to employ a combination of two promoters with symmetric behavior. Both of the promoters had to respond to the same couple of inputs, but in inversed manner. One of them had to be activated by input1 and repressed by input2. This way if only input1 was present the promoter would be active, and if only input would be present the promoter would be repressed. If both of the inputs were not present the promoter would not be activated and if both of the inputs were present, repression from input2 would take over and the promoter would be

34

repressed. Consequently, the promoter would be active only if there was only input1 present. The inversed promoter had to function in the opposite manner and be active only if there was input2 alone. The logical disjunction of those two gates would produce precisely an XOR gate and would be accomplished by sequential cloning of the two promoters.

Illustration 7: XOR gate design. The molecular mechanism required that the transcription factors employed had to be both repressors and activators and to have known functioning combinatorial promoters with both of the regulatory functions. The naturally occurring genetic parts exhibiting all of those features are the bacteriophages

lysogeny control promoters. We chose to utilize the λ and the 434 P RM promoters because of the TF's operator length compatibilities. The control over the two CI protein expression is exerted through LacI and TetR repressible promoters. The two latter TF's were expressed constitutively by a Z1 cassette inserted in the chromosome.

To obtain such behavior, the transcription factors used had to be capable of performing both

35

activator and repressor function. Additionally, it would have been preferable if those protein regulators are known to perform both of those functions in the same complex promoter, so it is sure that both of the functions are compatible with known promoter sequences. The one solution that appeared were the complicated P R and P RM promoter couples of the bacteriophages from the type. The 434 phage has a very similar promoter structure to the one already described for l with one main difference. The -35 sequence is shared between both of the inversed promoters and here the operator that provides repression for P R between its -35 and -10 sequences is actually at the UP of the P RM promoter and is also the activating position for the latter (Illustration 8).

Therefore we designed two similar promoters based on the P RM from the and the 434 bacteriophages. In brief, we maintained the original O2 operators for the cognate CI protein fro each of the promoters and we grafted at the sequence between the -35 and -10 elements an operator for the other CI protein (Illustration 8). This way, we were expecting that the new promoters would be activated normally by their cognate proteins and repressed by the complementary protein.

The exchanging of the spacer of the 434 promoter with an O2 for the -CI protein seemed to be risky, because the length of the operator is exactly 17 bp. Therefore the “TG” bp at -14 and -15 were also exchanged. The other hybrid promoter had to be more reliable because the O2 of the 434-CI is only 14 bp and could be fit immediately downstream of the -35, thus leaving the -14 and -15 bp intact. However, phenotypically it was exactly the latter promoter that did not work. Therefore we kept the design for the 434-template promoter and tried to develop an alternative for the -based system. Fortunately, a similar promoter was already developed in another research (Guido et al., 2006) . This promoter was based on the -PRM template, but had two major changes. Firstly, the O3 operator was obliterated, hence the promoter was no longer repressed by CI at high concentrations of the transcription factor. Second, there was a LacI-O1 added upstream of the promoter that provided efficient repression by LacI. Therefore we kept the entire Collins promoter intact, but just grafted a O1 for the 434-CI protein in the LacI operator site to provide for repression by the 434-CI protein. We maintained the center of the new operator

36

as close as possible to the center of the original one. As a result, the aimed function of the promoter was obrtained (Illustration 8).

Illustration 8: Schematic representation of the design for hybrid promoters for the construction of XOR gate. The

design were based on the original PR/M promoters from the phages λ and 434 (A). Initially the P RM sequences of both of the promoters were used to generate and activated template and O2 operators for the alternative CI proteins were grafted between the -35 and -10 sequences (B.1 and B.2). Another version of the λ-activated and 434-repressed promoter was based on a synthetic promoter (B.3) repressed by LacI. The new version of B.2 used the same structure, but grafted 434-O1 operator in the lacO1 sequence (B.4)

The expression of the two CI proteins was constructed to be under the control of the P Ltet-O1 and

PLlac-O1 promoters. This way, if constant production of the TetR and LacI repressors was provided in the cell, the actual inputs to the system would be the inducers of the same repressors, namely aTc and IPTG. Therefore, we utilized specific strains for the characterization of the XOR gate. Those strains contained a Z1 cassette (Lutz & Bujard, 1997)  inserted in the chromosome. This

37

device contains TetR and LacI under the control of constitutive promoters, thus providing the constant synthesis of those repressors.

II.12. Conclusion

We have reviewed the major components of genetic circuits. Based on the theoretical knowledge of the structure and function of those parts and on successful examples from published work we engineered a number of novel promoters. Some of the latter were characterized as parts of synthetic genetic circuits and appeared to be functional. Furthermore, some important conclusions were derived for the mutual dependence between choice of genetic parts for the engineering of a circuit, the required circuit function and the characterization method to be utilized.

II.13. References

Andersen, J., Sternberg, C., Poulsen, L., Bjorn, S., Givskov, M., & Molin, S. (1998). New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria. Applied and environmental , 64 (6), 2240–2246. Retrieved from http://aem.asm.org/cgi/content/abstract/64/6/2240

Boer, H. A. de, Comstock, L. J., & Vasser, M. (1983). The tac promoter: a functional hybrid derived from the trp and lac promoters. Proceedings of the National Academy of Sciences , 80 (1), 21–25. Retrieved from http://www.pnas.org/content/80/1/21.abstract

Brosius, J., Erfle, M., & Storella, J. (1985). Spacing of the -10 and -35 regions in the tac promoter. Effect on its in vivo activity. The Journal of biological chemistry , 260 (6), 3539– 41. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/2579077

Chakerian, a E., & Matthews, K. S. (1992). Effect of lac repressor oligomerization on regulatory outcome. Molecular microbiology , 6(8), 963–8. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/1584025

38

Chalfie, M. (1995). Green fluorescent protein. Photochemistry and photobiology , 62 (4), 651– 656.

Chan, V., Dreolini, L. F., Flintoff, K. A., Lloyd, S. J., & Mattenley, A. A. (2002). The Effect of Increasing Plasmid Size on Transformation Efficiency in Escherichia coli, 2(April), 207– 223.

Cox, R. S., Surette, M. G., & Elowitz, M. B. (2007). Programming gene expression with combinatorial promoters. Molecular systems biology , 3(145), 145. doi:10.1038/msb4100187

Di Lallo, G., Castagnoli, L., Ghelardini, P., & Paolozzi, L. (2001). A two-hybrid system based on chimeric operator recognition for studying protein homo/heterodimerization in Escherichia coli. Microbiology (Reading, England) , 147 (Pt 6), 1651–6. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11390696

Dunlap, P. (1999). Quorum regulation of luminescence in Vibrio fischeri. Journal of molecular microbiology and biotechnology , 1(1), 5–12.

Elowitz, M. B., & Leibler, S. (2000). A synthetic oscillatory network of transcriptional regulators. Nature , 403 (6767), 335–8. doi:10.1038/35002125

Falcon, C. M., & Matthews, K. S. (2000). Operator DNA sequence variation enhances high affinity binding by hinge helix mutants of lactose repressor protein. Biochemistry , 39 (36), 11074–83. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/10998245

Gottesman, S., Roche, E., Zhou, Y., & Sauer, R. T. (1998). The ClpXP and ClpAP proteases degrade proteins with carboxy-terminal peptide tails added by the SsrA-tagging system. Genes & development , 12 (9), 1338–47. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=316764&tool=pmcentrez&rende rtype=abstract

Gruber, T. M., & Gross, C. A. (2003). Multiple sigma subunits and the partitioning of bacterial transcription space. Annual review of microbiology , 57 , 441–66. doi:10.1146/annurev.micro.57.030502.090913

39

Guido, N. J., Wang, X., Adalsteinsson, D., McMillen, D., Hasty, J., Cantor, C. R., … Collins, J. J. (2006). A bottom-up approach to gene regulation. Nature , 439 (7078), 856–60. doi:10.1038/nature04473

Hook-Barnard, I. G., & Hinton, D. M. (2007). Transcription initiation by mix and match elements: flexibility for polymerase binding to bacterial promoters. Gene regulation and systems biology , 1, 275–93. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2613000&tool=pmcentrez&rend ertype=abstract

Imburgio, D., Rong, M., Ma, K., & McAllister, W. T. (2000). Studies of Promoter Recognition and Start Site Selection by T7 RNA Polymerase Using a Comprehensive Collection of Promoter Variants†. Biochemistry, 39 (34), 10419–10430. doi:10.1021/bi000365w

Joung, J. K., Koepp, D. M., & Hochschild, a. (1994). Synergistic activation of transcription by bacteriophage lambda cI protein and E. coli cAMP receptor protein. Science (New York, N.Y.) , 265 (5180), 1863–6. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8091212

Kamionka, A., Bogdanska-Urbaniak, J., Scholz, O., & Hillen, W. (2004). Two mutations in the tetracycline repressor change the inducer anhydrotetracycline to a corepressor. Nucleic acids research , 32 (2), 842–7. doi:10.1093/nar/gkh200

Kim, Y. I., & Hu, J. C. (1997). Oriented DNA binding by one-armed lambda repressor heterodimers and contacts between repressor and RNA polymerase at P(RM). Molecular microbiology , 25 (2), 311–8. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9282743

Knaus, R., & Bujard, H. (1988). PL of coliphage lambda: an alternative solution for an efficient promoter. The EMBO journal , 7(9), 2919–23. Retrieved from /pmc/articles/PMC457087/?report=abstract

Krueger, M., Scholz, O., Wisshak, S., & Hillen, W. (2007). Engineered Tet repressors with recognition specificity for the tetO-4C5G operator variant. Gene , 404 (1-2), 93–100. doi:10.1016/j.gene.2007.09.002

40

Larson, M. H., Greenleaf, W. J., Landick, R., & Block, S. M. (2008). Applied force reveals mechanistic and energetic details of transcription termination. Cell , 132 (6), 971–82. doi:10.1016/j.cell.2008.01.027

Lewis, M. (2005). The lac repressor. Comptes rendus biologies , 328 (6), 521–48. doi:10.1016/j.crvi.2005.04.004

Lutz, R., & Bujard, H. (1997). Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic acids research , 25 (6), 1203–10. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=146584&tool=pmcentrez&rende rtype=abstract

Müller, J., Oehler, S., & Müller-Hill, B. (1996). Repression of lac promoter as a function of distance, phase and quality of an auxiliary lac operator. Journal of molecular biology , 257 (1), 21–9. doi:10.1006/jmbi.1996.0143

Nandagopal, N., & Elowitz, M. B. (2011). Synthetic biology: integrated gene circuits. Science (New York, N.Y.) , 333 (6047), 1244–8. doi:10.1126/science.1207084

Ozbudak, E. M. (2004). Noise and Multistability in Gene Regulatory by. Growth (Lakeland) .

Ramos, J. L., Martínez-Bueno, M., Molina-Henares, A. J., Terán, W., Watanabe, K., Zhang, X., … Tobes, R. (2005). The TetR family of transcriptional repressors. Microbiology and molecular biology reviews : MMBR , 69 (2), 326–56. doi:10.1128/MMBR.69.2.326- 356.2005

Randall, A., Guye, P., Gupta, S., Duportet, X., & Weiss, R. (2011). Design and connection of robust genetic circuits. Methods in enzymology , 497 , 159–86. doi:10.1016/B978-0-12- 385075-1.00007-X

Salis, H. M., Mirsky, E. A., & Voigt, C. A. (2009). Automated design of synthetic ribosome binding sites to control protein expression. Nature biotechnology , 27 (10), 946–50. doi:10.1038/nbt.1568

41

Schleif, R. (2010). AraC protein, regulation of the l-arabinose operon in Escherichia coli, and the light switch mechanism of AraC action. FEMS microbiology reviews , 34 (5), 779–96. doi:10.1111/j.1574-6976.2010.00226.x

Shaner, N. C., Steinbach, P. A., & Tsien, R. Y. (2005). A guide to choosing fluorescent proteins. Nature methods , 2(12), 905–9. doi:10.1038/nmeth819

Slusarczyk, A. L., Lin, A., & Weiss, R. (2012). Foundations for the design and implementation of synthetic genetic circuits. Nature reviews. Genetics , 13 (6), 406–20. doi:10.1038/nrg3227

Webster, C., Merryweather, A., & Brammar, W. (1992). Efficient repression by a heterodimeric repressor in Escherichia coii. Plant biology , 6, 371–377.

42

III. Engineering of genetic oscillators

III.1. Introduction

Oscillations are the source of spatio-temporal organization not only for the living organisms (Berridge, Bootman, & Lipp, 1998) , but indeed for the whole physical universe as we know it. For it is the oscillation of the Earth between the the two extrema of its orbit that allows for the mild conditions on its surface and for a climate providing such thriving circumstances for all life forms. Two forces drive our planet between its (the furthest away from the point the Earth reaches during its annual space travel) and its perihelion (the closest to the sun point of the orbit). It is gravity that attracts the Earth towards the Sun causing it fall faster and faster and it is the the momentum accumulated during this fall that repels her back in space. The momentum is gradually consumed by the opposing gravitational force until Earth's inertia is so weak that gravity once again becomes predominant and the system is back to its initial case. During this oscillation the climate elements also change periodically, preventing extremes in temperature, humidity, radiation exposure and all other physical conditions that are vital for the survival of organisms.

This annual and daily fluctuations of the environmental properties requires that the organisms have the ability to adapt to them or even to be prepared for them in advance in order to survive and to develop (Vilar, Kueh, Barkai, & Leibler, 2002) . If for the direct conformity of an organism behavior, a linear sensing system would be enough, this is not the case if complicated rearrangements of physiology are required to be performed before some climate change would occur. In the latter case only a non-linear system capable to entrain the period of the fluctuating environmental factor could cope efficiently with the task. Thus, there is a strong selective pressure towards the development of oscillatory control systems in the living organisms. Indeed,

43

all synchronization, be it spatial or temporal is performed with the utilization of regulatory networks capable of oscillations within certain set of parameters. From the timing of the cell cycle (Lavi, Ginsberg, & Louzoun, 2011)  and the positioning of the (Shih & Zheng, 2013)  during the cell division, through pattern formation (Wu, Wu, & De Camilli, 2013)  and development (González, Manosalva, Liu, & Kageyama, 2013) , neural transmission (Hájos et al., 2013) , muscle contraction (M. De Bock et al., 2012)  until mating synchronization (Bertossa et al., 2013) , the regulation of all processes that provide life with the capability of survival, growth and expansion are based on biochemical oscillators .

Before tackling the problem of how oscillations are created in living systems, we need to produce a definition of oscillators for the need of our research. Given all the above examples, the obvious requirements for a biological system to be considered oscillatory are two. First, the system has to be dynamical, i.e. to be constantly changing with time. Since living organisms need to constantly consume energy from the environment and they develop all the time, equilibrium state could not be combined with such system properties. Secondly, the organisms have to be capable of adapting properly to the changes of the environmental conditions. However, those changes are periodic and they repeat themselves, therefore the open dynamic systems called living organisms need to be able to periodically revert back to a previous state. Given also the requirement for anticipation of the environmental changes, this means that this reversion has to be based on the system proper current state not necessarily on sensing of some environmental conditions. Consequently, in order to drive their own oscillations internally, the organisms need to use as input function for the oscillatory logic the latter's own output function, i.e. the system should have a feedback loop. This also requires that there should be a set of input parameters defining an input function for the processing logic that would be processed in exactly the same set of parameters as an output function, which in its turn would be used as a feedback (Illustration 9). Then and only then sustained oscillations could exist.

44

Illustration 9: Generalized scheme of an oscillator (above). The device is a black box, which processes the input signal for a given time and produces an output signal. Note that the period of the oscillations (bellow) is equal to the processing time.

III.2. Mathematical modeling

The problem about theoretical study of biochemical oscillators has been addressed long time ago and was the subject of numerous research endeavors. It was first Goodwin (Goodwin, 1965)  that developed and worked mathematically the idea that a delayed-negative feedback system would suffice to produce oscillations. Such system would comprise of three chemical species, the first of which (A) being constantly produced and some of it converted to the second species (B), some of the second species being converted to the third one (C) and the latter repressing the

45

synthesis of the first species. In practice, the Goodwin oscillator could be comprised of a gene encoding a repressor protein, the mRNA polymerized on the gene at a constant basal rate and the protein itself in its unfolded and folded states. This way A stands for mRNA, B represents the unfolded protein and C is the final repressing form of the protein (folded multimer e.g.) (Illustration 10). The original Goodwin model is based on the following equations: dA K n × i − × (1) = k1 n n k2 A dt Ki + C dB = k × A− k × B (2) dt 3 4 dC = k × B− k ×C (3) dt 5 6

Where k1 is the basal rate of mRNA synthesis, k2 is the mRNA decay rate, k3 is the translation rate, k4 is the protein degradation rate, k5 is the folding rate, k6 is the rate of degradation of the active form of the protein and Ki is the constant of inhibition and is directly related to the dissociation constant of the inhibitory protein binding to DNA. The member describing the

n Ki inhibitory function n n is known as Hill function, in which n is considered to be a measure Ki + C of the non-linearity of the response of the system to the inhibitor concentration.

The input function to this system is the constant production of A. However, the concentration of A defines the concentration of B and the concentration of B defines the concentration of C. In addition, the concentration of C affects the concentration of A, thus effectively closing the feedback loop.

46

Illustration 10: The Goodwin oscillator. The individual biochemical transformations depicted are also generalized representations for processes consisting of numerate intimate steps. The schematic representation of the interactions architecture allows for the direct visualization of the logic core of the oscillator. Note the specific symbols used to represent activation and repression (the colors are not obligatory, but are added for improved clarity).

Hence, the chemical species' concentration in the Goodwin oscillator change with time, i.e. they have dynamical properties and there is a feedback control loop driving the behavior of the whole system. This way, the described system exhibits the properties already described by us as a prerequisite for sustained oscillations and for this chemical system there exists a set of parameters allowing for the generation of sustained oscillations. One obvious solution for the equivalence between the input and the output function is a passive transmission instead of processing by the

47

system. In such case the system would stall into a steady state and by definition would always conform with the second condition defined by us. However, this way the system would lose its dynamism and could not be regarded as oscillatory. Unfortunately, for such small system as Goodwin's, the steady state is very easily accomplished. There is a direct linear dependence between the concentration of A and the concentration of C and any small increase in A would mean an almost immediate increase in C. However, an increase in the concentration of C leads to repression of the synthesis of A, i.e. a decrease in the concentration of A. Therefore, A would constantly oscillate around some equilibrium point and if the processing time of the input is small enough, this oscillation would be so fast and small in amplitude that it would be practically nonexistent. Combined with the naturally occurring dissipation of the chemical signal (Gonze & Goldbeter, 2006) , steady state becomes eminent end for the system we are studying.

Consequently, one way to obtain sustained oscillations in such system is to constantly increase the amplitude of the oscillations. The latter could be accomplished by changing the linear dependence between the different species' concentrations and the synthesis of A to a steeper non- linear function. Practically non-linearity could be explained in cases where molecular mechanisms are involved, in which the cooperative effect of few representatives of the same species is larger than the simple sum of their individual actions. For example, some enzymes have more than one allosteric centers for ligand binding and the complete enzyme efficiency is only reached when all those centers are occupied. Of course, such biochemical mechanism could be represented explicitly with a dedicated equation describing each of the possible enzyme states avoiding non-linear terms in the mathematical description. However, for the sake of simplicity and/or because the precise chemical mechanism is unknown, non-linear mathematical formulations of biochemical processes are widely adopted. Effectively, all biochemical non- linear functions are an implicit form of more species transformation steps. For example the Hill function utilized for modeling of the Goodwin oscillator is derived from the following assumptions. The inhibitory protein is sequestering some of the polymerase complexes through direct chemical bonding:

48

(4)

Where C is the inhibitory protein, P is the polymerase complex, PC is the bound polymerase, is the k1 and k2 are the reaction rates in the forward and the reverse directions. Consequently, the amount of uninhibited polymerase available for mRNA synthesis is equal to the ratio between the unbound protease complexes and the total number total number of chemical species having the protease as a component. The latter could be derived assuming a steady state is reached as following:

P P P 1 K = = × = D (5) P+ PC P×C P C K + C P+ 1+ D KD KD

Where KD is the dissociation constant of the polymerase/inhibitor complex and denotes the steady-state ratio between the dissociated state of the complex P X C and the bound state PC .

This is exactly the form of the Hill equation for n = 1 and Ki = K D. However, if the number of inhibitory molecules binding the protease is more than 1, then the equations change as following:

(6)

P P P 1 K = = × = D (7) P+ PC P×Cn P Cn K + Cn n P+ 1+ D KD KD

n Here, to obtain (1) the following Ki = KD must be true and this is the general equation. Ki is also defined as the concentration of the inhibitor, at which the enzyme activity reaches half of its normal rate.

49

Therefore, the binding of ligands to enzymes is very precisely described using the Hill formalism when a single binding reaction is involved, i.e. for n =1 . However, if, as in our case, multiple molecules are expected to bind the enzyme, this approach becomes very demanding statistically. Explicitly, in order to follow the Hill behavior all molecules involved in the reaction need to bind simultaneously, the probability for which event is tending to zero (Illustration 11).

Therefore the Goodwin oscillator only seemingly has such small amount of chemical intermediate species involved. Still, this is the simplest mathematical description yet, which makes it very useful for numerical simulations and easy for mechanism understanding and analysis. Consequently, the Hill form of the equation describing biochemical behaviors as repression or activation should be regarded as simple and useful in some cases. However, the Hill coefficient should not be taken as a measure of the number of molecules involved in the reaction, but merely as a minimal value (Weiss, 1997) . The precise calculations for the numerical properties of the required feedback function were developed by Griffith (Griffith, 1968) . According to his work the minimal value for the Hill equation allowing for the maintenance of sustained oscillations in the Goodwin oscillator is 8.

Reverting back to the requirements for sustained oscillations, we could address the other source of stabilization of the system at the steady state, i.e. the processing time. We already discussed that if the latter is very short, no significant deviation from the steady state could be accomplished within one cycle of material/information processing and the system would remain stable. However, if this processing time could be increased, the generation rate of the initial chemical species would be enough to create a significant burst of molecules' concentration, well above the steady state level. The latter would cause the synthesis of a large number of active repressors, which would eventually block their own synthesis. Upon complete degradation of the existent chemical species, however, the system would be restarted in its initial state and the oscillatory cycle would be closed. Such time delay is caused either by a very slow processing reaction or by a large number of intermediate steps involved in some of the generalized

50

Illustration 11: Graphical representation of the assumption underlying the Hill formalism. In order for it to be valid all ligand molecules should bind their target simultaneously (above). The realistic mechanisms include sequential (below) or independent binding. transformations during the processing. The former, however is also assumed to be caused by a large number of unknown intermediate steps. Therefore, mechanistically we could adopt the understanding that it is always the small steps comprising the major process that provoke the time delay. Consequently, the straight-forward manner of mathematical description for such a delayed event would be the explicit formulation of all the known small reactions (Gonze & Abou- Jaoudé, 2013) . However, this approach leads to systems of equation which are quite difficult to solve not only for the number of variables and parameters involved. The biggest numerical challenge for the simulation of such systems is their “ stiffness ” (Ilie, 2012) . The latter means that in the same system of equations coexist rate constants, which differ significantly. If an algorithm is to be used to simulate this type of combination, it has to be capable of simultaneous numerical integration of equations whose time constant are orders of magnitudes away from each

51

other. As “stiff” are usually defined systems whose fastest and slowest reaction rates have a ratio of the order of 10 4 or bigger. Any system complying with this requirement could only be simulated through the adoption of special algorithms and is by definition extremely demanding as a CPU power. Unfortunately, most of the naturally occurring regulatory biochemical systems in the living organisms are exactly of that type.

To avoid that problem, another approach for the mathematical formalism is usually adopted. It is based on the fact that it is the dynamism of the slow reactions in a stiff system that require explicit modeling. The latter is owing to the fact that the fast reactions reach equilibrium much earlier than the slow ones and, consequently, could be assumed to be always at their steady state for the given conditions. Thus, the linear differential equations describing the dynamics of those equations could be exchanged with linear difference equation that require no special numerical methods for solving. Additionally, it is precisely the small intermediate steps, which are the fastest in our types of systems. However, the immediate reaching of the steady state is just an assumption and those reactions still require some time, small, but existent, to reach this steady state. Therefore, the fast intermediate steps could be incorporated in the other equations governing the dynamism of the oscillatory system as additional members. However the time required for the stabilization of the concentrations of the species described by them needs to be modeled explicitly (Novák & Tyson, 2008) . If we return to the governing equations of the Goodwin model and assume the fast reaching of the steady state we obtain the following: for dB equation (2) = k × A− k × B the steady state would mean that all concentrations are constant, dt 3 4 dB hence = 0, consequently k × A− k × B= 0 and k × A= k × Bor finally for B: dt 3 4 3 4 k B = 3 × A (8) k4

Similarly for C we could easily derive that:

52

k C = 5 × B (9) k6

Which based on (8) we could work to:

k × A k × 3 k × B 5 k k × k C = 5 = 4 = 5 3 × A= k × A (10) × C k6 k6 k4 k6

Where we substituted all rate constants with a single one, kC. Finally, we based on (10) we could also re-define the unique remaining equation for our system (1) as: dA K n K n = k × i − k × A= k × i − k × A (11) 1 n n 2 1 n n × n 2 dt Ki + C Ki + kC A

However, in (11) there is no explicit time delay , this way assuming that the nascent peptide becomes active immediately, which is mechanistically untrue. Therefore, we should include a member for the explicit time delay for the protein participating in the inhibitory function. This is usually performed by the denotation ( t – T), where  is the time delay and T is the current time. Consequently, we could use for a simplified mathematical description for a Goodwin-type oscillator the following: dA K n = k × i − k × A() t (12) 1 n n × − n 2 dt Ki + kC A() t T

By the utilization of explicit time-delay models we could render the mathematical description of a complicated regulatory genetic circuit to a single differential equation. If the parameters are evaluated properly, we would lose none of the information regarding the general dynamics of the system or the concentration of its main components. Furthermore, explicit modeling of the

53

intermediate steps could veil the importance of the values of some of the parameters of the system. Properly elaborated explicit time-delay models are much more informative, because they allow us to individualize the governing parameters of the system and to experiment with them. The standing problem in such cases is the understanding of the precise physical meaning of the generalized parameters and the “engineer-ability” of those parameters.

The conditions described until here for the non-linearity of the response and the length of the processing time unfortunately do not exhaust all the requirements for the dynamism of the system behavior. For even if the oscillatory system's response is extremely non-linear and the processing time of the chemical signal is sufficiently long, still the oscillations might be damped and unsustained. This effect is usually provoked by a long protein lifetime. Mechanistically, this phenomenon is explained as following. Our non-linear system produces very high number of mRNA, which on its turn is used to produce very high number of proteins, which fold and form efficient repressors. The latter prevent the synthesis of new mRNA, the old mRNA gets degraded and there are no more new proteins synthesized. However, if the existent protein molecules are long-lived, they get gradually degraded and diluted and their concentration is reduced extremely slowly. As a consequence, the gene remains repressed for a long time. Finally, some polymerization of mRNA re-appears, but it is not at its maximum, because there is partial repression from the still existing repressor molecules. Hence, as an output, new proteins are synthesized, but they are not many. The latter repress a little bit more the gene and a new oscillatory period starts. Such system tends to a steady state and could not produce sustained oscillations, because the slowly degraded protein generates a stable pool of repressor, which acts as a damper to the whole system. To escape from the steady state the system needs to be able to degrade the protein very fast with respect to the time delay of the system, which would guarantee good dynamism and capacity to continuously oscillate between zero and maximal protein synthesis. This is accomplished in nature by protein degrading enzymatic machines , i.e. proteases . The latter have very high processivity and are attracted to the designated proteins by some molecular marker. The dynamic behavior of the proteases has been found to be properly described by Michaelis-Menten kinetics (Wong, Tsai, & Liao, 2007)  and could be readily

54

integrated in the mathematical description of our stable oscillator:

D D = max (13) A+ Km

Where D is the actual degradation rate, Dmax is the maximal degradation rate, A is the tagged protein concentration and Km is the concentration at which the protease exhibits half of its maximal degradation rate. This kinetics is based on the following theoretical premises. There is a limited number of protease complexes in the cytoplasm and if the number of tagged molecules is large enough those proteases would be saturated and the degradation rate would approach a certain maximum, i.e. saturation level. Therefore, the more tagged proteins there are in the system, the more difficult it would be for the proteases to remove them. An over-saturation with one protein would slow down the degradation of all the tagged proteins in the cell. Therefore, if a highly-expressed tagged molecule exhibits certain dynamic pattern, also all the other proteins would follow the same pattern, at least regarding their degradation. This way, the protease machinery might function as a synchronization mechanism in dynamic systems with more than one tagged protein. Consequently, utilization of over-expressed proteins should be avoided also for the avoidance of unwanted synchronization effects. Still, by introducing active protein degradation in our system we could still keep the brevity and the clarity of the mathematical formalism and in the same time provide a fast response capacity and better dynamic properties.

Up till now we have dealt with the problems for genetic oscillators that arise from the system's amplitude or time of response. We also introduced active protein degradation in order to avoid damping owing to reduced turnover of some of the chemical species. An oscillator constructed with those requirements in mind should be able to exhibit sustained oscillations for a long period of time. However, there is one major assumption implicitly stated in all the modeled mechanisms that we have disregarded until now. We assumed that the system behavior overall and the rate and direction of the individual comprising reaction is defined by the state of the system, i.e. by the value of the parameters and the concentration of the variables (the chemical species). We

55

stated that the level of mRNA synthesis depends in some describable way on some properties of the encoding gene and the concentration of the repressor protein. Also that the protein synthesis depends on the concentration of the mRNA. Finally, we modeled the repressor active concentration as depending on the concentration of the protein molecules. We regarded this system as behaving according to some laws and the status of the system itself and the environment, in other words, we assumed that this system is deterministic. Unfortunately, at the intimate molecular level biochemical reactions are not entirely deterministic. The intersection of the trajectory of the DNA molecule encoding a protein, of an RNA polymerase and of a repressor complex is indeed defined on the concentrations of those chemical species, bot not solely. The information molecules, the regulatory proteins and the large enzymes exist in the cell in so small numbers that their interaction at certain moment is by far a random event . Furthermore, even if this interaction happens, it is not at all irreversible. A polymerase complex and a ribosome are basically “sliding” along the information molecules of the cells. Sometimes they “fall off”, sometimes they propagate in the wrong direction (L. V Bock et al., 2013) . Overall, the genetic expression is subjected to numerous sources of noise (Michael B. Elowitz, Levine, Siggia, & Swain, 2002)  generated by the random behavior of the expression machinery (intrinsic noise) or of the changes in intracellular parameters (extrinsic noise). Therefore, a deterministic model could be used to describe the averaged behavior of some genetic circuit, but could not be used for realistic prediction of the status of the system at any time point. To address such type of system one needs to adopt an approach that takes into account the intimate “decision making” at the molecular level in the living cells. There is a dedicated mathematical apparatuses developed precisely for the modeling of stochastic biochemical networks interaction (Gillespie, Hellander, & Petzold, 2013) . The main idea that differentiates this algorithm is the understanding that biochemical reactions are performed in discrete reversible steps. The simulation is performed for a limited number of molecules step by step, where at every step a random number is used to determine the next state of each molecule, selecting from all accessible states from its current state. The probability for each reaction takes into account the governing rate constants. This way, this type of modeling avoids results of the type “3.5 molecules” and also takes into account the

56

random nature of the event of molecular binding and unbinding, but still maintains the probability ratio defined by the nature of the reactions. This type of modeling is very precise and could be used to analyze the effect that stochasticity exerts on the behavior of dynamic systems, however it is computationally very demanding because of the need for the modeling of each discreet reaction step.

When Gillespie algorithm is used for the simulation of a delayed-negative-feedback oscillator some new features of the system's behavior become distinct. The individual pulses of oscillation are not precisely the same and the system exhibits a significant coefficient of variation between the individual oscillatory cycles (Mather, Bennett, Hasty, & Tsimring, 2009) . This is due to the difference in the protein production rates owing to the stochasticity of the separate biochemical processes involved. To avoid that discrepancy, additional molecular mechanisms need to be employed in order to make the system more deterministic. The stochasticity in the transcription rate for example is due to the random and reversible binding of the RNA polymerase to the DNA. To change the natural characteristics of that enzyme is a very tedious and laborious task, therefore we would try to alter properties of the DNA and more precisely, its capacity to attract the polymerase. The latter is accomplished by the addition of an activator to the reaction initiation mechanism. The methods to do that are thoroughly examined in the dedicated chapter. For now, it is important to note that such an edition in the protein synthesis mechanism equals to the addition of a novel feedback loop to the logic of the entire system. If the repressor protein production is coupled to the generation also of an activator protein, than the system would become one with coupled positive and negative feedback loops. The mechanism by which such an oscillator would function is the following. Initially the two types of proteins are generated. They fold and form the active forms of an activator and repressor. If the dynamics of both of the protein effectors are equal, than the two feedback signals would arrive simultaneously and the repression would take over and cancel the effect of the positive feedback . The system would behave in exactly the same manner like with only a negative feedback. However, if the formation of an activator is faster than the formation of the repressor, the behavior would be much different. After the initial transcription, the activator would form faster and would cause the protein

57

synthesis to reach its maximum. Then even bigger quantity of the two proteins is synthesized. Upon formation of the repressor complex, the system would be blocked and both of the proteins would start to be degraded. When the repressor concentration reaches the lower repressing threshold, the system would enter in a protein synthesis phase again. The activator again would be formed faster than the repressor and would cause the synthesis to reach its maximum and thus the cycle is closed. Unlike the previous oscillator design, in this case the protein synthesis would almost always reach its maximum rate, thus maintaining the shape of the oscillation pulses much more uniform and providing a genetic circuit with a much lower coefficient of variation (Illustration 12).

We have already stated that oscillatory circuits are responsible for many types of spatio-temporal organization including pattern formation in organisms. Spontaneous pattern formation is a common trait between all organisms at different levels of organization including multicellular, may it be in a higher organism or in a colony of microbes. Obviously such behavior requires synchronization between the individual cells, otherwise a repeated pattern would be impossible. In addition, synchronization of the states of individual cells requires information exchange. Hence, inter-cellular communication should exist in order pattern formation to appear at the multicellular level (Turing, 1952) . However, up till now we gave examples only with genetic

58

Illustration 12: A schematic representation of the architecture and modus operandi of a positive-negative-feedback oscillator. The initial state of the system is no expression and no proteins available (1). The expression of the activator ( B) is faster and this type of protein starts to accumulate (2). When certain concentration of activator is reached, the promoter is activated (3). Consequently, all proteins start to be expressed at maximum rate (4), which leads to accumulation of the repressor. The latter also reaches it threshold of activity and represses the whole system (5). This leads to recycling back to the original zero state (1).

circuits involving the production and the action of proteins generated and degraded inside the same cell. To obtain the needed cell-cell communication another class of molecules is required, namely signaling molecules. The most widely-used signaling molecules in synthetic biology are the N-Acyl homoserine lactones (AHLs or N-AHLs), which are involved in different natural

59

synchronization processes such as the regulation of the bioluminescent protein Luciferase in the luminescent bacteria Vibrio fischeri (Kaplan & Greenberg, 1985) . Usually those molecules are denoted as “quorum sensing” because their main function is to detect the presence of other signal generating cells and to trigger a certain type of response when a certain concentration threshold is reached. Therefore, they exhibit also genetic regulation abilities.

If we endeavor to utilize a quorum sensing molecule in a oscillatory circuit, it should be the carrier of the slower information channel, because it would require a significant amount of time to diffuse to the other cells of the colony. However, based on our theoretical considerations and the previous example, we know that in order to obtain sustained oscillations we should employ a fast positive feedback and a slower negative feedback in a genetic circuit. Therefore, we should aim to use AHL as the negative feedback signal.

An oscillatory circuit comprising an activator generated locally and a repressor controlled by a quorum sensing molecule would function as following. Initially the activator and the quorum sensing molecule would start to be synthesized in all the cellular population. More and more quorum sensing molecule would accumulate in different concentrations at different spots owing to the extrinsic noise. At some random positions, those concentrations would surpass certain threshold and in this location the repression effect would be triggered. At those locations the generation of AHL would cease. Consequently, they would stop emitting new quorum sensing molecules and if there is a mechanism for the removal of the latter, the cells could revert back to their initial state and thus spatial oscillatory formation of random spots with a particular expression patterned could be observed. However, if there is no dedicated mechanism for the fast removal of quorum sensing molecules in the repressed state of the circuit, then the spots with repressed state would stabilize for indefinite time period and/or until the ratio between activated/repressed cells in the colony reaches some threshold value, i.e. a steady state. If, conversely, the long-ranged interaction would control the positive feedback in the circuit, we would obtain a completely different behavior. The cells would simply enter in a continuous activated steady state stimulated both by the AHL molecules produced by them and the ones

60

coming from the environment.

In conclusion, we have demonstrated that a single negative feedback system with a time delay and non-linear response is capable of maintaining sustained oscillations for as long as it manages to generate substantial feedback input. However, due to the stochasticity of biochemical reactions involving low number of molecules, the individual pulses of the sinusoid differ among each other exhibiting a certain coefficient of variance. The latter could be reduced if to the system is added a positive feedback loop coupled to the negative feedback one. The effect of the positive feedback is like a switch turning the protein synthesis always to the maximal level. However, biological control systems are characterized by redundancy in the elements and spread spatial distribution in order to provide increased reliability and failure protection. Consequently, many biological regulatory network elements, including oscillators, are consisting of many more than the 3 to 4 elements we have discussed. Still, if they are regarded as logic elements, the underlying basic architecture could easily be identified. The two most important rules are that parallel interactions are simply added, whereas serial interactions are multiplied. For example, if A represses B and in the same time activates B through another channel, then both of those interactions are accounted for. However, if A represses B and B activates C, then A indirectly represses C, the intermediate B just prolongs the time delay of the effect. Conversely, if A represses B and B represses C, then A activates C (Illustration 13).

A typical example is the model used for the description of the oscillatory dynamics of the KaiC protein in cyanobacteria generated by the interaction with another protein KaiA (Rust, Markson, Lane, Fisher, & O’Shea, 2007) . The interaction network could be presented schematically as in (Illustration 14)

61

Illustration 13 : Graphical representation of the rules for simplification of logics architectures involving an intermediate with fast-reached steady state. Consecutive interactions are multiplied, whereas parallel are added. The simplest rule to remember is that a negati on of a negation is an affirmation. Therefore, delayed negative feedbacks require an odd number of negations larger than 2.

The presented network seems to be much more complicated than the ones we already discussed. However, the rules of logic allow us to simplify its representation in order to be able to identify the basic structure. Firstly, keeping in mind the rule for the serial interactions, we could omit the KaiC phosphorylated on serine alone and simplify the network significantly. Thus, from left to right, the activation and repression interactions leading from the other phosphorylated forms of KaiC to KaiA could be re-written as direct repression. Accordingly, from right to left, the multiplication of repression and repression produces activation from KaiA to KaiC.

62

Consequently, KaiC represses KaiA and KaiA activates KaiC in two alternative mechanisms. However, we already stated that parallel interactions are simply added, thus the two activating mechanisms could be summed up in one. Finally, the obtained simplified architecture of the network is nothing more than a delayed negative feedback loop.

Illustration 14: The regulatory circuit driving the oscillations of the KaiC protein in cyanobacteria. The complete network of interactions (A) involves 3 protein species and 5 independent regulation paths. This network could be simplified to a 2-component delayed negative feedback loop (B).

III.3. Synthetic genetic oscillators

The first example for synthetic genetic oscillator is also the first example for rational design of a dynamic genetic device in synthetic biology as a whole (M B Elowitz & Leibler, 2000) . This work was reported in the year 2000 by Elowitz and and Leibler in a famous Letter to Nature. The design they utilized is a direct implementation of the delayed negative feedback design. In brief, their oscillatory genetic circuits consists of a cycle of three elements, each of them producing a repressor protein that inhibits the production activity of the next element. Since the elements are three, the summed up effect of three repressions is a multiplication of three negative interactions, which is a negative interaction. This device was named the “ repressilator ” owing to the fact that its oscillatory dynamics depends entirely on repression.

The fact that the signal is transmitted through 3 different network elements, each consisting of a gene, mRNA, unfolded protein and active repressors provides the system with large enough delay to produce sustained oscillations. Complying with our previous analysis, the authors report that

63

sustained oscillations are maintained better in systems based on strong protein expression and fast protein degradation. Therefore, the precise system design includes strong synthetic promoters based on the phage naturally occurring elements and SsrA degradation tags made of peptide sequences directing the Clp family of protease enzymes towards the repressor proteins (Gottesman, Roche, Zhou, & Sauer, 1998) . Explicitly, the first network element consists of the repressor of the tetracycline resistance system (TetR) controlled by a LacI-repressible promoter

(PL lacO1 ), the second element controls the production of the repressor of the first promoter (LacI) through the wild-type PR promoter. The latter is controlled bi the -cI protein, which is produced by a TetR-repressible protein (PL tetO1 ), thus closing the complete interaction network (Illustration 15).

Notably, both LacI and TetR interact with ligands that inhibit their action when bound the cognate protein, namely those are a structural analogue of lactose (Isopropyl β-D-1- thiogalactopyranoside, abbreviated IPTG) and the modified version of tetracycline (anhydrotetracycline, ATc). This way, the design of this regulatory network allows for tuning of the available concentrations of those repressor and consequently better approximation to the parameters required for sustained oscillations. However, the individual cells still exhibit large variances between the oscillatory periods, the reasons for which we already studied above. The exact behavior of the engineered cells was characterized by the utilization of a fluorescence reporter protein, GFP. The latter was introduced on a dedicated plasmid under the control of another copy of the PL tetO1 promoter. Thus , the oscillations in the TetR concentration would result in oscillatory repression of the generation of GFP. This phenomenon was observed using single-layer fluorescence microscopy. The 2D growth of the cells was performed in a low-melt- agarose-based medium sealed between a microscope slide and a cover slip. As discussed elsewhere, this type of experimental setup poses limitations to the length of the time-lapse cellular observations, in this case to a maximum of 10 hours. The tracking of the fluorescence of the individual cells was performed manually. Despite the noisy behavior, the “repressilator” was the first example of engineering of a synthetic biology part based on the oscillator theory, thus opening up the whole new field of bottom-up design approach in biology.

64

Illustration 15: The repressilator circuit design. The information and material flow is cycling through the closed loop of the regulatory network. This cycling of the repressor proteins causes oscillations of the expression level of GFP. Inducers could be used for fine-tuning of the circuit dynamics at 2 connecting interactions.

The next fundamentally important example for synthetic biological oscillator design was based on a positive-negative feedback loop and came much later (Stricker et al., 2008) . However, it has no lesser importance to the field, owing to the fact that it was the first oscillator actually capable of sustaining oscillations for a long period of time and with very low coefficient of variance between the individual cells. Once again, the engineering of this oscillator was based on the theoretical requirements already defined (Hasty, Isaacs, Dolnik, McMillen, & Collins, 2001) . Hence, in order to support this kind of steady behavior it combines negative and positive

65

feedback loops. To accomplish that, the researchers had to employ a specific class of synthetic promoters that combine the signals of two distinct types of effector proteins. Explicitly, the P lac/ara promoter was utilized. This promoter is based on a derivative of the wild-type Plac promoter and comprises three LacI operator sites and two AraC binding sites exchanging the original Crp operators. Thus, the complete system consists of two elements controlling its dynamics. First, there is one copy of the combinatorial promoter controlling the synthesis of an activator protein. The second part consists another copy of the same promoter controlling a repressor protein. The modus operandi of this oscillator is the following. Initially both of the promoters synthesize the two proteins at their basal rate level. The activator protein requires less time for achieving a functional form and it affects the two promoters by raising their expression levels to the maximum. The latter leads to the accumulation also of the repressor protein, which upon reaching some threshold concentration of its active form, inhibits the production of both of the proteins. The latter, as above, have SsrA protein degradation tags and are removed efficiently from the system in a very short time. Then, the system is again at its initial state and the next cycle of oscillation starts (Illustration 16).

The remarkable property of this oscillator is not only its robustness, but also the fact that it has a tunable period. The proteins involved in the regulation of the dynamics of the circuit both are allosterically affected by some simple molecules contained in the medium. The repressor protein is the repressor of the lactose assimilation system of E. coli (LacI) and is controlled naturally by the presence of lactose in the environment. LacI normally blocks the expression of the lactose utilization proteins and releases it in the presence of lactose, i.e. it is inducible. Since lactose is normally consumed by bacterial cells and its presence in the medium could alter the metabolic state and consequently the behavior of the cells, it is not utilized for LacI induction. Instead, a structural analog is used, namely IPTG. On the other hand, the activator through which the positive feedback is accomplished is also affected by a metabolite molecule. The AraC protein is

66

Illustration 16: Graphical representation of the Stricker et al. positive-negative-feedback oscillator. The two feedbacks are executed through an activator (AraC) and repressor (LacI) of the same promoter. Both of the proteins are allosterically controlled by ligands. The system is based on two circles of information. However, the positive feedback is faster, because it requires less reaction steps to assume its active form.

a natural regulator of the arabinose uptake system of E. coli and has both repressing and activating capacity. The latter activity is due to its capacity to attract RNA polymerase when it is bound to DNA. The binding capacity is active when the protein is in complex with arabinose sugar. Consequently, by controlling the levels of the accessible inducers/activators, the authors could control the levels of activated controller protein complexes. The latter not only allows for the tuning of the oscillation period, but also for achieving proper system parameters that comply

67

Illustration 17: Pattern formation circuit relying on a single distant messenger. The circuit inside the individual cells consists of a delayed negative feedback network. The positive feedback involves the generation of a quorum sensing molecule, AHL. The negative feedback is only activated when certain threshold concentration of AHL is reached and is characterized by the production of a variant of RFP. This way, in a growing colony under exposure of UV light a typical pattern of red concentric rings is observed. with sustained oscillations. Here once again the green fluorescence protein was used as a reporter of the system's behavior. The explicit design of the whole system is somewhat different than before. In this example the activator and the reporter units were cloned in the same plasmid and the repressor unit was cloned in a separate plasmid.

The fluorescence characterization itself was performed in a microfluidics device allowing for the continuous growth of the bacterial cells in a single layer. The fluorescence emission of the individual cells was tracked automatically using a purpose-made MATLAB processing algorithm. The automation of the data processing allowed for the tracking of a significant number of cells and the derivation of statistically-significant conclusions for the dependence of the

68

oscillatory period and the inhibitors' concentrations.

The posive-negative-feedback-based oscillatory circuit was recently utilized for pattern formation (Payne et al., 2013) . As we discussed above, the AHL molecules were utilized to carry the negative feedback signal. The complete circuit consists of the following elements. Firstly, there is a promoter controlling its own activator protein. The same protein activates also the generation of AHL and also of another regulatory protein LuxR.

The latter, when bound to AHL, activates the expression of an inhibitor for the first activator protein. Explicitly, the first unit of this regulatory network consists of a T7 polymerase controlled by a T7 promoter. Another copy of the T7 promoter controls LuxR and LuxI, which produces AHL. Finally, A wild-type Plux promoter activated by LuxR in the presence of AHL controls the expression of T7 lysozyme. The latter inhibits the action of T7 polymerase by direct interaction with the enzyme (Illustration 17).

The modus operandi of the complete system is as following. Initially the system starts to produce the first activator, AHL and the second activator. When AHL reaches a certain threshold concentration in some spots of the colony, the second activator becomes functional and the inhibitor protein is expressed. The latter results in shutting down the whole system. However, upon degradation of the inhibitor, the system resumes a new oscillatory cycle. The characterization of this engineered biological circuit was performed by a type of red fluorescent protein (mCherry) co-expressed in the same operon with the T7 lysozyme. The bacteria were grown in a as a single colony, which was imaged with a phase-contrast and fluorescent camera each 10 hours. The measure of the emergence and spread of fluorescence was measured automatically using a custom MATLAB algorithm. In this particular research, a typical pattern of concentric rings formed and remained stable throughout the whole experiment. The latter phenomenon was explained by the authors as a combined effect of the continuous colony growth and the reduction of the growth of the cells in the repressed state and might be regarded as a standing wave oscillation.

69

Finally, we would like to discuss an engineered synthetic biology circuit in which the quorum sensing was utilized to synchronize the dynamic behavior of a bacterial colony (Danino, Mondragón-Palomino, Tsimring, & Hasty, 2010) . Here again AHL was used as a cell-to-cell communication agent. The oscillatory circuit is combined positive-negative feedback one. As before, the negative feedback involves the synthesis of an additional protein, thus being delayed with respect to the positive feedback. The precise structure of the oscillatory system comprises two operons controlled by identical promoters, which are affected by an activator protein synthesized constantly in the cell. The first operon controls the synthesis of AHL. The latter binds to the constantly present activator and converts it to its functional state. This way, the two operons are switched to their maximal synthesis rate. The first promoter generates more AHL and the second produces an inhibitor protein. The latter inhibits the generation of AHL, thus switching off the whole system. All the proteins produced inside the oscillatory circuit are tagged for protease enzymatic degradation, therefore the protein concentration decays fast. Upon reaching of a lower threshold concentration of the inhibitor protein, the system is reverted back to its original state and a new cycle of oscillation may start. The explicit design of this system is again distributed in two separate plasmids. In the first plasmid there is a wild-type P lux promoter controlling the generation of LuxI and another copy of the same promoter controlling the generation of a type of GFP (yemGFP). On the second plasmid a third copy of the same promoter was cloned, controlling the expression of the enzyme AiiA, which reduces the concentration of AHL (Illustration 18).

70

Illustration 18: Positive-negative-feedback logics oscillator allowing for synchronization of the separate cells through a quorum sensing molecule (AHL). The reaching of an activating threshold concentration of AHL triggers maximal rate of synthesis and thus increases the messenger concentration in the whole population. Consequently, the oscillators in all neighboring cells are activated, hence synchronization is achieved.

The originality here is in the utilization of a quorum sensing molecule, thus being able to synchronize the oscillations of all individual bacterial cells in a given colony. Because the initial triggering of the fast synthesis state of a given cell would propagate the higher AHL concentration also to the neighboring cells even if they are not yet in that sate. This way, at each switching on of the activated promoter state within some part of the cellular population, the whole colony is synchronized and all individual cells become entrained with the same frequency. This behavior was also characterized in a microfluidics chip with 2D growth chambers allowing for the observation of bacterial colonies growing in a single layer for a long period.

71

III.4. Genetic oscillators engineered for this research

We designed and built and partially characterized a large variety of oscillator. Some of them involve typical elements utilized in the engineering of a genetic circuit similar to the ones cited in the previous examples. However, we did not limit ourselves to the proteins as the only channels for information channeling in the living systems and/or the only oscillating species involved in periodic concentration changes.

III.4.1. Goodwin-type oscillators

The first oscillator we designed, built and studied were a series of a Goodwin-type devices. As explained above, those oscillators consist of an operon controlling the expression of a protein capable of inhibiting the activity of its own promoter. Thus, if the time-delay caused by the folding and proper activation of the protein is sufficiently long and the reaction of the promoter activity to the presence of the repressor is non-linear, sustained oscillations could arise in such a system. However, the individual peaks of those oscillations tend to be very different between the separate cells and also in the same cell between each other. This phenomenon is provoked by the stochasticity of the nature of the biochemical processes involving low number of molecules. In addition, the accumulation of repressor protein might slow down the reactivity of the system and thus result in the emergence of damped oscillations. Therefore, the protein turnover should be as fast as possible. Thus, there are three major functional requirements for the engineering of a Goodwin oscillator with sustained oscillations. First, the delay of such a simple system could only be provided by repressor proteins that require multimerization for their activity to be achieved. Second, the non-linearity of the reaction could be accomplished only by strong tightly- repressible promoters. The strength of the promoters also increases the number of the synthesized protein molecules in each oscillatory cycle, thus reducing the stochasticity in the biochemical interactions. Furthermore, tight repression also reduces stochasticity by removing the protein molecules appearing as a consequence of the “leakiness” in promoter control. Third, fast protein turnover requires the employment of protease tags for the degradation of the regulatory proteins

72

enzymatically. There is one additional requirement for the design for any synthetic biological circuit and it arises from purely engineering considerations. Namely, the parts used for the design and build for bigger circuits should preferably be standard and characterized. Thus, we should aim for the utilization of parts with known activity and function for our synthetic circuits. Last, but not least, engineered synthetic biological parts require characterization, therefore, a reporter fluorescence protein should be included in the design in a way, which allows it to properly reflect the system's behavior, but which has limited effect on the dynamics of the circuit per se .

Thus, we have already outlined the general architecture and requirements of the individual parts from which we would build a Goodwin-type oscillator. Combining equations (12) and (13), we could also derive the complete mathematical model governing this type of devices: dA K n D K n  D  = k × i − k × A− max × A= k × i − A×k + max  (14) 1 n n × − n 2 1 n n × − n 2 dt Ki + kC A() t T A+ Km Ki + kC A() t T  A+ Km 

The first member of equation (14) is governing the promoter activity where k1 is a rate constant representing the basal promoter activity, i.e. the capacity of the DNA sequence of the promoter to attract the polymerase. Ki also represents a property of the promoter, explicitly the constant of inhibition exhibited by the repressor protein. In addition, the kC constant is summing up the translation rate of the mRNA in peptide chain defined by the strength of the RBS and also the rate of folding of the peptide chain in active protein.

As discussed above, the analysis of the behavior of any synthetic biology part of circuit is based on expression observations of the dynamic concentration of a fluorescent reporter protein. The general approach for the coupling of the reporter concentration to the dynamics of the oscillatory system is to put the reporter fluorescence protein synthesis under the control of the same promoter. The latter could be accomplished polycistronically or under the control of another copy of the promoter. However it is also possible to utilize another type of promoter controlled by the

73

same repressor protein. Therefore, the two defining rate constants of the promoter controlling the reporter synthesis generally do not match the ones used for the repressor protein. The same argument is valid for the kC rate constant, owing to the fact that the RBS upstream of the fluorescence reporter coding sequence on the mRNA could be the same or different, but the folding rates of two different proteins always disagree. Consequently, the general equation governing the concentration level of the fluorescence reporter protein is the following: dB K n = k × i2 − B× k (15) 12 n n × − n 22 dt Ki2 + kC2 A() t T

Where all the rate constants with subscripts ending in “2” could be the same or different from the ones in equation (14). Whatever the promoter used for control of the fluorescence reporter synthesis, there is always a general preference towards stronger expression capacity. The latter is owing to the need of clear fluorescence imaging by the utilization of short exposure time and low excitation energy, which could be accomplished only if a large number of fluorescence molecules are producing the emission. However, a large pool of fluorescence molecules would damp artificially the observed dynamics of the system. Therefore, those molecules need to be degraded fast, which could anly be accomplished by tagging for enzymatic degradation. If the fluorescence reporter protein is also tagged for enzymatic degradation, this activity should be taken into account in the mathematical model of the protein concentration dynamics. However, the SsrA tags are the most widely used protease-targeting tags and usually they are utilized for both the reporter and the regulatory proteins. The latter fact results in simultaneous degradation of both of those types of proteins by the same machinery and, consequently, the simultaneous charging and saturation of the proteases by proteins. Consequently, the active protease degradation should be modeled by the same mathematical formalism for all the tagged protein species and this formalism should include the concentrations of all of those species. Finally, if the reporter protein is also tagged with an SsrA tag, the two governing equations for a Goodwin-type of oscillator with a fluorescence protein reporter would be the following:

74

dA K n  D  = k × i − A×k + max  (16) 1 n n × − n 2 dt Ki + kC A() t T  A+ B+ Km  dB K n  D  = k × i2 − B×k + max  (17) 12 n n × − n 22 dt Ki2 + kC2 A() t T  A+ B+ Km 

The good practice for the software modeling of the system thusly described would be based on modularity. Since one of the explicitly declared goals of synthetic biology is modularity and the development of exchangeable parts, the tools utilized for the simulation of such system should be prepared for the possible exchange of similar parts. Practically, in the upper system each of the consisting parts of the expression machinery could be modeled as a separate script, i.e. the promoters should exist as separate functions. This way they could be called and using the appropriate variables (repressor concentration, RBS strength, etc.), the script would produce the rate of protein expression.

In addition, the protease machinery could also be modeled independently with the possibility for inputs for a different number of tagged protein species. If this approach is adopted, the dynamics of each of the protein species could be easily simulated by simple revoking of small scripts. Furthermore, different combinations of parts would require simply the exchange of the proper function calling, instead of re-writing of the complete model. Using those considerations we developed a number of MATLAB scripts describing the behavior of the repressible promoters that we used and the dynamics of proteins' synthesis and concentrations controlled by those promoters (Illustration 19). As a solver for the differential equations with an explicit delay we used the dedicated MATLAB function dde23 . The complete list of developed scripts as well as the scripts themselves could be found in APPENDIX B.

75

Illustration 19: Structure of the general MATLAB model for dynamic genetic circuits. All the estimators are written as individual scripts. The same estimator may be called by one or more other estimators. If standard parts (promoters, RBS's, degradation tags) are used the input parameters might be reduced just to the initial conditions of the simulation, i.e. P 0.

For the Goodwin-type oscillators engineering we utilized the two best-studied repressors with known behavior and dynamics, TetR and LacI. The sequences of the proteins that we utilized are the ones that could be found in the repository of genetic parts of the iGEM community and they already have SsrA tags added. Those proteins were put under the control of promoters repressed by them. The sequences of the promoters that we used were obtained from literature or were designed by us. The literature-based promoters that we used are the well-known Lutz-Bujard

PL lacO1 and PL tetO1 also a synthetic promoter designed and built by us , the P trc_tet as well as a promoter repressed by either of those proteins, namely the hybrid promoter, which has a number BBa_K091101in the repository of genetic parts. The latter part is considered to be an “AND” logic gate, because it could only express the protein under its control in the presence of the inducer of LacI and the inducer of TetR. For RBS we utilized the BBa_B0034 part, which is used as a reference for RBS strength measurements and was originally derived from the Elowitz

76

“repressilator paper”, which we already cited. For the polycistronic reporters we used YFP or mCherry and the whole system was cloned in a pSC101 low-copy-number plasmid with ampicillin resistance. Conversely, when we adopted the approach with a fluorescence reporter on a separate plasmid, we utilized GFP and we integrated the oscillatory circuit in a p15A plasmid with kanamycin resistance. The reporter construct was cloned in a high-copy-number plasmid with a pMB1 origin and ampicillin resistance. This way, when two plasmids were used they had compatible origins of replication and different antibiotics resistances (Illustration 20).

Illustration 20: Goodwin-type oscillators we constructed consist of promoter which controls the production of its cognate repressor (left). We constructed a number of such type of oscillators based on standard biological parts (right). The reporter devices are not presented here.

We characterized those genetic circuits by fluorescence microscopy. For that purpose we cultivated the transformed bacterial cells in microfluidics devices for 2D growth. The starting culture was prepared from an overnight culture grown at 37 oC in LB medium supplied with the appropriate antibiotics. The overnight culture was diluted 1000 times and then was grown until OD=0.4-0.5 in an erlenmeyer flask. At that point the culture was concentrated 10 times and 0.05

77

% of surfactant was added to the obtained inoculum. Additionally, LB medium supplied with antibiotics and the same concentration of surfactant was prepared for the continuous growth experiment. The two solutions were poured in syringes from where they were forced in the microfluidics chip. Initially the starting culture is forced and by generation of shear lateral stress some bacteria are trapped in the growth chambers. Consequently, the syringe with the fresh medium was connected to the microfluidics device and the medium was continuously pumped through at a constant rate of 500 l/h. This system allows for the continuous growth of bacterial culture and it maintenance in exponential growth phase for a long period of time. The fluorescence and phase-contrast imaging was programmed and performed as explained in the dedicated chapter. The obtained raw microscopy data was analyzed using our image processing software. The obtained fluorescence level vectors could be used for the further analysis of the data and for visualization of the results. Some exemplary results of this analysis are presented in Illustration 21. Here we adopted two types of visualization. The first one is a the dynamic curve of the fluorescence level of each individual cell against time. This way the characteristics of the individual oscillations could be analyzed easily and common traits could be underlined. However, the number of cells that could be easily distinguished in this type of plot is limited. Conversely, if the fluorescence levels are represented as intensity levels, the behavior of a large number of cell could be easily visualized in a small space. Unfortunately, this type of representation renders the precise understanding of the characteristics of the individual peaks extremely hard. The presented and other observations of the Goodwin-type oscillators that we built confirmed the our theoretical expectations that they would exhibit very noisy oscillations. Those results also confirmed previous experimental data based on similar characterization of an oscillator consisting of a PL lacO1 promoter controlling the expression of the LacI repressor. This way we proved that we developed a useful and efficient technology for the characterization of synthetic biological parts and circuits.

78

Illustration 21: Graphical representation of the characterization of a Goodwin-type oscillator. The dynamics of the fluorescence of individual cells could be presented as a plot against time (above) or as a 1D intensity plot (below).

III.4.2. Double genetic oscillators

As we already stated, modularity is very important for synthetic biology, because it allows for the building of different dynamics and complexity genetic circuits composed of more than one elementary parts. Therefore, our next step was to engineer a system comprising two genetic oscillators and to characterize its behavior inside a bacterial cell. The simplest combination of oscillators consists of simple addition of two of the previous oscillators, which are controlled by

79

different promoters, which on their turn are repressed by different repressor proteins. However other options are provided by the modularity of promoter structure. The latter allows for the engineering of promoters under the control of more than one different regulatory proteins. One such promoter is the combinatorial P tet/lac promoter repressed by ether of the LacI or TetR repressors or by both of them in ensemble. Because of the combined action exerted by two different repressor proteins over this promoter, the latter could be used as node connecting the dynamics of few individual genetic circuits. Each synthetic biological device that includes LacI or TetR in the control of the system's dynamics would affect also the dynamics of the system controlled by the P tet/lac promoter. Additionally, if the latter controls the expression of the same protein, it would be a Goodwin oscillator on its own and in the same time would affect the dynamics of the first system. This way a number different combinations could be created and explored.

Utilizing this combinatorial promoter and the previously described PL lacO1 and PL tetO1 promoters, we were able to engineer the following list of combinations of two oscillators inside the same bacterial cell (Illustration 22): a) Decoupled oscillators – the two TetR and LacI repressed are independent from each other and they Illustration 22: Combinations of two different promoters engineered in the same cell. do not affect each other's dynamics. b) TetR-coupled oscillators – this combination allows for the independent oscillation of the LacI -repressed oscillator, however, the TetR-repressed device is under the control of the P tet/lac promoter. Therefore, the dynamics of the TetR concentration is also affected by the LacI concentration.

80

c) LacI-coupled oscillator – this is the circuit with reverse symmetry with respect to the previous one. In this case it is LacI that is under the control of the P tet/lac promoter and TetR is expressed by PL tetO1 . Consequently, the dynamics of the concentration of TetR is that of a Goodwin-type oscillator, but the dynamics of the concentration of the other repressor protein is combinatorial.

d) Coupled oscillators – this circuit consists of two separate copies of the P tet/lac promoter controlling the expression of the LacI and TetR respectively. This way, both of the proteins would exhibit influence on the expression of each other and the dynamics of both of them could have a variety of properties.

All the repressor proteins in the upper combinatorial circuits were tagged. The combinations were cloned in a single p15A plasmid with a kanamycin resistance.

There are also different possibilities for the reporting of the dynamics of those genetic circuits by fluorescent proteins. We could utilize an independent reporter protein affected by the dynamics of one of the two repressor proteins. In that case, the expression of each of those fluorescent proteins would be controlled by a promoter repressed by one of the two repressor proteins. We engineered such a double reporter device in a pMB1 plasmid with ampicillin resistance. The LacI reporting device is an mCherry with and LAA SsrA tag under the control of the PL lacO1 promoter and the TetR reporter is GFP with the same tag under the control of PL tetO1 promoter. However, there is also another option for the characterization of the dynamics of this complex circuit and it is provided again by the properties of the P tet/lac promoter. If the latter is utilized to control the expression of a single reporter protein, then the obtained fluorescence dynamics would reflect the combined concentration levels of both of the repressor proteins. Obviously, such an “integrating” reporter would be less informative, however it could still be used for the discriminating between different states of the system. Furthermore, the fluorescence microscopy observation of a single fluorescence protein compared to two requires two times less UV-light exposure and, therefore, poses a lot less limitations on the experimental setup as a whole. Both of those type of reporters were cloned in a pMB1 plasmid with ampicillin resistance.

81

Illustration 23: Two Goodwin-type oscillators in the same cell. In this case they are decoupled and the reporter response is integrating the dynamics of both the repressors (A). After microscopy and image processing the dynamics of the fluorescence levels of the individual cells could be obtained (B). The processing could continue with automatic period detection and periodograms for different experimental conditions could be obtained (C). This type of plot shows the frequency of appearance of certain oscillatory period in the whole cell population. The saturating level of one of the inducers obviously changes the structure of the periodogram.

A number of different experiments were performed with the different combinations of two oscillators and each of the two type of reporters. In (Illustration 23) are presented some exemplary results.

82

III.4.3. Oscillatory copy number plasmid

Protein regulators are very versatile and they could be used in genetic circuits in modular ways, which combined with combinatorial engineered promoters allows for the design and build of a large variety of dynamical systems. However, there is a long standing problem concerning protein regulation in synthetic biology and it is the limited number of readily available well- studied repressors or activators known to the researchers (Garg, Lohmueller, Silver, & Armel, 2012) . The repressor proteins that one could utilize to build a genetic circuit are literally two – LacI and TetR. If moderate risk is willing to be undertaken, then the phage family of the CI proteins could be included, also LasR and that is all. There are also the proteins with double regulatory activity such as LuxR, AraC, maybe others. But even if we extend the list to ten different repressor proteins could be used, still the number is quite small. If we are to build a complex genetic circuit involving also at least one oscillator, than the latter device would sequester at least one of the available repressors for the control of its dynamics. And if our aimed circuit involves one oscillator and a couple of inverters, apart from other logics, then the problem becomes insolvable. Therefore, the development of alternative methods to control dynamically the genetic expression is one of the main tasks of synthetic biology. As described elsewhere, genetic expression involves the conversion of information in a variety of forms. It starts from DNA sequence encoding a peptide chain, which is transcribed into mRNA sequence, which finally is translated into the peptide chain. Up till now we regarded the methods to regulate the first of those transformations, i.e. the transcription. The latter is performed with the help of a polymerase enzyme, which process could be either hindered by a repressor protein or promoted by an activator protein. However, this is not the only point of this sequence of events at which dynamic control could be exerted. Small RNA's are more and more adopted as a means of post- transcriptional control. In addition, the quantity of DNA could also be regulated through different methods. Consequently, one of the main research directions in the field of biological rhythms is the study of the periodic changes in the DNA replication activity connected to the cellular life cycle. Therefore, in one of our designs we tried to develop an oscillatory system based exactly on DNA replication control. We aimed to accomplish that goal by using the minimal possible

83

genetic elements, i.e. constructing the simplest possible genetic circuit. The latter is the Goodwin-type oscillator, therefore we needed to generate a negative feedback loop in the DNA replication control. The most straightforward method to obtain such circuit and in the same time does not involve whole organism engineering is to put the control of a plasmid copy number under control of a dynamic system containing a negative feedback loop.

The BioBricks collection of biological parts ( http://parts.igem.org ) provides a standard plasmid which allows for inducible control of its copy number. This element is denoted pSB2K3 and is a plasmid, which has two origins of replication. The first one is a standard F' replication origin and the second is the lytic replication control system from the P1 bacteriophage (Heinrich, Riedel, Rückert, Lurz, & Schuster, 1995) . The genetic elements that contribute to the successful replication initiation from the P1 origin are put under the control of the wild-type P lac promoter. Consequently, if transformed in a strain constitutively expressing LacI, the pSB2K3 plasmid copy number would be under the control of the F' origin solely, whereas the P1 origin would be silenced. The control of the copy number is inducible, because if the LacI inducer IPTG is supplied with the medium, then DNA replication becomes also possible from the P1 origin. Furthermore, within some IPTG concentration margins, the plasmid DNA concentration is proportional to the inducer level. We based our novel engineered oscillator on the pSB2K3 plasmid by generating of a feedback loop in this type of copy-number control. Explicitly, we cloned a copy of the LacI gene under the control of a medium strength constitutive promoter (JS006 from the BioBricks). This way, after initial transformation in a LacI-negative strain the plasmid would reach its maximal possible copy number, because the P1 origin is derepressed. However, during this increase of copy number, the LacI protein would also be constitutively expressed. At certain threshold concentration, the regulator protein would be able to repress efficiently the promoter controlling the P1 replication enhancers. Consequently, the only remaining active replication origin would be the F' and the plasmid replication would halt. Furthermore, the plasmid copy number would start to decrease by dilution provoked by the continuous growth of the containing cell. However, the LacI synthesis is directly proportional to the plasmid copy number, because this protein is under the control of a constitutive promoter on

84

the same plasmid. Therefore, together with the reduction of the plasmid copy number, LacI synthesis would also decay, whereas at certain point it would be overcome by the protein degradation and the LacI concentration would start to diminish. When this concentration falls under certain threshold, the repression activity of the protein would be removed and the P1 replication origin would be activated again. This way, the plasmid would enter in another oscillatory cycle (Illustration 24). Here again the fast degradation of the repressor protein is important, however the general dynamics of the system is governed by the relatively slow rate of plasmid dilution in the cytoplasm. Consequently, much slower rhythms could be generated in such system, which involves only one repressor protein. The dynamics of our design were characterized through follow up of the concentration of a fluorescence reporter protein (GFP) expressed under the control of a PL lacO1 promoter. The whole reporter device was cloned in a hig- copy-number pMB1 plasmid with ampicillin resistance. The fluorescence measurements were performed with a microfluidics setup as described earlier and the image processing was performed with a dedicated software developed by Catalin Fetita et al. (Fetita, Kirov, Jaramillo, & Lefevre, 2012)  (Illustration 24).

The results exhibit the typical noisy dynamics of a Goodwin-type oscillator and confirm the possibility for successful engineering of oscillator based on plasmid copy number.

III.4.4. Phage-communication-based oscillator

85

Illustration 24: Oscillatory copy number design (above) and characterization (below). Fluorescence time frames were taken each 5 minutes. The observed noisy oscillations have very long periods.

Finally, we also designed an oscillatory system, which uses inter-cellular communication to synchronize the oscillations between the independent cells. However, in order to accomplish this synchronization we did not employ quorum sensing molecules as AHL, but derivatives of bacterial phages M13 (De Paepe, De Monte, Robert, Lindner, & Taddei, 2010) . The phages produced by a certain cell would infect the neighbouring cells and would trigger a dynamic response regulated by an engineered oscillatory circuit. The “trigger” for the oscillatory behavior is a protein encoded in the phagemid DNA.

This way we would exchange directly genetic information, instead of messenger molecules, which again provides additional means for entangling cell-to-cell communication in genetic circuits designed for synthetic biology. The latter means that novel complex genetic circuits would be possible for engineering, circuits involving more than one intercellular communication channels, which unlike different classes of AHL molecules have absolutely no cross-talk between

86

each other. The design of this system includes a dynamic system with a positive-negative- feedback. The phagemid itself is a carrier of the T7 polymerase. The positive feedback logic is provided by the synthesis of the GeneIII protein, which provides for the successful packaging of the novel phagemid particles. The geneIII is under the control of a T7 promoter (P T7 ). Consequently in the presence of the phagemid the T7 polymerase is expressed and the latter starts transcribing the geneIII, which leads to the spread of the phagemid in the neighboring cells.

However, another copy of the P T7 controls the expression of the T7 lysozyme (LysY) (Wagner et al., 2008) . The latter is an inhibitor of the T7RNAP activity and upon reaching of certain concentration threshold it could block the expression from all P T7 promoters. The GeneIII synthesis would stop and, consequently, also the spread of viral particles. If the ratio between the efficiency of phagemid infection and the rate of cell growth and division is properly tuned, new uninfected cells should be continuously generated. The latter would allow also for the continuous spread of phagemid particles in those susceptible cells with subsequent halt, which phagemid spread dynamics could have periodic nature.

The explicit design of this circuit involves three different plasmids (Illustration 25). The first plasmid is actually a phagemid with an M13 phage packaging signal and pMB1 origin of replication. This phagemid carries the gene for T7 RNAP under the control of a constitutive promoter and a standard ampicillin resistance cassette. We envision four different versions of the constitutive promoters in order to provide for some “tune-ability” of the system at that point. The second plasmid would provide the dynamics of the system through the control of the expression of the feedback proteins. The GeneIII and the LysY expression would be cloned under the control of a P T7 or a P T7_lac promoter. This way we would have four possible promoter combinations for the control of the expression of the feedback proteins and thus would be capable of higher versatility for the final system setup. The “feedback plasmid” would have a low-copy pSC101 origin of replication and chloramphenicol resistance. Finally, all the other genes necessary for the successful replication and packaging of the phagemid apart from geneIII would be hosted on a “helper plasmid” with a p15A origin of replication and Kanamycin resistance. Such plasmid is readily available for purchase under the name of Hyperphage M13K07 ∆pIII. In

87

order to be able to follow the spread of phagemids through the bacterial population we would need to put also a fluorescence protein under the control of some of the dynamically changing elements of the circuit. For that purpose we also envisioned the GFP synthesis to be put under the control of another copy of the P T7 promoter on the feedback plasmid.

Illustration 25: Phage-communication-based synchronous oscillator design. The oscillatory logic is a delayed negative feedback loop provided by the interaction of T7 RNAP and LysY. For the successful spread of the phagemid through the population of bacteria the products of all the three plasmids are required.

III.5. Conclusion

88

In this chapter we examined thoroughly the theoretical knowledge existing regarding genetic oscillators. Their function, structure and mathematical description was reviewed. Furthermore, the major practical accomplishments in the engineering of synthetic oscillatory devices were reviewed and analyzed. We demonstrated methods for efficient modeling of genetic oscillators. Simple Goodwin-type oscillators as well as double oscillatory devices were engineered and expressed in the same cell. Additionally, novel type of oscillators – oscillatory copy-number plasmid was developed and characterized in microfluidics setup. Finally, a novel device based on phage communication was also designed and specific biological construction was proposed.

III.6. References

Berridge, M. J., Bootman, M. D., & Lipp, P. (1998). Calcium--a life and death signal. Nature , 395 (6703), 645–8. doi:10.1038/27094

Bertossa, R. C., van Dijk, J., Diao, W., Saunders, D., Beukeboom, L. W., & Beersma, D. G. M. (2013). Circadian rhythms differ between sexes and closely related species of Nasonia wasps. PloS one , 8(3), e60167. doi:10.1371/journal.pone.0060167

Bock, L. V, Blau, C., Schröder, G. F., Davydov, I. I., Fischer, N., Stark, H., … Grubmüller, H. (2013). Energy barriers and driving forces in tRNA translocation through the ribosome. Nature Structural & Molecular Biology . doi:10.1038/nsmb.2690

Danino, T., Mondragón-Palomino, O., Tsimring, L., & Hasty, J. (2010). A synchronized quorum of genetic clocks. Nature , 463 (7279), 326–30. doi:10.1038/nature08753

De Bock, M., Wang, N., Bol, M., Decrock, E., Ponsaerts, R., Bultynck, G., … Leybaert, L. (2012). Connexin 43 hemichannels contribute to cytoplasmic Ca2+ oscillations by providing a bimodal Ca2+-dependent Ca2+ entry pathway. The Journal of biological chemistry , 287 (15), 12250–66. doi:10.1074/jbc.M111.299610

De Paepe, M., De Monte, S., Robert, L., Lindner, A. B., & Taddei, F. (2010). Emergence of variability in isogenic Escherichia coli populations infected by a filamentous virus. PloS

89

one , 5(7), e11823. doi:10.1371/journal.pone.0011823

Elowitz, M B, & Leibler, S. (2000). A synthetic oscillatory network of transcriptional regulators. Nature , 403 (6767), 335–8. doi:10.1038/35002125

Elowitz, Michael B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic Gene Expression in a Single Cell. Science , 297 (5584), 1183–1186. doi:10.1126/science.1070919

Fetita, C., Kirov, B., Jaramillo, A., & Lefevre, C. (2012). An automated approach for single-cell tracking in epifluorescence microscopy applied to E. coli growth analysis on microfluidics biochips. In SPIE (p. 83170Z–83170Z–11). International Society for Optics and Photonics. doi:10.1117/12.911371

Garg, A., Lohmueller, J. J., Silver, P. A., & Armel, T. Z. (2012). Engineering synthetic TAL effectors with orthogonal target sites. Nucleic acids research , 40 (15), 7584–95. doi:10.1093/nar/gks404

Gillespie, D. T., Hellander, A., & Petzold, L. R. (2013). Perspective: Stochastic algorithms for chemical kinetics. The Journal of chemical physics , 138 (17), 170901. doi:10.1063/1.4801941

González, A., Manosalva, I., Liu, T., & Kageyama, R. (2013). Control of Hes7 expression by Tbx6, the Wnt pathway and the chemical Gsk3 inhibitor LiCl in the mouse segmentation clock. PloS one , 8(1), e53323. doi:10.1371/journal.pone.0053323

Gonze, D., & Abou-Jaoudé, W. (2013). The goodwin model: behind the hill function. (N. Monk, Ed.) PloS one , 8(8), e69573. doi:10.1371/journal.pone.0069573

Gonze, D., & Goldbeter, A. (2006). Circadian rhythms and molecular noise. Chaos (Woodbury, N.Y.) , 16 (2), 026110. doi:10.1063/1.2211767

Goodwin, B. C. (1965). Oscillatory behavior in enzymatic control processes. Advances in enzyme regulation , 3, 425–38. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/5861813

Gottesman, S., Roche, E., Zhou, Y., & Sauer, R. T. (1998). The ClpXP and ClpAP proteases degrade proteins with carboxy-terminal peptide tails added by the SsrA-tagging system.

90

Genes & development , 12 (9), 1338–47. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=316764&tool=pmcentrez&rende rtype=abstract

Griffith, J. S. (1968). Mathematics of cellular control processes. I. Negative feedback to one gene. Journal of theoretical biology , 20 (2), 202–8. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/5727239

Hájos, N., Karlócai, M. R., Németh, B., Ulbert, I., Monyer, H., Szabó, G., … Gulyás, A. I. (2013). Input-output features of anatomically identified CA3 neurons during hippocampal sharp wave/ripple oscillation in vitro. The Journal of neuroscience : the official journal of the Society for Neuroscience , 33 (28), 11677–91. doi:10.1523/JNEUROSCI.5729-12.2013

Hasty, J., Isaacs, F., Dolnik, M., McMillen, D., & Collins, J. J. (2001). Designer gene networks: Towards fundamental cellular control. Chaos (Woodbury, N.Y.) , 11 (1), 207–220. doi:10.1063/1.1345702

Heinrich, J., Riedel, H. D., Rückert, B., Lurz, R., & Schuster, H. (1995). The lytic replicon of bacteriophage P1 is controlled by an antisense RNA. Nucleic acids research , 23 (9), 1468– 74. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=306884&tool=pmcentrez&rende rtype=abstract

Ilie, S. (2012). Variable time-stepping in the pathwise numerical solution of the chemical Langevin equation. The Journal of chemical physics , 137 (23), 234110. doi:10.1063/1.4771660

Kaplan, H., & Greenberg, E. (1985). Diffusion of autoinducer is involved in regulation of the Vibrio fischeri luminescence system. Journal of bacteriology , 163 (3), 1210–1214.

Lavi, O., Ginsberg, D., & Louzoun, Y. (2011). Regulation of modular Cyclin and CDK feedback loops by an E2F transcription oscillator in the mammalian cell cycle. Mathematical biosciences and engineering : MBE , 8(2), 445–61. Retrieved from

91

http://www.ncbi.nlm.nih.gov/pubmed/21631139

Mather, W., Bennett, M., Hasty, J., & Tsimring, L. (2009). Delay-Induced Degrade-and-Fire Oscillations in Small Genetic Circuits. Physical Review Letters , 102 (6). doi:10.1103/PhysRevLett.102.068105

Novák, B., & Tyson, J. J. (2008). Design principles of biochemical oscillators. Nature reviews. Molecular cell biology , 9(12), 981–91. doi:10.1038/nrm2530

Payne, S., Li, B., Cao, Y., Schaeffer, D., Ryser, M. D., & You, L. (2013). Temporal control of self-organized pattern formation without morphogen gradients in bacteria. Molecular systems biology , 9, 697. doi:10.1038/msb.2013.55

Rust, M. J., Markson, J. S., Lane, W. S., Fisher, D. S., & O’Shea, E. K. (2007). Ordered phosphorylation governs oscillation of a three-protein circadian clock. Science (New York, N.Y.) , 318 (5851), 809–12. doi:10.1126/science.1148596

Shih, Y.-L., & Zheng, M. (2013). Spatial control of the cell division site by the Min system in Escherichia coli. Environmental microbiology . doi:10.1111/1462-2920.12119

Stricker, J., Cookson, S., Bennett, M. R., Mather, W. H., Tsimring, L. S., & Hasty, J. (2008). A fast, robust and tunable synthetic gene oscillator. Nature , 456 (7221), 516–9. doi:10.1038/nature07389

Turing, A. M. (1952). The Chemical Basis of Morphogenesis. Philosophical Transactions of the Royal Society B: Biological Sciences , 237 (641), 37–72. doi:10.1098/rstb.1952.0012

Vilar, J. M. G., Kueh, H. Y., Barkai, N., & Leibler, S. (2002). Mechanisms of noise-resistance in genetic oscillators. Proceedings of the National Academy of Sciences of the United States of America , 99 (9), 5988–92. doi:10.1073/pnas.092133899

Wagner, S., Klepsch, M. M., Schlegel, S., Appel, A., Draheim, R., Tarry, M., … de Gier, J.-W. (2008). Tuning Escherichia coli for membrane protein overexpression. Proceedings of the National Academy of Sciences of the United States of America , 105 (38), 14371–6. doi:10.1073/pnas.0804090105

92

Weiss, J. N. (1997). The Hill equation revisited: uses and misuses. FASEB journal : official publication of the Federation of American Societies for Experimental Biology , 11 (11), 835– 41. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9285481

Wong, W. W., Tsai, T. Y., & Liao, J. C. (2007). Single-cell zeroth-order protein degradation enhances the robustness of synthetic oscillator. Molecular Systems Biology , 3. doi:10.1038/msb4100172

Wu, M., Wu, X., & De Camilli, P. (2013). Calcium oscillations-coupled conversion of actin travelling waves to standing oscillations. Proceedings of the National Academy of Sciences of the United States of America , 110 (4), 1339–44. doi:10.1073/pnas.122153811

93

IV. Engineering of microfluidics devices

IV.1. Introduction

Microfluidics is indispensable for modern synthetic biology for a plethora of reasons. On the first place, this technique allows for the observation of microorganisms at the single-cell level . This type of experiments is performed in microfluidics chips designed for the growth of microbial cultures in single layers. Thus, upon vertical observation, the stacking of cells on top of each other is avoided and each cell is easily discriminated and characterized (Illustration 26).

The growth of the cells in a single layer is accomplished in growth chambers which differ in horizontal dimensions and shape, but usually have a similar design in the vertical dimension. Explicitly, the depth of such growth chambers is normally between 1 and 1.5 times the normal thickness of the cells to be grown. This way, the growing microbes are mechanically hindered to stack in the vertical direction and are forced to move only in one plane. This behavior has an additional interesting aspect for the experimental setup, which is the continuous pushing between the growing cells towards the aperture(s) of the growth chamber. The latter results in the beneficial extrusion of the surplus cells in the medium-loading channel, thus limiting their total number and effectively turning the growth chamber in to a chemostat .

Next, the experiments are performed in a well-controlled environment . The microfluidics chips consist of interconnected channels and chambers with micron-scale dimensions. The loading and removal of cells and media is accomplished by fluid flows inside those channels and chambers. The control over the fluid dynamics could be exerted externally by the supply apparatuses (syringe pumps, etc.) or internally by integrated valves. The result of this fluid control is the precise aiming of certain input fluid(s) towards specific chambers containing cells. The small dimensions of the microfluidics channels result in increased importance of the fluid viscosity vs.

94

Illustration 26: A snapshot of an E. coli colony growing inside a rectangular chamber allowing for the growth of bacteria in a single layer. In this particular case the technique is utilized for the easy discrimination and counting of certain labeled plasmids inside each individual cell. inertia and render mixing extremely difficult. The latter means that the content of each amount of fluid reaching some part of the microfluidics device is precisely defined by the input flows and their dynamics. This is contrary to standard lab-scale bioreactors where the final composition of the media reaching the growing cells is a complex practically undefinable mixture of fresh input media, exhausted media already utilized by the cells and waste products produced by the cells.

The precise control over the media quality and dynamics means also that the media input could be programmed as a dynamic function unlike the normal process of liquid cultivation.

95

Consequently, a large variety for input functions could be tested for the definition of the transfer function of a novel engineered system. Within the context of biological engineering as a whole and synthetic biology in particular, this capacity is extremely important for the precise modeling of the designed parts and processes.

Fourth, the miniature size of the internal structures of a microfluidics chip allow for the easy provision of biological and technical replicas during an experiment. The fine understanding of the mechanisms driving biological parts requires also reliable and representative data. However, Biology as all modern natural sciences relies on sampling in order to describe the general behavior of certain studied system. Therefore statistically representative repetitions of the same experiment are necessary in order some conclusion to be derived. In the case of standard time- lapse microscopy performed on agar pads in order to obtain single-layer cultures the task to obtain a statistically representative set of results is an extremely tedious one. However, the small size of the parts out of which is composed a microfluidics device allows for the easy combination of a battery of growth chambers and their supply channels. This way the space and means for the easy biological and technical replication of an otherwise complicated experimental setup are provided.

Fifth, the miniature sizes also reduce the amount of consumables required for the performance of each experiment. In a typical biological lab setup the medium (including all its expensive components such as inducers) used for a single batch culture is 5ml for a single test at a single combination of conditions. The same amount of medium is consumed by a microfluidics chip with hundreds of growth chambers for 10 hours. Within such period of time with the same overall amount of medium also a number of different conditions could be tested owing to the possibility for dynamic exchange of the conditions in a microfluidics experiment.

Next, the procedure for obtaining of an exponentially growing microbial culture in a single layer is straightforward and a lot less laborious . When using traditional microscopy with slide and cover slip for the preparation of sample culture in a single layer, one has to follow a challenging

96

protocol. Firstly, a fresh culture in exponential phase of growth should be prepared from an overnight culture. Then, some medium with low agarose concentration (0.1-0.2%) is melted and poured on a microscope slide and needs to be waited to cool down. Care must be taken, because together with its temperature, the agarose pad loses also its moisture and a too dry substrate would simply kill the microorganisms spread on top. Unfortunately, there is no objective way of measurement of the moisture in the pad and the survival of the sample culture could only be confirmed by direct observation later under the microscope. After the agarose is assumed to have reached the required temperature, a drop of the culture is carefully delivered on the pad. Upon that point, the pad should be cut very quickly with another microscope slide so that its surface becomes smaller than a cover slip. Next the cover slip is attached to the pad. If the procedure is stopped here, the observations could be performed for 10-15 minutes, after which the agarose starts to dry rapidly under the heat of the microscope light source and the cells lyze. If the experimentalist wishes to prolong the observations, the agarose pad must be sealed between the microscope slide and the cover slip. This is accomplished usually by a layer of melted paraffin wax. However, this insulation prevents also the exchange of oxygen, therefore triggering anaerobic growth of the organisms and hampering the proper folding of GFP and other reporter proteins. Conversely, the observation of microbial cells growing in a single layer in a microfluidics chip requires but few simple steps. Firstly, the fresh culture is forced into the chip through its designated input port. Then, lateral stress is created by some sheer movement of the supply tubes so that the cells are trapped in the growth chambers. Lastly, proper medium supply is provided and the cells start to grow. The polymer normally used for the fabrication of microfluidics chips is Polydimethylsiloxane (PDMS) and it is a porous material. Therefore, it does not hinder gas exchange, including oxygen. Therefore, the growing microorganisms are provided with everything they need and the period of the observations of such culture is limited only by the quality of the delivered medium and internal physiological reasons.

Last, but not least, the fabrication of microfluidics chip is easy, reproducible and cheap . The most wide-spread process consists in molding of a polymer in a vessel, which has at its bottom structures with negative shape with respect to the finally aimed channels and chambers of the

97

chip. The cast thusly obtained is consequently cut in appropriate shape(s) and attached firmly to a hard substance. The latter seals the channels and chambers of the device, thus making it functional and ready to use. The substrate carrying the negative of the chip structure is usually quite rigid and therefore reusable many (i.e. hundreds of) times. In addition, the size of that substrate allows for the parallel fabrication of a large number of devices, rendering the process very efficient. Finally, the price of the consumables for this type of fabrication is very attractive.

IV.2. Microfluidics devices design

The microfluidics devices used for the characterization of synthetic biology circuits in bacteria have two main functions. First, they need to allow for the continuous growth of the cells in appropriate structures (large traps, single layers, single lines, etc.). Second, they need to allow for the dynamic control over the quality of the medium reaching the bacteria. In order to meet those requirements, such devices need to provide at least three types of spatial structures. First, they need to provide appropriate growth chambers for the microorganisms. Second, those chambers need to communicate with the incoming fluids through apertures and to allow for the easy removal of excessive cells. Finally, there should be a system of channels communicating with the outer environment of the chip through input and output ports and allowing for the easy dynamic exchange and/or mixing between the different input media.

IV.2.1. Growth chambers

The design of the first type of structures is defined by the spatial properties of the bacteria and the structural properties of the material used for the fabrication of the devices, namely PDMS. Whatever the shape of the growth chamber , the elastic properties of the material will always affect the allowed aspect ratio that could be used in the design of ceiling-collapse-proof devices. In the case of PDMS the maximal advisable support height:ceiling length ratio is 1:50. Furthermore, depending on the aimed growth conditions there are more requirements to be taken

98

into account. For a 2D growth, i.e. in a single layer, the horizontal spatial properties of the chamber need not meet any further limitations apart the ones imposed by the elasticity of the material. However, the vertical dimension should not allow for stacking of cells, therefore should be between 1 and 1.5 times the diameter of the bacteria. Analogically, for a 1D growth, i.e. in a single line, the size of the growth chamber would be unlimited in only one of the horizontal dimensions. For the other two dimensions again the size should be between 1 and 1.5 times the diameter of the bacteria. Keeping all the above in mind and given that the effective diameter of E. coli cells is approximately 1 micrometer, one could easily derive the practical limitations used in the design of growth chambers for this bacterium (Table 1).

The second types of structures that the design of a microfluidics chip needs to provide are the apertures of the growth chambers. Those structures need to allow for sufficient exchange of compounds between the chamber and the supply channels so that the cells would not starve. In addition, there also should be apertures that are used for the seeding of the chambers with the starting culture. Finally, the interface between the chambers and the channel system should prevent the complete washing of the bacteria so that a continuous culture could be maintained.

Vertical s ize Long horizontal size Short horizontal limitations limitations size limitations

Deep growth chamber z y <50*z

Single layer growth 1-1.5 m y <75 m chamber

Single line growth 1-1.5 m y 1-1.5 m chamber

Table 2: Design limitations for different types of growth chambers used for cultivation of E. coli

There are several possible design approaches for the accomplishment of those tasks. Firstly, all

99

the above-mentioned requirements could be met by a single aperture design between the growth chamber and the supply channel. In this case the design considerations are fairly simple, for the channel interface of the chambers should be large enough to allow for the entrance of cells in the chamber and also as small as possible so that it blocks the exit of the same cells out of the chambers. This problem has only one solution and it is that the height of the aperture exactly matches the diameter of the cells to be grown in the chip. This way the growth chamber could be easily seeded with cells. Also the dynamics of the fluids in the channels could be maintained such as to allow for active diffusion of nutrients in the chamber. Finally, the surplus cells could be pushed out of the growth chamber by the other cells through the same aperture. Such design solutions are commonly used for single layer or single line growth devices. In this case the growth chambers are completely open to the supply channels laterally and the height of the aperture is the same as the height of the growth chamber, i.e. between 1 and 1.5 times the diameter of the bacteria. Thus, the growing cells are retained in the chamber mechanically (they are literally “squeezed” inside) and the complete washing of the culture is impossible.

A similar to this design is the case in which the chamber opens to the supply channel with its ceiling. This approach is used for deep growth channels and provides large opening for exchange of nutrients and cell seeding. For the cells retention, however, those devices rely on the weight of the microorganisms. The latter are not washed out of the chamber because they are pressed down by their own weight. Owing to the completely laminar flow in microfluidics channels, the only cells that would enter the fluid flow are the excessive ones, protruding outside of the growth chamber and inside the growth channel.

Finally, there are microfluidics growth chamber designs with dedicated separate apertures for cell seeding and retention. In this type of chambers the seeding opening is similar to the first case discussed where the aperture was just the chamber facing laterally in the supply channel. The cells are forced inside the chamber with the flow of culture, which is directed right through. Thus, the cell seeding is not accidental and could be automated. This is conversely to the approach with the sheer lateral stress in the fluid flow, which is characterized by high level of randomness of the

100

successful chamber seeding. However, because of mass conservation, the fluid forced through the growth chamber should eventually be allowed to leave. If this is provided simply by a second opening, there is a high probability that the cells would be directly washed out of the chamber. Therefore, the other opening needs to be constricted and to block the cells inside the chamber. This constriction could be horizontal or vertical. In the first case, if the organism to be studied is E. coli , a vertical opening as high as the chamber and 500 nm wide could be used. However, a horizontal resolution of this order is very difficult to obtain with typical soft lithography and requires expensive quartz crystal photomask and high resolution UV projecting equipment. Unfortunately, if the constricted opening to be used is horizontal and is again 500nm high, it would require and additional layer with this thickness to be constructed on the wafer. The latter requires an additional photomask and one more coating, alignment and exposure procedure. However, if properly made, the horizontal size of this additional feature would require a simple ink photomask and could be projected with standard optics.

When the cells are trapped in such chamber they receive nutrients from the medium flow either through the constricted opening, or through the seeding aperture. In the first case the growing and dividing cells could move only towards the seeding opening. The excessive cells would be extruded through this aperture, however, the medium flow is at the opposite side of the chamber and the cells are not going to be removed. The latter would cause accumulation of cells and possibly clogging of the whole chip. Therefore, a dedicated flow should be continuously generated in order for those cells to be washed away. If, on the other hand, the cells receive their nutrients through the seeding opening, this could happen in two ways. Firstly, the medium might flow in a channel parallel to the chamber and the same flow could also be used to remove the excessive cells. The other option is to force the medium through the growth chamber the same way the cells were seeded. In this case, however, the extruded cells would be pressed be the incoming medium flow towards the aperture and would mechanically block the exchange of nutrients and again cause clogging of the chip. Therefore, in this type of designs, a shunting channel running alongside the growth chamber is fabricated. In this latter case the three functions of the chamber apertures are all performed by separate dedicated structures – an opening for cell

101

seeding, a constricted aperture for cell retention and a shunting channel for cell removal.

IV.2.1.1. Examples

There is a big number of successful microfluidics designs that have been used for high-quality research used for in-depth understanding of fundamental cellular processes or characterization of engineered biological parts. We will present just a few of them characterized with their simplicity of the design and efficiency of the functions. Some of them of them are designed and used for experiments with bacterial species, others, however were developed to be utilized for growth and observation of yeast or mammalian cells. In any case, the design principles adopted could be readily transferred to meet the needs of future research.

The first example we would like to regard in some detail is a very simple device for 1D growth of E. coli used for fundamental research in the field of bacterial growth (Wang et al., 2010). The design of the growth part of the microfluidics chip consists in a series of narrow 1D growth chambers intersecting with a large supply channel (Illustration 27). The latter is used for culture inoculation, medium transport and excessive cells removal. Upon inoculation of a growth channel, the cells that are trapped inside can only position themselves along the chamber. Through cell growth and division the bacteria move and fill the whole growth channel. However, when they reach the end of the growth chamber their displacement is blocked by the dead end and the only direction in which they can continue to move is the aperture towards the supply channel. This way, the cell that is positioned at the growth-chamber end cannot be displaced. Interestingly, this means that even after numerous divisions this cell still remains direct descendant of the originating end-placed cell that inoculated the chamber on the first place. Furthermore, because of the unidirectional displacement of all the cells inside the same chamber, after a small number of divisions all the cells trapped in this chamber are progeny of the same cell. Hence, the authors refer to the latter as to “mother cell” and the whole microfluidics device was named by them “mother machine”. The mother machine was successfully used to study elongation of bacterial during exponential growth phase. One of their findings was that with

102

some regularity the mother cell elongates excessively (“filamentation process”) and upon uneven division it regains its normal size in the next generation. One possible explanation proposed was that the old poles accumulate in the mother cell and the filamentation is a method for removal of aging stress factor from the entire population.

Illustration 27: Design of the main part of the Wang et al. microfluidics device. The main channel is used for supply of nutrients and cell removal. The cells are trapped in the growth channels and grow only in one dimension towards the main channel.

Another remarkable microfluidics devise design was proposed for the research of yeast behavior (Rowat, Bird, Agresti, Rando, & Weitz, 2009) . This chip was created also to allow for the

103

growth of microorganisms in a single line , however the authors had a different scope, hence, they adopted also a different design approach. The aim of this paper was to study the heterogeneity of gene expression in yeast for comparing the production of a fluorescence reporter controlled by different genes. One of the aims of the authors was to show that some proteins exhibit synthesis heterogeneity through different generations, whereas rather retain relatively stable expression pattern. Therefore, the microfluidics device had to allow for the simultaneous observation of cells from the same cell lineage, but from different generations. The approach was firstly to create a growth chamber in which the inoculation would occur only with one starting cell, which would guarantee that all the observed cell from this chamber are from the same lineage. Second, the chamber had to be as long as possible so that many generations of the initial cell could be studied in parallel. However, if the feeding channel is connected to the growth chamber laterally, as in the previous design, the reach of nutrients to the cells is provided only by diffusion. As discussed before, the distance of effective diffusion is limited in the growth channels. Consequently, a different type of design was adopted. Instead of diffusing from the supply channel, the nutrients were delivered directly to the cells with the whole medium flow. This was accomplished by passing of the medium flow through the growth chamber. In order to avoid cell washing, the chamber was constrained laterally this way allowing for the medium passage, but blocking the cells removal. The explicit design (Illustration 28) includes single elements comprising of a growth channel constrained at the end opposite the nutrient supply connected to a shunting channel at the supply end. The growth channel is 5 m wide and deep, which allows for the fitting of a single yeast cell. The shunting channel is 2 times wider, which results in approximately 2 times smaller resistance, hence 2 times larger fluid flow rate. The device is initially loaded with a mixture of concentrated starting cell culture and fresh medium. Most of this inoculum would pass through the shunting channels of the individual growth units. However, some yeast cell would occasionally enter in one of the growth channels. The latter would result in blockage of the narrow passage at the end of the growth chamber and would increase the resistance significantly. Consequently, most of the flow and virtually all the newly arriving cells would pass through the shunting channel. This way, each growth chamber is loaded with a single cell to give rise to the studied lineage. The growth channel length of 500 m allows for the

104

parallel growth and observation of 8 generations of yeast cells. This design is an example for a

Illustration 28: Outline of a microfluidics device for the growth of yeast cell in a single line. The device consists in a series of repeated growth elements (left). An individual growth element (right) comprises a narrow growth channel and a wide shunting channel. When there is no cell trapped in the growth channel (A) the fluid flow through the growth channel has half of the flow rate through the shunting channel . When a cell is trapped in the growth channel (B) the resistance increases and the flow is mostly redirected through the shunting channel.

flow-through single-line horizontal chamber with horizontal constriction. As discussed above, such devices need dedicated mechanisms for removal of excessive cells. In this case the shunting channel provides this function. The new cells protruding out of the growth chamber are carried

105

along with the fluid flow passing through the shunting channel and are later exported with the rest of the waste flow out of the chip. The precise dimensions of this chip are not appropriate for cultivation of E. coli , however the size of the channels could easily be adapted for other microbial species. We have designed a similar device with a depth of 1.5 m and width of the growth and the shunting channel 1.5 m and 3 m respectively.

Another important type of microfluidics devices for characterization of bacteria are those that utilize 2D growth chambers . The latter provide opportunity for single-layer cell growth and thus for single-cell observations. With this type of devices a much larger number of cells could be studied simultaneously, however image processing elements such as cell segmentation and tracking are much more challenging. Among the best examples of such devices are the microfluidics chips used by the group of Jeff hasty for the studies of genetic oscillators in E. coli (Stricker et al., 2008) . The growth chamber (denoted “trap” in the paper) in this chip is approximately 1 m deep, which forces the bacteria to grow in single layer by preventing their stacking in vertical direction (Illustration 29). The horizontal shape of the trap is rectangular length of the edge perpendicular to the supply channel of 50 m and length along the channel of 40 to 60 m. The rectangular shape of the growth chamber forces the cells to organize themselves in parallel lines perpendicular to the supply channel, thus allowing for an opening to be available in the growth direction (Volfson, Cookson, Hasty, & Tsimring, 2008) . The latter phenomenon organizes the cellular displacement during growth and renders the cell tracking during image processing somewhat easier. Given chamber dimensions of 50 m X 50 m and average cell size of 5 m X 1 m we obtain a theoretical maximum of 500 cells and 200 cells observed per growth chamber in reality. For this number of objects and object tracking are very demanding tasks. Additionally, the excessive cells are pushed from the growth chamber directly in the main channel from whence they washed directly to the waste port, thus making the supply channel also a waste removal channel. It is interesting to note that unlike the previous example, here the structure constriction is in the vertical direction.

106

Illustration 29: Design of a microfluidics device with rectangular growth chamber for 2D growth of E. coli. The main channel in the middle is both medium supply and waste removal duct. The depth of the growth chambers is the same as the cell diameter.

Generally speaking, in microfluidics devices horizontal and vertical features are interchangeable depending on the requirements of the design, fabrication limitations and the aimed functionality. We have been generously granted with the design and the knowledge for fabrication technology of such types of devices by the group of Jeff Hasty for which we are very grateful. After necessary editions we have been able to use microfluidcs chips based on this design and re- engineered for the aims of our research in a number of papers (Carrera, Rodrigo, Singh, Kirov, &

107

Jaramillo, 2011) , (Fetita, Kirov, Jaramillo, & Lefevre, 2012) .

IV.2.2. Channel system

The last element of a microfluidics chemostat we need to look at is the channel system. As discussed above, this system has two main functions: 1) to provide input/output interface for the growth chambers with the environment and 2) to provide the means for dynamic control over the medium reaching the growth chambers. The interface between the growth chambers and the channels was discussed in detail in the previous section. The interface between the channels and the outer environment of the chip is usually provided by input and output ports. Both type of ports have the same design and are connected to supply or waste tubing through standard metallic fittings. The metallic fittings are forced into the ports and are practically sealed there owing to the elastic properties of the chip material. The ports themselves are fabricated manually through the utilization of special instrument (whole puncher) with gauge 21. The fittings are a bit larger (gauge 23), which allows them to stick tight in the perforation. The positions of the ports are normally provided with large comfortable landing pads that could be easily visualized by a naked eye. The pads usually consist of circles that have a diameter of few hundred micrometers to 1 mm and are directly connected to the channel system. Thus, the fluids flowing through the metallic fittings enter or leave the channel system of the chip.

The horizontal channels that could be produced by soft photolithography have a rectangular cross-section. The depth of the channels if defined by the thickness of the photoresist layer produced during the spin coating process, where the lower the spinning rate is, the thicker is the photoresist layer. Since there is a minimal rotation speed for the even coating of a wafer with photoresist, the depth of the channels is limited technologically. A typical depth used when working with E. coli is 10 m. The width of the channels should be large enough to prevent clogging and also to provide low enough fluidic resistance so that the flow could be driven by lower pressures.

108

As discussed previously, in microfluidics the miniature dimensions of the channels render the importance of fluid inertia very insignificant with respect to viscosity. A measure for this ratio is the dimensionless quantity called Reynolds number (George & Stokes, 1905; Reynolds, 1883) :

ρ × v× L v× L Re = = (18) µ υ

Where Re is the Reynolds number, v is the linear speed of the fluid ( m/s), L is the linear dimension of the channel cross-section (m), µ is the dynamic viscosity of the fluid (kg/(m·s)), υ is the kinematic viscosity (m²/s), ρ is the fluid density (kg/m³).

The value of this number is assumed to define the type of flow observed in a fluid under certain conditions (linear speed). Reynolds numbers above 103 define turbulent flow, between 103 and 40 the flow is assumed to be mixed (with occasional vorticity) and flows at very low Reynolds numbers (smaller than 40) are assumed to be strictly laminar. The Reynolds number for water in microfluidics-sized channels is approximately 10−2 and therefore is purely laminar . Hence, there is no need to calculate complex fluid dynamics and energy dissipation connected to it. The only two valid questions regarding the fluid dynamics inside the microfluidics channel network are how much fluid will be transported through the channels and which way will it go at the channel junctions.

The quantity of the transported fluid or the volumetric flow rate through a microfluidics channel is proportional to the pressure drop (i.e. the pressure difference) across this channel and inversely proportional to the hydraulic resistance of the channel (Choi, Lee, & Park, 2010) :

109

∆ × (19) P = Rh Q

∆ 2 5 Where Pis the pressure drop (N/m ), Rh is the hidraulic resistance (Ns/m ) and Q is the flow rate (m 3/s).

In a simple situation the channel will end with an output port and the pressure at this end will be zero. Therefore the bigger the pressure at the other end, the faster the fluid will flow and the bigger volume of it will be transported per unit of time. On the other hand, there is friction between the walls of the channels and the flowing fluid and this friction will slow down the fluid and will consequently reduce the volumetric flow rate. The bigger the part of the fluid that is in contact with the walls, the bigger the resistance of the channel is. Therefore a large cross-section will mean that a smaller proportion of the fluid is in contact with the walls, hence the wider and higher is a channel, the smaller is the resistance. Conversely, a longer channel means longer contact between the fluid and the channel, thus more friction and higher resistance. The precise equation for the hydraulic resistance of a channel with rectangular cross-section is as following (Bruus, 2008) :

12 × µL R ≈ (20) h w× h3 × ()1− 0.630 × h / w

Where w and h are are the width and height of the channel respectively. Note that the height carries more weight for the hydraulic resistance since the laminar flow in microfluidics renders gravity much more important for the fluid dynamics at the molecular level.

In the typical case the microfluidics device consists of a network of channels with different resistances and the cumulative resistance of the system needs to be derived for the flow to be assessed. If the flow is unidirectional, there are two possible combinations of channels in a network. The first is the parallel organization of the channels. The total flow rate through such

110

network is simply the sum of the flow rates through the individual channels. The pressure drop which is driving the flow is the same for all the channels and each individual flow is defined by the individual resistances of each channel. However, the flow rate is inversely proportional to the resistance, therefore the cumulative resistance is inversely proportional to the total flow rate. Also each individual resistance of the separate channels is inversely proportional to the the individual flow rates. Consequently the reciprocal value of the cumulative resistance is simply the sum of the reciprocal values of the individual resistances (Akers, Gassman, & Smith, 2006) :

1 1 1 = +..+ (21) Rt R1 Rn

Where Rt is the cumulative resistance R1..n are the individual resistances and n is the total number of channels in the analyzed system.

Conversely, if the channels are organized in series , the cumulative flow rate would be the same as the flow rate through each individual channel. The pressure drop across the whole system would drive the fluid through each consecutive channel from the series and in each channel the local characteristic resistance would exert friction on the flow. Thus, after each individual channel the pressure drop will be reduced until it reaches the exit where is the basal pressure. Therefore, the pressure drops across each individual channel sum up to the total pressure drop. It was already shown that the flows through all individual channels are equal between each other and we know that the resistances are proportional to the pressure drops. Therefore, we could easily derive that the resistances of the individual channels behave as the respective pressure drops. In other words, for systems of channels connected in series, the total resistance of the system is equal to the sum of the resistances of the individual channels (Akers et al., 2006) :

(22) Rt = R1 +..+ Rn

111

It is interesting to note the strong resemblance between the processes of fluid transfer through a microfluidics channel and the electrical charge transfer through a conductor. In both cases a big number of small particles move in an organized manner along a spatially limited corridor. In the process of this movement the material of the transfer channel exerts friction on the particles and slows down their movement, i.e. the channel internal surface resists the material current. Hence, channels with larger cross-sectional area are characterized by smaller resistances and longer channels are characterized with higher resistances. This resemblance in the nature of the processes leads to homology in the mathematical description of laws governing those processes. Indeed, the equation we formulated to connect the pressure drop, volumetric flow rate and the hydraulic resistance (2) has exactly the same form as the Ohm's law (Ohm, 2010)  for electrical current:

V = I × R (23)

Where V is the potential difference (or potential pressure, also voltage drop), (V), I is the electrical current (A) and R is the electrical resistance ( Ω ). Furthermore, the principle of conservation of the electric charge gives rise to the Kirchoff's law stating that the sum of all electric currents entering or leaving a node is zero:

n (24) ∑I n = 0 k=1

Where n is the total number of the branches. Finally, the principles that we used to formulate the method to calculate the cumulative resistance of a system of channels (21) and (22) are valid also for electrical circuits and the cumulative electric resistance of a complex current is calculated in exactly the same manner.

The possibility for dynamic control over the medium reaching the growth chambers requires

112

capacity for selective directing of different media through the same feeding channel. This could be accomplished only through complex channel nodes where flows with different directions meet and exert pressure on each other. This behavior cannot be described in terms of systems of parallel or serial channels. To understand how such junctions function, the law of mass conservation is used. According to that law, in a closed system the total amount of mass is constant and cannot be changed unless through mass or energy transfer. The fluid flow junction or node is a simple system with constant volume which can contain a constant amount of fluid. For each fluid volume entering the node, the same volume of fluid should also leave the node. In other words, the sum of the fluid flows passing through a given channel node is zero. In practical terms we could imagine two channels supplying two different media that meet at a node and from this node also starting a channel that feeds medium to the growth chambers. The medium with the higher flow rate would would enter in the supply channel predominantly. However, the other fluid flow cannot disappear. It will either also enter in the supply channel, but at lower rate, or it would be turned backwards by the pressure exerted from the other medium. To maintain precise equilibrium at which one of the media would be simply blocked in its channel and only the other medium would be supplied is virtually impossible. More sophisticated designs are required to secure reliable switching between two different media.

All the previously discussed design considerations are a good bases for the engineering of efficient microfluidics chips that could be used for the performance of complex biological experiments with precisely and dynamically controlled environment. However, there is one last important point to be taken into consideration for the design of microfluidics chips. Even the chips with the best design fail occasionally and the most common reason for microfluidics experiment failure is channel clogging. The living cells tend to adhere to hard surfaces and produce substances that attract other cells and help the further attachment of more and more cells. A single bacteria that manages to attach itself to the channel wall at certain crack or sharper curve will multiply every 30 minutes and after few hours would give rise to a whole colony. The colony will get bigger and bigger until it manages to block completely the flow through the channel. At

113

this point the nutrient supply will seize, the cells will start to starve and will enter in stationary state, which will make them more compact and difficult to be removed. The occlusion of a supply channel will disturb the fluid dynamics through the whole chip and would lead to the starvation of at least part of cells in the growth channels. Additionally, the clogging cells will produce unknown substances and will release them in the common supply flows. Eventually, the whole experiment will be compromised and will need to be repeated, a lot of resources in terms of media, microfluidics chips, other consumables and most importantly labor will be wasted. Therefore, careful design also includes anti-clogging measures.

There are two different goals to be achieved for the risk of experimental failure due to clogging to be avoided. Firstly, the opportunities for the cells to adhere to a hard surface should be reduced to the minimum. Bacteria could attach itself to the channel wall if it retained there for a longer period of time. On one hand, this retention might be provoked by mechanical imperfections of the wall. Occasional fabrication imperfections are unavoidable and good quality control for the newly produced chips should be exerted in order for the defective devices to be rejected. Alternatively, sharp turns and edges in the channels may generate dead zones with very low or zero flow. Cells would virtually sediment in those zones and try to adhere to some hard surface. Therefore, sharp edges should be avoided whenever possible, e.g. if a 90 o turn is needed, an arch and not a corner should be used. (Illustration 30)

The second goal to be accomplished in order to reduce the risk of clogging is the redundancy in the design. The growth chambers and their supply channels should always be designed in replicates, which would allow for the same experiment to be performed in few alternative parts of the chip in case of clogging. The most straightforward manner to develop such a chip is the repetition of the same structures in series. The supply channels for all replicas should branch from the same node in order to secure exactly the same medium supply for all of them. Also the design of each growth region repetition should be exactly the same. An easy way to spatially arrange such a battery of parallel devices is the parallel or radial organization. The channel for waste removal is also prone to clogging and should be treated as the supply channel. In addition, the redundancy in the design would provide also the capacity for the performance of

114

Illustration 30: Shapes of channels to be avoided in microfluidics devices with microorganisms access. A) Sharp turns and corners should be avoided; B) Arches are the preferred solution when a turn of direction is required

experimental repetitions, which are very important for the reliable data analysis.

Finally, the channel system should also provide sufficient nutrient transport for a normal cell growth. Our experimental evidence show that the final part of the supply channels, the one which communicates directly with the growth chambers has to be 10 m deep and at least 50-60 m wide. However, the single-layer and single-line chambers are about 1 m deep. Therefore, there

115

are at least 2 depth levels in some of the most commonly used microfluidics chips.

Illustration 31: Finite element modeling simulation of the fluid flow in devices with similar designs. The device consists of two supply channels, between which are positioned rectangular growth chambers. In the first design (A) the supply channels are ten times deeper than the growth chambers. In the second case (B) the supply channels are as deep as the growth chamber.

Given all the considerations previously discussed, one could outline the main features of microfluidics chip with desired functional properties. The shape and dimensions of the growth chambers could be chosen, the dimensions that would produce channels with the required local resistances could be calculated, junctions for media switching or mixing devices could be

116

sketched. Once the final design has been decided upon, the graphical representation should be generated with a software. Although, simpler packages could also be used, Autocad is the superior product for engineering purposes. In addition, the student version of Autocad is free for use for now, which makes it perfect for academic purposes. One has to keep in mind that each depth level in the microfluidics chip requires a separate design sketch in vector form graphics.

Unfortunately, the design process does not stop here, because while some local dynamic properties of the microfluidics system could be estimated by simple calculus and some small script could be used to aid this process, the behavior of a complex system of interconnected channels and chambers is not so easy to predict. The transient behavior of the system during regime changing is practically impossible to be modeled by a non-specialist. Therefore, the only practically applicable approach for the final evaluation of the chip design is finite element modeling. There is a number of available software packages for this purpose, the most widely- used out of which is COMSOL. Autocad and other vector format designs could be directly imported in COMSOL. After the successful generation of the finite element mesh of the design model, one could simulate fluid flow at different regimes, including transient behavior (Illustration 31). In addition, diffusion independently or in combination with the fluid dynamics could be modeled. If, at this point, the microfluidics device design does not exhibit the required behavior, the design needs to be revised. Once the model simulation have the desired dynamics, the fabrication process may start.

IV.2.2.1. Examples

An example for carefully designed switching device for microfluidics flows depending on the differential flow pressures was also engineered by the Jeff Hasty group (Ferry, Razinkov & Hasty, 2011) . In this device the two incoming media do not simply compete for the same supply channel, but there are also side shunts allowing for the excessive fluid mass to be redirected elsewhere (recycled back to the proper source channel or to the waste channel). Since

117

the device was a standard microfluidics chip which does not allow different vertical levels of supply, the shunting channel could not be the same for the two different media. Therefore, overall, at the shunting point there are two incoming fluids supplied by two independent channels and three outgoing channels (Illustration 40). One of those outgoing channels is the supply channel feeding the cells in the growth region of the chip and the other two are the shunting channels.

118

Illustration 32: A schematic representation of a passive microfluidics switch device. The supplied media meet at the junction and are redistributed. The side shunts always transport the respective input media. The fluid conducted towards the cells by the supply channel is defined by the difference between the input fluids pressures.

The media entering the junction do not mix, because of the laminar flow. Each of the media maintains its respective position to the other for a long time. In addition, as already discussed, according to Kirchoff's law (24) the sum of all the flows entering a node should be equal to the sum of the flows leaving the node. Therefore, the total amount of fluid carried by the supply channel and the shunting channels is equal to the sum of the fluid delivered by the two input

119

channels. The distribution of the volumetric flow rates between the separate outgoing channels depends only on the ratio between the hidraulic resistances of those channels. With good design consideration the resistances could be engineered in a such a way to produce reliable and estimable flow of medium (or media) in the main supply channel. The utilization of such switches generally involves supplying of the alternative media with such pressures so that not only the proper shunt and the main output channel receive one of the two media. Some of the selected medium should be also forced down the other shunt. This is done to guarantee protection of the supply flow from local instabilities occasionally provoked in the fluid flow system of the chip. In Practice if the resistances of the output channels are as following:

PM1 3 RhS1:RhF :RhS2 = 1:2:1 and if medium1 is required to be delivered, then > PM2 1

Where RhS1 , RhF and RhS2 are the hydraulic resistances of the shunt for medium1, of the feeding channel and of the shunt for medium2 respectively; and PM1 and PM2 are the input pressures of medium1 and medium2.

IV.2.2.2. Channel systems engineered for this research

We utilized a similar design for medium switching device (Illustration 33) for the experiments in our genetic oscillator forcing paper (Rodrigo, Kirov, Shen, & Jaramillo, 2013) . Our design also consists of two input channels for the two alternative media and three output channels for two shunts and one feeding channel for the cells. However those channels do not meet at a node, but in a short connecting channel. The channel delivering nutrients to the growth chambers starts at the middle of the connecting tube. This way, the flow of one of the income media is split in two branches, one toward the nearer waste shunt and another branch towards the further shunt through the connecting tube, passing by the supply channel and feeding the latter. The other medium flow is directed entirely towards the nearer waste shunt. Consequently, when the media are supplied with different enough pressures, there is only one possible medium delivered to the

120

cells and the switching is complete. This type of switch is a bit more resistant to local pressure instabilities compared to the previous example. However, it requires larger pressure differences between the input media for clean switching without leaks from the undesired medium towards

Illustration 33: A schematic representation of a passive microfluidics switching device with two inputs. The two input media are delivered in a connecting tube, which is used for redistribution of the outgoing flows between the waste channels and the feeding channel.

the cell growth region of the chip.

Simple switching between two media does not exhaust the desired dynamic behavior of the input function for a microfluidics-based synthetic biology experiment. Very often sinusoidal or more complex input functions are required. In addition, if the precise concentration of an inducer for the triggering certain event is sought, the simple switching between a limited number of

121

concentrations prepared in advance would be rather imprecise and inefficient. Therefore, designs that provide the possibility for the generation of mixes between different fluids are required. Unfortunately, this problem is not so easily solvable in microfluidics devices. The main hindrance originates from the up till now so useful phenomenon of very low Reynolds number. As already discussed, at the microfluidics scale the inertial forces have little effect on the fluid dynamics and the typical fluid flow is purely laminar. This means that the molecules of the fluid move in very tidy streams together as a package and there is no active exchange between the streams of the flow, i.e. there is no mixing . Hence, the task is to obtain mixes without mixing. Fortunately, the slow passive diffusion is not prevented in conditions of laminar flow. Passive diffusion is caused by the Brownian motion of the molecules and its efficiency is inversely proportional to the molecular weight. The simplest situation is with only two parallel streams of fluid flowing in parallel. In that case the complete mixing of the two streams would require that both of them are in contact for long enough time so that the particles of one of the streams could diffuse the whole width of the other. Consequently, the smaller the horizontal dimension of the channel cross-section, the faster the complete mixing will take place. On the other hand, the time of contact is defined by the ratio between the length of the diffusion channel and the flow speed of the streams. Hence, for the design of a diffusion-based mixing channel, once the cross- sectional dimensions are known, the length of the channel is key importance. The precise equation describing the minimal length required for the diffusion of certain compound from one incoming fluid in another fluid moving in parallel is the following:

v×l 2 ∆y = (25) m D

∆ where ym is the minimal length in m, vis the flow speed in m/s, l is the channel width in m and D is the diffusion coefficient in m 2/s. The values of the diffusion coefficients for some substances frequently used in media for biological experiments are given in Table 2

122

The minimal length of the mixing channel is often in the range of millimeters. A chip with a straight channel of that length is difficult to design and quite impractical to fabricate. In order to maintain the compact size of the chip, a serpentine-shaped channel is commonly used. For our practical needs we designed a mixer device with a serpentine shape, which had to comply with the required total volumetric flow rate of nutrients supply as well as to provide long enough time for two media to mix completely. Hence, such devices generate a delay in the medium supply and therefore called “delay lines”. The explicit requirements for our delay line were medium flow of 500 l/h and one small organic molecule dissolved in one of the media, having to diffuse completely in the other one. A typical size for an inducer molecule would be very close to elementary sugars, therefore glucose could be regarded as an example. In addition, the cross section of a channel could be predefined as 10 m X 10 m based on simplicity of fabrication and requirements for small width. Effectively, the member: v×l 2 from (25) is the volumetric flow rate. Given the later, also the ratio (25) and the diffusion coefficient for glucose from Table 3, we could derive that the required length for a delay line in our case would be:

× 2 × −6 × −9 3 ∆ v l Q 500 10 l / h 500 10 m / 3600s ≈ ym = = = 2 = −10 2 0.2m (26) D D 6.7×10−6 ×()10−2m / s 6.7×10 m / s

Consequently, the required channel length for the mixing device is 200 mm. Given that the normal linear size of a microfluidics chip is 1 to 2 cm, the aimed structure length is a challenging design. We accomplished that utilizing a unit channel of 200 m with 1000 repetitions in one serpentine (Illustration 34).

123

Illustration 34: Schematic representation of a delay-line device used as a mixer in microfluidics. The serpentine shape provides compact size and in the same time allows for a significant total channel length.

The repetitions are fit in 5 large sections, each comprising 200 replicas of the unit detail. The delay-line device has only one level of depth of 10 m, which renders it fairly easy to fabricate.

124

Name Molecular weight (Da) Diffusion coefficient (cm 2/s)

+ Sodium ion (Na ) 22.98 1.3×10−5

Glucose 180.16 6.7×10−6

Atto 655 dye 528 4.3×10−6

Bovine albumin 67,000 5.9 ×10−7

Table 3: Values of the diffusion coefficients for some substances frequently used in media for biological experiments

IV.3. Microfluidics device fabrication

The most widely used technique for fabrication of microfluidics devices is the Polydimethylsiloxane(PDMS)-based soft photolithography (McDonald et al., 2000) . This process consists in casting of a polymer (PDMS) over a mold with the features of the future microfluidics device. This mold is produced by a process called photolithography and usually involve photopolymerization of a resin coated on a Si wafer.

The preference towards this fabrication method is based on a number of reasons. First the photolithography of the Si wafer mold is a standardized process used excessively in microelectronics, therefore very well studied and reliable. Second, the fabrication of the microfluidics devices themselves is relatively simple and requires no special conditions like clean room, etc. Third, the polymer used (PDMS) for this fabrication has very advantageous properties (Lötters, Olthuis, Veltink, & Bergveld, 1997) . PDMS has very high viscosity, which makes it very useful for precise replication of molds. On the other hand, this polymer has very high flexibility ( G ≈ 250kPa), which renders its processing extremely easy. At room temperature PDMS has the transparency of glass, making it extremely useful material for direct observation of the inner structures. The chemical properties of the polymer are also unique. The surface is

125

strongly hydrophobic and prevents leakage of water-based solutions. Upon exposure to oxygen plasma, however, groups of silicon bonded to 3-4 oxygen atoms are formed (Hillborg et al., 2000)  and those groups provide the capacity of the material surface to form strong chemical bridges. Furthermore, PDMS is chemically stable, biologically inert and exhibit practically no temporal drift in its physicochemical properties. In addition, PDMS is highly porous, which provides easy access of gases to the liquid medium inside the chip. The latter is an important advantage when aerobic microbial cultivation is required and/or fluorescence reporter proteins are used, since oxygen is a limiting factor for their proper folding (Smith & Remington, 2006) . Last, but not least it is absolutely harmless for humans, being widely used also as foam control agent during food processing and preparation ( Evaluation of Certain Food Additives: Sixty-ninth Report of the Joint FAO/WHO Expert Committee on Food Additives , 2009) .

IV.3.1. Photomask printing

The first step of the physical fabrication of a novel microfluidics device is the printing of a photomask with the chip design. Each depth level needs to be printed as a separate mask. The size of the separate masks is as big as the final mold to be produced and if a standard 4” wafer is used, this means that 6 individual masks could be fit on A4 sheet (Illustration 35).

The final outlook of the whole printed sheet should be sent to the printing company. Different companies have different requirements for the design representation, however, as a general rule, the contours of the individual features of the device should be unbroken (i.e. in Autocad, polylin e should be used whenever possible). If all the individual features of the design are larger than 5m, a 20 000dpi (dots per inch) ink photomask could be used.

However, for finer resolution a chromium photomask on crystal substrate is required. The printing company would also require information whether the image should be printed as a mirror or not. This depends on the UV exposure approach used for the polymerization. If the mask is facing the polymer coating with its ink side, the resulting image would be a mirror of the

126

printout, therefore to keep it matching the design , the printout should be originally a mirror. Conversely, if the ink side of the mask is facing the opposite direction of the polymerized surface, the printing should be normal.

Illustration 35: A snapshot of an actual design file sent to a photoplotting company. A) The whole A4 page could easily fit six individual mask designs, each one of them used for the fabrication of a photolithography layer on a silicon wafer. B) Each masks consists of multiple devices and some features giving information about the precise design and proper positioning of the mask.

The final thing that a photoplotting company would like to know is whether the printing should be positive or negative. Positive printing maintains the colors of the design and dark is printed as black, whereas bright as transparent, and the negative printing vice versa . The typical photoresist

127

used in soft photolithography for microfluidics is negative, i.e. the exposed features will polymerize and become insoluble. Therefore, the final features to be fabricated on the mold have to be transparent. However, the microfluidics device design is usually laid as dark features on a bright background. Consequently, negative photoprinting is the method typically employed for photomask fabrication.

IV.3.2. Wafer spin-coating

Further, the photomask is used for the fabrication of the negative cast for the future microfluidics device. This is accomplished by the exposure of a solid flat substrate coated with a photoresist to a UV light. Since, the lithography methods for microfluidics are directly borrowed from microelectronics, the standard substrate used are silicon wafers, although any flat polishable surface could be utilized. The coating procedure itself is performed as following. The silicon wafer is firmly attached to a holder capable of fast spinning usually through vacuum. Next, the photoresist is applied on the substrate through pipetting. We utilize SU-8 photoresist by Microchem, which exhibits a number of advantageous features (Liu et al., 2004) . On the first place, its good viscosity provides the capacity for the generation of layers ranging from <1 m up to 300 m. Next, this photoresist has maximum absorption at 365nm, which allows for very good resolution to be achieved. Finally, SU-8 has a highly cross-linked matrix, which makes it chemically and thermally very stable. The quantity of the resin to be used depends on the size of the wafer, the rule of thumb being 1ml of photoresist for each inch of wafer diameter. The bigger the wafer surface, the more efficient is the fabrication process, since the coating itself requires the same amount of labor for any substrate size. Silicon wafers are fabricated in many sizes from 1 to 18 inches in diameter. However, standard equipment for silicon wafers treatment is made to handle 4 or maximum 5 inches large wafers, therefore 4 inches is the preferred size for the fabrication process. After the wafer is firmly attached to the rotor and the polymer is applied on the substrate surface, the rotor starts to spin so that the viscous polymer could be evenly distributed on the substrate surface. The spinning is usually performed in a couple of steps. First,

128

there is a slow spin (500rpm) for 5-10 seconds, which provides even distribution of the photoresist on the substrate, next the appropriate fast spinning (1000 – 3000rpm) for the required final photoresist thickness is applied for 30 seconds. Consequently, the wafer is removed and is inspected for proper film generation. If the surface is evenly coated and without visible disturbances, a short (between 1 and 5 minutes depending on the photoresist thickness) soft bake at 95 oC is applied for stabilization of the coating. A wafer prepared in that manner is quite stable and could be stored at 4 oC for weeks, if needed.

IV.3.3. Wafer UV exposure and development

The next stage of the wafer fabrication process is the UV light exposure. The coated wafer is fixed under the photomask and UV light is projected from top. The distance between the wafer and the photomask is set between couple of millimeters and hard contact (i.e. the photomask is firmly attached to the wafer). Each projecting machine has specific requirements for the distance between the mask and the substrate. Another option that might be adjustable is the UV light intensity and it is set according to the photoresist material properties. Finally, the exposure time is also defined and the polymerization is conducted. The UV light is projected onto the polymer coating through the photomask and in the typical case (please, refer to the previous paragraph) the polymerization occurs in the exposed zones. The wafer is removed from the UV light projecting machine and the polymerized photoresist is stabilized by another bake. The latter is again at 95 oC for 1 minute. Finally, the exposed wafer is developed in an appropriate solvent for 1 – 5 minutes until the features are clearly visible. The non-polymerized part of the photoresist is dissolved and washed away. Another rinse with a weaker organic solvent (isopropanol) and consequently with water is advisable. The wafer thusly produced is ready to be used and the features on it are stable. For a graphical representation of the whole process of photolithography, please refer to Illustration 36.

129

If the final microfluidics device has only one depth level, the wafer fabrication stops here and it needs to be stabilized for further thermal use overnight at the future working temperature (80 oC for PDMS chips). However, if additional depth levels are required, each of them needs to be generated consequently on the same wafer. This is performed by repetition of all the steps from spin coating to development for each of the required levels. Thus, the already hard developed features of one level would be coated with another layer of photoresist, which would consequently be exposed and developed.

130

Illustration 36: Graphical representation of the individual stages of the photolithography process. The process begins with a clean silicon wafer (a) . After spin coating it is covered with an even layer of photoresist (b). Consequently, the cross- linking of the resin is induced by exposure with UV light through a photomask (c). Finally, after development, the mold for the microfluidics device is fabricated on the silicon wafer surface (d)

The features of the next level need to be positioned precisely with respect to the features of the previous one(s) so that the final device is fabricated as designed. This is accomplished by the precise positioning of the second level photomask over the wafer with the first level already developed. The non-polymerized photoresist is transparent, however the depths of the different levels are of few micrometers which makes the alignment of the two layers extremely difficult.

131

Therefore, the UV projection is usually performed in designated machines (“mask aligners”) equipped with fine optics which allow for direct visible inspection of the photomask positioning over the wafer. If, the thinner layers were fabricated on the thicker ones, the visibility through the new coating would be better and the precise positioning easier. However, if the new layer of photoresist would be spun on thicker features than itself, the coating would be uneven and the features produced by the new layer exposure would be distorted (Illustration 35). Therefore, the different depth levels of the microfluidics devices are fabricated one by one and on top of each other in order of increase of the level thickness starting from the thinest.

IV.3.4. Multilevel devices fabrication

There are two major types of approaches that could facilitate the alignment of the consecutively fabricated layers. The first method involves the fabrication of additional alignment features on the wafer. Those features are not a structural part of the microfluidics devices to be obtained finally and their only use is during precise consecutive layer generation. The more reference points are used, the more precise the alignment would be. In addition, mismatch between the layers should be easily detected, therefore the features should not allow for hidden overlaps. A straightforward example is a system based on consecutive concentric disks (Illustration 37). In this case again the smallest features are fabricated first in order to prevent distortion and consecutive difficulties during correct alignment.

Alignment of the masks for the consecutive layer provides a method for very precise feature position matching, however the instruments utilized in that process are connected to at least two major drawbacks. Firstly, mask aligners are relatively expensive and require a significant investment, which not all labs could afford. Even used equipment would cost about 50 000 Euros in the EU. On the second place, mask aligners are bulky apparatuses and since they require particle-clean environment for their utilization, the building and maintenance of a clean room becomes unavoidable. Since we were limited in the budget dedicated for the building of a soft lithography fabrication setup and we did not have access to a nearby clean room, buying a mask

132

aligner was not an option. However, lacks in hardware could always be attempted to be compensated with intelligent design.

Illustration 37: An example of alignment features utilized to facilitate fabrication of consecutive layer during photolithography process. Sometimes the distortion is obvious even from the frame, but in other cases the mismatch between the position of the dedicated features is much more notable.

In our microfluidics devices designs we adopted two approaches. Firstly, we tried to reduce as much as possible the need for multilevel microfluidics chips. When, however, this need was unavoidable,we tried to make a design that allows for big margin of error during alignment and having the fabricated chip still functional. If we taker the design of Wang et al. as an example

133

(Illustration 27), we could readily apply the two types of design approaches in order to substitute the need for mask aligner, but still to obtain the same functionality of the device. Explicitly, the device needs to allow for a stable bacterial growth in a single line for a long period of time. In addition, the E. coli cells need to be supplied by nutrients through a lateral opening of the growth channel into a larger supply channel. First, to achieve those functionalities in a single-level device we need to reduce the depth of the main channel (Illustration 38).

Illustration 38: Schematics of microfluidics devices design examples avoiding the need of a mask aligner during the fabrication process. This is achieved by a single-level design (left) or by evenly-spaced repeating features guaranteeing at least one correct layer overlapping (right).

134

This device has the drawback of increased hydraulic resistance of the main channel. The latter could be overcome by increasing the width of the channel. However the elastic properties of PDMS limit the support:ceiling ration to maximum 1:50 without bending of the ceiling. The single depth level of a device for single-line cultivation of E. coli requires a depth of a single cell diameter, i.e. 1 m. Consequently, the supply channel width is limited to 50 m, which is not necessary to provide the needed reduction of the hydraulic resistance. To overcome this hindrance, evenly-spaced supports are fabricated throughout the main channel, thus preventing ceiling bending. Still, the single-cell depth of the main channel would mean increased risk of clogging during the inoculation process, when a concentrated starting culture is forced through this channel. Therefore, the inoculation should be performed in a different manner. We designed a dedicated cell port at the end of each growth chamber, which would be used for individual loading of each of the channels. This way, the medium would constantly run in the main channel and the concentrated starting culture could be forced in the growth chamber only until at least one cell is firmly positioned inside. Upon successful loading, the cell port could be blocked, thus effectively converting the cell channels to the same type as the ones of Wang et al . Unfortunately, the number of ports that could be fit in the same microfluidics device is limited and this would limit also the number of growth channels and, consequently, the number of experiments observed in parallel. On the other hand, individual loading ports means that each growth chamber could be loaded with a different type of bacteria, thus allowing for the parallel testing of different engineered organisms in the same conditions simultaneously.

The other approach to avoid multilevel microfluidics designs is the provision of repetition in the structures in such a way that at least one of the overlaps of badly aligned masks would produce the desired feature (Illustration 38). In order to achieve a single-line growth we again need to provide for a growth channel with certain length supplied with nutrient by deeper medium channel. The device in Illustration 27 has an overall width of 75 m and to rely on proper manual superimposition even to overlap the main channel with the growth channels is not realistic. However, we could make a first layer mask for very long growth channels, e.g. up to 5 mm, which already is a size readily distinguishable by naked eye. The next layer could consist in

135

many parallel supply channels, interspaced by the optimal nutrients diffusion distance for this type of growth channels (in this case it is about 100 m). This way we obtain two separate grids of channels that are overlapped. Even with very bad alignment we could rely that at least one growth channel (usually many more) would remain overlapped by at least two parallel supply channels. The latter structure has optimal conditions for bacterial growth in a single line and could be used for our aimed experiments. The only observed fields under the microscope would be the ones that consist of a growth channel properly closed between two supply channels. The incorrectly overlapped supply channels do not need to be used and could be blocked by drops of superglue e.g.

IV.3.5. PDMS structures stamping

The thermally stabilized wafers need also to be silanized before they could be used for continual molding of PDMS microfluidics chips. The silanization renders the peeling off of the PDMS from the wafer surface easier, which reduces the risk of photoresist breaking and removal from the wafer surface. Wafers are silanized overnight by resting in the same sealed small chamber (a large petri dish sealed with parafilm would suffice) with 1ml of silanizing agent in an open container. Care must be taken, for silanizing agents are extremely irritant for the eyes, therefore work in laminar boxes is advised.

The fabrication of the PDMS stamp over the wafer carrying the negative of the microfluidics device is accomplished by pouring of the the PDMS over the wafer. The cross-linking of the polymer is carried out at 80 oC. Initially, the pre-polymer is mixed with its curing agent in a vessel by a spatula. This process involves trapping of air bubbles in the mixture. However, during the polymerization and hardening of the material, those bubbles would form cavities in the chip and would compromise the structure and functionality of the device. Therefore, air is removed first by centrifugation and later by exposing the mixture already poured over the wafer to a vacuum, usually in a desiccating chamber connected to a vacuum pump. After the complete removal of the air from the polymer mixture, the PDMS is partially cross-linked for 2 hours at 80 oC in an oven with passive convection. This way, the PDMS stamp would already have the mechanical

136

properties of an elastomer (i.e. it would be rubbery). This means that it would keep the shape of the mold even upon peeling off, while still being soft enough to allow for easy cutting and other mechanical manipulations. Hence, the PDMS stamp is peeled off and is cut into the separate chips that were fabricated on the wafer. The peeling off exposes the part of the PDMS stamp that was facing the wafer and will generate the channels and the growth chambers. Those features are very fine and need to be protected from particles sedimenting on them. Therefore the cutting of the chip is performed in a laminar box equipped with a particle filter (HEPA filter). Consequently, the input and output ports are perforated at the appropriate spots also in the laminar box. The positioning of the hole-puncher could be done under a stereoscope in order to improve the precision. The finalized PDMS parts of the chips are transported in a closed container for protection from particles to be firmly attached to the hard substrate that would become the base of the chip.

IV.3.6. Final devices bonding

The microfluidics devices we are aiming for are to be used for microscopy characterization of bacterial physiology. Therefore, the finalized chips should allow for direct microscopy observation of the growth chambers. This is accomplished by using microscope cover slips for chip bases. This way, all device compartments that are in direct contact with the base would be visible through a microscope. The irreversible attachment of the PDMS chunks to a glass surface is normally not possible. This is owing to the inert surface properties of glass and the strong hydrophobicity of PDMS surface. To overcome this impediment, an artificial sticky layer should be generated on the surfaces of both materials. This is accomplished by the exposure of the lower surface of the PDMS chunks and the cover slips to oxygen plasma. The latter is rich in ionized molecular species, which adhere mechanically to the presented surfaces and generate activated ionized layers on them. Upon direct contact those layers create direct chemical bonds, which are capable to hold tight the materials, to which they are attached. Since this ionized state is generated artificially, it is not permanent hence the PDMS chunks need to be placed on the cover

137

slips within one minute after the end of the plasma exposure. The surfaces to be ionized need to be cleaned thoroughly from small particles that might later impede the proper attachment or distort the fine features of the microfluidics device. This is usually accomplished by gentle attachment of those surfaces to some sticky material capable of removing the small adherent particles. The materials thusly cleaned are later positioned in the plasma chamber of a plasma cleaner. The latter is a device capable of generating oxygen plasma by radio frequency emission in a low vacuum environment (200mb). The firmly attached PDMS and cover slip are further left for complete cross-linking and hardening of the polymer at 80 oC in an oven with passive convection. This step also finalizes the microfluidics chip fabrication process and the obtained devices are ready for use.

IV.4. Soft lithography setup developed for this research

Since the soft lithography process for microfluidics devices fabrication is based on techniques borrowed from microelectronics, it also relies on usage of clean rooms for the maintenance of particle-free environment. The latter is a prerequisite for precise reproducible fabrication of the designed devices. Most of the biological labs that rely on their own production of microfluidics chips use clean rooms for the wafer fabrication, either clean rooms they have access to or they build their own. However, the investment for the building of a clean room is substantial and this requirement is a huge impediment for the wide spreading of microfluidics in biological research. In addition, clean rooms were developed for microelectronics and had to protect devices with much finer features from particle sedimentation. Therefore, we decided to develop our own low- cost fabrication setup that would avoid the need of a clean room utilization and would allow us to generate our own wafers with microfluidics designs and use soft lithography with PDMS.

The part of the microfluidics device fabrication process that is most sensitive and prone to failure

138

owing to high particle concentration in the air is the wafer mold preparation. During this process sedimenting airborne particles could affect irreversibly the precision of fabrication at few key steps. Firstly, before the spin-coating the wafer surface needs to be perfectly clean, otherwise the distribution of the photoresist during the process might be uneven or even prevented in some surface areas. Therefore, wafer manipulation before the spin-coating is performed in strictly particle-free environment. Second, the photoresist on the coated wafer is not polymerized and is still soft before the exposure. Any particle falling on that surface would mechanically disturb it and would compromise all the consequent processing steps. Hence, also the manipulation of coated wafers (i.e. soft baking) before UV exposure is mandatory in filtered air. The UV exposure itself is prone to failure from particle accumulation inside the exposure chamber. Dust residing anywhere between the UV light source and the photoresist coating of the wafer would inevitably absorb some of the light and change the final structure of the polymerized features on the wafer. Finally, even the exposed wafer before the hard bake could not be easily cleaned from sedimented particles, since the fine features fabricated on its surface are still fragile and mechanically unstable.

Hence, the whole sequence of processing steps for the fabrication of the silicon wafer mold of the microfluidics device features is necessarily performed in particle-free environment. The latter is maintained by air filtration and utilization of materials without particle release from their surfaces, ranging from wiping tissues to laboratory garments. When those requirements are applied for a whole clean room the expenses to maintain such environment include air filtration and positive pressure for 30-40m 3, building and maintenance of special floor, walls and ceiling, utilization of entrance with sticky floor, changing room , etc. However, the actual space for the equipment required for the wafer mold fabrication is much more modest. The size of a bench-top spin coater is approximately 40x40x30cm(height). The high-precision hot plates employed for the soft bake and the post-exposure bake are 20x30x10cm. The development process is performed in a 10cm deep vessel with diameter of 5” (approximately 13cm). Therefore, the space required for the above-mentioned equipment should have a surface of 120x60cm and its height is limited only by the comfort of the personnel operating with it.

139

However, the UV exposure step is typically performed in mask aligners, which are quite bulky. A typical size for this apparatus would be 70x120x50cm. As discussed before, mask aligners are required for precision matching of the separate layers of multilevel microfluidics devices. The latter provide the soft lithography technology with much higher versatility and improve the possibility for creation of appropriate devices. Nevertheless, if a decision is taken that this capacity is expendable, a much simpler device, which fixes the coated wafer and the photomask under UV irradiation should suffice for the photoresist polymerization to occur. We opted for the latter type of apparatus in our lab, namely the UV-KUB from Kloe. This is a cubic device with a 26mm edge, which makes it extremely compact. Nevertheless, UV-KUB has an commission spectrum of 385nm, which allows it to produce features with 2 m resolution. The power of the UV lamp could be set from 1 to 100%, different cycles could be programed and precise mechanics allow for the setting of the mask – substrate distance. Given those qualities, this apparatus is very versatile and efficient for UV exposure, lacking only alignment capacity.

The utilization of a compact exposure device allowed us to fit all the equipment necessary for the fabrication of silicon wafer molds in a single device with filtered air. We used a standard double length (180x60cm) laminar box equipped with HEPA and active filter. This way we could work with the organic solvents required for the development process and secure proper air filtration. The combined usage of two filters rendered the laminar box higher than usual and a ceiling height of 4m is recommended for the laboratory of installation. Our laminar box recycles 80% of the air, this way allowing secure particle-free work and protects the operator from the organic fumes in the same time. The equipment is arranged in the box in order of process stream, preventing processing errors and allowing simultaneous work for two people.

For the spin coating we acquired a Laurell 650 device. Those type of devices need to provide three main capabilities. Firstly, the hard substrate for the coating needs to be fixed tightly to the rotor. Next, the rotor needs to be able to produce spinning with at least 2000-3000rpm. Finally, the wafer should be maintained horizontal in order to prevent uneven distribution of the photoresist, which could lead to undefined profile and feature depth. The substrate fixation is accomplished by vacuum, which could be supplied directly to the device or it could be generated

140

by a vacuum chunk, which uses compressed air flow to generate it. Compressed air supply is required by definition, since the horizontal orientation of the substrate is provided by compressed air cushion, which sustains the whole rotating part of the device and the latter is virtually floating. Therefore, we decided to use also the vacuum chunk and avoid the need to provide vacuum line in addition to the compressed air one.

Illustration 39: Low-cost photolithography setup. 1) Spin coater; 2) UV exposure device; 3) Organic solvent used for the development; 4) Vessel used for the development; 5) High-precision hot plate used for baking. Finally, in order to prevent accidental triggering, we removed the UV lamp which is normally provided in the laminar box. The visible light source is located in a special compartment and allows for easy addition of standard UV yellow filters on the bulb. The whole setup, including the air compressor was installed in a room without access to direct natural light. The light bulbs in this laboratory were also covered with yellow UV filters, which allowed for the manipulation with photoresist. The passive convection oven for the thermal stabilization of the wafers was positioned also in the same room.

The thermally stabilized wafer molds are the molds for stamping of the microfluidics devices in PDMS. As described earlier, this process involves complete degassing of the polymer mixture first by centrifugation and then in a vacuum desiccator. For the latter we used a desiccator for

141

small samples directly connected to a vacuum pump. After the initial mixing and 2 minutes of centrifugation the PDMS mixture was poured in a cup made of thick aluminum foil and having as a bottom the wafer mold. The cup was placed in a large petri dish without cover and the petri dish was set in the desiccator chamber. Upon connection to the vacuum pump through a standard orange vacuum tubing, 40 minutes provided reliable degassing of the PDMS. The PDMS mold thusly prepared was placed in a passive convection oven at 80 oC for 2 hours for initial cross- linking and hardening. The product of this processing step was an elastomer stamp of the features on the surface of the silicon wafer. This replica needs to be peeled off of the wafer surface. Care must be taken not to destroy the wafer at this step. The first step was the careful peeling off of the aluminum foil from the walls of the stamp made of PDMS and the bottom consisting of the wafer. Next, the edge of the PDMS rubber was separated from the wafer by gentle forcing of a scalpel tip between and its slow sliding along the edge until the whole structure is separated. The peeling off of the PDMS from the wafer followed. This action should be performed as gently as possible, because silicon wafers are quite brittle and are prone to cracking. It is important to note that the warmer wafers are more brittle, therefore proper cooling is advisable before the commencing of the peeling process. In addition, a more robust wafer construct could be easily created by attaching the wafer carrying the device features on top of another wafer through superglue or a thin layer of PDMS. The peeled off PDMS should be positioned in a clean container (once again a large petri dish is appropriate). The obtained elastomer is transparent and under proper angle of light exposure the larger features of the devices stamped in it are visibly distinguishable. The used wafer should be carefully cleaned for any remaining PDMS particles and should be closed in a proper container. Under no circumstances the features on top of the wafer should not be brought in contact with hard objects. Compressed air is the best way to clean any impurity on them.

Next, the PDMS stamp should be cut in separate chips and proper ports should be perforated at the designated positions. Since, those operations require working outside the container, the need to be carried in a laminar box. This laminar cannot be the same as the one used for the wafer fabrication, since some particles might be released from the cut pieces of PDMS. The cutting is

142

performed in the petri dish with the mold facing with the features down. The PDMS is strong to sustain the pressure of its own weight, however sliding of the polymer when the features are touching the bottom of the petri should be avoided. There is a wetting effect from the contact between the PDMS stamp and the petri dish hence the contours of the separate devices become visible. After the latter have been cut in proper chips, they are placed with the features facing up under a stereoscope for better positioning of the hole puncher. As mentioned before, penetration and removal of the puncher in and out of the chip must be done vertically. Any rotation would lead to a crack starting from the port and would render the chip unusable.

The cut chips with the proper ports open are ready to be sealed to microscope cover slips. Beforehand, they need to be cleaned thoroughly from particles accumulated on the surfaces to be sealed. This is accomplished by gentle attachment of strong adhesive tape to those surfaces and its consequent removal. Next, the sealing step requires plasma activation of the surfaces of both types of materials. We use an ACE1KHz plasma cleaner supplied with vacuum by the same pump used for the degassing. This device allows for the complete automation of the process, since all the steps, including vacuum level, gas mixture and plasma generation time, are programmable. However, we do not use a standard mixture of oxygen and nitrogen for the plasma generation. Those gases are readily available in air and after careful protocol elaboration, we discovered that 20 seconds of exposure to plasma produced from air at 200mbar is enough to produce strong bondage between the PDMS chips and the cover slips. The only main difference from standard protocols is that we require longer final hardening of the chips. In our case it is at least 48 hours. However, avoiding additional costs for supplies of pure oxygen and nitrogen, security considerations, providing proper pressure controls, etc., we were able to accomplish a completely reliable and reproducible method for microfluidics device fabrication.

Overall, our complete setup (Illustration 39) costs less than EUR 35 000 vs. approximately EUR 200 000 for a setup with a small clean room and a cheap mask aligner. It is true that our setup does not allow for optical mask alignment, however we believe that we have already demonstrated how intelligent design could compensate for lacks in hardware even when a device

143

as sophisticated as a single-line growth chemostat is engineered.

IV.5. Conclusion

Here we described the theoretical aspects of microfluidics devices design. The requirements for the structure of the individual parts of such devices were thoroughly investigated and the methods to develop and finalize a novel design were demonstrated. Based on the work of others and our own, examples of efficient microfluidics chips were discussed in detail. We also explained in detail the complete process of fabrication of microfluidics devices. Furthermore, a successful example for the development of minimalistic fabrication platform requiring minimum investment and maintenance was demonstrated. The efficiency of this type of fabrication facility should make it affordable for a much larger number of biological labs to develop their own biochips according to their own needs. The connection between microfluidics chip design, cell physiology and experimental requirements was also demonstrated.

IV.6. References

Akers, A., Gassman, M., & Smith, R. (2006). Hydraulic Power System Analysis (Fluid Power and Control) (p. 400). CRC Press.

Bruus, H. (2008). Theoretical Microfluidics .

Carrera, J., Rodrigo, G., Singh, V., Kirov, B., & Jaramillo, A. (2011). Empirical model and in vivo characterization of the bacterial response to synthetic gene expression show that ribosome allocation limits growth rate. Biotechnology journal , 6(7), 773–83. doi:10.1002/biot.201100084

Choi, S., Lee, M. G., & Park, J.-K. (2010). Microfluidic parallel circuit for measurement of hydraulic resistance. Biomicrofluidics , 4(3). doi:10.1063/1.3486609

Evaluation of Certain Food Additives: Sixty-ninth Report of the Joint FAO/WHO Expert Committee on Food Additives . (2009) (p. 208). World Health Organization. Retrieved from

144

http://books.google.com/books?id=r9pXEcj8TYoC&pgis=1

Ferry, M. S., Razinkov, I. a, & Hasty, J. (2011). Microfluidics for synthetic biology: from design to execution. Methods in enzymology (Vol. 497, pp. 295–372). doi:10.1016/B978-0-12- 385075-1.00014-7

Fetita, C., Kirov, B., Jaramillo, A., & Lefevre, C. (2012). An automated approach for single-cell tracking in epifluorescence microscopy applied to E. coli growth analysis on microfluidics biochips. In SPIE Medical Imaging (p. 83170Z–83170Z–11). International Society for Optics and Photonics. doi:10.1117/12.911371

George, S., & Stokes, G. (1905). ON THE EFFECT OF THE INTERNAL FRICTION OF FLUIDS ON THE MOTION OF PENDULUMS, 1–86.

Hillborg, H., Ankner, J. F., Gedde, U. W., Smith, G. D., Yasuda, H. K., & Wikström, K. (2000). Crosslinked polydimethylsiloxane exposed to oxygen plasma studied by neutron reflectometry and other surface specific techniques. Polymer , 41 (18), 6851–6863. Retrieved from http://www.sciencedirect.com/science/article/pii/S0032386100000392

Liu, J., Cai, B., Zhu, J., Ding, G., Zhao, X., Yang, C., & Chen, D. (2004). Process research of high aspect ratio microstructure using SU-8 resist. Microsystem Technologies , 10 (4), 265– 268. doi:10.1007/s00542-002-0242-2

Lötters, J. C., Olthuis, W., Veltink, P. H., & Bergveld, P. (1997). The mechanical properties of the rubber elastic polymer polydimethylsiloxane for sensor applications. Journal of Micromechanics and Microengineering , 7(3), 145–147. doi:10.1088/0960-1317/7/3/017

McDonald, J. C., Duffy, D. C., Anderson, J. R., Chiu, D. T., Wu, H., Schueller, O. J., & Whitesides, G. M. (2000). Fabrication of microfluidic systems in poly(dimethylsiloxane). Electrophoresis , 21 (1), 27–40. doi:10.1002/(SICI)1522-2683(20000101)21:1<27::AID- ELPS27>3.0.CO;2-C

Ohm, G. S. (2010). Die Galvanische Kette, Mathematisch Bearbeitet (1827) (p. 250). Kessinger Publishing. Retrieved from http://www.amazon.co.uk/Die-Galvanische-Kette-

145

Mathematisch-Bearbeitet/dp/1168419514

Reynolds, O. (1883). An Experimental Investigation of the Circumstances Which Determine Whether the Motion of Water Shall Be Direct or Sinuous, and of the Law of Resistance in Parallel Channels. Philosophical Transactions of the Royal Society of London , 174 (November 2009), 935–982. doi:10.1098/rstl.1883.0029

Rodrigo, G., Kirov, B., Shen, S., & Jaramillo, A. (2013). Theoretical and experimental analysis of the forced LacI-AraC oscillator with a minimal gene regulatory model. Chaos (Woodbury, N.Y.) , 23 (2), 025109. doi:10.1063/1.4809786

Rowat, A. C., Bird, J. C., Agresti, J. J., Rando, O. J., & Weitz, D. a. (2009). Tracking lineages of single cells in lines using a microfluidic device. Proceedings of the National Academy of Sciences of the United States of America , 106 (43), 18149–54. doi:10.1073/pnas.0903163106

Smith, S. S. T. and G. G. / M. B. and J. L., & Remington, S. J. (2006). Fluorescent proteins: maturation, photochemistry and photophysics. Current Opinion in Structural Biology , 16 (6), 714–721. Retrieved from http://www.sciencedirect.com/science/article/pii/S0959440X06001734

Stricker, J., Cookson, S., Bennett, M. R., Mather, W. H., Tsimring, L. S., & Hasty, J. (2008). A fast, robust and tunable synthetic gene oscillator. Nature , 456 (7221), 516–9. doi:10.1038/nature07389

Volfson, D., Cookson, S., Hasty, J., & Tsimring, L. S. (2008). Biomechanical ordering of dense cell populations. Proceedings of the National Academy of Sciences , 105 (40), 15346–15351. doi:10.1073/pnas.0706805105

Wang, P., Robert, L., Pelletier, J., Dang, W. L., Taddei, F., & Wright, A. (2010). Robust Growth of Escherichia coli, 1099–1103. doi:10.1016/j.cub.2010.04.045

146

V. Microscopy and image processing

V.1. Microscopy

Microfluidics technology provides the means for microbial growth in proper conditions and direct optical access to the cells to be observed. However, it is microscopy which provides the means for recording of the observed bacterial behavior with high resolution for long periods of time in an automated fashion. Performance of engineered synthetic biological parts are usually characterized through the observation of the quantity and quality of reporter fluorescence proteins produced by the chassis for the engineered parts. For the successful analysis of their dynamics, microscopy setup should provide good quality data for the size and position of each cell and for the quantity and quality of the cellular fluorescence emission. On the other hand, for the observation and recording of long term dynamics, the microscopy setup should also provide automatic focusing capacity and programmability of time-lapse experiments. Finally, the cellular growth happens inside the microfluidics chips, which, on their turn, are positioned on the way of the microscope's optical system. Therefore, there are additional requirements for microscope setups used for microfluidics experiments, explicitly, they should provide for easy access for the tubing connected to the chip and, optionally, some of the necessary growth conditions like adequate temperature, gas feed, etc.

To meet those requirements we built a system based on Zeiss Axio Observer Z.1 inverted epifluorescent microscope. This microscope is specially designed for fluorescent experiments and is with inverted optics. The latter means that the objectives are bellow the sample, hence allowing for the comfortable positioning of the microfluidics chip above the stage and the plugging of the input/output tubes from top. Our microscope is also equipped with a heated stage, which allows us to indirectly maintain the proper temperature inside the chip, without the need

147

expensive and bulky incubators or integrated heaters in the chip. The stage is also motorized, hence different observation spots could be programmed and memorized and the microscope would return back to exactly the same spot upon the execution of each cycle of the program. The latter property is indispensable for the programmability of time-lapse experiments. Furthermore, we acquired the Definite Focus system from Zeiss, which uses an 835nm LED light source to find the exact phase border between the liquid and the hard phase in the culture vessel. In the case of microfluidics chips that would be the border between the cover slip internal surface (facing the chip) and the liquid flowing through the chip. When the experimentalist focuses the microscope over a position inside a growth chamber, the Definite Focus would first find the exact place of that border and then calculate the distance from there to the desired focus by the user. This way, upon losing the focus owing to different sources of z drift, the precise focusing distance could be recovered impeccably. For the UV light source we are using the X-Cite Exacte system with a closed-loop feedback technology, which eliminates big part of the naturally-occurring light intensity drift characteristic for UV light sources. The final goal of our experiments is to quantify the fluorescence emission levels for long time periods. Furthermore, very often in living systems the light is emitted by an extremely low number of molecules (<10) and is very weak, making any source of noise detrimental to the successful analysis. Therefore, a UV light source without a drift in the excitation light intensity is indispensable. The low number of molecules reduces the emission intensity. The latter could be overcome by longer or more intense excitation. However, UV light source is detrimental for bacteria and the period of exposure should be reduced to the minimum, otherwise phototoxic effects appear. To resolve this problem, we utilize and EM-CCD camera, which multiplies capable to multiply the light at the hardware level and detecting even 10 photons per . The maximum capacity of the camera, however, cannot be used, because it leads to multiplication also of the background noise and deteriorates the quality of the phase- contrast image. In our system the latter is captured by a 100x immersion objective, which combined with the 10x in-built microscope magnification and the additional 1.6 lens provides one of the best magnifications for phase-contrast imaging available nowadays. Finally, our microscope is also equipped with 4 sets of excitation and emission filters appropriate for the 4 most widely-used reporter fluorescent proteins: GFP, RFP, YFP, CFP.

148

This system is under the control of the Zeiss AxioVision software allowing for complete programmability of the whole setup. Programming of a long time-lapse microscopy experiment is a compromise between a number of factors to be taken into account. On one hand, the fluorescence signal should be enhanced, but with minimal possible UV light exposure for the cells, hence the EM capacity of the camera should be used. However, the latter reduces the quality of the phase-contrast images and the whole image segmentation depends on this quality. Furthermore, we need to use the Definite Focus system to avoid the risk of strong z drift, but this increases the time for the image acquisition at each of the preset positions. On the other hand, the longer is the period between each two consecutive frames, the bigger the displacement of the tracked objects will be and the harder will they be tracked through the whole movie. Therefore the number of followed spots should be reduced.

The absolute requirements in fluorescence microscopy that cannot be adapted to the other conditions are the need for good quality phase-contrast images, strong enough fluorescence signal, which does not kill the bacteria and short period between the consecutive frames. To meet the first requirement we utilize usually 70 and maximum 100 electronic gain for the EM-CCD camera. For the second requirement to be met, the intensity of the excitation UV light is set to 40% of the maximum, which is strong enough to excite even low number of molecules, however has reduced phototoxic effect. Finally, in order to reduce the pause between the consecutive frames we follow only 4 different spots and the period of the time-lapse is 30 seconds. The period for fluorescent imaging does not need to be so small and we utilize timing between 3 and 7 minutes.

The experiments are performed in a dark room without access to direct natural light. The setting up of a microfluidics chip on a microscope stage to be observed for a long period involves several steps. First, the starting culture is prepared by centrifugation and concentration of 50ml exponentially growing bacterial culture. To enhance the penetration of the water-based starting culture in the hydrophobic polydimethylsiloxane (PDMS) channels of the chip, a small amount of surfactant Tween20 is added up to a final concentration of 0.05%. The same concentration of

149

surfactant is used also in the media that are going to be used during the experiment. The media also contain all needed antibiotics and inducers. There is a substantial difference between the quantity of starting culture (5ml) and of media prepared (5ml for each 10 hours of experiment and additional 5ml for the pump dead volume and loading losses). The fluids, thusly prepared are poured in syringes with appropriate sizes. Then dispensing needles (gauge 23) entering into Tygon tubes of appropriate size (0.5mm internal diameter) are fixed on the syringes through Luer locks. The tubes for the media should be long enough to allow for the easy reach between the pressure source (in our case syringe pumps) and the chip position on the microscope stage. For the tubing for cell loading 20cm should suffice. After the syringes with fluids to be flown inside the chip are ready, the loading starts. The latter begins with forcing of the starting culture in the chip manually through the syringe. The empty channels of a microfluidics device are readily visible under appropriate angle whereas the full channels are transparent from any point of view hence visual control of the channel filling is easy. Once the region of the chip containing the growth chambers is full, some lateral stress needs to be generated for the cells to be forced in the chambers. This stress is caused by manual flicking of the tubes. After few repetitions of the stressing, the cell input tubes are unplugged from the chip ports and waste output tubes are connected to them. The waste tubes lead directly to a dedicated waste container, which is positioned lower than the microscope stage in order to avoid backflow upon pressure drop. Next, the media input tubes are plugged into the input ports and the media are run gently through the chip. This initial flow is slower than the operational one (we usually use 100 l/h vs. 500 l/h) to prevent washing of the cells in case they are not firmly attached inside the chambers yet. The chip is now properly supplied with nutrients and the cells will start growing, however there is no temperature control yet. Therefore, the microfluidics device is fixed on the microscope stage at its final position. If the immersion objective is going to be used, any residing oil from previous experiments needs to be wiped off and a drop of fresh oil needs to be added at the center of the objective lens. The microfluidics chip is positioned gently above the objective with the cover slip facing towards the lens (i.e. downwards) and the tubing plugged on top. To prevent minimal lateral movement of the chip on the stage provoked by the finding of the individual observation spots, the chip is attached by some adhesive tape. The tubing carries substantial weight and the

150

tubes could also be fixed to the stage with adhesive tape. At this point the temperature control of the heated stage is used to set appropriate temperature for cellular growth. We discovered that for our setup a 39 oC temperature of the stage provide good bacterial growth inside the chip. After 1- 2 hours the media flows could be reverted to the normal levels and syringe pumps may be programmed for the final experimental sequence of flows. Next are selected and saved the positions of spots to be observed and the complete time-lapse is programmed. The starting of the time-lapse microscopy experiment requires the artificial light to be turned off, since it increases the background noise and affects the final image quality.

A microfluidics-based time-lapse fluorescent microscopy experiment might continue up to 72 hours. Given that for each observed spot there are roughly 130 images produced every hour and that every image has a size of approximately 1MB, the total size of the data that needs to be processed might amount to almost 10GB. The analysis of such pile of images is a laborious and complicated task and requires the utilization of elaborate image processing techniques.

V.2. Image processing

V.2.1. Introduction

The rapid development of microfluidics technology in the recent years combined with fluorescent protein reporting rendered accessible the utilization of microscopy for single-cell analysis for ever growing number of biological labs (Vinuselvi et al., 2011) . Unlike traditional fluorimtery in automated apparatuses like TECAN, microfluidics-based microscopy allows for the single-cell observations. Indeed, fluorescence-activated cell sorting (FACS) also records the fluorescence levels of individual cells, however, it cannot keep track of the cells and follow the dynamics of their fluorescence with time. Therefore, microfluidics technology combined with fluorescence microscopy provides unique experimental opportunities, which are becoming essential for

151

modern biology (Ferry, Razinkov, & Hasty, 2011) .

Image processing has the task to automate and standardize the analysis of the enormous amount of data produced during microscopy experiments. Without it, big part of the achievements of microfluidics technology would be useless. This automation has to convert the raw microscopy images into data for the dynamics of the fluorescence emitted by the observed cells. Effectively, the image processing is filtering out of the unnecessary parts from the bulk of the microscopy data reducing the final data storage size 10 4 – 10 5 times (Fero & Pogliano, 2010) . One of the major advantages of the utilization of microfluidics technology is the single-cell level of the observations. Therefore, it is crucial that the image processing techniques allow for the separate cells to be discriminated and tracked through the individual image frames. Also, the image processing should be as fast as possible, because the loop of engineering of novel genetic devices involves characterization and improvement of the design. Hence, each slowing of this step is directly transmitted to the whole research pace in engineering of synthetic biological parts. Consequently, image processing of fluorescence microscopy time-lapse data acquired in microfluidics devices has three main tasks. First, proper segmentation of the individual cell is required. Second, precise tracking of the separate cells through the consecutive time frames is crucial. Third, the whole processing should be accelerated as much as possible.

V.2.2. Image pre-processing

V.2.2.1. Thresholding

Before the proper segmentation of the microscopy image can begin, however, the raw images need to be digitized in a manner that would allow further logic operations by the software. The image processing operations need to be able to discriminate first between “cell” and “not cell” areas of the image. This discrimination is best done in image composed only of black and white spots, i.e. having only two possible values for the individual . However, the raw microscope

152

images are grayscale and are characterized by pixel values within the range of the memory assigned to them. For example, in an 8-bit grayscale image the total number of pixel values is 28 = 256 and are between 0 (absolute black) and 255 (absolute white). This type of images need to be binarized, i.e. to be converted to 0 and 1 pixel values only. When phase-contrast microscopy is used, the dense objects (the cells) are darker and the background is brighter. Therefore, the binarization could be accomplished employing certain threshold pixel value (Wu, Merchant, & Castleman, 2008) . All the pixels bellow that value become 0 (black) and all the pixels above that value become 1 (white). However, in image processing the objects of interest (the foreground) need to be white (an analogy with the mountain peaks covered with snow) and the pixels should have a value of 1 and the background should be black (the dark valleys) with peak value of 0. Hence, the image is usually first inversed in intensity by a simple global mathematical operation (27):

− − (27) pi = pM p= 255 p

Where pi is the pixel value in the inversed image, pM is the maximal pixel value allowed by the bits of the image and p is the pixel value in the original image. After this operation all white − pixels become black ( pi = 255 255 = 0 ) and vice versa . The thresholding of the inversed could be performed according to a global value (the same for all pixels) or a local value, which is allocated for each image region with some geometrical properties. The usual automatic techniques employed are based on the image histogram profile. Whichever the thresholding technique used, the result of it is a binary image with white foreground and black background.

V.2.2.2. Object connectivity

Unfortunately, the individual white objects in the binary image are not always clearly distinguishable, because they do not have defined borders between each other. Therefore, the separate objects in the binary image are usually discriminated based on their connectivity. There are two types of connectivity employed for segmentation image processing. The first is the 4

153

connectivity, which means that for connected are considered only the pixels connected vertically (up or down) and horizontally (left or right). The other type is the 8 connectivity where for connected are considered also the diagonal pixels. Hence, all 4-connected pixels are also 8- connected, but the other way around is not always true. For microscopy images of cells 8- connectivity is usually employed.

V.2.2.3. Noise removal

Proper image segmentation depends on the quality of the transmission light images and on the properties of the segmented objects. Apart from the hardware limitations, the quality of phase- contrast images is usually deteriorated because of low signal to noise ratio. As discussed previously, the main source of noise in this type of experiments is the background scattered light. The photons belonging to scattered light are usually very low in number and therefore undetectable. However, the utilization of EM-CCD cameras multiplies all detected photons, thus strengthening also the background light. This phenomenon generates “salt and pepper” noise throughout all the image frames. It consists in different number of bright small spots scattered all over the image surface. First, image enhancement technique like 2D median filtering could be employed. The 2D median filter is a type of ranking filter utilizing the information for the values of the surrounding pixels to change the value of odd points in the image. Depending on the size of the window used to evaluate the surrounding, different results could be obtained.

If the size of additive noise spots is larger than the size of the window used for 2D median filtering, the snow particles would not only be preserved, but they would be also enlarged. Still, the size of those particles is significantly smaller than the size of the objects of interest (i.e. the cells). Therefore, a morphological operation for the removal of objects consisting of number of pixels lower that a certain threshold would remove big part of this type of noise. Image area opening is a standard morphological operation for the removal of connected objects with area smaller than some predefined threshold.

154

V.2.2.4. Objects' boundaries restoration

The image filtered in that manner is used for segmentation and discrimination between the individual cells. Initially the 8-connected objects are individualized and labeled with a unique number. This would allow for the future tracking of the cells through the consecutive time frames. Unfortunately, often some cells are too closely attached to each other and they appear to one connected object upon initial segmentation. However, living cells do not have random shapes and sizes and their geometrical properties could be used for recognition of merged objects during segmentation. If such object are found, removal of some parts of them in a proper direction could be employed so that the attached cells could be discriminated from each other. This could be accomplished by a simple image erosion or by the much finer technique of image opening. With the first approach, a predefined object ( O) is slided on the outer edge of the object and everything that it overlaps with is removed, whereas in the second case O is used to split extrusions from larger objects. Consequently, image opening splits some intact original objects in smaller parts by removal of some parts of them (subtractive noise). To compensate for the latter effect image opening is often combined with image closing, which would close the cracks generated by the former technique. Another approach to tackle close neighbors that appear overlapping is the watershed algorithm. It is usually employed for grayscale images, however there are methods for its transition also to binary image segmentation. The essence of this processing technique is best described by the flooding analogy (Illustration 40). If the image is regarded as the surface of a hilly region, this algorithm identifies the watershed lines in this area by fixing a center for each local minima and maintaining an object around each of the centers. This way two neighboring cells would seem like two neighboring valleys, which upon flooding would be separated by a watershed and hence could be discriminated.

V.2.3. Frame-to-frame object matching

The successful segmentation verified by comparison of the obtained objects with a predefined

155

template or dimensional margins is performed for all image frames generated by transmitted-light microscopy. The individual cell labels are assigned either to the pixels delineating only the borders of the objects or to the pixels composing the complete area of the objects. This way, the information consisting in a stack of 2D grayscale images is converted to an array of pixel lists corresponding to object labels. Furthermore, the objects in the consecutive time frames need to matched to each in such manner that would allow for the proper following of the position of the same object through time. This matching is accomplished by comparison of the morphological and/or logical characteristics of the objects in the two frames. The compared characteristics might be size, position, position of centroid, neighbors, overlapping, momentum, etc.

156

Illustration 40: Watershed segmentation technique. The original image (A) is regarded as a relief map and is flooded (B). The separate "valleys" are segmented as individual objects (C).

The evaluation of the matching between two objects in the general case is a polynomial function including one or more of those characteristics and assigning different weight to each of the components used:

∆ × ∆ × ∆ (28) P = c1 p1 +..+ cn pn

157

Where ΔP is the total mismatch between the two compared objects and Δpn is the mismatch between the individual components of the evaluation function and cn is the weight assigned to each of the individual components. A match is found when the evaluation function is minimal and/or smaller then some predefined threshold. The same results could be obtained by evaluating the matching between the separate components, but in that case the match would correspond to the maximal non-zero value of ΔP. Thus, the final outcome of the tracking process is a matrix recording the matching between the individual object labels through the whole time series of the experiment. Combined with the previous array of pixel lists, this matrix could produce simpler array of pixel lists with reduced number of labels, thus reducing also the total size of recorder information once more.

V.2.4. Single-cell fluorescence level tracking

The final tracking array is further used to describe the dynamic fluorescence emission of each individual cell. Depending on the distribution of fluorescence molecules inside the cell volume there are two possible approaches for that operation. First, if the fluorescence molecules are localized, specific algorithms for fluorescence peaks detection are employed to discover the positions of the individual loci of interest. Later on those positions might be tracked with time inside the same cell or be recorded with respect to some specific morphological features of the cell (tips, centroid, nucleus, etc.). In the second case the fluorescence molecules are moving inside the whole cell volume and the resulting fluorescence signal is distributed more or less homogeneously. Then, the average fluorescence per pixel for each cell is obtained. This is done by summing the intensity from the fluorescence image of the pixels composing the cell and dividing the sum by the total number of those pixels:

n

∑ fi F = 1 (29) n

158

Where F is the average fluorescence of a cell at certain time point, fi is the fluorescence intensity of the individual pixels and n is the total number of pixels composing a cell. Consequently, the vector consisting of the average cell intensities corresponding to each time point is considered to be the fluorescence dynamics of this cell. Thus, eventually, the sizable data contained in the thousands of images produced by the time-lapse microscopy experiment is reduced to a matrix of fluorescence values for each cell corresponding to each time point. In other words, the information from the image files of tens of GB is finally compressed into few kB in the computer memory.

V.3. Image processing algorithms

V.3.1. Image processing algorithms developed in collaboration

An image processing algorithm for the analysis of long time-lapse microscopy data was developed by Catalin Fetita and his co-workers in collaboration with our lab. This software was purpose-built for the study of the behavior during long-term exponential growth of bacterial cells in single-cell layers. The microscopy setup was similar to the one already described. For the purpose of efficient single-cell tracking we acquired phase-contrast images every 1 minute and the fluorescence imaging period depends on the genetic construct, but is generally between 3 and 10 minutes. The number of traps that we could follow was limited by the 1 minute period and the time required for re-focusing, thus, we normally followed 3 separate traps. The cell segmentation of this software was based on an algorithm relying on the halo generated around the cell by the phase-contrast method of microscopy imaging. The cell boundary was detected based on this differential pixel value and the cells were further closed based on the assumption that they are organized along the short axes of the growth chamber and perpendicular to the supply channels (Volfson, Cookson, Hasty, & Tsimring, 2008) . Furthermore, connectivity graphs were

159

employed for the tracking of individual cells, which allowed for the direct selection of particular cellular path and continuity of remaining in the growth chamber. This algorithm was applied specifically for the analysis of the characterization data of the oscillatory copy number plasmid developed by us and the results were published in a peer-reviewed paper (Fetita, Kirov, Jaramillo, & Lefevre, 2012) .

V.3.2. Image processing algorithm developed during this research

For our specific experimental needs we developed a web-based tool for automated Single-Cell image Processing (SinCePro) of E. coli microscopy data. It is based on MATLAB and apart from three parameters regarding the timing of the image acquisition and analysis it is fully automated and unsupervised. Since it is web-based, unlike other similar , our tool requires no software installation, no powerful CPU and is written for , which makes it significantly faster.

In our research we are mainly interested in synthetic-biology research where the requirement for precise quantitative models of the engineered biological parts and the circuits based on them is becoming more and more obvious. Such models could be based solely on precise quantitative data for the dynamics of the molecular species involved in the described reactions. The development of the microfluidics technology allowed for the cultivation of cells in single layers and in precisely controlled environment. The combination of this tool with powerful light and fluorescence microscopy provide the possibility for observation of single cells for long time periods in a reproducible manner.

The analysis of the behavior of a synthetic part in a host organism involves the processing of very long time-lapse experimental data and also of statistically significant number of individual cells.

160

There is a number of already existent packages for the automation of this type of image processing. Among those some of the most cited are CellTracer (http://www.stat.duke.edu/research/software/west/celltracer), MicrobeTracker (http://microbetracker.org) and CellProfiler (http://www.cellprofiler.org). They all proved to be quite useful in a number of applications such as analysis of cell growth and shape (Fenton & Gerdes, 2013) , cell wall (Desmarais, De Pedro, Cava, & Huang, 2013) , cell motility (Nan et al., 2013) , peptide aggregates (Coquel et al., 2013) , secretion from bacteria (LeRoux et al., 2012) , yeast, drosophila, human (Wang, Niemi, Tan, You, & West, 2010) . Unfortunately they also have some drawbacks, which we aimed to overcome with the SinCePro package. On the first place is the requirement for the installation of software, be it proprietary (CellProfiler) or third party, namely MATLAB (CellTracer and MicrobeTracker).

Conversely, our processing software is provided as a web-based tool, therefore requires only Internet connection in order to be used. Secondly, all of those products aim to be highly versatile and allow for addressing of different type of microscopy data. This results in the requirement for initializing of the data processing by the setting of a number of parameters and pre-processing steps. We developed our software for the reproducible output of microfluidics-based epifluorescent microscopy, which we believe to be the future of high-throughput single-cell analysis. The only parameters SinCePro requires are the time settings used for the imaging. Next comes the question of CPU power consumption. The locally installed packages consume substantial amount of computing power for a considerable time period. Since SinCePro is web- based, it takes no CPU power from the resources of the lab. Finally, as the number of processed images is usually tens of thousands, parallel computing is a must for the successful analysis within a reasonable time period. Unfortunately, parallel computing is readily available in none of the above-mentioned packages. To address this problem with SinCePro, we dedicated a lot of efforts to parallelize the most tedious processing steps, namely cell segmentation and cell tracking. As a result we obtained processing speed acceleration almost proportional to the number of cores dedicated to the processing itself. Right now we host the SinCePro on a

161

computation node with 12 processors.

In order to access the SinCePro web tool, the user must first register for free on the website (https://absynth.issb.genopole.fr/Bioinformatics) of the joint Synthetic Biology platform of Genopole and the iSSB. Once logged in, the interface of the tool is quite simple. The user should browse for the files containing the phase-contrast and fluorescence movies and start uploading them. In addition, there are three parameters to be set before the image processing could be conducted. First the period of phase-contrast imaging in minutes (e.g if the imaging was performed every 30 seconds, the period would be set to 0.5) should be declared, similarly also the period of fluorescence imaging is set and finally the desired minimal time period of cell tracking is defined (Illustration 41).

162

Illustration 41: SinCePro web interface. The required settings are on the right-hand side: i) the period between the consecutive fluorescence images in the time lapse; ii) the period between the consecutive phase-contrast images; iii) the desired minimal time used to decide which cell trajectories to be added in the final report. Next the user should just upload the images in the required format and click on the "PROCESS" button in the middle of the screen. Upon finalization of the image processing an email with a link for downloading the report is sent.

SinCePro workflow consists of: i) phase-contrast images pre-processing, ii) segmentation of the preprocessed images, iii) tracking of the segmented objects (i.e. cells), iv) post-processing of the obtained tracking data for the extraction of the dynamic fluorescence data. During this process we need to tackle different types of problems: hardware, software and problems caused by the physiology of the bacterial cells. The major hardware problem is provoked by the occasional slow drift of the sample during the long time-lapse experiments we perform. The software limitations arise from the fact that we utilize microfluidics chips with growth chambers containing about 100 cells, which for a relatively short period of 10 hours divide into millions of new cells. The rigorous tracking of such number of cells is a serious challenge for the software resources of a typical biological lab, including ours. Finally, the physiology of the cells we work with is an impediment because of two major factors. First, cells can sustain limited amount UV

163

light exposure before their DNA is irrevocably damaged. Therefore, we need to reduce the excitation time as much as possible, including by utilization of the electronic gain capacity of our EM-CCD camera. However, the latter results in reduced quality of the phase-contrast images. Additionally, the visible inhomogeneity of the cell internal density gives rise to wrong segmentation.

V.3.2.1. Image pre-processing

The phase-contrast images pre-processing (Illustration 42) is performed by generation of a movie from the separate images followed by the black-and-white inversion (Illustration 42, (A2)) of that movie as already described (27). Consequently, the inversed movie is binarized (Illustration 42, (A3)) by the utilization of automatically selected threshold. However, we employed a somewhat unorthodox approach for the threshold selection. Unlike traditional methods where the thresholding value is determined in accordance with the image histogram profile, we decide to tackle this problem in a heuristic manner. Our procedure consists in testing all possible thresholding values (between 255 and 0) and consequently detect all 8-connected objects produced in the image in that manner. Next our script counts the number of objects that morphologically correspond to the limitations of a typical cell and selects the thresholding producing the highest number of proper cells. This way, we could easily adapt the procedure to any kind of organisms as long we are able to determine some sensible morphological margins describing their normal size and shape. Unfortunately, performing such trial-and-error routine for each of the tens of thousands of phase-contrast images would require substantial computational resources and would slow down significantly the whole image processing. However, we discovered empirically that the margin of change of the allocated threshold values between the individual time frames is within 1-2 pixels overall. With such a small error the threshold does not affect significantly proper image segmentation, hence we search for the most adequate thresholding value only once in the first analyzed frame and then we utilize exactly the same value for all consecutive frames.

164

The next step of our image pre-processing flow consists in the compensation for the small mechanical horizontal drifts occurring during the experiment (Illustration 42, (B)). Those are provoked by the periodic repositioning of the microfluidics chip for each of the tracked sampling positions and at each imaging time point. Despite the adhesion of both the chip and the tubing to the microscope stage, sometimes the lateral pressure is too strong and some minor displacements occur. The compensation for this phenomenon is accomplished by the automatic selection of a large object (usually a structure of the chip) in the visible field to be used as an “anchor”. A morphological feature of this structure (the right-most point) is used as a reference point for the repositioning of the whole image so that the “anchor” is positioned over its original coordinates. Importantly, this procedure could be rendered much easier if a proper structure to be allocated on the chip is already envisioned in the design and is inbuilt on purpose in the microfluidics device. This is a typical example for mutual dependence between chip design and image processing.

Usually after that point of image processing a combination of image-enhancement techniques are used to improve the quality and adaptability of the image for segmentation (Illustration 42, (A4)). Unfortunately, there is not a single existing theory prescribing fixed steps used to obtain certain image characteristics. This is also largely due to the large variety of hardware combinations existing in every different microscopy setup. Therefore, the normal approach is to empirically derive the best combination of enhancement techniques for each separate case (van Teeffelen, Shaevitz, & Gitai, 2012) .

One way to tackle this problem would be the emergence of standard microscopy techniques based on microfluidics devices. Our pre-processing continues with the removal of “salt and pepper” noise. As discussed before, this is usually accomplished by the application of a 2-D median filter. The neighborhood we use for the pixel ranking was defined heuristically to be a window of 8× 2pixels. This operation successfully removes some of the smaller additive noise, however the similar quantities of subtractive noise provoked by the same sources still remains in the image. There is a large number of small black holes in the identified objects. For their

165

removal we employ the imfill function of Matlab, which is basically a flood-fill operation. This means that all background pixels (i.e. “0”) that cannot be reached by the edge of the image are converted to foreground (i.e. “1”), thus the holes in the objects are filled.

Illustration 42: Image pre-processing steps. Those include preparation of the image stacks and image filtering (A) and compensation for the occasional horizontal drift occurring during the long time-lapse experiments (B). The image binarization and filtering also leads to a significant reduction of the image file size (A, right column).

V.3.2.2. Image segmentation

At this point the filtered images obtained after the pre-processing are ready to be segmented. This

166

process begins with initial segmentation for the labeling of the individual objects in the image. Those objects are then checked for compliance with the maximal allowed sizes of some of the cells morphological features. The typical E. coli cell growing in the chambers of our microfluidics devices has a diameter of 1 micrometer and length between 5 (immediately after division) and 10 (immediately before division) micrometers. Since bacteria do not always divide properly, their length could vary significantly, therefore we use the cell diameter as a measure for proper segmentation. Given the magnification we use, each pixel is 50 nanometers wide, hence 1000 m−9 the cell diameter should be maximum × = 20pixels. If some of the objects are 50 m−9 / pixel too big, they are processed by gradual repeating erosion with a structural element consisting of 1 pixel (Illustration 43).

This erosion aims for the generation of small “seeds” from which to begin the segmentation of each of the final individual cells. During this loop of deformation of the missized objects some existing extrusions are cut from the original shapes and produce new “salt and pepper” noise. The latter is removed by the utilization once again of the 2-D median filter. In addition, new holes are formed in the existing large objects and they are again refilled with the imfill command. When all the objects match the normal cell size requirements they could be used used as “seeds” for the generation of the final cells to be tracked. However, some of the objects were now eroded too much and some actual cells were split into more than one seed. Therefore, the image area open function is used to remove all objects containing less than 200 pixels (typical cells are made of about 2000 pixels).

167

Illustration 43: Results from the gradual erosion of laterally merged cells. If some geometrical property of an object (the short axes e.g.) is bigger than the theoretical limit, this is usually owing to merging of two neighboring cells (A). In that case the objects is gradually eroded, which leads to the formation of two (or more) clearly distinct objects (B).

Next, the seeds are expanded to match the initial binary image relief with a modified watershed algorithm, but maintaining the labels of the individual objects. This way cells, which initially were merged, are segmented and discriminated. Finally, the segmented images are once more cleared out by removal of small objects and “salt and pepper” noise. The final segmented images thusly obtained are labeled with the bwlabel Matlab function, which produces a number of statistical data for the objects found in the image. For the purpose of future cell tracking and fluorescence data extraction the list of pixels consisting each of the separate objects is recorded. The pixel itself is using the linear pixel notation where each of the image pixels is addressed with a single number, which is obtained by the counting of the pixels row by row. This way, in a 1000 X 1000 pixels image, pixel number 5456, would refer to the 456-th pixel in the 6-th row of the image. More generally (11):

168

pn − (30) = rp 1remaindercp lr

Where pn is the pixel number, lr is the length of the rows of the image, rp is the row number of the pixel and cp is the column number of the specific pixel.

At this point image segmentation is completed and the process of cell tracking could begin. From all the image processing until now, the proper image segmentation is the bottle neck regarding time consumption. The segmentation of a single image frame is relatively fast and is slowed down only by the consecutive erosion of the erroneous objects and the read/write processes upon loading of the image itself and writing of the segmentation outcome. If a relatively small number of images is processed, the slowing down could be overcome by maintaining of the segmentation statistics in the memory as a single variable. The latter could be written only upon finishing of the complete processing of the whole image sequence. However each of the image statistics has the size of few kilobytes and when we are dealing with tens of thousands of images this results in maintenance of thousands of megabytes in the computer RAM. If the machine used is provided with typical memory size of few gigabytes, the memory clogging becomes inevitable. Therefore, the saving of the segmentation results for each separate image frame is obligatory. However, this relatively fast process repeated tens of thousands of times consequently becomes a substantial slowing down of the complete image processing flow.

To tackle the inevitable slowing of the image processing at this point, parallelization of the analysis of the separate frames is employed. For that purpose and for the sake of clarity and orderliness the segmentation function was written as a separate script, which is called by the main script instead of as a local function. This approach has many advantages like providing the total script with modularity, readability by other , reduced size of the separate modules, etc. In the context of parallelization the most important advantage of writing the consecutive segmentation steps as individual functions is connected to issues of transparency. The possibility for automatic parallelization in Matlab is provided mainly by the parfor loop. Upon entering in

169

the latter, Matlab processor automatically distributes the individual repetitions of the loop among the cores available to the matlabpool . This distribution is completely random and unordered, therefore keeping of track of the required order of events in the loop is impossible. Hence, to avoid mixing of the labels, etc. all reading and writing of variables must be done inside the parfor loop itself. The latter means that the load and save functions are also forbidden and the only possible way to call them is as a part of another function. The way to accomplish proper algorithm following those rules is not always obvious and requires skillful programming. However, we managed to devise a script, which utilizes Matlab parallel computing and accelerates segmentation almost linearly with the number of employed computing power (Illustration 44).

V.3.2.3. Frame-to-frame cell tracking

Next in the image processing flow is the cell tracking. As described above, it is based on pixel addresses of the separate objects found in the consecutive image frames in the segmentation process. Since we utilize imaging loop of 30 seconds for each repetition, the cells between the consecutive images are not substantially displaced and we may adopt a simpler way to evaluate object matching. Therefore, we use a simpler version of (9), namely:

∩ ∩ (31) P = pi pj

Where pi and pj are the pixel addresses of the two compared objects. Matlab has a dedicated function for the calculation of the intersection between two linear vectors, intersect .

170

Illustration 44: Graphical representation of the effect of parallelization over the image processing speed. Note that the 1 X 2.4GHz and the 2 X 2.4GHz experiments were performed on the same Core2Duo machine without and with parallelization of the script respectively.

The couple of objects from the two consecutive frames with the highest number of overlapping pixels is denoted as matching and the cells are considered to be one and the same through the whole tracking. Thus all the objects from frame 1 are compared to all the objects from frame 2. Consequently all objects from frame 2 are compared to all objects from frame 3, etc. The comparison of two vectors for overlaps is computationally very light and requires no special resources. However, hundreds of objects to be matched to hundreds of cells through all consequent couples of frames multiplies the computational effort significantly and could also be a reason for slowing down of the entire image procession. Therefore, in the tracking script

171

parallelization is also the core of the processing. Because of the randomization of tasks, the matching could not be programmed for all objects through all consecutive couples of frames. Otherwise, keeping track of the proper frame sequence would be rather difficult. Instead, we wrote the parallel loop for tracking of the separate objects through all the frames, looking for matches of the object frame after frame. This way parallelization was possible, but the total number of parallel loops had to be declared in advance. Therefore, we needed to limit the number of tracked cells in order to declare a fixed number prior the start of the loop. However, E. coli bacteria divide once every 30 minutes, which means for that for 10 hours each of the cells from the starting culture would generate 220 ≈ 106 new cells. Out of those most would end up being pushed out of the growth chamber before they could divide. Still, the total number of cells generated in the growth chamber is approximately 104. Not being able to track all possible cell trajectories and given that most of those cells would reside in the growth chamber for a very short time period we had to reduce the number of tracked cells as much as possible. In addition we had to narrow the tracking to the cells, from which we could obtain the highest amount of information, i.e. the longest residing ones. Starting from the first frame would be extremely difficult to envision the progeny of which cells would end up being washed out of the chamber and of which one would remain inside. Therefore, we decided to shift the tracking time vector and to start from the last frame and return towards the first. This way, inevitably we were tracking the longest residing cells in the chamber, because by definition new cells are generated only from existing ones and the last to remain had to be lineage of some of the first one to enter. This time reversal is provided in the beginning of the pre-processing of the image frames in the step of movie generation by the image stacks by reversing the time order of the first movies to be generated, i.e. the phase-contrast and the fluorescence ones.

The actual tracking algorithm consists in loading in the memory of the pixels lists for the objects of each two consecutive frames. Next, each of the objects from the newer frame is compared to all the objects from the older one and the couples with the longest pixel overlap are considered to be the consequent images of the same cell. The numbers of objects found in the consecutive

172

frames is not necessarily equal between each other. On one hand, cells from the older frame might have been pushed outside of the chamber until the next image was taken and the number of objects would be reduced. Conversely, upon division, one cell splits into two and thus the number of objects in the newer frame seems higher. Finally, wrong segmentation might split the same cell into more than one objects and give rise to even more mismatches between the total number of objects in consecutive frames. However, we are tracking a limited number of cells and every lost lineage is detrimental. Therefore, after the matching between the objects of the two frames, additional operations are required to reduce the amount of data losses. On the first place, cells from the newer frame without a match in the older one are noted as “badly tracked”. Furthermore, two ore more cells from the newer frame that match the same object might also appear. This happens upon cell division, which appears like merging in reverse time order, hence two separate labels match best with a single one from the older frame. To avoid redundancy in the tracking, the new label is matched only to one of the two old labels and the other is also noted as “badly tracked”. Finally, for all the “badly tracked” cells we try to find unmatched cells from the older frame and to start a new tracking lineage. Unmatched cells not used for new lineages instead of “badly tracked” cells are lost. This way the total number of tracked cells is always the same. The tracking data is saved for each time frame with the label and pixel list for each cell. If there is a good match between the two objects in consecutive frames, they would have the same label and therefore could be later followed. The starting of a new lineage means a new label value would be assigned, which increases the maximal label value in the frame and through the tracking, but not the total number of tracked cells. This way we always have the same number of actually tracked cells in the same parallel tracking loop, but we increase our chances to obtain longer individual tracks in the separate cells.

V.3.2.4. Image post-processing

The post-processing of the image data begins with choosing the cells with the best tracking results, which would later be more informative. Usually the cells with the longest tracking are preferred. There are two options for distinguishing the best tracking. Firstly, there could be a

173

fixed number ( x) of cells to be used for further analysis. In that case only the results for the longest x tracks are retained and the next are discarded. Alternatively, our preferred method is the analysis of all tracks longer than some minimal tracking time ( ttr ). The extraction of the fluorescence intensity itself is accomplished as described previously (29). At this point of the processing time reversal is no longer necessary and the intensity data recording is performed starting from the last frame to be tracked, i.e. the first obtained microscopy image. For that purpose only the tracking data for the phase-contrast frames overlapping in time with the fluorescence images is needed. The rest of the data could already be discarded, unless required for inspection later on. The finally retained fluorescence data is recorded as a single .txt file, in which each couple of lines represent the vector with the average fluorescence intensity at certain time points and the vector with the precise time points (in minutes) respectively. This way the data could be easily analyzed further statistically or visualized. The user of the web-based tool SinCePro receives a link for downloading of the data as a .png figure for inspection. Additionally, the .txt file containing the raw fluorescence and time vectors for each properly tracked cell is provided for further analysis.

Overall, the processing algorithm we developed, SinCePro, is easy-to-use web-based tool, which requires no special training of the biologist user, no software installation, no powerful CPUs, but still produces valuable analysis of high-throughput single-cell microscopy data. The described steps of the package workflow are all written as separate independent modules which allows for the straightforward development and expanding of the proposed package. We envision to add segmentation module for yeast and some types of mammalian cells, as well as post-processing module for spot detection and counting. Combined with microfluidics experimental setup this tool will be extremely important for the wide-spread usage of single-cell microscopy characterization of engineered biological parts in a number of different organisms. The complete Matalb scripts implementing the SinCePro algorithm are appended at the end of this work (APPENDIX B).

174

V.4. Conclusion

In this chapter the most important requirements for successful single-cell microscopy experiment to be performed were studied. The limitations arising from cellular physiology and the need for image quality for the consequent image processing were discussed. We demonstrated our experimental setup and microscope programming. We also developed an efficient software for automated processing of microscopy images. The script is was developed for use with E. coli. However, it is parametrized almost entirely on the cellular geometrical properties, which makes it very versatile and easy to configure for other types of organisms. The results form the processing performed with our algorithm were already used in some peer-reviewed papers and proved to be a valuable method for data analysis and accumulation. Furthermore the computationally- demanding functions of the software were successfully parallelized, which allows for a substantial acceleration of the processing time and with some minor changes, for an on-line data analysis. Finally, we uploaded the entire script as a web-based tool, which removes entirely the demand for any specific software or CPU power availability. The only resource demanded by the client is INTERNET connection. This way, the image processing becomes effectively decoupled from biological research and biological labs could focus on their specific projects. Consequently there will be no need for recruiting highly-paid software specialists and dedicate mountains of resources towards the development of solutions for side-tasks.

V.5. References

Coquel, A.-S., Jacob, J.-P., Primet, M., Demarez, A., Dimiccoli, M., Julou, T., … Berry, H. (2013). Localization of protein aggregation in Escherichia coli is governed by diffusion and nucleoid macromolecular crowding effect. PLoS computational biology , 9(4), e1003038. doi:10.1371/journal.pcbi.1003038

Desmarais, S. M., De Pedro, M. A., Cava, F., & Huang, K. C. (2013). Peptidoglycan at its peaks: how chromatographic analyses can reveal bacterial cell wall structure and assembly. Molecular microbiology , 89 (1), 1–13. doi:10.1111/mmi.12266

175

Fenton, A. K., & Gerdes, K. (2013). Direct interaction of FtsZ and MreB is required for septum synthesis and cell division in Escherichia coli. The EMBO journal , 32 (13), 1953–65. doi:10.1038/emboj.2013.129

Fero, M., & Pogliano, K. (2010). Automated quantitative live cell fluorescence microscopy. Cold Spring Harbor perspectives in biology , 2(8), a000455. doi:10.1101/cshperspect.a000455

Ferry, M. S., Razinkov, I. a, & Hasty, J. (2011). Microfluidics for synthetic biology: from design to execution. Methods in enzymology (Vol. 497, pp. 295–372). doi:10.1016/B978-0-12- 385075-1.00014-7

Fetita, C., Kirov, B., Jaramillo, A., & Lefevre, C. (2012). An automated approach for single-cell tracking in epifluorescence microscopy applied to E. coli growth analysis on microfluidics biochips. In SPIE Medical Imaging (p. 83170Z–83170Z–11). International Society for Optics and Photonics. doi:10.1117/12.911371

LeRoux, M., De Leon, J. A., Kuwada, N. J., Russell, A. B., Pinto-Santini, D., Hood, R. D., … Mougous, J. D. (2012). Quantitative single-cell characterization of bacterial interactions reveals type VI secretion is a double-edged sword. Proceedings of the National Academy of Sciences of the United States of America , 109 (48), 19804–9. doi:10.1073/pnas.1213963109

Nan, B., Bandaria, J. N., Moghtaderi, A., Sun, I.-H., Yildiz, A., & Zusman, D. R. (2013). Flagella stator homologs function as motors for myxobacterial gliding motility by moving in helical trajectories. Proceedings of the National Academy of Sciences of the United States of America , 110 (16), E1508–13. doi:10.1073/pnas.1219982110

Van Teeffelen, S., Shaevitz, J. W., & Gitai, Z. (2012). in fluorescence microscopy: bacterial dynamics as a case study. BioEssays : news and reviews in molecular, cellular and developmental biology , 34 (5), 427–36. doi:10.1002/bies.201100148

Vinuselvi, P., Park, S., Kim, M., Park, J. M., Kim, T., & Lee, S. K. (2011). Microfluidic technologies for synthetic biology. International journal of molecular sciences , 12 (6), 3576–93. doi:10.3390/ijms12063576

176

Volfson, D., Cookson, S., Hasty, J., & Tsimring, L. S. (2008). Biomechanical ordering of dense cell populations. Proceedings of the National Academy of Sciences , 105 (40), 15346–15351. doi:10.1073/pnas.0706805105

Wang, Q., Niemi, J., Tan, C.-M., You, L., & West, M. (2010). Image segmentation and dynamic lineage analysis in single-cell fluorescence microscopy. Cytometry. Part A : the journal of the International Society for Analytical Cytology , 77 (1), 101–10. doi:10.1002/cyto.a.20812

Wu, Q., Merchant, F., & Castleman, K. (2008). Microscope Image Processing (p. 576). Academic Press. Retrieved from http://www.amazon.com/Microscope-Image-Processing- Qiang-Wu/dp/012372578X.

177

VI. Characterization of synthetic genetic parts and devices

VI.1. Introduction

Engineering of genetic circuits from simple genetic parts requires precise knowledge of the functioning of the latter (Slusarczyk, Lin, & Weiss, 2012) . The way a given part processes information should be known, described and defined otherwise this part cannot be used for the construction of circuits with predictable behavior. However, we have no direct sensors for the expression function provided by a given promoter or the translation rate defined by a certain RBS, etc. In other words, we cannot observe directly the given data processing properties of the studied unit. However, the observability of all parts of a given system is necessary for the proper modeling of its behavior and consequently the engineer-ability of the entire system (Liu, Slotine, & Barabási, 2013) . If this condition is not met, this means that some of the state variables of the system in study are unknown and the system is not controllable. However, we could infer the behavior of a given circuit part from the behavior of another such part if between them there is a known direct dependence. This is exactly the approach utilized engineering of biological parts (Kelly et al., 2009) . Usually the reporting of an unknown internal parameter of the system is accomplished by the application of a reporter system. From engineering point of view such systems are transducers – they convert the information detected as presence of some type information biomolecules to another type of information biomolecules. This conversion in our case consists of sensing of a TF signal and its conversion into a protein-coding sequence by promoter-controlled RNAP (sensor) and reporting this protein coding sequence as a fluorescence protein through translation by an RBS-controlled ribosome (actuator). Consequently, if the parameters describing the promoter activity and the RBS function are known, we could infer the dynamics of the TF concentration from the dynamics of the fluorescence protein. This way, by including reporter device(s) consisting of promoter controlled by some of the proteins involved in the circuit dynamics in the final design of the genetic circuit, an access to some of the system's variables is gained.

178

Unfortunately, we as human beings do not have means for direct detection of fluorescence proteins either, therefore we need to employ more levels of transduction until the concentration of the reporters might be quantified. Fluorescence proteins have a chemical structure that could be energetically excited by some UV-light wavelengths (Shaner, Steinbach, & Tsien, 2005) . After a short period of time the fluorescence molecules revert back to their basic energy through the release of photons with a specific wavelength. It is exactly those photons that we could detect and quantify instrumentally. Such transduction of fluorescence light quantity to digital information stored on electronic devices could be accomplished by a simple imaging a petri dish under a UV-light source for example. This approach could be useful for quality control, however it could be difficult to convert it to a high-throughput method applicable for model parameter values inference. The latter requires standard experimental setup allowing for fast processing of a large number of samples simultaneously. This is owing to the high levels of noise with different origins that is characteristic for biological experiments as a whole. Stochasticity in biology is generated by subtle differences in environmental parameters such as temperature, oxygen concentration, evaporation, etc. Additionally, biological objects, especially bacteria, evolve very fast and the genetic parameters also change. Furthermore, sources of small differences of the intracellular chemical contents could also affect the detected behavior. In many of those cases random noise could be canceled by averaging of the detected signal obtained in multiple simultaneous experiments. However, regarding biochemical kinetics, averaging could be misleading and could lead to wrong conclusions about the mechanism and the nature of some phenomenon (Wong, Tsai, & Liao, 2007) .

If the aim of the research is conclusion for the population behavior, such experiments are completely valid and are actually very useful. However, if the purpose of a given characterization is the discovery of the parameter values for the behavior of a particular genetic circuit, then another approach is required. The reason for that is very simple. When modeling a genetic device we employ description of intimate molecular mechanisms that comprise actual binding of certain type of biomolecules to some other types of molecules. However, most of the information

179

biomolecules do not cross the cellular membranes and act only inside a single cell. Therefore, those molecular interactions are impossible at the population level and their inferring from population-level data would be an error of scale. The proper method to obtain valid experimental data for that purpose would be to detect the photons emitted by individual cells.

VI.2. Population-level characterization

For the population-level characterization we employed a standard fluorometer technique. The cells transformed with the engineered devices were grown to beginning of exponential phase with appropriate inducers supplied in the medium and were then transferred in a 96-well plate. The exact protocol was as following. Overnight cultures of the cells were prepared in LB medium with the required antibiotics. Those cultures were started from single clones extracted from petri dishes. The overnight cultures were subsequently diluted 1000 times, which gave a culture concentration with an optical density for 600 nm wavelength (OD 600) of about 0.0015. The concentration we utilized for the beginning of fluorometer measurements was beginning of exponential phase, i.e. OD 600 of approximately 0.2. Consequently we had to increase the culture 0.2 concentration with a factor of = 133.() 3 times. Given that log 133 ≈ 7 and that the 0.0015 2 division time of our E.coli culture in M9 was about 50 minutes, we had to grow the cells in M9 for 7× 50minutes= 350minutes≈ 6 hours. The M9 medium was supplied with the required antibiotics and the inducers, therefore upon beginning of the fluorometer measurements the expression systems had reached their steady-state level characteristic for the given inducers' concentrations. The fluorometer measurement time-lapse program cycles consisted of shaking for incubation with shaking for 10 minutes followed by relaxation time of 2 minutes to avoid influence of surface ripples on the optical measurements. The latter were taken immediately after the relaxation and included optical density measurements and UV excitation followed by emission detection with characteristic filter combinations for the fluorescence protein used. Each cycle consisted of overall 15 minutes and the experiment measurements were taken for at least 4 hours. The precise filters for fluorescence reporters that our TECAN500 fluorometer is equipped

180

with are presented in Table 4.

Fluorophore (EXpeak-EMpeak) EX-filter EM-filter mCherry (587-610) EX-580/20 EM-612/10 mCitrine (516-529) EX-510/10 EM-544/25

EGFP (478-510) EX-465/35 EM-530/25

ECFP (433-475) EX-430/35 EM-485/20

Table 4: UV filters equipped in the TECAN500 fluorometer apparatus. All the wavelengths are given in nm. We have filters for each of the most used fluorophore colors: red (mCherry), yellow (mCitrine), green (eGFP) and cyan (eCFP).

The raw data obtained in that manner was later analyzed by a purpose-written MATLAB software when the aim was characterization of promoter activity. The analysis algorithm was based on the work of Alper et al . (Alper, Fischer, Nevoigt, & Stephanopoulos, 2005) . In summary, the data processing consisted of the following steps. Initially the raw data was imported and converted into vectors containing the data obtained at each time point the measured

OD 600 and fluorescence levels emitted for each well. The user defined repeats and blanks were also detected and were taken into account in the consequent analysis. The mathematical analysis began with the automatic detection of the sharpest slope of the growth curve obtained by plotting

OD 600 vs. time. The region in which this slope was exhibited is the period of fastest growth of the culture and was assumed to correspond to the mid-exponential phase. The slope itself was taken as the value for the specific growth rate. For the same data points the curve fluorescence vs.

OD 600 was also used. The slope of the latter curve was used as the steady-state fluorescence level. Finally, for the promoter activity the following equation was used:

181

 m   m D  P = F × m×1+ + F × D ×2 + +1 (32)  M   M M 

Where P is the promoter activity, F is the steady-state fluorescence, is the specific growth rate, M is the protein maturation rate (for GFP m = 1.54/h) and D is the degradation constant of the protein (for tagged GFP D = 0.24/h). If there is no degradation tag in the fluorescence protein sequence, D = 0 and the equation for P simplifies to:

 m  (33) P = F × m×1+   M 

The values for the promoter activity and the specific growth rate obtained this way for the repeats (usually 3) for each of the experiments was then averaged and a figure was automatically generated for fast assessment of the results. The scripts developed for the described operations are presented in APPENDIX B.

This method for population-level characterization was used when the XOR gate design (described in the chapter studying design of genetic parts). With each tested culture started from a single clone four different types of experiments were performed. One type was the one with IPTG or aTc inducers. In this case the LacI and TetR produced by the Z1 cassette were not induced and the expression of the CI proteins was repressed, hence the input was “00”. There were also experiments with only aTc (concentration of 30 ng/ml) or only IPTG (200 M) corresponding to “10” and “01” inputs. Finally, medium with both IPTG and aTc was used for the last type of experiments or the “11” input was tested this way. The experimental repetition was triple in the same 96-well plate and the at least triple repetition of the same experiment in different days. The data results are presented in (Illustration 45). The results obtained for the second version of promoter combinations was quite encouraging, however the devices showed very low genetic stability.

182

Illustration 45: XOR characterization results. The fluorescence is customarily given in relative fluorescence units (RFU). As visible from the graph, the behavior of clone1 and clone2 are very close to the expected, however clone3 is completely different. This discrepancy in the results is owing to the great genetic instability of this circuit.

VI.3. Single-cell-level characterization

As discussed before, single-cell level measurements are much more informative when used for the characterization of static genetic devices and are indispensable for the characterization of dynamic genetic circuits. To obtain that level of precision we combined microfluidics-bioreactor devices with fluorescence microscopy and consequent image processing.

VI.3.1. Experimental setup

The microfluidics reactors that we used bacterial growth during characterization experiments are discussed in detail in the dedicated chapter. They comprise rectangular growth chambers which

183

limits the stacking of individual cells and forces to grow only in two dimensions. This way, the microscopy image allows for segmentation and differentiation of the individual cells and detection of their individual fluorescence levels. The input system for the microfluidics devices was of two types. One of them had only one input port for only one type of medium and in this type of devices experiments with constant parameters of the input medium were performed. The second type of biochips engineered and used by us had two separate input ports for two types of supply media. This architecture provided the possibility for generation of dynamic exchange between two alternative media and thus different oscillatory inputs were programmed. The latter allowed for the analysis of complex interaction between the genetic circuits and the dynamics of the supplied inducers and detection of phenomenon as entrainment, resonance, etc.

The cultivation of bacterial cells inside the microfluidics chips could be performed in two distinct manners. In the first approach, no mechanical devices are used to force the solutions (media or starting cultures) inside the biochip. This is the pressure-driven microfluidics, which relies on gravitational force to drive the fluids through the chip. This method has some advantages such as lower pressure exerted on the cells and absence of mechanical waves generation and propagation through the liquids and thus affecting the cells. However, pressure-driven microfluidics is also much more laborious and less reliable and efficient in comparison to syringe pumps e.g. This is owing to the fact that the reduced pressure of the medium has reduced capacity to remove trapped bubbles of air from the micro-channels. Therefore, additional step involving wetting of the entire channel system of the chip with a solution containing high concentration of surfactant (Tween20) is required. After this is accomplished the loading of the traps with inoculum might start. The medium and the waste syringes are plugged in the beginning of the operation. After the medium is securely passing through the growth chambers region, the supply of concentrated culture is plugged in its dedicated port. The syringe containing the culture is initially placed at such height that gravity drives the cell flow through the growth chamber region and towards the waste ports, but not against the medium flow. When such status of the flows is obtained, the tube passing the culture is flicked, which generates a lateral stress and forces the cells in the growth traps. Once the inoculation of the traps is confirmed, the cell syringe is placed at lower position, thus

184

reverting the flow towards itself and becoming another waste. At this point the medium is flowing through the whole chip and is supplying the cells with fresh nutrients. When pressure- driven supply is utilized for microfluidics chips, programming of dynamic requires special mechanical setups and is not straightforward to accomplish.

On the other hand, cultivation of bacteria in microfluidics chips when syringe pumps are used to generate the flow is mush easier. The pressure generated by the pumps is strong enough and occasional bubbles trapped in the channels of the chip are not a problem. Therefore, the first step in this case is direct forcing of concentrated cell culture in the growth area of the chip. The tube is also flicked and some cells are trapped in the shallow growth chambers. Consequently, the medium supplies are plugged into the ports and the pumps are started. The waste lines are also connected to the chip. The microfluidics chip can now be positioned on the microscope stage and the structures inside could be readily observed. After the air bubbles are pushed out by the flow, the final program of the inputs dynamics could be set in the pumps interface and the experiment might take place.

All the other aspects of the microfluidics chips operation and microscopy setup were as discussed in the dedicated chapters. The raw microscopy data was processed by our software and the obtained fluorescence-level dynamics were consequently analyzed by additional software modules depending on the aim of the experiment.

VI.3.2. Characterization examples VI.3.2.1. Copy-number load and limiting resource

We utilized single-cell characterization in a research paper (Carrera, Rodrigo, Singh, Kirov, & Jaramillo, 2011)  focused on the analysis of limitation resources for the expression of heterologous circuits in bacterial chassis. For that purpose an expression system containing a eYFP under the control of a constitutive promoter (BioBrick part BBa_J23106) was cloned in a plasmid with tunable copy number (BioBrick plasmid pSB2K3) and transformed in a Z1 strain.

185

The replication system of that plasmid is under the control of LacI and therefore the plasmid copy number could be controlled through addition of IPTG to the medium. The experiments were performed in microfluidics devices driven by gravitational force. The engineered bacteria were inoculated in the microfluidics chips and were grown in medium with different levels of IPTG. After the growth chambers were filled with cells, microscopy time-lapse experiment was performed for 2 hours and the steady-state fluorescence levels were compared. For that purpose, a normal image processing algorithm to detect the fluorescence in the individual cells was used and then those results were averaged. The cellular division time was manually obtained, but no dependence on the induction level was found.

VI.3.2.2. Dynamics of genetic oscillators

The Goodwin-type oscillators that were engineered were also characterized in gravity-driven microfluidics chips. The design of those oscillators was already discussed in detail I the dedicated chapter. The oscillators were transformed in the JS006 E. coli strain, which is LacI - and AraC - and was developed by the group of Jeff Hasty. The culture for inoculation of the microfluidics chip was prepared in the standard manner without inducers. The medium utilized for growth inside the biochip was LB also without inducers. The phase-contrast images were obtained every 1 minute and the phase-contrast images were taken every 3 minutes. The raw microscopy data was processed and exhibited the typical noisy behavior for this type of oscillators. Additional software module was developed for the smoothing of the curves through the application of a savitzky-golay filter. An exemplary results is presented in (Illustration 46). The oscillators were too noisy and no further experiment and characterization was attempted with them.

The oscillatory copy number filter is also a Goodwin-type oscillator. This construct utilizes the regulation of the copy number provided by the pSB2K3 device. The repressor of the copy number LacI was cloned under the control of a constitutive promoter in the same plasmid and thus provided a negative-feedback loop. This synthetic oscillator was also transformed in the JS006 strain and was characterized in microfluidics devices and fluorescence microscopy as

186

described above. However, the image processing was performed by the dedicated software of the group of Catalin Fertita (Fetita, Kirov, Jaramillo, & Lefevre, 2012) . A typical result was already presented in the chapter discussing genetic oscillators. As above, the behavior was very noisy and no further experiments were performed in that direction.

Illustration 46: Characterization results from a Goodwin-type P tet/lac -TetR oscillator. The behavior is very noisy showing large coefficient of variation even within the same cell. The curves were smoothed by the utilization of FIR filter

For the systems of two oscillators in the same cell we already utilized the capacity of the microfluidics chips with two inputs and programmed dynamics of the supply media. Those systems were also already described in the dedicated chapter. In summary, we co-transformed different combination of Goodwin-type oscillators in the same cell with the purpose of studying their interactions. We were most interesting in two of those combinations. First, we were curious to understand whether the behavior of the completely decoupled oscillators (P Llac-O1 -LacI and

PLtet-O1 -TetR) would be completely independent and what is their mutual effect, if any. The completely coupled oscillators (P tet/lac -LacI and P tet/lac -TetR) were also of interest, since there are many possibilities for the behavior of such system. For example, they could be synchronized and

187

in phase or in antiphase or any possibility in between or not synchronized at all. Initially we constructed an integrating reporter for the decoupled system. This device consisted of GFP under

the control of a P tet/lac promoter, thus being repressed by any of the two repressors. This way many of the differences between the dynamics of the two oscillators in the individual cells remained veiled. However, by employment of the proper technique, some of the details were still possible to be recognized. For example, the effect of the presence of one of the two repressor inducers was clearly visible as changing the number of cells exhibiting a typical frequency of oscillation (Illustration 47).

Illustration 47: Characterization of the uncoupled system of two oscillators. In the presence of saturating levels of IPTG (left) some of the average periods (appr. 65 minutes) disappear. Inversely, when the aTc levels are high (right), some of the faster periods are reduced.

In order to obtain this kind of data, the cells cultivated in the microfluidics chip in the standard manner. The two alternative media were LB with aTc (200 ng/ml) and LB with IPTG (20 mM). The experiments were initially performed with one of the media for at least 12 hours and then the media was switched with the alternative and the same length of time-lapse was programmed. In this case the input was not dynamically changing. The dual input option was used as a method to observe exactly the same traps with cells from the same culture under the influence of the two alternative inducers. The images with the microscopy data were processed. However, the simple

188

fluorescence time dynamics was not enough for this type of analysis. Therefore we developed an additional processing module similar to one already used by the group of Jeff Hasty (Mondragón-Palomino, Danino, Selimkhanov, Tsimring, & Hasty, 2011)  for the analysis of entrainment of external forcing signal by their positive-negative-feedback oscillator. In summary, this approach consists of detecting automatically the oscillatory period exhibited by the individual cells and then presenting the data for all analyzed cells as a histogram. The ordinate of this histogram represents the percentage of cells characterized by periods of oscillation within certain margins and the abscissa is grouping the close periods in appropriate bins. This way, the predominant behavior (if any) could be easily uncovered. Furthermore, resonance between the external forcing frequency and the genetic oscillator is evident from the sudden grouping of the big part of the cells around the input period (Illustration 48).

We further utilized this type of characterization by forcing with an external frequency to characterize the dynamics of a type of minimal oscillator with respect to theoretical predictions and to prove that synthetic genetic circuits could be used for theory validation (Rodrigo, Kirov, Shen, & Jaramillo, 2013) . The studied oscillator was the positive-negative-feedback device developed by Stricker et al. , which was constantly supplied with arabinose. Small pulses of IPTG were employed to disturb the dynamics of the system and search for resonance and other types of behavior.

Finally, we needed a special characterization technique for fast estimation of the possible modes of synchronization between the coupled oscillators in the same cell. The approaches we developed until here did not allow for this type of analysis to be performed, because they provided no evidence for the relative timing of the two individual devices.

189

Illustration 48: Characterization of oscillators by external forcing and periodograms. The uncoupled oscillators (A) are not entrained by an external forcing period of 30 min (below) and exhibit even larger variation in comparison with the non- forcing behavior (above). When the coupled double oscillators (B) are forced with the same period (below), two clear periods become predominant - one close to a harmonic of the forcing period and one away from it. We used the P lac/ara positive negative-feedback oscillator (C) as a positive control. When not forced by the input signal (above), this oscillator exhibits a predominant frequency of about 70 minutes. When forced with 30 min frequency, the cells exhibit a clear resonance behavior close to a harmonic of the forcing, i.e. 60 min.

Therefore we first engineered another type of reporter, which was producing different fluorescence signals for each of the two TF's. This device consisted of GFP under the control of

PLtet-O1 and mCherry under the control of P Llac-O1 . This way we could infer the dynamics of the two circuits separately. However, the conclusion whether the two oscillators are in phase or out of phase or have any type of harmonic behavior required additional mathematical analysis and

190

was not straightforward. Hence, we decide to borrow a technique from electronics, which is employed exactly for the visualization of phase and amplitude differences between coupled oscillators. Namely, those are the Lissajous curves (Greenslade, 1993) . Those curves are plots of the combined behaviors of two oscillators where each point of the curve represents the actual expression level for both of the oscillators.

This is accomplished by plotting the fluorescence level of one of the reporters on the ordinate and the fluorescence of the other reporter on the abscissa. In that case, if the two oscillators are in phase, the figure should be similar to a diagonal line inclined at 45 o. Conversely, if the two oscillators are not in phase, the faster oscillator should produce a sinusoid in its cognate axes direction, since it produces few oscillations while the other is producing only one. Furthermore, the ratio between the periods of the two oscillators should be self-evident from the number of those oscillations. Some results are presented in (Illustration 49). The utility of this technique is obvious and the synchrony and difference between phase and period of the two oscillators becomes obvious.

Illustration 49: Lissajous figures for the two types of double oscillators - tightly coupled (right) and uncoupled (left). The coupled oscillators are very close in phase and produce a typical curve positioned near the diagonal. The uncoupled oscillators in this case show that the oscillator reported by mCherry is almost three times faster than the other one.

191

VI.3.2.3. Plasmid visualization

Another type of characterization that we developed is a specific method to count the number of plasmids in a living bacterial cell. We employed a genetic system developed by Catherine Guynet et al . (Guynet, Cuevas, Moncalián, & de la Cruz, 2011)  to study distribution of plasmids between the daughter cells after division. The method consists of constitutive expression of a GFP chimeric protein capable of binding certain DNA operator in the receiver cells. The conjugative plasmid carries a site consisting of multiple repeats of the same operator. Consequently, when the conjugation event takes place and the sender cells transfer the plasmid to the receiver, the mutated GFP binds to the new plasmid and localizes to a single spot. This spot is readily observable in microfluidics devices and even allows for the counting of the plasmid copy number in a single cell. We have conducted two types of experiments with this system. In the first type the sender and receiver cells were grown separately in liquid cultures. Once the two cultures reached exponential phase, they were concentrated by centrifugation, washed by the original media containing antibiotics and were placed on a filter paper over a petri dish with LB agar at 37 oC for 1 hour. At this conditions conjugative events took place and the obtained culture was extracted from the filter paper and grown in antibiotics selective for the conjugated cells. The culture thusly grown was further cultivated in microfluidics device and was characterized in the standard manner. This allowed for direct plasmid visualization and counting of the copy number. The latter could be used for removal of the noise generated by the plasmid copy number drift during characterization of genetic circuits.

The second type of experiments involved inoculation of the microfluidics device with the mixture of concentrated receiver and sender cells. The cells were then cultivated at standard conditions and were observed under microscope. Bright GFP spots appeared during the cultivation, which proved that conjugation was observed in microfluidics chip, which up to our knowledge is the first such experimental accomplishment (Illustration 50).

192

Illustration 50: Conjugation event in microfluidics device. The brighter cells produce constantly GFP, which diffuses freely in the cytosol. When a conjugative plasmid is present in the bacterium, the chimeric GFP binds to its cognate operators and forms a bright fluorescent spot

VI.4. Conclusion

Overall we developed a variety of characterization techniques for the study of genetic parts based on standard and microfluidics experimental setups. Those techniques were shown to be useful and were used in the writing of peer-reviewed research papers. The characterization approaches we established include promoter characterization methods by fluorescence measurements in fluorometer, single-cell fluorescence time-lapse microscopy with consequent automated computational analysis and specific experimental approaches for plasmid visualization in microfluidics. We successfully transferred the existing characterization methods and developed new approaches for the analysis of the double devices. The microfluidics devices we developed combined with the automated software written by us proved to efficient and reliable. The analysis approaches also included some novel techniques borrowed from electronics, which allowed for

193

different types of genetic circuits' aspects to be revealed in a straightforward manner.

VI.5. References

Alper, H., Fischer, C., Nevoigt, E., & Stephanopoulos, G. (2005). Tuning genetic control through promoter engineering. Proceedings of the National Academy of Sciences of the United States of America , 102 (36), 12678–83. doi:10.1073/pnas.0504604102

Carrera, J., Rodrigo, G., Singh, V., Kirov, B., & Jaramillo, A. (2011). Empirical model and in vivo characterization of the bacterial response to synthetic gene expression show that ribosome allocation limits growth rate. Biotechnology journal , 6(7), 773–83. doi:10.1002/biot.201100084

Fetita, C., Kirov, B., Jaramillo, A., & Lefevre, C. (2012). An automated approach for single-cell tracking in epifluorescence microscopy applied to E. coli growth analysis on microfluidics biochips. In SPIE Medical Imaging (p. 83170Z–83170Z–11). International Society for Optics and Photonics. doi:10.1117/12.911371

Greenslade, T. B. (1993). All about Lissajous figures. The Physics Teacher , 31 (6), 364. doi:10.1119/1.2343802

Guynet, C., Cuevas, A., Moncalián, G., & de la Cruz, F. (2011). The stb operon balances the requirements for vegetative stability and conjugative transfer of plasmid R388. PLoS genetics , 7(5), e1002073. doi:10.1371/journal.pgen.1002073

Kelly, J. R., Rubin, A. J., Davis, J. H., Ajo-Franklin, C. M., Cumbers, J., Czar, M. J., … Endy, D. (2009). Measuring the activity of BioBrick promoters using an in vivo reference standard. Journal of biological engineering , 3(1), 4. doi:10.1186/1754-1611-3-4

Liu, Y.-Y., Slotine, J.-J., & Barabási, A.-L. (2013). Observability of complex systems. Proceedings of the National Academy of Sciences of the United States of America , 110 (7), 2460–5. doi:10.1073/pnas.1215508110

Mondragón-Palomino, O., Danino, T., Selimkhanov, J., Tsimring, L., & Hasty, J. (2011). Entrainment of a population of synthetic genetic oscillators. Science (New York, N.Y.) ,

194

333 (6047), 1315–9. doi:10.1126/science.1205369

Rodrigo, G., Kirov, B., Shen, S., & Jaramillo, A. (2013). Theoretical and experimental analysis of the forced LacI-AraC oscillator with a minimal gene regulatory model. Chaos (Woodbury, N.Y.) , 23 (2), 025109. doi:10.1063/1.4809786

Shaner, N. C., Steinbach, P. A., & Tsien, R. Y. (2005). A guide to choosing fluorescent proteins. Nature methods , 2(12), 905–9. doi:10.1038/nmeth819

Slusarczyk, A. L., Lin, A., & Weiss, R. (2012). Foundations for the design and implementation of synthetic genetic circuits. Nature reviews. Genetics , 13 (6), 406–20. doi:10.1038/nrg3227

Wong, W. W., Tsai, T. Y., & Liao, J. C. (2007). Single-cell zeroth-order protein degradation enhances the robustness of synthetic oscillator. Molecular Systems Biology , 3. doi:10.1038/msb4100172.

195

VII. Conclusion

We demonstrated clearly what are the necessary steps for establishing a complete experimental setup for a lab aiming at research in the field of synthetic genetic circuits. We studied theoretically some of the main genetic parts involved in the engineering of such circuits. Furthermore, we proposed particular genetic implementations and proved experimentally that those devices exhibit the expected behavior. We did not only re-built existing devices as a proof of concept, but we also developed novel types of genetic circuits. The engineering of an oscillation of the plasmid copy number based on self repression is indeed a very natural mechanism, but has not been employed in synthetic biology yet. The different combinations of two plasmids also provide a vast field for research, whose surface is barely scratched.

Furthermore we established a fabrication platform for the production of microfluidics devices of our own design and for our specific needs. The experimental results showed that the outcome of this fabrication are efficient and reliable devices. This platform is minimalistic as requirements – both for space and resources allocation. Virtually any lab could afford this type of investment.

The engineering loop (Illustration 1) was closed by the development of image processing software. The latter allowed for the automation of the data processing and opened the door for high-throughput utilization of the microfluidics-and-microscopy experimental establishment in our lab. Some results of this utilization have already been published in peer-reviewed journals. The developed software not only automated, but also parallelized the computations for image processing, thus rendering it fast enough to be planned for on-line utilization. We also thought of the community and uploaded the image processing software as a web-based tool. This way labs with no access to powerful CPU's or expensive software packages could have access to the same level of image processing we have.

We envision this as the future of data processing in biology as a whole. The only requirement for

196

this to become a fact would be the standardization of data formats. The latter also means standardization of experimental procedures and/or equipment. Actually, development of standard microfluidics devices for different types of experiments and their production at large scale would reduce the overall cost of performing of a single-cell experiments drastically. This is a task not less important than the standardization of biological parts, because it is exactly the demand for equipment and specialists that is limiting the wide adoption of single-cell experiments in the community.

An unexpected outcome from the coverage of all those different fields and their thorough examination was the discovery of the mutual interdependence between genetic circuits design, microscopy setup, microfluidics devices design and image processing. All the details involved in parameter setup of those aspects of the experiment are a compromise. A compromise between high production of fluorescence protein and genetic stability, between long UV-light exposure and cell survival, between reduced number of imaging per minute and efficiency of tracking algorithm, between geometry of growth chambers and efficiency of tracking algorithm. Overall, such kind of interdisciplinary work is required to find the proper parameters that could satisfy all those requirements and to propose the optimal solutions allowing for the fast development of well-characterized genetic circuits.

197

APPENDIX A Promoters sequences:

GAATTCGCGGCCGCTTCTAGAG ...Promoter...TACTAGTAGCGGCCGCTGCAG

__ - BioBrick prefix

__ - Promoter sequence

__ - BioBrick suffix

Ptrc repressed by λλλ-CI

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAA TACATCTGGCGGTGATATTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by mutated lambda λλλ-CI (sequence and position of the operator were based on (Cox, Surette, & Elowitz, 2007)).

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATACA TCTGGCGGTGATATTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by 434-CI

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAA AAACAAGAAAGTTTGTATTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by a mixed 434/P22-CI and P22-CI

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAA CAACATATCTTAAATTTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by 434-CI and 434/P22-CI

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAAT

198

GTGTGGAAACAACATATCTTAAATATTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by 434-CI using O3 (weak operator)

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAAT GTGTGGAAAAACAGTTTTTCTTGTATTTCACACATACTAGTAGCGGCCGCTGCAG

Ptrc repressed by P22-CI using O3 (weak operator)

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCCGGCTCGTATAAT GTGTGGAACTTAAGTTTTTGTTTGATTTCACACATACTAGTAGCGGCCGCTGCAG

PL from lambda phage repressed by CI hybrid operator from phages 434/P22 (weak)

GAATTCGCGGCCGCTTCTAGAG AAAATTTATCAAAAAGAGTGTTGACATACAACATA TCTTAAATGATACTGAGCACATCAGCAGGACGCACTGACCTACTAGTAGCGGCCGCT GCAG

Plac8A promoter activated by AraC

GAATTCGCGGCCGCTTCTAGAG CATAGCATTTTTATCCATAAGATTAGCGGATCCTA AGCTTTACAATGAACTGTCCACTCCAGTATGATAGATTCGTCCATATTGCATCAGAC ATTGTACTAGTAGCGGCCGCTGCAG

Plac8A promoter activated by AraC another version

GAATTCGCGGCCGCTTCTAGAG CATAGCATTTTTATCCATAAGATTAGCGGATCCTA AGCTTTACAATGAACTGTCCACTCCAGTATGATAGATTCTAGCATTTTTATCCATATA CTAGTAGCGGCCGCTGCAG

PL repressed by TetR’ or TetR

GAATTCGCGGCCGCTTCTAGAG TCCCTATCAGTGATAGAGATTGACATCCCCGTCAG TGACGGAGATACTTGTGGAATACTAGTAGCGGCCGCTGCAG

Plac-UV5 repressed by TetR

199

GAATTCGCGGCCGCTTCTAGAG CCAGGCTTTACACTTTATGCTTCCGGCTCGTATAAT GTGTGGATCCCTATCAGTGATAGAGATTCACACATACTAGTAGCGGCCGCTGCAG

Ptac promoter repressed by TetR

GAATTCGCGGCCGCTTCTAGAG GAGCTGTTGACAATTAATCATCGGCTCGTATAATG TGTGGATCCCTATCAGTGATAGAGATTCACACTACTAGTAGCGGCCGCTGCAG

200

XOR devices

Version1 atacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagc cagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattccatggtgccacctgacgtctaagaaacca ttattatcatgacattaacctataaaaataggcgtatcacgaggcagaatttcagataaaaaaaatccttagctttcgctaaggatgatttctggaa ttcgcggccgcttctagagatACAATGTATCTTGTTTGTCACAACACGCACGGTGTTAGAAGATTG GGGGTAAatactAGAGCAACACGCACGGTGTTAGATAACAATGTATCTTGTtgaTAGATT TAACGTAtgTACTAGAGaaagaggagaaatactagatggtgagcaagggcgaggaggataacatggccatcatcaaggag ttcatgcgcttcaaggtgcacatggagggctccgtgaacggccacgagttcgagatcgagggcgagggcgagggccgcccctacgagg gcacccagaccgccaagctgaaggtgaccaagggtggccccctgcccttcgcctgggacatcctgtcccctcagttcatgtacggctccaa ggcctacgtgaagcaccccgccgacatccccgactacttgaagctgtccttccccgagggcttcaagtgggagcgcgtgatgaacttcgag gacggcggcgtggtgaccgtgacccaggactcctccttgcaggacggcgagttcatctacaaggtgaagctgcgcggcaccaacttcccc tccgacggccccgtaatgcagaagaagaccatgggctgggaggcctcctccgagcggatgtaccccgaggacggcgccctgaagggc gagatcaagcagaggctgaagctgaaggacggcggccactacgacgctgaggtcaagaccacctacaaggccaagaagcccgtgcag ctgcccggcgcctacaacgtcaacatcaagttggacatcacctcccacaacgaggactacaccatcgtggaacagtacgaacgcgccgag ggccgccactccaccggcggcatggacgagctgtacaagtaataatactagagccaggcatcaaataaaacgaaaggctcagtcgaaag actgggcctttcgttttatctgttgtttgtcggtgaacgctctctactagagtcacactggctcaccttcgggtgggcctttctgcgtttataTAC TAGTagcggccgctgcagtccggcaaaaaaacgggcaaggtgtcaccaccctgccctttttctttaaaaccgaaaagattacttcgcgtt atgcaggcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatctcgagtcc cgtcaagtcagcgtaatgctctgccagtgttacaaccaattaaccaattctgattagaaaaactcatcgagcatcaaatgaaactgcaatttattc atatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttccataggatggcaagatcctg gtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatcaagtgagaaatcaccatgagt gacgactgaatccggtgagaatggcaaaagcttatgcatttctttccagacttgttcaacaggccagccattacgctcgtcatcaaaatcactc gcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggacaattacaaacaggaatcga atgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctggaatgctgttttcccgggg atcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattccgtcagccagtttagtctg accatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttcccatacaatcgatagattgtc gcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcggcctcgagcaagacgtttc ccgttgaatatggctcataacaccccttgtattactgtttatgtaagcagacagttttattgttcatgatgatatatttttatcttgtgcaatgtaacatc agagattttgagacacaacgtggctttgttgaataaatcgaacttttgctgagttgaaggatcagatcacgcatcttcccgacaacgcagaccg ttccgtggcaaagcaaaagttcaaaatcaccaactggtccacctacaacaaagctctcatcaaccgtggctccctcactttctggctggatgat ggggcgattcaggcctggtatgagtcagcaacaccttcttcacgaggcagacctcagcgctagcggagtgtatactggcttactatgttggc actgatgagggtgtcagtgaagtgcttcatgtggcaggagaaaaaaggctgcaccggtgcgtcagcagaatatgtgatacaggatatattcc gcttcctcgctcactgactcgctacgctcggtcgttcgactgcggcgagcggaaatggcttacgaacggggcggagatttcctggaagatg ccaggaagatacttaacagggaagtgagagggccgcggcaaagccgtttttccataggctccgcccccctgacaagcatcacgaaatctg acgctcaaatcagtggtggcgaaacccgacaggactataaagataccaggcgtttcccctggcggctccctcgtgcgctctcctgttcctgc ctttcggtttaccggtgtcattccgctgttatggccgcgtttgtctcattccacgcctgacactcagttccgggtaggcagttcgctccaagctgg actgtatgcacgaaccccccgttcagtccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggaaagacatgcaaaagca ccactggcagcagccactggtaattgatttagaggagttagtcttgaagtcatgcgccggttaaggctaaactgaaaggacaagttttggtga ctgcgctcctccaagccagttacctcggttcaaagagttggtagctcagagaaccttcgaaaaaccgccctgcaaggcggttttttcgttttca gagcaagagattacgcgcagaccaaaacgatctcaagaagatcatcttattaaggggtctgacgctcagtggaacgaaaactcacgttaag

201

ggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaactt ggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagat aactacg

202

Version2 atacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagc cagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattccatggtgccacctgacgtctaagaaacca ttattatcatgacattaacctataaaaataggcgtatcacgaggcagaatttcagataaaaaaaatccttagctttcgctaaggatgatttctggaa ttcgcggccgcttctagagATACAATGTATCTTGTTTGTCACAACACGCACGGTGTTAGAAGATT GGGGGTAAAtactAGAGTGTTGTGTGGAATTacaagaaagtttgtATTCACACACCCGGCCGGG GCAACCATTATCACCGCCAGAGGTAAAATAGTCAACACGCACGGTGTTAGATATTTA TCCCTTGTGGTGATAGATTTAACGTATGaaagaggagaaatactagatggtgagcaagggcgaggaggataac atggccatcatcaaggagttcatgcgcttcaaggtgcacatggagggctccgtgaacggccacgagttcgagatcgagggcgagggcga gggccgcccctacgagggcacccagaccgccaagctgaaggtgaccaagggtggccccctgcccttcgcctgggacatcctgtcccctc agttcatgtacggctccaaggcctacgtgaagcaccccgccgacatccccgactacttgaagctgtccttccccgagggcttcaagtggga gcgcgtgatgaacttcgaggacggcggcgtggtgaccgtgacccaggactcctccttgcaggacggcgagttcatctacaaggtgaagct gcgcggcaccaacttcccctccgacggccccgtaatgcagaagaagaccatgggctgggaggcctcctccgagcggatgtaccccgag gacggcgccctgaagggcgagatcaagcagaggctgaagctgaaggacggcggccactacgacgctgaggtcaagaccacctacaag gccaagaagcccgtgcagctgcccggcgcctacaacgtcaacatcaagttggacatcacctcccacaacgaggactacaccatcgtggaa cagtacgaacgcgccgagggccgccactccaccggcggcatggacgagctgtacaagtaataatactagagccaggcatcaaataaaac gaaaggctcagtcgaaagactgggcctttcgttttatctgttgtttgtcggtgaacgctctctactagagtcacactggctcaccttcgggtggg cctttctgcgtttataTACTAGTagcggccgctgcagtccggcaaaaaaacgggcaaggtgtcaccaccctgccctttttctttaaaacc gaaaagattacttcgcgttatgcaggcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaa ggcggtaatctcgagtcccgtcaagtcagcgtaatgctctgccagtgttacaaccaattaaccaattctgattagaaaaactcatcgagcatca aatgaaactgcaatttattcatatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttcc ataggatggcaagatcctggtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatca agtgagaaatcaccatgagtgacgactgaatccggtgagaatggcaaaagcttatgcatttctttccagacttgttcaacaggccagccattac gctcgtcatcaaaatcactcgcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggac aattacaaacaggaatcgaatgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctg gaatgctgttttcccggggatcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattc cgtcagccagtttagtctgaccatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttccc atacaatcgatagattgtcgcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcgg cctcgagcaagacgtttcccgttgaatatggctcataacaccccttgtattactgtttatgtaagcagacagttttattgttcatgatgatatattttta tcttgtgcaatgtaacatcagagattttgagacacaacgtggctttgttgaataaatcgaacttttgctgagttgaaggatcagatcacgcatcttc ccgacaacgcagaccgttccgtggcaaagcaaaagttcaaaatcaccaactggtccacctacaacaaagctctcatcaaccgtggctccct cactttctggctggatgatggggcgattcaggcctggtatgagtcagcaacaccttcttcacgaggcagacctcagcgctagcggagtgtat actggcttactatgttggcactgatgagggtgtcagtgaagtgcttcatgtggcaggagaaaaaaggctgcaccggtgcgtcagcagaatat gtgatacaggatatattccgcttcctcgctcactgactcgctacgctcggtcgttcgactgcggcgagcggaaatggcttacgaacggggcg gagatttcctggaagatgccaggaagatacttaacagggaagtgagagggccgcggcaaagccgtttttccataggctccgcccccctgac aagcatcacgaaatctgacgctcaaatcagtggtggcgaaacccgacaggactataaagataccaggcgtttcccctggcggctccctcgt gcgctctcctgttcctgcctttcggtttaccggtgtcattccgctgttatggccgcgtttgtctcattccacgcctgacactcagttccgggtaggc agttcgctccaagctggactgtatgcacgaaccccccgttcagtccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccgga aagacatgcaaaagcaccactggcagcagccactggtaattgatttagaggagttagtcttgaagtcatgcgccggttaaggctaaactgaa aggacaagttttggtgactgcgctcctccaagccagttacctcggttcaaagagttggtagctcagagaaccttcgaaaaaccgccctgcaa ggcggttttttcgttttcagagcaagagattacgcgcagaccaaaacgatctcaagaagatcatcttattaaggggtctgacgctcagtggaac gaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaa

203

gtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctga ctccccgtcgtgtagataactacg

APPENDIX B

Image processing scripts function process5()

if matlabpool('size') > 0 matlabpool close end

matlabpool

%General settings fluo_period=7; % the peiod between fluorescent images (in minutes) phase_period=0.5; % the peiod between phase-contrast images (in minutes) min_track_time=20; % the minimal period (in minutes) for tracking in order for a particle to be retained in the final results save temp_settings fluo_period phase_period min_track_time

make_movies; clear all; clc;

threshold('inversed.tif'); clear all; clc;

drift_compensation; clear all; clc;

parsegment; clear all; clc;

clc;close all; clear all; partrack5;

clc;close all; clear all; if exist('gfp_compensated.tif','file') fluodata_M9_4('gfp_compensated'); end

204

clc;close all; clear all; if exist('rfp_compensated.tif','file') fluodata_M9_4('rfp_compensated'); end

clc;close all; clear all; if exist('yfp_compensated.tif','file') fluodata_M9_4('yfp_compensated'); end

wd=cd; cd('frames_data') delete *.* cd(wd) zip('tracks.zip',{'tracked_frame_*'}); delete tracked_frame_* if ~isempty(dir('cell_*')) zip('cells.zip',{'cell_*'}); delete cell_* end delete inversed.tif delete thresheld.tif delete temp_settings.mat results2;

close all; clear all; clc;

matlabpool close end.

205

function make_movies()

load temp_settings

% fluo_period=7; % the peiod between fluorescent images (in minutes) % phase_period=0.5; % the peiod between phase-contrast images (in minutes) fluo_delay=fluo_period/phase_period; wd=cd; cd tifs; % unzip tifs.zip % delete tifs.zip files = dir('*.tif'); ref_time=datevec(files(1).date); time_sorter=zeros(length(files),2); for id = 1:length(files) time_sorter(id,1)=id; time_sorter(id,2)=etime(datevec(files(id).date),ref_time); end time_sorter=sortrows(time_sorter,-2);

%Create the exact time vector corresponding to the images time_vec=[]; for id = 1:length(files) movie_type = files(id).name(end-4); if movie_type=='6' temp_date=files(id).date; temp_time=datevec(temp_date); time_vec=[time_vec; temp_time]; end end for i=1:size(time_vec,1) time(i)=etime(time_vec(i,:),time_vec(1,:))/60; end time=sort(time)'; time_final=time(1:fluo_delay:length(time)); time_final=sort(time_final,'descend');

%Create the tiff stack file for i = 1:length(files)

id=time_sorter(i,1);

im = imread(files(id).name); movie_type = files(id).name(end-4);

206

if movie_type=='6' output_file=strcat(wd,'/phase.tif'); if exist(output_file,'file') imwrite(im, output_file, 'writemode', 'append'); else imwrite(im, output_file); end; im = imcomplement(im); output_file=strcat(wd,'/inversed.tif'); if exist(output_file,'file') imwrite(im, output_file, 'writemode', 'append'); else imwrite(im, output_file); end;

elseif movie_type=='0' output_file=strcat(wd,'/gfp.tif'); if exist(output_file,'file') imwrite(im, output_file, 'writemode', 'append'); else imwrite(im, output_file); end;

elseif movie_type=='2' output_file=strcat(wd,'/yfp.tif'); if exist(output_file,'file') imwrite(im, output_file, 'writemode', 'append'); else imwrite(im, output_file); end;

elseif movie_type=='3' output_file=strcat(wd,'/rfp.tif'); if exist(output_file,'file') imwrite(im, output_file, 'writemode', 'append'); else imwrite(im, output_file); end; end;

end; cd (wd); fileinfo_phase = imfinfo('phase.tif');

207

if exist('gfp.tif','file') fileinfo_fluo = imfinfo('gfp.tif');

elseif exist('rfp.tif','file') fileinfo_fluo = imfinfo('rfp.tif');

elseif exist('yfp.tif','file') fileinfo_fluo = imfinfo('yfp.tif');

phase_length=(length(fileinfo_fluo)-1)*fluo_delay+1; if length(fileinfo_phase)>phase_length fluo_lag=length(fileinfo_phase)-phase_length; for i=1:phase_length im1=imread('inversed.tif',i+fluo_lag); im2=imread('phase.tif',i+fluo_lag); function threshold(pos_filename)

%optimization mode (1: exhaustive, 2: montecarlo) exploration_mode = 2;

%debug mode, set to 1 to see the intermediate results of all the analysis phases debug_mode = 0;

%switch to the grayscale mode colormap('gray');

%position image stack filename output_file = 'thresheld.tif';

%parameters of the model mu1 = 500; mu2 = 1000;

% Load position images into the main memory fileinfo = imfinfo(pos_filename); n_position_images = length(fileinfo); clear fileinfo;

%Function to get a position image by its index function image = get_position_image_by_index(index) % Open the image to be processed I = imread(pos_filename, index); image(:,:) = I; clear I; end

%Function to apply a background subtraction method to an image

208

function image_out = background_subtraction(image, mode) switch mode %%%%% Do nothing %%%%% case 1 image_out = image; %%%%% contrast-limited adaptive %%%%% %%%%% and then simple background subtraction %%%% case 2 image = adapthisteq(image); background = imopen(image, strel('disk',15)); image_out = image - background; %simple background subtraction case 3 background = imopen(image, strel('disk',15)); image_out = image - background; %top-hat, bottom-hat method, rectangle case 4 %MAXIMIZE CONTRAST se = strel('rectangle', [100, 20]); Itop = imtophat(image, se); Ibot = imbothat(image, se); image_out = imsubtract(imadd(Itop, image), Ibot); %top-hat, bottom-hat method, disk case 5 %MAXIMIZE CONTRAST se = strel('disk',15); Itop = imtophat(image, se); Ibot = imbothat(image, se); image_out = imsubtract(imadd(Itop, image), Ibot); %TODO: Mixture of Gaussians? otherwise fprintf('\nInvalid background subtraction mode\n\n'); return; end; end

%Function to apply a preprocessing method to an image function image_out = preprocessing(image, mode) switch mode %%%%% Do nothing %%%%% case 1 image_out = image; %%%%% Histogram equalization %%%%% case 2 image_out = adapthisteq(image); otherwise fprintf('\nInvalid preprocessing mode\n\n'); return;

209

end; end

%Function to apply a thresholding method to an image function image_out = thresholding(image, mode, input_thresh) switch mode %%%%% Global thresholding %%%%% case 1 otsu_thresh = graythresh(image); image_out = im2bw(image, otsu_thresh); %%%%% Local thresholding %%%%%% case 2 ws = 8; C = 2; mIM=imfilter(image,fspecial('average',ws),'replicate'); sIM=mIM-image-C; bw=im2bw(sIM,0); image_out = imcomplement(bw); %%%%% Triangle thresholding %%%%% case 3 thresh = triangle_th(imhist(image), 256); image_out = im2bw(image, thresh); case 4 thresh = input_thresh/255; %Convert the grayscale image to a bw image using the Otsu method threshold image_out = im2bw(image, thresh); otherwise fprintf('\nInvalid thresholding mode\n\n'); return; end; end

%Function to apply a postprocessing method to an image function image_out = postprocessing(image, mode) switch mode

210

function drift_compensation() load temp_settings fileinfo = imfinfo('phase.tif'); n_position_images = length(fileinfo); clear fileinfo; % fluo_period=7; % the peiod between fluorescent images (in minutes) % phase_period=0.5; % the peiod between phase-contrast images (in minutes) fluo_delay=fluo_period/phase_period; fluo_images=1:fluo_delay:n_position_images; wd=cd; phase_file=strcat(wd,'/drift_compensated.tif'); im1_phase=imread('thresheld.tif',1); [r c]=size(im1_phase);

%Find the left big object and its coordinates to align all the images %accordingly filteredImage1 = medfilt2(im1_phase,[8,2]); filledImage1=imfill(filteredImage1,'holes'); L1=bwlabel(filledImage1, 4); stats1=regionprops(L1,'Area', 'Extrema', 'Centroid', 'PixelIdxList'); drift=zeros(1,n_position_images);

%Start writing the new movie file im1_phase(1:200,:)=0; im1_phase(801:1000,:)=0; imwrite(im1_phase, phase_file); max_cross_sec=0; for i=1:length(stats1) temp_area=stats1(i).Area; C_init=stats1(i).Centroid; if temp_area>=max_cross_sec if C_init(1)<=c/4 && C_init(2)>=r/4 && C_init(2)<=3*r/4 big_object1=i; max_cross_sec=temp_area; end end end extr_coord1=stats1(big_object1).Extrema; right_end1=max(extr_coord1(:,1)); pixels1=stats1(big_object1).PixelIdxList;

%Calculate the drift between the coordinates of the left big object in the %different images for i=2:n_position_images im2=imread('thresheld.tif',i); filteredImage2 = medfilt2(im2,[8,2]);

211

filledImage2=imfill(filteredImage2,'holes'); L2=bwlabel(filledImage2, 4); stats2=regionprops(L2,'Extrema', 'PixelIdxList');

max_cross_sec=0; for j=1:length(stats2) pixels2=stats2(j).PixelIdxList; cross_sec=length(intersect(pixels1,pixels2)); if cross_sec>=max_cross_sec max_cross_sec=cross_sec; extr_coord2=stats2(j).Extrema; right_end2=max(extr_coord2(:,1)); drift(i)=right_end1-right_end2; end end end

%Compensate the drift in the phase images for i=2:n_position_images shifted_image=imread('thresheld.tif',i); im2=imread('thresheld.tif',i); if drift(i)~=0 shifted_image(:,1:drift(i))=0; shifted_image(:,c-drift(i):c)=0; if drift(i)>0 for k=1:r for j=drift(i):c shifted_image(k,j)=im2(k,j-drift(i)+1); end end end

if drift(i)<0 for k=1:r for j=1:c+drift(i) shifted_image(k,j)=im2(k,j-drift(i)); end end end end shifted_image(1:200,:)=0; shifted_image(801:1000,:)=0; imwrite(shifted_image, phase_file, 'writemode', 'append'); end

%Compensate the drift in the gfp images if exist('gfp.tif','file') gfp_file=strcat(wd,'/gfp_compensated.tif'); im1_gfp=imread('gfp.tif',1); imwrite(im1_gfp, gfp_file); for i=2:length(fluo_images)

212

shifted_image=imread('gfp.tif',i); im2=imread('gfp.tif',i); if drift(fluo_images(i))~=0 shifted_image(:,1:drift(fluo_images(i)))=0; shifted_image(:,c-drift(fluo_images(i)):c)=0; if drift(fluo_images(i))>0 for k=1:r for j=drift(fluo_images(i)):c shifted_image(k,j)=im2(k,j-drift(fluo_images(i))+1); end end end if drift(fluo_images(i))<0 for k=1:r for j=1:c+drift(fluo_images(i)) shifted_image(k,j)=im2(k,j-drift(fluo_images(i))); end end end end imwrite(shifted_image, gfp_file, 'writemode', 'append'); end end

%Compensate the drift in the yfp images if exist('yfp.tif','file') yfp_file=strcat(wd,'/yfp_compensated.tif'); im1_yfp=imread('yfp.tif',1); imwrite(im1_yfp, yfp_file); for i=2:length(fluo_images) shifted_image=imread('yfp.tif',i); im2=imread('yfp.tif',i); if drift(fluo_images(i))~=0 shifted_image(:,1:drift(fluo_images(i)))=0; shifted_image(:,c-drift(fluo_images(i)):c)=0; if drift(fluo_images(i))>0 for k=1:r for j=drift(fluo_images(i)):c shifted_image(k,j)=im2(k,j-drift(fluo_images(i))+1); end end end if drift(fluo_images(i))<0 for k=1:r for j=1:c+drift(fluo_images(i)) shifted_image(k,j)=im2(k,j-drift(fluo_images(i))); end end end end imwrite(shifted_image, yfp_file, 'writemode', 'append');

213

end end

%Compensate the drift in the rfp images if exist('rfp.tif','file') rfp_file=strcat(wd,'/rfp_compensated.tif'); im1_rfp=imread('rfp.tif',1); imwrite(im1_rfp, rfp_file); for i=2:length(fluo_images) shifted_image=imread('rfp.tif',i); im2=imread('rfp.tif',i); if drift(fluo_images(i))~=0 shifted_image(:,1:drift(fluo_images(i)))=0; shifted_image(:,c-drift(fluo_images(i)):c)=0; if drift(fluo_images(i))>0 for k=1:r for j=drift(fluo_images(i)):c shifted_image(k,j)=im2(k,j-drift(fluo_images(i))+1); end end end if drift(fluo_images(i))<0 for k=1:r for j=1:c+drift(fluo_images(i)) shifted_image(k,j)=im2(k,j-drift(fluo_images(i))); end end end end imwrite(shifted_image, rfp_file, 'writemode', 'append'); end end

end

214

function parsegment()

%Starting input file input_file = 'drift_compensated.tif';

%File in which the segmented images are stored movfile = 'segmented.tif';

% wd=cd;

% Load position images into the main memory fileinfo = imfinfo(input_file); n_position_images = length(fileinfo); %Loop over the images parfor index = 1:n_position_images index1=num2str(index); data_file=strcat('pixels_frame_',index1,'.mat'); %Open the images to be processed bw = imread(input_file, index); filteredImage = medfilt2(bw,[8,2]); filledImage=imfill(filteredImage,'holes'); tres_image=imread('thresheld.tif',index); BWnew=filledImage; %Get image properties Lseed=bwlabel(BWnew, 4); data1 = regionprops(Lseed,'MinorAxisLength'); axis_lengths1 = [data1.MinorAxisLength]; %Define processing parameters maxCellWidth=20; erosion_thresholding=1; se = strel('disk',erosion_thresholding); BWobj = tres_image; %Treat objects that do not match the parameters while sum(axis_lengths1 > maxCellWidth) > 0 for obj = 1:length(axis_lengths1); if axis_lengths1(obj) > maxCellWidth BWobj(Lseed ~= obj) = 0; BWobj = imerode(BWobj,se); BWobj = medfilt2(BWobj,[4,2]); BWobj=imfill(BWobj,'holes'); BWnew(Lseed == obj) = 0; BWnew = BWnew | BWobj; BWobj=BWnew; end end Lseed = bwlabel(BWnew,4);

215

data_temp = regionprops(Lseed,'MinorAxisLength'); axis_lengths1 = [data_temp.MinorAxisLength]; end

%remove small objects im2 = bwareaopen(BWnew,200,4); im = medfilt2(im2,[8,3]); Lseed_new = bwlabel(im,4); lambda=0.05;

%generate dilated image for final fitting se = strel('disk',3); im4=medfilt2(tres_image,[3,3]); im3=imdilate(im4,se); final_image=im3;

%Re-fill the generated object seeds to the size of the original %image Lprelim = IdentifySecPropagateSubfunction(double(Lseed_new),double(filteredImage),final_image,lambda); se = strel('disk',1); bw=imerode(Lprelim,se);

%Save the segmented images in a new stack imwrite(bw,movfile,'Compression','none','WriteMode','append');

%Get and save the images object properties stats=regionprops(bw,'PixelIdxList'); % retrieve some of the 'particle' properties - added the list of pixels -mod CG if ~isempty(stats), % empty stats causes crash

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Not possible save directly - violates the transparency of the %function, therefore we create a special script to save and we call %it here % cd (strcat(wd,'/frames_data')); % save(data_file, 'stats'); % cd(wd); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

savestats(stats, data_file);

end; end;

end

216

function partrack5() index1='1'; wd=cd; data_file1=strcat(wd,'/frames_data/pixels_frame_',index1,'.mat'); fileinfo = imfinfo('phase.tif'); frames=length(fileinfo); load(data_file1); stats1=stats; clear stats beads=length(stats1); cells(beads).pixels=[];

% parfor i=1:beads % pixels=stats1(i).PixelIdxList; % cells(i).pixels=pixels; % end parfor i=1:beads pixels=stats1(i).PixelIdxList; cells(i).pixels=pixels; cells(i).cell_nr=i; end clear fileinfo stats1 pixels beads data_file1 i mat_name=strcat('tracked_frame_',index1,'.mat'); save (mat_name, 'cells'); clear cells mat_name index1 for j=2:frames frame_index_old=num2str(j-1); frame_index=num2str(j); mat_name_old=strcat('tracked_frame_',frame_index_old,'.mat'); mat_name=strcat('tracked_frame_',frame_index,'.mat'); load (mat_name_old); beads=length(cells); data_file2=strcat(wd,'/frames_data/pixels_frame_',frame_index,'.mat'); load(data_file2); stats2=stats; clear stats beads2=length(stats2); found_beads=zeros(beads,1); lost_beads=zeros(beads,1); parfor i=1:beads pixels_old=cells(i).pixels; pixels_new=pixels_old;

217

if pixels_old~=0 for k=1:beads2 pixels_try=stats2(k).PixelIdxList; max_sec=0; if length(pixels_try)>=450 && length(pixels_try)<=4500 cross_sec=length(intersect(pixels_old,pixels_try)); if cross_sec>max_sec max_sec=cross_sec; pixels_new=pixels_try; found_beads(i)=k; end end end else lost_beads(i)=1; end pixels=pixels_new; cells(i).pixels=pixels; cell_nrs(i)=cells(i).cell_nr; end repeats=zeros(length(found_beads),1); for l=1:length(found_beads)-1 a=found_beads(l); for m=l+1:length(found_beads) b=found_beads(m); if a~=0 && a==b repeats(m)=1; end end end free_beads=[]; for n=1:beads2 if isempty(find(found_beads==n)) free_beads=[free_beads n]; end end max_bead=max(cell_nrs); clear cell_nrs beads2 repeated_beads=find(repeats==1); lost_beads=find(lost_beads==1); bad_beads=[repeated_beads; lost_beads]; for o=1:length(bad_beads) if length(free_beads)>=o cells(bad_beads(o)).pixels=stats2(free_beads(o)).PixelIdxList; cells(bad_beads(o)).cell_nr=max_bead+o; end end clear max_bead repeated_beads lost_beads bad_beads

218

% repeats=zeros(length(cells),1); % for l=1:length(cells)-1 % a=cells(l).pixels; % for m=l+1:length(cells) % b=cells(m).pixels; % if(length(intersect(a,b))==length(a)) % repeats(m)=1; % end % end % end % if length(find(repeats==1))>=length(cells)/2 % cell_counter=0; % for n=1:beads2 % pixels_try=stats2(n).PixelIdxList; % if length(pixels_try)>=450 && length(pixels_try)<=4500 && cell_counter

save (mat_name, 'cells'); clear pixels_try pixels_new pixels cells end

% track_matrix(beads,frames).frame=[]; % track_matrix(beads,frames).bead=[]; % for l=1:beads % track_matrix(l,1).frame=1; % track_matrix(l,1).bead=l; % end % % for j=2:frames % index2=num2str(j); % data_file2=strcat(wd,'/frames_data/pixels_frame_',index2,'.mat'); % load(data_file2); % stats2=stats; % clear stats % index1=num2str(j-1); % data_file1=strcat(wd,'/frames_data/pixels_frame_',index1,'.mat'); % load(data_file1);

219

% stats1=stats; % clear stats % beads2=length(stats2); % prev_frame=track_matrix(:,j-1); % track_matrix(:,j)=prev_frame; % parfor i=1:beads % to=0; % max_sec=0; % current_cell=prev_frame(i); % % if current_cell>0 % pixels=stats1(current_cell.bead).PixelIdxList; % if length(pixels)>=450 && length(pixels)<=4500 % for k=1:beads2 % pixels2=stats2(k).PixelIdxList; % if length(pixels2)>=450 && length(pixels2)<=4500 % cross_sec=length(intersect(pixels,pixels2)); % if cross_sec>max_sec % max_sec=cross_sec; % to=k; % end % end % end % track_matrix(i,j).bead=to; % track_matrix(i,j).frame=j; % end % % end % end % clear stats1 stats2 % end % % % track_matrix_new=[]; % % % % for i=1:beads % % current_bead=track_matrix(i,:); % % % if isempty(find(current_bead==0)) % % track_matrix_new=[track_matrix_new; current_bead]; % % % end % % end % % % % clear track_matrix % % track_matrix=track_matrix_new; % save track_matrix track_matrix end

220

function fluodata_M9_4(image_type) imagefile=strcat(image_type,'.tif'); output_file=strcat(image_type, '_data.csv');

% % SETTINGS % min_track_time=60;% the minimal period for valid tracking of individual cells (in minutes) % fluo_period=7; % the peiod between fluorescent images (in minutes) % phase_period=0.5; % the peiod between phase-contrast images (in minutes) load temp_settings fluo_delay=fluo_period/phase_period; min_track_length=ceil(min_track_time/phase_period); clear fluo_period phase_period fileinfo = imfinfo('phase.tif'); frames=length(fileinfo); fluo_frames=1:fluo_delay:frames; clear fileinfo fluo_delay last_index=num2str(length(dir('tracked_frame_*'))); last_name=strcat('tracked_frame_',last_index,'.mat'); load (last_name) clear last_index last_name beads=length(cells); for i=1:beads cell_nrs(i)=cells(i).cell_nr; end total_beads=max(cell_nrs); clear cells cell_nrs tracking_results=zeros(frames,total_beads); for i=1:frames frame_index=num2str(i); mat_name=strcat('tracked_frame_',frame_index,'.mat'); load (mat_name) for j=1:beads cell_nrs(j)=cells(j).cell_nr; end clear cells parfor j=1:total_beads if ~isempty(find(cell_nrs==j)) tracking_results(i,j)=1; end end clear cell_nrs frame_index mat_name end

221

good_tracking=[]; for i=1:total_beads if length(find(tracking_results(:,i)==1))>=min_track_length good_tracking=[good_tracking i]; end end load time.csv for i=1:length(good_tracking) temp_frames=find(tracking_results(:,good_tracking(i))==1); temp_phase_frames=intersect(temp_frames, fluo_frames); clear temp_frames fluo=zeros(length(temp_phase_frames),1); time_points=zeros(length(temp_phase_frames),1); for j=1:length(temp_phase_frames) time_points(j)=time(find(fluo_frames==temp_phase_frames(j))); frame_index=num2str(temp_phase_frames(j)); mat_name=strcat('tracked_frame_',frame_index,'.mat'); load (mat_name) cells1=cells; clear cells parfor k=1:beads cell_nrs(k)=cells1(k).cell_nr; end temp_bead=min(find(cell_nrs==good_tracking(i))); clear cell_nrs pixels=cells1(temp_bead).pixels; fluo_data=imread(imagefile,find(fluo_frames==temp_phase_frames(j))); parfor k=1:length(pixels) f(k)=fluo_data(pixels(k)); end f=mean(f); clear fluo_data pixels fluo(j)=f; end clear temp_phase_frames f if max(fluo)>10 cell_index=num2str(good_tracking(i)); cell_name=strcat('cell_',cell_index,'.mat'); save (cell_name, 'fluo', 'time_points'); end clear fluo time_points cell_index cell_name end clear time tracked_cells=dir('cell_*'); nr_tracked_cells=length(tracked_cells);

222

% h=figure; % for i=1:nr_tracked_cells % load(tracked_cells(i).name); % plot(time_points, fluo); % hold all % end % saveas(h,strcat(cd,'/results'),'png'); % close all fid=fopen(output_file, 'w'); for j=1:nr_tracked_cells load(tracked_cells(j).name); data=[]; for i=1:length(fluo) data=strcat(data,num2str(fluo(i)),';'); end data=strcat(data,'\n'); fprintf(fid, data); data=[]; for i=1:length(fluo) data=strcat(data,num2str(time_points(i)),';'); end data=strcat(data,'\n'); fprintf(fid, data); end fclose(fid); end

223

function results2() wd=cd; rd=strcat(wd,'/results/'); if ~isempty(dir ('*.tif')) movefile('*.tif', rd); end if ~isempty(dir ('*.csv')) movefile('*.csv', rd); end if ~isempty(dir ('*.mat')) movefile('*.mat', rd); end if ~isempty(dir ('*.zip')) movefile('*.zip', rd); end if ~isempty(dir ('*.png')) movefile('*.png', rd); end end

224

Fluorometer processing scripts function process() if ~matlabpool('size') clear all; close all; clc; matlabpool end clear all; close all; clc; data_extraction clear all; close all; clc; data_analysis clear all; close all; clc; data_presentation clear all; close all; clc; rd=strcat(cd,'/results'); movefile('*.mat',rd) movefile('*.csv',rd) h=figure; plotting saveas(h,strcat(rd,'/results'),'png'); clear all matlabpool close end

225

function data_extraction()

% Find and load the excel file with the data wd=cd; cd data data_file=dir('*.xls'); data_file=data_file.name; [num,txt,raw] = xlsread(data_file); cd(wd) first_well=min(find(~isnan(num(:,1)))); length_num=size(num,2); if first_well>1 num(1:first_well-1,:)=[]; end if ~isempty(find(isnan(num(1,:)))) last_measurement=min(find(isnan(num(1,:)))); num(:,last_measurement:length_num)=[]; end clear length_num length_num=size(num,2); clear wd data_file first_well last_measurement

% Retrieve the number of wells used in the measurement num_wells=size(num,1);

% Retrieve the measurement labels used in the experiment (e.g. OD600, GFP, etc.) labels_tmp=txt(1,:); labels=[]; for i=1:length(labels_tmp) if ~isempty(cell2mat(labels_tmp(i))) if i<=length_num labels=[labels labels_tmp(i)]; elseif length(cell2mat(labels_tmp(i)))==length('Well positions') && isempty(find(cell2mat(labels_tmp(i))~='Well positions')) wells_pos_field=i; elseif length(cell2mat(labels_tmp(i)))==length('Layout') && isempty(find(cell2mat(labels_tmp(i))~='Layout')) layout_field=i; end end end clear labels_tmp exp_num=length(labels);

% Retrieve the time vector parfor i=1:size(txt,2) time_tmp=cell2mat(txt(2,i));

226

if length(time_tmp)>0 time_vec(i)=str2num(time_tmp(1:end-1)); end end time_vec2=time_vec(1:length(time_vec)/exp_num); time_vec=time_vec2; clear time_tmp time_vec2

% Retrieve a matrix with the numerical values for all measurements in the % format(number of wells, number of labels used, number of time points) results=zeros(num_wells,exp_num,length(time_vec)); for i=1:num_wells for j=1:exp_num for k=1:length(time_vec) results(i,j,k)=num(i,length(time_vec)*(j-1)+k); end end end parfor i=1:num_wells wells(i).names=txt(3+i-1,wells_pos_field); end

% Retrieve the groups of experimental and blanc repeats tecan_labels=txt(3:3+num_wells-1,layout_field); groups=[]; used_wells=[]; for i=1:num_wells if isempty(find(i==used_wells)) current_name=cell2mat(tecan_labels(i)); groups(length(groups)+1).wells=i; if ~isempty(find(current_name(1:2)=='BL')) groups(length(groups)).type='Blanc'; elseif ~isempty(find(current_name(1:2)=='SM')) groups(length(groups)).type='Experiment'; end used_wells=[used_wells i]; for j=i:num_wells if isempty(find(j==used_wells)) new_name=cell2mat(tecan_labels(j)); if length(new_name)==length(current_name) && isempty(find(current_name~=new_name)) groups(length(groups)).wells=[groups(length(groups)).wells j]; used_wells=[used_wells j]; end end end end end

227

clear tecan_labels used_wells for i=1:length(groups) groups(i).labels=[]; for j=1:length(groups(i).wells) groups(i).labels=[groups(i).labels; wells(groups(i).wells(j)).names]; end end

% Clear the raw data obtained from the excel file clear num txt raw

% Save the retrieved data for further analysis save results results wells groups time_vec labels end

228

function data_analysis () load results

% Definition of the constants m=1.54; % Maturation constant of GFP in /h D=0.24; % Degradation constant for GFP !!! If there is protease degradation the value is about 0.24/h

% Retrieve the OD data for i=1:length(labels) current_name=cell2mat(labels(i)); if length(current_name)==length('OD600') && isempty(find(current_name~='OD600')) % was 'OD600' OD_label=i; end end fluo_num=1:length(labels); % Number of non-OD measurements fluo_num(fluo_num==OD_label)=[]; for i=1:length(fluo_num) exp_labels(i).name=cell2mat(labels(fluo_num(i))); end clear current_name

% Calculate mu data_span=zeros(length(wells),2); Mu=zeros(length(wells),1); OD=zeros(1,length(time_vec)); for well=1:length(wells) OD(:)=results(well,OD_label,:); ln_OD=log(OD); test_slope=0; for i=1:length(ln_OD)-5 for j=i+5:length(ln_OD) test_data=ln_OD(i:j); test_time=time_vec(i:j); [P,S]=polyfit(test_time,test_data,1); clear test_time test_data if S.normr<0.1 && P(1)>test_slope data_span(well,:)=[i j]; test_slope=P(1); end end end Mu(well)=test_slope*3600; end

229

% Calculate the steady-state fluorescence fluo_num=1:length(labels); % Number of non-OD measurements fluo_num(fluo_num==OD_label)=[]; OD=zeros(1,length(time_vec)); f_ss=zeros(length(wells),length(fluo_num)); for well=1:length(wells) OD(:)=results(well,OD_label,:); F=zeros(1,length(time_vec)); for i=1:length(fluo_num) F(:)=results(well,fluo_num(i),:); P=polyfit(OD(data_span(well,1):data_span(well,2)), F(data_span(well,1):data_span(well,2)),1); f_ss(well,(i))=P(1); end end

% Calculate the promoter strengths in the different wells for the different % reporters Prom_strength=zeros(length(fluo_num),length(wells)); for i=1:length(fluo_num) for j=1:length(wells) Prom_strength(i,j)=f_ss(j,i)*Mu(j)*(1+Mu(j)/m)+f_ss(j,i)*D*(2*Mu(j)/m+D/m+1); end end save results2 Prom_strength Mu OD_label exp_labels data_span end

230

function data_presentation() load results load results2 for i=1:length(wells) complete_data(i).Well=cell2mat(wells(i).names); complete_data(i).Prom_strength=Prom_strength(i); end fid = fopen('data.csv', 'w'); for i=1:length(complete_data) data=strcat(complete_data(i).Well, ';', num2str(complete_data(i).Prom_strength),'\n'); fprintf(fid, data); end fclose(fid); clear complete_data

for i=1:length(exp_labels) data_name=['data_' exp_labels(i).name]; data=groups; for j=1:length(data) if length(data(j).type)==length('Experiment') current_wells=data(j).wells; time_span=[]; for k=1:length(current_wells) data(j).Prom_strength(k)=Prom_strength(i,current_wells(k)); data(j).Mu(k)=Mu(current_wells(k)); time_span=[time_span; time_vec(data_span(current_wells(k),1)) time_vec(data_span(current_wells(k),2))]; end clear current_wells data(j).Average=mean(data(j).Prom_strength); data(j).St_dev=std(data(j).Prom_strength); data(j).time_span=time_span; clear time_span end end save(data_name,'data') end delete results.mat delete results2.mat end

231

function plotting() cd results data_file=dir('*.mat'); data_file=data_file.name; load (data_file) clear data_file cd ..

% Retrieve the number of experiments done exp_count=[]; for i=1:length(data) if length(data(i).type)==length('Experiment') exp_count=[exp_count i]; end end

% Retrieve the values of the mean and the error values=zeros(length(exp_count),1); stdevs=zeros(length(exp_count),1); for i = 1:length(exp_count) values(i) = data(exp_count(i)).Average; stdevs(i) = data(exp_count(i)).St_dev; end

% Plot the data barweb(values, stdevs, 0.8); %for a barplot with standard deviation end

232

Genetic parts and devices models function dxdt = laci_gfp(t,x,Z)

% This is a deleyed differential equation function connecting the lacI and GFP production to a % promoter activity function (p_lac_ara by definition, but maybe another one also - see lines 32, 33). % The lacI and GFP genes are controlled by two equal promoters, hence - different time delays % and translation rate % % To run it simply type: D=[6 8]; t=500; sol = dde23(@laci_gfp,[D],[0 0],[0, t]); plot(sol.x,sol.y) % % This will produce a plot of lacI and GFP concentrations v/s time for time delays of D(1 and 2), % timespan t and initial concentrations: x0=0; gfp0=0

%parameters ktr1=0.7; % translation rate for LacI ktr2=1.5; % translation rate for GFP

AraC=0; % concentration of AraC (uM) a=0; % concentration of arabinose (uM) I=0; % concentration of IPTG (uM) d1=0.3; % rate of protein degradation for LacI d2=0.2; % rate of protein degradation for GFP

Kmp=0.075; % protesome michaelis constant Vp=0.4; % rate of proteasome degradation

%functions Xd1=Z(1,1); % delayed function of LacI used in the repression calculation Xd2=Z(1,2); % delayed function of LacI used in the repression calculation PA1=p_lac_ara1(Xd1, AraC, a, I); % maybe also another user-defined promoter function PA2=p_lac_ara2(Xd2, AraC, a, I); % maybe also another user-defined promoter function

%reactions dxdt=zeros(2,1); dxdt(1) = PA1*ktr1-d1*x(1)-Vp*x(1)/(Kmp+x(1)); % LacI concentration function dxdt(2) = PA2*ktr2-d2*x(2)-Vp*x(2)/(Kmp+x(2)); % GFP concentration function function dxdt = laci_gfp_arac(t,x,Z)

% This is a deleyed differential equation function connecting the lacI, AraC and GFP production to a % promoter activity function (p_lac_ara by definition, but maybe another one also - see lines 33, 34). % The lacI gene on one hand, and the AraC and GFP genes on the other, are controlled by two equal promoters,

233

% hence - different time delays and translation rates % % To run it simply type: D=[6 7]; t=100; sol = dde23(@laci_gfp_arac,[D],[0 0 0],[0, t]); plot(sol.x,sol.y) % % This will produce a plot of lacI and GFP concentrations v/s time for time delays of D(1), % timespan t and initial concentrations: lacI0=0; GFP0=0; araC0=0

%parameters ktr1=0.7; % translation rate for LacI ktr2=1.5; % translation rate for GFP ktr3=2; % translation rate for AraC a=0; % concentration of aTC (uM) I=0; % concentration of IPTG (uM) d1=0.3; % rate of protein degradation for LacI d2=0.2; % rate of protein degradation for GFP d3=0.2; % rate of protein degradation for TetR

Kmp=0.075; % protesome michaelis constant Vp=0.4; % rate of proteasome degradation

%functions Xd1=Z(1,1); % delayed function of LacI used in the repression calculation Xd2=Z(1,2); % delayed function of LacI used in the repression calculation for the second promoter (controlling AraC and GFP) PA1=p_lac_ara(Xd1, x(3), a, I); % maybe also another user-defined promoter function PA2=p_lac_ara(Xd2, x(3), a, I); % maybe also another user-defined promoter function

%reactions dxdt=zeros(3,1); dxdt(1) = PA1*ktr1-d1*x(1)-Vp*x(1)/(Kmp+x(1)); % LacI concentration function dxdt(2) = PA2*ktr2-d2*x(2)-Vp*x(2)/(Kmp+x(2)); % GFP concentration function dxdt(3) = PA2*ktr3-d3*x(3)-Vp*x(3)/(Kmp+x(3)); % AraC concentration function function dxdt = laci_p_lac_tet(t,x,Z)

% This is a deleyed differential equation function connecting the lacI production to a promoter activity function % (p_lac_tet by definition, but maybe another one also). % % To run it simply type: D=[6]; t=500; sol = dde23(@laci_p_lac_tet,[D],[0],[0, t]); plot(sol.x,sol.y) % % This will produce a plot of lacI concentration v/s time for time delays of D, % timespan t and initial concentrations: lacI0=0;

%parameters

234

ktr1=0.7; % translation rate for LacI I=0; % concentration of IPTG (uM) d1=0.3; % rate of protein degradation for LacI Kmp=0.075; % protesome michaelis constant Vp=0.4; % rate of proteasome degradation tet=0; % TetR concentration (uM) atc=0; % aTC concentration (uM)

%functions Xd=Z; % delayed function of LacI used in the repression calculation PA=p_lac_tet(tet, atc, Xd, I); % maybe also another user-defined promoter function

%reactions dxdt = PA*ktr1-d1*x-Vp*x/(Kmp+x); % LacI concentration function

235

function dxdt = laci_p_l_lac(t,x,Z)

% This is a deleyed differential equation function connecting the lacI production to a promoter activity % function (p_l_lac_ara by definition, but maybe another one also - see line 25) % % To run it simply type: D=6; t=500; sol = dde23(@laci_p_l_lac,[D],[0],[0, t]); plot(sol.x,sol.y) % % This will produce a plot of lacI concentration v/s time for time delay of D, timespan t and initial % concentration: x0=0

%parameters ktr=0.7; % translation rate

I=10000; % concentration of IPTG (uM) d1=0.3; % rate of protein degradation

Kmp=0.075; % protesome michaelis constant Vp=0.4; % rate of proteasome degradation

%functions xd=Z; % delayed function of LacI used in the repression calculation PA=p_l_lac(xd, I); % maybe also another user-defined promoter function

%reaction dxdt = PA*ktr-d1*x-Vp*x/(Kmp+x);

236

function dxdt = lac_oscillator(t,x,Z)

% run sol = dde23(@lac_oscillator,[D],[0; 0],[0, t]); plot(sol.x,sol.y) % D=6; t=100; % for time delays for looping formation of D and timespan t % x0=0 for both LacI and mRNA

%parameters Pt=10; % total number of plasmids Pb=500; % basic promoter activity ktr=0.7; % translation rate

I=0; % concentration of IPTG (uM) ki=15.7; % constant of induction (uM)

L0=5; % looping coefficient kl=0.4; % affinity constant of LacI for operator d1=0.03; % rate of protein degradation d2=0.3; % rate of mRNA degradation

Kmp=0.075; % protesome michaelis constant Vp=4; % rate of proteasome degradation kd=0.2; %laci dimer constant kt=0.004; %laci tetramer constant

%functions xd=Z(2); % delayed function of LacI used in the looping repression calculation X=x(2)+kd*x(2)^2+kt*x(2)^4; %total concentration of LacI species Ind=1+I/ki; % induction function

PA=Pb/(1+(2*x/kl)/Ind^2+(2*xd*L0/kl)/Ind^4); % promoter activity function

%reactions dxdt=zeros(2,1); dxdt(1)=PA*Pt-d2*x(1); % transcription and mRNA degradation dxdt(2)=x(1)*ktr-d1*x(2)-Vp*X/(Kmp+X); % translation and protein degradation

237

function f = p_lac_tet(x, y, z, w)

% This is a function calculating the activity of the Plac/tet promoter with % respect to four inputs: p_L_lac = f(x, y, z, w) % % Where: % x is concentration of TetR in uM % y is concentration of aTC in uM % z is concentration of LacI in uM % w is concentration of IPTG in uM

%parameters Pb=2; % basal promoter activity (?????????????????????) Pmax=2500; % promoter activity at maximal induction (?????????????????)

Kr1=0.89; % constant of repression by TetR (uM) Ki1=2.1; % constant of induction by aTc (ng/mL)

Ki2=15.7; % constant of repression induction by IPTG (uM) Kr2=0.001; % constant defining the active repressor species L0=0; % looping coefficient (zero for no looping)

Pt=3; % total number of plasmids

%functions Rep1=1+(x/(Kr1)/((1+y/Ki1)^2))^2; % Promoter repression function TetR Rep2=1+(z/Kr2)/((1+w/Ki2)^2)+L0*(z/Kr2)/((1+w/Ki2)^4);

% Promoter activity f=Pt*Pb*Pmax/Rep1/Rep2;

238

function f = p_l_lac(x, y)

% This is a function calculating the activity of the PLlac promoter with % respect to four inputs: p_L_lac = f(x, y) % % Where: % x is concentration of LacI in uM % y is concentration of IPTG in uM

%parameters Pb=35; % basal promoter activity Pmax=620; % promoter activity at maximal induction

Ki=90; % constant of repression induction by IPTG (uM) Kr=0.01; % constant defining the active repressor species

Pt=60; % total number of plasmids

%functions Rep=1+(x/Kr)/((1+y/Ki)^2); % Promoter repression function

% Promoter activity f=Pt*Pb*Pmax/Rep;

239

function f = p_l_tet(x, y)

% This is a function calculating the activity of the PLlac promoter with % respect to four inputs: p_L_lac = f(x, y) % % Where: % x is concentration of TetR in uM % y is concentration of aTC in uM

%parameters Pb=2; % basal promoter activity Pmax=2500; % promoter activity at maximal induction

Kr=0.89; % constant of repression by TetR (uM) Ki=2.1; % constant of induction by aTc (ng/mL) Pt=60; % total number of plasmids

%functions Rep=(1+(x/Kr)/((1+y/Ki)^2))^2; % Promoter repression function

% Promoter activity f=Pt*Pb*Pmax/Rep;

240

Characterization function lissajou() %UNTITLED Summary of this function goes here % Detailed explanation goes here load gfp_bact_traj.txt gfp=gfp_bact_traj; load rfp_bact_traj.txt rfp=rfp_bact_traj; % lissajou_data=[]; total=length(gfp); for i=29 x=gfp(:,i); if (length(find(x~=0))>=20) golx=sgolayfilt(x, 2, 13); golx=sgolayfilt(golx, 2, 13); y=rfp(:,i); goly=sgolayfilt(y, 2, 13); goly=sgolayfilt(goly, 2, 13); i total golx=golx.*1000; goly=goly.*1000; plot(golx, goly) pause close all; end end end

241