De Novo Automated Design of Small RNA Circuits for Engineering Synthetic Riboregulation in Living Cells
Total Page:16
File Type:pdf, Size:1020Kb
De novo automated design of small RNA circuits for engineering synthetic riboregulation in living cells Guillermo Rodrigoa,b,1, Thomas E. Landraina,1, and Alfonso Jaramilloa,2 aInstitute of Systems and Synthetic Biology, Genopole – Université d′Évry Val d′Essonne – Centre National de la Recherche Scientifique (CNRS UPS3509), 91030 Évry, France; and bInstituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas – Universidad Politécnica de Valencia, 46022 Valencia, Spain Edited by David Baker, University of Washington, Seattle, WA, and approved August 6, 2012 (received for review March 20, 2012) A grand challenge in synthetic biology is to use our current knowl- tional problem, here we also rely on Watson–Crick interactions edge of RNA science to perform the automatic engineering of com- (i.e., canonical purine-pyrimidine pairs plus the G–U pair), pletely synthetic sequences encoding functional RNAs in living although it is possible to extend our approach to other types cells. We report here a fully automated design methodology and of interactions (28). Therefore, we are faced with the problem experimental validation of synthetic RNA interaction circuits work- of designing a set of RNA species with predefined structures ing in a cellular environment. The computational algorithm, based and with unspecified intermolecular interactions able to produce on a physicochemical model, produces novel RNA sequences by the intended allosteric regulation. Here, we show it is possible to exploring the space of possible sequences compatible with prede- solve such a combinatorial problem by developing a fully auto- fined structures. We tested our methodology in Escherichia coli by mated procedure that exploits physicochemical principles and designing several positive riboregulators with diverse structures structural constraints and that outputs the RNA sequences imple- and interaction models, suggesting that only the energy of forma- menting the predefined interactions. tion and the activation energy (free energy barrier to overcome for initiating the hybridization reaction) are sufficient criteria to engi- Results and Discussion neer RNA interaction and regulation in bacteria. The designed se- Computational Design of sRNA Circuits. Our methodology (see de- quences exhibit nonsignificant similarity to any known noncoding tails in SI Appendix) starts by choosing well-defined structures for RNA sequence. Our riboregulatory devices work independently all single species of the circuit. Because we focus on Watson– and in combination with transcription regulation to create complex Crick interactions, we implicitly assume that the secondary struc- logic circuits. Our results demonstrate that a computational meth- ture already determines a stable three-dimensional architecture odology based on first-principles can be used to engineer interact- (29). We also hypothesize that the interaction between two spe- ing RNAs with allosteric behavior in living cells. cies is nucleated by their unpaired nucleotides (Fig. 1A). In this interaction model, an initial intermolecular pairing driven by a post-transcriptional regulation ∣ evolutionary computation ∣ small sequence (toehold or seed site) nucleates a downhill reac- computational RNA design ∣ RNA synthetic biology tion where the size of the intermolecular pairs (represented as a reaction coordinate in Fig. 1A) rapidly increases until the hybri- he understanding of RNA interactions in living cells and their dization ends. Initially, the algorithm assigns random nucleotides subsequent exploitation as regulators is providing new syn- to the sequences of each RNA species, while respecting their T ΔGi thetic biology applications (1). RNA regulation is being studied designated secondary structure, which ensures a low form by A from natural systems by the analysis of the interactions of small solving an inverse folding problem (Fig. 1 ). It then explores RNAs (sRNAs) with messenger RNAs (mRNAs) (2), proteins the space of allowed nucleotide sequences by using an objective (3) or molecules (4). However, it is also possible to follow a for- function and a Monte Carlo simulated annealing (MCSA) search B ward engineering approach and attempt the de novo design of algorithm (30) (Fig. 1 ). The convergence of our algorithm is RNA regulators. Rational design techniques have been applied, coupled to the existence of large networks of neutral paths (of in both prokaryotes and eukaryotes, for repression or activation common structure) able to connect highly different sequences of translation (5–10), mRNA degradation (11), riboswitches and (31), and the algorithm scatters several initial random sequences ribozymes (12–14), transcription attenuation (15–18), and scaf- along these networks to perform an efficient exploration of the folding (19). On the other hand, computational methods allowed sequence space. In addition, to enlarge such neutral paths and designing nucleic-acid-based logic circuits in vitro (20–23), in- then improve the optimization, we allow non-neutral mutations cluding the redesign of allosteric ribozymes (23), hence, challen- perturbing, up to three base pairs, the structures specified for the ging the current knowledge of nucleic acid structure and function. single species. Previous RNA design approaches, however, have been mostly de- The objective function is defined to minimize by MCSA two i veloped to work with in vitro systems or required incorporating competing design goals: ( ) free energy of complex formation ii fragments of natural sequences. and ( ) activation energy of complex formation. The first term We now propose a fully automated sequence selection meth- accounts for the free energy difference between the interacting odology to design general circuits based on RNA interactions to and free species for all possible interactions in the circuit ΔG operate in living cells. Previous computational methodologies re- ( form). For the second term we consider a magnitude related lied on the dominance of the Watson–Crick interactions (24), but they were not adapted to in vivo operations where RNA could be Author contributions: G.R., T.E.L., and A.J. designed research, performed research, very unstable as it occurs in bacteria. Our approach consists of contributed new reagents/analytic tools, analyzed data, and wrote the paper. stabilizing RNA molecules by enforcing a given structure, as done The authors declare no conflict of interest. – in the inverse folding problem (25 27), together with targeted This article is a PNAS Direct Submission. interactions and conformational changes. The analysis of natural 1G.R. and T.E.L. contributed equally to this work. systems unveils another challenge: The kinetics of RNA interac- 2To whom correspondence should be addressed. E-mail: alfonso.jaramillo@issb tions is rate-limited by the initial interaction of solvent-accessible .genopole.fr. BIOPHYSICS AND nucleotides from each binding partner, as illustrated in the kis- This article contains supporting information online at www.pnas.org/lookup/suppl/ sing-loop mechanism (2). To make manageable the computa- doi:10.1073/pnas.1203831109/-/DCSupplemental. COMPUTATIONAL BIOLOGY www.pnas.org/cgi/doi/10.1073/pnas.1203831109 PNAS ∣ September 18, 2012 ∣ vol. 109 ∣ no. 38 ∣ 15271–15276 Downloaded by guest on September 29, 2021 A Unfolded state Reaction scheme of modified models harnessing experimental data (33) can be em- RNA interaction RBS ployed to improve the prediction of the functionality of the ribor- Transition state egulatory devices. RBS G i form Engineering Synthetic Riboregulation. As a case study for our meth- i G act odology we chose the challenging problem of engineering a syn- Free Energy thetic riboregulator able to trans-activate the translation of a Intermolecular Individual Gform cis SI Appendix folding state -repressed gene ( , Fig. S4). This problem provides folding state RBS an assessment of the generality of our methodology because it requires the design of two RNA species that experience a confor- 0 dinteract Reaction Coordinate mational change after their mutual interaction. In this case, the quantification of the regulatory activity by differential protein ex- B Optimization Individual folding pression already results in a characterization of the RNA inter- scheme action. Furthermore, our riboregulators enlarge the repertoire of Scoring RBS exogenous activators of bacterial gene expression for creating Sequences G = G + G interact act form gene circuits with complex dynamics (34). To design such ribor- sRNA Subjected to allosteric regulation Gconstr RBS Intermolecular folding << egulatory devices, we implemented a natural mechanism (35, 36), 5’-UTR mRNA 0 1 RBS = + ’ Gscore Ginteract (1 ) Gconstr where the sRNA activates translation by binding to the 5 -un- translated region (UTR) of a given mRNA. This binding is de- vised to produce a conformational change in the cis-repressed Mutations ribosome binding site (RBS) to become exposed to the solvent and, therefore, enable the docking of the 16S ribosomal unit dur- FinP-like ing translation initiation (SI Appendix, Fig. S4). In our designs, we C T1 T2 T5 30 20 90 C1 SokC-like 40 80 ’ 40 60 maintained as fixed the RBS sequence in the 5 -UTR. We se- 30 T4 50 lected an mRNA coding for a fluorescent protein (GFP) to facil- 100 70 CU G C itate the experimental characterization. The secondary structures A A 20 10 AUU GUC A A U 30 60 U A 50 30 50 110 A 70 A U A U ’ 20 A C G for both sRNA and 5 -UTR of the mRNA were specified as U G GFP 40 A U G G C G U 120 A 50 A U C U A structural constraints (Fig. 1 ), although these specifications are G 40 1 79 C G 1 10 55 G U U A G U G A T3 G C G U not strict and some perturbations in the structures are tolerated. G 40 20 60 U A DsrA-like A U 130 40 A U A A U U G A G C G U For the sRNA, we picked several known structures to assay the A 705050 U A G C 10 U A A U U G 140 30 U G versatility of our methodology: the predicted structures of the 10 G C 80 A U U A AUG 30 G C 1 10 7071 C G U A bacterial sRNAs SokC, FinP, and DsrA (T1, T2, and T3, respec- 50 G U 20 G C 1 55 ACAUCAAACGUCCAGCGCCUG UUUUUUU 150 ’ 110 20156 tively) and two artificial structures (T4 and T5) (6, 8).