SURVEY and SUMMARY Retrons and Their Applications in Genome
Total Page:16
File Type:pdf, Size:1020Kb
Nucleic Acids Research, 2019 1 doi: 10.1093/nar/gkz865 SURVEY AND SUMMARY Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gkz865/5584520 by Unitversity of Texas Libraries user on 01 November 2019 Retrons and their applications in genome engineering Anna J. Simon *, Andrew D. Ellington and Ilya J. Finkelstein * Center for Systems and Synthetic Biology and Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712, USA Received April 23, 2019; Revised September 19, 2019; Editorial Decision September 23, 2019; Accepted October 02, 2019 ABSTRACT plex Automated Genome Engineering (‘MAGE’) (8), Yeast Oligo-mediated Genome Engineering (‘YOGE’) (9), and it- Precision genome editing technologies have trans- erative CRISPR EnAbled Trackable genome Engineering formed modern biology. These technologies have (‘iCREATE’) (10). arisen from the redirection of natural biological ma- Retrons are a potentially useful tool for targeted genomic chinery, such as bacteriophage lambda proteins for engineering because they produce intracellular DNAs recombineering and CRISPR nucleases for eliciting via reverse transcription of noncoding structural RNAs site-specific double-strand breaks. Less well-known (11,12). While the natural biology of these widespread is a widely distributed class of bacterial retroele- retroelements is largely unknown, retrons have recently ments, retrons, that employ specialized reverse tran- been repurposed as in situ sources of donor DNA for both scriptases to produce noncoding intracellular DNAs. recombineering- and CRISPR-based genome editing ap- Retrons’ natural function and mechanism of genetic plications (13–15). This production of intracellular DNA has so far enabled recording of cellular states (13), high- transmission have remained enigmatic. However, re- throughput genetic editing (14), and targeted genome muta- cent studies have harnessed their ability to produce tion (15). Here, we review retron biology and retron-enabled DNA in situ for genome editing and evolution. This genome engineering applications. We conclude with a dis- review describes retron biology and function in both cussion of future prospects and challenges in retron-based natural and synthetic contexts. We also highlight genome editing. areas that require further study to advance retron- based precision genome editing platforms. RETRON BIOLOGY Retron discovery and biochemistry INTRODUCTION Retrons were initially discovered in 1984 from the presence The ability to precisely edit genomic loci has been a long- of short satellite RNA–DNA molecules in bacterial DNA standing goal of biotechnology. Targeted genome editing preparations (16). Gel electrophoresis of phenol-extracted is being exploited to probe genotype-phenotype relation- chromosomal DNA from Myxobacteria xanthus and Stig- ships, repair disease-causing alleles, develop designer or- matella aurantica indicated a secondary satellite band with ganisms with modified genomes, and to biologically record a mobility of ∼120–190 base pairs (bp) (16). Subsequent intra- and extracellular conditions. Classically, strain im- studies revealed that these bands, termed multicopy single- provement has been achieved either via non-specific mu- stranded DNAs (msDNAs), are comprised of one strand tagenesis of genomes (1,2) or via targeted modifications of structured RNA, the ‘msr,’ connected to one strand of of specific genes expressed on plasmids (3,4). Recombi- DNA, the ‘msd.’ The msr and msd molecules are joined by neering (5) and CRISPR-based genome editing (6,7)have a2-5 phosphodiester bond between a priming guanosine dramatically increased the ease and throughput of tar- within a conserved AGC sequence in the msr and the phos- geting individual genomic loci. Multiplexing and iterating phate of the 5 end of the msd that covalently links the RNA recombineering- and CRISPR-based editing technologies and DNA strands into a single branched molecule (Figure have enabled high-throughput exploration of genotype– 1). The msr and msd are encoded in a compact, contiguous phenotype relationships and continuous evolution of syn- transcriptional cassette that also includes a specialized re- thetic genomes with novel properties, for example in Multi- verse transcriptase (RT); we refer to this cassette as a whole *To whom correspondence should be addressed. Tel: +1 512 475 6172; Email: [email protected] Correspondence may also be addressed to Anna J. Simon. Email: [email protected] C The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 2 Nucleic Acids Research, 2019 A Priming Reverse transcriptase as the retron. The discovery of retron RTs was particularly msr guanosine msd (RT) notable because they were the first discovered bacterially- encoded RTs (17). In vitro studies demonstrated that Es- Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gkz865/5584520 by Unitversity of Texas Libraries user on 01 November 2019 5’ cherichia coli retron RTs and their paired msrs were both repeat 3’ repeat 2 necessary and sufficient to produce msDNA with the char- bits Reverse transcription 1 acteristic 2 -5 linkage (18,19). To date, 38 distinct retrons 0 have been identified experimentally by their production of B msDNA; 16 of these, which we subsequently refer to as the G ‘experimentally validated retrons’, contain fully annotated A C U G G U C U U C C U A A G and experimentally-validated msr-msd-RT cassettes. Table A msd transcript G C A U C A G T A A G C G U G C U 1 summarizes all experimentally-characterized retrons re- A C G U U G A C C G G ported to date, along with their host organisms. Bioinfor- U U U G A C U U C G A G A C U C U U A matic approaches have also identified hundreds of puta- U C G C C G G U C GC U U G A C U G A U U G U U A tive retrons that have not been characterized experimentally U G C C A U A Reverse A G C G C C G transcription (20–22). U U U C U C C A G C A The rapid discovery of hundreds of putative retrons has A G U C A G A C G G U G G U G U U A U U prompted us to develop a systematic naming convention A G C U A A U U A U C U U A U A U C G Beginning of for these retroelements. Currently, retrons are named by Priming C G C G reverse the first letters of their genus name, species name, and the guanosine A U C G transcription G C length of reverse-transcribed DNA in their native msDNA C G G C sequence. This convention was initially developed by In- 5’ repeat U A A 3’ repeat 5’ 3’ ouye et al. in 1989, who termed a retron isolated from E. coli that contained 86 bases of reverse-transcribed DNA ‘Ec86’ U A (23). However, this nomenclature is not easy to adapt for C U A A G large-scale, systematic retron discovery for three reasons: (i) C G U U U U U A the length of the msd of most experimentally validated and C G C U A U U G putative retrons is unknown; (ii) two letters are insufficient A U U G Priming C G C msr G U to uniquely identify the genus and species of many retron- G C A A guanosine A U U G A U encoding bacteria; and (iii) the convention does not distin- G C C C U G U G G A A C U A UUGGC AAUUU C 5’ G AUGGCCCCAA CU U G C A G A C TTC AAG A guish the entire retron from the RT. Therefore, we propose C G A G T C C A a more general naming convention that is adapted from the A G A A A O A A A A A A C G G G A 5’ T T T C G restriction enzyme literature (Table 1)(24). N G msr (5’) NH C Reverse T A GA N N NH O 2 transcription G C I. Retrons are named as the first letter of their genus, first T A T A two letters of their species, and an Arabic numeral cor- 2’ G C msr (3’) G C responding to the order with which they were discov- O O C T O T T C G ered relative to other retrons from the same species or a O P O N G C NH G C msd different species with the same first genus and first two O A T N G C species letters, respectively. N NH2 reverse O A T 5’ G C transcript II. The prefix ‘Retron-” is added to signify that this object C G A T is a retron, and the prefixes ‘RT-’, ‘msr’, ‘msd’ and ‘ms- T A O C G A DNA’are added when referring to the reverse transcrip- msd C G G tase enzyme, msr, msd and msDNA, respectively. Figure 1. Retron structure and organization. (A) Retrons are encoded as Under the new convention, ‘Ec86’ is re-named ‘Retron- a single polycistronic transcriptional cassette containing a promoter, an Eco1’ and the corresponding RT is ‘RT-Eco1.’ For clarity msr (blue) including a conserved priming guanosine residue (green), an and continuity, the rest of this review will use the new con- msd (red), self-complementary regions (yellow), and a reverse transcrip- tase (purple, not to scale) (12). The black arrow shows the direction of the vention and will also include the historical names in paren- coding strand for the retron cassette; the red arrow shows the direction thesis (e.g.