Beyond Editing to Writing Large Genomes
Total Page:16
File Type:pdf, Size:1020Kb
PERSPECTIVES STUDY DESIGNS the technological advancements in genome OPINION engineering with those in both DNA synthesis and DNA sequencing, the key Beyond editing to writing components that are necessary to approach large-scale genome engineering have fallen into place. large genomes In this Opinion article, we describe the motivation for and current state of Raj Chari and George M. Church large-scale genome engineering and Abstract | Recent exponential advances in genome sequencing and engineering compare strategies for how it can be achieved. We provide an overview of key technologies have enabled an unprecedented level of interrogation into the genome-engineering technologies and how impact of DNA variation (genotype) on cellular function (phenotype). Furthermore, they can be utilized for large-scale editing. these advances have also prompted realistic discussion of writing and radically Finally, we detail the key features that will re‑writing complex genomes. In this Perspective, we detail the motivation for be necessary for a robust genome-writing large‑scale engineering, discuss the progress made from such projects in bacteria platform and provide a roadmap to engineer a large genome. As seen for whole-genome and yeast and describe how various genome‑engineering technologies will reading, the cost and quality of methods for contribute to this effort. Finally, we describe the features of an ideal platform and genome writing are improving by factors of provide a roadmap to facilitate the efficient writing of large genomes. millions, and rigorous, comprehensive tests for the accuracy of genome writing will shift from being a luxury to becoming essential DNA reading (sequencing) and writing can be performed in the same space, at and routine. (synthesis) are intimately connected: the same cost and with the same effort meaningful use of either requires the as a single reaction). Why should we engineer genomes? other. For example, the most widely Although much of the attention has When we refer to genome engineering used sequencing platforms read DNA been on the progress in DNA reading, and genome editing, our use of ‘genome’ during the synthesis of a complementary namely, advances in DNA sequencing specifically means large-scale engineering strand1–3. Targeted DNA reading, such technologies2,3, the improvements in or alteration of several loci across the as exome sequencing, depends on vast writing, which can largely be attributed to genome. We feel that such terminology libraries of synthetic oligonucleotides for harnessing and stimulating natural cellular is unhelpfully misapplied when referring target capture, and the annotation and processes such as homologous recombination to single-locus applications, such as using understanding of genome reads benefit (HR) to enable genome modification, genome engineering to refer to editing that greatly from genome-scale synthesis and should not be overlooked. From the initial is done directly in the genome rather than testing. Furthermore, meaningful use of work of describing HR at a very low level in a plasmid or transgenic setting or using genome engineering generally requires in higher-order eukaryotic cells to current the term genome editing when the changes whole-genome sequencing to design work, where we can achieve reasonably high are neither precise editing nor genome strategies for avoiding off-target writing efficiencies of HR in stem cells4–7 (FIG. 1), our scale. The question for this section is why and to verify the quality of the final genome ability to direct targeted changes in genomes should we design and generate whole (or product. Genome editing and writing have has opened a large number of avenues that nearly whole) genomes? An initiative known benefited from diverse nanomachines previously may not have seemed possible. as Genome Project–write has recently harvested by reading the biosphere. Since One of the main areas that can be advocated large-scale genome engineering9. 1975, reading and writing platforms have realistically pursued is the ability to perform The motivation and goals are well aligned exhibited increases in throughput of three- large-scale editing of genomes, even for both small-scale and large-scale genome million-fold (for reading; see https://www. multiplexing libraries of billions of changes8. engineering. With respect to non-human genome.gov/sequencingcostsdata/) and From an industrial perspective to a health organisms, bacteria are common chassis for one-billion-fold (for writing of raw oligo- care perspective and to researchers aiming various bioproduction applications, from nucleotides)1, with a steep increase at an for fundamental understanding of how uses in bioremediation to pharmaceuticals exponential rate since 2004. The technical biological systems work, the motivations to bioenergy10–12 (FIG. 2). From a purely progress of both reading and writing has to pursue ambitious genome-engineering functional perspective, re-engineering the depended not only on mere automation of projects originate from a diverse group of genomes of particular strains could render human protocols but also on rethinking the stakeholders, and the results of these projects the bacteria less susceptible to viruses and/ whole process to permit miniaturization and could have considerable positive impact or more efficient at producing a biomolecule multiplexing (in which millions of reactions on multiple aspects of society. Combining of interest13,14. Similarly, yeast strains, plants NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 1 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. PERSPECTIVES The proportions and persistence of edited Bacteria Yeast Mammalian Project ongoing Range of recombination rates and unedited cells are particularly important 128–129,136 in diseases such as cancer, where residual phiC31 integrase 7 100 27,133 10 Cre recombinase50,125 CRISPR unedited, defective cells may have acquired 130,137 a selective growth advantage over edited ZFN 6 80 10 cells that could confer them with the ability, TALEN131–132,142 Richardson 126–127,135,138 (Sc2.0) number of edited sites Total over time, to eliminate edited cells from 10 I-SceI 105 the population. Ostrov77 It is also clear that many normal (RC57) 104 and disease phenotypes are governed Annaluru79 1 Dymond78 by concurrent genetic variants across (Sc2.0) (Sc2.0) 103 multiple alleles throughout the genome. 75 Hutchison Large-scale genome editing would enable, Isaacs13 (JCVI-syn3.0) 102 in the immediate term, the understanding Recombination rate (%) Recombination rate 0.1 (ReColi) Yang70 of complex genotype–phenotype 10 associations. From a more ambitious Endogenous Wang134 HR4 Capecchi–Smithies perspective, engineering human cells to be 0.01 1 less susceptible to disease and generating 1985 1990 1995 2000 2005 2010 2015 2020 2025 tissues to eventually produce organs of Year high value are additional applications Figure 1 | Timeline of gene targeting. The timeline depicts progress in geneNature targeting Reviews rates | Genetics through that would require large-scale genome recombination (red) and the number of edited sites in a single genome (blue). Recombination rates editing. Thus, it is likely that the most 4,27,28,50,58,63,64,125–147 with dashed lines denote ranges observed in the published literature . In the mid to immediate impact from editing human late 1980s, it was observed that in the presence of exogenous DNA, homologous recombination (HR) cells will emanate from small-scale could occur at a very low frequency (0.01%). Subsequently, it was shown that by inducing a double‑ engineering, whereas large-scale strand break (DSB), the rates of HR could be dramatically increased. Thus, different modalities have been discovered and employed to create DSBs at higher frequencies, from meganucleases (such as approaches will be crucial for obtaining a I‑SceI) to zinc‑finger nucleases (ZFNs) and transcription activator‑like effector nucleases (TALENs) to better understanding of phenotypes with the current CRISPR‑associated nucleases. In addition, approaches that do not use DSB‑stimulated HR, complex genetics. such as Cre recombinase or phiC31 integrase, have also shown tremendous promise for DNA replace‑ Tactically, if the number of desired ment but are restricted by the limited range of sequences that can be targeted. As a corollary, in changes is large and the changes span combination with the progress in DNA synthesis technologies, our improved ability to edit DNA has different areas of the genome, there is permitted increases in the total number of modifications being made in a single genome. In the past, a compelling reason to approach this 1 to 2 edits were typically made in a single genome. Ambitious projects such as RC57 and Sc2.0 have problem in a completely synthetic manner. aimed to make up to four or even five orders of magnitude more edits in a single genome. From a practical perspective, this approach is currently limited to genomes that are and livestock could be thought of in the mutations are the likely causes and drivers megabases in size (at most), such as those same manner, and re-engineering their of specific diseases16–22. Subsequently, in unicellular organisms. Here, rather than genomes could ensure the highest and most from a therapeutic perspective, repair of modifying an existing genomic template, efficient productivity. For some of these such defects would probably necessitate designing and building the genome of particular entities,