PERSPECTIVES

STUDY DESIGNS the technological advancements in genome OPINION engineering with those in both DNA synthesis and DNA sequencing, the key Beyond editing to writing components that are necessary to approach large-scale genome engineering have fallen into place. large genomes In this Opinion article, we describe the motivation for and current state of Raj Chari and George M. Church large-scale genome engineering and Abstract | Recent exponential advances in genome sequencing and engineering compare strategies for how it can be achieved. We provide an overview of key technologies have enabled an unprecedented level of interrogation into the genome-engineering technologies and how impact of DNA variation (genotype) on cellular function (phenotype). Furthermore, they can be utilized for large-scale editing. these advances have also prompted realistic discussion of writing and radically Finally, we detail the key features that will re‑writing complex genomes. In this Perspective, we detail the motivation for be necessary for a robust genome-writing large-scale engineering, discuss the progress made from such projects in bacteria platform and provide a roadmap to engineer a large genome. As seen for whole-genome and yeast and describe how various genome-engineering technologies will reading, the cost and quality of methods for contribute to this effort. Finally, we describe the features of an ideal platform and genome writing are improving by factors of provide a roadmap to facilitate the efficient writing of large genomes. millions, and rigorous, comprehensive tests for the accuracy of genome writing will shift from being a luxury to becoming essential DNA reading (sequencing) and writing can be performed in the same space, at and routine. (synthesis) are intimately connected: the same cost and with the same effort meaningful use of either requires the as a single reaction). Why should we engineer genomes? other. For example, the most widely Although much of the attention has When we refer to genome engineering used sequencing platforms read DNA been on the progress in DNA reading, and genome editing, our use of ‘genome’ during the synthesis of a complementary namely, advances in DNA sequencing specifically means large-scale engineering strand1–3. Targeted DNA reading, such technologies2,3, the improvements in or alteration of several loci across the as exome sequencing, depends on vast writing, which can largely be attributed to genome. We feel that such terminology libraries of synthetic oligonucleotides for harnessing and stimulating natural cellular is unhelpfully misapplied when referring target capture, and the annotation and processes such as to single-locus applications, such as using understanding of genome reads benefit (HR) to enable genome modification, genome engineering to refer to editing that greatly from genome-scale synthesis and should not be overlooked. From the initial is done directly in the genome rather than testing. Furthermore, meaningful use of work of describing HR at a very low level in a or transgenic setting or using genome engineering generally requires in higher-order eukaryotic cells to current the term genome editing when the changes whole-genome sequencing to design work, where we can achieve reasonably high are neither precise editing nor genome strategies for avoiding off-target writing efficiencies of HR in stem cells4–7 (FIG. 1), our scale. The question for this section is why and to verify the quality of the final genome ability to direct targeted changes in genomes should we design and generate whole (or product. Genome editing and writing have has opened a large number of avenues that nearly whole) genomes? An initiative known benefited from diverse nanomachines previously may not have seemed possible. as Genome Project–write has recently harvested by reading the biosphere. Since One of the main areas that can be advocated large-scale genome engineering9. 1975, reading and writing platforms have realistically pursued is the ability to perform The motivation and goals are well aligned exhibited increases in throughput of three- large-scale editing of genomes, even for both small-scale and large-scale genome million-fold (for reading; see https://www. multiplexing libraries of billions of changes8. engineering. With respect to non-human genome.gov/sequencingcostsdata/) and From an industrial perspective to a health organisms, bacteria are common chassis for one-billion-fold (for writing of raw oligo- care perspective and to researchers aiming various bioproduction applications, from nucleotides)1, with a steep increase at an for fundamental understanding of how uses in bioremediation to pharmaceuticals exponential rate since 2004. The technical biological systems work, the motivations to bioenergy10–12 (FIG. 2). From a purely progress of both reading and writing has to pursue ambitious genome-engineering functional perspective, re‑engineering the depended not only on mere automation of projects originate from a diverse group of genomes of particular strains could render human protocols but also on rethinking the stakeholders, and the results of these projects the bacteria less susceptible to viruses and/ whole process to permit miniaturization and could have considerable positive impact or more efficient at producing a biomolecule multiplexing (in which millions of reactions on multiple aspects of society. Combining of interest13,14. Similarly, yeast strains, plants

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 1 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

The proportions and persistence of edited Bacteria Yeast Mammalian Project ongoing Range of recombination rates and unedited cells are particularly important

128–129,136 in diseases such as cancer, where residual phiC31 integrase 7 100 27,133 10 Cre recombinase50,125 CRISPR unedited, defective cells may have acquired 130,137 a selective growth advantage over edited ZFN 6 80 10 cells that could confer them with the ability, TALEN131–132,142 Richardson 126–127,135,138 (Sc2.0) Total number of edited sites over time, to eliminate edited cells from 10 I-SceI 105 the population. Ostrov77 It is also clear that many normal (RC57) 104 and disease phenotypes are governed Annaluru79 1 Dymond78 by concurrent genetic variants across (Sc2.0) (Sc2.0) 103 multiple alleles throughout the genome. 75 Hutchison Large-scale genome editing would enable, Isaacs13 (JCVI-syn3.0) 102 in the immediate term, the understanding Recombination rate (%) Recombination rate 0.1 (ReColi) Yang70 of complex genotype–phenotype 10 associations. From a more ambitious Endogenous Wang134 HR4 Capecchi–Smithies perspective, engineering human cells to be 0.01 1 less susceptible to disease and generating 1985 1990 1995 2000 2005 2010 2015 2020 2025 tissues to eventually produce organs of Year high value are additional applications Figure 1 | Timeline of gene targeting. The timeline depicts progress in geneNature targeting Reviews rates | Genetics through that would require large-scale genome recombination (red) and the number of edited sites in a single genome (blue). Recombination rates editing. Thus, it is likely that the most 4,27,28,50,58,63,64,125–147 with dashed lines denote ranges observed in the published literature . In the mid to immediate impact from editing human late 1980s, it was observed that in the presence of exogenous DNA, homologous recombination (HR) cells will emanate from small-scale could occur at a very low frequency (0.01%). Subsequently, it was shown that by inducing a double- engineering, whereas large-scale strand break (DSB), the rates of HR could be dramatically increased. Thus, different modalities have been discovered and employed to create DSBs at higher frequencies, from meganucleases (such as approaches will be crucial for obtaining a I‑SceI) to zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) to better understanding of phenotypes with the current CRISPR-associated nucleases. In addition, approaches that do not use DSB-stimulated HR, complex genetics. such as Cre recombinase or phiC31 integrase, have also shown tremendous promise for DNA replace‑ Tactically, if the number of desired ment but are restricted by the limited range of sequences that can be targeted. As a corollary, in changes is large and the changes span combination with the progress in DNA synthesis technologies, our improved ability to edit DNA has different areas of the genome, there is permitted increases in the total number of modifications being made in a single genome. In the past, a compelling reason to approach this 1 to 2 edits were typically made in a single genome. Ambitious projects such as RC57 and Sc2.0 have problem in a completely synthetic manner. aimed to make up to four or even five orders of magnitude more edits in a single genome. From a practical perspective, this approach is currently limited to genomes that are and livestock could be thought of in the mutations are the likely causes and drivers megabases in size (at most), such as those same manner, and re‑engineering their of specific diseases16–22. Subsequently, in unicellular organisms. Here, rather than genomes could ensure the highest and most from a therapeutic perspective, repair of modifying an existing genomic template, efficient productivity. For some of these such defects would probably necessitate designing and building the genome of particular entities, genome engineering small-scale genome editing only, whereby interest completely from chemically could address global problems such as food a limited number of genomic targets would synthesized DNA would not only provide shortages and environmental deterioration, need to be altered in order to restore normal the most efficient way to incorporate many for example, by altering sets of genes function. Whether the alteration becomes changes into a desired DNA target but in metabolic pathways to permit either a lasting cure for the disease will largely also enable complete control of the final increased or reduced production of specific depend on the proportion of target cells in sequence. The most compelling cases so biomolecules. Thus, the ability to perform which editing is achieved and the persistence far have included extensive changes in the genome engineering at a larger scale of the edited cells in their biological niche. genetic code (for biocontainment, efficient would enhance our chances of achieving these goals. Genome-wide association studies Glossary (GWAS) along with large-scale Homologous recombination CRISPR-associated nuclease (such as Cas9 or Cpf1) high-throughput sequencing of exomes (HR). Exchange of sequence between two highly recognizes before making its DNA break. Depending and genomes have uncovered hundreds, similar DNA molecules. on the CRISPR system used, this sequence must either flank the 3ʹ end or 5ʹ end of the sequence if not thousands, of small and large DNA Homology-directed repair of interest. sequence variations associated with a given (HDR). The process by which double-strand breaks (DSBs) phenotype, including medically important in DNA are repaired when in the presence of an additional Reading human traits15. For example, in numerous DNA moiety that has homology to the DNA sequence The use of massively parallel sequencing technologies to genetic disorders, studies are advancing surrounding the DSB. decipher nucleotide composition. from correlation to cause and effect (for Protospacer adjacent motif Writing instance, via genome engineering in animals (PAM). A short, characteristic sequence (typically The use of either genome-editing tools and/or DNA or organoids) to prove that particular between 3 and 8 nucleotides in length) that the synthesis to make desired changes in DNA sequence.

2 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Applications from engineered organisms

Increase production Bioenergy Chemicals Proteins Vaccines

HO A B D NH2

C E F J HN O OH G H

e.g., biofuels, e.g., pharmaceuticals, e.g., antibodies, e.g., antigen biomass fragrances recombinant production proteins

Metabolic engineering Cleaning of water

A B D CO2 or CH4 removal C E F J Pollutant removal G H Environmental remediation

Engineer traits Increased size

Robust growth

Disease resistant Food supply

Generate Engineer variants stem cells 3D printing or xenotransplantation

Disease Genotype– Organ modelling phenotype replacement

Bacteria Yeast Plants Livestock/fish Mammals

Organism

Metabolic pathways Regulatory or disease-associated genetic variants A B D Wild-type Engineered Target of large-scale engineering C E F J

G H Gene/regulatory region

Figure 2 | Potential applications of large-scale genome engineering of variants aims to make these sources grow more robustlyNature Reviews (that is, tolerant| Genetics to different organisms. For applications such as bioenergy, chemicals, pro‑ different environmental conditions) and to be less susceptible to disease teins and vaccines, the goal is to increase production by engineering met‑ to minimize loss. Finally, for mammalian genome engineering, some of the abolic pathways in particular organisms. Similarly, engineering of the main key applications would be for deciphering complex genotype– metabolic pathways in these organisms could be used for environmental phenotype relationships, modelling disease and producing cells of interest. remediation, from maintaining clean water supplies to removing pollutants Subsequently, these engineered cells could be built into tissues and organs

and CO2 from the air. While for applications such as food supply, in addition ex vivo using approaches such as 3D printing or, if the engineering was to making plants and livestock larger, large-scale engineering of regulatory performed in pigs, organs could be harvested for xenotransplantation.

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 3 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

. d e v r e s e r s t h g i r l l A . e r u t a N r e g n i r p S f o t r a p , d e t i m i L s r e h s i l b u P n a l l i m c a M 7 1 0 2 © . d e v r e s e r s t h g i r l l A . e r u t a N r e g n i r p S f o t r a p , d e t i m i L s r e h s i l b u P n a l l i m c a M 7 1 0 2 ©

4 4 www.nature.com/nrg PUBLICATION ONLINE ADVANCE |

efficiencies ranging from 10 from ranging efficiencies to 2% in in 2% to

. structure each underneath provided are numbers accession (PDB) Bank Data

−5 149–157

genome. SSOs alone have demonstrated demonstrated have alone SSOs genome. , and Protein Protein and , including multiple sgRNAs. Protein crystal structures were generated using Chimera using generated were structures crystal Protein sgRNAs. multiple including

148

by the Cas9 protein and, due to this simplicity, enables simultaneous targeting of multiple sites by by sites multiple of targeting simultaneous enables simplicity, this to due and, protein Cas9 the by the SSO will be incorporated in the host host the in incorporated be will SSO the

forming DSB is made made is DSB forming - end - blunt a which in location the specifies (sgRNA) RNA guide - single A nologies. a template, and the change encoded by by encoded change the and template, a

‑ tech engineering - genome current among promise most the demonstrated has system CRISPR–Cas9 percentage of instances, SSOs will be used as as used be will SSOs instances, of percentage

specific disruption within a sequence of interest. The The interest. of sequence a within disruption specific - site enable which (RT), transcriptase reverse

during DNA replication, in a very small small very a in replication, DNA during

c RNA-based recognition. Group II introns encode both a self-catalysing RNA moiety and a a and moiety RNA self-catalysing a both encode introns II Group recognition. RNA-based limited. |

homology to the target site. Mechanistically, Mechanistically, site. target the to homology

DNA to be exchanged. Currently, the repertoire of different recombinase recognition sites is quite quite is sites recognition recombinase different of repertoire the Currently, exchanged. be to DNA

increase stability and have a high degree of of degree high a have and stability increase

of short inverted repeats (IRs), flanking both the target site and donor DNA molecule, are required for for required are molecule, DNA donor and site target the both flanking (IRs), repeats inverted short of

in length, may contain modified bases to to bases modified contain may length, in

specific recombinases (SSRs) mediate editing with (Ser) or without (Tyr) using DSBs. A pair pair A DSBs. using (Tyr) without or (Ser) with editing mediate (SSRs) recombinases specific - site (Ser)

cleotides range between 20 and 200 bases bases 200 and 20 between range cleotides facilitate DNA editing. A new custom protein is required for each targeted site. Tyrosine (Tyr) and serine serine and (Tyr) Tyrosine site. targeted each for required is protein custom new A editing. DNA facilitate

and substitutions. Typically, these oligonu these Typically, substitutions. and - end forming DSBs to to DSBs forming end - staggered utilize meganucleases and (TALENs) nucleases effector like - activator

create small, site-specific insertions, deletions deletions insertions, site-specific small, create b Protein-based recognition. Zinc-finger nucleases (ZFNs), transcription transcription (ZFNs), nucleases Zinc-finger recognition. Protein-based manner. specific - site a in |

strand break (DSB) (DSB) break strand - double a direct to protein Ago the with along sequence DNA guide 24‑nucleotide cleotides (SSOs) alone can also be used to to used be also can alone (SSOs) cleotides

until the result is a single strain with all desired changes. The Argonaute (Ago) system utilizes a a utilizes system (Ago) Argonaute The changes. desired all with strain single a is result the until Transfection of single-stranded oligonu single-stranded of Transfection -

. This cycle can be repeated repeated be can cycle This . direction single a in be to engineered is transfer DNA of direction the that

for HR in human cells. in HR for 13

edited genomes (strains), one strain is designated a donor strain and the other a recipient strain such such strain recipient a other the and strain donor a designated is strain one (strains), genomes edited

eventual use as donor DNA molecules molecules DNA donor as use eventual

engineering (CAGE) enables the hierarchical assembly of parallel edited genomes. For every pair of of pair every For genomes. edited parallel of assembly hierarchical the enables (CAGE) engineering

propagate large pieces of DNA in bacteria for for bacteria in DNA of pieces large propagate

. Conjugative assembly genome genome assembly Conjugative . locations genomic multiple at changes desired of incorporation the

8

these technologies can be used to edit and and edit to used be can technologies these

beta and an (Exo). Designer oligonucleotides, along with these two proteins, can enable enable can proteins, two these with along oligonucleotides, Designer (Exo). exonuclease an and beta

has yielded limited success limited yielded has . Moreover, Moreover, .

λ stranded binding protein protein binding stranded - single the namely, , phage the from machinery utilizes (MAGE) engineering 32

in human cells cells human in β Red of version optimized

a nanomachines. DNA-editing Figure 3 3 Figure DNA-based recognition. Multiplex automated genome genome automated Multiplex recognition. DNA-based | |

, and the use of an an of use the and ,

◀ demonstrated been has 31

for eukaryotic cells, some conjugal transfer transfer conjugal some cells, eukaryotic for

on opposite DNA strands. These systems systems These strands. DNA opposite on optimized fully been not has approaches and there are very few Tyr/Ser SSRs that that SSRs Tyr/Ser few very are there and

through the action of FokI nuclease domains domains nuclease FokI of action the through these in used machinery the Although site in relatively close proximity is limiting, limiting, is proximity close relatively in site

domains and generate staggered DSBs DSBs staggered generate and domains . DNA homologous of presence the in occurring copies of a very specific target target specific very a of copies occurring

30

manner through their DNA-binding DNA-binding their through manner recombination of levels high stimulate can eukaryotes, the requirement of two naturally naturally two of requirement the eukaryotes,

bp in size, in a sequence-specific sequence-specific a in size, in bp 40 to 25 cells, bacterial into imported when and, highly efficient in both prokaryotes and and prokaryotes both in efficient highly

recognize DNA target sites, ranging from from ranging sites, target DNA recognize phage from originates , recET to similar this mechanism of genome engineering is is engineering genome of mechanism this

29

(Tyr/Ser SSRs) SSRs) (Tyr/Ser system, red ‑ λ the Briefly, multiplexability. . Although Although . present DNA donor the with TALENs and ZFNs .

) FIG. 3b ( 51

tyrosine/serine site-specific recombinases recombinases site-specific tyrosine/serine of degree high a demonstrated have and ligation ligation ‑ re and cleavage concerted require

nucleases (TALENs), meganucleases and and meganucleases (TALENs), nucleases prokaryotes in used widely been have designer double-strand nucleases, SSRs SSRs nucleases, double-strand designer

(ZFNs), transcription activator-like effector effector activator-like transcription (ZFNs), , (CAGE) engineering genome assembly SSRs do create DSBs, but unlike simpler simpler unlike but DSBs, create do SSRs

13

recognition, including zinc-finger nucleases nucleases zinc-finger including recognition, conjugative and recombination red ‑ λ using strand exchange without creating DSBs, Ser Ser DSBs, creating without exchange strand

8

majority of systems use protein-based target target protein-based use systems of majority (MAGE) engineering genome automated whereas Tyr SSRs utilize a mechanism of of mechanism a utilize SSRs Tyr whereas

Protein-based target recognition. target Protein-based multiplex approaches, major two the , can be deleted, inverted or replaced. Notably, Notably, replaced. or inverted deleted, be can The ) FIG. 3a (

DNA-based target recognition systems systems recognition target DNA-based the DNA sequence between the target sites sites target the between sequence DNA the

DNA targetable for editing. for targetable DNA For recognition. target DNA-based occur between a pair of target sites, where where sites, target of pair a between occur

(PAM) which would, in theory, make any any make theory, in would, which (PAM) inverted repeats, and recombination can can recombination and repeats, inverted

requirement of a a of requirement (DSBs). breaks double-strand make a short DNA sequence flanked by two two by flanked sequence DNA short a protospacer adjacent motif adjacent protospacer

is that, unlike CRISPR–Cas9, there is no no is there CRISPR–Cas9, unlike that, is they whether and RNA or Briefly, the target site comprises three parts, parts, three comprises site target the Briefly, ), TABLE 1 ; FIG. 3 (

in eukaryotic cells, a key potential advantage advantage potential key a cells, eukaryotic in protein DNA, using is, that site, target its . . genomes mammalian in (HDR)

repair

50 , 49

reliable cleavage efficacy at DNA target sites sites target DNA at efficacy cleavage reliable recognizes system the how by classified engineering tools to enable enable to tools engineering homology-directed homology-directed

length, were one of the earliest genome- earliest the of one were length, have to shown be can system Ago-based be can tools these Broadly, action. of

bp in in bp 40 and 30 between sequences target an If retracted. subsequently was paper mechanisms different have which

Tyr/Ser SSRs, which typically recognize recognize typically which SSRs, Tyr/Ser original the and questioned were results of most nanomachines, genome-editing of

39–41

DNA-binding domains DNA-binding these , temperatures physiological at cells number large a to access have we However, .

38 45–48

and by deriving fusions to programmable programmable to fusions deriving by and eukaryotic in DNA edit to reported initially . cells human in engineering genome

26–28

targeting sites using mutagenesis approaches approaches mutagenesis using sites targeting was NgAgo, , gregoryi Natronobacterium for adapted and repurposed been has that

to systematically increase the diversity of of diversity the increase systematically to from Ago different a Although published. pyogenes Streptococcus from system Cas9

this key limitation, work has been done done been has work limitation, key this been not has TtAgo using cells human in CRISPR– the on been has spotlight the of

targeting sites is quite limited. To address address To limited. quite is sites targeting activity Editing . °C ≥65 temperatures Much technologies. genome-engineering ) 37 (REF.

site. However, the natural repertoire of of repertoire natural the However, site. at only albeit target, DNA plasmid a of in developments recent the to attributed

a single custom biopolymer for each target target each for biopolymer custom single a cleavage Ago-based direct could guide DNA mainly be can genomes and regions

are reasonably high, and they only require require only they and high, reasonably are single-stranded a where interference, large edit to ability the surrounding

. The efficiencies of meganucleases meganucleases of efficiencies The . DSB DNA DNA-based provide to shown been has enthusiasm and excitement The

44

bp upon which they cut and induce a a induce and cut they which upon bp 40 (TtAgo) thermophilus Thermus bacterium Genome-editing systems — not just Cas9 just not — systems Genome-editing

14 between sequence DNA specific the from (Ago) protein Argonaute The and and

as homing endonucleases, recognize a a recognize endonucleases, homing as . modification of rates higher much generate . motifs regulatory

36 , 5 23–25

. Meganucleases, also known known also Meganucleases, . here further to nuclease designer a with conjunction unknown of presence or absence the reveal

43 , 42

detail and, thus, will not be elaborated elaborated be not will thus, and, detail in is SSOs of utility prevalent most the to ‘refactoring’ and resistance) multi-virus

have previously been discussed in great great in discussed been previously have However, . cells mammalian and acids amino non-standard of use

) TABLE 1 ( 33–35 PERSPECTIVES PERSPECTIVES

a DNA-based recognition CAGE MAGE λ-red Unedited cell DNA Edited replication cell 5′ 3′ Oligonucleotides with desired mutations RecA + DNA (3CMT) λ exonuclease + DNA (3SLP)

Argonaute Guide Double nicked Target Linear DNA (circular) DNA (Fwd) P or

P TtAgo + DNA (4NCB) Guide DNA (Rev)

b Protein-based recognition Zinc-finger nuclease (ZFN) Transcription activator-like effector nuclease (TALEN)

FokI FokI 5′ 5′ 5′ 5′ 3′ 3′ 3′ 3′ FokI Double-stranded FokI Double-stranded staggered cut A staggered cut 3 bp C 1 bp T Zif268 + DNA TALE PthXo1 G (1P47) (3UGM)

Meganuclease Tyr/Ser site-specific recombinase (SSR)

5′ IR 5′ 5′ 5′ IR IR 3′ 3 3′ 3′ ′ Double-stranded IR = inverted repeat staggered cut IR Excision

I-CreI Meganuclease + DNA Cre Recombinase + DNA (2XE0) (3CRX)

c RNA-based recognition Group II intron CRISPR–Cas9

RT

5′ 3′ Double-stranded Single-guide RNA Intron blunt-ended cut

Group II intron SpCas9 + DNA + sgRNA (3BWP) (4OO8)

Nature Reviews | Genetics

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 5 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Table 1 | Properties of DNA-editing nanomachines Name Target Use in Target Double-strand Maximum Maximum Maximum Refs scanning human diversity break number of observed observed cells simultaneously HDR (%)‡ NHEJ (%)‡ altered sites* Unstimulated DNA Yes High No 2 10−6 NA 158 endogenous HR MAGE (λ‑Red) DNA Yes High No 2¶ NA NA 32,159 CAGE DNA No High No NA NA NA 13 SSOs only DNA Yes High No 2 10−4 NA 33,35 ZFNs Protein Yes Moderate Yes § 6 43 65 160–162 TALENs Protein Yes High Yes § 2 10 50 6,163 Meganucleases Protein Yes Low Yes 2 1 5 164,165 Tyr/Ser SSRs Protein Yes Low No|| 6 10−5 NA 166 Group II introns RNA Yes High Yes NA NA NA 58,59 CRISPR–Cas9 RNA Yes High Yes § 62 33 79 7,70,167 *In mammalian cells (per individual genome). Homozygous changes counted as two altered sites. ‡Observed in mammalian stem cells without antibiotic selection. §Can be programmed as nickases (single-strand nucleases). ||Although Ser SSRs utilize double-strand breaks mechanistically, these breaks are probably not exposed to the NHEJ machinery in the manner of the various designer nucleases. ¶In bacteria, 12 sites have been simultaneously edited. CAGE, conjugative assembly genome engineering; HDR, homology-directed repair; HR, homologous recombination; MAGE, multiplex automated genome engineering; NA, data not available or not applicable; NHEJ, non-homologous end joining; SSO, single-stranded oligonucleotides; SSRs, site-specific recombinases; TALENs, transcription activator-like effector nucleases; ZFNs, zinc-finger nucleases. recognize different target sites. However, potential42,60–62. Furthermore, specific variants each segment is edited independently using the potential for recombinases in the of Cas9 have been made to enable gene an approach such as MAGE or CRISPR scheme of large-scale genome engineering editing without creating DSBs. A single (FIG. 4b). Subsequently, these segments can be is vast; thus, work to increase the diversity mutation to the Cas9 protein can generate a re‑assembled into the final desired sequence. of DNA sequences that SSRs can target by version that only creates a single-strand DNA In this section, we describe successful engineering new specificities, which has cut (nick)63,64. Fusion of a nuclease-dead applications of large-scale engineering in already started and is continuing52–57, will be Cas9 (dCas9) to a cytidine deaminase bacteria and yeast and how these approaches highly valuable. enables site-specific conversion of cytidine could be translated for higher-order to uracil65–67. With respect to large-scale eukaryotes. RNA-based target recognition. The class genome editing, the key relevant aspect Previously, our laboratory engineered an of RNA-based target recognition includes of CRISPR is the ease of designing and strain in which all known group II introns and CRISPR–Cas9 (FIG. 3c). multiplexing single-guide RNAs (sgRNAs) instances of the UAG stop codon were For group II introns, in a process known for targeting diverse genomic loci, which replaced with UAA, with the motivation of as retrohoming, DNA is first transcribed could enable excision and/or replacement of freeing the UAG codon to encode for novel into a catalytically active RNA and an DNA segments of various sizes. This aspect amino acids13,14. In this work, the E. coli mRNA encoding a reverse transcriptase. has been demonstrated in the epigenetic genome was separated into 32 different These two components form a complex and context of studying enhancer function22,68,69, segments, such that each segment contained invade DNA with sequence similarity to multi-kilobase gene replacement6 and approximately 10 UAG codons, then these a part of the intronic RNA. Subsequently, multiplexed disruption of 62 different segments were edited in parallel using MAGE. reverse transcription of the RNA molecule porcine endogenous retrovirus (PERV) Briefly, MAGE utilizes the phage λ‑red into cDNA permits insertion in a targeted sites in the pig genome70. recombination machinery along with short manner to disrupt gene function or, oligonucleotides carrying desired mutations when homologous DNA is introduced, Genome editing versus de novo synthesis to enable cumulative recombination in a to promote HDR58,59. At the current Various strategies have been outlined to cyclical process that can be repeated multiple juncture, this technology is not likely to be perform large-scale genome engineering, times. Once each segment obtained all a main contributor to large-scale genome from completely synthesizing genomes the necessary UAG changes, the segments engineering owing to its low activity in de novo, to massively editing an existing were hierarchically assembled into a single mammalian cells. genome scaffold, to a combination of both71. strain using CAGE. Notably, this final strain Finally, the CRISPR–Cas9 system In a fully synthetic approach, genomes exhibited decreased phage infectivity, and has been shown to be one of the most of interest are designed computationally, the UAG codon was successfully re‑assigned robust gene-editing platforms currently and then a rational, hierarchical strategy is to different non-standard amino acids14,24,25, available26–28. Review articles have covered devised to assemble long oligonucleotides demonstrating that this approach can the development of the CRISPR–Cas9 into larger pieces (FIG. 4a). Alternatively, be used to create potentially industrially system in terms of its mechanism of action, an existing genome scaffold can be robust strains. ability to be tethered to different effector substantially edited, whereby the scaffold By contrast, genomes have also been domains, and improvements in terms of is delineated into a manageable number of constructed completely de novo from higher specificity and multiplex editing segments of roughly equal size, and then synthesized DNA. Work from the J. Craig

6 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a Complete de novo synthesis In silico In vitro In vivo (bacteria or yeast)

Genome design Oligonucleotides Short fragments (~1 –2 kbp) Segments (50 kbp to 1 Mbp)

b Editing an existing DNA scaffold In vivo In silico In vivo (bacteria, yeast and mammals) MAGE or CRISPR or HR

Existing DNA scaffold

Devise editing Sequential editing Massively edited scaffold strategy

Engineer replacement Figure 4 | Two main approaches to large-scale genome engineering in PCR-based approaches. Finally, these fragments can be assembled into seg‑ bacteria and yeast. a | For complete de novo synthesis, one must first com‑ ments in the order of megabase pairs (Mbp) by use of either bacteria or yeast putationally design oligonucleotide sequences that comprise the desired as a chassis. b | Extensive editing of an existing genomeNature scaffold. Reviews Depending | Genetics product while ensuring the highest compatibility with the experimental on the size of the DNA segment being edited, a segment is computationally strategy used for assembly. For example, the thermodynamic properties of divided into pieces with an approximately similar level of editing. Each the sequence, such as GC% and secondary structure formation, are impor‑ piece can then be edited independently in bacteria or yeast and, upon com‑ tant parameters to consider. Subsequently, oligonucleotides (~150–200 pletion, can be re‑assembled into a single final segment and then placed bases in length) can be assembled into 1–2 kbp fragments using in vitro into the final recipient cell.

Venter Institute (JCVI) has resulted in the and transformed into yeast; subsequently, of which are shared by both approaches, synthesis of numerous small genomes72–75. the native wild-type versions of these arms and potential solutions. Here, the eventual goal was to generate were inactivated. By contrast, for synthesis Although computationally designing a minimal genome (JCVI‑syn3.0) of chromosome III, a hierarchical approach and subsequently producing chemically using a completely synthetic genome was taken. Starting from thousands of short synthesized DNA offers much greater (JCVI‑syn1.0) as the starting point. Briefly, nucleotides (70 bases), ~400 building blocks control than starting from existing genomes, for JCVI‑syn1.0, the approach taken was to (750 bp) were assembled using PCR-based the cost to synthesize an entire human hierarchically assemble 1‑kb DNA fragments protocols to produce 83 overlapping segments genome is probably more expensive than into larger and larger segments by use of (2–4 kbp) in yeast by HR. For the replacement editing an existing scaffold. It should be HR in yeast and to subsequently transfer the of the native chromosome, synthetic segments noted that this cost comprises the constructed genome (~1 Mbp in size) into a were iteratively replaced over multiple cycles raw materials (synthetic DNA) and new recipient strain76. Similarly, our group until the entire chromosome was replaced. the process and labour of assembling the raw recently used synthesized DNA fragments Most recently, in addition to chromosome material into the final product. Thus, a for an entire E. coli genome (~4 Mbp in III, synthetic versions of chromosomes II, V, further reduction in cost will be necessary. size) along with yeast HR for assembly to VI, X and XII have been made, marking the However, this difference in cost is much re‑design the E. coli genome to use 57 out completion of more than one-third of the smaller than it once was, as we have of 64 codons, enabling re‑assignment of chromosomes in the Sc2.0 project80–84. observed a million-fold reduction in cost the seven unused codons to additional, in a decade for genome sequencing85 and non-standard amino acids77. Technical challenges solid-phase DNA synthesis86, to the point In yeast, the major project that is well It is clear that both large-scale editing of where the cost for sequencing is about underway is the Synthetic Yeast Genome existing scaffolds and de novo synthesis US$1,000, whereas synthesis runs Project (Sc2.0)78,79. Sc2.0 utilizes a full of DNA have yielded successful end about US$6,000 (REF. 9) per human genome. synthetic approach whereby chromosomes products, each with its own advantages and Furthermore, genomic regions with (or chromosome arms) are made entirely disadvantages. Furthermore, when scaling low sequence complexity (that is, those from chemically synthesized DNA. these approaches to larger genomes (such that are highly repetitive) can also pose Two chromosome arms (synIXR and as human), additional considerations would difficulties with current DNA synthesis semi-synVIL) were synthesized de novo as also need to be addressed. Here, we discuss platforms. Relatedly, the genome sequences a bacterial artificial chromosome (BAC) a number of key technical challenges, many of many higher-order eukaryotes are

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 7 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

incomplete, largely due to difficulties designing and introducing many changes at example, for CRISPR–Cas9 or TALENs with reading low-complexity genomic once relative to the normal genome sequence and ZFNs, each target site would require regions. For example, further development may result in a non-viable phenotype as a a new sgRNA or new designer nuclease, and application of newer sequencing consequence of the editing process itself respectively, in addition to the donor DNA technologies that can generate long reads, (in the case of radically modifying an molecule specifying the new sequence. such as those obtained from Pacific existing scaffold) or as a result of the DNA Next, the majority of editing experiments Biosciences (PacBio) single-molecule sequence being incompatible with normal have been performed in cells that are real-time (SMRT) sequencing87 and Oxford function. Although we have gained a better continually undergoing mitosis, so another Nanopore Technologies (ONT) nanopore understanding of how to design prokaryotic77 ideal feature would be a system that can sequencing88–91, or high-resolution and lower-order eukaryotic genomes80, this also operate in postmitotic cells in a highly visualization approaches could be used to level of understanding does not exist for efficient manner. Although the level of complete our maps of these genomes92. more complex mammalian genomes. Thus, editing desired in postmitotic cells may Massively editing an existing scaffold it may be imperative that the upper limit not be as large in scale as the level desired also has limitations and consequences. First, of the number of modifications that can be in other cells, this feature would at least the multiple rounds of editing required to made in a single iteration be identified for increase the repertoire of cell types that achieve the desired genome sequence can each cell type, and if the number of changes could be edited. To date, Cre recombinase be quite laborious and time consuming exceeds this limit, the modifications would and CRISPR–Cas9 have been shown to be to perform. If the efficiency for multiple be incorporated in multiple iterations. effective in cardiomyocytes and neurons, targeting is not sufficiently high, cell As each approach provides distinct respectively99–101, with application in other populations would need to be screened for advantages and challenges, it may very well cell types yet to be demonstrated. the correct sequence over each iteration. be a hybrid of these two approaches that In addition to increasing the diversity Second, depending on the genome-editing will ultimately be most successful. One can of cell types, it is also important to be able technology used, the sequence of the envision a scenario in which an initial library to have broad target sequence specificity. genome itself can present challenges in of DNA building blocks that spans an entire As mentioned earlier, if Ago systems could targeting. For example, regions with a genome will be synthesized (or generated be optimized in the future for efficient high per cent of cytosines could present through PCR of genomic DNA), and DNA target cleavage in eukaryotic cells, challenges for site-specific deaminases and/ subsequent sequence variations to particular one of their more compelling features or regions that are A/T-rich would present sets of building blocks will be generated is the lack of a requirement for a PAM problems for CRISPR–Cas9. The discovery using genome-engineering technologies. sequence. Although this system would of the CRISPR system Cpf1 nuclease (as For example, starting with a BAC library provide a theoretically unlimited range of an alternative to Cas9) could address the comprising overlapping 100–300 kbp targetable sequences, it may also increase latter problem, as the PAM is composed of a segments of the human genome98, one can the likelihood of unintended mutagenesis thymidine homopolymer93. select an individual BAC encompassing via off-target activity. In the same vein, A notable consequence of genome editing a desired genomic location, edit this when improving CRISPR–Cas9 specificity, is the targeting of unintended sites. Targeting DNA segment in E. coli using MAGE and depending on the approach taken, a particular single-nucleotide variants (SNVs) subsequently use HR to incorporate this reduction in off-target activity may be or point mutations is difficult because all new DNA segment in the destination accompanied by a reduction in on‑target the genome-editing platforms can tolerate cell population. activity as well61. Thus, it is going to be a fine some level of mispairing. For unintended balance of having a system that has a broad mutagenesis, the use of a non-DSB- Desired genome-writing properties targeting range, high editing efficiency and inducing technology could alleviate this We have detailed two different approaches to high specificity. problem, but in general, the efficiencies performing large-scale genome engineering. The ability to achieve 100% gene of these approaches are lower than the Although building large genomes de novo disruption at a single locus has become efficiencies of DSB-inducing technologies. has many advantages, as we stated earlier, we feasible for some loci with many of the Thus, if a DSB-inducing approach believe that a combination of de novo DNA technologies discussed. However, the ideal were used, any off-target sites that are synthesis and genome-editing technologies feature with respect to efficiency is obtaining unintentionally altered could be identified will probably be utilized moving forward. a high frequency of multiplexed editing in a high-throughput manner61,94–97, and Thus, it will be important to understand where we have the ability to make hundreds even if there were DSBs, these could still be what properties we desire in a truly robust, of changes per cell simultaneously. Highly tolerated if the DSBs were not introduced high-throughput genome-writing efficient multiplexed editing would need to in important regions in the genome, such platform (TABLE 2). be done in conjunction with an approach as oncogenes or tumour suppressor genes. Of the different genome-engineering that minimizes toxicity to the cell. If the Similar to the above suggestion, an iterative technologies described, although it method of choice utilizes DSBs, creating approach could be used whereby after each has not been shown to work in human hundreds of DSBs could be problematic iteration, the target DNA is sequenced to cells, the λ‑red (and the similar recET) in that it could either induce cellular verify the absence of off-target mutations. recombination systems have the ideal apoptosis or chromosomal rearrangement. However, as the number of iterations characteristic that only one custom As highlighted earlier, the disruption of 62 required increases, the costs associated biopolymer (the donor DNA molecule PERV genes in a single genome with minimal with sequencing would also increase. carrying desired changes) is necessary to impact on genome architecture was probably For both de novo genome synthesis and make site-specific changes, whereas the a unique case, as there were no apparent substantial editing of existing genomes, other technologies require at least two. For rules for multiplex targeting that could be

8 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Table 2 | Ideal features for genome writing or editing platforms risks to embryos. At this juncture, it seems difficult to draw sharp red lines for what Feature Description may be permitted versus forbidden; Simple Only one custom biopolymer for each HDR or HR event for example, designating areas of the Multiplex 100% targeting at multiple sites, not merely ‘high efficiency’ at a single site genome, such as the mitochondrial genome, Robust High efficiency in postmitotic cells (for example, neurons), broad where editing would be permitted. It sequence compatibility should also be noted that mitochondrial replacement therapy has already been Large regions High efficiency of 10 kbp to 10 Mbp scale replacements, alternative 120 strategies possible utilized in cases of IVF . A key component will be engaging various communities to Low toxicity Low apoptosis, high specificity, genome integrity intact imagine unexpected scenarios, including Germline editing Specific safeguards might include restriction to gametes and/or clonal respect for diverse cultures and faiths. analysis to reduce mosaicism Reading and testing Rapid feedback on many synthetic genomes via reporters, organoids A roadmap to writing large genomes and sequencing After these desired features of a large-scale HDR, homology-directed repair; HR, homologous recombination. genome-writing platform are achieved, the final step is implementation. Here, we consider a roadmap for how the discerned moving forward, and only a single to exchange and then the exchange event implementation could be accomplished sgRNA was required to target all 62 sites70. itself — and thus could be quite laborious. (FIG. 5). Starting with libraries of DNA oligo­ Thus, a more sophisticated strategy, such Intriguingly, RMCE has been combined nucleotides, small DNA fragments in the as manipulating DNA repair machinery with CRISPR–Cas9 to make this approach 3‑kbp range can be constructed in vitro using or utilizing non-DSB-inducing entities, more efficient115. As stated previously, a combination of DNA annealing, ligation would be necessary if this many genes were the main challenge with recombinases is the and amplification. By taking advantage of simultaneously targeted in other applications. limited diversity of targetable sites. In terms the high efficiency of HR in yeast, these Nonetheless, this level of disruption of tethering or pairing, placing the donor smaller fragments could then be assembled will be crucial for large-scale editing. template in close proximity to the site of into larger pieces of approximately 50 kbp Although efforts have been made to identify cleavage should increase the frequency at in size (or possibly into the megabase scale). approaches to improve HR and HDR, the which replacements occur. For example, Next, these 50‑kbp segments can be built efficiencies are still typically one order of when using CRISPR–Cas9, the desired into larger fragments (~4 Mbp) using an SSR magnitude lower than that of an alternative proximity could be ensured by covalently system in E. coli. Multiple bacterial or yeast mechanism of DSB repair, non-homologous attaching the donor DNA to the sgRNA. strains carrying different 4‑Mbp fragments end joining (NHEJ), in which the DNA Last, but not least, is germline editing. could be delivered via conjugation (or ends are re‑ligated, typically without The ability to edit the germline poses spheroplast fusion) to plant or animal cells incorporating a donor DNA sequence102–107. the usual ethical issues of risks, benefits, to build chromosome-sized elements31,121. Intriguingly, attempts to utilize the NHEJ safety and efficacy (adequately handled Finally, these large (40–250 Mbp) segments repair pathway to incorporate specific by organizations such as the US Food could be hierarchically assembled (usually DNA sequences have also been made with and Drug Administration (FDA), the with selectable markers) into single cells some degree of success108,109. Thus, work European Medicines Agency (EMA) and by microcell-mediated chromosomal will need to continue to identify innovative the China Food and Drug Administration transfer (MMCT)122. As the efficiency of HR ways to achieve high editing frequency (CFDA))116–119. One of the most difficult increases, we can miniaturize and multiplex simultaneously across multiple loci. challenges, from a logistical perspective, the HR reactions and reduce the need for Relatedly, another important feature will be understanding the long-term, selection. High-throughput and multiplex will be to increase the efficiency of larger potentially trans-generational, societal testing of large numbers of combinatorially changes. There is an inverse relationship impact of germline editing applications, created genomes greatly increases the observed between the efficiency of an which may otherwise be deemed ‘safe’ in chances of rapid success. alteration and the size of the alteration, a conventional medical sense. Ironically, It should also be noted that at various where smaller changes are much easier because reproductive technologies, such as points of this process, DNA segments of to incorporate than larger ones6,110. Two spermatogonial stem cell engineering, can different sizes could be modified using potential strategies could be pursued to use clonal sibling cells, which can be checked smaller-scale genome-editing technologies, address the efficiency. The first strategy for genomic goals, including zero undesired contingent on the ability to retrieve and would be to utilize recombinases, and the on‑target mutations (caused by NHEJ replace individual segments efficiently second would be to tether homologous rather than HDR), off-target errors and (such as using a multiplex barcoded DNA directly to genome-editing reagents. epigenetic issues, high-fidelity editing may library). Additional editing is advantageous Recombinase-mediated cassette exchange be more critical for somatic gene therapy. for various reasons. First, if a particular (RMCE) has been routinely used in Additionally, editing sperm or egg precursor segment was deemed incompatible for mammalian cells, with its most prevalent cells could result in considerable reduction viability, the segment could be edited until use in creating conditional mouse in embryo loss and spontaneous or induced it no longer created problems. Second, if strains111–114. This approach requires two abortion relative to in vitro fertilization additional sequence diversity was desired, separate gene-editing events — one event to (IVF) and hence might be attractive to it may be more efficient to edit the DNA place recombination sites flanking the area recessive carrier parents with concerns about segment than to re‑synthesize it entirely.

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 9 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Size Synthesized For example, if the editing was to be oligonucleotides performed in yeast, native HR could be Computational used, which is already highly efficient, or a design CRISPR-based approach could be used for 123,124 200 bases replacement . Similarly, if the segments were in E. coli, MAGE and/or CAGE could be employed. Taken together, achieving this collection of reading, editing, writing and testing Anneal (in vitro) goals may move us closer to safe, robust, inexpensive, egalitarian and effective genome editing that can change our Short DNA fragments environment and ourselves, provided ~3 kbp HR or CRISPR that such strategies are constantly under technical and ethical scrutiny and feedback.

HR/Assembly Raj Chari is at the Department of Genetics, Harvard (Yeast) Medical School, 77 Avenue Louis Pasteur, Boston, Massachusetts, 02115, USA.

Assembled fragments George M. Church is at the Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur,

SSR, CRISPR Boston, Massachusetts, 02115, USA, and at the Wyss ~50 kbp– or MAGE 1 Mbp Institute for Biologically Inspired Engineering, 3 Blackfan Circle, Boston, Massachusetts, 02115, USA. Tyr/Ser SSR Correspondence to G.M.C. (E. coli) [email protected]

Megabase DNA segments doi:10.1038/nrg.2017.59 Published online 30 Aug 2017 ~4 Mbp SSR or CRISPR 1. Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499–507 (2014). 2. Reuter, J. A., Spacek, D. V. & Snyder, M. P. High-throughput sequencing technologies. Mol. Cell Conjugation or cell fusion 58, 586–597 (2015). (to mammalian cells) 3. Goodwin, S., McPherson, J. D. & McCombie, W. R. Coming of age: ten years of next-generation sequencing Chromosomes technologies. Nat. Rev. Genet. 17, 333–351 (2016). 4. Smithies, O., Gregg, R. G., Boggs, S. S., Koralewski, M. A. & Kucherlapati, R. S. Insertion of ~100 Mbp DNA sequences into the human chromosomal beta- globin locus by homologous recombination. Nature 317, 230–234 (1985). 5. DeWitt, M. A. et al. Selection-free genome editing of the sickle mutation in human adult hematopoietic stem/ MMCT progenitor cells. Sci. Transl Med. 8, 360ra134 (2016). 6. Byrne, S. M., Ortiz, L., Mali, P., Aach, J. & (Human) Church, G. M. Multi-kilobase homozygous targeted gene replacement in human induced pluripotent stem Large genome cells. Nucleic Acids Res. 43, e21 (2015). 7. Dever, D. P. et al. CRISPR/Cas9 β‑globin gene targeting in human haematopoietic stem cells. Nature 539, 384–389 (2016). 8. Wang, H. H. et al. Programming cells by multiplex Test genome engineering and accelerated evolution. 6 Gbp Nature 460, 894–898 (2009). 9. Boeke, J. D. et al. The Genome Project-Write. Science 353, 126–127 (2016). 10. Robertson, D. E. et al. A new dawn for industrial photosynthesis. Photosynth. Res. 107, 269–277 (2011). 11. Salhi, A. et al. DESM: portal for microbial knowledge | exploration systems. Nucleic Acids Res. 44, Figure 5 A roadmap to building large genomes using a combinationNature of de novo Reviews synthesis | Genetics D624–D633 (2016). and genome editing. Here, a hierarchical strategy is employed starting from short oligonucle‑ 12. Fischer, C. R., Klein-Marcuschamer, D. & otides (150 to 200 bases in length), which can be assembled into short DNA fragments ~3 kbp Stephanopoulos, G. Selection and optimization of microbial hosts for biofuels production. Metab. Eng. in size. The short DNA fragments are then be assembled into larger segments between 50 kbp 10, 295–304 (2008). and 1 Mbp in size. Segments ~4 Mbp in size can be built in bacteria or yeast and subsequently 13. Isaacs, F. J. et al. Precise manipulation of transferred into human cells by conjugation. These segments can then be increased to chromosomes in vivo enables genome-wide codon replacement. Science 333, 348–353 (2011). ~100 Mbp in size and transferred into recipient cells by microcell-mediated chromosomal trans‑ 14. Lajoie, M. J. et al. Genomically recoded organisms fer (MMCT). In each stage where a segment is built, genome-editing tools, such as multiplex expand biological functions. Science 342, 357–360 automated genome engineering (MAGE), CRISPR, homologous recombination (HR) and (2013). 15. Welter, D. et al. The NHGRI GWAS Catalog, a curated site-specific recombinases (SSRs), can be utilized to make further edits from what was specified resource of SNP-trait associations. Nucleic Acids Res. in the initial set of oligonucleotides. 42, D1001–D1006 (2014).

10 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

16. Wang, G. et al. Modeling the mitochondrial 45. Smith, J. et al. A combinatorial approach to create 73. Gibson, D. G. et al. Complete chemical synthesis, cardiomyopathy of Barth syndrome with induced artificial homing endonucleases cleaving chosen assembly, and cloning of a pluripotent stem cell and heart‑on‑chip technologies. sequences. Nucleic Acids Res. 34, e149 (2006). genome. Science 319, 1215–1220 (2008). Nat. Med. 20, 616–623 (2014). 46. Chevalier, B. S. et al. Design, activity, and structure of 74. Gibson, D. G. et al. Creation of a bacterial cell 17. Freedman, B. S. et al. Modelling kidney disease with a highly specific artificial endonuclease. Mol. Cell 10, controlled by a chemically synthesized genome. CRISPR-mutant kidney organoids derived from human 895–905 (2002). Science 329, 52–56 (2010). pluripotent epiblast spheroids. Nat. Commun. 6, 8715 47. Boissel, S. et al. megaTALs: a rare-cleaving nuclease 75. Hutchison, C. A. et al. Design and synthesis of a minimal (2015). architecture for therapeutic genome engineering. bacterial genome. Science 351, aad6253 (2016). 18. Drost, J. et al. Sequential cancer mutations in cultured Nucleic Acids Res. 42, 2591–2601 (2014). 76. Lartigue, C. et al. Creating bacterial strains from human intestinal stem cells. Nature 521, 43–47 48. Wolfs, J. M. et al. MegaTevs: single-chain dual genomes that have been cloned and engineered in (2015). nucleases for efficient gene disruption. Nucleic Acids yeast. Science 325, 1693–1696 (2009). 19. Matano, M. et al. Modeling colorectal cancer using Res. 42, 8816–8829 (2014). 77. Ostrov, N. et al. Design, synthesis, and testing toward CRISPR‑Cas9‑mediated engineering of human 49. Birling, M.‑C., Gofflot, F. & Warot, X. Site-specific a 57‑codon genome. Science 353, 819–822 (2016). intestinal organoids. Nat. Med. 21, 256–262 (2015). recombinases for manipulation of the mouse genome. 78. Dymond, J. S. et al. Synthetic chromosome arms 20. Schwank, G. et al. Functional repair of CFTR by Methods Mol. Biol. 561, 245–263 (2009). function in yeast and generate phenotypic diversity by CRISPR/Cas9 in intestinal stem cell organoids of cystic 50. Sauer, B. & Henderson, N. Site-specific DNA design. Nature 477, 471–476 (2011). fibrosis patients. Cell Stem Cell 13, 653–658 (2013). recombination in mammalian cells by the Cre 79. Annaluru, N. et al. Total synthesis of a functional 21. Liu, J. et al. CRISPR/Cas9 facilitates investigation of recombinase of bacteriophage P1. Proc. Natl Acad. designer eukaryotic chromosome. Science 344, neural circuit disease using human iPSCs: mechanism Sci. USA 85, 5166–5170 (1988). 55–58 (2014). of epilepsy caused by an SCN1A loss‑of‑function 51. Turan, S. & Bode, J. Site-specific recombinases: from 80. Richardson, S. M. et al. Design of a synthetic yeast mutation. Transl Psychiatry 6, e703 (2016). tag-and-target- to tag-and-exchange-based genomic genome. Science 355, 1040–1044 (2017). 22. Soldner, F. et al. Parkinson-associated risk variant in modifications. FASEB J. 25, 4088–4107 (2011). 81. Shen, Y. et al. Deep functional analysis of synII, a distal enhancer of α‑synuclein modulates target gene 52. Buchholz, F. & Stewart, A. F. Alteration of Cre 770‑kilobase synthetic yeast chromosome. Science expression. Nature 533, 95–99 (2016). recombinase site specificity by substrate-linked 355, eaaf4791 (2017). 23. Chan, L. Y., Kosuri, S. & Endy, D. Refactoring protein evolution. Nat. Biotechnol. 19, 1047–1052 82. Xie, Z.‑X. et al. ‘Perfect’ designer chromosome V and bacteriophage T7. Mol. Syst. Biol. 1, 2005.0018 (2001). behavior of a ring derivative. Science 355, eaaf4704 (2005). 53. Eroshenko, N. & Church, G. M. Mutants of Cre (2017). 24. Mandell, D. J. et al. Biocontainment of genetically recombinase with improved accuracy. Nat. Commun. 83. Zhang, W. et al. Engineering the ribosomal DNA in a modified organisms by synthetic protein design. 4, 2509 (2013). megabase synthetic chromosome. Science 355, Nature 518, 55–60 (2015). 54. Santoro, S. W. & Schultz, P. G. Directed evolution of eaaf3981 (2017). 25. Rovner, A. J. et al. Recoded organisms engineered to the site specificity of Cre recombinase. Proc. Natl 84. Mitchell, L. A. et al. Synthesis, debugging, and effects depend on synthetic amino acids. Nature 518, 89–93 Acad. Sci. USA 99, 4185–4190 (2002). of synthetic chromosome consolidation: synVI and (2015). 55. Saraf-Levy, T. et al. Site-specific recombination of beyond. Science 355, eaaf4831 (2017). 26. Jinek, M. et al. RNA-programmed genome editing in asymmetric lox sites mediated by a heterotetrameric 85. Shendure, J. et al. Accurate multiplex polony human cells. eLife 2, e00471 (2013). Cre recombinase complex. Bioorg. Med. Chem. 14, sequencing of an evolved bacterial genome. Science 27. Mali, P. et al. RNA-guided human genome engineering 3081–3089 (2006). 309, 1728–1732 (2005). via Cas9. Science 339, 823–826 (2013). 56. Gaj, T. et al. Enhancing the specificity of recombinase- 86. Tian, J. et al. Accurate multiplex gene synthesis from 28. Cong, L. et al. Multiplex genome engineering using mediated genome engineering through dimer interface programmable DNA microchips. Nature 432, CRISPR/Cas systems. Science 339, 819–823 (2013). redesign. J. Am. Chem. Soc. 136, 5047–5056 1050–1054 (2004). 29. Zhang, Y., Buchholz, F., Muyrers, J. P. & Stewart, A. F. (2014). 87. Gordon, D. et al. Long-read sequence assembly of the A new logic for DNA engineering using recombination 57. Wallen, M. C., Gaj, T. & Barbas, C. F. Redesigning gorilla genome. Science 352, aae0344 (2016). in Escherichia coli. Nat. Genet. 20, 123–128 (1998). recombinase specificity for safe harbor sites in the 88. Loose, M., Malla, S. & Stout, M. Real-time selective 30. Yu, D. et al. An efficient recombination system for human genome. PLoS ONE 10, e0139123 (2015). sequencing using nanopore technology. Nat. Methods chromosome engineering in Escherichia coli. Proc. 58. Mastroianni, M. et al. Group II intron-based gene 13, 751–754 (2016). Natl Acad. Sci. USA 97, 5978–5983 (2000). targeting reactions in eukaryotes. PLoS ONE 3, e3121 89. Szalay, T. & Golovchenko, J. A. De novo sequencing 31. Lacroix, B. & Citovsky, V. Transfer of DNA from (2008). and variant calling with nanopores using PoreSeq. bacteria to eukaryotes. mBio 7, e00863‑16 (2016). 59. Guo, H. et al. Group II introns designed to insert into Nat. Biotechnol. 33, 1087–1091 (2015). 32. Xu, K., Stewart, A. F. & Porter, A. C. G. Stimulation of therapeutically relevant DNA target sites in human 90. Goodwin, S. et al. Oxford Nanopore sequencing, hybrid oligonucleotide-directed gene correction by Redβ cells. Science 289, 452–457 (2000). error correction, and de novo assembly of a eukaryotic expression and MSH2 depletion in human HT1080 60. Shalem, O., Sanjana, N. E. & Zhang, F. High- genome. Genome Res. 25, 1750–1756 (2015). cells. Mol. Cells 38, 33–39 (2015). throughput functional genomics using CRISPR‑Cas9. 91. Fuller, C. W. et al. Real-time single-molecule electronic 33. Rios, X. et al. Stable gene targeting in human cells Nat. Rev. Genet. 16, 299–311 (2015). DNA sequencing by synthesis using polymer-tagged using single-strand oligonucleotides with modified 61. Tsai, S. Q. & Joung, J. K. Defining and improving the nucleotides on a nanopore array. Proc. Natl Acad. Sci. bases. PLoS ONE 7, e36697 (2012). genome-wide specificities of CRISPR‑Cas9 nucleases. USA 113, 5233–5238 (2016). 34. Igoucheva, O., Alexeev, V. & Yoon, K. Targeted gene Nat. Rev. Genet. 17, 300–312 (2016). 92. Beliveau, B. J. et al. Single-molecule super-resolution correction by small single-stranded oligonucleotides in 62. Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-based imaging of chromosomes and in situ haplotype mammalian cells. Gene Ther. 8, 391–399 (2001). technologies for the manipulation of eukaryotic visualization using Oligopaint FISH probes. Nat. 35. Aarts, M. & te Riele, H. Subtle gene modification in genomes. Cell 168, 20–36 (2017). Commun. 6, 7147 (2015). mouse ES cells: evidence for incorporation of 63. Mali, P. et al. CAS9 transcriptional activators for 93. Zetsche, B. et al. Cpf1 is a single RNA-guided unmodified oligonucleotides without induction of target specificity screening and paired nickases for endonuclease of a class 2 CRISPR-Cas system. Cell DNA damage. Nucleic Acids Res. 38, 6956–6967 cooperative genome engineering. Nat. Biotechnol. 31, 163, 759–771 (2015). (2010). 833–838 (2013). 94. Tsai, S. Q. et al. GUIDE-seq enables genome-wide 36. Quadros, R. M. et al. Easi-CRISPR: a robust method for 64. Ran, F. A. et al. Double nicking by RNA-guided CRISPR profiling of off-target cleavage by CRISPR-Cas one-step generation of mice carrying conditional and Cas9 for enhanced genome editing specificity. Cell nucleases. Nat. Biotechnol. 33, 187–197 (2015). insertion alleles using long ssDNA donors and CRISPR 154, 1380–1389 (2013). 95. Frock, R. L. et al. Genome-wide detection of DNA ribonucleoproteins. Genome Biol. 18, 92 (2017). 65. Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & double-stranded breaks induced by engineered 37. Swarts, D. C. et al. DNA-guided DNA interference by a Liu, D. R. Programmable editing of a target base in nucleases. Nat. Biotechnol. 33, 179–186 (2015). prokaryotic Argonaute. Nature 507, 258–261 (2014). genomic DNA without double-stranded DNA cleavage. 96. Kim, D. et al. Digenome-seq: genome-wide profiling of 38. Gao, F., Shen, X. Z., Jiang, F., Wu, Y. & Han, C. Nature 533, 420–424 (2016). CRISPR‑Cas9 off-target effects in human cells. Nat. DNA-guided genome editing using the 66. Nishida, K. et al. Targeted nucleotide editing using Methods 12, 237–243 (2015). Natronobacterium gregoryi Argonaute. hybrid prokaryotic and vertebrate adaptive immune 97. Lensing, S. V. et al. DSBCapture: in situ capture and Nat. Biotechnol. 34, 768–773 (2016). systems. Science 353, aaf8729 (2016). sequencing of DNA breaks. Nat. Methods 13, 39. Lee, S. H. et al. Failure to detect DNA-guided genome 67. Yang, L. et al. Engineering and optimising deaminase 855–857 (2016). editing using Natronobacterium gregoryi Argonaute. fusions for genome editing. Nat. Commun. 7, 13330 98. Osoegawa, K. et al. A bacterial artificial chromosome Nat. Biotechnol. 35, 17–18 (2016). (2016). library for sequencing the complete human genome. 40. Burgess, S. et al. Questions about NgAgo. Protein Cell 68. Fulco, C. P. et al. Systematic mapping of functional Genome Res. 11, 483–496 (2001). 7, 913–915 (2016). enhancer-promoter connections with CRISPR 99. Straub, C., Granger, A. J., Saulnier, J. L. & Sabatini, B. L. 41. Javidi-Parsijani, P. et al. No evidence of genome interference. Science 354, 769–773 (2016). CRISPR/Cas9‑mediated gene knock-down in post-mitotic editing activity from Natronobacterium gregoryi 69. Canver, M. C. et al. BCL11A enhancer dissection by neurons. PLoS ONE 9, e105584 (2014). Argonaute (NgAgo) in human cells. PLoS ONE 12, Cas9‑mediated in situ saturating mutagenesis. Nature 100. Agah, R. et al. Gene recombination in postmitotic e0177444 (2017). 527, 192–197 (2015). cells. Targeted expression of Cre recombinase 42. Kim, H. & Kim, J.‑S. A guide to genome engineering 70. Yang, L. et al. Genome-wide inactivation of porcine provokes cardiac-restricted, site-specific with programmable nucleases. Nat. Rev. Genet. 15, endogenous retroviruses (PERVs). Science 350, rearrangement in adult ventricular muscle in vivo. 321–334 (2014). 1101–1104 (2015). J. Clin. Invest. 100, 169–179 (1997). 43. Chandrasegaran, S. & Carroll, D. Origins of 71. Haimovich, A. D., Muir, P. & Isaacs, F. J. Genomes by 101. Suzuki, K. et al. In vivo genome editing via CRISPR/ programmable nucleases for genome engineering. design. Nat. Rev. Genet. 16, 501–516 (2015). Cas9 mediated homology-independent targeted J. Mol. Biol. 428, 963–989 (2016). 72. Smith, H. O., Hutchison, C. A., Pfannkoch, C. & integration. Nature 540, 144–149 (2016). 44. Chevalier, B. S. & Stoddard, B. L. Homing Venter, J. C. Generating a synthetic genome by whole 102. Pinder, J., Salsman, J. & Dellaire, G. Nuclear domain endonucleases: structural and functional insight into genome assembly: phiX174 bacteriophage from ‘knock‑in’ screen for the evaluation and identification the catalysts of intron/intein mobility. Nucleic Acids synthetic oligonucleotides. Proc. Natl Acad. Sci. USA of small molecule enhancers of CRISPR-based genome Res. 29, 3757–3774 (2001). 100, 15440–15445 (2003). editing. Nucleic Acids Res. 43, 9379–9392 (2015).

NATURE REVIEWS | GENETICS ADVANCE ONLINE PUBLICATION | 11 ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

103. Chu, V. T. et al. Increasing the efficiency of homology- genomes during DNA double-strand break-induced 150. Chen, Z., Yang, H. & Pavletich, N. P. Mechanism of directed repair for CRISPR‑Cas9‑induced precise gene gene targeting in human cells. Hum. Gene Ther. 21, homologous recombination from the RecA-ssDNA/ editing in mammalian cells. Nat. Biotechnol. 33, 543–553 (2010). dsDNA structures. Nature 453, 489–484 (2008). 543–548 (2015). 128. Robert, M.‑A. et al. Efficacy and site-specificity of 151. Sheng, G. et al. Structure-based cleavage mechanism 104. Maruyama, T. et al. Increasing the efficiency of precise adenoviral vector integration mediated by the phage of Thermus thermophilus Argonaute DNA guide genome editing with CRISPR‑Cas9 by inhibition of φC31 integrase. Hum. Gene Ther. Methods 23, strand-mediated DNA target cleavage. Proc. Natl nonhomologous end joining. Nat. Biotechnol. 33, 393–407 (2012). Acad. Sci. USA 111, 652–657 (2014). 538–542 (2015). 129. Ou, H. et al. A highly efficient site-specific integration 152. Muñoz, I. G. et al. Molecular basis of engineered 105. Adamson, B., Smogorzewska, A., Sigoillot, F. D., strategy using combination of homologous meganuclease targeting of the endogenous human King, R. W. & Elledge, S. J. A genome-wide recombination and the ΦC31 integrase. J. Biotechnol. RAG1 locus. Nucleic Acids Res. 39, 729–743 homologous recombination screen identifies the RNA- 167, 427–432 (2013). (2011). binding protein RBMX as a component of the DNA- 130. Coluccio, A. et al. Targeted gene addition in human 153. Gopaul, D. N., Guo, F. & Van Duyne, G. D. Structure of damage response. Nat. Cell Biol. 14, 318–328 (2012). epithelial stem cells by zinc-finger nuclease-mediated the Holliday junction intermediate in Cre-loxP site- 106. Agudelo, D. et al. Marker-free coselection for CRISPR- homologous recombination. Mol. Ther. 21, specific recombination. EMBO J. 17, 4175–4187 driven genome editing in human cells. Nat. Methods 1695–1704 (2013). (1998). 14, 615–620 (2017). 131. Beaton, B. P. et al. Inclusion of homologous DNA in 154. Peisach, E. & Pabo, C. O. Constraints for zinc finger 107. Lin, S., Staahl, B. T., Alla, R. K. & Doudna, J. A. nuclease-mediated gene targeting facilitates a higher linker design as inferred from X‑ray crystal structure of Enhanced homology-directed human genome incidence of bi‑allelically modified cells. tandem Zif268‑DNA complexes. J. Mol. Biol. 330, engineering by controlled timing of CRISPR/Cas9 Xenotransplantation 22, 379–390 (2015). 1–7 (2003). delivery. eLife 3, e04766 (2014). 132. Nakade, S. et al. Microhomology-mediated end- 155. Mak, A. N.‑S., Bradley, P., Cernadas, R. A., 108. He, X. et al. Knock‑in of large reporter genes in human joining-dependent integration of donor DNA in cells Bogdanove, A. J. & Stoddard, B. L. The crystal cells via CRISPR/Cas9‑induced homology-dependent and animals using TALENs and CRISPR/Cas9. Nat. structure of TAL effector PthXo1 bound to its DNA and independent DNA repair. Nucleic Acids Res. 44, Commun. 5, 5560 (2014). target. Science 335, 716–719 (2012). e85 (2016). 133. Richardson, C. D., Ray, G. J., DeWitt, M. A., 156. Toor, N., Keating, K. S., Taylor, S. D. & Pyle, A. M. 109. Maresca, M., Lin, V. G., Guo, N. & Yang, Y. Obligate Curie, G. L. & Corn, J. E. Enhancing homology-directed Crystal structure of a self-spliced group II intron. ligation-gated recombination (ObLiGaRe): custom- genome editing by catalytically active and inactive Science 320, 77–82 (2008). designed nuclease-mediated targeted integration CRISPR‑Cas9 using asymmetric donor DNA. Nat. 157. Nishimasu, H. et al. Crystal structure of Cas9 in through nonhomologous end joining. Genome Res. Biotechnol. 34, 339–344 (2016). complex with guide RNA and target DNA. Cell 156, 23, 539–546 (2013). 134. Wang, H. et al. One-step generation of mice carrying 935–949 (2014). 110. Yang, Y. & Seed, B. Site-specific gene targeting in mutations in multiple genes by CRISPR/Cas-mediated 158. Hatada, S., Nikkuni, K., Bentley, S. A., Kirby, S. & mouse embryonic stem cells with intact bacterial genome engineering. Cell 153, 910–918 (2013). Smithies, O. Gene correction in hematopoietic artificial chromosomes. Nat. Biotechnol. 21, 447–451 135. Puchta, H., Dujon, B. & Hohn, B. Homologous progenitor cells by homologous recombination. (2003). recombination in plant cells is enhanced by in vivo Proc. Natl Acad. Sci. USA 97, 13807–13811 (2000). 111. Bradley, A. et al. The mammalian gene function induction of double strand breaks into DNA by a 159. Wang, H. H. et al. Genome-scale promoter engineering resource: the International Knockout Mouse site-specific endonuclease. Nucleic Acids Res. 21, by coselection MAGE. Nat. Methods 9, 591–593 Consortium. Mamm. Genome 23, 580–586 (2012). 5034–5040 (1993). (2012). 112. Ordovás, L. et al. Efficient recombinase-mediated 136. Groth, A. C., Olivares, E. C., Thyagarajan, B. & 160. Liu, P.‑Q. et al. Generation of a triple-gene knockout cassette exchange in hPSCs to study the hepatocyte Calos, M. P. A phage integrase directs efficient site- mammalian cell line using engineered zinc-finger lineage reveals AAVS1 locus-mediated transgene specific integration in human cells. Proc. Natl Acad. nucleases. Biotechnol. Bioeng. 106, 97–105 inhibition. Stem Cell Rep. 5, 918–931 (2015). Sci. USA 97, 5995–6000 (2000). (2010). 113. Voziyanova, E. et al. Efficient Flp-Int HK022 dual 137. Bibikova, M. et al. Stimulation of homologous 161. Wang, J. et al. Homology-driven genome editing in RMCE in mammalian cells. Nucleic Acids Res. 41, recombination through targeted cleavage by chimeric hematopoietic stem and progenitor cells using ZFN e125 (2013). nucleases. Mol. Cell. Biol. 21, 289–297 (2001). mRNA and AAV6 donors. Nat. Biotechnol. 33, 114. Osterwalder, M. et al. Dual RMCE for efficient 138. Rouet, P., Smih, F. & Jasin, M. Expression of a site- 1256–1263 (2015). re‑engineering of mouse mutant alleles. Nat. Methods specific endonuclease stimulates homologous 162. Hoban, M. D. et al. Correction of the sickle cell disease 7, 893–895 (2010). recombination in mammalian cells. Proc. Natl Acad. mutation in human hematopoietic stem/progenitor 115. Zhang, X., Koolhaas, W. H. & Schnorrer, F. A versatile Sci. USA 91, 6064–6068 (1994). cells. Blood 125, 2597–2604 (2015). two-step CRISPR- and RMCE-based strategy for 139. Liu, P., Jenkins, N. A. & Copeland, N. G. A highly 163. Karakikes, I. et al. A comprehensive TALEN-based efficient genome engineering in Drosophila. G3 efficient recombineering-based method for generating knockout library for generating human-induced (Bethesda) 4, 2409–2418 (2014). conditional knockout mutations. Genome Res. 13, pluripotent stem cell–based models for 116. Bosley, K. S. et al. CRISPR germline engineering — the 476–484 (2003). cardiovascular diseases. Circ. Res. 120, 1561–1571 community speaks. Nat. Biotechnol. 33, 478–486 140. Testa, G. et al. Engineering the mouse genome with (2017). (2015). bacterial artificial chromosomes to create multipurpose 164. Donoho, G., Jasin, M. & Berg, P. Analysis of gene 117. Baltimore, D. et al. Biotechnology. A prudent path alleles. Nat. Biotechnol. 21, 443–447 (2003). targeting and intrachromosomal homologous forward for genomic engineering and germline gene 141. Epinat, J.‑C. et al. A novel engineered meganuclease recombination stimulated by genomic double-strand modification. Science 348, 36–38 (2015). induces homologous recombination in yeast and breaks in mouse embryonic stem cells. Mol. Cell. Biol. 118. Church, G. Perspective: encourage the innovators. mammalian cells. Nucleic Acids Res. 31, 2952–2962 18, 4070–4078 (1998). Nature 528, S7 (2015). (2003). 165. Lin, J. et al. Creating a monomeric endonuclease 119. Doudna, J. Perspective: embryo editing needs scrutiny. 142. Cermak, T. et al. Efficient design and assembly of TALE‑I‑SceI with high specificity and low genotoxicity Nature 528, S6 (2015). custom TALEN and other TAL effector-based in human cells. Nucleic Acids Res. 43, 1112–1122 120. Schandera, J. & Mackey, T. K. Mitochondrial constructs for DNA targeting. Nucleic Acids Res. 39, (2015). replacement techniques: divergence in global policy. e82 (2011). 166. Cai, D., Cohen, K. B., Luo, T., Lichtman, J. W. & Trends Genet. 32, 385–390 (2016). 143. Mercer, A. C., Gaj, T., Fuller, R. P. & Barbas, C. F. Sanes, J. R. Improved tools for the Brainbow toolbox. 121. Li, L. & Blankenstein, T. Generation of transgenic mice Chimeric TALE recombinases with programmable DNA Nat. Methods 10, 540–547 (2013). with megabase-sized human yeast artificial sequence specificity. Nucleic Acids Res. 40, 167. Ding, Q. et al. Enhanced efficiency of human chromosomes by yeast spheroplast-embryonic stem 11163–11172 (2012). pluripotent stem cell genome editing through cell fusion. Nat. Protoc. 8, 1567–1582 (2013). 144. Gaj, T., Mercer, A. C., Sirk, S. J., Smith, H. L. & replacing TALENs with CRISPRs. Cell Stem Cell 12, 122. Tomizuka, K. et al. Double trans-chromosomic mice: Barbas, C. F. A comprehensive approach to zinc-finger 393–394 (2013). maintenance of two individual human chromosome recombinase customization enables genomic targeting fragments containing Ig heavy and kappa loci and in human cells. Nucleic Acids Res. 41, 3937–3946 Acknowledgements expression of fully human antibodies. Proc. Natl Acad. (2013). This work was supported by NIH grant RM1HG008525 Sci. USA 97, 722–727 (2000). 145. Owens, J. B. et al. Transcription activator like effector “Center for Genomically Engineered Organs”. 123. DiCarlo, J. E. et al. Genome engineering in (TALE)-directed piggyBac transposition in human cells. Saccharomyces cerevisiae using CRISPR-Cas systems. Nucleic Acids Res. 41, 9197–9207 (2013). Competing interests statement Nucleic Acids Res. 41, 4336–4343 (2013). 146. Tsai, S. Q. et al. Dimeric CRISPR RNA-guided FokI The authors declare competing interests: see Web version for 124. EauClaire, S. F., Zhang, J., Rivera, C. G. & Huang, L. L. nucleases for highly specific genome editing. Nat. details. Combinatorial metabolic pathway assembly in the Biotechnol. 32, 569–576 (2014). yeast genome with RNA-guided Cas9. J. Ind. 147. Guilinger, J. P., Thompson, D. B. & Liu, D. R. Fusion of Publisher’s note Microbiol. Biotechnol. 43, 1001–1015 (2016). catalytically inactive Cas9 to FokI nuclease improves Springer Nature remains neutral with regard to jurisdictional 125. Wang, S.‑Z., Liu, B.‑H., Tao, H. W., Xia, K. & Zhang, L. I. the specificity of genome modification. Nat. claims in published maps and institutional affiliations. A genetic strategy for stochastic gene activation with Biotechnol. 32, 577–582 (2014). regulated sparseness (STARS). PLoS ONE 4, e4200 148. Pettersen, E. F. et al. UCSF Chimera — a visualization (2009). system for exploratory research and analysis. FURTHER INFORMATION 126. Gouble, A. et al. Efficient in toto targeted J. Comput. Chem. 25, 1605–1612 (2004). Church lab CRISPR resources: http://crispr.med.harvard.edu recombination in mouse liver by meganuclease- 149. Zhang, J., McCabe, K. A. & Bell, C. E. Crystal Addgene CRISPR–Cas9 and Resources: http:// induced double-strand break. J. Gene Med. 8, structures of λ exonuclease in complex with DNA www.addgene.org/crispr/ 616–622 (2006). suggest an electrostatic ratchet mechanism for Genome Project–write: http://engineeringbiologycenter.org/ 127. Gellhaus, K., Cornu, T. I., Heilbronn, R. & Cathomen, T. processivity. Proc. Natl Acad. Sci. USA. 108, ALL LINKS ARE ACTIVE IN THE ONLINE PDF Fate of recombinant adeno-associated viral vector 11872–11877 (2011).

12 | ADVANCE ONLINE PUBLICATION www.nature.com/nrg ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2017 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.