<<

Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly: Putting the Synthetic in

Randall A. Hughes1 and Andrew D. Ellington2

1Applied Research Laboratories, The University of Texas at Austin, Austin, Texas 78758 2Department of Molecular Biosciences, The University of Texas at Austin, Austin, Texas 78712 Corresponding authors: [email protected]

The chemical synthesis of DNA and their assembly into synthons, , circuits, and even entire genomes by synthesis methods has become an enabling technology for modern and enables the design, build, test, learn, and repeat cycle underpinning innovations in synthetic biology. In this perspective, we briefly review the techniques and technologies that enable the synthesis of DNA oligonucleotides and their assembly into larger DNA constructs with a focus on recent advancements that have sought to reduce synthesis cost and increase sequence fidelity. The development of lower-cost methods to produce high-quality synthetic DNA will allow for the exploration of larger biological hypotheses by lowering the cost of use and help to close the DNA read–write cost gap.

DNA neither cares nor knows. DNA just is. And we decades in both reading (sequencing) and writ- dance to its music. ing (synthesis) DNA sequences have made —Richard Dawkins marked changes in our ability to understand River Out of Eden: A Darwinian Life and engineer biological systems. These advance- t has been said that the 20th century was the ments have led to the development of ground- I“century of the atom” in which discoveries on breaking technologies for the design, assembly, the physical and chemical properties of the ele- and manipulation of DNA encoded genes, ma- ments led to breakthroughs such as atomic en- terials, circuits, and metabolic pathways, which ergy (and weaponry), medical diagnostics, are allowing for an ever greater manipulation of computers, and the microchip to name just a biological systems and even entire organisms. few. These advances had a dramatic effect on Thanks to next-generation sequencing our way of life and helped shape the promise (NGS) technologies capable of generating an and possibilities of science and technology. In estimated 15 petabases of sequence data per the early part of the 21st century, we are witness- year worldwide (Schatz and Phillippy 2012), ing what could very likely become known as the the current megagenomics era has led to the “century of DNA.” As the score to life’s intricate swelling of biological sequence repositories symphony, DNA provides the blueprint for bi- with DNA sequences isolated from every organ- ological function. Advances over the last few ism and environment imaginable. Associated

Editors: Daniel G. Gibson, Clyde A. Hutchison III, Hamilton O. Smith, and J. Additional Perspectives on Synthetic Biology available at www.cshperspectives.org Copyright # 2017 Cold Spring Harbor Laboratory Press; all rights reserved; doi: 10.1101/cshperspect.a023812 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812

1 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

improvements in bioinformatics techniques Synthetic biology is emerging as an impor- and software allow researchers to obtain, ana- tant discipline with the potential to impact a lyze, and manipulate these DNA sequences in number of academic and industrial applications ways easier than ever. The ever-increasing avail- including the creation of novel therapeutics, ma- ability of biological sequence information from terials, biosensing, and manufacturing capabil- all branches of the “tree of life” has deepened ities. Although our current understanding of our understanding of biological systems and the biological systems is vast, it is still far from com- interrelated nature of organisms at the genetic plete. Adapting exogenous DNA sequences and level. NGS technologies have led to a deeper the functional components they encode that understanding of human diseases, the micro- have naturallyevolved within a networkof inter- biome, and the genetic diversity of organisms related sequences remains a principle challenge in our environment. This sequence boom is also of engineering biology. Currently, the engineer- allowing for the expansion of scientific disci- ing of biological systems requires a heavy dose plines such as metabolic engineering and syn- of empirical trial and error to evaluate novel thetic biology as researchers seek to use novel enzymes, expression systems, and pathways for sequences in the manipulation of biological sys- the desired function. Typically, this process is tems for anthropocentric means. In addition, accomplished by first designing a desired syn- this wealth of information is leading to the de- thetic biological circuit or pathway using com- velopment of a variety of diagnostics and ther- puter-aided design tools (Fig. 1). Next, the re- apeutics, which will contribute to the long-term sulting construct DNA is divided into smaller improvement of human health. overlapping pieces (typically 200–1500-bp seg- This biological sequence data bonanza has ments) that are easier to synthesize. These DNA been aided by a stream of innovations in in- components are then synthesized from a set of strumentation and techniques for generating overlapping single-stranded oligonucleotides sequencing data with high fidelity, increased in-house (or by commercial vendors). The re- throughput, and decreased cost, which has con- sulting overlapping synthons are then assem- tributed to making NGS a go-to technology bled into larger pieces of DNA and cloned into for many applications in biology. The ability to an expression vector and the sequence of the generate large DNA sequence data sets has al- resulting construct is then verified. The se- lowed researchers to create and test large scoped quence-verified constructs are then transformed biological hypotheses using NGS technologies into a and assayed for function. Depending that were not possible before their development. on the results, changes to the construct design Beyond analyzing biological systems at the DNA can be made and further iterations of the test sequence level, the need to construct DNA from cycle repeated. This design, build, test, learn, designed sequences has driven the parallel de- and repeat process has become the backbone velopment of methods and instrumentation to of synthetic biology (Fig. 1), which in turn has produce synthetic DNA at scale to enable the put a premium on automated processes and testing of engineered biological components. methods that can shorten the development cycle DNA reading by NGS when combined with and increase throughput. modern DNA synthesis technologies form the One of the attributes that makes biological two foundational technologies driving synthet- systems attractive from an engineering perspec- ic biology efforts and will eventually instill the tive is the fact that biological functions are en- predictability and reliability to engineered bio- coded to a large part in DNA. Therefore, a gross logical systems that chemical engineering has simplification of biological engineering can be brought to chemical systems. This being said, reduced to the design, production, and testing our ability to sequence DNA is currently better of DNA sequences. As researchers seek to engi- than our ability to synthesize DNA de novo, neer biological systems with novel DNA se- although new technologies are helping to close quences, the need for custom synthetic DNA the gap. sequences has grown. This is particularly true

2 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

T A CGTACCGA TTGCTA CGT A T GCATGGCTAACGAT GCA

Sequence manipulation

A G C T

Synthesis Build Modify/repeat

Design Oligos Synthons Devices

Assemble and sequence-verify

Test

Functional assay

Transform

Figure 1. The synthetic biology test cycle. (From top, clockwise) Synthetic DNA constructs are designed and manipulated using computer-aided design software. The designed DNA is then divided into synthesizable pieces (synthons) up to 1–1.5 kbp. The synthons are then broken up into overlapping single-stranded sequences and chemically synthesized. The oligonucleotides are then assembled together into the designed synthons using gene synthesis techniques. If necessary, multiple synthons can be assembled together into larger DNA assemblies or devices. The assembled are then typically cloned into an expression vector and sequence-verified. Once verified, the synthetic constructs are transformed into a cell and the function of the synthetic construct is assayed. Depending on the results the constructs can then be modified or refined and the test cycle is repeated until a DNA construct is obtained that produces the desired function.

when the sequences to be engineered are derived are made. The cost of oligonucleotide synthesis from metagenomic sequences in which no or- has not decreased appreciably in more than a ganism may be available from which to isolate decade, generally ranging from $0.05 to $0.17 the DNA via other methods. The synthesis of per base depending on the synthesis scale, the synthetic DNA is often referred to generically length of the oligonucleotide, and the supplier as “gene synthesis,” which specifically is the (Kosuri and Church 2014). Traditionally, this synthesis of gene-length pieces of DNA (250– cost floor has been carried through to the pro- 2000 bp) directly from single-stranded synthet- duction of gene synthesis products. These syn- ic DNA oligonucleotides. Unfortunately, where- thetic genes currently range in cost from $0.10 as the cost of sequencing has decreased precip- to$0.30 per bp ($100–$300fora1-kb gene)from itously over time, the cost of gene synthesis and traditional commercial suppliers, although com- oligonucleotide synthesis in general has not panies exploiting newer lower-cost synthesis kept pace, although technological innovation methods are starting to bring the cost down. and market forces are progressively lowering Lowering the cost of gene synthesis would the cost of synthetic DNA. The cost of gene syn- enable the generation of larger data sets by mak- thesis is typically directly tied to the cost of ol- ing the cost of gene construction less expensive, igonucleotide synthesis from which the genes meaning that more constructs could be gener-

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 3 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

ated for the same cost investment. This would synthesisthat is used to this day for the commer- allow researchers the ability to sample a greater cial synthesis of oligonucleotides. Synthetic oli- amount of the design landscape. With a great- gonucleotides for gene synthesis applications er understanding of what does and does not are generally synthesized using variations of work in a given construct design, a set of general the phosphoramidite chemistry methods either design rules could be created that would im- on traditional column-based synthesizers or prove the success of the design process and even- microarray-based synthesizers that will be dis- tually shorten the design–build–test–learn cussed in the following sections. process. Because the major cost of gene synthe- sis is the reagents that are needed for oligo- COLUMN-BASED OLIGONUCLEOTIDE synthesis, approaches that reduce SYNTHESIS reagent consumption, improve the robustness and accuracy of the gene assembly process, The commonly used phosphoramidite synthe- and enable increased throughput have become sis chemistry consists of a four-step chain elon- valuable tools for advancing the usage of syn- gation cycle that adds one base per cycle onto thetic DNA by allowing for lower cost of use. In a growing oligonucleotide chain attached to a this perspective, a brief overview of the methods solid support matrix (Fig. 2). In the first step, and technologies that have contributed to the a dimethoxytrityl (DMT)-protected nucleo- production of synthetic DNAvia gene synthesis side phosphoramidite that is attached to a solid methods will be highlighted as well as some of support (usually contained within a synthe- the challenges that have had to be overcome to sis column) is deprotected by the addition of reduce the cost, increase the throughput, and trichloroacetic acid. This activates the sup- ensure the fidelity of synthetic DNA. port-attached phosphoramidite for chain elon- gation with the next phosphoramidite mono- mer. In the second step, the next base in the OLIGONUCLEOTIDE SYNTHESIS sequence is added in the form of a DMT-pro- Modern de novo gene synthesis techniques can tected phosphoramidite and is coupled to the trace their beginnings to the mid-1960s and Go- 50-hydroxyl group of the previous nucleoside bind Khorana and colleagues’ effortsto decipher phosphoramidite in the sequence forming a the genetic code and to reconstitute biological phosphite triester. Third, any unreacted 50-hy- function synthetically (Scheuerbrandt et al. droxyl groups are capped by acylation to render 1963; Khorana et al. 1968). These early efforts any unextended sequences inert in subsequent focused on chemically synthesizing and ligating rounds of the chain elongation cycle and thus together17oligonucleotides, whichencodedthe reducing deletion errors in the finished oligo- gene for a 77-nucleotide alanine tRNA using nucleotide sequences. In the fourth step, the synthetic chemistries still in their infancy in an phosphite triester linkage between the mono- effort that took more than 5 years to complete mers is converted to a phosphate linkage via (Agarwal et al. 1970; Khorana et al. 1972). With- oxidation with an iodine solution to produce in 7 years of this work, the first reported synthet- a cyanoethyl-protected phosphate backbone. ic gene encoding the 14-amino-acid hormone, The synthesis cycle then repeats for the next somatostatin, was expressed in recombinant base in the sequence via the removal of the 50- (Itakura et al. 1977). Subsequent terminal DMT protecting group. After the de- improvements to oligonucleotide synthesis sired sequence has been synthesized from the chemistries and techniques in the early 1980s 30 to 50, the oligonucleotide is chemically cleaved (reviewed in Caruthers 2011; Roy and Caruthers from the solid synthesis support and the pro- 2013) led to the development of solid-phase tecting groups on the bases and the backbone phosphoramidite chemistries whose robustness are removed. This process is highly amenable to and fidelity allowed for automated methods to automation and forms the basis for oligonucle- be developed to enable scalable oligonucleotide otide synthesizers, which can synthesize 96–

4 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

Cycle start

Repeat Step 1 Next cycle DMTO O Basen–1

Deprotection (or release) O

DMTO O Basen

HO O Basen–1 O O P CN O O O O Basen–1 Step 2

DMTO O Basen O Coupling

O O DMTO O Basen P CN Oxidation and O O Basen–1 capping O O P CN O Steps 3 and 4 N

Figure 2. Phosphoramidite-based synthesis of oligonucleotides. This synthesis process is the most commonly used for the synthesis of DNA oligonucleotides for gene synthesis.

1536 distinct oligonucleotides simultaneous- and Heiner 1985; Septak 1996; LeProust et al. ly (Fig. 3A) (Rayner et al. 1998; Cheng et al. 2010). These spurious abasic sites can reduce 2002). These synthesizers typically synthesize the yields of full-length oligonucleotides by oligonucleotides at scales ranging from 10 to promoting the cleavage of the oligonucleotide 1000 nmol with a cost ranging between $0.05 phosphate backbone during the removal of the and $0.17 per base. These oligonucleotides can remaining and backbone protecting generally be synthesized up to 100 nt with groups following the final synthesis cycle. An error rates at or below 0.5%. Column-based additional reduction of synthesis quality is coupling yields are generally quite efficient caused by the introduction of synthesis-related (typically .99% per coupling); however, the errors into the synthesized sequences, primarily yield of full-length oligonucleotides typically in the form of single-base deletions. The main decrease with increasing oligonucleotide length. source of this type of error is the incomplete Yields of full-length oligonucleotides can be removal of the DMT protecting group or the affected by spurious depurination of the syn- combined inefficiencies of the coupling and thesized oligomer, especially at adenosines dur- capping steps during the synthesis cycle. Al- ing the acid deprotection steps of the synthesis though complete removal of synthesis-related cycle and becomes especially problematic for errors will likely be impossible because of less longer oligonucleotide sequences (Efcavitch than 100% reaction efficiencies in a step-wise

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 5 Downloaded from http://cshperspectives.cshlp.org/ ..Hge n ..Ellington A.D. and Hughes R.A. 6 ABColumn-based oligonucleotide synthesis Microarray-based oligonucleotide synthesis

CPG synthesis columns CombiMatrix (OligoArray) microarray

AGTC

CMOS electrode (feature)

96-column synthesis plate onSeptember29,2021-PublishedbyColdSpringHarborLaboratoryPress Tens of Oligonucleotides thousands of are synthesized Oligonucleotides oligonucleotide Microarray attached to the CPG beads sequences are CPG surface in oligonucleotides (synthesis support) synthesized on the synthesis CPG the array surface ieti ril as article this Cite column (one sequence per Microarray feature) Cleave from array Cleave from column

odSrn abPrpc Biol Perspect Harb Spring Cold ++

Oligonucleotide pools Stocks of individual oligonucleotides (one per column) (mixture of oligonucleotide sequences)

Figure 3. Methods for solid-phase synthesis of oligonucleotides. (A) Column-based oligonucleotide synthesis. This is the traditional method of synthesizing DNA using solid-phase phosphoramidite chemistry. The synthesis of a unique oligo- nucleotide sequence is done on the surface of controlled-porosity glass beads (CPG) contained with a synthesis column. During the synthesis process, the reagents flow through the column and across the packed CPG matrix and the oligonu- cleotide “grows” off from the bead surface. Only one sequence can be synthesized per column, although high-throughput synthesizers exist that can synthesize on multiple columns at once. A 96-column synthesis plate is shown as an example. (B) 2017;9:a023812 Microarray-based oligonucleotide synthesis. In this method, microarray chips containing tens of thousands of distinct features synthesize unique oligonucleotide sequences at once with one unique oligonucleotide sequence synthesized per chip feature. On standard arrays, there are no physical barriers between features, so following cleavage of the synthesized oligonucleotides from the chip surface the end product is a pool of sequences containing every oligonucleotide synthesized on the array. Subsequent processing steps are required to “fish” the desired oligonucleotide sequences out of the synthesis pool for subsequent gene synthesis. The OligoArray CMOS microarray chip is shown as an example, although a handful of different array formats exist for oligonucleotide synthesis. Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

multiple reaction synthesis, recent improve- trollers or photomasking. In particular, these ments in the synthesis process and the chemis- two synthesis technologies currently provide tries used will likely lead to gains in the length the best combination of cost, oligonucleotide and quality of oligonucleotides synthesized in synthesis length, and accuracy, which has led the not-too-distant future (LeProust et al. to their use in several recent gene synthesis ap- 2010). For gene synthesis applications, because plications (Cleary et al. 2004; Warneret al. 2010; every oligonucleotide synthesized is made on Patwardhan et al. 2012). Using microarray- individual synthesis columns the oligonu- based synthesizers, oligonucleotides are typical- cleotides necessary to assemble a gene must be ly synthesized at femtomolar synthesis scales, added together into an assembly pool postsyn- that is, typically two to four orders of magni- thesis, which inherently reduces the oligonucle- tude lower than that used in column-based otide synthesis pool complexity because only synthesis. This lower synthesis scale leads to a the oligonucleotides needed for a given synthe- concomitant decrease in the quantity of re- sis are included in the assembly mixture. agents needed for the synthesis. The reduction of reagent use in the synthesis process leads to dramatically lower costs. In addition to the re- MICROARRAY-BASED OLIGONUCLEOTIDE duced synthesis scale, chip-based oligonucleo- SYNTHESIS tide synthesizers offer levels of multiplexing not An alternative to traditional column-based oli- possible on traditional column-based synthe- gonucleotide synthesis has emerged from the sizers that allows for a greater number of unique use of microarray oligonucleotide synthesis oligonucleotide sequences to be synthesized in a (Fig. 3B). These technologies, which were orig- given synthesis run. With possible feature den- inally developed for producing oligonucleotide sities in the tens of thousands per chip and the microarrays for diagnostic applications, have capability for some synthesizers to synthesize emerged as a promising low-cost alternative to multiple chips simultaneously, the ability of column-based oligonucleotide synthesis. Early microarray-based synthesis methods far out- versions of these synthesizers synthesized the strip the capacity of even the largest column- oligonucleotides attached to a microchip sur- based instruments. The combined reduced-syn- face using a modification of the phosphorami- thesis scale and massive multiplexing capability dite synthesis process. Affymetrix was one of the means that oligonucleotide synthesis costs from early pioneers in this field and developed light- array-based platforms can range from $0.00001 activated chemistries to control the spatially to $0.0001 per nucleotide depending on the separated synthesis of oligonucleotides on the platform, oligonucleotide length and synthesis chip surface using standard mask-based photo- scale. When compared with the $0.05–$0.10 lithography techniques to selectively deprotect cost per base attainable for column-synthesized special photolabile nucleoside phosphorami- oligonucleotides, the appeal of chip-synthe- dites (Fodor et al. 1991; Pease et al. 1994). Com- sized oligonucleotides for gene synthesis appli- panies such as NimbleGen and LC Sciences cations becomes obvious. further simplified the light-directed synthesis Although array-based platforms offer supe- procedures by eliminating the photolithogra- rior synthesis capabilities in terms of multiplex- phy masks by using programmable micromir- ing and cost there are some challenges with ror devices to precisely control the light-based using these platforms for making oligonucleo- chemistries (Singh-Gasson et al. 1999; Gao et al. tides for gene synthesis applications. One prob- 2001). Alternatively, ink-jet printing-based lem is that planar array synthesized oligonucle- technologies developed by Agilent and semi- otides tend to be relatively low quality, which conductor-based chip technologies by Combi- leads to more synthesis-related errors when Matrix (CustomArray) allow for the use of compared with column-synthesized oligonu- standard phosphoramidites and reagents with- cleotides (Kosuri and Church 2014; Wan et al. out the need for expensive micromirror con- 2014). Despite this, recent improvements in

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 7 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

chip design and refinements to the synthesis DNA, individual phosphoramidite monomers process have led to an increase in the quality of are combined together to create individual array-synthesized oligonucleotides. One such oligonucleotides 60–100 nt in length. To source of reduced synthesis quality is the depu- facilitate the assembly of a synthetic double- rination of oligonucleotide sequences on chips, stranded DNA (synthon) from single-stranded which can occur spontaneously because of pro- oligonucleotides, adjacent oligonucleotides are longed exposure of the nascent oligonucleotides designed to contain overlapping sequences to the deprotecting reagents during the synthe- between the oligonucleotides encoded on the sis cycle. Optimizations to the reaction condi- opposing strands of the DNA duplex and are tions and the flow of reagents to the chip during assembled together during the gene synthesis the synthesis cycle has been shown to effectively process to make double-stranded synthons improve the quality of the oligonucleotides syn- from 200 to 2000 bp in length. Sequence-veri- thesized on chips by reducing depurination and fied synthons can then be assembled together increasing the overall synthesis yields, which by numerous methods to make larger synthetic in turn enables the synthesis of oligonucleotides DNA constructs encoding entire metabolic up to 200 nts in length (LeProust et al. 2010). pathways or genomes. Another source of reduced synthesis quality Early oligonucleotide assembly methods re- with chip produced oligonucleotides is because lied on the ligation of adjacent oligonucleotides of “edge effects,” which is caused by misalign- via the use of a DNA ligase (Agarwal et al. 1970; ment of the reagent droplets with the selected Sekiya et al. 1976, 1979). Initially, the ligation of chip feature, or inaccurate selective deprotec- adjacent oligonucleotides was done sequen- tion of the proper synthesis features caused by tially, but as assembly methods and the quality improper reagent sequestration or light beam of the synthetic oligonucleotides improved and drifts in synthesizers using light-activated syn- the use of thermostable ligases were used to im- thesis chemistries. Edge effects reduce oligo prove the stringency of the assembly reaction quality by reducing the selective control of the the one-pot assembly of DNA synthons via li- synthesis reactions in adjacent oligonucleotides, gation became possible (Au et al. 1998; Xiong which can lead to substitution or deletion errors et al. 2008b). Shortly after the discovery of the in oligonucleotides synthesized on adjacent fea- polymerase chain reaction (PCR) in the 1980s tures. Work to reduce edge effects during the ligation-free methods for using synthetic single- synthesis process has shown that refinements stranded oligonucleotides were developed to to the chip design can improve the fidelity of make completely synthetic pieces of DNAwith- array-synthesized oligonucleotides to rival col- out the need for a naturally derived DNA tem- umn-synthesized oligonucleotides (Saaem et al. plate (Dillon and Rosen 1990; Jayaraman et al. 2010). Continued improvement in array design 1991; Stemmer et al. 1995). Since this time, doz- and refinements to synthesis reagents and pro- ens of different methods have been created to cesses used will lead to routine robust synthesis assemble double-stranded DNA from single- of high-quality long oligonucleotides from ar- stranded oligonucleotides via PCR-like meth- rays, which will further their use as the go to ods. A summary of many of these methods source for low-cost oligonucleotides for gene has been reviewed in the literature (Xiong synthesis applications. et al. 2008a; Czar et al. 2009; Hughes et al. 2011; Ma et al. 2012) and will not be reviewed extensively here. All of these methods use sin- GENE SYNTHESIS FROM gle-stranded synthetic oligonucleotides with OLIGONUCLEOTIDES complementary overlapping sequences between Because DNA is a polymer made up of four adjacent oligonucleotides to assemble double- different nucleotide monomers, gene synthesis stranded DNA synthons using a thermostable and DNA assembly methods are in effect a form DNA polymerase and PCR. The only differences of hierarchical polymer synthesis. For synthetic between the myriad of PCR-based DNA assem-

8 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

bly methods is in how the substituent oligonu- synthesis required to synthesize a given se- cleotides are designed to be assembled together quence. Once designed and synthesized, the and the reaction conditions under which they substituent oligonucleotides for an assembly are assembled, the end product of all of these are pooled together in equimolar concentra- methods is the same, a double-stranded DNA tions and cycled in a one-pot “assembly” reac- (dsDNA) synthon. Of these assembly methods, tion in which adjacent oligonucleotides are ran- variations of the polymerase cycle assembly domly extended in a nonexponential manner by (PCA) method is by far the most commonly a DNA polymerase to produce a mixture of oli- used to assemble DNA synthons from oligonu- gonucleotide extension products of various cleotides (Fig. 4) (Stemmer et al. 1995). Using lengths. This mixture is then used as a template PCA, a dsDNA sequence is divided into oli- to seed a second PCR reaction, in which the gonucleotide sequences (typically 60–80 nt), desired “full-length” product is amplified from which encode both strands of the DNA duplex the assembly mixture in the presence of an ex- with overlaps between adjacent oligonucleo- cess of the outermost assembly primers. In ad- tides that range from 15 to 25 nt in length. Typ- dition to ligation and PCA-based assembly ically, adjacent oligonucleotides are designed methods, Gibson and colleagues, as part of their with gaps between the forward and the reverse pioneering work in , have overlapping regions of the assembly oligonucle- developed both in vitro (Gibson et al. 2009; otides to reduce the amount of oligonucleotide Gibson 2011a), and in vivo (Gibson 2009,

DNA duplex sequence to be synthesized

Overlapping oligonucleotides to assemble DNA duplex

Progressive one-pot overlap extension assembly

Amplification of full-length assembled product

Figure 4. Polymerase chain assembly. Overlapping oligonucleotides encoding a DNA duplex are assembled together via progressive overlap extension assembly in a one-pot reaction. Following assembly the full-length assembled synthon is amplified out of the assembly mixture by polymerase chain reaction (PCR) with the outermost primers.

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 9 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

2011b, 2012) one-pot, single-step methods for Last, as mentioned previously, array-synthe- simultaneously assembling DNA synthons from sized oligonucleotides are in general lower qual- oligonucleotides and cloning them directly into ity as compared with their column-synthesized . In some variation, these three assem- counterparts, which manifests itself as sequence bly methods are used in most commercial and errors in the synthetic DNA assembled from academic gene synthesis applications reported them. So, in general, the use of array-synthe- to date. sized oligonucleotides requires the use of some sort of error reduction strategy to reduce the work required to obtain sequence-verified GENE SYNTHESIS FROM ARRAY-DERIVED constructs. OLIGONUCLEOTIDE POOLS To overcome these limitations, a couple of Over the last decade, the desire for cheaper different methods have been developed to sources of synthetic DNA has fueled an interest increase the effective oligonucleotide pool con- in array-synthesized DNA as a way to produce centration to ensure reliable assembly and re- oligonucleotides inexpensively. Beyond early producibility (Tian et al. 2004; Kosuri et al. demonstrations of gene synthesis from array 2010; Quan et al. 2011). Tian et al. (2004) ad- produced oligonucleotides (Richmond et al. dressed the concentration issue by including a 2004; Tian et al. 2004; Zhou et al. 2004), recent common priming sequence followed by a nick- innovations have made substantial improve- ing endonuclease recognition site on the 30-end ments to the quality, efficiency, and scalability of every oligonucleotide synthesized on the ar- of microarray-based oligonucleotide synthesis ray (Fig. 5A). The array-synthesized oligonucle- and gene assembly. To use microarray-synthe- otides serve as the template strand for a primer sized DNA for gene synthesis, several hurdles extension reaction using the common priming have had to be overcome to make this source sequence incorporated into the array-attached of DNA an attractive alternative to oligonucle- oligonucleotide pool. Following primer exten- otides synthesized by conventional column- sion, a nicking endonuclease included in the based approaches. Some of the properties that reaction mixture removes the common primer make array-synthesized oligonucleotides attrac- sequences to release a single-stranded DNA oli- tive as a low-cost source of DNA for gene syn- gonucleotide that is complementary to each ol- thesis present challenges for actually using oli- igonucleotide synthesized on the array. This gonucleotides synthesized in this way for gene process amplifies the array-synthesized oligonu- synthesis. The reduced synthesis scale of array- cleotide pool to increase the concentration of produced oligonucleotides is attractive from a the oligonucleotides to levels suitable for gene cost perspective because of the concomitant re- synthesis. In this method, because the array-syn- duction in reagent consumption needed to syn- thesized oligonucleotides remain attached to thesize a given DNA sequence. However, the the array, spatial control of oligonucleotide as- amount of DNA produced (fmol) is generally sembly can occur on chip. Toexploit this feature much lessthanwhat is needed (pmol) to reliably and to improve the multiplexing capabilities of assemble them into larger synthons. Similarly, chip-based oligonucleotide synthesis, Quan the massive multiplexing capabilities of array- et al. (2011) used a custom ink-jet synthesizer based oligonucleotide synthesis means that that synthesized oligonucleotides on specialized tens of thousands of unique oligonucleotide chips. Within these specialized chips, micro- sequences can be synthesized simultaneously compartmentalized assembly wells can be de- on the array surface. However, the sequence signed into the synthesis array with the oligonu- complexity of the oligonucleotide pools pro- cleotides necessary to assemble a unique syn- duced on the array can lead to problems with thon, synthesized, amplified, and assembled assembling them into larger constructs due within individual microwells (Fig. 5A). Using to the interference between oligonucleotides such a method effectively reduces the sequence that contain even modest sequence homology. complexity of the localized oligonucleotide

10 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ ieti ril as article this Cite A On-chip DNA amplification and assembly B Off-chip DNA amplification and assembly

Microarray with microwells

Microarray oligos odSrn abPrpc Biol Perspect Harb Spring Cold = DNA primer = DNA nicking enzyme Cleave from array

= DNA polymerase onSeptember29,2021-PublishedbyColdSpringHarborLaboratoryPress Assembly

PCR-amplify subpools Pool primers

Amplification/ Oligonucleotide subpools Oligonucleotide primer removal 2017;9:a023812

Remove primer sequences

Processed oligo pools ytei N ytei n Assembly and Synthesis DNA Synthetic Microwell Assemble DNA via PCA On-chip amplification of oligonucleotides

Assembled DNA

Figure 5. Gene synthesis from microarray-synthesized oligonucleotides. (A) Synthons can be assembled using on-chip synthesis and assembly by including a single priming site into the 30-end of every oligonucleotide synthesized on the microarray. The oligonucleotides can then be amplified within microwells designed into the array by incubating with a common primer and a DNA polymerase. The primer sequence is removed from the assembly oligonucleotides using a nicking endonuclease, freeing the oligonucleotides to be assembled together via polymerase chain assembly within the same well. (B) Synthons can also be synthesized off-chip by first cleaving the oligonucleotide pools from the array. The cleaved synthesis oligonucleotide pool can be amplified into subpools by PCR using priming sites incorporated into both 11 ends of the synthesized oligonucleotide. The amplified subpools can then be further subdivided into assembly oligonu- cleotide pools by additional unique priming sites included in the oligonucleotide flanking sequences. Following segrega- tion by amplification the priming sequences are removed from the assembly oligonucleotides by digestion. The processed oligonucleotides can then be assembled together by polymerase cycle assembly. Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

pool, which in turn increases the robustness of oligonucleotides is dealing with sequence errors assembly for gene synthesis while also allowing introduced into the sequences by synthesis for synthesis multiplexing that can occur in each by-products and the assembly process itself. of the many microwells designed into the chip. Following assembly, a synthetic DNA can be Another method first introduced by Kosuri thought of as a population of sequences con- et al. (2010) solves the oligonucleotide concen- taining a mixture of correct and incorrect se- tration and pool complexity problems by in- quences. Traditionally, the synthetic DNAwould cluding “bar-coded” priming sequences into be cloned into a vector and individual clones each of the oligonucleotides synthesized on the isolated and sequenced by Sanger sequencing. array (Fig. 5B). In this method, the oligo- However, the probability of any given sequence required to synthesize a desired syn- containing an error increases with the length of thon is flanked on both ends by sets of prede- DNA, which in turn means that more clones fined primer sequences. The outermost priming must be sequenced to obtain a completely cor- sequences allow for amplification of the array- rect clone. This procedure is time-consuming synthesized oligonucleotide pool, which has and the setup work required adds significantly been cleaved from the chip surface following to the cost of the DNA being produced. To re- synthesis. This step allows for one set of unique duce the number of clones necessary to obtain a primers to amplify the entire oligonucleotide correct sequence, some postsynthesis method pool such that a portion of the pool can be must be applied to the assembled DNA to sieve used for subsequent assembly or archival pur- correct sequences from incorrect ones. As a re- poses. Inboard of the pool amplification prim- sult, methods have been developed that approx- ers is a set of flanking primer-binding sequences imate the error-correcting process that cells use that allow for selective amplification of assembly to maintain sequence fidelity during DNA rep- subpools from the synthesis pool. Each unique lication. In general, most methods for removing synthon would be given a unique set of flanking DNA sequence errors from synthetic DNA start subpool primer sequences such that the oligo- with the formation of a DNA heteroduplex. This nucleotides necessary to assemble any given is done by heating up the synthetic DNA to synthon would be selectively amplified from denature and disassociate the strands followed the synthesis pool for subsequent assembly. Fol- by cooling the sample to reanneal the strands lowing amplification of the subpool oligonucle- together. For any given position within a DNA otides, the priming sequences are removed by sequence, most of the duplexes in the synthetic restriction digestion of the amplified dsDNA mixture will contain the correct base at that po- oligonucleotides using a unique enzyme recog- sition with errors sprinkled throughout the nition sequence flanking the assembly oligonu- population. Because sequence errors occur ran- cleotide sequence. This oligonucleotide design domly within an assembled DNA sequence this scheme simultaneously solves the oligonucleo- denaturation and reannealing process leads to tide concentration and pool complexity prob- the formation of heteroduplexed DNA at posi- lems associated with array-derived oligonucle- tions that contain errors. Positions that contain otides and does not require specialized chips or mutations within these heteroduplexes can be array synthesizers. This method is also highly acted on by proteins, which specifically recog- amenable to automation as each subpool can nize sequence mutations in DNA. One such be assembled in unique wells in a multiwell plate group of methods relies on the sequence mis- format using PCA assembly techniques. match recognition capabilities of the MutS pro- tein to specifically bind to sequence mismatches in synthetic DNA duplexes. Selective binding of ERROR CORRECTION AND SEQUENCE MutS to error-containing DNA can be used to VALIDATION sieve error-free sequences from those that con- One of the chief challenges in the synthesis and tain errors (Carr et al. 2004; Binkowski et al. assembly of synthetic DNA from unpurified 2005; Wan et al. 2014). These methods usually

12 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

immobilize MutS to a solid matrix material and sized DNA to pairwise assemble 2271 131– then purify column-bound (error-containing) 250 bp synthons from a single oligonucleotide DNA sequences from unbound material (er- pool and used bar-coded primers to selectively ror-reduced). Error-containing heteroduplex isolatesequence-verified individuals (Klein et al. DNA can also be sieved using enzymes that rec- 2016). All of these methods point to ways in ognize and cut the DNA duplex at the site of the which NextGen technologies can be used to im- base mismatch (Young and Dong 2004; Fuhr- prove the quality of synthetic DNA. In addition, mann et al. 2005; Kosuri et al. 2010; Saaem et al. because sequence-based approaches evaluate 2012; Dormitzer et al. 2013). The use of endo- collections of individual molecules of DNA, nuclease enzymes (or enzyme cocktails), which they are suitable for the sequence verification recognize and cleave DNA heteroduplexes at the of synthetic DNA, which may contain regions sites of mismatches, has been shown to be highly of sequence degeneracy such as libraries for di- effective at reducing synthesis-related errors in rected evolution. Currently, each of these meth- synthetic genes allowing for time and material ods are limited by the sequence read-length savings such that in some cases the treated genes capabilities of the NextGen instrument used can be used directly in functional assays without and by sequencing-related errors. However, as cloning and sequence verification (Dormitzer NextGen instrumentation and techniques con- et al. 2013). These methods are a relatively tinue to improve, these limitations will become straightforward and cost-effective approach for less significant and allow for the accurate verifi- sieving error-containing sequences from syn- cation of longer pieces of synthetic DNA. thetic DNA and can reduce the observed error rate from 1/100 bases to 1/10,000 bases (Dor- MAKE IT BIGGER: SYNTHESIS OF LONGER mitzer et al. 2013; Kosuri and Church 2014). DNA ASSEMBLIES An alternative to enzyme-mediated error correction techniques is the direct sequencing Recent advancements in oligonucleotide syn- of oligonucleotides and assembled synthons us- thesis and techniques for gene synthesis have ing NextGen sequencing techniques. Although largely focused on reducing costs, increasing more expensive on a per-sequence basis than throughput, and the quality of synthetic DNA. enzyme-based error correction techniques, Concomitant advances in assembling DNA into NextGen sequencing techniques offers tremen- longer and longer pieces have led to methods dous multiplexing capabilities in which thou- to construct large enzyme complexes (Kodu- sands of sequence reads can be taken on multiple mal et al. 2004), entire metabolic pathways oligonucleotides or synthons simultaneously. In (Temme et al. 2012), and even entire genomes one such application, Matzas et al. (2010) used (Smith et al. 2003; Gibson et al. 2008a; Hutch- Roche 454 sequencing combined with a bead- ison et al. 2016) synthetically. Methods to as- picking robot to selectively remove oligonucle- semble synthons have been developed to assem- otide sequences (attached to beads) from the ble synthons and larger DNA assemblies in both sequencing array that showed perfect sequences in vitro (Gibson et al. 2010) and in vivo (Gibson that were then used for gene synthesis. Kim et al. et al. 2008b). Although restriction enzyme- (2012) used random sequence tags to mark in- based cloning techniques have been the de ri- dividual synthetic sequences in a454 sequencing gueur choice for manipulating DNA constructs reaction and used these tags to selectively ampli- for a couple of decades and were the basis of fy correct sequences from the sequenced pool. In early BioBrick and similar assembly methods a similar approach, Schwartz et al. (2012) used (Shetty et al. 2008; Lee et al. 2011), the need Illumina sequencing barcodes to “dial-out” cor- to simplify the cloning/assembly process while rect sequences via selective amplification of the reducing the limitations on sequence design has sequence-verified bar-coded sequences. Recent- led to the development of scar-less restriction- ly, they have refined this approach to improve enzyme-free cloning and assembly techniques. the multiplexing capacity of microarray-synthe- Seam-less assembly and cloning methods avail-

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 13 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

able to the modern DNA jockey include Gibson Gibson assembly was used to assemble the assembly (Gibson et al. 2009; Gibson 2011a), 16.3-kb mouse mitochondrial genome directly Golden Gate assembly (Engler et al. 2008), se- from 60 mer oligonucleotides (Gibson et al. quence and ligation-independent cloning 2010). Efficient assembly of even larger synthet- (SLIC) (Li and Elledge 2007), ligation cycling ic DNA segments can also be performed in vivo reaction (de Kok et al. 2014), paper-clip assem- by using the capa- bly (Trubitsyna et al. 2014), yeast assembly bilities of the yeast Saccharomyces cerevisiae.In (Gibson et al. 2008b; Gibson 2011b), and circu- an example of the exceptional ability of yeast to lar polymerase extension cloning (CPEC) assemble exogenous DNA into larger assemblies (Quan and Tian 2009), to name just a few. from overlapping synthons or subassemblies, Which DNA assembly technique to use is large- researchers at the J. Craig Venter Institute have ly a matter of choice, and multiple approaches successfully used yeast to assemble multiple are often applied in parallel. Of these methods, 0.5–1 Mbp bacterial genomes (Gibson et al. Gibson assembly is probably the most com- 2008a; Hutchison et al. 2016) and even assem- monly used to assemble multiple pieces of bled synthons directly from overlapping oligo- DNA together into largerconstructs. This meth- nucleotides (Gibson 2009, 2011b, 2012). Each od uses a one-pot isothermal technique, which of the aforementioned assembly techniques uses an enzyme mixture containing a thermo- could be automated to further increase the stable DNA polymerase, DNA ligase, and exo- throughput for constructing larger synthetic nuclease to chew-back, anneal and repair adja- DNAs and enable the exploration and testing cent overlapping DNA sequences to assemble of large biological hypotheses. the desired construct. Recent innovations in the design of unique overlapping sequences to CONCLUDING REMARKS direct the assembly process has further expand- ed the usage of the Gibson assembly method for The development of automatable robust chem- combinatorial assemblyof large DNA sequences istries for chemical DNA synthesis over the last (Guye et al. 2013; Torella et al. 2014). 40 years has contributed to the advancement of One of the hallmarks of synthetic biology is our understanding of biology and has laid the the application of rational design principles to groundwork for the predictable engineering of the design and assembly of biological compo- biological systems. Synthetic DNA is central nents; however, because it is often difficult to to the development of methods to engineer bi- know a priori how well a given DNA construct ology and when combined with the massive will work once introduced into a cell, it is often amounts of sequence data being generated by necessary to try several versions of the construct NGS efforts will contribute to the advancement to find which one will work best. Therefore, a of synthetic biology toward applications hereto- greater emphasis on the modular design of DNA fore unimaginable. To date, there have been a parts enables the assembly of a greater variety of handful of moonshot demonstrations such as potential constructs through mix-and-match the complete synthesis of an entire yeast chro- combinatorial assembly of DNA components. mosome (Annaluru et al. 2014), an entire bacte- In addition to simplifying the overall assembly rial genome (Gibson et al. 2008a), and the sub- process, modular design and assembly of DNA sequent synthesis of a minimal bacterial genome components makes automation of the process (Hutchison et al. 2016), which illustrate the use possible, which can reduce the time, labor, and of synthetic DNA and the capabilities of existing cost of making and testing multiple constructs. gene synthesis methods to accomplish large- Most of the aforementioned assembly methods scale synthetic biology efforts. These examples, can be used for the assembly in vitro, and Gib- when combined with numerous projects in son assembly has been applied to the assembly which synthetic DNA has been used to evaluate of DNA segments multiple kilobases in length functional biological components (Salis et al. (Gibson et al. 2009). In another such example, 2009; Callura et al. 2012; Brophy and Voigt

14 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

2016), create synthetic vaccines (Dormitzeret al. of the first minimal bacterial genome (Hutch- 2013), and construct synthetic genetic circuits ison et al. 2016) and a deep appreciation of the (Salis et al. 2009; Sowa et al. 2015; Nielsen et al. complexities and challenges of engineering bi- 2016; Rubens et al. 2016) and when applied to ology. Much of which still remains to be ex- wholesale genomic editing (Wang et al. 2009; plored. In this, the century of DNA, the ques- Doudna and Charpentier 2014), point to a fu- tion remains. What is next? ture where the nuances of biological function can in part be understood by the design, syn- thesis, and assay of interchangeable synthetic REFERENCES components akin to synthetic drug develop- Agarwal KL, Buchi H, Caruthers MH, Gupta N, Khorana ment. HG, Kleppe K, Kumar A, Ohtsuka E, Rajbhandary UL, Van de Sande JH, et al. 1970. Total synthesis of the gene To enable larger-scale engineering efforts for an alanine transfer ribonucleic acid from yeast. Nature and to democratize the use of synthetic biology 227: 27–34. techniques and design principles, recent inno- Annaluru N, Muller H, Mitchell LA, Ramalingam S, Strac- vations have sought to lower the cost of synthe- quadanio G, Richardson SM, Dymond JS, Kuang Z, Scheifele LZ, Cooper EM, et al. 2014. Total synthesis of sizing DNA oligonucleotides and their subse- a functional designer eukaryotic . Science quent assembly into synthetic constructs ready 344: 55–58. for use. The use of microarray/microfluidic ol- Au LC, Yang FY, Yang WJ, Lo SH, Kao CF. 1998. Gene syn- igonucleotide synthesis techniques, when com- thesis by an LCR-based approach: High-level production of leptin-l54 using synthetic gene in Escherichia coli. Bio- bined with automated gene synthesis protocols chem Biophys Res Commun 248: 200–203. and improvements in synthetic DNA quality, Binkowski BF, Richmond KE, Kaysen J, Sussman MR, have made great strides toward this goal. In re- Belshaw PJ. 2005. Correcting errors in synthetic DNA through consensus shuffling. Nucleic Acids Res 33: sponse to market pressures, commercialization e55. of these technologies and the continued devel- Brophy JA, Voigt CA. 2016. Antisense transcription as a tool opment of other efficient DNA synthesis tech- to tune gene expression. Mol Syst Biol 12: 854. niques will further enhance the availability of Callura JM, Cantor CR, Collins JJ. 2012. Genetic switch- increasingly inexpensive synthetic DNA. board for synthetic biology applications. Proc Natl Acad Sci 109: 5850–5855. Although the synthesis of relatively small Carr PA, Park JS, Lee YJ, Yu T, Zhang S, Jacobson JM. 2004. synthons (,1 kb) can often be straightforward Protein-mediated error correction for de novo DNA syn- and accomplished at the undergraduate level thesis. Nucleic Acids Res 32: e162. the assembly of larger pieces of synthetic DNA Caruthers MH. 2011. A brief review of DNA and RNA chemical synthesis. Biochem Soc Trans 39: 575–580. can still pose several technical challenges to even Cheng JY, Chen HH, Kao YS, Kao WC, Peck K. 2002. High the most experienced DNA jockey. Problems throughput parallel synthesis of oligonucleotides with created by a high degree of secondary structure 1536 channel synthesizer. Nucleic Acids Res 30: e93. in the target sequence, as well as repetitive DNA Cleary MA, Kilian K, Wang Y, Bradshaw J, Cavet G, Ge W, Kulkarni A, Paddison PJ, Chang K, Sheth N, et al. 2004. sequences, can pose challenges to their assem- Production of complex nucleic acid libraries using highly bly. Sequences that are unknowingly toxic to the parallel in situ oligonucleotide synthesis. Nat Methods 1: host cells used for intermediate manipulations 241–248. of the DNA (e.g., cloning for sequence verifica- Czar MJ, Anderson JC, Bader JS, Peccoud J. 2009. Gene synthesis demystified. Trends Biotechnol 27: 63–72. tion) or assay of synthetic DNA constructs can de Kok S, Stanton LH, Slaby T, Durot M, Holmes VF, Patel also pose problems for the successful assembly KG, Platt D, Shapland EB, Serber Z, Dean J, et al. 2014. and use of some synthetic DNA constructs. Rapid and reliable DNA assembly via ligase cycling reac- Here to, further improvements in synthetic con- tion. ACS Synth Biol 3: 97–106. Dillon PJ, Rosen CA. 1990. A rapid method for the construc- struct design (Tang and Chilkoti 2016) and in tion of synthetic genes using the polymerase chain reac- the DNA assembly process will likely enable the tion. BioTechniques 9: 298–300. successful synthesis of even the most difficult Dormitzer PR, Suphaphiphat P, Gibson DG, Wentworth target sequences in the not-to-distant future. DE, Stockwell TB, Algire MA, Alperovich N, Barro M, Brown DM, Craig S, et al. 2013. Synthetic generation of What started with a synthetic tRNA gene more influenza vaccine for rapid response to pandem- than 40 years ago has led to the recent synthesis ics. Sci Transl Med 5: 185ra168.

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 15 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

R.A. Hughes and A.D. Ellington

Doudna JA, Charpentier E. 2014. . The new chemically synthesized gene for the hormone somatosta- frontier of genome engineering with CRISPR-Cas9. Sci- tin. Science 198: 1056–1063. ence 346: 1258096. Jayaraman K, Fingar SA, Shah J, Fyles J. 1991. Polymerase Efcavitch JW, Heiner C. 1985. Depurination as a yield de- chain reaction–mediated gene synthesis: Synthesis of a creasing mechanism in oligodeoxynucleotide synthesis. gene coding for isozyme c of horseradish peroxidase. Proc Nucleosides Nucleotides 4: 267–267. Natl Acad Sci 88: 4084–4088. Engler C, Kandzia R, Marillonnet S. 2008. A one pot, one Khorana HG, Buchi H, Caruthers MH, Chang SH, Gupta step, precision cloning method with high throughput NK, Kumar A, Ohtsuka E, Sgaramella V, Weber H. 1968. capability. PLoS ONE 3: e3647. Progress in the total synthesis of the gene for ala-tRNA. Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, Solas D. Cold Spring Harb Symp Quant Biol 33: 35–44. 1991. Light-directed, spatially addressable parallel chem- Khorana HG, Agarwal KL, Buchi H, Caruthers MH, Gupta ical synthesis. Science 251: 767–773. NK, Kleppe K, Kumar A, Otsuka E, RajBhandary UL, Van Fuhrmann M, Oertel W, Berthold P, Hegemann P. 2005. de Sande JH, et al. 1972. Studies on . 103. Removal of mismatched bases from synthetic genes by Totalsynthesis of the structural gene for an alanine trans- enzymatic mismatch cleavage. Nucleic Acids Res 33: e58. fer ribonucleic acid from yeast. J Mol Biol 72: 209–217. Gao X, LeProust E, Zhang H, Srivannavit O, Gulari E, Yu P, Kim H, Han H, Ahn J, Lee J, Cho N, Jang H, Kim H, Kwon S, Nishiguchi C, Xiang Q, Zhou X. 2001. A flexible light- Bang D. 2012. “Shotgun DNA synthesis” for the high- directed DNA chip synthesis gated by deprotection using throughput construction of large DNA molecules. Nu- solution photogenerated acids. Nucleic Acids Res 29: cleic Acids Res 40: e140. 4744–4750. Klein JC, Lajoie MJ, Schwartz JJ, Strauch EM, Nelson J, Gibson DG. 2009. Synthesis of DNA fragments in yeast by Baker D, Shendure J. 2016. Multiplex pairwise assembly one-step assembly of overlapping oligonucleotides. Nu- of array-derived DNA oligonucleotides. Nucleic Acids Res cleic Acids Res 37: 6984–6990. 44: e43. Gibson DG. 2011a. Enzymatic assembly of overlapping Kodumal SJ, Patel KG, Reid R, Menzella HG, WelchM, Santi DNA fragments. Methods Enzymol 498: 349–361. DV. 2004. Total synthesis of long DNA sequences: Syn- Gibson DG. 2011b. Gene and genome construction in yeast. thesis of a contiguous 32-kb polyketide synthase gene In Current protocols in molecular biology (ed. Ausubel FM cluster. Proc Natl Acad Sci 101: 15573–15578. et al.), Chapter 3, Unit 3.22. Wiley, Hoboken, NJ. Kosuri S, Church GM. 2014. Large-scale de novo DNA syn- Gibson DG. 2012. Oligonucleotide assembly in yeast to pro- thesis: Technologies and applications. Nat Method 11: duce synthetic DNA fragments. Methods Mol Biol 852: 499–507. 11–21. Kosuri S, Eroshenko N, Leproust EM, Super M, Way J, Li JB, Gibson DG, Benders GA, Andrews-Pfannkoch C, Denisova Church GM. 2010. Scalable gene synthesis by selective EA, Baden-Tillson H, Zaveri J, Stockwell TB, Brownley A, amplification of DNA pools from high-fidelity micro- Thomas DW,Algire MA, et al. 2008a. Complete chemical chips. Nat Biotechnol 28: 1295–1299. synthesis, assembly, and cloning of a Mycoplasma geni- Lee TS, Krupa RA, Zhang F,Hajimorad M, Holtz WJ, Prasad talium genome. Science 319: 1215–1220. N, Lee SK, Keasling JD. 2011. BglBrick vectors and data- Gibson DG, Benders GA, Axelrod KC, Zaveri J, Algire MA, sheets: A synthetic biology platform for gene expression. J Moodie M, Montague MG, VenterJC, Smith HO, Hutch- Biol Eng 5: 12. ison CA III. 2008b. One-step assembly in yeast of 25 LeProust EM, Peck BJ, Spirin K, McCuen HB, Moore B, overlapping DNA fragments to form a complete synthetic Namsaraev E, Caruthers MH. 2010. Synthesis of high- Mycoplasma genitalium genome. Proc Natl Acad Sci 105: quality libraries of long (150mer) oligonucleotides by a 20404–20409. novel depurination controlled process. Nucleic Acids Res Gibson DG, Young L, Chuang RY,Venter JC, Hutchison CA 38: 2522–2540. III, Smith HO. 2009. Enzymatic assembly of DNA mol- Li MZ, Elledge SJ. 2007. Harnessing homologous recombi- ecules up to several hundred kilobases. Nat Method 6: nation in vitro to generate recombinant DNA via SLIC. 343–345. Nat Method 4: 251–256. Gibson DG, Smith HO, Hutchison CA III, VenterJC, Merry- Ma S, Tang N, Tian J. 2012. DNA synthesis, assembly and man C. 2010. Chemical synthesis of the mouse mito- applications in synthetic biology. Curr Opin Chem Biol chondrial genome. Nat Method 7: 901–903. 16: 260–267. Guye P, Li Y, Wroblewska L, Duportet X, Weiss R. 2013. Matzas M, Stahler PF, Kefer N, Siebelt N, Boisguerin V, Rapid, modular and reliable construction of complex Leonard JT, Keller A, Stahler CF, Haberle P, Gharizadeh mammalian gene circuits. Nucleic Acids Res 41: e156. B, et al. 2010. High-fidelity gene synthesis by retrieval of Hughes RA, Miklos AE, Ellington AD. 2011. Gene synthesis: sequence-verified DNA identified using high-through- Methods and applications. Methods Enzymol 498: 277– put pyrosequencing. Nat Biotechnol 28: 1291–1294. 309. Nielsen AA, Der BS, Shin J, Vaidyanathan P, Paralanov V, Hutchison CA 3rd, Chuang RY, Noskov VN, Assad-Garcia Strychalski EA, Ross D, Densmore D, Voigt CA. 2016. N, Deerinck TJ, Ellisman MH, Gill J, Kannan K, Karas BJ, Genetic circuit design automation. Science 352: aac7341. Ma L, et al. 2016. Design and synthesis of a minimal Patwardhan RP, Hiatt JB, Witten DM, Kim MJ, Smith RP, bacterial genome. Science 351: aad6253. May D, Lee C, Andrie JM, Lee SI, Cooper GM, et al. 2012. Itakura K, Hirose T,Crea R, Riggs AD, Heyneker HL, Bolivar Massively parallel functional dissection of mammalian F, Boyer HW. 1977. Expression in Escherichia coli of a enhancers in vivo. Nat Biotechnol 30: 265–270.

16 Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly

Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Singh-Gasson S, Green RD, Yue Y, Nelson C, Blattner F, Fodor SP. 1994. Light-generated oligonucleotide arrays Sussman MR, Cerrina F. 1999. Maskless fabrication of for rapid DNA sequence analysis. Proc Natl Acad Sci 91: light-directed oligonucleotide microarrays using a digital 5022–5026. micromirror array. Nat Biotechnol 17: 974–978. Quan J, Tian J. 2009. Circular polymerase extension cloning Smith HO, Hutchison CA III, Pfannkoch C, VenterJC. 2003. of complex gene libraries and pathways. PLoS ONE 4: Generating a synthetic genome by whole genome assem- e6441. bly: wX174 bacteriophage from synthetic oligonucleo- Quan J, Saaem I, TangN, Ma S, Negre N, Gong H, White KP, tides. Proc Natl Acad Sci 100: 15440–15445. Tian J. 2011. Parallel on-chip gene synthesis and applica- Sowa SW, Gelderman G, Contreras LM. 2015. Advances in tion to optimization of protein expression. Nat Biotech- synthetic dynamic circuits design: Using novel synthetic nol 29: 449–452. parts to engineer new generations of gene oscillations. Rayner S, Brignac S, Bumeister R, Belosludtsev Y, Ward T, Curr Opin Biotechnol 36: 161–167. Grant O, O’Brien K, Evans GA, Garner HR. 1998. Mer- Stemmer WP, Crameri A, Ha KD, Brennan TM, Heyneker Made: An oligodeoxyribonucleotide synthesizer for high HL. 1995. Single-step assembly of a gene and entire plas- throughput oligonucleotide production in dual 96-well mid from large numbers of oligodeoxyribonucleotides. plates. Genome Res 8: 741–747. Gene 164: 49–53. Richmond KE, Li MH, Rodesch MJ, Patel M, Lowe AM, Kim Tang NC, Chilkoti A. 2016. Combinatorial codon scram- C, Chu LL, Venkataramaian N, Flickinger SF, Kaysen J, et bling enables scalable gene synthesis and amplification al. 2004. Amplification and assembly of chip-eluted DNA of repetitive proteins. Nat Mater 15: 419–424. (AACED): A method for high-throughput gene synthe- sis. Nucleic Acids Res 32: 5011–5018. TemmeK, Zhao D, VoigtCA. 2012. Refactoring the nitrogen fixation gene cluster from Klebsiella oxytoca. Proc Natl Roy S, Caruthers M. 2013. Synthesis of DNA/RNA and their Acad Sci 109: 7085–7090. analogs via phosphoramidite and H-phosphonate chem- istries. Molecules 18: 14268–14284. Tian J, Gong H, Sheng N, Zhou X, Gulari E, Gao X, Church G. 2004. Accurate multiplex gene synthesis from pro- Rubens JR, Selvaggio G, Lu TK. 2016. Synthetic mixed-sig- grammable DNA microchips. Nature 432: 1050–1054. nal computation in living cells. Nat Commun 7: 11658. Torella JP,Lienert F,Boehm CR, Chen JH, Way JC, Silver PA. Saaem I, Ma KS, Marchi AN, LaBean TH, Tian J. 2010. In 2014. Unique nucleotide sequence-guided assembly of situ synthesis of DNA microarrayon functionalized cyclic olefin copolymer substrate. ACS Appl Mater Interfaces 2: repetitive DNA parts for synthetic biology applications. 491–497. Nat Protoc 9: 2075–2089. Saaem I, Ma S, Quan J, Tian J. 2012. Error correction of Trubitsyna M, Michlewski G, Cai Y, Elfick A, French CE. microchip synthesized genes using Surveyor nuclease. 2014. PaperClip: Rapid multi-part DNA assembly from Nucleic Acids Res 40: e23. existing libraries. Nucleic Acids Res 42: e154. Salis HM, Mirsky EA, Voigt CA. 2009. Automated design of Wan W, Li L, Xu Q, Wang Z, Yao Y,Wang R, Zhang J, Liu H, synthetic ribosome binding sites to control protein ex- Gao X, Hong J. 2014. Error removal in microchip-syn- pression. Nat Biotechnol 27: 946–950. thesized DNA using immobilized MutS. Nucleic Acids Res 42: e102. Schatz MC, Phillippy AM. 2012. The rise of a digital im- mune system. GigaScience 1: 4. Wang HH, Isaacs FJ, Carr PA, Sun ZZ, Xu G, Forest CR, Scheuerbrandt G, Duffield AM, Nussbaum AL. 1963. Step- Church GM. 2009. Programming cells by multiplex ge- wise synthesis of deoxyribo-oligonucleotides. Biochem nome engineering and accelerated evolution. Nature 460: Biophys Res Commun 11: 152–155. 894–898. Schwartz JJ, Lee C, Shendure J. 2012. Accurate gene synthesis WarnerJR, Reeder PJ, Karimpour-Fard A, WoodruffLB, Gill with tag-directed retrieval of sequence-verified DNA RT. 2010. Rapid profiling of a microbial genome using molecules. Nat Method 9: 913–915. mixtures of barcoded oligonucleotides. Nat Biotechnol 28: 856–862. Sekiya T, Besmer P,Takeya T, Khorana HG. 1976. Total syn- thesis of the structural gene for the precursor of a tyrosine Xiong AS, Peng RH, Zhuang J, Gao F, Li Y, Cheng ZM, Yao suppressor transfer RNA from Escherichia coli. 7. Enzy- QH. 2008a. Chemical gene synthesis: Strategies, soft- matic joining of the chemically synthesized segments to wares, error corrections, and applications. FEMS Micro- form a DNA duplex corresponding to the nucleotide biol Rev 32: 522–540. sequence 1-26. J Biol Chem 251: 634–641. Xiong AS, Peng RH, Zhuang J, Liu JG, Gao F, Chen JM, Sekiya T, Takeya T, Brown EL, Belagaje R, Contreras R, Fritz Cheng ZM, YaoQH. 2008b. Non-polymerase-cycling-as- HJ, Gait MJ, Lees RG, Ryan MJ, Khorana HG. 1979. Total sembly-based chemical gene synthesis: Strategies, meth- synthesis of a tyrosine suppressor transfer RNA gene. ods, and progress. Biotechnol Adv 26: 121–134. XVI: Enzymatic joinings to form the total 207-base YoungL, Dong Q. 2004. Two-step total gene synthesis meth- pair-long DNA. J Biol Chem 254: 5787–5801. od. Nucleic Acids Res 32: e59. Septak M. 1996. Kinetic studies on depurination and detri- Zhou X, Cai S, Hong A, YouQ, Yu P,Sheng N, Srivannavit O, tylation of CPG-bound intermediates during oligonucle- Muranjan S, Rouillard JM, Xia Y,et al. 2004. Microfluidic otide synthesis. Nucleic Acids Res 24: 3053–3058. PicoArray synthesis of oligodeoxynucleotides and simul- Shetty RP,Endy D, Knight TF Jr. 2008. Engineering BioBrick taneous assembling of multiple DNA sequences. Nucleic vectors from BioBrick parts. J Biol Eng 2: 5. Acids Res 32: 5409–5417.

Cite this article as Cold Spring Harb Perspect Biol 2017;9:a023812 17 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press

Synthetic DNA Synthesis and Assembly: Putting the Synthetic in Synthetic Biology

Randall A. Hughes and Andrew D. Ellington

Cold Spring Harb Perspect Biol 2017; doi: 10.1101/cshperspect.a023812

Subject Collection Synthetic Biology

Minimal Cells−−Real and Imagined Synthetic DNA Synthesis and Assembly: Putting John I. Glass, Chuck Merryman, Kim S. Wise, et al. the Synthetic in Synthetic Biology Randall A. Hughes and Andrew D. Ellington Synthetic Botany Design Automation in Synthetic Biology Christian R. Boehm, Bernardo Pollak, Nuri Evan Appleton, Curtis Madsen, Nicholas Roehner, Purswani, et al. et al. Synthetic Biology in Cell and Organ Cell-Free Synthetic Biology: Engineering Beyond Transplantation the Cell Sean Stevens Jessica G. Perez, Jessica C. Stark and Michael C. Jewett Genome-Editing Technologies: Principles and The Need for Integrated Approaches in Metabolic Applications Engineering Thomas Gaj, Shannon J. Sirk, Sai-lan Shui, et al. Anna Lechner, Elizabeth Brunk and Jay D. Keasling Alternative Watson−Crick Synthetic Genetic Synthetic Biology of Natural Products Systems Rainer Breitling and Eriko Takano Steven A. Benner, Nilesh B. Karalkar, Shuichi Hoshika, et al. Phage Therapy in the Era of Synthetic Biology At the Interface of Chemical and Biological E. Magda Barbu, Kyle C. Cady and Bolyn Hubby Synthesis: An Expanded Genetic Code Han Xiao and Peter G. Schultz Synthetic Morphogenesis Building Spatial Synthetic Biology with Brian P. Teague, Patrick Guye and Ron Weiss Compartments, Scaffolds, and Communities Jessica K. Polka, Stephanie G. Hays and Pamela A. Silver Engineering Gene Circuits for Mammalian Cell− Based Applications Simon Ausländer and Martin Fussenegger

For additional articles in this collection, see http://cshperspectives.cshlp.org/cgi/collection/

Copyright © 2017 Cold Spring Harbor Laboratory Press; all rights reserved