<<

arXiv:0807.3280v2 [cond-mat.soft] 22 Dec 2008 octahedron, aae osnhsz N cube goup DNA Seeman a The synthesize slower. to much managed initially was tructures oiai praha lotabtayvreyo two- created. of be variety can arbitrary shapes DNA almost dimensional Rothemund’s an using approach Furthermore, junction “origami” a strands. of four consisting of arm each larger from structures, lattices four-armed dimensional two and ribbons structed t w-iesoa lattices. two-dimensional ate ii obecosvrmotifs crossover double rigid h emngopi 1983. in by junction group four-armed Seeman a approach the target for This demonstrated the sequences. initially way the was into this helices) ‘programmed’ double In be branched mini- can of global landscape. consisting the (usually energy structure as the specified of be mum system can a of configuration strands of certain different a controlled complementary, of be for sections to ideal designing DNA By makes self-assembly. that however, strands, is struc- it dimensional so three and tures. well-defined rod forming rigid almost of an capable as acts dsDNA distances, ie ytesprto fbs ar,tehelical the nm, 50 pairs, and base 3) nm, 0.33 (Ref. (approximately of length nm 3.4 persistence separation its and the pitch by mined N dDA a h da rprisfrananoscale a for properties Double-stranded block, ideal the fashion. building has step by (dsDNA) laboriously step DNA nanoscale. be goal a the must in the on constructed structures to machines self-assembly, central and Without is objects engineering units of simple from self-assemble rgesi omn he-iesoa N nanos- DNA three-dimensional forming in Progress ti h eetvt fbs arn ewe single between pairing base of selectivity the is It h blt odsg aotutrswihaccurately which nanostructures design to ability The h efasml fDAHlia ucin tde iham a with studied junctions Holliday DNA of self-assembly The 12 sas bet erdc h eaiemligtemperatures melting relative mi the in reproduce trapped to becoming able of also juncti possibility is Holliday the t the above of reduces case and mechanism the structure successfu in target Furthermore, most the aggregates. is of d assembly temperatures DNA both melting that of the find self-assembly We the DNA. sugar-phosph simulate to single-stranded the able whe to are DNA we corresponding of model, model sites simple interaction a two introduce by We nanostructures. DNA of ASnmes 87.14.gk,81.16.Dn,87.15.ak numbers: PACS occur. to displacement strand allows nti ae,w xlr h esblt fuigcoarse-gr using of feasibility the explore we paper, this In u nyatraln eiso tp and steps of series long a after only but 1,2 .INTRODUCTION I. uofPirsCnr o hoeia hsc,1KbeRoad Keble 1 Physics, Theoretical for Centre Peierls Rudolf ihsrcua eghsae deter- scales length structural with nvriyo xod ot ak od xodO13Z Unite 3QZ, OX1 Oxford Road, Parks South Oxford, of University hscladTertclCeityLbrtr,Departmen Laboratory, Chemistry Theoretical and Physical hmsE udig,Ii .Jhso,adAdA Louis A. Ard and Johnston, G. Iain Ouldridge, E. Thomas 4 7,8 5 6 epciey.Oe these Over respectively). uhjntosadmore and junctions Such a hnb sdt cre- to used be then can Yan 10 tal. et 11 n truncated a and 9 aeas con- also have Dtd coe 5 2018) 25, October (Dated: oahnP .Doye K. P. Jonathan endvlpdta lo oyerlcgs uhas such cages, have polyhedral approaches allow recently, More that tetrahedra, developed yield. final been low a with eaer n rnae icosahedra, truncated and decahedra rhclasml,sc sta ecie yHe hier- by described to that approach as standard such assembly, A archical efficiently. more reached motifs simple as- hierarchical of through the sembly routes by certain example, follow space—for if to configuration addition, designed In be configuration. respect can target systems with the of structures stability competing the to of stability the imize sebe oua uligbok noprtn other incorporating molecules. blocks organic building pre- using modular appropri- produced assembled temperature. been of also high have solutions structures from Additional cooling oligonucleotides by designed simply ately yields high in o rgamn sebyptwy a enproposed been system Yin elegant has by pathways alternative, assembly An programming cooled. other for is each iso- solution to in the bonding before form as temperatures to high motifs at certain lation allows different dif- This at stable between temperatures. become bonds oligonucleotides of that pairs so ferent sequences choosing involves hi neatosuigohroligonucleotides. other catalyzing using of interactions possibility their the and structures loop stranded otsailydtie ecitoso h self-assembly. the of descriptions the spatially-detailed potentially designs. offer most complex would DNA more of simulations increasingly Atomistic experimentalists consider to use to self- of path- the wishing be would with formation such associated as the and landscape assembly, on energy information free and provide self- ways to the would model able into successful theo- a be insights have particular, In further to allow process. useful assembly that be models would retical it nanostructures, DNA ndsgigsrn eune,i sipratt min- to important is it sequences, strand designing In ie hs eeteprmna dacsi creating in advances experimental recent these Given ie oest iuaeteself-assembly the simulate to models ained n eso o irrhclassembly hierarchical a how show we on, tal. et t akoeadtebs.Uigthis Using base. the and backbone ate fdffrn tutrsacrtl,and accurately, structures different of emligtmeaueo misbonded of temperature melting he bne ofiuain.Temodel The configurations. sbonded pee n oldyjntosfrom junctions Holliday and uplexes ntetmeauewno below window temperature the in l 13 21 eec uloiei represented is nucleotide each re rgnlbipyramids, trigonal hc eiso h eatblt fsingle of metastability the on relies which , xod KO13NP OX1 UK Oxford, , 18,19 fChemistry, of t 20 Kingdom d tetre a oetal be potentially can target —the nmlmodel inimal 14 17 octahedra, obe obtained been to 15,16 tal. et do- , 17 2

However, they are computationally very expensive, and φ (a) (b) are generally restricted to time scales that are too short to study self-assembly.22 Statistical approaches such as that of Poland and Scheraga23 and the nearest-neighbour model24 use sim- θ ple expressions for the free energy of helix and random coil states to obtain equilibrium results for the bonding of two strands. Whilst the parameters in these models can be tuned to give very accurate correspondence with experimental data,25,26 they give no information on the dynamics and formation pathways and hence are only useful for ensuring that the target structure has signifi- cantly lower free energy than competing configurations. Furthermore, any description purely based on secondary structure (i.e. which bases are paired) is inherently inca- FIG. 1: (Colour online) A schematic representation of the pable of accounting for topological effects such as linking model. The thick lines represent the rigid backbone monomer of looped structures.27 units and the large circles the repulsive Lennard-Jones inter- Coarse-grained or minimal models offer a compromise actions at their centres. The smaller, darker circles represent between detail and computational simplicity, and are well the bases. The panels illustrate the definitions of (a) the bend- ing angle between two units (θ), and (b) the torsional angle suited to the study of hybridization of oligonucleotides. (φ) which is found after the monomers have been rotated to The aim of these models is to be capable of describing lie parallel. both the thermodynamic and kinetic behaviour of sys- tems, a vital feature if kinetic metastability is inherent in 21 assembly pathways. In developing such minimal mod- hypothesize that the self-assembly properties of DNA are els the approach is usually to retain just those physical dominated by the fact that ssDNA is a semi-flexible poly- features of the system that are essential to the behaviour mer with selective attractive interactions. We introduce that is of interest. an extremely simple model, similar to that of Starr and 28 Dauxois, Peyrard and Bishop models, and modified Sciortino, to test this hypothesis.34 This simplicity en- versions such as that proposed by Buyukdagli, Sanrey, ables us to explore the thermodynamics and kinetics of 29 and Joyeux, constitute the simplest class of dynamical self-assembly in the model in great depth, and hence ex- models. Although these are dynamic models in the sense amine the feasibility of simulating nanostructure forma- that the energy is a function of the separation between tion through minimal models. each base, the nucleotides are constrained to move in one We first describe the model in Section II, then exam- dimension. This lack of conformational freedom means ine its success in reproducing the general features of hy- that these models are incapable of capturing the nuances bridization in Section III A. Next in Section III B, we of the self-assembly from single-stranded DNA (ssDNA). apply it to the formation of a Holliday junction, a simple Recently, models have been proposed which capture nanostructure consisting of a four-armed cross.8 the helicity of dsDNA using two30 or three31 interac- tion sites per nucleotide. These models, however, are optimized for studying deviations from the ideal double- II. METHODS stranded state, and so have not been used to examine self-assembly. Although they have been used to study the thermal denaturation of dsDNA, it is essential for our A. Model purposes to be able to simulate the assembly of a struc- ture from ssDNA as it is this process that will reveal the We introduce an off-lattice model inspired by that kinetic traps and free energy landscape associated with which Starr and Sciortino used to study the gelation of the formation of a particular DNA nanostructure. four-armed DNA dendrimers.34 As our aim is to repro- Simpler, linear models, also with two interaction sites duce the basic physics with as simple a model as possible, per nucleotide, have been used to investigate duplex we neglect contributions to the interactions due to base hybridization,32 hairpin formation33 and gelation of col- stacking, and the charge and asymmetry of the phosphate loids functionalized with oligonucleotides.34 These mod- backbone. We do not attempt to include the detailed ge- els all use two interaction sites to represent one nu- ometrical structure of DNA, but instead represent the cleotide, with backbone sites linked to each other to rep- oligonucleotides as a chain of monomer units, each corre- resent the sugar-phosphate chain, and interaction sites sponding to one nucleotide (Fig. 1). A monomer consists which represent the bases. This work investigates the of a rod (chosen to be rigid for simplicity) of length l with possibility of extending the use of such coarse-grained a repulsive backbone interaction site at the centre of the models to study the self-assembly of nanostructures that rod. In addition, each unit has a bonding interaction involve multiple strands forming branched duplexes. We site (or base) at a distance of 0.3 l from the backbone 3 site (perpendicular to the rod). Each monomer is also k1 is chosen to be 0.1 ǫ to give a persistence length, ∗ assigned a base type (A,G,C,T) to model the selective lps = 3.149 l at a reduced temperature of T = 0.09677 nature of bonding. In this model we only consider bonds for ssDNA. We obtain this result by simulating a single between the complementary pairs A-T and G-C. strand 70 bases in length, and using the definition:35 We do not explicitly include any solvent molecules in hL · l1i our simulations, but instead use effective potentials to lps = , (5) describe the interactions between the DNA. Sites inter- hl1i act through shifted-force Lennard-Jones (LJ) potentials, where L is the end to end vector of the strand, l is the where, as well as truncating and shifting the potential, an 1 vector associated with the first monomer and hi indicates extra term is included to ensure the force goes smoothly ∗ a thermal average. Taking l to be 6.3A,˚ T =0.09677 is to zero at the cutoff r . For r < r ◦ c c mapped to 24 C and so the model is consistent with ex- perimental data for ssDNA in 0.445M NaCl solution.36,37 dVLJ Vsf (r)= VLJ(r) − VLJ(rc) − (r − rc) (1) k is chosen to be 0.4 ǫ. dr 2 r=rc In neglecting the geometrical structure of a double he-

lix we do not accurately represent certain types of bond- where ing. ‘Bulged’ bonding occurs when consecutive bases in σ 12 σ 6 one of the strands attach to non-consecutive bases in the V (r)=4ǫ − , (2) LJ r r other strand. ‘Internal loops’ consist of stretches of non-      complementary bases (either symmetric or asymmetric in and Vsf (r) = 0 for r ≥ rc. Backbone sites (except adja- the number of bases involved in each strand). ‘Hairpins’ cent units on the same strand) interact through Eq. (1) result when a single strand doubles back and bonds to 1/6 with σ = l and rc =2 σ. This purely repulsive interac- itself. The details of these motifs are complicated but an tion models the steric repulsion between strands. Bond- empirical description of their thermodynamic properties ing sites (again excluding adjacent units on a strand) in- is given in Ref. 26. Importantly, they are generally pe- teract via Eq. (1) with σ =0.35 l and rc =2.5 σ for com- nalized due to the disruption of the geometry of DNA in 1/6 plementary bases (to allow for attraction) and rc =2 σ a way which is not well reproduced by our model. In the for all other pairings. The depth of the resulting poten- case of the short strands we consider, these motifs will eff only play a small role as the strands are not specifically tial well between complementary bases, ǫbase, is 0.396ǫ. In what follows we will measure the temperature in terms intended to have stable structures of these forms. In fact, ∗ eff as the base sequences we use were designed to form the of a reduced temperature, T = kB T/ǫbase. The above choice of parameters ensures that the attractive inter- Holliday junction, the possibility of forming these motifs action between complementary bases is largely shielded at relevant temperatures was deliberately avoided.38 by backbone repulsion. Monomers therefore bond selec- For simplicity we therefore include only two alter- tively and can only bond strongly to one other monomer ations to the model. Firstly, we define ‘kinked states’ at a given instant. These are the key features of Watson- as those for which the number of unpaired bases be- Crick base pairing that make DNA so useful for self- tween two bonding pairs on either side of a duplex is assembly. not equal (including asymmetric loops and bulges). We The model also includes potentials between consecu- impose an infinite energy penalty on the formation of tive monomers associated with bending and twisting the these kinked states if the total number of intermediate strand: bases is less than six. Secondly, we treat complementary units within six bases of each other on the same strand as 3π non-complementary, but allow all other hairpins without k1(1 − cos(θ)) if θ< 4 , Vbend = (3) penalty. (∞ otherwise It should be also noted that this model neglects the and directional asymmetry of the sugar-phosphate backbone. Therefore, parallel as well as anti-parallel bonding is pos-

Vtwist = k2(1 − cos(φ)). (4) sible in our model, whereas parallel bonding does not occur in experiment. We define θ as the angle between the vectors along ad- jacent monomer rods. As previously mentioned, consec- utive backbone sites do not interact via LJ potentials. B. Monte Carlo Simulation Instead, a hard cutoff is introduced in Eq. (3) to reflect the fact that an cannot double back on In a fully-atomistic model of DNA the natural way to itself. φ is taken as the angle between adjacent backbone simulate its dynamics would be to use molecular dynam- to bonding site vectors after the monomers have been ro- ics. However, the best way to simulate the dynamics in tated to lie parallel (Fig. 1). For simplicity, we choose the a coarse-grained model is an important, but not fully re- torsional potential to have a minimum at φ = 0. Thus, solved, question, and one that will depend on the nature neither ssDNA or dsDNA will be helical in our model. of the model. Clearly, for the current model standard 4 molecular dynamics is inappropriate as it will lead to bal- choose W (Q) = exp(βA(Q)), where A(Q) is the free en- listic motion of the strands between collisions because of ergy as a function of the order parameter. To achieve the absence of explicit solvent particles, whereas DNA in this, however, would require knowledge of A(Q). In- solution undergoes diffusive Brownian motion. An alter- stead, there are standard methods to construct W (Q) native approach is to use Metropolis Monte Carlo (MC) iteratively, but for the current examples it was possible algorithm39,40 where the moves are restricted to be lo- to construct W (Q) manually, because of the relative sim- cal, as it has been argued that this can provide a rea- plicity of the free energy profiles. sonable approximation to the dynamics.41–43 This is the To a first approximation the interaction between fully approach that we use here to simulate the dynamics of bonded structures is negligible. Therefore, in the um- self-assembly of DNA duplexes and Holliday junctions. brella sampling simulations we consider systems contain- In particular, the local MC moves that we use are trans- ing the minimum number of strands required to form a lation and rotation of whole strands and bending of a given object (two for a duplex and four for a Holliday strand about a particular monomer, thus ensuring that junction). We then use the relative weight of bound and the strands undergo an approximation to diffusive Brow- free states to extrapolate the expected fractional concen- nian motion in the simulations. Therefore we expect the trations for larger systems.46 The natural choice for the MC simulations, which are all initiated with free single order parameter Q is the number of correct bonds, where strands, to mimic the real self-assembly processes in our two monomers are defined to be bonded if their energy model. It is important to note that this will include, as of interaction is negative. well as successful assembly into the target structure, ki- netic trapping in non-equilibrium configurations and that when the latter occurs this reflects the inefficiency of the III. RESULTS self-assembly under those conditions. Although a true measure of time is impossible in A. Duplex Formation Monte Carlo simulations, an approximate time scale for diffusion-limited processes can be found by comparing the diffusive properties of objects to experiment. By mea- We test the model by analysing the duplex bond- suring the diffusion of isolated strands, and assuming dif- ing of two different complementary strands. We simu- late systems of ten oligonucleotides, initially not bonded, fusion coefficients comparable to those of double strands −5 44 in a periodic cell with a concentration of 5.49 × 10 and hairpin loops of similar length, we conclude that −3 −4 one step per strand corresponds to a time scale of ap- molecules l (or 3.65 × 10 M). We separately consider proximately 2 ps. Thus, our model allows our systems to strands consisting of 7 and 13 monomers, which corre- be studied on millisecond time scales. spond to two of the arms of the Holliday junction studied et al.8,38 At the end of the above MC simulations, our systems experimentally by Malo and which we consider will not necessarily have reached equilibrium, both be- in Section III B: cause the energy barriers to escape from misbonded con- G-A-G-T-T-A-G figurations can be difficult to overcome at low tempera- 7 bases (7) ture and the low rate of association at higher tempera- (C-T-A-A-C-T-C tures. Therefore, as a comparison we also compute the equilibrium thermodynamic properties of our systems us- 40,45 ing umbrella sampling. Formally, we can write the G-C-G-A-T-G-A-G-C-A-G-G-A rN 13 bases (8) thermal average of a function B( ) in the canonical en- T-C-C-T-G-C-T-C-A-T-C-G-C semble as: ( B T rN [W (Q) exp(−V/kB )]d where we have listed strands in the 5’–3’ sense for consis- hBi = W (Q) (6) 1 T rN tency with the literature. The yields of correctly-bonded R [W (Q) exp(−V/kB )]d W (Q) and misbonded structures at the end of the simulations where Q = Q(rRN) is an order parameter or reaction co- are depicted in Figure 2 as a function of temperature. ordinate and V = V (rN) is the potential energy. We are We also display the predicted equilibrium fraction of cor- free to choose W (Q), and by taking the term in square rectly bonded strands for both systems obtained using brackets as the weighting of states and keeping statistics umbrella sampling. For convenience, we define a cor- for B/W and 1/W at each step we can find hBi. In stan- rectly bonded structure to have more than 70% of the dard Metropolis MC, W = 1, but by choosing W (Q) in bonds of the complete duplex and no bonds to other such a way that those states with intermediate values of strands. Any other structure is recorded as ‘misbonded’. Q are visited more frequently, the effective free energy Figure 2 shows a maximum in the yield as a function of barrier between (meta)stable states can be lowered al- temperature. Such behaviour is typical of self-assembling lowing the system to pass easily between the free energy systems47–51 and reflects the thermodynamic and dy- minima, and equilibrium to be reached. namic constraints on the self-assembly process. Firstly, To ensure that each value of Q is equally likely to be the yield is zero at high temperature where only ssDNA sampled in an umbrella sampling simulation, one would is stable, and rises just below the expected equilibrium 5

equilibrium 1

0.8 misbonded

0.6

Fractional Yield 0.4

correct 0.2

0 0.04 0.06 0.08 0.1 0.12 Reduced Temperature FIG. 3: (Colour online) Snapshot of a fully-assembled con- FIG. 2: (Colour online) Yields of correctly-formed duplexes figuration in a MC simulation of ten 13-base strands at ∗ and misbonded configurations at the end of our MC sim- T = 0.0971. In this image the colour of the backbone in- ulations (lines with data points, as labelled) compared to dicates the type of strand: red for G-C-G-A-T-G-A-G-C-A- the equilibrium probability of the strands adopting the cor- G-G-A and grey for its complement. Backbone sites are in- rect structure as obtained by umbrella sampling. The solid dicated by the large spheres, and bases by the small, blue and dashed lines represent results for strands with 7 and 13 spheres. monomers, respectively. The MC results are averages over ten runs of length 3 × 108 steps per strand with ten strands in the simulation cell. icant decrease in energy, explaining the monotonic de- crease in F (Q) beyond Q = 2. The rise between Q = 1 and Q = 2 is partly due to the fact that in order to value as the temperature is decreased, the deviation aris- form two bonds between strands the relative orientation ing due to the large number of steps required to reach of strands must be specified whereas this is not true for equilibrium. At low temperatures, the yield falls away Q = 1: hence there is an additional entropy penalty to due to the presence of kinetic traps which are now stable the formation of the second bond. In addition, there ex- with respect to isolated strands, as evidenced by the rise ist structures with only one correct bond that are stabi- in ‘misbonded structures’ in Figure 2. Thus, there is a lized by additional incorrect bonds and these misbonded non-monotonic dependency of yield on temperature and configurations also contribute to F (1). The constant gra- an optimum region for successful assembly, which corre- dient above Q = 2 indicates that the energetic gain and sponds to the region where only the desired structure is entropic cost of forming an extra bond are approximately stable against thermal fluctuations. Figure 3 is a snap- constant at a given temperature, which is consistent with shot from near the end of a simulation in this regime. It the assumptions underlying nearest-neighbour models of should be noted that neglecting helicity has the effect of DNA melting.24 increasing the flexibility of dsDNA in the direction per- In Figure 5, we compare our melting curves to those pendicular to the plane of bonding (a helix cannot bend 24 in any direction without disturbing its internal structure predicted by a simple two-state model, using the same whereas a ‘ladder’ can). mapping of the reduced temperature as in Section II A. In the two-state model the molar concentrations of prod- The heat capacity obtained from umbrella sampling of uct (AB) and reactants (A,B) are given by the equilib- pair formation is shown in Figure 4(a). The heat ca- rium relation: pacity peaks indicate a transition from single strands to a duplex. As the formation of duplexes is essentially a [AB] −∆H0 + T ∆S0 chemical equilibrium between monomers and clusters of = exp , (9) [A] [B] kBT a definite size (in this case two) the width of the peaks   will remain finite as the number of strands is increased. where ∆H0 and ∆S0 are assumed to be constants which The transition does, however, become increasingly nar- depend only on the strand sequences and the salt concen- row as the DNA strands become longer, as is evident tration (we take [Na+]=0.445M as in Section IIA). We from comparing the heat capacity peaks for the 7-mer use the enthalpy and entropy changes of duplex forma- and 13-mers. tion calculated by “HyTher”,52 a program that estimates Figures 4(b) shows the free energy profile, F (Q), for these values using the “unified oligonucleotide nearest the formation of a duplex. The initial peak at low Q is neighbour parameters”.25,53 The authors claim that the accounted for by the entropic cost of bringing two strands thermodynamic parameters predicted by HyTher give together. Once bonds are formed, however, adding extra the melting temperature Tm (the temperature at which bonds costs much less entropy whilst providing a signif- the fraction of bonded strands is 1/2) of a duplex to 6

(a) 2500 13 1

2000 0.8 13-base

B strands 1500 0.6 (a) (b) (c) (d)

1000 7 Probability 0.4 Heat Capacity / k

500 0.2 7-base strands

0 0.09 0.095 0.1 0.105 0.11 0.115 0.12 0.125 0 0 20 40 60 80 100 Reduced Temperature Temperature / oC (b) 10 FIG. 5: (Colour online) The bulk equilibrium probability of 8 strands being in a correct duplex extrapolated from our um- brella sampling simulations (solid lines) compared to the pre- 6 dictions of an empirical two-state model (dashed lines). Re-

T sults are presented for strands with 7 and 13 bases, as la- B 0.115 4 belled. For the two-state model, as well as the results for the sequences in Eqs. (7) and (8) (lines (a) and (d)), sequences 2 0.112 corresponding to the other two arms of the Holliday junction

Free Energy / k (Figure 7) are considered. Only one line is shown for the um- 0 brella sampling results, because A-T and G-C have the same binding energy in our model. Temperatures in our model are o -2 converted to C using the same mapping given in Section II 0.109 A. -4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Q in the misbonded structure. Thermal fluctuations allow FIG. 4: Thermodynamics for the formation of a single du- the new strand to bond to sites previously involved in plex. (a) Heat capacity curves for 7- and 13-base systems, misbonding, in a process known as ‘’. as labelled. (b) Free energy profile associated with the for- Eventually one of the misbonded strands is completely mation of a 13 duplex at different temperatures (as displaced, leaving a correct duplex and an isolated single labelled), where Q represents the number of correctly formed strand. This behaviour is observed in real DNA systems, bonds in the duplex. and is the driving mechanism of some nanomachines54 and DNA catalyzed reactions.21,55 within a standard error of ±2.2oC.25 Figure 5 shows that our system reflects the melting temperatures predicted by the two-state model with reasonable accuracy, excepting B. Holliday Junction sequence dependent effects which are not included in our model, because the interaction energies between A-T and Encouraged by the above results, we next apply the C-G complementary base pairs have for simplicity been model to the formation of a Holliday junction. Holli- taken to be the same. The widths of transitions are seen day junctions consist of four single strands which bind to to be of the same order, but slightly larger for our model. form a four-armed cross. In our case we consider a Hol- This feature, which is typical of coarse-grained models,31 liday junction with two long arms (13 bases long) and indicates that the degree of entropy loss on hybridiza- two short arms (7 bases long). We use the experimental tion is too small in our model, and is due to a failure to base ordering of Malo et al.38 with the ‘sticky ends’ re- accurately incorporate all degrees of freedom which be- moved. (These sticky ends consist of six unpaired bases come frozen on hybridization. However, the agreement is on the end of arms and their purpose is to allow the Hol- sufficiently good that the basic features of physical DNA liday junctions to bond together to form a lattice). The assembly should be reproducible. sequences of the four DNA strands and schematic dia- A further satisfying feature of the model is that ‘dis- grams of the possible junctions that they can form are placement’ was observed on several occasions. This pro- shown in Figure 7. cess, during which a misbonded pair of strands is broken Initially we studied a system of 20 strands (five of each up by a third strand, is illustrated in Figure 6. The third type) that has the potential to form five separate junc- strand is able to bond to the pair, as some bases are free tions. We use a concentration of 1.56×10−5 molecules l−3 7

FIG. 6: (Colour online) Snapshots illustrating four stages in the process of displacement. (a) A third strand binds to a misbonded pair. (b) The third strand is prevented from forming a complete duplex by the misbond. (c) Thermal fluctuations cause bonds in the misbonded structure to break and be replaced by the correct duplex. (d) The misbonded strand is displaced and the correct duplex is formed.

1. CTA ACT C // AA TGC CTT CTG GA 1 HJ (eq) misbonded 2. CGC ATG AGC AGG A // GA GTT AG β (eq) α (eq) 0.8 3. TGT TCC G // TC CTG CTC ATC GC α 4. TCC AGA AGG CAT T // CG GAA CA 0.6

Fractional Yield 0.4 a) b) β 0.2 HJ 0 0.04 0.06 0.08 0.1 0.12 Reduced Temperature

FIG. 8: (Colour online) A comparison of the kinetics and ther- modynamics for a system of 20 strands that can potentially form five Holliday junctions, where the MC simulations are FIG. 7: (Colour online) A schematic diagram showing the initiated from a purely single-stranded configuration. The sequences of the strands used in our Holliday junction simu- MC results (lines with data points) are the final yield of lations, and the alternative bound states that are possible: (a) Holliday junctions, and the fraction of strands involved in a the square planar configuration and (b) the χ-stacked form. correctly-formed long (α) or short (β) arm, or in misbonding, as labelled. The results are averages over five runs of length 109 steps per strand. For comparison, the equilibrium proba- bilities of being in a α-bonded dimer and a Holliday junction −4 (which corresponds to 1.04×10 M). The results are dis- are also plotted, along with the equilibrium probability of be- played in Figure 8. ing in a β-bonded dimer if the longer arms are not allowed to The results are as expected for the bonding of the hybridize. longer arms (which we now describe as ‘α-bonding’). The yield again displays the characteristic non-monotonic de- pendence on temperature. We obtain very few com- with respect to the α-bonded dimers, many competing plete junctions, however, which is due to two effects. minima would then be inaccessible to the system as they Firstly, each simulation is performed at constant temper- would require the disassociation of stable α-bonded pairs. ature, which means the hierarchical route to assembly is The free energy landscape of two α-structures forming a less favoured than when the system is cooled, as in the Holliday junction is consequentially much simpler than experiments.8,38 When the system is gradually cooled, that of four single strands forming a junction at a given Figure 8 suggests that at around Tm(α)=0.111 we would temperature. Therefore, one expects the yield for self- expect to find a region in which only α-bonded dimers assembly at constant temperature to be lower than when were stable with respect to ssDNA. If the cooling was the system is cooled, because there is only a relatively sufficiently slow on the timescale of bonding, all strands narrow temperature window between where the Holli- would form α-structures at around Tm(α). At lower tem- day junction becomes stable and misbonded configura- peratures, when the Holliday junction becomes stable tions start to appear. Indeed, the shorter arms are only 8

HJ (eq) 1

misbonded HJ α (eq) 0.8

0.6

Fractional Yield 0.4

0.2

0 0.04 0.06 0.08 0.1 0.12 Reduced Temperature

FIG. 10: (Colour online) The yields of Holliday junctions (HJ) and misbonded configurations for MC simulations, where the initial configuration was a pair of α-bonded dimers. For com- FIG. 9: (Colour online) Snapshot showing five Holliday junc- parison, the equilibrium probabilities of being in a α-bonded × 8 tions formed at T = 0.0842 after 5.67 10 MC steps per dimer and a Holliday junction are also plotted. The results strand. Again, the backbone colour indicates strand type (1: are averages over five runs of length 7.5×108 steps per strand. red, 2: grey, 3: orange, 4: yellow) where numbers refer to Figure 7 namely the formation of α-bonded dimers. To probe the second stage of assembly, we must first make two marginally more stable than some competing minima, as modifications to our simulation approach to overcome evidenced by the rise in misbonded structures in Fig. 8 at the two deficiencies mentioned above. Firstly, we study temperature just below where β-bonded structures first systems initially consisting of pairs of α-bonded strands, appear. which we assume have successfully formed at some higher Secondly, even in the temperature range where Holli- temperature—this is reasonable given the results of our day junction formation is not hindered by the formation earlier simulations. Secondly, we also include simple local of misbonded configurations, the yield is low because the cluster moves in addition to those which move only one Metropolis MC algorithm artificially reduces the diffu- strand, i.e. translations, rotations and bending of pairs of sion of bound pairs, and hence the likelihood that two α-bonded strands. With these changes incorporated, we pairs of α-bonded strands come together to form a junc- simulate the same system for 7.5 × 108 steps per strand tion is also reduced. This is because the acceptance prob- at a range of temperatures below Tm(α). It should be ability of trial moves for bonded strands is much lower noted that due to a change in the size of typical moves, than for isolated strands56, due to the energy penalty one move per strand now corresponds to approximately associated with trying to move a bound pair apart. 10ps. Interestingly, examination of the equilibrium lines in We find that Holliday junctions form over a wide range Figure 8 shows that the Holliday junctions are actu- of intermediate temperatures, whilst kinetic traps at ally stable at a higher temperature than the individual low temperature lead to incomplete bonding and con- shorter arms. This is because the total loss of entropy sequently to the possibility of forming large clusters. A when two α-bonded dimers bind together is consider- typical result from the high-yield regime is shown in Fig- ably less than that for two short arms in isolation (as ure 9. As fully-bonded Holliday junctions are essentially fewer translational degrees of freedom are lost), whereas inert, it is reasonable to analyse their assembly behaviour the energy change is comparable. Thus, there is a small by considering only one junction. The smaller system size ∗ temperature window at T ≈ 0.1 where hierarchical as- has the effect of increasing the assembly rate, because the sembly can occur at constant temperature as the short strands have less distance to diffuse, but does not affect arms are only stable once α-bonding has taken place. the basic assembly mechanism. We therefore simulated However, due to the deficiencies in the MC simulations systems consisting of two α-bonded pairs with the same mentioned above, the yield of Holliday junctions in this concentration as above. region is practically zero. Instead, the maximum yield of We also introduced some modifications to the the um- Holliday junctions occurs at lower temperatures where brella sampling scheme in order to more efficiently com- non-hierarchical pathways that proceed by the addition pute the thermodynamics of the second-stage of Holli- of single strands become feasible. day junction formation. As well as cluster moves, we The above simulations were only able to successfully also introduced a ‘tethering’ component in the weighting model the first stage of the Holliday junction assembly, function W (Q). We introduce a length rmin that cor- 9

(a) responds to the shortest distance between any pair of 5000 backbone sites on different strands. We then split Q =0 α into two regions: we weight those states with rmin < 3l with W = 1 but for rmin ≥ 3l we use W =0.1. This en- 4000 ables us to increase the rate of transitions between Q =0 HJ B and 1, and reduces the time spent simply simulating the 3000 diffusion of α-bonded dimers waiting for a collision to occur. The MC results are plotted in Figure 10 along with 2000 the equilibrium results obtained from umbrella sampling. Heat Capacity / k With the cluster moves in place, we now see a high yield 1000 of Holliday junctions and a broad maximum in the yield as a function of temperature. The hierarchical pathway 0 has the effect suggested earlier. Namely, the temperature 0.095 0.1 0.105 0.11 0.115 0.12 window over which correct formation can occur is vastly Reduced Temperature increased, as the most significant competing minima are (b) inaccessible because their formation would require disso- 10 ciation of the α-bonded pairs. The model is therefore consistent with the experimen- 8 0.105 tally observed hierarchical assembly of Holliday junctions 6 T as the system is cooled.8,38 It should be noted, how- B ever, that the junctions in our model usually form in 4 the ‘square planar’ as opposed to the ‘χ-stacked’ shape 2 (Figure 7) that is observed under normal experimental 0.102 conditions. The preference for one structure is a subtle Free Energy / k 0 consequence of the concentration of cations and the pre- -2 cise helical geometry of DNA.57 This level of detail is not 0.099 included in our coarse-grained model, so it is not surpris- -4 ing that it cannot reproduce the preference for χ-stacked 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 structures. Moreover, it is relatively easy to see why in Q our model, which forms ‘ladders’ rather than helices, a planar geometry is preferred for the junction. FIG. 11: Thermodynamics for the formation of a single Hol- Some of the equilibrium thermodynamic properties as- liday junction. (a) Heat capacity curves for two α-bonded sociated with the formation of a Holliday junction are dimers forming a Holliday junction (HJ) and four strands forming two α-bonded dimers, as labelled. (b) Free energy shown in Figure 11. In particular, Fig. 11(b) shows the profiles for the formation of a Holliday junction from two α- free energy profile for the formation of a Holliday junc- bonded dimers at different temperatures, as labelled. Q rep- tion from two α-bonded pairs. The initial peak and sub- resents the total number of correct bonds in the short arms sequent drop is very similar to that for the duplexes and of the junction. can be accounted for in the same way. However, the for- mation of the two arms is not like the zipping up of a 14-base duplex, because there is much more relative free- so that there are no free monomers between the short dom of movement for the bases on either side of the α- arm and the α-bonded sections. As a consequence, there bonded sections in the dimers than for consecutive bases is not the usual free energy benefit from forming the final on single-stranded DNA. Thus, there is a rise between bond in the short arm (the one closest to the centre of M = 7 and M = 8 that is a result of the entropy penalty the Holliday junction), as the excluded volume penalty is of bringing together the two ends to make the second large and those states that are allowed involve distortion short arm. We note that the penalty is much smaller of the backbones and bonds near the centre of the Hol- than the initial cost of bringing the two α-bonded pairs liday junction. Although the details of this free energy together, and as a result, the value of Tm for the junction penalty and the other features in Fig. 11(b) will depend is higher than for the short arms in isolation, as noted on the exact geometry of the system, we expect the cal- earlier. culated free energy profile to be representative of that for An interesting feature of Figure 11(b) is the plateau real DNA. between Q = 6 and Q = 7. In general, when two α- It is possible to extend the two-state model discussed bonded pairs meet to form one short arm, there is an en- in Section III A to the formation of a Holliday junc- tropic penalty associated with the excluded volume that tion by considering the concentrations of all four isolated the remaining bases in the α-structures represent to each strands, the two α-bonded intermediates and the junc- other. This excluded volume is a large fraction of the tion itself. We assume Eq. (9) holds for every possible total available space if one complete short arm is formed, transition, use the same thermodynamic parameters as 10

1 α 1 HJ dimers

0.8 0.8 misbonded

0.6 0.6 HJ (eq) Probability

0.4 Fractional Yield 0.4 (a) (b) HJ 0.2 0.2

0 0 0 20 40 60 80 0.08 0.085 0.09 0.095 0.1 0.105 0.11 Temperature / oC Reduced Temperature

FIG. 12: (Colour online) Bulk equilibrium probability of FIG. 13: (Colour online) Simulation results for the badly- strands being in a Holliday junction (HJ) or an α-bonded designed Holliday junction of Section III C, where the ini- dimer computed by umbrella sampling (solid lines) and by tial configuration was a pair of α-bonded dimers. The lines the extended two-state model (dashed lines), where lines (a) with data points give the yield of correctly-formed junctions and (b) represent the two possible α-bonded dimers. and misbonded configurations, as labelled. For comparison the solid line gives the equilibrium probability that the well- designed sequences of Section III B adopt a Holliday junction. before and apply conservation of total strand number. To estimate ∆H0 and ∆S0 associated with the forma- tion of a Holliday junction, we construct a single strand by linking the ends of the oligonucleotides together with four non-bonding bases. The thermodynamic parame- ters associated with the folding of this structure are pre- dicted by “UNAFold”.58 The correction for the fact that our strands are not connected by loops is discussed by Zuker.59 This leaves five simultaneous equations (assum- ing perfect stochiometry) which can be solved numeri- cally. Figure 12 compares this extended two-state model (ETSM) with the bulk thermodynamics predicted by um- brella sampling (using the same temperature scaling as before). ETSM predictions for both stages of Holliday junction formation agree well with our results, which again supports our hypothesis that much of the physics of self-assembly can be reproduced by a simple coarse- grained model. The extra width of the transitions in our model occurs for the same reasons as mentioned in Sec- tion III A when discussing Fig. 5. FIG. 14: (Colour online) Example of a competing minimum for a badly designed Holliday junction. The snapshot is taken ∗ from a simulation at T = 0.0936. C. Negative Design

The hierarchical pathway for the formation of the Hol- the short arms so modified under the same conditions as liday junction is one aspect of the sequence design that for Fig. 10 and the results are shown in Figure 13. Al- aids the formation of the correct structure. The experi- though there is some probability of forming the correct mental base ordering of the Holliday junction, however, junction, the simulations are dominated by misbonded was also chosen to minimize the number of competing junctions, such as the one depicted in Figure 14. Al- structures—a typical example of ‘negative design’.60 We though these competing structures are energetically less illustrate the importance of such negative design by con- stable than the target junction because of the presence sidering a badly-designed junction, where the comple- of unpaired bases at the ‘dangling’ ends, they are read- mentary seven-base sections consist of just one base type ily accessible, because the likelihood that the first bonds each. We simulate a system of two α-bonded pairs with formed between two α-bonded pairs are in the same reg- 11 istry as the target structure is low. The yield of the the energy landscape is likely to be a general feature of correct Holliday junctions will then depend upon how hierarchical self-assembly. readily the system is able escape from these malformed Thus, our results have confirmed the utility of using junctions. Clearly, this process is slow on the time scales coarse-grained DNA models to study the self-assembly of the current simulations, and is also likely to hinder the of DNA nanostructures, and supported our hypothesis location of the target structure in experiment. that much of the physics can be explained by describing DNA as a semi-flexible polymer with selective attractive interactions. The model’s success in forming junctions IV. DISCUSSION in reasonable computational time suggests that it will be possible to develop further models that have an increased In this paper we have introduced a simple coarse- level of detail, but which can still access the time scales grained model of DNA in order to test the feasibility relevant to self-assembly. of modeling the self-assembly of DNA nanostructure by The model has also highlighted some features which it Monte Carlo simulations. Any such model involves a would be advantageous to include in such models. For ex- trade-off between detail and computational simplicity, ample, greater accuracy in the details of oligonucleotide and here we deliberately chose to keep the model as geometry, particularly the helicity of dsDNA, would allow simple as possible in order to give us the best chance features such as the characteristically long persistence of being able to probe the time scales relevant to self- length of hybridized strands to be reproduced and give assembly. The model involves just two interaction sites the appropriate degree of rigidity to simulated nanostruc- per nucleotide. tures. Such improvement might also allow more compli- The results from our model are very encouraging. cated motifs to be accounted for, such as the preference Firstly, we have shown that using our model it is fea- for χ-stacked Holliday junctions that the current model sible to model the self-assembly of both DNA duplexes could not reproduce. and a Holliday junction. The latter represents, to the It should be noted that if one is to introduce helicity best of our knowledge, the first example of the simula- in a physically reasonable way it should also allow for ss- tion of the self-assembly of a DNA structure beyond a du- DNA to undergo a stacking transition to a helical form. plex. Secondly, the model succeeds in reproducing many This transition may play a significant role in the thermo- of the known thermodynamic and dynamic features of dynamics and kinetics of self-assembly.61 Previously pro- this self-assembly. For example, the equilibrium melting posed coarse-grained DNA models that incorporate helic- curves agree well with those predicted by the nearest- ity have not been designed to accurately reproduce this neighbour two-state model,25 which is known to predict feature. Incorporating extra degrees of freedom which melting temperatures very accurately. The model is also are relevant to the stacking transition, such as the rota- able to capture important dynamical phenomena such as tion of the base with respect to the sugar-base bond, may displacement. also help to increase the entropy change on hybridization Thirdly, by analysing the thermodynamic and dynamic and hence make the transition narrower as required. constraints on assembly, we have been able to gain some The approximation to diffusive dynamics provided by important physical insights into the nature of DNA self- the local move Metropolis Monte Carlo algorithm could assembly and how to control it. For example, the opti- also be improved. Currently the ‘local’ moves involve mal conditions for self-assembly are in the temperature displacing, rotating or bending entire strands or pairs range just below the melting temperature of the the tar- of strands—these effectively constitute cluster moves of get structure, where this structure is the only one sta- groups of strongly bound nucleotides, and result in slow ble with respect to the precursors, be they ssDNA or relaxation and translation times within bound structures. some intermediate in a hierarchical assembly pathway. More realistic dynamics may be achievable by considering At lower temperatures, misbonded configurations can be trial moves of individual nucleotides, and incorporating formed that act as kinetic traps and reduce the assem- cluster moves in a more systematic fashion, such as in bly yield. Similar trade-offs between the thermodynamic the ‘virtual move’ MC algorithm proposed by Whitelam driving force and kinetic accessibility have been previ- and Geissler.51,62 ously seen in a variety of self-assembling systems,47–51 One potential issue with any coarse-graining is how and also give rise to a maximum in the yield near to it preserves the different time scales in a system. In and below the temperature at which the target structure Section II B we assigned an approximate mapping be- becomes stable. tween the number of Monte Carlo steps and physical time We have also seen how hierarchical self-assembly based upon comparison of diffusion coefficients. There through cooling can be a particularly useful strategy to are, however, other important time scales in the system, aid self-assembly, because the formation of stable inter- such as the time scale for the internal dynamics of an iso- mediates at higher temperatures simplifies the free en- lated strand and the time scale over which the ‘zipping- ergy landscape for the assembly of the next stage in the up’ of two strands occurs after a bond has been formed. hierarchy by reducing the number of misbonded config- Comparisons of experimental diffusion coefficients44 and urations available to the system. This simplification of melting and bubble formation from molecular dynamics 12 simulations30,31 suggest a large separation in time scale contribution to the association rate from the probability between diffusion-limited processes and those that rely that a collision will lead to successful association. That on the dynamics of individual nucleotides. Encourag- we can reproduce the thermodynamics of the DNA melt- ingly, we observe a similar time scale separation in our ing transitions implies that the rates of association and model: zipping-up and thermal relaxation of isolated disassociation have the right ratio, but not that they nec- strands occur over times scales shorter than 105 steps essarily have the correct absolute value. For example, it per strand, whereas association typically required on the is conceivable that helicity (both in dsDNA and possibly order of 107 to 108 steps per strand near the melting in ssDNA), which is not included in the current model, temperature (corresponding to tens or hundreds of mi- will influence the likelihood that a collision is successful. croseconds). Furthermore, we would argue that it is this time scale separation, and not the precise ratios of the relevant rate constants, that it is important to reproduce in self-assembly simulations. Acknowledgments We should also note that the mapping of the diffu- sion constants between the model and experiment will The authors are grateful for financial support from the not necessarily ensure that the rate of association is ac- EPSRC and the Royal Society. We also wish to acknowl- curate in our model, because although the frequency of edge helpful discussions with Jonathan Malo, John San- collisions in our model should be correct, there is also the talucia Jr and Michael Zuker.

1 N. C. Seeman, Nature 421, 427 (2003). ceedings of the 43rd annual conference on Design automa- 2 S. Pitchaiya and Y. Krishnan, Chem. Soc. Rev. 35, 1111 tion (ACM, New York, 2006), pp. 919–924. (2006). 21 P. Yin, H. M. Choi, C. R. Calvert, and N. A. Pierce, Nature 3 W. Saenger, Principles of Structure 451, 318 (2008). (Springer-Verlag, 1984). 22 T. E. Cheatham, III, Curr. Opin. Struct. Biol. 14, 360 4 P. J. Hagerman, Annu. Rev. Biophys. Biophys. Chem. 17, (2004). 265 (1988). 23 D. Poland and H. A. Scheraga, Theory of Helix-Coil Tran- 5 N. R. Kallenbach, R.-I. Ma, and N. C. Seeman, Nature sitions in Biopolymers: Statistical Mechanical Theory of 305, 829 (1983). Order-disorder Transitions in Biological Macromolecules 6 T. J. Fu and N. C. Seeman, Biochemistry 32, 3211 (1993). (Academic Press, New York, 1970). 7 E. Winfree, F. R. Liu, L. A. Wenzler, and N. C. Seeman, 24 R. Everaers, S. Kumar, and C. Simm, Phys. Rev. E 75, Nature 394, 539 (1998). 041918 (2007). 8 J. Malo, J. C. Mitchell, C. Venien-Bryan, J. R. Harris, 25 J. SantaLucia, Jr., Proc. Natl. Acad. Sci. U.S.A 17, 1460 H. Wille, D. J. Sherrat, and A. J. Turberfield, Angew. (1998). Chem. Int. Ed. 44, 3057 (2005). 26 J. SantaLucia, Jr. and D. Hicks, Annu. Rev. Biophys. 9 H. Yan, S. H. Park, G. Finkelstein, J. H. Reif, and T. H. Biomol. Struct. 33, 415 (2004). LaBean, Science 301, 1882 (2003). 27 J. Bois, S. Venkataraman, H. M. T. Choi, A. J. Spakowitz, 10 P. W. K. Rothemund, Nature 440, 297 (2006). Z. Wang, and N. A. Pierce, Nucleic Acids Res. 33, 4090 11 J. H. Chen and N. C. Seeman, Nature 350, 631 (1991). (2005). 12 Y. W. Zhang and N. C. Seeman, J. Am. Chem. Soc. 116, 28 T. Dauxois, M. Peyrard, and A. R. Bishop, Phys. Rev. E 1661 (1994). 47, 684 (1993). 13 R. P. Goodman, I. A. T. Sharp, C. F. Tardin, C. M. Erben, 29 S. Buyukdagli, M. Sanrey, and M. Joyeux, Chem. Phys. R. M. Berry, C. F. Schmidt, and A. J. Turberfield, Science Lett. 419, 434 (2006). 310, 1661 (2005). 30 K. Drukker, G. Wu, and G. C. Schatz, J. Chem. Phys. 14 C. M. Erben, R. P. Goodman, and A. J. Turberfield, J. 114, 579 (2001). Am. Chem. Soc. 129, 6992 (2008). 31 T. A. Knotts, IV, N. Rathore, D. Schwartz, and 15 W. M. Shih, J. D. Quispe, and G. F. Joyce, Nature 427, J. de Pablo, J. Chem. Phys. 126 (2007). 618 (2004). 32 J. C. Araque, A. Z. Panagiotopoulos, and M. A. Robert, 16 F. F. Anderson, B. Knudsen, C. L. P. Oliveira, R. F. ???? (2006). Frøhlich, D. Kr¨uger, J. Bungert, M. Agbandje-McKenna, 33 M. Sales-Pardo, R. Guimera, A. A. Moreira, J. Widom, R. McKenna, S. Juul, C. Veigaard, et al., Nucleic Acid and L. Amaral, Phys. Rev. E 71, 051902 (2005). Res. 36, 1113 (2008). 34 F. W. Starr and F. Sciortino, J.Phys.: Condens. Matter 17 Y. He, T. Ye, M. Su, C. Zhang, A. Ribbe, W. Jiang, and 18, L347 (2006). C. Mao, Nature 452, 198 (2008). 35 P. Cifra, Polymer 45, 5995 (2004). 18 F. A. Aldaye and H. F. Sleiman, J. Am. Chem. Soc. 129, 36 M. C. Murphy, I. Resnik, W. Chang, T. M. Lohman, and 13376 (2007). T. Ha, Biophys. J. 86, 2530 (2004). 19 J. Zimmermann, M. P. J. Cebulla, S. M¨onninnghoff, and 37 As with all ‘ladder’ models, there is an ambiguity in defin- G. von Kiedrowski, Angew. Chem. Int. Edit. 47, 3626 ing the value of l because it is also equal to the separation (2008). of bases along the chain in the double-stranded geometry, 20 C. Pistol, A. R. Lebeck, and C. Dwyer, in DAC ’06: Pro- whereas in the double helix the rise per base pair is 3.3A˚ 13

due to the wrapping of single strands around the axis. As 51 S. Whitelam, E. H. Feng, M. F. Hagan, and P. L. Geissler the absolute value of l is used to map onto the experimental (arXiv:0806.390). persistence length of ssDNA we choose to use 6.3A.˚ 52 N. Peyret and J. SantaLucia, Jr., Hyther tm version 1.0, 38 J. Malo, Ph.D. thesis, Oxford University (?). URL http://ozone3.chem.wayne.edu/cgi-bin/login/. 39 N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. 53 N. Peyret, P. A. Seneviratne, H. T. Allawi, and J. San- Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953). taLucia, Jr., Biochemistry 38, 3468 (1999). 40 D. Frenkel and B. Smit, Computer Simulation of Liquids 54 B. Yurke, A. J. Turberfield, A. P. Mills, F. C. Simmel, and (Academic Press Inc. London, 2001). J. Neumann, Nature 406, 605 (2000). 41 K. Kikuchi, M. Yoshida, T. Maekawa, and H. Watanabe, 55 D. Y. Zhang, A. Turberfield, B. Yurke, and E. Winfree, Chem. Phys. Lett. 185, 335 (1991). Science 318, 1121 (2007). 42 L. Berthier and W. Kob, J. Phys.: Condens. Matter 19, 56 E. Luijten, Computing in Science & Engineering 8, 20 205130 (2007). (2006). 43 G. Tiana, L. Sutto, and R. A. Broglia, Physica A 380, 241 57 M. Ortiz-Lombardia, A. Gonzalez, R. Eritja, J. Aymami, (2007). F. Azorin, and M. Coll, Nat.Struct. Biol. 6, 913 (1999). 44 J. Lapham, J. P. Rife, P. B. Moore, and D. M. Crothers, 58 N. Markham and M. R. Zuker, Bioinformatics: Volume II: J. Biomol. NMR 10, 252 (1997). Data, Sequence Analysis and Evolution (Humana Press, 45 G. Torrie and J. P. Valleau, J. Comp. Phys. 23, 187 (1977). 2008). 46 T. E. Ouldridge, A. A. Louis and J. P. K. Doye, in prepa- 59 M. Zuker, Nucleic Acids Res. 31, 3406 (2003). ration. 60 J. P. K. Doye, A. A. Louis, and M. Vendruscolo, Phys. 47 H. D. Nguyen, V. S. Reddy, and C. L. Brooks III, Nano Biol. 1, P9 (2004). Lett. 7, 338 (2006). 61 J. Holbrook, M. Capp, R. Saecker, and M. Record, Bio- 48 M. F. Hagan and D. Chandler, Biophys. J. 91, 42 (2006). chemistry 38, 8409 (1999). 49 A. W. Wilber, J. P. K. Doye, A. A. Louis, E. G. Noya, 62 S. Whitelam and P. L. Geissler, J. Chem. Phys. 127, M. A. Miller, and P. Wong, J. Chem. Phys. 127, 085106 154101 (2007). (2007). 50 D. C. Rapaport (arXiv:0803.0115).