Universal folding pathways of nets

Paul M. Dodda, Pablo F. Damascenob, and Sharon C. Glotzera,b,c,d,1

aChemical Engineering Department, University of Michigan, Ann Arbor, MI 48109; bApplied Physics Program, University of Michigan, Ann Arbor, MI 48109; cDepartment of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109; and dBiointerfaces Institute, University of Michigan, Ann Arbor, MI 48109

Contributed by Sharon C. Glotzer, June 6, 2018 (sent for review January 17, 2018; reviewed by Nuno Araujo and Andrew L. Ferguson)

Low-dimensional objects such as molecular strands, ladders, and navigate, thermodynamically, from a denatured (unfolded) state sheets have intrinsic features that affect their propensity to fold to a natured (folded) one. Even after many decades of study, into 3D objects. Understanding this relationship remains a chal- however, a universal relationship between molecular sequence lenge for de novo design of functional structures. Using molecular and folded state—which could provide crucial insight into the dynamics simulations, we investigate the refolding of the 24 causes and potential treatments of many diseases—remains out possible 2D unfoldings (“nets”) of the three simplest Platonic of reach (11–13). shapes and demonstrate that attributes of a net’s topology— In this work, we study the thermodynamic foldability of 2D net compactness and leaves on the cutting graph—correlate nets for all five Platonic solids. Despite being the simplest and with thermodynamic folding propensity. To explain these cor- most symmetric 3D , the family of Platonic shapes relations we exhaustively enumerate the pathways followed suffices to demonstrate the rapid increase in design space as by nets during folding and identify a crossover temperature shapes become more complex: A has 2 possible net Tx below which nets fold via nonnative contacts (bonds must representations, and octahedra each have 11 nets, and break before the net can fold completely) and above which dodecahedra and icosahedra have, each, 43,380 distinct edge nets fold via native contacts (newly formed bonds are also unfoldings. Here we are interested in the thermodynamic self- present in the folded structure). Folding above Tx shows a uni- folding of these nets. Our goal is to understand how topology versal balance between reduction of entropy via the elimination affects yield in the stochastic folding of 3D objects. The advan- of internal degrees of freedom when bonds are formed and tage is threefold. First, by using a collection of sheets folding gain in potential energy via local, cooperative edge binding. into the same target shape, we isolate the geometric attributes Exploiting this universality, we devised a numerical method to responsible for high-yield folding. Second, the model allows efficiently compute all high-temperature folding pathways for exhaustive computation of the pathways followed by the nets dur- any net, allowing us to predict, among the combined 86,760 ing folding, elucidating how some nets achieve high yield. Third, nets for the remaining Platonic solids, those with highest fold- by studying increasingly more complex objects—from tetrahedra ing propensity. Our results provide a general heuristic for the to icosahedra—we can use the folding mechanisms quantified design of 2D objects to stochastically fold into target 3D geome- in the simplest objects to predict, and potentially validate, their tries and suggest a mechanism by which and folding occurrence in the more complex shapes. propensity are related above Tx, where native bonds dominate Beginning with a target Platonic shape (Fig. 1), we construct folding. a graph whose vertices and edges correspond to those in the polyhedron. This mapping of the shape to a graph facilitates the folding | origami | polyhedra nets | cooperativity exhaustive search of all distinct nets by allowing spanning tree enumeration (3). A set of prechosen edges (the cutting tree) is n the 16th century, the Dutch artist Albrecht Durer¨ inves- Itigated which 2D cuts of nonoverlapping, edge-joined poly- Significance gons could be folded into Platonic and Archimedean polyhedra. Durer¨ cuts were later called “nets” but, for a long time, the inter- What makes an object successful at thermal folding? Protein est around them was mostly restricted to the field of mathematics scientists study how sequence affects the pathways by which (1–3). A newer concept, self-folding origami, adds a modern chained amino acids fold and the structures into which they twist to the ancient art of paper folding. By providing a mech- fold. Here we investigate the inverse problem: Starting with anism to achieve complex 3D from low-dimensional a 3D object as a polyhedron we ask, which ones, among the objects—without the need for manipulation of the constituent many choices of 2D unfoldings, are able to fold most consis- parts—self-folding brings the concepts pioneered by Durer¨ to tently? We find that these “nets” follow a universal balance the forefront of many research fields, from medicine (4) to between entropy loss and potential energy gain, allowing robotics (5). us to explain why some of their geometrical attributes (such Several recent works have leveraged physical forces to achieve as compactness) represent a good predictor for the folding controllable folding of 3D objects including light (6), pH (7), propensity of a given shape. Our results can be used to guide capillary forces (8), cellular traction (9), and thermal expansion the stochastic folding of nanoscale objects into drug-delivery (10). Other works have investigated the relationship between devices and thermally folded robots. geometric attributes of the object being folded and its propen- sity for successful folding. In the macroscopic folding of kirigami Author contributions: P.M.D., P.F.D., and S.C.G. designed research; P.M.D. implemented sheets—origami-like structures containing cuts and creases—the the methods and performed numerical simulations; P.M.D., P.F.D., and S.C.G. analyzed effect of different cut patterns on the material’s stress–strain data; and P.M.D., P.F.D., and S.C.G. wrote the paper. behavior has been elucidated (11–13) and the “inverse design Reviewers: N.A., University of Lisbon; and A.L.F., University of Illinois at Urbana– problem” of finding cuts leading to the folding of a particular Champaign. target structure has been solved (14). For submillimeter-sized The authors declare no conflict of interest. capsules, formed via nonstochastic folding of nets into polyhe- Published under the PNAS license. dra, it has been suggested that nets fold with higher yield when 1 To whom correspondence should be addressed. Email: [email protected] they are compact (8, 15, 16), but the reason for this correlation This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. remains unclear. In natural systems, the canonical example of 1073/pnas.1722681115/-/DCSupplemental. self-folding occurs for proteins, where a string of amino acids Published online July 3, 2018.

E6690–E6696 | PNAS | vol. 115 | no. 29 www.pnas.org/cgi/doi/10.1073/pnas.1722681115 Downloaded by guest on September 23, 2021 Downloaded by guest on September 23, 2021 tdb nodn oyern hc fte hwtehighest polyhedron? the original show the into them refold of to which propensity polyhedron, for gener- a nets possibility unfolding all This by Among ated question: observed. following are the raises misfolds traps traps kinetic net both kinetic and (quenched), for sys- possible decreased the allows are rapidly when nets is consequence, a the temperature As tem of contacts. edges nonnative original and between the native interactions to the explicitly of refer Unless we arise. state” can foldings “folded and, 3D by polyhedron. state other not otherwise, ground see, does the noted will This of from we configuration. uniqueness formed as polyhedron the ground-state the however, not the that guarantee, so edges also net is all a folding of between springs interactions by (more implicit joined (sticky) dynamics in attractive molecular suspended ranged Langevin net, of via in influence single details modeled the a is are and on solvent, spheres fluctuations The connected springs. thermal rigidly harmonic of via composed polygons a adjacent in solids to Platonic the for nets 86,784 algorith- all (19). computed database be list can We nets these (18). become for mically demonstrated interest might of recently subset nets been a all that has computing it of prohibitive, process computationally the (see while discovered shapes, are nets Methods distinct and be Materials all can this until (17), single, exhaustively enumerated a are For repeated nets net. create whose a to faces: shapes, nonoverlapping Platonic unfolding, of the edge sheet 2D called flat process and contiguous, a in cut, then cooling as particular calculated the then via shape is 3D configuration yield desired annealing folding protocol. final the The achieving slow the of shape. a probability reached, target the or the is with quench temperature compared fast final is a the either Once brought following WCA and protocol. repulsive temperature temperature purely high low a at via initialized to spheres spheres are Blue gray simulations with potential. The and potential. Lennard–Jones rigidly other spheres a each of via with union model interact interact MD a spheres the as In gray springs. modeled harmonic shown, net) then via hinges is (a along net faces tethered together, the the nonoverlapping held From of of cut. connection the being Each hinged those of results. planar, mark (edges) a tree vertices cutting tree, the the cutting to in correspond edges to Red graph in used polyhedron. tree (edges) is vertices cutting The representation so-called cuttings. graph the edge a of permutations clockwise), different moving enumerate and (top, 1. Fig. ode al. et Dodd si rtisadohrbooeue,tenonspecificity the biomolecules, other and proteins in As connected polygons rigid of sheet a as modeled is net Each oko o oyer odn td.Satn ihatre 3D target a with Starting study. folding polyhedra for Workflow .W sinnnpcfi,short- nonspecific, assign We Methods). and Materials o oedtis.W oeta o other for that note We details). more for ftent civ ihrta 0 ucs aefreither for rate success 50% none than 2C), and higher (Fig. folding, achieve in rate. the cooling nets success for 2 the 50% rates; revealed (Fig. cooling than of octahedron nets slow greater the at showed and only the nets cube the For the each of both differences: more for of has larger simulations net nets Similar even the 11 minimum. as the global expected the of is find This to folding higher net. time a each yielded 46% for simulations other gen- rate probability In cooling succeeded—the configurations. slower nets misfolded the in eral tetrahedron linear resulted target the experiments the the sim- of of into 125 54% folded Of only different protocol: nets while remarkably cooling triangular showed fast all 2A), the ulations, (Fig. for triangular net propensities the linear folding as (see the to protocol referred and hereafter cooling net tetrahedron, slow the a for nets and Methods each fast and for a simulations rials both cooling of using of polyhedron their hundreds net, into performed reliably fold we to origin able nets the identify Reliably To More Fold Nets “Leafy” Compact, More ihnme flae nterctigte n mle imtrusually diameter smaller a and tree cutting have best. their that fold on Nets leaves better: of folding fold number slow nets high compact for a more even the 50%) Generally shape. (below (C original poorly octahedron fold the for nets while 8/11 reactions (B) (B rates. nets cooling cube slower a rates, for the cooling found (Bottom)], is rapid linear propensity for folding visible rate. and similar is cooling but (Top) rate success [triangular low folding nets at in difference tetrahedron yield noticeable two lowest the to that For each highest simulations for (A) from Nets of rates. ordered cooling fraction two are for the polyhedron calculated was by state, folded defined the yield, reached Folding solids. Platonic 2. Fig. BC A feto e oooyo h odn rbblt o h simplest the for probability folding the on topology net of Effect eto o oedtis.Tetodistinct two The details). more for section otnt r nbet odit the into fold to unable are nets most ) PNAS | o.115 vol. | o 29 no. ,ol 3 only B), and | Mate- C E6691 For )

APPLIED PHYSICAL SCIENCES The misfolded configurations for the tetrahedron and cube AB4.0

nets were incomplete 3D geometries (showing, for instance, ) -4 faces collapsed on top of each other or bonds incompatible with 3.0 the formation of the respective target shape). In contrast, octa- hedron nets often folded into another 3D shape: a concave, 2.0 boat-like conformation with the same number of edge–edge con- tacts as the octahedron: in other words, a degenerate ground 1.0 state. This competing structure, which is less symmetric than the octahedron, has higher rotational entropy resulting in a lower 0.0 free energy than the octahedron (see SI Appendix, Fig. S1A C 1.0 for folding probabilities for the boat conformation). A compe- tition between similar degenerate structures was reported for finite clusters of six attractive spherical colloids (20), where sym- 0.5 metry breaking leads to the formation of the same boat-like conformation. Relative Flux Reactive Flux (x10 Fig. 2 shows that, despite having the same ground-state energy, 0.0 not all nets of a polyhedron are equivalent. In general, we observe 246 that the nets that fold most reliably are the most compact and have Temperature (kT) the highest number of leaves on its cutting graph (green solid cir- cles in nets in Fig. 2). A net is said to be more compact if it has a Fig. 3. Folding pathways for a representative cubic net. (A) Intermediate large number of leaves (the vertices with degree one on the cutting folding states (nets, in the diagram), arise when new bonds between edges are formed. States are connected with an arrow when a pathway from one tree) and a small diameter (the longest shortest path between any state to another is observed in the simulation. The thickness of such con- two faces on the face graph). Exact values for the leaves and diam- nection is proportional to the measured probability of this transition being eter are shown in SI Appendix, Table S1. Most strikingly, even observed. For better visualization, only the most visited states are shown. nets differing only by the location of a single face can have folding Red (green) faces correspond to a state where a nonnative (native) contact probabilities reduced from 99% to 17%. What causes one shape is formed. The pathways containing only native contacts follow a “sequen- to fold nearly perfectly every time while a slightly different one tial” folding process, where one face folds at a time. Nonnative pathways fails to do so almost as frequently? And why do net “leafiness” can lead to the native folded shape via one of two mechanisms: (i) Either the and “compactness” correlate with folding yield? misfold helps, bringing otherwise far away faces together, or (ii) the fold- ing proceeds sequentially after the misfold occurs until it reaches a point where the nonnative contact must break for folding to continue. In both High-Temperature Folding Happens via Native Contacts cases, if the correct polyhedron is achieved in the end, the nonnative con- To answer why small differences in net topology can have a tact is corrected along the pathway. The network represents 55% of the large impact on the net’s folding propensity, we used Markov- folding flux at T = 3kT.(B) The total probability flux, defined as the sum of the fluxes along all pathways, for the representative net in A as a func- state models (MSM) (21–23) to compute the pathways through tion of temperature. The peak around temperature T = 3kT shows that which nets fold into their folded state. Since quench rate was there is a temperature at which there is a maximum number of expected observed to affect the folding propensity of the nets, we run transitions from the unfolded to the folded state per unit time. (C) The rel- constant temperature simulations of each net while computing ative amount of flux that goes through pathways that use the native (green the rate of transitions between two states (flux). A represen- curve) and nonnative (red curve) pathways. The folding seems to happen tative folding network is shown in Fig. 3A (see SI Appendix, mostly via native contacts formation when the system is kept at higher Fig. S2 for other example networks). Arrows represent observed temperatures. transitions between different states and each arrow has a thick- ness proportional to the probability flux of the transition being is a maximum number of expected transitions from unfolded to observed (see Materials and Methods for more details). To sim- folded state per unit time τ. The representative network shown plify, we show only the most visited pathways (i.e., those whose in Fig. 3A was calculated at that peak temperature. Finally, at combined flux accounts for at least 50% of the total reactive high T the unfolded state is preferred and again the flux van- flux between unfolded and folded states). Intermediate config- ishes. These trends are also true for the other nets studied (SI urations can achieve the folded state via the formation of native Appendix, Fig. S3). If we separate the flux into those following (green) or nonnative (red) contacts. If a pathway includes native native and nonnative contacts, we see (Fig. 3C) that the fold- contacts only, every newly formed bond is compatible with the ing pathways at high temperature mostly follow the formation of final polyhedron and error correction is not needed. When the native contacts while the behavior inverts for low temperatures, folded state is achieved via incompatible bonds, we observe that and mostly nonnative contacts are observed. There were only two these nonnative contacts can sometimes have an active role by nets where a crossover temperature was not observed. The trian- bringing otherwise far-away native contacts closer together, facil- gular tetrahedral net does not exhibit a crossover temperature itating folding. In other cases the nonactive contacts contribute since it has no traps. The best-folding cubic net also does not only passively and folding occurs sequentially after the misfold have a crossover temperature in the range of temperatures we occurs until it reaches a point where the nonnative contact must study; while traps exist, the fraction of native contacts decreases break for folding to continue. In either case, nonnative con- and the temperature decreases but never falls below 50% (SI tacts must eventually break before a folded configuration can be Appendix, Fig. S3). This T dependency is also observed in col- achieved. loids, where the assembly of an is monomeric at high The total flux connecting unfolded and folded states (Fig. 3B) temperatures, but, at low temperature, particles first aggregate first increases with temperature and then decreases to zero at into large clusters (not necessarily compatible with icosahe- high temperature. At low T the folding flux is low because the dral symmetry) and those later rearrange into the ground-state states are mostly trapped into a few configurations; i.e., the slow structure (24). kinetics inhibit bond breaking. As the temperature increases, To gain an understanding of the mechanisms underlying the bonds can now break and the folding/unfolding process occurs observation that native contacts are favored at high temperatures at a higher rate. At intermediate T , a maximum in reactive flux during folding, we calculated the number of degrees of freedom is observed, meaning that there is a temperature at which there associated with each intermediate.

E6692 | www.pnas.org/cgi/doi/10.1073/pnas.1722681115 Dodd et al. Downloaded by guest on September 23, 2021 Downloaded by guest on September 23, 2021 uain ed ob rteueaigcniaebnsthat bonds candidate enumerating cal- first MSM by full a so dodeca- for do need the We the of culation. without nets icosahedron, 86,760 and the hedron for pathways high-temperature fdgeso reo vralptwy o ahgvnvleof value given each for pathways all number over the freedom of approx- averages of in weighted degrees freedom are of of points of data degrees The its degrees way. trade same internal will the of imately net each number that and find We contacts freedom. native parameters: order two 4. Fig. Appendix, (SI above lie explained S4 to balance Fig. found narrow are the folding from away the far of stages non- occur—and intermediate contacts when as native temperatures low at occur its not necessarily maximize does to maximization strives this system Importantly, the entropy. such process, conformational the locally of step happens each In at folding that, energy. between potential high-temperature of balance terms, gain fol- narrow and practical nets a freedom of all achieves degrees remarkably, that of of reduction that, pathway number folding shows the trade- a 4 in optimal low Fig. degeneracy the paths. high of optimal with one such nets along favoring primar- (ii) compact paths, and favoring should off leaves, connections, Folding many (local) with nearby temperature. nets via high (i) at happen folding ily the nets for general mechanism following of obser- the this hypothesize system From we the entropy. process, vation, conformational the its of maximize step to each strives at that, folding gain such high-temperature locally and terms, happens freedom practical In of energy. degrees potential of nar- of a reduction achieves between that balance pathway that, folding row shows a 4 follow Fig. nets octahedron. all and remarkably, cube, tetrahedron, the of nets high at occurs number contacts number native the forming calculate and degrees maximizing freedom between path- trade-off of this folding whether the test To thereby along ways. freedom, entropy of conformational degrees locally, fewest the fold the maximizing might reduces that nets manner that a suggests gen- in yield leaves higher many with with those fold and erally nets compact more that fact The and Entropy Between Enthalpy Balance Universal a Follow Nets ode al. et Dodd each freedom. for of plotted degree temperature DOF, The S1. Table by df. cannot is 1 given net plot least is the the at net that of losing fact corner without the right bond from upper a gain arising The region eye. forbidden the geometrically guide a to drawn are lines The

Normalized Conformational DOFs generate to algorithm an devised we hypothesis this Using rjce ahascmue rmteMMo l ftent onto nets the of all of MSM the from computed pathways Projected ). Q fntv otcsa e od.W os o l 24 all for so do We folds. net a as contacts native of Tetrahedron Octahedron T m + Cube ,and 0.5, Normalized NativeContacts N fitra ere ffedmadthe and freedom of degrees internal of T m srpre o ahntin net each for reported is geometrically forbidden IAppendix SI Q/ T folded we , , . cd 2–8.Teietfidtaeofbtenetoyand entropy between trade-off identified amino The of folding of the (26–28). and assembly particles, acids (20) (8), colloidal polyhedra and (24) of patchy folding irreversible octahedron macroscopic for observe we con- same that (20). the nets conformation form to boat-like Finally, prefer (24). spheres cave, sticky formation in colloidal contact of structure native systems overall to temperature, the process high with equivalent compatible at an are col- pathways that Similarly, bonds monomeric (27). forming via state further assemble native by the loids followed to form leading contacts low rearrangements nonnative At which (26). in native-only temperature collapse, via melting fold their to to T simulation close in when high observed contacts at pro- been small pathways Several have nets. contact polyhedron teins to native unique not for is temperature preference observed The Conclusion and Discussion face. another onto occur typically folds nets states face the trapped one to which when contrast for fur- shapes, sharp three in is other is states the This of trapped nets. the these for of increased and diversity ther tetrahedral the both and con- are motifs, boat there octahedral the nets, in icosahedral seen For as examples). tetrahedra, (see above partial motifs mentioned or formation are these full and There into nets, states. fold octahedral trapped can many in on arises motifs” second that “tetrahedral The complexity nets. the pro- many for is folding probability factor folding landscape the the energy boost net’s throughout the may funnel and rigidity sufficiently to the important is increasing cess by constraining further net that shapes ver- implies three the This leaf other rigid. loop about the the of folding renders case tices ), the cube, In tetrahedron, ver- df. leaf trapped 2 (i.e., about a retains folding loop enter after icosahedron, the still the tices, may of intermediate case the the In so state. interme- rigid), resulting the not (i.e., the is df to diate 1 due retain but vertex, loop, leaf the octa- the symmetry instance, about unique For bond bond. a a make form can nets edges hedron leaf sharing faces the the when by vertex factor retained leaf first freedom a The of role. degrees a of play number the may is that factors remains probability two exception are this higher there for elusive, with reason the folds While octahedron. which this the than to dodecahedron, exception the One is fold. instance, trend successfully to 20-sided For the unable increases. perfectly, is faces nearly icosahedron folds of tetrahedron number 4-sided the the while as decreases net to a correlates inversely it order and contact sequence low (25). acid rate that amino protein- folding shown the the away been on that far In how are has connections of state. contacts measure local unfolded a native is of specific the order of contact from amount the number literature fold the The folding to of well. required measure as are systems a leaves other is the in leaves and analogues propensity have folding may the and between folding correlation high-T correct The many and have for leaves also many propensity nets with higher those are polyhedra show expected, propensities diameter As folding small 6. corresponding Fig. in The plotted (see nets). with those yields specific nets, then folding for 86,760 We low the fold. of among and to find, high degrees contacts to native principles maximize these local, that used only pathways using octahedron and the the for freedom and illustrating 5A) (Fig. 5B), cube fol- the (Fig. pathways for nets example 11 combined largest all the the by the in lowed shows 5 or (see have Fig. net freedom that details). the of more ones on degrees other selecting of each by to number then next are and they intermediate if made be can u ipemdlteeoedascnetosbtenthe between connections draws therefore model simple Our of propensity folding that suggest observations our Overall, o ihhdohbct)teptwy hf oahydrophobic a to shift pathways the hydrophobicity) high (or i.S1A Fig. Appendix, SI PNAS is 5adS6 and S5 Figs. Appendix, SI ahast h rudstate. ground the to pathways aeil n Methods and Materials | o.115 vol. | o 29 no. o more for | E6693 for

APPLIED PHYSICAL SCIENCES A and a hexagonal approximate for the pentagonal faces. For each simula- tion the drag coefficient, γ, was set to the inverse of the number of spheres used to create a facet. The spheres in the center of the face interact via a Weeks–Chandler–Andersen (WCA) potential shown in blue in Fig. 1, while the spheres on the free edges of the net interact via a Lennard–Jones poten- tial; both potentials used  and σ values of 1.0. The rigid facets are tethered together using harmonic springs along the edges, using a spring constant of 800 and equilibrium distance of 1.0.

Quantifying Yield. To quantify the folding yield we ran 125 simulations start- ing from a high temperature and quenched the temperature to near zero (T = 0.1). We then defined the yield as the fraction of simulations that com- pletely folded into the target polyhedron. To distinguish between nets that had very low probability, we defined the folding propensity as the average fraction of native contacts averaged over all of the quenching simulations B we ran. If all nets folded perfectly in all runs, the folding propensity is one but is possibly nonzero even if all nets did not fold at all. We linearly quenched 125 systems from 0.1 ≤ T ≤ Tm + 2.5, where Tm is the folding temperature defined as the maximum melting temperature (temperature at which the net is unfolded 50% of the time) among all nets for a given target shape. We used two different cooling rates to investigate their influence in −6 −8 the folding yield: 2.5 ×10 T/t and 2.5 ×10 T/t. The Tm for each shape is listed in SI Appendix, Table S1.

MSMs and Folding Pathway Calculations. MSMs (21) have been used to study protein folding (22, 23) and virus capsid assembly (39) and can provide a

1 Fig. 5. Combined dominant pathways for the 11 nets of the cube (A) and Cube the 11 nets of the octahedron (B) calculated at high temperature. In both cases the dominant pathways are sequential (one face folding at a time) and include only native contacts. As in Fig. 3, arrows indicate transitions between two states, the arrow thickness being proportional to the probability of

Propensity (0.81, 0.002) (-0.81, 0.003) (0.79, 0.004) observing such a transition. 0 1 Octahedron

enthalpy that dictates high-temperature folding provides guid- ing principles for the assembly of 3D complex geometries from potentially simpler-to-fabricate 2D nets. We demonstrated the Propensity 0 (0.86, 0.001) (-0.23, 0.48) (0.68, 0.021) judicious pathway engineering via the selection of nets with cer- 1 (-0.77, 1.5e-8) tain characteristics. We found that nets at high temperature fold Dodecahedron through pathways that maximize the internal degrees of freedom, regardless of their propensity to fold, and the more compact nets fold with higher propensity. The compactness measures cor-

related with the number of pathways connecting the unfolded Propensity 0 (0.98, 8.3e-29) (0.61, 4.6e-5) and folded states, offering some understanding on why these 1 Icosahedron (-0.86, 7.2e-13) measures work well. In addition to giving insights into the ther- modynamics of folding in naturally occurring systems, our results could also provide a route for the fabrication of anisotropic

Brownian shells, paving the way for the self-assembly of com- Propensity 0 (0.97, 6.0e-26) (0.88, 4.7e-14) plex crystals from nanoshells and colloidal shells (29–32) capable 010011 of encapsulating cargo. We expect these results to impact future Leaves Diameter Paths experiments on folding of graphene sheets (13), graphene oxide layers (12), or DNA–origami polyhedral nets (33). Fig. 6. Correlation between different geometric and topological quantities calculated for the net and the folding propensity, defined as the average Materials and Methods fraction of native contacts formed across all 125 quenching simulations. Enumeration of Polyhedral Nets. We create nets from polyhedra via a pro- Leaves are the vertices with degree one on the cutting tree, the diame- cess known as edge unfolding. In edge unfolding, one cuts along a set of ter is the longest shortest path along the face graph, and the number of prechosen edges, called the cutting tree, of a polyhedron (e.g., a cube) to high-temperature pathways was calculated by enumerating the number create a single, contiguous flat 2D sheet of connected (square) faces: a net. of pathways that maximize the internal degrees of freedom at each step. We enumerate all of the nets of each polyhedron by generating random Each row corresponds to the cube (red), octahedron (green), dodecahedron weights for the edges of the skeleton-1 graph of the polyhedron on the (orange), and icosahedron (purple), respectively; data for the tetrahedron interval [0, 1]. The minimal spanning tree was found using Kruskal’s algo- were omitted because there are only two nets. Pearson coefficients and P rithm (34). We then converted the spanning tree to a net and added it to values are reported in each panel. We found that the number of leaves pro- the database if it did not already exist. We ran this loop for many iterations vided the strongest linear correlation for both the cube and the octahedron until we found all of the nets for each shape. (Pearson coefficients are 0.81 and 0.86, respectively) and so the number of leaves was used to try to predict the nets that would fold with both high and Langevin Dynamics and Molecular Dynamics Simulation. Langevin dynamics low propensity. In the bottom two rows (dodecahedron and icosahedron), are used to model the folding dynamics for each net using HOOMD-Blue the nets predicted to fold well are plotted with a solid circle and the nets (35–38). Each face of the net is approximated by a union of spheres acting predicted to fold poorly are plotted with a solid “x.” We find that while the as a rigid body, with an edge length of 10 spheres. The spheres are arranged number of leaves is a good predictor for the dodecahedron, the icosahedron in a hexagonal lattice for triangular faces, a square lattice for square faces, still folds with relatively low propensity.

E6694 | www.pnas.org/cgi/doi/10.1073/pnas.1722681115 Dodd et al. Downloaded by guest on September 23, 2021 Downloaded by guest on September 23, 2021 1 ilK,OknS,WilT,CoeaJ,VezV 20)Tepoenfolding protein The (2007) VA Voelz JD, Chodera TR, Weikl SB, Ozkan to nanopolyhedra KA, From Dill materials: thin-film 11. Self-folding (2012) D Gracias V, Shenoy 10. 2 huT,e l 21)Akrgm praht niern lsiiyi nanocomposites in elasticity engineering to approach kirigami A (2015) al. et TC, Shyu 12. 7 uknotF akrM(98 h ubro eso h eua ovxpolytopes convex regular the of nets of number The (1998) M Parker F, Buekenhout 17. the determines pluripotent Compactness Klobu (2009) to R, DH Kaplan route Gracias 16. A AM, Zarafshar kirigami: TG, lattice Leong A, Algorithmic Azam (2015) 15. al. et DM, kirigami. Graphene Sussman (2015) al. 14. et MK, Blees 13. sdtasto ahter 2,4,4)t eemn h ecieflux, reactive the determine to 44) of 43, (23, eigenvalues theory the path Appendix which transition (SI used constant at become time time, matrix the probability lag transition identifying The the of above. to described protocol combined MSM standard and the the steps in time 10 state dihedral every we a and computed define edges were until Bonded hinge temperature. each repeated each for and was angles net each process for steps) This (14 time 1,875 seeds. of total random a obtained new using jectories All angles. dihedral the of of variation list a the with (42) metric, with DBSCAN Manhattan using along clustered recorded then are is intermediates edges bonded of ode al. et Dodd as defined is flux reactive The intermediates. all 12.5 for temperature and algorithm volume DBSCAN the by returned clusters so, relabeling If a merged. net. to were the lead of could vertices molecular symmetry the the the of whether turn determine we can we graph, information, phisms a bonding into the model Using (MD) sym- dynamics net. the account the into of taking states metry different of angles dihedral the compare less is edges two between energy of potential part the not If (edges edges net. free the than of on pair the hinge) each of a between track energy keeping the by interval variables. compute the the collective degeneracy on of specify folding angle set completely “up”/“down” good dihedral a angles the therefore dihedral break are each The and We between net matrix. transitions a a of of be number in configuration the to recorded and snapshot is state, simulation state discrete of each a require computation as MSMs hierarchical classified (41), or function which (40) partition pathways, parameters assembly the land- compute order to folding monotonic methods the other require of to thermodynamics contrast and In dynamics scape. the into view detailed stefradcmitrpoaiiy(h rbblt httentwl fold will net the that probability (the state from probability committor forward the is rniinn ostate to transitioning ouflig.Fnly h e u sdfie as defined opposed is as flux folding, net is the net Finally, the unfolding). that to probability (the probability committor aha smvn forward. reac- moving the is then of pathway sum is the rate as folding state, defined The unfolded is the flux of reactive out the flux total using tive The algorithm, 43). “bottleneck” (23, the fluxes via net computed were paths dominant .KrbysiSieoiK neH aeciS(02 eloiai effligof Self-folding origami: Cell (2012) S Takeuchi H, polyhedra. Onoe self-folding K, of Kuribayashi-Shigetomi design Algorithmic 9. of (2011) folding M origami Ewing Controlled S, (2012) Pandey SM 8. Yang HC, Jeon CJ, Heo SH, Kim self- local using TS, building sheets Shim polymer for of method Self-folding 7. (2012) A MD Dickey (2014) J, R Genzer JK, Wood Boyles Y, D, Liu Rus 6. E, Demaine M, Tolley S, Felton and 5. encapsulation for containers polymeric Self-folding (2012) D Gracias R, Fernandes Universitat, 4. (Technische (2007) thesis J O’Rourke PhD ED, polyhedra. Demaine of 3. Nets (1997) W Schlickenrieder 2. nets. convex with polytopes Convex (1975) GC Shephard 1. obidteMMw a 2 needn iuain tconstant at simulations independent 125 ran we MSM the build To rbe:We ili esolved? be it will When problem: origami. graphene force. traction cell by driven 7:e51085. microstructures cell-laden dimensional three USA Sci Acad microcarriers. robust for reversibility 124:1449–1452. sustained with bilayers hydrogel absorption. light machines. folding drugs. of delivery Polyhedra Berlin). Soc n experiment. and defects. patterned through ndmnin 4. dimension? in self-assembly. octahedron and cube of success materials. E 78:389–403. bond = j ), rcNt cdSiUSA Sci Acad Natl Proc CmrdeUi rs,NwYork). New Press, Univ (Cambridge hnteegsaecniee ob odd n list a and bonded, be to considered are edges the then −5, π 108:19885–19890. i stepoaiiyo en nstate in being of probability the is sick ˇ iceeMath Discrete ri Life Artif otMatter Soft d rgDlvRev Deliv Drug Adv Science R Bull MRS d ,Pne 21)Bidn oyer ysl-seby Theory self-assembly: by polyhedra Building (2014) S Pandey J, y ´ (i j , ie h ytmi nstate in is system the given j ) k = 345:644–646. 20:409–439. fold a Mater Nat 37:847–854. α∈Aut(G 8:1764–1769. × = 186:69–94. min ,2π [0, 2)taetre o,euvlnl,2.3 equivalently, (or, trajectories 125) emti odn loihs ikgs Origami, Linkages, Algorithms: Folding Geometric urOi tutBiol Struct Opin Curr 112:7449–7453. F /τπ F net × 64:1–11. 14:785–789. = G ] r 10 ) stesmlto srnigw also we running is simulation the As . where , int max P Nature 6 yloiga h rp automor- graph the at looking By . i tp n hnbace h tra- the branched then and steps f LSOne PLoS |a ui where , α(i 524:204–207. π ),k r f stepoaiiyta the that probability the is f − ij 17:342–346. i ij + and , i ahPo abig Philos Cambridge Proc Math = 4:e4451. , a u = P j ,k q ij steufle state. unfolded the is max{f j + hc sawyto way a is which |, stepoaiiyof probability the is q π i − i P τ stebackward the is ij a on by found was , q ij i − − .We S7). Fig. , ne Chem Angew where , f ji The 0}. , LSOne PLoS rcNatl Proc × f ij 10 of , q 10 j + 4 rsa B(96 ntesots pnigsbreo rp n h traveling the and graph a of subtree spanning shortest the On (1956) JB Kruskal multi-arm with 34. nanostructures origami DNA wireframe Complex (2015) particles. al. into et polyhedral polyhedra F, Zhang of of self-assembly 33. Predictive assembly (2012) the SC Glotzer for M, roadmap Engel PF, A Damasceno (2012) 32. L Manna particles. J, polyhedral Graaf of de behaviour 31. Mesophase (2011) FA Escobedo into U, assembly their Agarwal and blocks 30. building of Anisotropy (2007) the MJ Solomon and SC, Glotzer mechanisms kinetics. protein-folding 29. folding in Cooperativity (1993) Protein HS Chan (1998) KM, Fiebig KA, PG Dill folding Wolynes 28. protein JN, determine Onuchic contacts ND, Native (2013) Socci WA 27. Eaton colloid G, patchy Hummer structure protein RB, of initio Best ab order learning Contact 26. (2002) D machine Baker J, Tsai Nonlinear I, Ruczinski R, (2014) Bonneau AL 25. Ferguson AW, Long 24. hubs. kinetic No are states 23. folded Protein (2010) VS molecular Pande by GR, kinetics Bowman folding protein 22. Describing (2004) F of Suits landscape JW, free-energy Pitera The WC, (2010) Swope VN 21. Manoharan MP, Brenner N, Arkus G, Meng 20. 9 odP 21)Dt rm“oyer es”Btukt https://bitbucket.org/the Bitbucket. nets.” “Polyhedra from Data (2018) PM Dodd 19. Ara 18. ie yAvne eerhCmuiga h nvriyo Michigan, of pro- University also services the was and at Foun- and resources Computing Science 140129, Research computational Arbor. Ann Advanced DMR National through by Discovery Award by part vided Engineering XSEDE in supported and ACI-1053575, supported is Science Sci- Grant Com- which Extreme Energy dation Program. the Basic (XSEDE), Bio- Fellowship Science, used funded for Environment Predoctoral of work the Rackham Center Center from Office putational Michigan support Research the Energy, of acknowledges P.F.D. Frontier of of DE-SC0000989. University Award Department part Energy under US ences, as an the and Science, by P.M.D.) (EFRI) Energy Innovation (to Inspired and then Research EFRI-1240264 in are Award Frontiers Emerging freedom Foundation, of Science degrees ACKNOWLEDGMENTS. of path- number of The largest pathway. sequence the the the are along returned. have by pathways intermediate lexicographically that the each state Finally, sorted of ways current graph. freedom and the the of graph between in degrees the link made added a from is then and state had taken was finding processing, candidate also intermediate by further that the This calculated for bonded and (local). queue be was one to a bonds of intermedi- needed to candidate distance still each of topological that For set intermediate a empty information. the a an on pathway queue, creating edges and the the algo- queue on the contain the initialize ate will to We state that bonds freedom. unfolded graph local the of temperature: adding degrees by high of rithm at number pathways maximizing folding and the for important high-T be enumerate to formed Pathways. High-Temperature Enumerating underestimate to known 47). is (46, game cases pebble the these since symme- of closed- hand, of by number degrees for checked high the were with then try determine intermediates finally, to and (48); formula freedom intermediate, closed-loop of each degrees a First, applied to 47). we game (46, motifs freedom pebble loop of the how- pebble degrees (45); the applied of symmetries, game number we group pebble the point has the underestimate or can use generic game can not is one linkage general are the dif- constraints if In be ever, which net. can deduce freedom to the difficult of face in be degrees the can redundant of it of number because diameter the calculate to general graph ficult In the net. is the contacts of diameter nonnative graph The The above. similarly. described calculated criteria correspond- were correct the the to to according bonded edge were ing that edges of number the counting Parameters. Folding aemnproblem. salesman vertices. junction structures. complex Science Mater structures. complex USA Sci Acad funnel. folding multidimensional simulations. atomistic in mechanisms prediction mechanisms. and pathways self-assembly simulations. USA off-equilibrium short Sci from Acad pathways folding of ensemble rium USA Sci Theory 1. simulations. dynamics spheres. hard attractive of clusters real esfrsl-odn kirigami. self-folding for nets ,Sch F, e ´ j A,d ot A ooote N edsJF(08 idn h optimal the Finding (2018) JFF Mendes SN, Dorogovtsev RA, Costa da NAM, ujo ´ pdodd/polyhedra 10:230–235. 337:417–418. 107:10890–10895. teC adnEjdnE ec ,WilT 20)Cntutn h equilib- the Constructing (2009) TR Weikl L, Reich E, Vanden-Eijnden C, utte ¨ rti Sci Protein 90:1942–1946. 106:19011–19016. a Nanotechnol Nat rcA ahSoc Math Am Proc Science Mater Nat h ubro aiecontacts, native of number The 11:1937–1944. nets. hswr a upre npr yteNational the by part in supported was work This 337:453–457. hsRvLett Rev Phys 6:557–562. † ahas w rnilswr sue to assumed were principles Two pathways. . rtisSrc uc Genet Funct Struct Proteins hsCe B Chem Phys J Science 10:779–784. rcNt cdSiUSA Sci Acad Natl Proc 7:48–50. hsCe B Chem Phys J PNAS 327:560–563. 120:188001. nehutv erhwsper- was search exhaustive An 108:6571–6581. | o.115 vol. 118:4228–4244. a acltdby calculated was Q, 110:17874–17879. 32:136–158. | o 29 no. rcNt Acad Natl Proc | rcNatl Proc rcNatl Proc E6695 Nat

APPLIED PHYSICAL SCIENCES 35. HOOMD-blue (2016) Version 1.3.3. Available at codeblue.umich.edu/hoomd-blue. 42. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering Accessed February 1, 2016. clusters in large spatial databases with noise. Proceedings of the Second International 36. Anderson JA, Lorenz CD, Travesset A (2008) General purpose molecular dynam- Conference on Knowledge Discovery and Data Mining (KDD-96), eds Simoudis E, ics simulations fully implemented on graphics processing units. J Comput Phys Han J, Fayyad U (AAAI Press, Menlo Park, CA), pp 226–231. 227:5342–5359. 43. Metzner P, Schutte¨ C, Vanden-Eijnden E (2009) Transition path theory for Markov 37. Nguyen TD, Phillips CL, Anderson JA, Glotzer SC (2011) Rigid body constraints realized jump processes, Multiscale Model Simul 7:1192–1219. in massively-parallel molecular dynamics on graphics processing units. Comput Phys 44. Weinan E, Vanden-Eijnden E (2010) Transition-path theory and path-finding Commun 182:2307–2313. algorithms for the study of rare events. Annu Rev Phys Chem 61:391–420. 38. Glaser J, et al. (2015) Strong scaling of general-purpose molecular dynamics 45. Lee A, Streinu I (2008) Pebble game algorithms and sparse graphs. Discrete Math simulations on GPUs. Comput Phys Commun 192:97–107. 308:1425–1437. 39. Perkett MR, Hagan MF (2014) Using Markov state models to study self-assembly. J 46. Schulze B, ichi Tanigawa S (2014) Linking rigid bodies symmetrically. Eur J Comb Chem Phys 140:214101. 42:145–166. 40. Allen RJ, Valeriani C, ten Wolde PR (2009) Forward flux sampling for rare event 47. Schulze B, Sljoka A, Whiteley W (2014) How does symmetry impact the flexibility of simulations. J Phys Condens Matter 21:463102. proteins? Philos Trans Ser A Math Phys Eng Sci 372:20120041. 41. Jankowski E, Glotzer SC (2009) A comparison of new methods for generating energy 48. Tay TS (1984) Rigidity of multi-graphs. I. Linking rigid bodies in n-space. J Comb Theor minimizing configurations of patchy particles. J Chem Phys 131:104104. Ser B 36:95–112.

E6696 | www.pnas.org/cgi/doi/10.1073/pnas.1722681115 Dodd et al. Downloaded by guest on September 23, 2021