Quick viewing(Text Mode)

Heterogeneous Interactions Promote Crystallization and Spontaneous Resolution of Chiral Molecules John E

Heterogeneous Interactions Promote Crystallization and Spontaneous Resolution of Chiral Molecules John E

Heterogeneous interactions promote and spontaneous resolution of chiral molecules John E. Carpenter and Michael Gr¨unwalda) Department of , University of Utah, Salt Lake City, 84112 Utah (Dated: 28 July 2019) Predicting the crystallization behavior of solutions of chiral molecules is a major challenge in the chemical sciences. In this paper, we use molecular dynamics computer simulations to study the crystallization of a family of coarse-grained models of chiral molecules with a broad range of molecular shapes and interactions. Our simulations reproduce the experimental crystallization behavior of real chiral molecules, including racemic and enantiopure crystals, as well as amorphous solids. Using efficient algorithms for the packing of shapes, we enumerate millions of low energy crystal structures for each model and analyze the thermodynamic landscape of polymorphs. In agreement with recent conjectures, our analysis shows that the ease of crystallization is largely determined by the number of competing polymorphs with low free energy. We find that this number, and hence crystallization outcomes, depend on molecular interactions in a simple way: Strongly heterogeneous interactions across molecules promote crystallization and favor the spontaneous resolution of racemic mixtures.

I. INTRODUCTION have shown, however, that thermodynamics alone can- not explain the bias towards racemic crystals observed 18,19,22 The crystallization of molecules is a necessary but of- in experiments. Clearly, kinetic effects of nucle- tentimes complicated step in many chemical processes. ation and growth play an important role in the crystal- While many elements readily crystallize in a single lization of molecules and frequently result in the forma- structure, even small molecules (and certainly large tion of amorphous solids and of crystals that are only 23,24 biomolecules) can be difficult to crystallize and display metastable. pronounced polymorphism.1,2 Bio-active organic com- Nevertheless, much information can in principle be pounds like drugs, pesticides, and herbicides, need to gleaned from a comprehensive list of possible crystal be available in crystalline forms that are pure, soluble structures and their (free) energies.25,26 Taking a purely in water, and thermodynamically stable. The chemi- thermodynamic view, modern computational methods cal industry therefore invests considerable time and ef- of crystal structure prediction (CSP) have evolved into fort in the discovery of suitable crystalline forms for new useful tools that can provide valuable guidance in the compounds.3 discovery of crystalline forms of organic and inorganic A large fraction of organic molecules are chiral and molecules.27–33 However, while CSP methods can now crystallization can play an important role in the sepa- usually find the polymorphs that form in experiments, ration of chiral molecules. Chiral compounds are most the majority of predicted crystal structures, even ones easily synthesized as racemic mixtures that usually need with low energies, are never realized in experiments.13 to be resolved into for use in drugs.4–6 A Can such kinetically inaccessible polymorphs be iden- straightforward and scalable method of chiral separa- tified by analyzing a molecule’s thermodynamic crystal tion is the crystallization into physical mixtures of enan- landscapes? The ”confusion principle”, developed in the tiopure crystals, so-called conglomerates.7 However, this context of glass forming materials, states that the pres- method of spontaneous is rarely applied ence of a large number of structural motifs with similar in industry because chiral compounds preferentially form low energies leads to kinetic competition and a low crys- racemic crystals8 and extensive searches for conditions tallization propensity.34,35 This principle has been shown that favor formation of enantiopure crystals are seldom to apply in at least a few inorganic systems,36,37 and has performed. been conjectured to apply much more broadly also for Will a molecule crystallize under given experimental organic molecules.26 Related ideas have been developed conditions? Will a spontaneously sepa- to describe kinetic trapping in protein folding,38 in the rate or form a racemic co-crystal? Why do salts of chi- self-assembly of atomistic39 and colloidal clusters,40 and ral molecules form conglomerates more readily than neu- in the packing of polyhedral shapes.41 Nevertheless, no tral versions of the same molecules?9–11 Despite much firm guiding principles are currently available that would research activity, no definitive answers to these ques- allow the prediction of crystallization behavior based on tions have been found.12–21 In principle, thermodynam- crystal energy landscapes and molecular properties. ics dictates that the crystal with the lowest free en- In this paper, we present a computational study that ergy prevails in equilibrium. Studies of chiral molecules aims to reveal thermodynamic and kinetic driving forces of chiral crystallization, and to establish connections between thermodynamic landscapes and crystallization a)Electronic mail: [email protected] outcomes. In our efforts, we build on previous computa- 2

FIG. 1. (a) The 11 chiral shapes studied in this work, each consisting of 5 coarse-grained functional groups. (b) Interaction potential for two functional groups as a function of distance r. (c) Example mapping of acetaminophen to coarse-grained shape s1. Functional groups that interact strongly, in this case via a hydrogen bond, are shown in light blue color. (d) Time series of snapshots from a MD simulation of the crystallization of molecule s2/1:3/5, illustrating the rapid formation of several enantiopure aggregates followed by slow coalescence and growth of high quality crystals. Labels indicate simulation time in units of 108 time steps. Enantiomers are shown in blue and red colors. (e) Temperature (red curve) and number of molecules in the largest cluster (blue curve) as a function of time, for the trajectory illustrated in (d). Background colors indicate the different stages of the applied temperature protocol, as described in the main text. The largest CQ score is obtained after 1.08 × 108 time steps (see the red aggregate visible in the corresponding snapshot in panel (d)). Intermittent fluctuations of the size of the largest cluster indicate transient binding of aggregates. tional work that has demonstrated the usefulness of sim- between triplets of connected functional groups. Further- ple molecular models in elucidating structure formation more, we focus only on chiral shapes, resulting in a total mechanisms of chiral molecules.16,20,21,42–45 Our study of 11 distinct molecular shapes,16 which we order accord- is based on a family of coarse-grained models of chiral ing to increasing radius of gyration. Functional groups molecules. The computational efficiency of these models interact via short-ranged pair potentials that represent allows us to simulate the crystallization dynamics of a effective interactions in solution, as illustrated in Figure wide range of molecular shapes and interactions, and to 1b (see Methods); effects are included in these in- fully determine the thermodynamic landscape of crystal teractions and solvent species are not modeled explicitly. structures. Despite the relative simplicity of our model, For simplicity, we fix the range of these pair potentials its analysis reveals general relations between molecular and use the attractive interaction strength att as the interactions, crystal energy landscapes, and crystalliza- only parameter. We therefore coarsely categorize func- tion trends that we argue are relevant for the crystalliza- tional groups according to their strength of attractive tion of real molecules. interactions with other groups, as exemplified in Figure 1c. Specifically, we will focus on molecules that have a single pair of functional groups that interact strongly II. RESULTS AND DISCUSSION with att = s (representing, for instance, a hydrogen bond), while other groups interact with uniform weaker A. A simple model for chiral molecules attractions, att = , which we also use as our unit of energy. A molecule can therefore be completely specified by its shape (i.e., 1-11), the pair of functional groups We represent molecules as rigid bodies consisting of that interact strongly (e.g., 1:4, 3:3, etc.), and the mag- five circular particles in two-dimensions, as illustrated nitude of strong interactions  . For example, we will in Figure 1a. Constituent particles represent functional s refer to the coarse-grained molecule shown in Figure 1c groups of the molecule and have a fixed diameter σ. To with  = 5 as ”s1/1:4/5”. For a fixed magnitude of limit the number of different molecular shapes, we only s strong interactions  , this family of models is comprised consider molecules with angles of 60, 120, or 180 degrees s 3 of a total of 159 distinct molecules that cover a wide lization outcomes, we first discuss results for the fam- 46 space of shapes and interactions. For a given molecular ily of racemic mixtures with s = 5, which approxi- shape, molecules with different interactions can be either mates the relative strength of a strong hydrogen bond interpreted as chemically different molecules with similar with respect to weaker van der Waals interactions.49 For shapes in the same solvent, or, alternatively, as identical these interactions, we identify the most crystalline ag- molecules in different solvent environments. gregates at temperatures between 0.80 and 1.20 /kB, which implies effective interactions of  ≈ 0.6 kcal/mol and s ≈ 3.0 kcal/mol at room temperature. Our model B. Molecular dynamics of crystal formation molecules display the same types of crystallization out- comes found in experiments.8 Specifically, we observe the formation of racemic crystals, enantiopure crystals, and We determine the crystallization behavior of a given aggregates with various degrees of disorder. Remarkably, molecule with molecular dynamics computer simulations we do not observe any crystals with ratios of enantiomers in the NVT ensemble at a molecular concentration of in the unit cell different from racemic and enantiopure. 0.04 σ−2 (see Methods). Simulations are initiated from This result appears to be consistent with experiments, dispersed racemic mixtures of 5184 molecules and run as we were not able to find reports of such crystals in for up to 4 × 108 time steps. To facilitate the forma- the literature. Examples of different crystallization out- tion of well-ordered solids, the temperature is adjusted comes are displayed in Figure 2; results for all types of during the simulation by an algorithm that operates in molecules are schematically summarized in Fig. 3c. Con- three stages (also see SI). In the first stage, starting from sistent with experiments,15 only a minority of molecules a temperature that is too high for structure formation, (27%) crystallize well, which we define by a CQ score of temperature is rapidly decreased, until an aggregate con- larger than 0.40. Among these good crystallizers, almost taining at least 50 molecules is observed. In the second all form a single racemic or enantiopure polymorph, as stage, temperature is automatically adjusted to achieve illustrated in Fig. 2a. In one case (s5/2:5/5), we ob- a growth rate of aggregates of approximately 5 molecules serve the concurrent formation of two polymorphs with per 106 time steps, until the largest aggregate has reached high CQ scores, one racemic, the other enantiopure. In a size of at least 200 molecules. In the third and final experiments, a prevalence of racemic crystals (approx- stage, temperature is slowly increased at a rate of 0.2/k B imately 9:1) is commonly cited in the literature.8,17,19 per 108 time steps, scanning a range of supersaturation In our 2D simulations, we observe similar numbers of values. Throughout the third stage, polymorph-specific racemic and enantiopure crystals, consistent with exper- order parameters (see Methods) are used to identify the imental accounts of quasi-2D racemic mixtures adsorbed molecular clusters with the highest crystalline quality on surfaces.50 The majority of molecules form aggregates (CQ) for both enantiopure and racemic structures. CQ with limited order or no order at all, including racemic is measured on a scale from zero (no discernible order) amorphous solids (Fig. 2b) and aggregates of enantiop- to one (a perfect bulk crystal); large crystalline aggre- ure or racemic micelles (Fig. 2c). We also observe gates with few defects typically obtain a CQ score of a large number of racemic disordered aggregates (Fig. 0.6 or larger but good crystalline order is evident in all 2d), which consist of enantiopure linear arrangements aggregates with CQ 0.4. Snapshots from a typical & that can stack on one another without apparent enan- simulation trajectory that yielded crystalline aggregates tiomeric preference, leading to large randomly stacked are shown in Figure 1d. Using the temperature protocol and partially ordered aggregates with varying compo- described above, crystallization is dominated by growth sition of enantiomers. This type of solid, called ”1D- rather than nucleation: In the first and second stages, a racemic”, is frequently observed in experiments of quasi- substantial number of aggregates rapidly form and grow, 2D systems on surfaces,51 but also occurs in 3D.52–55 typically displaying varying numbers of defects, as illus- In addition, we find that a few systems form disordered trated in Fig. 1e. In the third stage, as temperature is in- structures that are closely related to solid solutions and creased, these defects anneal and well-ordered aggregates that we will refer to as ”0D-racemic” (Fig. 2e). These can further grow, while disordered clusters fall apart. We structures are composed of enantiopure oligomers that find that this simulation protocol yields superior crystals are arranged with spatial but no compositional order. compared to straightforward MD simulations at constant Since both 1D-racemic and 0D-racemic solids do not have temperature and that independent simulation runs yield well-defined unit cells, assignment of CQ scores (which consistent values of CQ (see Table S1). We have fur- involves matching of aggregates to a known catalogue thermore confirmed that CQ values are insensitive to a of crystal structures, see Methods) to aggregates of this four-fold reduction of molecular concentration (see Table type is ambiguous. We therefore remove them from our S2). Similar annealing procedures have been found to op- analysis and indicate molecules that show this behavior timize self-assembly outcomes of different systems,47,48 by black colored cells in Fig 2a. We built a large data set of crystallization trajecto- ries for several different interaction strengths of func- In the following, we first discuss several trends that tional groups as well as for racemic and enantiopure are apparent from the results of our MD simulations. solutions. To give an overview of the various crystal- We then show that these trends can be rationalized and 4

FIG. 2. Crystallization outcomes of molecules with s = 5. (a) Snapshots of all crystals with CQ > 0.4. (b)-(e) Snapshots of examples of disordered aggregates: (b) amorphous racemic, (c) micelles, (d) 1D-racemic aggregates, (e) 0D-racemic (solid solution). predicted based on the underlying thermodynamic land- ers characterized the crystallization propensity of 319 or- scape of crystal polymorphs. ganic molecules in different .15 While an analysis Molecular shape is not a strong predictor of crystalliza- of this data set clearly showed that smaller molecules tion behavior. In a recent study, Pillong and cowork- tend to crystallize better, no strong connections be- 5 tween crystallization and molecular fingerprints sensitive see Fig. S2), a lower yield than for s = 5 (70%), but to molecular shape could be identified.56 The results of higher than the racemic case, where no crystals formed at our MD simulations of molecules with s = 5 (Fig. 3c) all (Fig. 3f and Fig. S1). Comparing the specific crystal are consistent with these experimental results. We do not structures obtained from racemic and enantiopure solu- find a strong correlation between molecular shape and tions, we find that molecules that form conglomerates quality or composition of crystals. In fact, most molecu- from racemic solutions crystallize in the same structures lar shapes can form both racemic and enantiopure crys- from enantiopure solutions, with comparable or better tals for different pairs of strongly interacting functional CQ values. In addition, several molecules that did not groups. While some shapes clearly produce more good produce good crystals from racemic mixtures crystallize crystals than others (e.g., s2 vs. s1), attempts to find well from enantiopure initial conditions. Furthermore, meaningful correlations between CQ scores and simple 0D-racemic and 1D-racemic disordered aggregates, which molecular descriptors of shape, degree of , and we observe frequently in the racemic case (black cells in position of strongly interacting functional groups on the Fig. 3a-d), cannot form in enantiopure systems. In fact, molecule were unsuccessful. However, we do observe a we find that these models tend to form high-quality crys- clear tendency of molecules with particularly elongated tals from enantiopure solutions. Crystalline quality does shapes (i.e., shapes 9, 10, and 11) to form 1D-racemic not, however, improve in all cases; molecules that formed aggregates. A similarly reduced crystallization propen- good racemic crystals from racemic solutions do not nec- sity might be expected in experiments for molecules with essarily form high quality enantiopure structures (e.g., elongated or planar shapes. s5/1:3, s9/3:5, and s11/3:5). Crystallization and chiral separation are enhanced when interactions are heterogeneous. We find that crys- tallization behavior depends markedly on the relative C. Crystal energy landscapes strengths of weak and strong interactions. In addition to the case of s = 5 discussed above, we have per- Are the crystals that form in our simulations always formed systematic MD simulations for smaller (s = the lowest (free) energy polymorph? Are there clear 3, 2, and 1) and larger (s = 7) interaction strengths, thermodynamic signatures of molecules that form dis- as illustrated in Fig. 3a-d. With decreasing s, the num- ordered solids? To reveal the thermodynamic landscapes ber of good crystallizers decreases sharply (29, 14, and 3, underlying the crystallization of our model molecules, for s = 5, 3, and 2, respectively). For completely uni- we have enumerated large numbers of polymorphs and form interactions, s = , none of the 11 molecular shapes calculated their lattice energies. Predicting accurate showed any crystalline tendencies (see Fig. S1). Increas- low-energy crystal structures for real molecules re- ing strong interactions to s = 7 affects racemic and quires computationally expensive methods and signifi- enantiopure crystals in different ways: while the number cant expertise.27–33,57,58 In the case of our coarse-grained of racemic crystals decreases noticeably, the number of molecules, low-energy polymorphs can be found much enantiopure crystals remains high, resulting in a increase more straightforwardly. To this end, we adapted a of the fraction of enantiopure crystals to 73% (compared computationally efficient algorithm for solving the ex- to 50% and 59% for s = 3 and 5, respectively). These act cover problem59,60 to generate dense packings of our results are summarized in the table shown in Fig. 3f. model molecules (see Methods). In the following, we While we do not have systematic data for even larger will refer to this polymorph enumeration algorithm as values of s, preliminary simulations indicate that enan- POLYNUM. The algorithm allows us to generate be- tiopure crystals form more readily also for molecules with tween 105 and 107 polymorphs for each molecule, with particularly strong interactions and we observe a clear unit cells containing up to 14 molecules, with all pos- trend towards more open crystal structures (see Fig. S3). sible molecular compositions ranging from enantiopure Enantiopure solutions crystallize markedly better than to racemic, and with different packing fractions. We racemic ones. In addition to the sensitivity to interac- find that the number of polymorphs generally increases tion strength discussed above, we find that crystalliza- rapidly with increasing unit cell size and with decreasing tion behavior also depends strongly on the composition packing fraction, preventing us from identifying poten- of the solution. As illustrated in Fig. 3e, crystallization tial low energy polymorphs with very large unit cells or propensity and also the quality and size of obtained crys- low packing fractions. However, an analysis of the gen- tals increases substantially when enantiopure solutions erated crystal polymorphs suggests that we identify the are simulated (at the same molecular concentration and vast majority of crystal polymorphs that have energies using the same crystallization protocol). Specifically, for low enough to be dynamically relevant (see SI). In partic- s = 5, 70% of all systems form high quality crystals, ular, all crystal structures that formed spontaneously in compared to 27% for the racemic solutions. We find our simulations were also found by POLYNUM (Fig. S8). the same qualitative dependence on interaction strength Note that throughout this paper we use zero temperature for the case of enantiopure solutions, but with overall crystal energies as estimates of free energies at the tem- higher crystallization propensity: For uniform interac- perature of crystal formation. To justify this approxima- tions, s = , 3 out of 11 models crystallize well (27%, tion, we have calculated free energy differences between 6

FIG. 3. Crystallization outcomes depend on interaction strength and composition. (a)-(d) Grid summary of CQ scores for molecules with different interaction strengths: (a) s = 2, (b) s = 3, (c) s = 5, (d) s = 7. Each molecule is represented by a cell and identified by its shape (vertical axis) and pair of strongly interacting functional groups (horizontal axis). Colors of cells reflect the highest CQ score achieved during simulation; negative values indicate enantiopure aggregates. Racemic structures are shown in red color, enantiopure structures in blue; larger magnitude indicates better crystalline quality. Black cells indicate molecules that formed 1D or 0D-racemic disordered aggregates, as discussed in the text. Several molecules with shape s9 are equivalent because of symmetry (i.e., 1:2 and 4:5, 2:2 and 4:4, etc.); redundant molecules are blacked out. (e) Grid summary of CQ scores for enantiopure solutions with s = 5. Darker colors indicate higher CQ scores. (f) Table summarizing crystallization outcomes. N is the number of simulated systems (those that form 0D-racemic and 1D-racemic structures are not counted), Npure and Nrac are the numbers of systems that formed enantiopure and racemic crystals with CQ ≥ 0.4, respectively, and Ncryst is the total number of systems with CQ ≥ 0.4. 7 a small number of polymorphs (see SI). Because of the short-ranged interactions between functional groups, en- tropy differences between polymorphs are small, on the order of 0.1  per molecule. We show below that differ- ences of this magnitude do not alter the crystallization behavior of molecules in our simulations substantially. There are many more racemic polymorphs than enan- tiopure ones, and most polymorphs have other molecular compositions. We find that the number of low-energy polymorphs varies greatly between molecular shapes, but general trends are immediately apparent. The numbers of polymorphs with different composition loosely follows binomial statistics (see Fig. S12), with enantiopure poly- morphs forming the smallest group and racemic poly- morphs the largest. Over the entire set of polymorphs calculated for all shapes, we find 8 times more racemic polymorphs than enantiopure ones. However, the group of polymorphs with other compositions of enantiomers is 5 times larger still. How are the energies of these poly- morphs distributed? In Fig. 4a-b, we show histograms of the energies of racemic and enantiopure polymorphs for two molecules with the same shape (s2) but different interactions (5:5 and 4:5). Although the different num- bers of racemic and enantiopure polymorphs are appar- ent, the energy distributions of both groups have similar shape and show an initial exponential increase with in- creasing energy. Both distributions show a maximum, followed by rapid decline. The position of these max- ima reflect the limitations of POLYNUM in enumerat- ing very large unit cells and crystals with low packing fraction. The true distributions should increase expo- nentially up to energies close to zero because many more periodic structures exist for larger number of molecules and larger available volume per molecule. The set of enumerated polymorphs is therefore clearly incomplete beyond the energies at which the polymorph distribu- tions peak. However, consistent with CSP studies,61 our MD simulations show that the dynamically relevant poly- morphs usually have energies within 5% of the lowest pos- sible energy and distributions of polymorphs found with POLYNUM peak at much higher energies. We therefore conclude that the set of polymorphs we analyze includes essentially all dynamically relevant crystal structures. Large crystal energy differences predict crystalliza- tion, but many molecules crystallize well without strong thermodynamic bias. Remarkably, the distributions of racemic and enantiopure polymorphs shown in Fig. 4a-b FIG. 4. Thermodynamics of chiral crystallization. His- are shifted with respect to each other in a way that is in- tograms of energies per molecule of racemic (red) and enan- dicative of crystallization behavior. To quantify the ther- tiopure (blue) polymorphs for (a) s2/5:5/5 and (b) s2/4:5/5. modynamic bias towards racemic or enantiopure crystals, Energy gaps ∆E between lowest-energy enantiopure and racemic crystals are indicated with red and blue arrow, re- it is useful to define an energy gap ∆E = E(P) − E(R) as 0 0 spectively. (c) Grid summary of CQ scores for s = 5 (as in (P) (R) the difference between the lowest energies E0 and E0 panel (c) of Fig. 3). Numbers indicate energy gaps ∆E. Sys- of any racemic (R) and enantiopure (P) polymorph, re- tems in panels (a) and (b) are highlighted with yellow circles. spectively. Positive and negative values of ∆E indicate (d) Scatter plot of CQ scores and energy gaps ∆E. a thermodynamic advantage for racemic and enantiop- ure crystals, respectively. The energy gaps correlate well with crystallization outcomes for the two specific exam- s2/5:5 and this molecule indeed forms a racemic crystal ples shown in Fig. 4a. We find a positive energy gap for of high quality in our MD simulations. By contrast, ∆E 8 is negative for s2/4:5 and chiral separation is observed. In Fig. 4c, we show values of ∆E, expressed in percent of the lowest crystal energy E0, for all simulated racemic solutions with s = 5 together with the CQ scores ob- tained in MD simulations; the same data is shown as a scatter plot in Fig. 4d. As expected from thermodynamic principles and consistent with CSP studies, we observe a clear correlation between ∆E and crystallization out- comes. We find the largest energy gaps (up to 20%) for racemic crystals, which we attribute to the much larger number of racemic crystal structures. Systems with such extreme energy gaps crystallize accordingly. However, in the group of polymorphs with zero to moderately large energy gaps (< 10%), which includes the majority of sys- tems, we find a much weaker correlation between ∆E and crystallization outcomes. Many systems with com- parably large energy gaps do not crystallize (e.g., s7/1:1, s8/1:3) and some systems without any thermodynamic bias (∆E = 0) form high-quality racemic or enantiop- ure crystals. For two system (s5/2:5 and s8/3:3) we even observe chiral separation at positive ∆E, that is against thermodynamic bias. Kinetic effects clearly are impor- tant in these cases. The likelihood of crystallization is (partly) encoded in the energetics of polymorphs with unusual stoichiome- try. While the largest group of polymorphs identified by POLYNUM are crystals with compositions other than racemic or enantiopure (referred to as ”OP” in the fol- lowing, short for ”odd polymorphs”), we did not ob- serve the formation of such crystals in any of our sim- ulations. One might therefore expect that OPs have FIG. 5. Polymorphs with unusual stoichiometries in- fluence crystallization. (a) Scatter plot of CQ values much higher energies and do not compete thermodynam- (OP) ically. To the contrary, we find that in about 60% of vs. ∆E for all simulated systems with s = 5. CQ grids shown above the plot separately display systems with all systems with s = 5, the lowest energy crystal is (OP) (OP) ∆E /E0 < 1.0 (left) and ∆E /E0 > 1.0 (right), re- an OP. We quantify the thermodynamic role of these spectively. (b) Scatter plot showing CQ scores of all simu- polymorphs by calculating the energy difference between lated racemic and enantiopure solutions vs. the number of the lowest-energy OP and the polymorph with the low- competing polymorphs N0.95. est energy among racemic and enantiopure polymorphs, (OP) (OP) (P) (R) ∆E = E0 − min(E0 ,E0 ). Remarkably, we (OP) find that ∆E is a reasonable predictor for crystalliza- of polymorphs N0.95 found by POLYNUM that have en- tion propensity in our simulations. As shown in Fig. 5, ergies within 95% of that of the lowest-energy crystal. In among the systems that have OPs at the lowest energies Fig. 5, we show the correlation between N0.95 and CQ (OP) (∆E /E0 > 1), very few form good crystals. Almost scores obtained in our simulations of racemic and enan- all good crystallizers are found among the systems that tiopure solutions. The plot shows that the best crystal- (OP) do not have competing OPs (∆E /E0 < 1). This ob- lizers tend to have the smallest numbers of competing servation suggests a potentially useful strategy to asses polymorphs and that enantiopure solutions have signif- the likelihood of a given molecule to crystallize, by using icantly smaller number of competing polymorphs than methods of CSP to predict crystals with unusual stoi- racemic ones (namely 28 competing polymorphs on av- chiometries. Our simulations predict that if low energy erage, compared to 96 for racemic solutions). We now crystals can be found with CSP, crystallization is likely turn to a discussion of the dependence of the number of complicated. competing polymorphs on interaction strength and poly- Why do OPs not crystallize and at the same time af- morph composition. fect the crystallization probability of racemic and enan- tiopure polymorphs? Our simulations suggest that these effects are a consequence of the sheer number of these D. An analysis of polymorph space polymorphs and their composition. We find that crys- tallization propensity is correlated with the number of In our model, only functional groups that are in im- competing polymorphs, which we define as the number mediate contact with one another have meaningful in- 9

FIG. 6. Thermodynamic competition between polymorphs. (a) Schematic illustration of polymorph space for a fictitious model with two types of contacts between functional groups (1:1 and 2:2). The enantiopure polymorph indicated by ~cP has the lowest energy because its contact vector has the largest scalar product with ~, as indicated by dashed lines. (b) Number of polymorphs of shape s2 enumerated by POLYNUM, as a function of ϕ~c, for different packing fractions γ: close-packed crystals (γ = 1, red curve), crystals with γ > 0.95 (green curve), and all crystals (blue curve). A uniform distribution of polymorphs on the simplex for γ = 1 is shown in black. (c) Number of competing polymorphs N0.95 as a function of ϕ~, for shape s2. Separate curves are shown in light colors for different pairs of strongly interacting groups. The average over all interactions is shown as a bold red curve. Vertical lines indicate values of ϕ~ corresponding to different interaction strengths s used in our MD simulations. (d) Relative number difference of competing enantiopure and racemic polymorphs as a function of ϕ~. Light-colored curves indicate data for all 11 different shapes, averaged over all interaction types. The bold red curve is an average over all shapes. vertical lines are as in panel (c). (e) Violin plot of distributions of energy gaps ∆E for different interaction strengths s used in our MD simulations. teractions. These contact interactions suggest a simple our models with 5 functional groups, n∗ varies between geometric model of the space of polymorphs.59 The en- 9 and 11, depending on the shape.) This condition im- P ∗ ergy per molecule of a given polymorph p can be written plies i cp,i = n . Geometrically, the contact vectors of as close-packed polymorphs define a simplex in the space of X the different types of contacts between functional groups. Ep = − icp,i, (For our models, this space is 15-dimensional.) Poly- i∈{1:1,1:2,... } morphs with smaller packing fraction form different sim- plices defined by P c = n < n∗. Fig. 6a illustrates where the sum goes over all types of interactions i (i.e., all i p,i the geometry of this situation for a fictitious system of pairs of functional groups), c is the number of contacts p,i molecules with only two types of possible contacts, which (per molecule) of type i in polymorph p, and  is the i we call 1:1 and 2:2 for consistency. Contact vectors of interaction strength associated with contacts of type i. racemic and enantiopure polymorphs are represented as (In our simulations,  =  for weakly interacting groups, i blue and red dots, respectively, on two different simplices and  =  for strongly interacting groups.) We will i s (i.e., lines in two dimensions) that represent groups of interpret the numbers c and  as the components of the p,i i polymorphs with different packing fractions γ = n/n∗. ”contact vector” ~c of polymorph p, and an ”interaction p Interactions between functional groups are represented vector” ~, respectively. The energy per molecules of p by ~, whose direction indicates strong interactions of type can then be written as the scalar product, E = −~ · ~c . p p 1:1 in this case. The energies of two particular poly- The larger the scalar product of a polymorph’s contact morphs, represented by ~c and ~c , are indicated in the vector with ~, the lower its energy. P R figure by the projections of ~c and ~c onto ~. As ev- What contact vectors are possible? In a close-packed P R ident from the figure, for a given packing fraction, the system of molecules on a lattice, all polymorphs have the polymorph with the largest number of 1:1 contacts must same total number of contacts per molecule, n∗. (For 10 have the lowest energy, as the interactions favor this type gle ϕ~. N0.95 decreases by many orders of magnitude of contact. as s is increased from  to 5, which is the interaction The number of competing polymorphs depends on the strength that results in the best CQ scores in our simu- choice of interactions ~. It is useful to introduce the lations. Interestingly, for many systems, increasing s to even larger values does not lower the number of compet- ”neutral” interaction vector ~0 = (, , . . . , ) which as- signs the same interaction strength to all pairs of func- ing polymorphs significantly. In some case, we even ob- tional groups, as illustrated in Fig. 6a. A simple mea- serve an increase in N0.95 for larger ϕ~, indicating that for sure for the heterogeneity of a given interaction vector these systems there exists an optimal interaction strength that results in the most crystals. ~ is the angle ϕ~ formed between ~ and the neutral in- teraction vector ~0. The larger ϕ~, the larger the dis- We find that the distributions of enantiopure and crepancy of interaction strengths of different functional racemic crystals are remarkably different. To show this, groups. The specific interactions we used in our MD sim- we calculate the difference in the numbers of competing ulations, i.e., s = , 2, 3, 5, and 7, correspond to an- (pur) ◦ ◦ ◦ ◦ ◦ enantiopure and racemic polymorphs, ∆N0.95 = N0.95 − gles ϕ~ = 0 , 13 , 24 , 38 , and 47 The largest possible (rac) (pur) value of ϕ (ϕ ≈ 75◦ in our 15-dimensional space) is N0.95 , as a function of ϕ~. The ratio ∆N0.95/(N0.95 + ~ max (rac) reached when only a single pair of functional groups has N0.95 ), as plotted in Fig. 6d clearly shows that for uni- attractive interactions. By the same token, we introduce form interactions (small ϕ~) most of competing poly- a contact vector ~c0 = (1, 1,..., 1), which describes a ficti- morphs are racemic. For more heterogeneous interac- tious polymorph that features equal numbers of contacts tions, the number of competing polymorphs decreases rapidly (Fig. 6d) but the ratio of enantiopure polymorphs of all types. The angle ϕ~cp between the contact vector of a given polymorph ~cp and ~c0 then describes hetero- increases. We find that around s = 5, the number geneities in the numbers of different types of contacts in of competing enantiopure polymorphs becomes equal or p; polymorphs with large numbers of contacts between even greater than the number of racemic ones, depending on shape and interactions. These changing numbers of particular functional groups have large ϕ~cp . competing polymorphs are directly reflected in the ther- What choice of interactions between functional groups modynamic preferences for enantiopure and racemic crys- ~ minimizes the number of competing polymorphs? It tals, as quantified by the energy gap ∆E. As shown in is clear from Fig. 6a that uniform interactions (those Fig. 6e, with increasing  we observe broader distribu- with small ϕ ) result in large numbers of competing poly- s ~ tions of ∆E, consistent with an overall decrease of the morphs. For completely uniform interactions ~ , all poly- 0 number of competing polymorphs. In addition, the rel- morphs with the same packing fraction have the same ative number of large negative ∆E values increases, as energy. Indeed, our MD simulations of this case ( = ) s enantiopure crystals form an increasing fraction of all show no crystallization at all. For more heterogeneous competing polymorphs. These observations rationalize interactions, however, energy differences between poly- our MD results (Fig. 3a-d), which show that the fraction morphs increase and the number of competing poly- of systems that form enantiopure crystals increases with morphs decreases. This effect is amplified by the dis- increasing interaction strength  . tribution of polymorphs on the simplex. We find that s polymorphs are not distributed uniformly on the simplex, Why are there comparably large numbers of compet- as illustrated in Fig. 6b: Most polymorphs are found in ing enantiopure polymorphs at large values of ϕ~, even the vicinity of the center of the simplex, at small val- though their total number is much smaller compared to ues of ϕ~cp , implying that in most polymorphs different racemic polymorphs? We hypothesize that the comple- types of contacts appear with similar frequency. (Note mentarity of chiral shapes, i.e., the ability of objects of that POLYNUM cannot enumerate polymorphs with the the same handedness to form locally dense arrangements smallest values of ϕ~cp , as these polymorphs have very akin to a handshake, favors the formation of multiple con- large unit cells.) Polymorphs with larger numbers of con- tacts between functional groups of the same type. This tacts between a few select functional groups (large val- argument is supported by an analysis of the oligomers ues of ϕ~cp ) are much rarer and, as evident from Fig. 6b, (dimers, trimers, etc.) that form in the early stages of our also tend to have lower packing fractions γ, as sug- MD simulations. Our choice of a single strong interaction gested by Fig. 6a. Our observation of increased crystal- s selects for oligomers that have several of those inter- lization propensity with increasing interactions strength actions. We indeed find that the majority of oligomers s can therefore be rationalized by the specific distri- observed in our simulations are enantiopure. However, bution of polymorphs. Strong interactions s between the frequency of formation of enantiopure polymorphs a small number of functional groups select for the few in our simulations cannot be rationalized by the number polymorphs that have large numbers of contacts between of competing polymorphs alone. As shown in Fig. 6d, these groups, endowing them with low energies and re- at modest interaction strength s = 3, the majority of ducing the number of competing polymorphs. This point competing polymorphs are racemic. Nevertheless, 50% is illustrated in Fig. 6c, which shows the number of of crystals that form in our simulations are enantiopure, competing polymorphs N0.95 that result from different suggesting that enantiopure crystals have an inherent ki- choices of interaction vectors ~, characterized by the an- netic advantage, which this study does not address. 11

III. CONCLUSIONS polymorphs share several structural motifs with the glass that forms at ambient conditions. At higher pressures, In this study, we have shown with simple molecular however, these polymorphs have much higher enthalpies models of chiral molecules that there is a clear correla- and do not compete any longer. tion between the number of competing low energy poly- While an analysis of the thermodynamic landscapes of morphs and a molecule’s likelihood to crystallize in sim- crystals can provide useful guidance in a statistical sense, ulations. Enantiopure solutions crystallize markedly bet- it cannot be used in a straightforward way to predict ter than racemic solutions because of a much smaller more specifically which polymorph will form. We have number of possible crystal structures. For racemic mix- shown that the energy gap between racemic and enan- tures, we find that most competing polymorphs are nei- tiopure crystals correlates well with crystallization out- ther racemic nor enantiopure, but have other ratios of comes, but energy gaps within groups of polymorphs of left- and right-handed molecules in the unit cell. These the same composition are much less predictive. Our anal- ”odd” polymorphs have so far been overlooked because ysis confirms that while thermodynamics exert a clear they never crystallize in experiments (and also not in our bias on crystallization behavior, there are many cases simulations). However, we find that only those molecules where crystals form or do not form for kinetic reasons. tend to crystallize well that do not have large numbers of While general trends can be inferred from thermody- competing odd polymorphs. Our results suggest that a namic landscapes, a thorough analysis of the mechanisms and rates of nucleation and growth of different poly- reduction in the number of competing polymorphs gener- 62,63 ally leads to an increase in crystallization likelihood and, morphs is needed to make specific predictions. Work for racemic mixtures, likelihood to spontaneously sepa- in this direction is underway. rate via formation of conglomerates. There are several parameters that determine the num- IV. METHODS ber of competing polymorphs and therefore crystalliza- tion yields. We have focused on the interaction strength between functional groups and demonstrated that the 1. Pair Potential. presence of a few strong interactions leads to markedly better crystallization outcomes and a larger fraction of We use a short-ranged pair potential consisting of a conglomerates because it selects for a small number of repulsive and an attractive part, polymorphs, many of which are enantiopure. The higher propensity of chiral organic salts to form conglomerates u(r) = urep(r) + uatt(r) compared to neutral versions of the same molecules9–11 64 can thus be rationalized by the presence of strong elec- The repulsive part is a WCA potential, trostatic interactions. Our results can also provide some ( h σ 12 σ 6i guidance for the chemical synthesis of molecules. When- rep r − 2 r if r < σ, urep(r) = ever chemical modification of a compound is an option, 0 else. introducing strongly interacting groups should enhance the propensity of molecules to crystallize and separate. The attractive part is given by Different numbers of competing polymorphs can help  rationalize several other recent observations. A recent −att, if r < σ, 50   h i  study by Gellman and coworkers showed that compared att (r−σ)π uatt(r) = − 2 cos ω + 1 if σ ≤ r < σ + ω, to bulk solutions the formation of enantiopure crystals is  enhanced when molecules are adsorbed on surfaces. We  0 if r ≥ σ + ω. argue that the number of polymorphs accessible to such quasi two-dimensional systems is likely much smaller The potential and its first derivative are continuous at than in bulk solution. Another example is the crystal- r = σ. In all MD simulations, we use rep = 5.0, σ = 1.0, lization of chiral molecules whose orientations are con- and ω = 0.2. For pairs of weakly interacting functional strained, for instance by an external field. Such molecules groups we use att = , which defines our unit of energy. crystallize and separate better, as recently demonstrated For pairs of strongly interacting groups we use att ≡ by Woszczyk and coworkers.16 We show in Table S3 that s = , 2, 3, 5, or 7, as specified in the main text. A the number of low-energy polymorphs decreases dramat- plot of the pair potential is shown in Fig. 1b. ically when molecular orientation are constrained. A third example is the use of pressure, as recently demon- 2. Molecular dynamics simulations. strated in the crystallization of B2O3, a material that forms a glass from the melt at ambient conditions but readily crystallizes into a single polymorph at elevated We use the software package HOOMD in all our MD pressures. A recent CSP study showed that this behav- simulation.65,66 We perform rigid-body molecular dy- ior can be rationalized by the presence of a large num- namics dynamics using a Langevin thermostat as imple- ber of low-enthalpy polymorphs at low pressure.36 These mented in HOOMD, with a time step of 0.004 and a 12 damping coefficient of 5.0. (We use the mass m and di- 4. Polymorph identification and CQ scores. ameter σ of functional groups as units of mass and length; the unit of energy is .) Simulation snapshots have been To determine the CQ score of a molecular cluster C, 67 generated with Ovito. we first find the polymorph P ∗ that best matches the structure of the cluster, and then compare the local en- vironment of each molecule in C to the environment of 3. Polymorph enumeration (POLYNUM). molecules in the unit cell of P ∗. To find P ∗ for a given cluster C, we consider as candi- We enumerate polymorphs of our models with a dates the 100 polymorphs that have the lowest energies, method inspired by algorithms used to solve a variant of as determined by POLYNUM. For each polymorph P in the exact cover problem, which can be stated as follows: this set, we determine a score that quantifies its similarity Given a box and a number of shapes, enumerate all dis- with the cluster, SC,P = 0.2 OC,P +0.2 RC,P +0.6 EP/E0. tinct ways to completely fill the box with the shapes. For The similarity score SC,P is determined by three order pa- arbitrary shapes, this problem can be very complicated. rameters: a parameter based on radial distribution func- On a lattice, this problem can be solved with a fast algo- tions (RDF), RC,P, a parameter based on molecular ori- rithm called Dancing Links, which exploits properties of entation, OC,P, and the potential energy per molecule 60 linked lists and was popularized by Knuth. We find that of the polymorph, EP, divided by the lowest polymorph enumerating crystal polymorphs of our model amounts energy E0 found by POLYNUM. The potential energy is to solving the exact cover problem on a hexagonal lat- included in SC,P to ensure that a low-energy polymorph is tice with periodic boundary conditions, as illustrated in selected for cases where several structurally similar poly- Fig. S4. Because of the simple geometry of our molecules morphs exist. The polymorph with the highest score SC,P and the short-ranged nature of interactions, functional is considered the best-matching polymorph for cluster C. groups reside near the vertices of a simple hexagonal lat- To determine the order parameters RC,P, we calcu- tice in all crystal structures and larger aggregates ob- late RDFs gi(r) for each of the five functional groups served in our simulations. The underlying lattice greatly i (only considering functional groups of the same type) reduces the search space for crystal polymorphs and pro- within a cutoff distance of 6σ. RDFs are averaged over all vides the connection to the exact cover problem. As evi- molecules of the cluster C, giving a set of 5 RDFs gi,C(r), dent from Fig. 2, not all crystals are close-packed; many as well as for a periodically replicated unit cell of a given have one or more ”holes” (i.e., unoccupied lattice sites) polymorph P , resulting in an analogous set gi,P(r). We in the unit cells. The exact cover algorithm can still be account for thermal expansion and noise present in clus- applied by treating holes as distinct ”molecules” that oc- ter configurations by calculating gi,P(r) from artificially cupy a single lattice site. To enumerate polymorphs, we thermalized bulk crystals, obtained by randomizing posi- proceed as follows: tions and orientations of all molecules of polymorph P by small displacements drawn from distributions estimated 1. On a 2D hexagonal lattice, enumerate all unit cells in MD simulation of a few different bulk crystals close that contain a minimum of 5 and a maximum of to the temperature of crystal formation. The order pa- M lattice sites. Eliminate geometrically equivalent rameter R , which quantifies structural similarity of unit cells and unit cells that are chiral enantiomers. C,P cluster C and polymorph P , is then determined from the 2. For each unit cell, apply the Dancing Links algo- Pearson correlation coefficients of the RDFs, rithm to generate all packings of molecules and 5 holes, considering periodic boundary conditions. X RC,P = hgi,C(r)gi,P(r)i − hgi,C(r)ihgi,P(r)i× 3. Eliminate equivalent structures from the set of solu- i=1 tions, based on contacts between functional groups.  2 2 2 2−1/2 hgi,C(r)i − hgi,C(r)i hgi,P(r)i − hgi,P(r)i . The algorithm returns the complete set of unique crystal 1 R 6σ structures, up to the specified maximum number of lat- Here, h· · · i = 6σ 0 dr ··· implies integration over the tice cells M. The number of solutions increases rapidly full range of the RDFs. with unit cell size; for our models, we find that we can To determine the orientational parameter OC,P, we calculate polymorphs for unit cells containing up to 14 characterize molecular environments by calculating the molecules and up to 4 holes (M = 74). Our analysis vectors that connect the center of mass (COM) of a given suggest that this limit is sufficient to identify the vast molecule to the COM of each of the 6 nearest neigh- majority of low energy polymorphs. The energy of crys- bor molecules. (Nearest neighbors are determined based tal polymorphs increases systematically with increasing on COM distance.) These vectors split the unit circle number of molecules in the unit cell and with increas- into 6 sectors. The central angles of these sectors define ing number of ”holes” in the unit cell, as illustrated in the components of a vector θ. We find the lexicograph- Fig. S7. Accordingly, the algorithm was able to identify ically minimal rotation68 of θ to ensure its invariance all crystal polymorphs observed in our MD simulations. with respect to global rotations of the coordinate system. A detailed description of the algorithm is given in the SI. θ can be straightforwardly calculated for all molecules 13 in cluster C. In polymorph unit cells with regular lat- 7H. Lorenzand A. Seidel-Morgenstern, “Processes to separate tice positions, however, several molecules can share the enantiomers,” Angewandte Chemie - International Edition 53, same COM distance from a given central molecule, mak- 1218–1250 (2014). 8J. Jacques, A. Collet, and S. H. Wilen, Enantiomers, Racemates, ing the selection of 6 nearest neighbors ambiguous. For and Resolutions. (John Wiley and Sons, 1981) p. 447 pp. each molecule in a unit cell, we therefore create 10 dif- 9J. Jacques, M. Leclercq, and M. J. Brienne, “La formation de ferent vectors θm (m = 1,..., 10), each independently sels augmente-t-elle la fr´equence des d´edoublements spontan´es?” randomized to approximate effects of thermal noise, as Tetrahedron 37, 1727–1733 (1981). 10 discussed in the last paragraph. Given the vectors θ K. Kinbara, Y. Hashimoto, M. Sukegawa, H. Nohira, and j,C K. Saigo, “Crystal structures of the salts of chiral primary for all molecules j in cluster C and the vectors θm,k,P for with achiral carboxylic acids: Recognition of the commonly- all molecules k in the unit cell of polymorph P , we deter- occurring supramolecular assemblies of hydrogen-bond networks mine for each molecule j a score that quantifies the sim- and their role in the formation of conglomerates,” Journal of the American Chemical Society 118, 3441–3449 (1996). ilarity of its molecular environment with molecules in P 11 P6 (l) (l)  A. Deyand E. Pidcock, “The relevance of chirality in space group by calculating s = minm,k l=1 |θm,k,P − θj,C| , where analysis: A database study of common hydrogen-bonding motifs l runs over the 6 components of θ. If s < 4◦, molecule j is and their symmetry preferences,” CrystEngComm 10, 1258–1264 (2008). categorized as ”crystalline”. The orientational order pa- 12E. D’Oria, P. G. Karamertzanis, and S. L. Price, “Spontaneous rameter is defined as the fraction of crystalline molecules resolution of enantiomers by crystallization: Insights from com- in the cluster, OC,P = Ncryst/N. puted crystal energy landscapes,” Crystal Growth and Design Finally, once the polymorph P ∗ that best matches the 10, 1749–1756 (2010). 13 structure of cluster C has been determined by finding the S. L. Price, “Why don’t we find more polymorphs?” Acta Crys- tallographica Section B: Structural Science, Crystal Engineering lowest score SC,P = 0.2 OC,P + 0.2 RC,P + 0.6 EP/E0 over and Materials 69, 313–328 (2013). the set of 100 low-energy polymorphs P , the CQ score of 14J. G. Wickerand R. I. Cooper, “Will it crystallise? Predicting C is simply defined as the orientational order parameter, crystallinity of molecular materials,” CrystEngComm 17, 1927– CQ = O ∗ . Note that the CQ score is sensitive to the 1934 (2015), arXiv:arXiv:1408.1149. C,P 15 size of the cluster and to the presence of defects within M. Pillong, C. Marx, P. Piechon, J. G. Wicker, R. I. Cooper, and T. Wagner, “A publicly available crystallisation data set and its the cluster, as molecules at the surface or near defects application in machine learning,” CrystEngComm 19, 3737–3745 will be categorized as non-crystalline. We have tuned (2017). 16 the weights that appear in the scoring function SC,P to A. Woszczykand P. Szabelski, “Creation of chiral adsorbed struc- optimize the performance of the polymorph identification tures using external inputs: results from lattice Monte Carlo sim- ulations,” Adsorption 22, 553–560 (2016). scheme. Examples of molecular clusters, assigned poly- 17A. Otero-De-La-Roza, J. E. Hein, and E. R. Johnson, “Reevaluat- morphs, and associated CQ values are given in Fig. S10. ing the Stability and Prevalence of Conglomerates: Implications for Preferential Crystallization,” Crystal Growth and Design 16, 6055–6059 (2016). 18 ACKNOWLEDGMENTS A. Gavezzottiand S. Rizzato, “Are racemic crystals favored over homochiral crystals by higher stability or by kinetics? Insights from comparative studies of crystalline stereoisomers,” Journal The authors thank Julio Facelli, Christoph Dellago, of Organic Chemistry 79, 4809–4816 (2014). and Wim Noorduin for useful discussions. J.C. thanks 19C. P. Brock, W. B. Schweizer, and J. D. Dunitz, “On the Valid- ity of Wallach’s Rule: On the Density and Stability of Racemic Merlin Carpenter for his assistance with the implemen- Crystals Compared with Their Chiral Counterparts,” Journal of tation of POLYNUM. The support and resources of the the American Chemical Society 113, 9811–9820 (1991). Center for High Performance Computing at the Univer- 20I. Paci, I. Szleifer, and M. A. Ratner, “Chiral separation: Mech- sity of Utah are gratefully acknowledged. This work anism modeling in two-dimensional systems,” Journal of the was supported by the National Science Foundation under American Chemical Society 129, 3545–3555 (2007). 21F. Latinwo, F. H. Stillinger, and P. G. Debenedetti, “Molecular Grant No. CHE-1900626. model for chirality phenomena,” Journal of Chemical Physics 145 (2016), 10.1063/1.4964678. 1 J. Bernstein, Polymorphism in Molecular Crystals(Google 22R. K. Hylton, G. J. Tizzard, T. L. Threlfall, A. L. Ellis, eBook), IUCr monographs on crystallography (Clarendon Press, S. J. Coles, C. C. Seaton, E. Schulze, H. Lorenz, A. Seidel- 2002) p. 428. Morgenstern, M. Stein, and S. L. Price, “Are the Crystal Struc- 2 A. J. Cruz-Cabeza, S. M. Reutzel-Edens, and J. Bernstein, “Facts tures of Enantiopure and Racemic Mandelic Acids Determined by and fictions about polymorphism,” Chemical Society Reviews 44, Kinetics or Thermodynamics?” Journal of the American Chem- 8619–8635 (2015). ical Society 137, 11095–11104 (2015). 3 A. Y. Lee, D. Erdemir, and A. S. Myerson, “Crystal Polymor- 23J. Bauer, S. Spanton, R. Henry, J. Quick, W. Dziki, W. Porter, phism in Chemical Process Development,” Annual Review of and J. Morris, “Ritonavir: An extraordinary example of confor- Chemical and Biomolecular Engineering 2, 259–280 (2011). mational polymorphism,” Pharmaceutical Research 18, 859–866 4 A. Calcaterraand I. D’Acquarica, “The market of : (2001). Chiral switches versus de novo enantiomerically pure com- 24A. Llin`asandJ. M. Goodman, “Polymorph control: past, present pounds,” Journal of Pharmaceutical and Biomedical Analysis and future,” (2008). 147, 323–340 (2018). 25S. L. Priceand S. M. Reutzel-Edens, “The potential of computed 5 L. A. Nguyen, H. He, and C. Pham-Huy, “Chiral drugs: an crystal energy landscapes to aid solid-form development,” Drug overview.” International journal of biomedical science : IJBS 2, Discovery Today 21, 912–923 (2016). 85–100 (2006). 26S. L. Price, “Computed crystal energy landscapes for understand- 6 S. W. Smith, “Chiral toxicology: It’s the same thing only differ- ing and predicting organic crystal structures and polymorphism,” ent,” Toxicological Sciences 110, 4–30 (2009). 14

Accounts of Chemical Research 42, 117–126 (2009). time-dependent interactions,” Journal of Chemical Physics 145 27F. Curtis, X. Li, T. Rose, A.´ V´azquez-Mayagoitia, S. Bhat- (2016), 10.1063/1.4972861. tacharya, L. M. Ghiringhelli, and N. Marom, “GAtor: A First- 49J. M. Berg, J. L. Tymoczko, and L. Stryer, Biochemistry, Fifth Principles Genetic Algorithm for Molecular Crystal Structure Edition, Biochemistry (W. H. Freeman, 2002). Prediction,” Journal of Chemical Theory and Computation 14, 50S. Duttaand A. J. Gellman, “ surface chemistry: Con- 2246–2264 (2018), arXiv:1802.08602. glomerate: versus racemate formation on surfaces,” Chemical So- 28J. Nymanand S. M. Reutzel-Edens, “Crystal structure prediction ciety Reviews 46, 7787–7839 (2017). is changing from basic science to applied technology,” Faraday 51Y. Q. Zhang, T. Lin, B. Cirera, R. Hellwig, C. A. Palma, Discussions 211, 459–476 (2018). Z. Chen, M. Ruben, J. V. Barth, and F. Klappenberger, “One- 29K. Seonah, A. M. Orendt, M. B. Ferraro, and J. C. Facelli, “Crys- Dimensionally Disordered Chiral Sorting by Racemic Tiling in tal structure prediction of flexible molecules using parallel genetic a Surface-Confined Supramolecular Assembly of Achiral Tec- algorithms with a standard force field,” Journal of Computa- tons,” Angewandte Chemie - International Edition 56, 7797–7802 tional Chemistry 30, 1973–1985 (2009). (2017). 30A. R. Oganovand C. W. Glass, “Crystal structure pre- 52C. Gervais, S. Beilles, P. Cardina¨el,S. Petit, and G. Coquerel, diction using ab initio evolutionary techniques: Principles “Oscillating crystallization in solution between (+)- and (-)-5- and applications,” Journal of Chemical Physics 124 (2006), ethyl-5-methylhydantoin under the influence of stirring,” Journal 10.1063/1.2210932. of Physical Chemistry B 106, 646–652 (2002). 31Y. Wang, J. Lv, L. Zhu, and Y. Ma, “Crystal structure pre- 53B. S. Greenand M. Knossow, “Lamellar twinning explains the diction via particle-swarm optimization,” Physical Review B - nearly racemic composition of chiral, single crystals of hexahe- Condensed Matter and Materials Physics 82, 1–8 (2010). licene,” Science 214, 795–797 (1981). 32A. M. Lund, G. I. Pagola, A. M. Orendt, M. B. Ferraro, and 54V. Y. Torbeev, K. A. Lyssenko, O. N. Kharybin, M. Y. Antipin, J. C. Facelli, “Crystal structure prediction from first principles: and R. G. Kostyanovsky, “ Lamellar Racemic Twinning as an The crystal structures of glycine,” Chemical Physics Letters 626, Obstacle for the Resolution of Enantiomers by Crystallization: 20–24 (2015). The Case of Me(All)N + (CH 2 Ph)Ph X - (X = Br, I) Salts ,” 33P. Avery, C. Toher, S. Curtarolo, and E. Zurek, “XTALOPT The Journal of Physical Chemistry B 107, 13523–13531 (2003). Version r12: An open-source evolutionary algorithm for crystal 55J. T. H. van Eupen, W. W. J. Elffrink, R. Keltjens, P. Bennema, structure prediction,” Computer Physics Communications 237, R. de Gelder, J. M. M. Smits, E. R. H. van Eck, A. P. M. Kent- 274–275 (2019). gens, M. A. Deij, H. Meekes, and E. Vlieg, “Polymorphism and 34C. H. Goodman, “Strained mixed-cluster model for glass struc- Migratory Chiral Resolution of the Free Base of . A ture,” Nature 257, 370–372 (1975). Remarkable Topotactical Solid State Transition from a Racemate 35A. Lindsay Greer, “Confusion by design,” Nature 366, 303–304 to a Racemic Conglomerate,” Crystal Growth & Design 8, 71–79 (1993). (2008). 36G. Ferlat, A. P. Seitsonen, M. Lazzeri, and F. Mauri, “Hidden 56Note that Pillong and coworkers also found that molecules with polymorphs drive vitrification in B 2 O 3,” Nature Materials 11, larger degree of flexibility tend to crystallize worse.15 Effects of 925–929 (2012). flexibility cannot be probed with our simple rigid models. 37R. Wangand M. D. Merz, “Non-crystallinity and polymorphism 57A. J. Cruz-Cabeza, “Crystal structure prediction: Are we there in elemental solids,” Nature 260, 35–36 (1976). yet?” Acta Crystallographica Section B: Structural Science, 38J. N. Onuchic, Z. Luthey-Schulten, and P. G. Wolynes, “THE- Crystal Engineering and Materials 72, 437–438 (2016). ORY OF PROTEIN FOLDING: The Energy Landscape Perspec- 58S. Royand A. J. Matzger, “Unmasking a third polymorph of a tive,” Annual Review of Physical Chemistry 48, 545–600 (1997). benchmark crystal-structureprediction compound,” Angewandte 39D. J. Wales, “Decoding the energy landscape: Extracting struc- Chemie - International Edition 48, 8505–8508 (2009). ture, dynamics and thermodynamics,” Philosophical Transac- 59J. Nicholls, G. P. Alexander, and D. Quigley, “Polyomino Models tions of the Royal Society A: Mathematical, Physical and En- of Surface Supramolecular Assembly: Design Constraints and gineering Sciences 370, 2877–2899 (2012). Structural Selectivity,” (2017), arXiv:1702.01994. 40S. Hormozand M. P. Brenner, “Design principles for self-assembly 60D. E. Knuth, “Dancing links,” Millennial Perspectives in with short-range interactions,” Proceedings of the National Computer Science: Proceedings of the 1999 OxfordMicrosoft Academy of Sciences 108, 5193–5198 (2011). Symposium in Honour of Sir Tony Hoare , 187–214 (2000), 41E. G. Teich, G. van Anders, and S. C. Glotzer, “Identity crisis arXiv:0011047 [cs]. in alchemical space drives the entropic colloidal glass transition,” 61S. L. Price, D. E. Braun, and S. M. Reutzel-Edens, “Can com- Nature Communications 10 (2019), 10.1038/s41467-018-07977-2. puted crystal energy landscapes help understand pharmaceutical 42P. F. Damasceno, A. S. Karas, B. A. Schultz, M. Engel, and S. C. solids?” Chemical Communications 52, 7065–7077 (2016). Glotzer, “Controlling Chirality of Entropic Crystals,” Physical 62D. S. Coombes, C. R. A. Catlow, J. D. Gale, A. L. Rohl, and S. L. Review Letters 115, 1–5 (2015), arXiv:1507.03712. Price, “Calculation of attachment energies and relative volume 43A. P. Gantapara, W. Qi, and M. Dijkstra, “A novel chi- growth rates as an aid to polymorph prediction,” Crystal Growth ral phase of achiral hard triangles and an entropy-driven and Design 5, 879–885 (2005). demixing of enantiomers,” Soft Matter 11, 8684–8691 (2015), 63J. Li, C. J. Tilbury, M. N. Joswiak, B. Peters, and M. F. Do- arXiv:arXiv:1504.03130v1. herty, “Rate Expressions for Kink Attachment and Detachment 44S. P. Carmichaeland M. S. Shell, “A simple mechanism for During Crystal Growth,” Crystal Growth & Design 16, 3313– emergent chirality in achiral hard particle assembly,” Journal of 3322 (2016). Chemical Physics 139 (2013), 10.1063/1.4826466. 64J. D. Weeks, D. Chandler, and H. C. Andersen, “Role of repulsive 45T. G. Lombardo, F. H. Stillinger, and P. G. Debenedetti, “Ther- forces in determining the,” The Journal of Chemical Physics 54, modynamic mechanism for solution phase chiral amplification 5237–5247 (1971). via a lattice model,” Proceedings of the National Academy of 65J. Glaser, T. D. Nguyen, J. A. Anderson, P. Lui, F. Spiga, Sciences 106, 15131–15135 (2009). J. A. Millan, D. C. Morse, and S. C. Glotzer, “Strong 46Note that because of the symmetry of s9, this shape gives rise to scaling of general-purpose molecular dynamics simulations on only 9, not 15, unique molecules. GPUs,” Computer Physics Communications 192, 97–107 (2015), 47D. Klotsaand R. L. Jack, “Controlling crystal self-assembly using arXiv:1412.3387. a real-time feedback scheme,” Journal of Chemical Physics 138, 66J. A. Anderson, C. D. Lorenz, and A. Travesset, “General 1–9 (2013), arXiv:arXiv:1210.2636v2. purpose molecular dynamics simulations fully implemented on 48C. J. Fullertonand R. L. Jack, “Optimising self-assembly through graphics processing units,” Journal of Computational Physics 15

227, 5342–5359 (2008). and Simulation in Materials Science and Engineering 18 (2010), 67A. Stukowski, “Visualization and analysis of atomistic simula- 10.1088/0965-0393/18/1/015012. tion data with OVITO-the Open Visualization Tool,” Modelling 68K. S. Booth, “Lexicographically least circular substrings,” Infor- mation Processing Letters 10, 240–242 (1980).