Model-driven engineering in supramolecular systems

Citation for published version (APA): Paffen, T. F. E. (2017). Model-driven engineering in supramolecular systems. Technische Universiteit Eindhoven.

Document status and date: Published: 15/11/2017

Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne

Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim.

Download date: 06. Oct. 2021

Model‐driven engineering in supramolecular systems

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de rector magnificus prof.dr.ir. F.P.T. Baaijens, voor een commissie aangewezen door het College voor Promoties, in het openbaar te verdedigen op woensdag 15 november 2017 om 16:00 uur

door

Tim Fransiscus Elizabeth Paffen

geboren te Kerkrade

Dit proefschrift is goedgekeurd door de promotoren en de samenstelling van de promotiecommissie is als volgt: voorzitter: prof.dr.ir. E.J.M Hensen promotor: prof.dr. E.W. Meijer copromotor: dr.ir. T.F.A. de Greef leden: prof.dr. J.S. Moore (University of Illinois) prof.dr.ir. J. Huskens (Universiteit Twente) dr. G.M. Pavan (University of Applied Sciences and Arts of Southern Switzerland) dr.ir. P.A. Korevaar (Radboud Universiteit Nijmegen) dr.ir. A.R.A. Palmans

Het onderzoek of ontwerp dat in dit proefschrift wordt beschreven is uitgevoerd in overeenstemming met de TU/e Gedragscode Wetenschapsbeoefening.

Cover design: Tim Paffen

Printed by: Gildeprint – the Netherlands

A catalogue record is available from the Eindhoven University of Technology Library ISBN: 978‐90‐386‐4375‐5

This work has been financially supported by the Netherlands Organization for Scientific Research (NWO)

Table of Contents

Chapter 1 Increasing complexity of supramolecular systems ...... 1 1.1 Introduction ...... 2 1.2 Molecular competition ...... 3 1.3 Multivalency and cooperativity ...... 4 1.4 Ring‐chain equilibria of divalent molecules ...... 7 1.5 Supramolecular binding motifs: UPy‐NaPy dimerization...... 9 1.6 Aim and outline ...... 10 1.7 References ...... 11

Chapter 2 Mathematical modeling of supramolecular systems ...... 17 2.1 Introduction ...... 18 2.2 Thermodynamic equilibrium: mass balances ...... 18 2.3 Deterministic kinetic simulations ...... 22 2.4 Rule‐based modeling ...... 25 2.5 Non‐linear regression of experimental data ...... 27 2.6 Decreasing computational time ...... 30 2.7 Conclusion ...... 30 2.8 Experimental section ...... 31 2.9 References ...... 31

Chapter 3 Supramolecular buffering by ring‐chain competition ...... 33 3.1 Introduction ...... 34 3.2 Results and discussion ...... 35 3.2.1 Model outline ...... 35 3.2.2 Model validation ...... 40 3.2.3 Design principles of supramolecular ring‐chain buffering ...... 43 3.2.4 Buffering via a ring‐chain mechanism versus other buffering mechanisms ...... 45 3.3 Conclusions ...... 47 3.4 Experimental section ...... 47 3.4.1 Materials and methods ...... 47 3.4.2 Synthetic procedures ...... 48 3.4.3 Model description of two‐component supramolecular buffering ...... 50 3.4.4 Assignment of 1H NMR urethane proton resonances and analysis of ring‐chain equilibrium ...... 56

3.4.5 Influence of oligomeric rings on buffering ...... 60

3.4.6 Calculation of confidence interval on CNaPy free ...... 62 3.4.7 UV‐VIS data of ditopic UPy 3‐NaPy 2 mixtures ...... 64 3.5 References ...... 64

Chapter 4 Model‐driven engineering of improved supramolecular buffering by multivalency ... 69 4.1 Introduction ...... 70 4.2 Results ...... 71 4.3 Discussion ...... 79 4.4 Experimental section ...... 80 4.4.1 Model description of tri‐ and tetravalent UPys with NaPy ...... 80

4.4.2 Determination of KUPy‐UPy, KUPy‐NaPy and EM1 ...... 88 4.4.3 Fit of 1H NMR data of NaPy 5 mixtures with trivalent UPy 3 or tetravalent UPy 4 ...... 89 4.4.4 Model description and fit of the tetravalent UPy 4 and NaPy 5 titration using DPU 6 ...... 92 4.5 References ...... 94

Chapter 5 Regulating competing supramolecular interactions using ligand concentration ...... 97 5.1 Introduction ...... 98 5.2 Results and discussion ...... 99 5.2.1 Model outline ...... 99 5.2.2 Non‐linear regression of experimental titration data ...... 103 5.3 Conclusion ...... 107 5.4 Experimental section ...... 107 5.5 References ...... 108

Chapter 6 Model‐driven engineering of reaction kinetics using feedback ...... 109 6.1 Introduction ...... 110 6.2 Results and discussion ...... 111 6.2.1 Supramolecular self‐accelerating reaction ...... 111 6.2.2 Model‐driven engineering of linear kinetics ...... 116 6.2.2.1 UPy catalysis: autocatalysis and autoinduction 6.2.2.2 NaPy and UPy mixtures: toward linear kinetics 6.2.2.3 Increased rate‐acceleration by covalently linking catalyst and substrate 6.3 Conclusion ...... 124

6.4 Experimental section ...... 124 6.4.1 Rule‐based model for the self‐accelerating reaction ...... 125 6.4.2 Kinetic model of the Michael addition between maleimide and pentanedione . 126 6.4.3 Kinetic model of the Michael addition between maleimide and UPy construct 6 ...... 128 6.5 References ...... 129

Summary ...... 131

Curriculum Vitae ...... 133

List of publications ...... 135

Acknowledgements ...... 137

1

Increasing complexity of supramolecular systems

Abstract

Supramolecular chemists are creating systems that are becoming increasingly intricate, approaching the complexity often observed in biological systems. With each step advancing towards more life‐like systems, it becomes increasingly clear that the only way to unravel the underlying molecular mechanisms is through a combined experimental and theoretical approach. Here, we present a short overview of the increase in complexity of supramolecular systems, followed by several expositions of the most important topics employed in the rest of the thesis: molecular competition, multivalency, cooperativity, ring‐chain equilibria, and ureidopyrimidinone dimerization.

Chapter 1

1.1 Introduction

Traditionally, supramolecular chemistry has been concerned with the synthesis of artificial binding motifs and the characterization of their non‐covalent interactions, with the aim to create responsive assemblies in solution or materials that can respond to environmental cues in a ‘smart’ way.1 As the field progressed it also diversified, focusing on topics such as supramolecular polymerizations,2–4 single molecule folding,5–9 multi‐ component mixtures,10–14 supramolecular catalysts,15,16 dissipative self‐assembly,17–21 etc. Within those topics, newly synthesized molecules and their functional properties became more complex over time. For example, research on supramolecular polymers initially focused on obtaining high molecular weights, requiring bivalent molecules with high equilibrium binding constants.2,22 Over time, a classification of supramolecular polymers was made by their underlying polymerization mechanisms:3 isodesmic polymers (all polymerization steps having equal binding constants),23 cooperative polymers (non‐equal binding constants),24 and ring‐chain polymers (isodesmic polymers with the capacity to form intramolecular rings).25 Subsequently, studies of the polymerization kinetics led to the discovery of kinetic traps, i.e. polymer states with different structures that are only stable for limited amounts of time, after which they convert to thermodynamically stable states.26,27 This resulted in living supramolecular polymerizations, in which small amounts of ‘active’ seeds polymerize ‘dormant’ monomers (in kinetic traps) in such a manner that the dispersity of the growing polymer is tightly controlled.28–30 Research on multi‐component mixtures started out by mixing mono‐ or bivalent molecules equipped with different supramolecular binding groups to characterize the extent to which their interactions allow social and/or narcissistic self‐sorting.31,32 Once characterized, the interactions could serve as supramolecular protecting groups,33,34 as logic gates,35–37 and as activators for chemical reactions.38 The field was stimulated by the discovery of dynamic covalent bonds, which can be switched from covalent to reversible by changing environmental conditions, e.g. pH, photo irradiation, and/or temperature.39–41 By employing molecules equipped with several dynamic covalent bonds, large libraries can be constructed in which each combination of the starting compounds is represented.42 These libraries allow for efficient screening of binding templates,43–45 chemical ‘evolution’ of the fittest assembly,46,47 switching between different types of supramolecular assemblies,48,49 and optimization of catalysts.16 Moreover, when the exchange rate between components is higher than their formation rates, severe non‐linear kinetics are observed which are reminiscent of the complexity observed in cellular reaction networks.50 As the supramolecular systems under study become more complex and lead to occasional intractable results, theoretical models are increasingly used as complementary

2

Increasing complexity of supramolecular systems

analysis methods.51–54 The primary advantage of using such models is that they enumerate all molecular species in a system, allowing a thorough understanding of underlying design principles. This is in contrast to most experimental characterization techniques, which typically only measure average properties of molecular ensembles. Combining experimental synthesis and characterization with theoretical analysis can have a synergistic effect, in which both techniques steer each other in the desired direction: theoretical models have the freedom to explore large parameter spaces to optimize molecular designs, while experimental work gives new input to models in the form of new molecular mechanisms and properties. Currently, a level of experimental and theoretical understanding has been reached that allows forward engineering of novel functionalities by first using models to simulate if a specific molecular design is viable, followed by the actual synthesis of the target structures and validation of the model predictions, i.e. model‐driven engineering.55 Model‐driven engineering can greatly reduce the time required to complete a study, as the time spent on synthesizing new molecules is reduced.56,57

1.2 Molecular competition

Molecular competition is a non‐covalent mechanism which plays a large role in many supramolecular systems in chemistry and biology and can give rise to critical concentrations, buffering and increased non‐linearity. Indeed, it defines the characteristics of systems such as supramolecular ring‐chain polymers,58,59 self‐sorting in multi‐ component mixtures,31 template binding in dynamic covalent libraries,43 etc. Competition between molecular species often results in a mixture of different types of aggregates which can be extremely sensitive to changes in concentration, temperature, or other environmental cues. In self‐sorting systems, competition manifests as the interplay between social (AA and BB) or narcissistic (AB) self‐sorting (Fig. 1.1A).31 When the heteromeric binding constant KAB is low, narcissistic self‐sorting is prevalent, while a high value of KAB leads to social self‐sorting. In the intermediate region, both types of aggregates are present and the composition depends strongly on the value of KAB. In biological systems, one of the ways in which molecular competition is used is to generate large responses upon small changes in concentration.60,61 For example, Buchler and Louis found that in the presence of a strongly binding inhibitor (B), the concentration of free protein (A) gives an ultrasensitive response when increasing the total concentration of A (Figure 1.1B).62 Due to the strong binding, B sequesters any A in the inactive complex AB until the equivalence point is reached. Subsequently, after all B is bound, the concentration of free A increases sharply. While this analysis is performed at thermodynamic equilibrium, it has been shown that this same simple inhibition mechanism leads to transient ultrasensitivity in many cellular regulatory systems.62,63

3

Chapter 1

Figure 1.1 (A) Speciation of a self‐sorting system versus the heteromer binding constant KAB. The parameter 4 ‐1 values used for the simulation are: KAA = KBB = 10 M , CA,total = CB,total = 10 mM. Image adapted from reference 31.

(B) Ultrasensitive response of the concentration of free A versus the total concentration of A (AT) in the presence of various amounts of inhibitor B. The magnitude of the response is quantified by the logarithmic two‐fold sensitivity S, which is defined by the change in the free A concentration upon a two‐fold change in the total A concentration. Image from reference 62.

1.3 Multivalency and cooperativity

Multivalency and cooperativity are phenomenon that can cause extreme non‐linear responses.64 Thus, it is advantageous to acquire an intuitive understanding of the underlying molecular mechanisms and their consequences. Multivalent molecules are equipped with multiple binding groups, allowing multiple points of contact to be maintained between two molecular entities (Fig. 1.2A).65,66 Apart from the increase in overall binding strength this design strategy offers, the binding selectivity can be tuned by both the positioning and the type of binding groups, allowing a multivalent binder to discriminate between components of complex mixtures.67,68 Furthermore, the kinetics of binding and unbinding events are influenced, due to an increase in the local concentration upon creation of the first interaction. For example, Huskens et al. showed that surface diffusion of a multivalent construct is governed by three distinct kinetic mechanisms (Fig. 1.2B).69 Multivalent interactions play an important role in biological systems, for example in the binding of viruses to cells, cell‐cell adhesion and cell signaling.70 Supramolecular model systems allow for detailed studies of the fundamental mechanisms as they lack the complexities of in situ measurements in biological systems.71 Typical model systems include dendrimers, molecular printboards, and colloids.72–75 Interestingly, multivalency

4

Increasing complexity of supramolecular systems

can be used to sharply respond to receptor concentration on cell surfaces, leading to super‐selective binding.76,77 Furthermore, the addition of competition in the form of weak complementary binders positioned on the multivalent construct can prevent non‐specific binding to untargeted receptors.78

Figure 1.2 (A) Valence terminology for multivalent constructs. Image adapted from reference 65. (B) Kinetic mechanisms governing surface diffusion of a divalent construct. Image from reference 69.

Cooperativity is a phenomenon in which separate binding steps influence each other through electronic or structural changes upon binding.79 It is often difficult to assess the presence and extent of cooperative effects from molecular characterization alone, as it requires a comparison with a non‐cooperative model in which isolated interactions behave independently of each other.80 Cooperativity can be both positive and negative, implying that initial binding steps are energetically less and more favorable, respectively. Three types of cooperativities have been defined: allosteric, chelate, and interannular cooperativity (Fig. 1.3).80,81 Allosteric cooperativity is perhaps the most widely known from oxygen‐hemoglobin binding, where the binding of a second oxygen molecule is more favorable than the first due to a change in protein conformation upon binding.82 Allosteric cooperativity has been extensively characterized for host‐guest complexes and cooperative supramolecular polymerizations.24,79 Chelate cooperativity describes the formation of intramolecular bonds, such as when two multivalent molecules bind (Fig. 1.3B). It is the only form of cooperativity that is dependent on concentration due to competition between intra‐ and intermolecular binding (Fig. 1.3C). Lastly, interannular cooperativity arises when two intramolecular binding events influence each other, thus necessitating the presence of chelate cooperativity (Fig. 1.3D).83 Since multivalent constructs can display all three forms of cooperativity, their contributions can easily be misinterpreted. Therefore, to avoid erroneous assignments, it is often vital to construct an accompanying theoretical model of the system under study.

5

Chapter 1

Figure 1.3 (A) Schematic depiction of allosteric cooperativity, which occurs when the equilibrium binding constants K1 or K2 are not equal to K. (B) Schematic depiction of chelate cooperativity, which occurs when EM is larger than zero. (C) Speciation of the equilibria depicted in (B) in the absence (left) and presence (right) of chelate cooperativity as a function of dimensionless concentration. The parameters used in the simulation are 3 ‐1 EM = 0 or 10 mM (left or right, respectively), K = 10 M , [AA]tot = 0.1 mM. (D) Schematic depiction of interannular cooperativity, which occurs when the effective molarities EM1 or EM2 are not equal to EM. Image adapted from reference 80.

Some cooperative supramolecular polymerizations are governed by a distinct case of allosteric cooperativity, in which the formation of an oligomeric nucleus is energetically unfavorable with respect to the subsequent elongation regime. Thus, allosteric cooperativity can span several molecules, instead of the single molecule depicted in Fig 1.3. The cooperativity can be due to additional non‐covalent interactions that only occur upon reaching the final nucleus size, as exemplified by the seminal paper on cooperative supramolecular polymerizations by Zhao and Moore (Fig. 1.4A).24 Alternatively, the polymerization can display positive cooperativity due to an aggregation‐induced polarization along the length of the nucleus.84 The concentration‐dependent aggregation of such cooperative systems displays strong non‐linear effects as compared to isodesmic polymerizations, resulting in an asymmetric and non‐sigmoidal transition (Fig. 1.4B‐C). Negative cooperative systems have been reported as well, in which steric hindrance leads to attenuation of the binding constant as the polymer grows.85,86

6

Increasing complexity of supramolecular systems

Figure 1.4 (A) Schematic depiction of positive allosteric cooperativity in supramolecular polymers. The green arrows represent the additional non‐covalent interactions that make elongation energetically more favorable. (B)

Monomer fraction ([A]/ct) and the product of the monomer concentration and the equilibrium constant (K[A]) versus the dimensionless total concentration for an isodesmic supramolecular polymerization. (C) Monomer fraction versus the dimensionless total concentration for a cooperative supramolecular polymerization, for different values of the cooperativity parameter σ. Image adapted from reference 24.

1.4 Ring‐chain equilibria of divalent molecules

Ring‐chain polymerization is one of the three archetypal mechanisms of equilibrium supramolecular polymerization, the other two being isodesmic and cooperative polymerizations.3 Molecules that undergo ring‐chain polymerization are characterized by having two or more complementary binding groups separated by a flexible linker that allows intramolecular cyclization of monomers and/or aggregates. Thus, all ring‐chain polymerizations display chelate cooperativity by definition. For divalent molecules, the thermodynamic equilibrium state has been extensively analyzed theoretically.58,87 The cyclization tendency is quantified via the effective molarity (EM), which equals the ratio of the intra‐ and intermolecular equilibrium binding constants

(Kintra and Kinter, respectively) while at the same time equaling the local concentration of binding groups around the ends of the divalent molecule (Fig. 1.4A). Thus, the EM

7

Chapter 1

quantifies the competition between intra‐ and intermolecular association. Experimentally, the EM can relatively easily be determined via the concentration of cycles above the critical concentration Ccr, i.e. the concentration at which cycle concentrations remain constant (Fig. 1.4B). The concentration‐dependent behavior follows naturally by considering that at low total concentrations, the high local concentration of binding groups around the ends of the molecule leads to intramolecular cyclization. Subsequently increasing the total concentration to a value above the local concentration leads to intermolecular binding, competing with cyclization.

Figure 1.4 (A) Schematic depiction of local concentration and its relationship to changes in the bulk concentration. (B) Concentration dependent behavior of divalent molecules, calculated using the model by 58 8 ‐1 Ercolani et al. The parameter values used for the prediction are EM1 = 10 mM and Kinter = 2.4 × 10 M .

The value of the EM is mostly determined by the molecular structure through the linker flexibility and the steric requirements of the binding motifs.88–90 For relatively short linkers (<30 linker atoms), the EM value is dominated by steric effects such as the energetics of bond rotation of linker atoms and the spatial requirements of the binding mechanism.25,91 For longer linkers (>30 linker atoms), the behavior of the linker can usually be described by scaling laws from polymer physics, leading to so‐called strain free cycles.92 As the EM is determined by the molecular structure, careful molecular design can lead to novel functionalities. For example, the addition of methyl side‐groups at specific positions along the linker can influence its conformational space in such a way that the divalent molecule displays entropy‐driven ring‐opening polymerization, in which an increase in temperature leads to polymerization of cyclic species, resulting in increased viscosity.93 Embedding a photo‐switchable group in the linker gives control over its flexibility, allowing switching between cyclic and polymeric species, resulting in a photo‐ responsive sol‐gel transition.94 Lastly, a linker functionalized with hydrogen‐bonding groups is reported to result in kinetically trapped conformations, which can be disrupted upon mechanical agitation, resulting in gelation of the solvent.95

8

Increasing complexity of supramolecular systems

1.5 Supramolecular binding motifs: UPy‐NaPy dimerization

Extensive characterization has been performed of the non‐covalent interactions between supramolecular binding motifs, ranging from metal‐ligand interactions, to host‐ guest complexation, solvophobic interactions, π‐π stacking and hydrogen bonding arrays. Of the hydrogen bonded supramolecular motifs, one of the most popular is the ureidopyrimidinone (UPy) group, due to its synthetic accessibility, high self‐associating 7 ‐1 dimerization constant, and fast dynamics (lifetime = 0.12 s, KUPy‐UPy = 6 × 10 M in CHCl3, 25°C; Fig. 1.5A).96 UPys can self‐associate through a self‐complementary AADD fourfold hydrogen bond array. The magnitude of the UPy dimerization constant originates from minimal repulsive secondary interactions and a minimal amount of competing tautomeric forms.2,97,98 The binding strength can be tuned by changing the solvent, temperature or by changing the periphery of the UPy moiety.96,99,100

Figure 1.5 (A) Binding modes of the self‐associating UPy group and its dimerization with NaPy following tautomerization. (B) Simulated concentration dependent speciation of an equimolar mixture of monovalent UPys 7 ‐1 6 ‐1 and NaPys (KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 10 M ).

Small divalent UPy molecules have been employed to create the first supramolecular polymer with material properties comparable to covalent polymers.22 The reversibility and short lifetime of UPy dimers ensures that polymers in solution quickly reach thermodynamic equilibrium. Furthermore, the reversibility has useful applications in bulk materials, allowing for the creation of self‐healable polymer networks,101,102 thermoplastic elastomers,103–105 and reversible adhesives.106,107 Next to self‐association, UPys can also dimerize with 2,7‐diamido‐1,8‐naphthyridine (NaPy) groups, after tautomerization to an ADDA hydrogen bond array via hydrogen‐ transfer and rotation about the NH‐CO bond (Fig. 1.5A).108,109 While the dimerization constant of UPy‐NaPy dimerization is about one order of magnitude lower (KUPy‐NaPy ≈ 5 ×

9

Chapter 1

106, depending on the exact molecular structures), the formation of UPy‐NaPy dimers at mM concentrations is mostly quantitative.110 This is due to UPy‐NaPy dimer formation being more competitive compared to mixtures of UPy‐UPy dimers and free NaPy molecules (NaPys have no self‐complementary hydrogen bonding array), i.e. at higher concentrations it is more favorable to have fewer particles (Fig. 1.5B). NaPy moieties have been employed as polymeric repeating units allowing grafting by UPy functionalized molecules,34 as a means to study the influence of selectivity on polymerization,59,111 and more recently as phase‐transfer catalysts (Chapter 6).112

1.6 Aim and outline

Significant scientific progress has been made on the subjects of molecular competition, multivalency, cooperativity, ring‐chain equilibria, and four‐fold hydrogen bonding. However, with a few exceptions, the applications of theoretical models in supramolecular chemistry are limited to a posteriori analysis of experimental results. As current supramolecular systems are reaching a level of complexity that cannot be unraveled through experimental characterization alone, we postulate that the only way forward is through a synergistic experimental and theoretical approach. Thus, the aim of this thesis is to demonstrate the benefits of applying model‐driven engineering to supramolecular chemistry research. The research presented in this thesis has been performed in close collaboration with Bram Teunissen, who performed the majority of the synthesis and characterization.113 Chapter 2 is a tutorial chapter in which we introduce the modeling techniques that are used throughout this thesis. Relatively simple examples are given and thoroughly explained, demonstrating common pitfalls and their solutions or workarounds. In Chapter 3, we describe a two‐component supramolecular system comprising a divalent UPy and a monovalent NaPy, in which the monovalent NaPy is buffered over a wide concentration range. Via a combination of experimental characterization and theoretical analysis we find that the buffering is caused by competition between cyclization of the divalent UPy and end‐capping of linear chains by NaPy. Using the model, we demonstrate that the effective molarity is the critical parameter in optimizing the broadness of the buffered region and the concentration of the buffered molecule. Subsequently, Chapter 4 describes a model‐driven approach to improve the supramolecular buffering system of the previous chapter by using multivalency. Model simulations reveal that there is an odd‐even effect in which even‐valent molecules possess superior buffering capabilities. Additionally, we predict that buffering can be considerably improved using a tetravalent instead of a divalent UPy, which is confirmed by subsequent experimental validation.

Chapter 5 describes the theoretical analysis of a C2v symmetric trivalent UPy which can form two mutually exclusive cycles, synthesized by Bram Teunissen. The speciation of the

10

Increasing complexity of supramolecular systems

molecule cannot be fully assigned by using experimental characterization alone, requiring the use of a thermodynamic binding model. Lastly, in Chapter 6 we present model predictions on the scope and viability of a conceptual supramolecular self‐accelerating reaction, in which product cyclization leads to the non‐covalent release of additional catalyst. Additionally, we analyze a different reaction system using kinetic models, in which UPy and NaPy groups act as phase‐transfer catalysts, their non‐covalent interactions allowing precise tuning of the type of kinetics. The system displays direct positive feedback, and that feedback is optimized to increase control over the shape of the kinetics, demonstrating the potential of model‐driven engineering.

1.7 References

(1) Whitesides, G. M.; Mathias, J. P.; Seto, C. T. Science 1991, 254 (5036), 1312. (2) Brunsveld, L.; Folmer, B. J. B.; Meijer, E. W.; Sijbesma, R. P. Chem. Rev. 2001, 101 (12), 4071. (3) De Greef, T. F. A.; , M. M. J.; Wolffs, M.; Schenning, A. P. H. J.; Sijbesma, R. P.; Meijer, E. W. Chem. Rev. 2009, 109 (11), 5687. (4) Aida, T.; Meijer, E. W.; Stupp, S. I. Science 2012, 335 (6070), 813. (5) Gellman, S. H. Acc. Chem. Res. 1998, 31 (4), 173. (6) Hill, D. J.; Mio, M. J.; Prince, R. B.; Hughes, T. S.; Moore, J. S. Chem. Rev. 2001, 101 (12), 3893. (7) Pomposo, J. A. Polym. Int. 2014, 63 (4), 589. (8) Terashima, T.; Sugita, T.; Fukae, K.; Sawamoto, M. Macromolecules 2014, 47 (2), 589. (9) Baumgartner, R.; Fu, H.; Song, Z.; Lin, Y.; Cheng, J. Nat. Chem. 2017, 9 (7), 614. (10) Ludlow, R. F.; Otto, S. Chem. Soc. Rev. 2008, 37 (1), 101. (11) Giuseppone, N. Acc. Chem. Res. 2012, 45 (12), 2178. (12) Mattia, E.; Otto, S. Nat. Nanotechnol. 2015, 10 (2), 111. (13) Bissette, A. J.; Fletcher, S. P. Nat. Chem. 2015, 7 (1), 15. (14) Ashkenasy, G.; Hermans, T. M.; Otto, S.; Taylor, A. F. Chem. Soc. Rev. 2017. (15) Wilkinson, M. J.; Leeuwen, P. W. N. M. van; Reek, J. N. H. Org. Biomol. Chem. 2005, 3 (13), 2371. (16) Dydio, P.; Breuil, P.‐A. R.; Reek, J. N. H. Isr. J. Chem. 2013, 53 (1–2), 61. (17) Heinen, L.; Walther, A. Soft Matter 2015, 11 (40), 7857. (18) Boekhoven, J.; Hendriksen, W. E.; Koper, G. J. M.; Eelkema, R.; Esch, J. H. van. Science 2015, 349 (6252), 1075. (19) Maiti, S.; Fortunati, I.; Ferrante, C.; Scrimin, P.; Prins, L. J. Nat. Chem. 2016, 8 (7), 725. (20) Sorrenti, A.; Leira‐Iglesias, J.; Markvoort, A. J.; Greef, T. F. A. de; Hermans, T. M. Chem. Soc. Rev. 2017. (21) Sorrenti, A.; Leira‐Iglesias, J.; Sato, A.; Hermans, T. M. Nat. Commun. 2017, 8, ncomms15899.

11

Chapter 1

(22) Sijbesma, R. P.; Beijer, F. H.; Brunsveld, L.; Folmer, B. J. B.; Hirschberg, J. H. K. K.; Lange, R. F. M.; Lowe, J. K. L.; Meijer, E. W. Science 1997, 278 (5343), 1601. (23) Martin, R. B. Chem. Rev. 1996, 96 (8), 3043. (24) Zhao, D.; Moore, J. S. Org. Biomol. Chem. 2003, 1 (20), 3471. (25) Mandolini, L. In Advances in Physical Organic Chemistry; Bethell, V. G. and D., Ed.; Academic Press, 1986; Vol. 22, pp 1–111. (26) Korevaar, P. A.; George, S. J.; Markvoort, A. J.; Smulders, M. M. J.; Hilbers, P. A. J.; Schenning, A. P. H. J.; De Greef, T. F. A.; Meijer, E. W. Nature 2012, 481 (7382), 492. (27) Haedler, A. T.; Meskers, S. C. J.; Zha, R. H.; Kivala, M.; Schmidt, H.‐W.; Meijer, E. W. J. Am. Chem. Soc. 2016, 138 (33), 10539. (28) Ogi, S.; Sugiyasu, K.; Manna, S.; Samitsu, S.; Takeuchi, M. Nat. Chem. 2014, 6 (3), 188. (29) Kang, J.; Miyajima, D.; Mori, T.; Inoue, Y.; Itoh, Y.; Aida, T. Science 2015, 347 (6222), 646. (30) van der Zwaag, D.; de Greef, T. F. A.; Meijer, E. W. Angew. Chem. Int. Ed. 2015, n/a. (31) Wu, A. X.; Isaacs, L. J. Am. Chem. Soc. 2003, 125 (16), 4831. (32) Safont‐Sempere, M. M.; Fernandez, G.; Wuerthner, F. Chem. Rev. 2011, 111 (9), 5784. (33) Scherman, O. A.; Ligthart, G. B. W. L.; Ohkawa, H.; Sijbesma, R. P.; Meijer, E. W. Proc. Natl. Acad. Sci. 2006, 103 (32), 11850. (34) Ohkawa, H.; Ligthart, G. B. W. L.; Sijbesma, R. P.; Meijer, E. W. Macromolecules 2007, 40 (5), 1453. (35) Wagner, N.; Ashkenasy, G. Chem. ‐ Eur. J. 2009, 15 (7), 1765. (36) Komatsu, H.; Matsumoto, S.; Tamaru, S.; Kaneko, K.; Ikeda, M.; Hamachi, I. J. Am. Chem. Soc. 2009, 131 (15), 5580. (37) Pischel, U.; Uzunova, V. D.; Remon, P.; Nau, W. M. Chem. Commun. 2010, 46 (15), 2635. (38) Smulders, M. M. J.; Nitschke, J. R. Chem. Sci. 2012, 3 (3), 785. (39) Rowan, S. J.; Cantrill, S. J.; Cousins, G. R. L.; Sanders, J. K. M.; Stoddart, J. F. Angew. Chem. Int. Ed. 2002, 41 (6), 898. (40) Lehn, J.‐M. In Constitutional Dynamic Chemistry; Barboiu, M., Ed.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2011; Vol. 322, pp 1–32. (41) Jackson, A. W.; Fulton, D. A. Polym. Chem. 2013, 4 (1), 31. (42) Li, J.; Nowak, P.; Otto, S. J. Am. Chem. Soc. 2013, 135 (25), 9222. (43) Corbett, P. T.; Sanders, J. K. M.; Otto, S. J. Am. Chem. Soc. 2005, 127 (26), 9390. (44) Mahon, C. S.; Fulton, D. A. Nat. Chem. 2014, 6 (8), 665. (45) Mahon, C. S.; Fascione, M. A.; Sakonsinsiri, C.; McAllister, T. E.; Turnbull, W. B.; Fulton, D. A. Org. Biomol. Chem. 2015, 13 (9), 2756. (46) Sadownik, J. W.; Mattia, E.; Nowak, P.; Otto, S. Nat. Chem. 2016, 8 (3), 264. (47) Duim, H.; Otto, S. Beilstein J. Org. Chem. 2017, 13 (1), 1189. (48) Li, W.; Dong, Z.; Zhu, J.; Luo, Q.; Liu, J. Chem. Commun. 2014, 50 (94), 14744. (49) Nguyen, R.; Jouault, N.; Zanirati, S.; Rawiso, M.; Allouche, L.; Fuks, G.; Buhler, E.; Giuseppone, N. Soft Matter 2014, 10 (22), 3926. (50) Armao, J. J.; Lehn, J.‐M. J. Am. Chem. Soc. 2016, 138 (51), 16809.

12

Increasing complexity of supramolecular systems

(51) Smulders, M. M. J.; Schenning, A. P. H. J.; Meijer, E. W. J. Am. Chem. Soc. 2008, 130 (2), 606. (52) Smulders, M. M. J.; Nieuwenhuizen, M. M. L.; de Greef, T. F. A.; van der Schoot, P.; Schenning, A. P. H. J.; Meijer, E. W. Chem. – Eur. J. 2010, 16 (1), 362. (53) Nakano, Y.; Markvoort, A. J.; Cantekin, S.; Filot, I. A. W.; ten Eikelder, H. M. M.; Meijer, E. W.; Palmans, A. R. A. J. Am. Chem. Soc. 2013, 135 (44), 16497. (54) van der Zwaag, D.; Pieters, P. A.; Korevaar, P. A.; Markvoort, A. J.; Spiering, A. J. H.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2015, 137 (39), 12677. (55) Korevaar, P. A.; Grenier, C.; Markvoort, A. J.; Schenning, A. P. H. J.; Greef, T. F. A. de; Meijer, E. W. Proc. Natl. Acad. Sci. 2013, 110 (43), 17205. (56) Kitano, H. Nature 2002, 420 (6912), 206. (57) Carothers, J. M.; Goler, J. A.; Juminaga, D.; Keasling, J. D. Science 2011, 334 (6063), 1716. (58) Ercolani, G.; Mandolini, L.; Mencarelli, P.; Roelens, S. J. Am. Chem. Soc. 1993, 115 (10), 3901. (59) de Greef, T. F. A.; Ercolani, G.; Ligthart, G. B. W. L.; Meijer, E. W.; Sijbesma, R. P. J. Am. Chem. Soc. 2008, 130 (41), 13755. (60) Koshland, D. E.; Goldbeter, A.; Stock, J. B. Science 1982, 217 (4556), 220. (61) Zhang, Q.; Bhattacharya, S.; Andersen, M. E. Open Biol. 2013, 3 (4), 130031. (62) Buchler, N. E.; Louis, M. J. Mol. Biol. 2008, 384 (5), 1106. (63) Kim, S. Y.; Ferrell Jr., J. E. Cell 2007, 128 (6), 1133. (64) Badjić, J. D.; Nelson, A.; Cantrill, S. J.; Turnbull, W. B.; Stoddart, J. F. Acc. Chem. Res. 2005, 38 (9), 723. (65) , A.; Huskens, J.; Reinhoudt, D. N. Org. Biomol. Chem. 2004, 2 (23), 3409. (66) Krishnamurthy, V. M.; Estroff, L. A.; Whitesides, G. M. Fragm.‐Based Approaches Drug Discov. 2006, 34, 11. (67) Fasting, C.; Schalley, C. A.; Weber, M.; Seitz, O.; Hecht, S.; Koksch, B.; Dernedde, J.; Graf, C.; Knapp, E.‐W.; Haag, R. Angew. Chem. Int. Ed. 2012, 51 (42), 10472. (68) Levine, P. M.; Carberry, T. P.; Holub, J. M.; Kirshenbaum, K. MedChemComm 2013, 4 (3), 493. (69) Perl, A.; Gomez‐Casado, A.; Thompson, D.; Dam, H. H.; Jonkheijm, P.; Reinhoudt, D. N.; Huskens, J. Nat. Chem. 2011, 3 (4), 317. (70) Xu, H.; Shaw, D. E. Biophys. J. 2016, 110 (1), 218. (71) Huskens, J. Curr. Opin. Chem. Biol. 2006, 10 (6), 537. (72) Corbin, P. S.; Lawless, L. J.; Li, Z.; Ma, Y.; Witmer, M. J.; Zimmerman, S. C. Proc. Natl. Acad. Sci. 2002, 99 (8), 5099. (73) W. Ludden, M. J.; N. Reinhoudt, D.; Huskens, J. Chem. Soc. Rev. 2006, 35 (11), 1122. (74) Ejima, H.; Richardson, J. J.; Caruso, F. Angew. Chem. Int. Ed. 2013, 52 (12), 3314. (75) Gerth, M.; Voets, I. K. Chem. Commun. 2017, 53 (32), 4414. (76) Martinez‐Veracoechea, F. J.; Frenkel, D. Proc. Natl. Acad. Sci. 2011, 108 (27), 10963. (77) Curk, T.; Dobnikar, J.; Frenkel, D. Proc. Natl. Acad. Sci. 2017, 114 (28), 7210. (78) Angioletti‐Uberti, S. Phys. Rev. Lett. 2017, 118 (6), 068001. (79) Shinkai, S.; Ikeda, M.; Sugasaki, A.; Takeuchi, M. Acc. Chem. Res. 2001, 34 (6), 494. (80) Ercolani, G.; Schiaffino, L. Angew. Chem. Int. Ed. 2011, 50 (8), 1762. (81) Hunter, C. A.; Anderson, H. L. Angew. Chem. Int. Ed. 2009, 48 (41), 7488.

13

Chapter 1

(82) Shibayama, N.; Sugiyama, K.; Tame, J. R. H.; Park, S.‐Y. J. Am. Chem. Soc. 2014. (83) Wilson, G. S.; Anderson, H. L. Chem. Commun. 1999, 0 (16), 1539. (84) Filot, I. A. W.; Palmans, A. R. A.; Hilbers, P. A. J.; van Santen, R. A.; Pidko, E. A.; de Greef, T. F. A. J. Phys. Chem. B 2010, 114 (43), 13667. (85) Weegen, R. van der; A. Korevaar, P.; Voudouris, P.; K. Voets, I.; Greef, T. F. A. de; M. Vekemans, J. A. J.; W. Meijer, E. Chem. Commun. 2013, 49 (49), 5532. (86) Gershberg, J.; Fennel, F.; Rehm, T. H.; Lochbrunner, S.; Würthner, F. Chem. Sci. 2016, 7 (3), 1729. (87) Jacobson, H.; Stockmayer, W. H. J. Chem. Phys. 1950, 18 (12), 1600. (88) Adams, H.; Chekmeneva, E.; Hunter, C. A.; Misuraca, M. C.; Navarro, C.; Turega, S. M. J. Am. Chem. Soc. 2013, 135 (5), 1853. (89) Sun, H.; Hunter, C. A.; Navarro, C.; Turega, S. J. Am. Chem. Soc. 2013, 135 (35), 13129. (90) Sun, H.; Navarro, C.; Hunter, C. A. Org. Biomol. Chem. 2015. (91) ten Cate, A. T.; Kooijman, H.; Spek, A. L.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2004, 126 (12), 3801. (92) Krishnamurthy, V. M.; Semetey, V.; Bracher, P. J.; Shen, N.; Whitesides, G. M. J. Am. Chem. Soc. 2007, 129 (5), 1312. (93) Folmer, B. J. B.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2001, 123 (9), 2093. (94) Xu, J.‐F.; Chen, Y.‐Z.; Wu, D.; Wu, L.‐Z.; Tung, C.‐H.; Yang, Q.‐Z. Angew. Chem. Int. Ed. 2013, 52 (37), 9738. (95) Teunissen, A. J. P.; Nieuwenhuizen, M. M. L.; Rodríguez‐Llansola, F.; Palmans, A. R. A.; Meijer, E. W. Macromolecules 2014, 47 (23), 8429. (96) Sontjens, S. H. M.; Sijbesma, R. P.; van Genderen, M. H. P.; Meijer, E. W. J. Am. Chem. Soc. 2000, 122 (31), 7487. (97) Beijer, F. H.; Sijbesma, R. P.; Kooijman, H.; Spek, A. L.; Meijer, E. W. J. Am. Chem. Soc. 1998, 120 (27), 6761. (98) Wilson, A. J. Soft Matter 2007, 3 (4), 409. (99) Lafitte, V. G. H.; Aliev, A. E.; Hailes, H. C.; Bala, K.; Golding, P. J. Org. Chem. 2005, 70 (7), 2701. (100) Greef, T. F. A. de; Nieuwenhuizen, M. M. L.; Stals, P. J. M.; Fitié, C. F. C.; Palmans, A. R. A.; Sijbesma, R. P.; Meijer, E. W. Chem. Commun. 2008, No. 36, 4306. (101) R. Hart, L.; L. Harries, J.; W. Greenland, B.; M. Colquhoun, H.; Hayes, W. Polym. Chem. 2013, 4 (18), 4860. (102) Chirila, T. V.; Lee, H. H.; Oddon, M.; Nieuwenhuizen, M. M. L.; Blakey, I.; Nicholson, T. M. J. Appl. Polym. Sci. 2014, 131 (4), n/a. (103) Bosnian, A. W.; Brunsveld, L.; Folmer, B. J. B.; Sijbesma, R. P.; Meijer, E. W. Macromol. Symp. 2003, 201 (1), 143. (104) Appel, W. P. J.; Portale, G.; Wisse, E.; Dankers, P. Y. W.; Meijer, E. W. Macromolecules 2011, 44 (17), 6776. (105) Wietor, J.‐L.; van Beek, D. J. M.; Peters, G. W.; Mendes, E.; Sijbesma, R. P. Macromolecules 2011, 44 (5), 1211. (106) Heinzmann, C.; Coulibaly, S.; Roulin, A.; Fiore, G. L.; Weder, C. ACS Appl. Mater. Interfaces 2014, 6 (7), 4713.

14

Increasing complexity of supramolecular systems

(107) Balkenende, D. W. R.; Monnier, C. A.; Fiore, G. L.; Weder, C. Nat. Commun. 2016, 7, 10995. (108) Wang, X.‐Z.; Li, X.‐Q.; Shao, X.‐B.; Zhao, X.; Deng, P.; Jiang, X.‐K.; Li, Z.‐T.; Chen, Y.‐Q. Chem. ‐ Eur. J. 2003, 9 (12), 2904. (109) de Greef, T. F. A.; Ligthart, G. B. W. L.; Lutz, M.; Spek, A. L.; Meijer, E. W.; Sijbesma, R. P. J. Am. Chem. Soc. 2008, 130 (16), 5479. (110) Ligthart, G. B. W. L. Complementary quadruple hydrogen bonding. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2006. (111) Ligthart, G. B. W. L.; Ohkawa, H.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2005, 127 (3), 810. (112) Teunissen, A. J. P.; Haas, R. J. C. van der; Vekemans, J. A. J. M.; Palmans, A. R. A.; Meijer, E. W. Bull. Chem. Soc. Jpn. 2016, 89 (3), 308. (113) Teunissen, A. J. P. Competing Interactions in Chemical Reaction Networks. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2017.

15

Chapter 1

16

2

Mathematical modeling of

supramolecular systems

Abstract

As supramolecular systems become more complex due to increasing numbers of interacting components, theoretical analysis and mathematical models become requisites for complete characterization. Here, we present a tutorial overview of various types of theoretical analyses as applied to supramolecular systems, focusing on simulations of thermodynamic equilibria and deterministic kinetics. Additionally, rule‐based modeling is introduced as a complementary method to describe complex systems which are characterized by repeating patterns. Lastly, the interface between models and real systems is discussed extensively, highlighting the importance of extensive model validation.

Chapter 2

2.1 Introduction

The increasing complexity of supramolecular systems inevitably necessitates theoretical analysis techniques to complement experimental characterization.1–3 The increase in complexity stems from an increasing number of interacting molecular components, while concurrently the components themselves become more sophisticated. The aggregation of supramolecular monomers to functional complex structures is inherently non‐linear, e.g. a relatively simple dimerization reaction is described by cubic equations.4,5 Furthermore, the often observed phenomenon of cooperativity causes severe non‐linear responses which are not easily interpreted intuitively.6 Therefore, simulations and modeling are required not only to characterize current supramolecular systems but also to design new functionalities, i.e. model‐driven engineering.7,8 Current theoretical analysis techniques range widely in the amount of detail that is calculated and the scope of the simulation. Density functional theory calculations are mostly used to determine steady‐state electronic structures of a handful of molecules with full atomistic resolution by quantum mechanics.9,10 Molecular modeling typically operates on larger length scales using Newtonian mechanics, handling small molecular ensembles (102‐104 molecules, nanometers) and timescales of nano‐ to microseconds.11 Even larger length‐ and timescales can be accessed by coarse‐graining models, in which (parts of) molecules are simulated as single entities.12,13 Complex multicomponent supramolecular systems typically necessitate a wider simulation scope in which molecular details such as electronic structures and molecular motions are consolidated into binding constants and statistical factors.14 Thus, its focus lies on the mathematical description of the binding equilibria that govern an entire supramolecular system. Three methods are generally employed for the theoretical analysis of the equilibria in complex supramolecular systems. For systems in thermodynamic equilibrium, a mass balance is solved which includes terms for all species that can potentially be formed.15–18 For time‐varying supramolecular systems in which the kinetics play a role, either deterministic or stochastic simulations are used depending on the requirements of the system studied and the availability of computational power.19–21 Here, we present a hands‐on introduction to the computational techniques as employed in this thesis. While all techniques were implemented in MATLAB®, the general principles will remain the same when other programming platforms or languages are employed.

2.2 Thermodynamic equilibrium: mass balances

The equilibrium state of a supramolecular system can be obtained by writing down all the equilibria involved in the system, followed by the construction and simultaneous solving of the corresponding mass balances for each component of the system. The construction and solving of mass balances typically requires the least computational

18

Mathematical modeling of supramolecular systems power compared to performing kinetic simulations. However, they can get so complex that manually writing all possible reactions is not feasible and no straightforward mathematical relations are available (for example in section 4.4.1). In such cases, a priori limits can be introduced on the number or type of species included in the model, or rule based modeling can be considered (section 2.4). The typical process for constructing mass balances is to first write down all possible reactions along with their equilibrium constants and statistical factors. Secondly, the concentrations of the species are calculated, typically as a function of monomer concentrations. Thirdly, the equivalent monomer concentrations of all species are calculated. Fourthly, all the terms are added into mass balances—one balance per component—resulting in a system of coupled non‐linear equations, which remains to be solved using either analytical or numerical means. Preferably, an analytical solution is found, as its computational time is lowest. However, for systems with three or more components, the chance of finding an analytical solution is low.22 Numerically solving the mass balance is always possible, but care should be taken that the correct solution is obtained. As an example for the construction of a mass balance, we consider here the equilibrium between two different monovalent UPys U1 and U2, and a monovalent NaPy N. We assume that UPys U1 and U2 have completely identical binding sites, so that their self‐association constants are equal, but differ in the rest of their molecular structure. The following paragraphs are numbered with respect to the different steps involved in the mass balance construction.

1. All possible reactions are those where the two UPys can both self‐associate, dimerize with each other or with NaPy (eq. 2.1). Note that the binding constant KU1U2 is a factor 2 larger than both KU1U1 and KU2U2. This factor is a statistical factor which is needed as a correction term to compensate for the mixing entropy of UPys U1 and U2.14 If the correction would not be applied and KU1U2 would be set equal to KU1U1 and KU2U2, an equimolar mixture of U1 and U2 would produce 33% U12, 33% U1U2, and 33% U22, while the correct amounts are 25%, 50%, and 25%, respectively. The only other instance in which statistical factors need to be applied is when dealing with polyvalent molecules, i.e. molecules that have more than one binding site. In those cases, the statistical factors can be calculated using the complementary ‘symmetry’ or ‘direct count’ methods,23,14 which will not be covered here (for examples, see sections 3.4.3, 4.4.1, or 5.2.1).

19

Chapter 2

K U1U1 71 U1 U1 U12U1U1 K 6 10 M K U2U2 71 U2 U2 U22U2U2 K 6 10 M K U1U2 81 (2.1) U1 U2 U1U2 KKKU1U2 2 U1U1 2 U2U2 1.2 10 M K U1N 71 U1 N U1N KU1N 6 10 M K U2N 71 U2 N U2N KU2N 6 10 M

2. To calculate the concentrations of the species, one simply uses the definition of the equilibrium constant (first two lines of eq. 2.2). Note that any statistical factors should also be included in these equations. Here, it is incorporated into the value of KU1U2. U1  K  2 U1U1 2 U1 2 U12U1U1 K  U1 2 (2.2) U22U2U2 K  U2

U1U2 KU1U2  U1 U2

U1N KU1N  U1 N

U2N KU2N  U2 N

3. For the calculation of the equivalent monomer concentrations, one needs to consider how many monomers of each type make up each species. For example, since dimer U12 consists of two U1 monomers, its equivalent monomer concentration is equal to own concentration multiplied by two (eq. 2.3). Similarly, since dimer U1U2 consists of one U1 and one U2 monomer, its equivalent monomer concentration is equal to its own concentration, but it should appear twice in the mass balance equations (eq. 2.4a‐b). Species Equivalent monomer concentration 2 U12U1U1 2K  U1 (2.3)

U1U2 KU1U2  U1 U2

4. The final system of mass balance equations is obtained by summing the equivalent monomer concentrations of each species—where each molecule is represented in one mass balance equation—and equaling them to the total concentration of each molecule (eq. 2.4).

20

Mathematical modeling of supramolecular systems

U1 U1 2 U1  U1U2  U1N (2.4a)  total    2      2 U1 2KKU1U1  U1  U1U2  U1 U2  K U1N  U1 N U2 U2 2 U2  U1U2  U2N (2.4b)  total    2      2 U2 2KKU2U2  U2  U1U2  U1 U2  K U2N  U2 N NNU1NU2N       (2.4c) total NU1NU2NKKU1N    U2N  

To solve this system of three equations, several methods can be used. Preferably, an analytical solution is obtained, and for supramolecular polymeric systems one can usually find one using the solutions of mathematical series (section 3.4.3).24,25 However, the more components present in a system, the lower the chance of finding an analytical solution. Indeed, trying to analytically solve this system of equations using Mathematica or the MATLAB® symbolic toolbox does not result in a solution. While this does not rule out the possibility of an analytical solution, it does indicate that finding one will probably not be easy. Therefore, the system of equations will be solved numerically. In general, numerically solving a system of equations is fastest when only one equation needs to be solved as the solution space is reduced to a single dimension. In that case, the MATLAB® function fzero can be used. Here, reducing the number of equations is not so straightforward, as only eq. 2.4c is linear. Therefore, the MATLAB® function fsolve is employed. The function needs only three inputs: the system of mass balances (including binding constants and total concentrations), an initial guess of the concentrations of species for which we are solving (U1, U2, and N), and a tolerance. The tolerance determines how close the algorithm should approach the solution, and should accordingly be set lower than the expected order of magnitude of the solution. Setting the tolerance too high will result in inaccurate solutions which often manifest as erratic discontinuities in the speciation (Fig. 2.1). Numerically solving a system of equations always requires one to verify that the solution obtained from the algorithm is acceptable. For example, solutions of polynomial equations might produce negative or imaginary concentrations. While these solutions can be mathematically correct, they are obviously not physically or chemically correct. In that case, supplying the algorithm with different initial guesses can lead to acceptable solutions. Alternatively, the MATLAB® function lsqnonlin can be used, which allows constraints on the values of the parameters. Another way of verifying that the correct solution is obtained is to calculate the total mass in the system, based on the solution provided by the algorithm. While it may seem obvious that total mass should be conserved, it should be checked nonetheless, as errors and typos are easily made during coding. The region in which the total mass deviates from

21

Chapter 2 the expected value often gives a good indication which line of code contains an error. To illustrate, the MATLAB® code that calculates and visualizes the UPy‐NaPy equilibria of this section contains 90 lines of code and approximately 2600 characters which should all be correct (section 2.8). For more complex systems, the code gains in length and/or complexity accordingly. Note that these model verifications only check the internal consistency of the model. Any applicability to experimental systems should be validated separately (section 2.5).

Figure 2.1 Simulated speciation (colored lines) and the sum of all fractions (black line) versus total NaPy concentration based on eq. 2.4, using a tolerance of 10‐6 (left) and 10‐3 (right). The relatively high tolerance value used for the graphs on the right side results in discontinuities in the speciation. Equilibrium binding constants have been set as in eq. 2.1, CU1 = 10 mM, CU2 = 5 mM.

2.3 Deterministic kinetic simulations

The kinetics of supramolecular polymers are often simulated using deterministic kinetics, in which the rate of concentration change of each molecular species is described by an ordinary differential equation (ODE).19,20,8 The use of ODEs implies that concentrations are continuous variables, which is a reasonable assumption for typical reactions performed in the lab. When considering systems in which low numbers of molecules are present (e.g. DNA or RNA molecules in the cell) that assumption does not hold, and stochastic methods should be employed, which inherently incorporate molecular fluctuations.26 In general, the construction of ODEs describing a supramolecular system is performed by writing down each reaction to be included in the model, followed by the summation of

22

Mathematical modeling of supramolecular systems all reaction rates that are involved in the creation or consumption of the molecular species into their corresponding ODEs. Since a reaction in which a molecular species is consumed lowers the concentration of that species, a minus sign is added to the corresponding reaction rate term in the ODE. For most supramolecular systems, it is a reasonable assumption that the rates of the individual reactions follow mass action kinetics, i.e. that the reaction rate is directly proportionate to the concentrations of species involved in the reaction.27,28 Other assumptions might be more appropriate for different systems, e.g. Michaelis‐Menten kinetics are often assumed for enzymatic reactions.29,30 To illustrate the construction and solving of a system of ODEs, we consider here the reversible association of monomers to a growing supramolecular polymer, in which all monomer addition steps have the same equilibrium constant, i.e. an isodesmic supramolecular polymerization.4 This approach of sequential monomer addition is conceptually easier to perform compared to the inclusion of fragmentation or coagulation reactions between oligomers/polymers, although that inclusion is sometimes needed to realistically describe certain experimental systems.21 The ODE of each molecular species except the monomer then contains only four terms: two for the addition of monomer to a growing chain, and two for monomer dissociation (eq. 2.5a). The ODE for the monomer contains terms pertaining to all aggregates, and a factor 2 for the dimerization reactions, since two monomers are consumed upon dimerization (eq. 2.5b).

kk ff  CCCnnn11  kkbb (2.5a) dCn kCf 11 Cnnbnn C  k  C 1  C  dt       dC  1 kC22 C  C  k  C   C (2.5b) dt fnbn11  2 nn23 where Cn is the concentration of a polymer chain of size n, kf is the forward rate constant, and kb is the backward rate constant. The summation in equation 2.5b goes to infinity. However, since solving an infinite number of coupled ODEs is no easy task, most simulations are limited to a certain maximum polymer length (eq. 2.6).

dC NN1 1 kC22 C  C  k  C   C dt fnbn11  2 nn23 (2.6) dC N kCC kC dt fNbN11 where N is the largest polymer size included in the model. To ensure an accurate and realistic description of the supramolecular polymerization, N should always be set high enough so that the majority of the populated species is below N. Additionally, models

23

Chapter 2 describing cooperative supramolecular polymers can warrant the use of an analytical relationship which lumps together all polymers larger than N.19,20 This extension is used to prevent excessive computational cost due to the large polymer sizes and the corresponding increase in the number of required ODEs. Generally, solving ODEs of supramolecular systems is performed numerically by a solver that handles ‘stiff’ ODEs. Stiffness is an ill‐defined property of ODE systems which is usually observed when ODEs change on different timescales. Since most supramolecular systems have several processes occurring on different timescales simultaneously, the ODEs describing their kinetics are usually stiff. The primary MATLAB® ODE solver for stiff equations is ode15s, which is used here to solve the system of ODEs (eq. 2.4a and 2.5; Fig. 2.2A). The input parameters of ode15s are the system of ODEs to be solved (including rate constants), the timespan over which to solve it, the initial concentration of each species, and a tolerance which determines the numerical accuracy. Similar to the function fsolve, setting the tolerance too high will result in unstable numerical solutions characterized by erratic discontinuities (vide supra). However, here the numerical instabilities will propagate, as the solver uses an iterative algorithm to proceed through simulation time.

Figure 2.2 (A) Simulated speciation (lines) and the total fraction (dashed line) versus time of a supramolecular isodesmic polymerization, based on eq. 2.4a and 2.5. (B) Concentration and polymer weight fraction versus chain length at the end of the kinetic simulation (dashed lines) and from the corresponding thermodynamic model 3 ‐1 8 ‐1 ‐1 5 (black lines). The parameters used in the simulation are C1,t=0 = 10 mM, Kequilibrium = 10 M , kf = 10 M s , kb = 10 s‐1, N = 500, tolerance = 10‐6.

Similar to the validation of thermodynamic models, the internal consistency of kinetic models can be validated by checking mass conservation, i.e. the total mass should not vary significantly. Furthermore, given a large enough simulation time, any kinetic simulation should converge to thermodynamic equilibrium. Thus, one should always check convergence by comparing the kinetic simulation to the corresponding mass balance model prediction (Fig. 2.2B).24 However, special care should be taken when comparing

24

Mathematical modeling of supramolecular systems more complicated models, as kinetic traps can give the appearance that thermodynamic equilibrium has been reached.19

2.4 Rule‐based modeling

Rule‐based modeling allows simulations of systems that are characterized by repeating reaction patterns and has been developed to handle the complexity of biochemical systems, in which thousands of different components can interact. Rule‐based models are specifically adept at dealing with systems that display combinatorial complexity.31 Combinatorial complexity arises when systems comprise polyvalent molecules, giving rise to a potentially huge amount of molecular species. For example, an enzyme with seven phosphorylation sites has 27‐1 attainable phosphorylated species, none of which can be excluded as unneeded a priori, although only a handful of species might be active in the actual experimental system. Such a system is then represented using reaction rules which map out the ways in which the enzyme can be modified, typically followed by automatic calculation of a reaction network with ODEs for each potential species. Although several languages have been developed for describing reaction rules, we will focus here on BioNetGen as it currently is the most widely used.32 On its own, it transforms reaction rules into a reaction network of ODEs, which sometimes requires considerable computational power. Extensions to BioNetGen such as the network free simulator (NFSim) can forego network calculation and apply reaction rules directly in stochastic simulations, although the solver employs a fixed time step which makes simulations of ‘stiff’ systems computationally expensive.33 Here, we demonstrate rule‐based modeling by simulating the same system as in the previous section, an isodesmic supramolecular polymerization. In that way, it can be used to verify the validity of the reaction rules and to compare the computational costs. In BioNetGen, one starts a model by declaring which types of components comprise the system along with their binding sites. Monomers that undergo isodesmic polymerization have two binding sites, which can be ‘bound’ to other monomers (eq. 2.7).

mon b1~0~1,2~0~1 b  (2.7) where mon is the name of the component, b1 and b2 are the names for the binding sites, and ~ precedes the states that the binding site can be in (0 for not bound and 1 for bound). The reaction rules are subsequently defined in such a way that bonds between binding sites are numbered (eq. 2.8). Note that one cannot simply write that b1 and b2 of any molecule can form bonds, as that would be interpreted as allowing intramolecular reactions. Therefore, a separate dimerization reaction rule is employed, which can only be applied to monomers with two free binding sites. Additionally, b1 and b2 are prevented

25

Chapter 2 from self‐association, as that would lead to polymers in which some monomers are mirrored, complicating the elongation reaction. Dimerization: mon b1~0,2~0 b  mon b 1~0,2~0 b 

mon b1~0, b 2!1~1. mon b 1!1~1, b 2~0 kfb k (2.8) Elongation: mon b2!1~1 . mon b 1!1~1, b 2~0   mon  b 1~0, b 2~0  

monb2!1~ 1 . monb 1!1~ 1, b 2!2 ~ 1  . monb 1!2 ~ 1, b 2 ~ 0  kf kb where ! precedes the bond number. The reaction rules are followed by the forward and backward reaction constants. A comparison of the simulation results to those obtained in the previous section shows no differences, confirming the validity of the rule‐based model (Fig 2.3A). As expected, the total computational time of these reaction rules is higher than the computational time needed for the model in section 2.3, since the reaction network has to be generated before the ODEs can be solved (Fig. 2.3B). The computational time required for network generation increases non‐linearly as a function of N, as BioNetGen iteratively applies the reaction rules to species, having to check whether the generated species is equal to any species from earlier iterations. Therefore, BioNetGen is not particularly well‐equipped to handle systems comprising large aggregates. However, while rule‐based modeling requires a greater overall computational power, the construction of reaction rules and their implementation is easier as compared to the model in the previous section, demonstrating the potential of this method.

Figure 2.3 (A) Simulated speciation of the rule‐based model based on eq. 2.8 (lines), the deterministic kinetic model based on eq. 2.4a and 2.5 (crosses), and the total fraction (dashed line) versus time for a supramolecular isodesmic polymerization. The model parameters are equal to those employed for the model of section 2.3, except for N = 50. (B) Network generation times versus the maximum polymer length for the rule‐based isodesmic polymerization.

26

Mathematical modeling of supramolecular systems

2.5 Non‐linear regression of experimental data

The most important criterion for the validity of theoretical models is that its predictions should agree with experimental data. Unfortunately, since commonly used experimental characterization techniques yield averages of the molecular ensemble in solution, assumptions are often required concerning the relation between the experimental response (CD, UV, 1H NMR, etc.) and a corresponding model property (helicity, assembly length, ring size, etc.). Therefore, verification of the underlying assumptions that couple the model to the experimental data is important. To further validate the model, one should ideally perform global regression (i.e. fitting all data simultaneously) on a dataset that is as comprehensive and diverse as possible by varying concentration, temperature, solvent composition, and so on.5 Sometimes all of the model parameters are known from reference experiments or are reported in literature. In that case, model predictions can be compared directly to experimental responses. However, frequently not all of the model parameters are known in advance. Then, estimates of the parameter values can be obtained from non‐linear regression techniques. Contrary to linear regression, in which a closed‐form expression of the parameter estimates is available, non‐linear regression is usually performed by numerically calculating consecutive estimations, each iteration describing the data better. The fit quality is usually quantified by the residual sum of squares (RSS), i.e. the sum of the squared differences between data and model (eq. 2.9).

n 2 RSS yii f x (2.9) i1 .th where yi is the i value of the experimental response to be predicted (e.g. UV intensity), xi .th is the i value of the environmental explanatory variable (e.g. temperature), f(xi) is the model prediction of yi, and n is the total number of datapoints. Since the error landscape that is being navigated is non‐linear, multiple minima may exist in which the optimization algorithm will be satisfied. We differentiate between local and global minima, the latter being the minimum of the entire landscape and the goal of regression, as it provides the best fit of the model to the data. To ensure that the global minimum is obtained, the algorithm should be run multiple times, each time starting from a different set of initial parameter estimates. The different initial parameter sets should be distributed optimally over the relevant parameter space, which can be achieved using latin hypercube sampling (implemented in the MATLAB® function lhsdesign).34 After all optimizations have succeeded, the final solutions can be compared and the best one selected, i.e. the solution with the lowest RSS. The first step in judging the accuracy of the obtained global minimum, and thereby also the validity of the model, is always to look at the data, the corresponding model

27

Chapter 2 prediction and the residuals. If the model prediction has large or systematic deviations from the data, either the global minimum was not found or the model is not refined enough to describe the underlying fundamental mechanisms. If the model describes the data well, the next step is to calculate the uncertainty of the parameter estimates. This gives an indication of how much a parameter value can change before the fit quality is significantly affected, and concurrently provides an estimate of the parameters importance. The parameter uncertainty is calculated using the Jacobian matrix at the global minimum, i.e. a matrix containing partial derivatives of the error landscape with respect to all parameters.35 Alternatively, comparing parameter values of optimizations that have a low RSS (typically within 5%) can also give an indication of parameter uncertainty. It is advised to check both, as conflicting uncertainties are sometimes found. Additionally, it is advisable to visually inspect the parameter values of the optimizations with low RSS values for any correlations. Lastly, a situation can occur when two different models can describe the data (almost) equally well. In general, either the simplest model should be picked as the ‘correct’ model, or more experiments need to be performed to exclude one of the models. Usually the more complex model fits the data better since it has a larger parameter space. If the two models under comparison are nested, i.e. one model can be transformed into the other model by parameter constraints, an F‐test can be performed to quantify whether the addition of parameters is justified.35,36 Non‐nested models should be compared using the Akaike information criterion.37,38 To illustrate the techniques outlined above, we’ll perform non‐linear regression on a simulated data set generated by a thermodynamic model for cooperative supramolecular polymerization.25 In the cooperative model, we assume that the initial dimerization equilibrium constant (Knucleation) is lower than any of the following monomer additions

(Kelongation; eq. 2.10).

KK Knucleation  elongation elongation  (2.10) CC12  ...  CN

Non‐linear regression is performed twice using two different models: an isodesmic polymerization model and the cooperative polymerization model (Fig. 2.4). The two models are nested, since the removal of cooperativity (i.e. setting the cooperativity parameter to unity) results in the isodesmic polymerization model. To replicate experimental errors in characterization, randomly distributed noise with amplitude 0.2 was added to the simulated data. To ensure that the global minimum is obtained, 1000 optimizations are performed, using different initial parameter values for each optimization (Fig 2.4B).

28

Mathematical modeling of supramolecular systems

Figure 2.4 (A) Simulated experimental data with random noise (crosses), the best fits using an isodesmic (line) and a cooperative (dashed line) supramolecular polymerization model, and their corresponding residuals (markers). A nucleus size of 2 was used in both the simulated data and the model used for regression. (B) Distribution of initial parameter values used in the regression of the cooperative model. (C) Distribution of optimized parameter log10(values) after optimization with a squared 2‐norm within 5% (red dots) and above 5% (blue dots) of the best fit.

The isodesmic polymerization model shows some deviations to the simulated data, while the cooperative model fits the data better. However, it is difficult to visually decide whether the addition of the extra parameter in the cooperative model is justified. Since the models are nested we can perform an F‐test, which shows that it is highly likely that the cooperative model is the correct model to describe the data (P = 8.3 × 10‐7).35 More specifically, P equals the probability that these data would be observed if the isodesmic model were used.39 The values of the optimized parameters show that the two parameters of the cooperative model are correlated, although they are spread over a very small range (Fig. 2.4C). Additionally, no local minima were encountered during the optimizations, evidenced by the fact that all optimizations reached the same final parameter values (for several examples with local minima, see chapter 4).

29

Chapter 2

2.6 Decreasing computational time

When dealing with large datasets or when performing non‐linear regression, the computational time can sometimes increase to unpractical timescales. The computational time can be decreased by several strategies, which will be outlined here, specifically for MATLAB®.

Jacobian matrix Many of the optimization functions in MATLAB® accept a Jacobian matrix as additional input, which is a matrix containing all partial derivatives of the objective function, i.e. the function to be optimized. In the cases where this matrix can be calculated analytically or symbolically (using Mathematica or the MATLAB® symbolic math toolbox, respectively), its input to the optimization function can lead to a considerable decrease in the required computational power.

Converting loops to matrix operations MATLAB® is heavily optimized to perform matrix operations quickly and efficiently. Therefore, if a for or while loop can be replaced with a matrix operation, it will result in a significant speed gain. Although matrix operations can make the code less readable, its speed boost can be enough of an advantage to sacrifice readability.

Parallelization A normal instance of MATLAB® only uses a single processor core, while most modern computers are equipped with multi‐core processors. To enable the use of the other cores MATLAB® has a parallelization toolbox. Should that not be enough, the computational power can be increased further by employing a computer cluster, i.e. a group of computers that work together in such a way that they can be viewed as a single high performance system.

2.7 Conclusion

Several methods are presented for the mathematical description of supramolecular systems. The strengths and limitations of each method are highlighted, focusing on their practical application using relatively simple models and the importance of validating internal model consistency. Subsequently, the importance of model validation using experimental characterization is discussed, along with common analysis techniques and pitfalls in non‐linear regression of experimental data. We expect that this chapter gives enough starting points for any chemist inexperienced in mathematical analysis, and gives sufficient background information to thoroughly understand the techniques used in the remaining chapters.

30

Mathematical modeling of supramolecular systems

2.8 Experimental section

The scripts used to calculate the UPy‐NaPy equilibria, the kinetics of isodesmic supramolecular polymerization, the corresponding rule‐based model, and the non‐linear regression analysis are available online as open source files at https://osf.io/kya4e/. The rule‐based model simulations were performed in a virtual machine (Oracle virtualbox 4.3.12 r93733, Oracle corporation) running Ubuntu (12.04 LTS, 64‐bit), MATLAB® (2012b, 8.0.0.783, Mathworks®) and BioNetGen (version 2.2.2). All other calculations were performed using MATLAB® (R2016a, version 9.0.0341360, Mathworks®).

2.9 References

(1) Nitschke, J. R. Nature 2009, 462 (7274), 736. (2) Lehn, J.‐M. Angew. Chem. Int. Ed. 2013, 52 (10), 2836. (3) Ashkenasy, G.; Hermans, T. M.; Otto, S.; Taylor, A. F. Chem. Soc. Rev. 2017. (4) De Greef, T. F. A.; Smulders, M. M. J.; Wolffs, M.; Schenning, A. P. H. J.; Sijbesma, R. P.; Meijer, E. W. Chem. Rev. 2009, 109 (11), 5687. (5) Thordarson, P. Chem. Soc. Rev. 2011, 40 (3), 1305. (6) Ercolani, G.; Schiaffino, L. Angew. Chem. Int. Ed. 2011, 50 (8), 1762. (7) Carothers, J. M.; Goler, J. A.; Juminaga, D.; Keasling, J. D. Science 2011, 334 (6063), 1716. (8) Korevaar, P. A.; Grenier, C.; Markvoort, A. J.; Schenning, A. P. H. J.; Greef, T. F. A. de; Meijer, E. W. Proc. Natl. Acad. Sci. 2013, 110 (43), 17205. (9) Parr, R. G.; Weitao, Y. Density‐Functional Theory of Atoms and Molecules; Oxford University Press, 1994. (10) Burke, K.; Werschnik, J.; Gross, E. K. U. J. Chem. Phys. 2005, 123 (6), 062206. (11) Garzoni, M.; Baker, M. B.; Leenders, C. M. A.; Voets, I. K.; Albertazzi, L.; Palmans, A. R. A.; Meijer, E. W.; Pavan, G. M. J. Am. Chem. Soc. 2016, 138 (42), 13985. (12) Brini, E.; A. Algaer, E.; Ganguly, P.; Li, C.; Rodríguez‐Ropero, F.; Vegt, N. F. A. van der. Soft Matter 2013, 9 (7), 2108. (13) Bochicchio, D.; Pavan, G. M. ACS Nano 2017, 11 (1), 1000. (14) Ercolani, G.; Piguet, C.; Borkovec, M.; Hamacek, J. J. Phys. Chem. B 2007, 111 (42), 12195. (15) Cantekin, S.; ten Eikelder, H. M. M.; Markvoort, A. J.; Veld, M. A. J.; Korevaar, P. A.; Green, M. M.; Palmans, A. R. A.; Meijer, E. W. Angew. Chem. Int. Ed. 2012, 51 (26), 6426. (16) Nakano, Y.; Markvoort, A. J.; Cantekin, S.; Filot, I. A. W.; ten Eikelder, H. M. M.; Meijer, E. W.; Palmans, A. R. A. J. Am. Chem. Soc. 2013, 135 (44), 16497. (17) ten Eikelder, H. M. M.; Markvoort, A. J.; de Greef, T. F. A.; Hilbers, P. A. J. J. Phys. Chem. B 2012, 116 (17), 5291. (18) Das, A.; Vantomme, G.; Markvoort, A. J.; ten Eikelder, H. M. M.; Garcia‐Iglesias, M.; Palmans, A. R. A.; Meijer, E. W. J. Am. Chem. Soc. 2017, 139 (20), 7036. (19) Korevaar, P. A.; George, S. J.; Markvoort, A. J.; Smulders, M. M. J.; Hilbers, P. A. J.; Schenning, A. P. H. J.; De Greef, T. F. A.; Meijer, E. W. Nature 2012, 481 (7382), 492.

31

Chapter 2

(20) van der Zwaag, D.; Pieters, P. A.; Korevaar, P. A.; Markvoort, A. J.; Spiering, A. J. H.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2015, 137 (39), 12677. (21) Markvoort, A. J.; Eikelder, H. M. M. ten; Hilbers, P. A. J.; de Greef, T. F. A. ACS Cent. Sci. 2016, 2 (4), 232. (22) Douglass, E. F.; , C. J.; Sparer, G.; Shapiro, H.; Spiegel, D. A. J. Am. Chem. Soc. 2013, 135 (16), 6092. (23) Benson, S. W. J. Am. Chem. Soc. 1958, 80 (19), 5151. (24) Martin, R. B. Chem. Rev. 1996, 96 (8), 3043. (25) Zhao, D.; Moore, J. S. Org. Biomol. Chem. 2003, 1 (20), 3471. (26) Gillespie, D. T. J. Phys. Chem. 1977, 81 (25), 2340. (27) Waage, P.; Gulberg, C. M. J. Chem. Educ. 1986, 63 (12), 1044. (28) Érdi, P.; Tóth, J. Mathematical Models of Chemical Reactions: Theory and Applications of Deterministic and Stochastic Models; Manchester University Press, 1989. (29) The Original Michaelis Constant: Translation of the 1913 Michaelis–Menten Paper ‐ Biochemistry (ACS Publications) http://pubs.acs.org/doi/suppl/10.1021/bi201284u (accessed Jul 13, 2017). (30) Vera, J.; Balsa‐Canto, E.; Wellstead, P.; Banga, J. R.; Wolkenhauer, O. Cell. Signal. 2007, 19 (7), 1531. (31) Faeder, J. R.; Blinov, M. L.; Goldstein, B.; Hlavacek, W. S. Complexity 2005, 10 (4), 22. (32) Faeder, J. R.; Blinov, M. L.; Hlavacek, W. S. Methods Mol. Biol. Clifton NJ 2009, 500, 113. (33) Sneddon, M. W.; Faeder, J. R.; Emonet, T. Nat. Methods 2011, 8 (2), 177. (34) McKay, M. D.; Beckman, R. J.; Conover, W. J. Technometrics 1979, 21 (2), 239. (35) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes in C: The Art of Scientific Computing, Second Edition, 2 edition.; Cambridge University Press: Cambridge ; New York, 1992. (36) Field, A. Discovering Statistics Using IBM SPSS Statistics; SAGE, 2013. (37) Bozdogan, H. Psychometrika 1987, 52 (3), 345. (38) Akaike, H. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer Berlin Heidelberg, 2011; pp 25–25. (39) Wasserstein, R. L.; Lazar, N. A. Am. Stat. 2016, 70 (2), 129.

32

3

Supramolecular buffering by ring‐chain competition

Abstract

Recently, our group has reported an organocatalytic system in which buffering of the molecular catalyst by supramolecular interactions results in a robust system displaying concentration‐independent catalytic activity. Here, we demonstrate the design principles of the supramolecular buffering by ring‐chain competition using a combined experimental and theoretical approach. Our analysis shows that supramolecular buffering of a molecule is caused by its participation as a chain stopper in supramolecular ring‐chain equilibria and we reveal here the influence of various thermodynamic parameters. Model predictions based on independently measured equilibrium constants corroborate experimental data of several molecular systems in which buffering occurs via competition between cyclization, growth of linear chains and end‐capping by the chain‐stopper. Our analysis reveals that the effective molarity is the critical parameter in optimizing the broadness of the concentration regime in which supramolecular ring‐chain buffering occurs as well as the maximum concentration of the buffered molecule. To conclude, a side‐by‐side comparison of supramolecular ring‐chain buffering, pH buffering and molecular titration is presented.

This chapter has been published as:

T.F.E. Paffen, G. Ercolani, T.F.A. de Greef, E.W. Meijer, Supramolecular Buffering by Ring‐ Chain Competition, J. Am. Chem. Soc., 2015, 137, 1501.

Chapter 3

3.1 Introduction

Recent advances in functional molecular systems focus on increasing the number of components to, for example, design ‘smart’ materials that can integrate multiple inputs.1–3 This increase correlates with a transition in chemistry which is expanding from the synthesis of individual molecules to the construction of chemical networks that are better equipped to adapt to a multitude of environmental cues.4–8 To match the level of complexity and responsiveness of biochemical pathways, artificial chemical networks should be comprised of tens of different types of molecules each with its own well‐ designed function.9 However, while individual molecules are at a point where their interactions can be rationally designed, only recently chemists have started to design large molecular networks of interacting molecules.10–15 Using a top‐down approach, systems biologists are working to deduce the underlying mechanisms of a variety of cellular functions, while chemists are expanding their knowledge on molecular systems using a bottom‐up approach.16,17 Recent advances in the field of systems chemistry include, but are not limited to: molecular recognition using dynamic molecular networks,14,18 self‐replicating molecules by templating,19–23 self‐ replicating aggregates24 and the construction of logic gates using self‐replicators.25,26 These fascinating advances are paving the way to systematically recreate specific cellular functions and to obtain a molecular‐level understanding of the design principles governing those functions. Buffering is a well‐known term in chemistry and the concept is used in a multitude of varying applications such as biochemical synthesis and assays, organic synthesis,27 fermentation processes,28 and dye processing.29 However, it is striking that, in chemistry, the scope of the buffered molecule is limited mainly to protons. Indeed, the very definition of a buffer ‐within a chemical context‐ is “any solution that maintains an approximately constant pH despite small additions of acid or base”.30 Contrastingly, in natural systems the scope of buffering is much broader as biomolecular pathways employ regulation of component concentrations so that important processes become robust to concentration fluctuations. In those pathways, regulation is achieved by various mechanisms such as active negative feedback loops, passive autoinhibition effects or molecular titration.31–33 These regulatory mechanisms can display similar behavior as compared to ‘classical’ pH buffers. For example, molecular titration in gene regulatory circuits leads to ultrasensitive thresholds at the equivalence point which is similar to the sharp increase in pH at the equivalence point of a titration curve. However, a challenge remains for chemists to broaden their definition of buffering and to recreate component regulation in synthetic systems. Component buffering has also been observed in supramolecular systems as exemplified by the buffering of amphiphilic molecules that can form micelles (critical

34

Supramolecular buffering by ring‐chain competition micelle concentration) and buffering in cooperative supramolecular polymerizations where the free monomer concentration becomes independent of the total concentration at high concentrations.34 However, to date there have been no applications that employ the molecular buffering observed in these types of systems due to the inherent difficulties in designing a monomer that performs a function in its monomeric form but cannot perform the same function in the aggregated state, although examples exist where a function is performed only by the aggregated state or where only the monomer, and not the aggregate, is racemizable.35–37 However, in these systems component buffering has not been investigated. Furthermore, designing component buffering in cooperative supramolecular polymerizations is not straightforward because it requires a high degree of control over both the nucleation and elongation phase. Recently, our group reported on a system of two molecules that showed supramolecular buffering of one of the components.38,39 The system consists of ditopic 2‐ ureido‐pyrimidinone (UPy) molecule 1, which can form both rings and chains, combined with monotopic 2,7‐diamido‐1,8‐naphthyridine (NaPy) molecule 2 that acts as a chain stopper (Figure 3.1a). It was found that NaPy 2 acts as an organocatalyst in the Michael reaction of trans‐β‐nitrostyrene and 2,4‐pentanedione. Subsequent investigations have shown that the catalytic activity of NaPy 2 is more complex than previously reported and that it most probably acts as a buffered phase transfer catalyst.40 Here, we focus on the observation that the concentration of free NaPy (CNaPy‐free) can be buffered over a large concentration range when the concentrations of both NaPy (CNaPy‐total) and ditopic UPy

(Cditopic UPy) are increased simultaneously in a 1:1 ratio. The molecular mechanism by which component buffering occurs is described by a two‐component equilibrium model that describes competition between cyclization and chain growth of a ditopic molecule and end‐capping by a monotopic component. Furthermore, we show that supramolecular ring‐ chain buffering is not just limited to the system of ditopic UPy 1 and NaPy 2, but that this concept can be applied using various non‐covalent binding groups, as long as the molecular topology and binding constants are designed correctly.

3.2 Results and discussion

3.2.1 Model outline

To gain a more intuitive understanding of the supramolecular ring‐chain buffering, a thermodynamic equilibrium model is constructed. The basis for the model is the Jacobson‐ Stockmayer theory describing ring‐chain equilibria of ditopic molecules.41 A key parameter in this model is the effective molarity, which is the experimentally measured tendency of ring formation of a chain consisting of i ditopic molecules (EMi). The EM is equivalent to the concept of effective concentration, which is the theoretical local concentration of associating groups around the ends of an end‐tethered ditopic molecule assuming the

35

Chapter 3 linker follows Gaussian chain statistics.42 In such a case only strainless cycles are formed and the EM and the number of ditopic residues in a chain, i, are related in the following way: 5/2 EMi  B i (3.1) ‐5/2 where B is equal to the effective molarity of the strainless monomeric ring. The term i may be regarded as the product of i ‐3/2 and i ‐1, where the former relates to the probability that a Gaussian chain of i repeating units has its ends coincide and the latter to the number of equivalent bonds available for the ring‐opening of a cyclic i‐mer. The Jacobson‐Stockmayer model was later expanded to include finite intermolecular equilibrium constants (Kinter) and a description of the cycle distribution under dilute conditions.43 In applying the theory to describe experimental systems, the assumption of strainless rings is not always met. Most often this is the case when the amount of linker atoms between the associating groups is less than 30 and monomeric rings are strained.44 In such a case, it is still possible to describe the distribution of rings and chains provided that the EM values of the strained rings are known. The EM values of the strainless rings can then be computed using equation (3.1) in which B corresponds to the effective molarity that the monomeric ring would have if it were strainless.43 In the two‐component model of supramolecular ring‐chain buffering, end‐capping reactions of chains are incorporated (Figure 3.1b; see section 3.4.3 for full model details). The input parameters of the model are the UPy‐UPy and UPy‐NaPy binding equilibrium constants (KUPy‐UPy and KUPy‐NaPy, respectively), the effective molarity of the monomeric ring

(EM1) and the ratio of NaPy to ditopic UPy (f). Gratifyingly, using the model to calculate the free NaPy concentration (CNaPy free) for various total concentrations of equimolar mixtures of ditopic UPy (Cdt‐UPy total) and NaPy (CNaPy total), a buffering plateau is observed (Figure 3.1c, region II). Based on these simulations, three distinct concentration regions are observed. At low concentrations, CNaPy free is equal to CNaPy total and the system consists of free NaPy and ditopic UPy rings (I). Upon increasing concentrations of both components the buffering region is observed, where CNaPy free is almost constant (II). As the fraction of bound NaPy starts to increase the composition of the system becomes a mixture of both rings and end‐capped chains. At the highest concentrations CNaPy free increases linearly again and the system comprises mainly end‐capped chains (III). Indeed, in region III the

CNaPy free curves of both the systems with and without rings overlap (black and blue lines, respectively). Interestingly, the transition from region II to III is equal to the critical concentration

(Ccr), at which point any subsequent addition of ditopic molecules results in formation of linear chains while the concentration of rings stays constant.43 Thus, in effect, component buffering is caused by competition between ring formation by UPy‐UPy association and end‐capping by UPy‐NaPy binding. This competition also explains why the onset of the buffering plateau ‐ and concurrently UPy‐NaPy binding ‐ occurs at a much higher

36

Supramolecular buffering by ring‐chain competition

6 ‐1 45 concentration than the binding of monotopic UPy and NaPy (KUPy‐NaPy = 5 × 10 M ). In other words, in region I the local concentration of UPy groups is simply too high to allow UPy‐NaPy binding.

Figure 3.1 (A) Molecular structures of ditopic UPys 1a‐c and NaPy 2. (B) Schematic overview of the supramolecular ring‐chain buffering system with corresponding equilibrium constants. The employed statistical factors are a result of molecular symmetry and are directly related to the number of ways in which both reactants and products can be formed. (C) Predicted buffering of free NaPy based on the thermodynamic model for equimolar mixtures of ditopic UPy and NaPy both with and without the possibility of ring formation (KUPy‐UPy = 7 ‐1 6 ‐1 6 × 10 M , KUPy‐NaPy = 5 × 10 M , EM = 100 mM, f = 1).

To verify whether the ring‐chain supramolecular polymerization of ditopic UPy molecules 1a‐c can be described by a single effective molarity, concentration‐dependent 1H NMR measurements are performed. To this end, we probed the 1H NMR signal of the urethane proton resulting in a concentration‐dependent splitting that is assigned to either monomeric cycles, oligomeric cycles or linear chains (Figure 3.2; for assignment details see section 3.4.4).46 With the aggregation states assigned, the concentration‐dependent fractions are analyzed using the ring‐chain theory employing the reported value of KUPy‐UPy 7 ‐1 47 in CHCl3 (KUPy‐UPy = 6 × 10 M ) as a fixed constant. In the analysis we performed non‐ linear least square analysis using two versions of the ring‐chain equilibrium model, which vary in the number of effective molarities used during the fitting routine. In the first model, all cycles are considered strainless and thus a single effective molarity is used. In the second model we employ two effective molarities, corresponding to the situation in

37

Chapter 3 which monomeric cycles behave as strained rings and dimeric and oligomeric rings are considered strainless.

Figure 3.2 Speciation plot of 1H NMR data (markers) of ditopic UPys 1a (A), 1b (B) and 1c (C) and fits (lines) based on the thermodynamic ring‐chain model using 1 (gray) and 2 (black) EMs. Residuals of both fits are shown.

In the assignment of the urethane peaks of ditopic UPy 1a it is not possible to distinguish chains and oligomeric cycles (Figure 3.2A). This indicates the similarity between the chemical environments of the urethane protons in both the oligomeric cycle and the chain conformation. The two models describe the experimental data of ditopic UPy 1a equally well, suggesting that rings of ditopic UPy 1a of any size are strainless. Interestingly, for ditopic UPys 1b‐c, the second model considerably fits the data better than the first model (black versus gray solid lines in Figure 3.2B‐C, respectively). Here it is observed that the optimized value of EM1 as obtained by non‐linear least square analysis is higher than expected based on equation (3.1) and the optimized value of the effective molarity of the strainless dimeric cycle, EM2. This higher value of EM1 with respect to EM2 suggests that the monomeric ring is stabilized instead of being destabilized by ring strain. Indeed, this increase in stability is in line with a measured intramolecular hydrogen bond between the urethane proton and the carbonyl of the pyrimidinone in monomeric rings formed by 1b.46,48 Although the model employing two EM values gives a slightly better description of the ring‐chain competition of ditopic UPys 1b and 1c, the generality of supramolecular ring‐ chain buffering, which is based on the use of a single EM value, is not affected. Upon addition of NaPy, oligomeric rings are formed in very low amounts, which reduces the sensitivity of the buffering towards changes in EM2 (for details see section 3.4.5). Therefore, in the remainder of this work, we will demonstrate the principle of supramolecular ring‐chain buffering by assuming that the ring‐chain equilibrium of ditopic UPys can be described by a single EM value.

To evaluate the effect of changing the magnitude of EM1 on the buffering behavior, several buffering curves were generated in which the EM1 was varied and all other parameters are kept constant (Figure 3.3A). In line with the qualitative argument that

38

Supramolecular buffering by ring‐chain competition buffering is predominantly caused by competition between ring formation and NaPy binding, both the buffering plateau concentration and broadness increase when increasing the EM1, i.e., the onset of the buffering plateau occurs at higher total concentrations since rings are more stable and KUPy‐NaPy remains constant. Indeed, the opposite effect is observed when increasing the strength of UPy‐NaPy binding by increasing KUPy‐NaPy as the buffering onset occurs at lower total concentrations (Figure 3.3D). However, as a result of increasing KUPy‐NaPy the CNaPy free in region III also decreases, revealing the complexity of the supramolecular ring‐chain buffering. The parameter KUPy‐UPy also affects the onset of buffering, as it is not just the magnitude of the EM1 that determines ring formation, but 49,50 the product of EM1 and KUPy‐UPy (Figure 3.3C). While varying the EM1 does not change the CNaPy free in region III, varying KUPy‐UPy does, as the ratio of KUPy‐UPy to KUPy‐NaPy determines

CNaPy free when the system consists exclusively of chains. Varying the ratio of NaPy to ditopic UPy molecules (f) has the expected effect that the average CNaPy free in the plateau decreases when decreasing f (Figure 3.3B). Interestingly, when f has a value around two, the slope of CNaPy free in region III is extremely sensitive to small changes in f. When f is increased above two, the buffering completely vanishes due to the fact that the number of NaPy end‐groups exceeds the number of UPy end‐groups. This results in a system where end‐capping is too dominant and only end‐capped monomeric ditopic UPy molecules are present. Thus, only for f ratios below two buffering is observed over an appreciably wide concentration range.

39

Chapter 3

Figure 3.3 Predicted free NaPy concentration versus total concentration using different values for (A) EM1, (B) f,

(C) KUPy‐UPy and (D) KUPy‐NaPy. The exact values used are shown next to the corresponding colorbar. Note that in (B) the values used for f are not monotonically increasing to demonstrate the sensitivity of the buffering at values of f close to 2. The round marks superimposed on the curves denote the buffering plateau defined as the region where the moving three point average of CNaPy free varies less than 10%. Insets show the corresponding broadness of the plateau (in decades) versus the parameter that is varied. If parameters are not varied, the following values 7 ‐1 6 ‐1 are used: KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 10 M , EM1 = 10 mM, f = 1.

3.2.2 Model validation

The validity of the model describing supramolecular ring‐chain buffering is confirmed by comparing model predicted buffering curves against 1H NMR measurements using several different combinations of ditopic UPy and monotopic NaPy molecules (Figure 3.4). In effect, the molecules were chosen for their known variations in equilibrium binding constants or EMs, mostly only varying one parameter while others are kept constant. The equilibration timescale of freshly prepared or diluted mixtures was in the order of seconds, thus supramolecular buffering can be used for most, if not all, practical 1 applications. Since CNaPy free cannot be measured directly using H NMR, it is calculated by

40

Supramolecular buffering by ring‐chain competition

first determining the concentration of UPy groups in UPy‐NaPy contacts (CUPy•NaPy) via the following equation:

IUPy NaPy CCUPy NaPy2 Ditopic UPy total  (3.2) IIUPy NaPy UPy UPy

where IUPy‐NaPy and IUPy‐UPy are the integrals of the signals corresponding to the N‐H protons in the hydrogen bonding array of UPy‐NaPy and UPy‐UPy contacts, respectively. Subsequently, the free NaPy concentration is calculated via the mass balance of NaPy:

CCCNaPy free NaPy total UPy NaPy (3.3)

where CNaPy total is the total concentration of NaPy present in solution. This method of calculating CNaPy free has the drawback of becoming increasingly uncertain when most of the NaPy molecules are bound. This uncertainty stems from the fact that as the fraction of free NaPy approaches zero, CNaPy total and CUPy‐NaPy approach values similar to each other.

Small variations in the calculation of CNaPy total and CUPy‐NaPy due to experimental errors are then sufficient to create large uncertainties in CNaPy free. In our experiments, there are several sources of experimental error such as the uncertainty in determining the 1H NMR integral and weighing and volumetric errors. To quantify the effect of those errors on the calculated CNaPy free, we employ relative standard deviations based on instrumental specifications and reported accuracies of NMR integrals.51 As such, the plotted error bars are not based on multiple measurements of samples at the same total concentration, but are calculated based on a single measurement. The 95% confidence interval on CNaPy free is calculated by standard error propagation techniques (see section 3.4.6). Gratifyingly, all experimentally determined concentrations of free NaPy correspond well with model predictions based on reported binding constants and measured EMs. Interestingly, almost no difference in buffering is observed for ditopic UPy molecules 1a and 1c in mixtures with NaPy 2 (Figure 3.4A). Even though their EMs differ by a factor of 2, the predicted buffering regime is almost overlapping. At high total concentrations, the experimental CNaPy free values become negative and the error intervals become increasingly large due to the fact that the fraction of free NaPy becomes exceedingly small. Indeed, when the fraction of NaPy to ditopic UPy is increased to f = 2.5, and correspondingly the fraction of free NaPy does not approach zero at high concentrations, the uncertainty in

CNaPy free remains relatively small (Figure 3.4B). Contrary to our earlier report, the buffering curve is not measured at f = 2, since for this value the calculated concentrations of CNaPy free are extremely sensitive to small errors (Figure 3.3B). Small amounts of impurities or weighing errors will then lead to vastly different buffering curves.

41

Chapter 3

Figure 3.4 Free NaPy concentration versus total concentration for various experimental systems. (A) Mixtures of NaPy 2 with either ditopic UPy 1a or 1b (black and gray, respectively). (B) Mixtures of ditopic UPy 1c and NaPy 2 at varying f ratios. (C) Mixtures of ditopic aminoUPy 3 and 2. (D) Mixtures of adamantyl substituted NaPy 4 and ditopic UPy 1c. The plots show experimental points based on 1H NMR spectra (squares), UV‐VIS spectra (circles) and model predictions (lines). Error bars denote 95% confidence intervals based on assumed relative standard deviations of the mass, volume and NMR integral (1, 1 and 5%, respectively). Note that in (D) most of the error bars are smaller than the marker size.

To verify the model prediction of changing the KUPy‐UPy, ditopic UPy 3 is synthesized following a modified literature procedure.52 The dibutyl amino group on ditopic UPy 3 stabilizes the enol as compared to the keto tautomer, resulting in a DADA hydrogen bonding array. Due to the negative secondary interactions of a DADA array as compared to the DDAA array of ditopic UPy 1, the binding strength is a factor 70 lower (KUPy‐UPy = 9 × 5 ‐1 52–55 10 M in CHCl3). To obtain an EM1 close to that of ditopic UPys 1a‐c a dodecane linker is employed, as used in the synthesis of 1b. The concentration‐dependent 1H NMR spectra show a splitting of the urethane proton resonance peak similar to the spectra of 1a‐c (see section 3.4.4). However, in contrast to said spectra, the relative intensities of the splitted peaks do not vary with increasing concentration. Instead, only a change in the chemical shift of one of the peaks is observed, from which Ccr is estimated to be 5 mM. This yields

42

Supramolecular buffering by ring‐chain competition

an EM1 of 1.9 mM, which is reasonably similar to the EM1 of ditopic UPy 1b (9.0 ± 0.5 mM). Thus, with all input parameters of the model determined, the model prediction is verified by measuring CNaPy free in mixtures of ditopic aminoUPy 3 and NaPy 2 (Figure 3.4C). Since the 1H NMR measurements of these mixtures were prone to particularly large errors, additional measurements of CNaPy free were performed using ultraviolet‐visible (UV‐VIS) spectroscopy (for details see section 3.4.7). Gratifyingly, the additional measurements overlap nicely with the model predictions.

Lastly, the model prediction obtained by varying KUPy‐NaPy is verified by measuring mixtures of ditopic UPy 1c and NaPy 4, the latter having a much lower binding constant due to steric hindrance of the adamantyl group.56 As the heterocomplexation is less strong, the fraction of bound NaPy is lowered. Indeed, the model predicts the absence of buffering in agreement with the experiments (Figure 3.4D).

3.2.3 Design principles of supramolecular ring‐chain buffering

The experimental validation of the two component supramolecular ring‐chain buffering model strongly suggests that it is applicable to a range of different ditopic molecules that can bind to a stopper molecule. Thus, as long as the binding constants and the EM values are known, the model can accurately predict the buffering behavior. This allows the rational design of molecular buffering systems, since the binding constants of many supramolecular associating groups are already reported.57,58 While the EMs of smaller rings (<30 bonds) remain troublesome to predict, the order of magnitude of the EM for relatively large strainless rings (>30 bonds) can be accurately predicted.44 To derive the design principles of supramolecular ring‐chain buffering, two quantities are evaluated: the broadness of the buffering plateau and the concentration of chain‐ stopper in the buffering regime (Figure 3.5). Interestingly, for the values of the binding constants used in Figure 3.5, a tradeoff is observed between the plateau broadness and concentration: a high plateau concentration can only be achieved in conjunction with a low broadness and vice versa. Thus, it is clear that in order to obtain a specific buffering behavior, the ratio of homo and hetero dimerization is important, and not the absolute values. Intriguingly, this means that the model predicts that using a ditopic molecule with weakly associating end groups, such as benzoic acid, can give rise to a similar buffering plateau as one with strongly associating end groups such as ditopic UPy 1.

43

Chapter 3

Figure 3.5 (A) Buffering plateau broadness and (B) the logarithm of the average plateau concentration as a function of KUPy‐UPy and KUPy‐NaPy. EM1 and f are fixed at 10 mM and 1, respectively. In the hatched region no buffering is observed. The broadness is calculated analytically, while the plateau concentration is calculated numerically by taking the average of 10 logarithmically spaced points within the buffered range. Square markers denote the parameter values of the equilibrium binding constants that are used to generate the buffering curves in (C). All graphs of (C) have the same axis limits and scaling.

It is hypothesized that the observed tradeoff between the broadness of the buffering plateau and the concentration of free NaPy can be overcome. This is based on the observation that, for a single set of binding parameters, increasing the EM increases both the broadness and concentration simultaneously (Figure 3.3A). However, to verify that the EM can indeed overcome this tradeoff, a way is required to report the influence of changing the EM while the other model parameters are varied across realistic values. We opted for performing a multiparameter analysis in which every combination of two parameters is varied while the other two are kept at a constant value. Comparing the graphs in which the EM is varied, it is observed that increasing the EM indeed increases both the plateau broadness and concentration for all of the parameter values considered here (Figure 3.6, bottom graphs). This strongly suggests that increasing the EM has the same effect of overcoming the tradeoff for all combinations of the other parameters. In the optimization of the supramolecular ring‐chain buffering it becomes readily apparent that there are no clear requirements for an optimal type of buffer. Instead, optimality depends strongly on the desired buffer properties for a specific application. Thus, two example cases are considered here in terms of their required parameters. I ‐ Biological reporter molecules are usually present in low concentrations (nM regime) and can be buffered across a wide range of total concentrations. Thus, a buffer with a high broadness and low concentration is applicable, which requires for example a construct of two associating proteins linked by a flexible chain.42,59,60 Peptide chains can be designed to be flexible random coil chains, which are accurately described using the worm‐like chain or Gaussian chain model, which gives a high degree of control over the EM.61 Because the buffering is to take place at low concentrations, the EM of the ditopic construct can be relatively low (μM‐mM regime). The appropriate binding strengths of the ditopic and monotopic molecules can then be chosen by generating a graph similar to Figure 3.5.

44

Supramolecular buffering by ring‐chain competition

Using Figure 3.5 as an example, the binding constants should be chosen to be in the right‐ lower region of the graph to obtain a buffer with a broad plateau and low concentration. II ‐ Chemical catalysts are mostly used at higher concentrations (mM to M regime) while their operating concentrations are limited to a relatively small range. Therefore, a buffer with high concentration and low broadness is best suited. The combination of supramolecular interactions and catalysis is well explored, thus it should pose no challenge designing ditopic molecules with binding groups that can deactivate a catalyst when bound to the catalyst.62–66 Since the buffering plateau is limited at high concentration by the critical concentration and since the magnitude of the critical concentration is a direct result of the magnitude of the EM, the EM should at least be equally high as the desired operating concentration. This imposes some strict design rules on the linker, as obtaining a high EM is not always straightforward.67,68 This stems from the fact that short linkers are conformationally more limited and give rise to odd‐even effects in the EM with respect to the linker length.44 The requirements on the binding constants are less strict, and can again be chosen by generating a graph similar to Figure 3.5.

Figure 3.6 Multiparameter analysis of plateau broadness (A) and the logarithm of the plateau concentration (B) as a function of all model parameters. Parameters that are not varied are set to the following values: KUPy‐UPy = 6 × 7 ‐1 6 ‐1 10 M , KUPy‐NaPy = 5 × 10 M , EM1 = 0.01 M and f = 1. Hatched regions indicate that no buffering is observed. Both the plateau broadness and concentration are calculated numerically. The artefacts in the left half of the subplots in the second columns are due to numerical inaccuracies.

3.2.4 Buffering by a ring‐chain mechanism versus other buffering mechanisms

The principle of supramolecular ring‐chain buffering as demonstrated in the previous sections, where buffering is observed while the total concentrations of both components are changed, is dissimilar from ‘traditional’ pH buffering in the sense that one generally does not change the total concentrations of both the buffering and the buffered

45

Chapter 3 molecules in a pH buffer. Instead, in a pH buffer the concentration of the buffered molecule (protons) is made insensitive to addition of acids or bases by action of a weak acid that acts as the buffering agent (Figure 3.7A). Thus, the question is raised whether the free NaPy concentration is buffered when NaPy is added to a solution containing a fixed concentration of ditopic UPy.

Figure 3.7 (A) Simulated titration curve of an acidified solution of a weak acid (pKa = 4.7, 0.1 M, 50 mL) with a strong base (0.1 M). (B) Simulated supramolecular ring‐chain buffering curves describing the addition of NaPy to 7 ‐1 6 ‐1 a solution containing fixed concentrations of ditopic UPy. KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 10 M , EM1 = 10 mM. (C) Simulated molecular titration curves describing the addition of NaPy to a solution containing fixed 7 ‐1 6 ‐1 concentrations of monotopic UPy. KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 10 M .

Based on the observation that supramolecular ring‐chain buffering is caused by competition between the formation of cycles and end‐capped chains, it is expected that the initial concentration of ditopic UPy plays a crucial role in buffering. Indeed, the supramolecular ring‐chain buffering curves generated by employing ditopic UPy concentrations below the critical concentration have slopes smaller than unity, reminiscent of pH titration curves (Figure 3.7B). While the slopes of supramolecular buffering curves are not as low as those of pH titration curves – indicating room for improvement – this does present the first example of supramolecular buffering. In contrast to pH buffers, the scope of our system is much broader as it enables buffering of a wide range of molecules with, for example, various catalytic functions. Concomitant to pH buffering, an increase in the initial ditopic UPy concentration, up to the critical concentration, leads to an increase in the buffer capacity.69 Increasing the initial ditopic UPy concentration above the critical concentration leads to ultrasensitive threshold behavior. Ultrasensitive threshold behavior can be generated by various molecular mechanisms, such as positive feedback or molecular titration.70 In the latter case, active components are stoichiometrically sequestered by reversible binding to strong inhibitors up until the equivalence point set by the concentration of the inhibitor.33,71 Once the inhibitor sink is filled, an increase in the total concentration of active component then results in a steep increase in the concentration of free active monomer (Figure 3.7C). The molecular titration curves overlap with the curves of supramolecular ring‐chain buffering that have a ditopic UPy concentration above the

46

Supramolecular buffering by ring‐chain competition

critical concentration (Cmonotopic UPy = 2 × Cditopic UPy), indicating that ring formation is negligible at high ditopic UPy concentrations. Thus, supramolecular buffering by ring‐chain equilibria extends the ultrasensitivity of molecular titration to include buffering, or subsensitivity, at concentrations below the threshold, allowing for extended regulatory capabilities. The ubiquitous presence of ditopic, tritopic and multitopic proteins in biochemical pathways suggests that supramolecular ring‐chain buffering in biological networks is yet to be discovered.60

3.3 Conclusions

We report a method to expand the scope of buffering from exclusively protons to whole molecules that are equipped with supramolecular binding groups. This is achieved by a ditopic molecule that is able to form rings and chains, and which is functionalized with binding groups complementary to the buffered monotopic molecule. By studying a newly developed thermodynamic equilibrium model of buffering, we were able to deduce that buffering occurs by competition between ring formation and stopper binding. The influence of the key model parameters was determined and the model was validated using a library of buffering systems, each with varying physicochemical parameters. The design principles of supramolecular ring‐chain buffering are elucidated by further model evaluation, revealing that the EM is the critical model parameter in attaining a buffering plateau with both a high broadness and a high concentration. We expect that this is the first step in broadening the definition of buffering as currently used in chemistry. The present system might be further expanded to include A‐B type ditopic molecules or ternary systems that can be employed to generate more diverse behavior.72 Furthermore, in the following chapter we will improve the capacity of the supramolecular buffer by employing a model‐driven engineering approach.

3.4 Experimental section

3.4.1 Materials and Methods

All solvents except deuterated solvents were obtained from Biosolve, all other chemicals were obtained from Aldrich or Acros unless otherwise noted and used without further purification. Deuterated solvents were obtained from Cambridge Isotope

Laboratories. Dry CDCl3 was obtained by adding oven dried molecular sieves (4 Å) at least 48 hours prior to the measurements. Ditopic UPys 1a‐c were generously provided by Francisco Rodriguez‐Llansola, adamantyl substituted NaPy 4 was generously provided by Ronald Ligthart and bis(adamantyl) NaPy 8 was generously provided by Marko Nieuwenhuizen.

47

Chapter 3

1H NMR and 13C NMR spectroscopy measurements were conducted on a Varian 400MR 400 MHz (400 MHz for 1H NMR and 100 MHz for 13C NMR). Proton chemical shifts are reported in ppm downfield from tetramethylsilane (TMS) and carbon chemical shifts in ppm downfield of TMS using the resonance of the deuterated solvent as internal standard. Abbreviations used are s: singlet, d: doublet, d‐d: double doublet, t: triplet, m: multiplet, b: broad. Flash column chromatography was performed on a Biotage Isolera Spektra One Flash Chromatography system using KP‐Sil Silica Gel SNAP columns. Mass spectrometric characterization was performed on a Bruker Autoflex Speed MALDI‐TOF spectrometer. Sample preparation was performed using DCTB (2‐[(2E)‐3‐(4‐tert‐Butylphenyl)‐2‐ methylprop‐2‐enylidene]‐malononitrile) or CHCA (α‐cyanohydroxycinnamic acid) as the matrix. Infrared spectroscopy measurements were performed on a Perkin Elmer Spectrum Two FT‐IR spectrometer. Ultraviolet‐visible (UV‐VIS) spectroscopy measurements were performed on a Jasco V‐650 Spectrophotometer equipped with a Jasco ETCT‐762 temperature controller. All UV‐VIS measurements were performed at 20 degrees Celsius. 1H NMR dilution experiments were performed on a Varian Unity Inova, 500 MHz equipped with a 5mm 1H/X Inverse Detection probe. Samples were not spinning during the measurements. Deconvolution of the NMR spectra was performed using the line fitting function of MestReNova version 7.1.1‐9649. Simulations were performed using the Matlab software package (R2013a, version 8.1.0.604, Mathworks) along with its optimization, curve fitting and parallel computing toolboxes. The design principles figure (Figure 3.5) was made using equations (3.26) and (3.27) to calculate the broadness of the buffering plateau. Subsequently, 10 logarithmically spaced total concentrations in between the buffer plateau limits are simulated and the calculated free NaPy concentrations, at those total concentrations, are averaged to obtain the average buffer plateau concentration. In figure 3.6, the plateau broadness is not calculated analytically, but numerically. This entails the calculation of a complete buffering curve and subsequent detection of the plateau by selecting the points where the moving three point average varies less than 10%. The average buffer plateau concentration is subsequently obtained by averaging all the calculated free NaPy concentrations, at varying total concentrations, which were detected to fall within the buffering plateau.

3.4.2 Synthetic procedures

N,N'‐(1,8‐naphthyridine‐2,7‐diyl)didodecanamide 2,40 2‐amino‐6‐ (dibutylamino)pyrimidin‐4‐ol 6 and 1‐(4‐(dibutylamino)‐6‐hydroxypyrimidin‐2‐yl)‐3‐(6‐ isocyanatohexyl)urea 7 were synthesized according to literature procedures.52 Ditopic UPy 3 is synthesized following a modified synthesis scheme from literature (Scheme 1).52 2‐amino‐6‐chloro 4‐pyridinol 5 is allowed to react with dibutylamine to yield 6 (45%), which is subsequently allowed to react in bulk diisocyanatohexane (50%). UPy‐

48

Supramolecular buffering by ring‐chain competition isocyanate 7 is allowed to react with 1,12‐dodecanol and purified using column chromatography to yield ditopic aminoUPy 3 in a yield of 20%.

Scheme 3.1: Synthesis of ditopic di(butyl)amino 3. a) dibutylamine, ethylene glycol, 135 oC, 18h; 45%. b) o o hexamethylenediisocyanate, 60 C, 1h; 50%. c) 1,12‐dodecanediol, dibutyltin didodecanoate, CHCl3, 60 C, 2h; 20%.

Synthesis of dodecane‐1,12‐diyl bis((6‐(3‐(4‐(dibutylamino)‐6‐hydroxypyrimidin‐2‐ yl)ureido)hexyl)carbamate) (3)

1‐(4‐(Dibutylamino)‐6‐hydroxypyrimidin‐2‐yl)‐3‐(6‐isocyanatohexyl)urea 7 (0.50 g, 1.23 mmol) was dissolved in dry CHCl3 (5.0 mL) and 1,12‐dodecanediol (0.121 g, 0.60 mmol) was added. Thereafter, a drop of dibutyltin dilaurate was added to the solution. The mixture was stirred for 2 hours at 60 oC. After this time, 0.5 g of silica and a drop of dibutyltin dilaurate were added to remove any excess of isocyanate. Subsequently, the mixture was stirred for 2 hours at 60 oC. After cooling to room temperature, the solution was filtered and the solvent was evaporated in vacuo. Column chromatography (SiO2, EtOAc) and subsequent drying of the material in vacuo afforded 0.12 g (19.7 %) of the 1 product as a white solid. H NMR (CDCl3): δ 12.58 (s, 2H; OH), 11.23 (s, 2H; NH), 9.61 (s,

2H; NH), 5.32 (s, 2H; Ar‐H), 4.68 (b, 2H; NH), 4.02 (s, 4H; CH2), 3.33 (m, 12H; CH2), 3.15 (m, 13 4H; CH2), 1.7‐1.1 (m, 52H; CH2), 0.96 (t, 12H; CH3). C NMR (CDCl3): δ 162.4, 156.8, 78.7, 64.8, 49.0, 40.8, 39.7, 30.2, 29.8, 29.5, 29.3, 29.1, 26.6, 26.4, 25.9, 20.4, 14.0. IR (ATR): ν =3331, 3220, 3137, 3027, 2924, 2854, 1688, 1615, 1558, 1525, 1504, 1454, 1368, 1323, 1250, 1206, 1145, 1110, 1059, 987, 891 cm‐1 MALDI‐TOF‐MS (m/z): calcd.: 1015.38, + observed: 1015.74 (M+H ). Anal. calcd for C52H94N12O8: C 61.51, H 9.33, N 16.55 found C 61.81, H 8.85, N 15.90.

49

Chapter 3

3.4.3 Model description of two‐component supramolecular buffering

‐1 CAA = initial concentration of the ditopic AA molecule in mol L ‐1 CB = initial concentration of the monofunctional B molecule in mol L ‐1 Kra = equilibrium constant for the dimerization of a monofunctional reactant RA in mol L

Kra* = statistically corrected equilibrium constant for the dimerization of a monofunctional reactant RA in mol‐1 L ‐1 Kaa = equilibrium constant for intermolecular AA homo‐coupling in mol L ‐1 Kab = equilibrium constant for intermolecular AB hetero‐coupling in mol L x = extent of homo‐coupling reaction in the linear fraction

Li = i‐meric chain made of AA homo‐bonds only (i running from 1 to ) B Li = i‐meric chain made of AA homo‐bonds, stopped at one end by the monofunctional B molecule (i running from 1 to ) BB Li = i‐meric chain made of AA homo‐bonds, stopped at both ends by the monofunctional B molecule (i running from 1 to )

Ci = equilibrium concentration of i‐meric ring (i running from 1 to ).

K(intra)i = equilibrium constant for the cyclization of Li to yield Ci.

EMi = effective molarity relative to the ease of formation of the i‐meric ring. It is defined as the ratio K(intra)i /KAA

Here we consider the addition of a monofunctional B molecule, that can act as a chain stopper, to a ditopic molecule that bears two equivalent functional groups at its ends

(AA). When the equilibrium is attained, the initial monomer concentration, CAA, is partitioned into fractions of cyclic and linear oligomers.

   BBB CiAACL i i  i i L i   i L i (3.4) iii111 i  1

We first derive the expressions for the case that CB is zero. In this case eq (3.4) reduces to eq (3.5).

 CiAACL i i  i (3.5) ii11

To evaluate the strength of the homo‐coupling interaction, it is convenient to define the dimerization equilibrium of a monofunctional reactant RA. K ra R-A + A-R   R-A A-R

50

Supramolecular buffering by ring‐chain competition

The equilibrium constant Kra is disfavored by a statistical factor 1/2 due to the 73 * * symmetry number,  = 2, of the dimer RAAR. Accordingly, Kra = Kra /2, where Kra , the equilibrium dimerization constant of RA corrected for the statistical factor, takes into account the effective binding strength of the AA interaction. In order to treat the case described by eq (3.5) it is convenient to use as the reference intermolecular reaction the dimerization equilibrium of bifunctional monomer AA (eq 39 in ref 43). K aa 2 A-A A-A A-A

The equilibrium constant Kaa can be evaluated by the constant Kra by considering that

Kaa is favored by a statistical factor 2 due to the symmetry number,  = 2, of both the * monomer AA and the dimer AAAA, thus Kaa = 2 Kra = 4 Kra. The factor 4 can also be explained by the direct count method.73 If all binding sites are labeled, there are four different ways in which two ditopic monomers can react to form a dimer, while the formed dimer can only disassociate in one way.

By using Kaa as intermolecular reference reaction constant, it can be shown that eqs (3.6) and (3.7) hold (eqs 10 and 12 in ref 43).

i Cii  EM x (3.6)

xi Li  (3.7) Kaa

Where x is the fraction of homo‐coupling in linear chains. It was pointed out by Jacobson and Stockmayer that under the condition that all linear chains are long enough to follow Gaussian statistics, eq (3.8) holds.41

52 EMi  EM1 i (3.8)

The factor i‐5/2 in eq (3.8) may be regarded as the product of i‐3/2 and i‐1. The former term relates to the probability that a Gaussian chain of i repeating units has its ends coincide while the latter term represents the number of equivalent bonds available for the ring‐opening of a cyclic i‐mer. Substituting eq (3.8) into eq (3.6), eq (3.9) is obtained.

52 i Ci   EM1 i x (3.9)

51

Chapter 3

Substituting eqs (3.7) and (3.9) into eq (3.5) and evaluating the infinite sum of i xi, eq (3.10) is obtained.

 1 x CEMix32 i AA 1  2 (3.10) i1 K aa 1 x

If the assumption of strainless rings is not valid, eq (3.10) can be expanded to include a summation over the strained rings (eq 18 in ref 43) to yield eq (3.11).

r 1 1 x CiEMxBixii32 AA i 2 (3.11) iir1 Kaa 1 x

Where r is the ring size of the first strainless ring and B is the value EM1 would be if the cyclic monomer would be strainless. We now allow for the appearance of the monofunctional stopper B with reference reaction: K ab A + B   AB

By addition of the monofunctional B molecule, at an initial concentration of CB, the equilibria are perturbed since B can react with: B i) The A end of a chain Li to give the mono‐stopped chains Li according to eq (3.12): 2  K ab  B L+ii B  L (3.12) B L = AA AA AA  B i  i2

The equilibrium constant of eq (3.12) is favored by a factor of two because Li chains have two terminal A groups available. Combining eq (3.7) and (3.12), we can find an B expression for the concentration of Li chains:

B 2[] KBab i [L]=i x (3.13) Kaa

52

Supramolecular buffering by ring‐chain competition

BB ii) Both A ends of a chain Li to give the double‐stopped chains Li according to eq (3.14): K 2 ab BB L+ii 2 B    L (3.14) BB L = BAAAA  AAB  i  i2

Note that the statistical factor is 1 because species Li has two terminal –A groups BB available but in the reverse reaction, Li chains have two terminal stoppers that undergo dissociation. Considering the definition of the equilibrium of eq (3.14) and eq (3.7), eq (3.15) is easily obtained:

22 BB KBab [] i [L]=i x (3.15) Kaa

Now consider the mass balance equation of the monofunctional B molecule:

 B BB CBBii[] [L]2  [ L] (3.16) ii11

The series in eq (3.16) are given by eqs (3.17) and eq (3.18) respectively:

i B 2[]2[]xBKBab K ab x [L]i  (3.17) ii11Kaa Kaa (1 x )

i 22 22 BB x KBab[] KB ab [] x [L]i  (3.18) ii11Kaa Kaa (1 x )

Combining eq (3.17) and (3.18) with eq (3.16), we find for the mass balance of monofunctional B:

2[]KBx 2 K22 [] B x CB[] ab  ab B K (1 xK ) (1 x ) aa aa (3.19) 2(1[])KKB x [](1B ab ab ) Kxaa (1 )

53

Chapter 3

Rearranging eq (3.19) results in the following expression:

22 2[](2KxBab KxKxKBCK ab aa aa )[](1)0  B aa  x (3.20)

which can be solved for [B] to yield eq (3.21)

1 (2KxKxKab aa aa ) []B   (3.21) 2 22 4Kxab (2KabxKxK aa aa )  8 K ab CKx B aa (1  x )

The amount of monomer concentration A–A that has gone into the end‐capped oligomers will be given by the following equations, where [B] is given by eq (3.21)

 2[]KB x i[L]B  ab  i 2 (3.22) i1 Kaa (1 x )

 KB22[] x i[L]BB  ab  i 2 (3.23) i1 Kaa (1 x )

Combining eq (3.22) and (3.23) into eq (3.4) gives the mass balance of ditopic AA species, corrected for the presence of end‐capped oligomers:

 22 32 i 1 2[]KBab K ab [] B x CEMixAA 1 () (3.24)  KK K 2 i1 aa aa aa 1 x

Solving eq (3.21) and (3.24) simultaneously, results in a value for x which can be used to calculate the concentration of monomers present in cycles, linear chains, chains with a single stopped end‐group and chains with two stopped end‐groups.

Solving using Matlab and approximation of buffering plateau limits

Equations (3.21) and (3.24) are solved numerically using Matlabs built in fzero function, using as inputs Kaa, Kab, EM1, CAA and CB. To numerically evaluate the infinite series in the mass balance of rings, a recurrence relation is used that calculates the difference between the two leading terms and then divides it by the sum of the concentration of monomers in cycles. Terms for subsequent ring sizes are added until a specified tolerance is met.

When increasing CAA and CB simultaneously and keeping the ratio CB / CAA (= f ) constant, a buffering plateau can be observed in the free stopper concentration (Figure

54

Supramolecular buffering by ring‐chain competition

3.1B). The transition of region I to II can be approximated by assuming that only monomeric rings are present and considering the following equilibrium: K  AABring  2  BAAB

K includes three reactions, i.e. breakage of the monomeric ring and binding of a B molecule to each end of the ditopic A monomers, and is equal to:

22 KK K ab ab (3.25) KintraKEM inter 1

By definition K is also equal to:

C K  BAAB 2 (3.26) CC AAring  B

From the midpoint of the considered equilibrium (CB.AA.B = CAAring), CB can be calculated as:

11 1 4 CMmMBI,  II   8.2 10  0.8 K 22 Keab 36 (3.27) 1 KEMinter 1 67110e 

Which corresponds to the observed simulated transition between region I and II. 43 The transition II ‐> III is close to the critical monomer concentration, Ccr:

11 CBB, II III 2.612 2.612 1 10 2.6 10 MmM 260 (3.28)

55

Chapter 3

3.4.4 Assignment of 1H NMR urethane proton resonances and analysis of ring‐chain equilibrium

To determine the effective molarities of the ditopic UPy monomers, 1H NMR measurements are performed on the ditopic molecules at various concentrations.

EM determination of ditopic UPy molecules 1a‐c

Here only the concentration‐dependent spectra and the assignment of peaks of 1a and 1c are shown since those of 1b are published earlier elsewhere.46,74 A concentration‐ dependent splitting of the urethane proton is observed which is attributed to the different conformations of the ditopic molecules (Figure 3.8). At each total concentration, the urethane resonances are deconvoluted using a maximum of eight peaks to completely describe the spectral region. A) B)

C) D)

Figure 3.8: Zoom in of urethane resonance region of the 1H NMR spectra of ditopic UPys 1a (A) and 1c (B). The concentration ranges from 0.2 to 80 mM with logarithmically increasing concentrations in between. The top spectrum corresponds to the lowest concentration and the bottom spectrum to the highest concentration. The intensity of the spectra is scaled to show peaks at all concentrations. C) deconvoluted spectrum of 1a (25 mM, raw data ‐ black, fitted peaks ‐ blue, sum of fitted peaks ‐ pink, residuals ‐ red). D) deconvoluted spectrum of 1c (9 mM, raw data ‐ black, fitted peaks ‐ blue, sum of fitted peaks ‐ pink, residuals ‐ red).

The assignment of the peaks is done by plotting the calculated concentration of each species as derived from the deconvoluted peaks vs the total concentration of ditopic UPy

56

Supramolecular buffering by ring‐chain competition

on a double logarithmic plot. Simulations show that at total concentrations below Ccr, the concentration of the monomeric cycle vs. the total concentration will have a slope of unity on a double logarithmic plot, whereas oligomeric cycles will have larger slopes. Indeed, this is observed experimentally in the analysis of the 1H NMR dilution series of ditopic UPy 1c, where peaks 1, 2, 7 and 8 have a slope of unity and peaks 3, 4 and 6 have a slope of 1.5

(Figure 3.9). Above Ccr, the concentrations of the cycles level off as all additional ditopic molecules go into linear chains. As the linear aggregates only form at higher concentrations, peak 5 corresponds to linear aggregates.

2 2 A) 10 B) 10 Peak 1 Peak 3 Peak 2 Peak 4 Peak 7 Peak 6 1 1 10 Peak 8 10 Slope = 1.5 Slope = 1

0 0 [mM] 10 [mM] 10 Peak Peak C C −1 −1 10 10

−2 −2 10 10 −1 0 1 2 −1 0 1 2 10 10 10 10 10 10 10 10 C 1c [mM] C 1c [mM] ditopic UPy ditopic UPy

Figure 3.8: Double logarithmic plots displaying the calculated concentration of each deconvoluted peak vs the total concentration of ditopic UPy 1c. A) Monomeric cycles and B) oligomeric cycles. The black line displays the critical concentration (Ccr).

This assignment is less straightforward for ditopic UPy 1a, in which peaks 1, 2 and 7 show concentration independent behavior; i.e., their fractional integral values remain constant. Considering only the remaining peaks, peaks 5 and 6 are tentatively identified as corresponding to linear aggregates since they appear at higher concentrations (> 10 mM). Thus, the remaining peaks (3, 4 and 8) should correspond to cyclic aggregates. However, plotting the calculated concentration of those peaks vs. the total concentration yields a curve with a slope of unity, indicating that only monomeric rings can be assigned to the peaks (Figure 3.9). As such, we conclude that the signals for the oligomeric rings overlap with those of linear aggregates.

57

Chapter 3

2 10 Peak 3 Peak 4 Peak 8 1 10 Slope = 1

0 [mM] 10 Peak C −1 10

−2 10 −1 0 1 2 10 10 10 10 C 1a [mM] ditopic UPy

Figure 3.9: Double logarithmic plot displaying the calculated concentration of each deconvoluted peak vs the total concentration of ditopic UPy 1a. The black line displays the critical concentration (Ccr).

To obtain estimates of the EM values, the 1H NMR data is fitted using a custom written Matlab script using eq (3.11) and the function lsqcurvefit. This function uses the Levenberg‐Marquardt method to minimize the residual sum of squares. Latin hypercube sampling is performed to generate a diverse set of starting parameters for the fitting routine so as to avoid local minima in the error landscape. Subsequently, the fit with the lowest residuals is chosen as the best fit. Estimates of the standard deviations on the optimized parameters are generated from the Jacobian and normalized residuals.75 The optimized EM1 values decrease going from ditopic UPy 1a to 1c (Figure 3.10), which is caused by the increase in linker length in between the two UPy groups.

20

[mM] 10 1 EM 0 1a 1b 1c 1

[mM] 0.5 2 EM 0 1b 1c

Figure 3.10: Optimized values of EM1 and EM2 of ditopic UPys 1a‐c. Error bars denote the 95% confidence interval.

58

Supramolecular buffering by ring‐chain competition

To compare the goodness‐of‐fit of the two alternative models described in the main text, an extra sum of squares F‐test is performed.75 This statistical test gives an indication whether the addition of an additional parameter is justified, based on the residual sum of squares and the loss of an extra degree of freedom. In this case, the additional parameter in model 2 compared to model 1 is EM2. The null hypothesis, H0, corresponds to the first model being correct, in which only EM1 is used as a fitting parameter. The alternative hypothesis, H1, corresponds to the second model being correct, in which both EM1 and

EM2 are used as fitting parameters. The resulting p value gives the chance that, assuming

H0 is correct, the observed data is found. Thus if the p value is close to zero, H0 is likely to be incorrect and H1 is likely to be correct. The F‐test on the fit of ditopic UPy 1a shows that the first model is likely to be correct, while for ditopic UPys 1b‐c model two is more likely to be correct (Table 3.1).

Table 3.1: F‐test values of the fits of ditopic UPys 1a‐c. SS = residual sum of squares; df = degrees of freedom.

1a 1b 1c SS df SS df SS df

H0 (fit EM1) 0,0731 26 0,6540 58 0,3915 43

H1 (fit EM1 and EM2) 0,0731 25 0,4298 57 0,0528 42 Difference 0 1 0,2242 1 0,3387 1 Difference (%) 0 4,00 × 10‐2 5,22 × 10‐1 1,75 × 10‐2 6,41 2,38 × 10‐2 Ratio (F) 0 30 269 p value 1 1,11 × 10‐6 7,01 × 10‐20

EM1 determination of ditopic aminoUPy 3

The concentration dependent 1H NMR spectra of ditopic aminoUpy 3 show that the urethane signal is splitted, similar as in the spectra of ditopic UPys 1a‐c (Figure 3.11A). However, in contrast to said spectra, the fractional integral values are concentration‐ independent (Figure 3.11C). This is likely due to the overlapping of peaks from cyclic and linear species. Instead of using the fractional integral values, the change in chemical shift of peak 3 is used to estimate Ccr to be around 5 mM (Figure 3.11D). This yields an EM1 of about 2 mM, which is reasonably similar to the EM1 of ditopic UPy 1b (~9 mM) which has the same spacer between the UPy groups.

59

Chapter 3

A) B)

1 4.75 C) Peak1 D) Peak2 4.7 0.8 Peak3 Peak4 4.65

Peak1 0.6 4.6 Peak2 Peak3 4.55 0.4 Peak4 Fraction 4.5 0.2 Chemical shift [ppm] 4.45

4.4 0 0 1 2 0 1 2 10 10 10 10 10 10 C [mM] C [mM] Ditopic AminoUPy Ditopic AminoUPy

Figure 3.11: A) Zoom in of urethane resonance region of the 1H NMR spectra of ditopic aminoUPy 3. B) deconvoluted spectrum of 3 (17 mM, raw data ‐ black, fitted peaks ‐ blue, sum of fitted peaks ‐ pink, residuals ‐ red). C) The fractional integrals of the deconvoluted peaks. D) Chemical shifts of the deconvoluted peaks.

3.4.5 Influence of oligomeric rings on buffering

The influence of oligomeric rings on the buffering of free NaPy is investigated by simulating several buffering curves in which all model parameters except EM2 are kept constant. By changing only EM2 the stability of the oligomeric rings is directly changed. A considerable difference in the buffering curves is only observed when the EM2 is a factor

10 larger than EM1, a situation not often encountered without considerable preorganization by the linker (Figure 3.12).76,77 Thus, the influence of oligomeric rings on the buffering is minimal, strengthening the conclusion that monomeric rings are the main contributors to the buffering. The lack of a large influence of oligomeric rings on the buffering of free NaPy can be explained by the fact that there are less oligomeric rings when the ditopic UPys are mixed with NaPy. Upon addition of NaPy, the maximum fraction of oligomeric rings quickly drops to fractions lower than 10% (Figure 3.13). Indeed, as with any (supramolecular) polymerization, the average degree of polymerization decreases quickly when chain stoppers are introduced.

60

Supramolecular buffering by ring‐chain competition

Figure 3.12: Predicted buffering curves using EM1 = 10 mM and assuming either strainless larger rings or specific 7 ‐1 6 ‐1 EM2 values. KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 3 × 10 M , f = 1.

A) 1 B) 1

0.8 0.8

0.6 Chains total 0.6 Chains total Cycles total Cycles total Monomeric cycles Monomeric cycles 0.4 0.4 Oligomeric Cycles

fraction Oligomeric Cycles fraction

0.2 0.2

0 0 −2 −1 0 1 2 3 −2 −1 0 1 2 3 10 10 10 10 10 10 10 10 10 10 10 10 C [mM] C = C / f[mM] ditopic UPy ditopic UPy NaPy

C) 1 D) 1

0.8 0.8

0.6 Chains total 0.6 Chains total Cycles total Monomeric cycles Cycles total 0.4 Oligomeric Cycles Monomeric cycles fraction 0.4 Oligomeric Cycles fraction

0.2 0.2

0 −2 −1 0 1 2 3 10 10 10 10 10 10 0 C = C / f[mM] −2 −1 0 1 2 3 ditopic UPy NaPy 10 10 10 10 10 10 C = C / f[mM] ditopic UPy NaPy

Figure 3.13 Simulated speciation plots of ditopic UPy molecules using (A) the ring‐chain model without addition of NaPy, (B‐D) the two‐component buffering model including NaPy (f = 0.5, 1 and 2, respectively). Model input 7 ‐1 6 ‐1 values are: KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 3 × 10 M , EM1 = 10 mM.

61

Chapter 3

3.4.6 Calculation of confidence interval on CNaPy free

1 There are three sources of errors in the measurement of CNaPy free: integration of the H NMR resonance spectrum, the weight of the sample and the volume of the deuterated solvent. For the integration it has been reported that it has a reliability of 5%,51 which we interpret as the integral having a relative standard deviation (RSD) of 5%. This gives the standard deviation, ∆y, since the RSD is defined as:

y RSD  100 (3.29) y where y is equal to the mean. The RSDs on both the sample weight and solvent volume are specified by the manufacturers of the scale and syringe to both be 1%. The first step is to determine the standard deviation on the concentration of UPy groups present in UPy‐NaPy contacts, ∆CUPy‐NaPy, which is calculated using regular error propagation:

IUPy NaPy CCUPy NaPy2 Ditopic UPy total  Itotal

222 (3.30) CI I C 2 ditopic UPy UPy NaPy total UPy NaPy  CIIditopic UPy UPy NaPy total

where C is the concentration, IUPy.NaPy is the integral value corresponding to UPy‐NaPy contacts and Itotal is the combined integral value of both UPy‐NaPy and UPy‐UPy contacts. The standard deviations of the total concentrations of ditopic UPy and NaPy are calculated as:

m C  MV CDCL3 2 (3.31) 2 V 1 m CDCL3 C   Mm V CDCL3

where m is the weighed mass, M is the molecular weight and VCDCl3 is the volume of deuterated chloroform used to prepare the sample. While the standard deviations of the integrals of the 1H NMR resonances are defined as:

62

Supramolecular buffering by ring‐chain competition

IIxtotal,   x 2 (3.32) Ix Ixtotal,  Ix

where Ix,total is either the addition of the integrals of UPy‐NaPy signals or the addition of

UPy‐NaPy and UPy‐UPy signals of hydrogens a‐c (Scheme 3.2) and Ix is the integral value of a proton in a single aggregate type. Note that in the experiments with bis(adamantyl) NaPy the signals of hydrogens a‐c show concentration‐dependent splitting of the UPy‐UPy resonances, which are difficult to deconvolute. This splitting is probably due to the formation of chains consisting of only ditopic UPy, i.e. not end‐capped chains, resulting from the lack of UPy‐NaPy binding. Thus, in this case the signal of proton d was used, which showed a more resolvable concentration‐dependent splitting.

Scheme 3.2 UPy‐UPy and UPy‐NaPy contacts.

Lastly, the standard deviation of the free NaPy concentration, ∆CNaPy free, is calculated as:

CCCNaPy free NaPy total UPy NaPy 22 CC (3.33) C NaPy total  UPy NaPy NaPy free  CCNaPy total UPy NaPy

63

Chapter 3

3.4.7 UV‐VIS data of ditopic UPy 3‐NaPy 2 mixtures

1 In addition to the H NMR measurements performed to obtain CNaPy free, several UV‐VIS measurements were performed on mixtures of ditopic UPy 3 and NaPy 2 (Figure 3.14). Before the spectra could be decomposed, concentration dependent spectra of both free and bound NaPy 2 were obtained. The reference spectra were used to determine the molar extinction coefficients in between 315 and 370 nm. Those coefficients were subsequently used to fit the spectra of the mixtures in order to obtain the free NaPy concentration. The analysis was performed using a custom written Matlab script.

2 C = 1.1e−02 mM C = 2.6e−02 mM C = 5.3e−02 mM 1.5 C = 1.1e−01 mM C = 2.7e−01 mM C = 5.3e−01 mM 1

Absorption (a.u.) 0.5

0 300 320 340 360 380 400 Wavelength (nm)

Figure 3.14 UV‐VIS spectra of ditopic UPy 3‐NaPy 2 mixtures at various concentrations. The optical path length was 0.1 cm and the spectra were obtained at a temperature of 20 degrees Celsius.

3.5 References

(1) Stuart, M. A. C.; Huck, W. T. S.; Genzer, J.; Müller, M.; Ober, C.; Stamm, M.; Sukhorukov, G. B.; Szleifer, I.; Tsukruk, V. V.; Urban, M.; Winnik, F.; Zauscher, S.; Luzinov, I.; Minko, S. Nat. Mater. 2010, 9 (2), 101. (2) de las Heras Alarcón, C.; Pennadam, S.; Alexander, C. Chem. Soc. Rev. 2005, 34 (3), 276. (3) Rybtchinski, B. ACS Nano 2011, 5 (9), 6791. (4) Giuseppone, N.; Lehn, J.‐M. J. Am. Chem. Soc. 2004, 126 (37), 11448. (5) , E.; Cormos, G.; Giuseppone, N. Chem. Soc. Rev. 2012, 41 (3), 1031. (6) Busseron, E.; Ruff, Y.; Moulin, E.; Giuseppone, N. Nanoscale 2013, 5 (16), 7098. (7) Ikeda, M.; Tanida, T.; Yoshii, T.; Kurotani, K.; Onogi, S.; Urayama, K.; Hamachi, I. Nat. Chem. 2014, 6 (6), 511. (8) Nguyen, R.; Jouault, N.; Zanirati, S.; Rawiso, M.; Allouche, L.; Fuks, G.; Buhler, E.; Giuseppone, N. Soft Matter 2014, 10 (22), 3926. (9) Lehn, J.‐M. Angew. Chem. Int. Ed. 2013, 52 (10), 2836. (10) Lee, D. H.; Severin, K.; Ghadiri, M. R. Curr. Opin. Chem. Biol. 1997, 1 (4), 491.

64

Supramolecular buffering by ring‐chain competition

(11) Rowan, S. J.; Cantrill, S. J.; Cousins, G. R. L.; Sanders, J. K. M.; Stoddart, J. F. Angew. Chem. Int. Ed. 2002, 41, 898. (12) Vidonne, A.; Philp, D. Eur. J. Org. Chem. 2009, No. 5, 593. (13) Giuseppone, N. Acc. Chem. Res. 2012, 45 (12), 2178. (14) Li, J.; Nowak, P.; Otto, S. J. Am. Chem. Soc. 2013, 135 (25), 9222. (15) Ludlow, R. F.; Otto, S. Chem. Soc. Rev. 2008, 37 (1), 101. (16) Ross, J.; Arkin, A. P. Proc. Natl. Acad. Sci. 2009, 106 (16), 6433. (17) Lehn, J.‐M. In Constitutional Dynamic Chemistry; Barboiu, M., Ed.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2011; Vol. 322, pp 1–32. (18) Saggiomo, V.; Hristova, Y. R.; Ludlow, R. F.; Otto, S. J. Syst. Chem. 2013, 4 (1), 1. (19) Tjivikua, T.; Ballester, P.; Rebek Jr, J. J. Am. Chem. Soc. 1990, 112 (3), 1249. (20) von Kiedrowski, G. In Bioorganic Chemistry Frontiers; Springer Berlin Heidelberg, 1993; Vol. 3, pp 113–146. (21) Dadon, Z.; Wagner, N.; Ashkenasy, G. Angew. Chem. Int. Ed. 2008, 47 (33), 6128. (22) Wagner, N.; Ashkenasy, G. J. Chem. Phys. 2009, 130 (16), 164907. (23) Moulin, E.; Giuseppone, N. In Constitutional Dynamic Chemistry; Barboiu, M., Ed.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2011; Vol. 322, pp 87–105. (24) Carnall, J. M. A.; Waudby, C. A.; Belenguer, A. M.; Stuart, M. C. A.; Peyralans, J. J.‐P.; Otto, S. Science 2010, 327 (5972), 1502. (25) Allen, V. C.; Robertson, C. C.; Turega, S. M.; Philp, D. Org. Lett. 2010, 12 (9), 1920. (26) Wagner, N.; Ashkenasy, G. Chem. ‐ Eur. J. 2009, 15 (7), 1765. (27) Perrin, D. D.; Dempsey, B. In Buffers for pH and Metal Ion Control; Springer, 1974. (28) Cross, J. S.; Clausen, E. C. Biomass Bioenergy 1993, 4 (4), 277. (29) Youssef, Y. A.; Ahmed, N. S. E.; Mousa, A. A.; El‐Shishtawy, R. M. J. Appl. Polym. Sci. 2008, 108 (1), 342. (30) Oxtoby, D. W.; Gillis, H. P.; Nachtrieb, N. H. Principles of modern chemistry; Thomson/Brooks/Cole, 2002. (31) Becskei, A.; Serrano, L. Nature 2000, 405 (6786), 590. (32) Whitaker, W. R.; Davis, S. A.; Arkin, A. P.; Dueber, J. E. Proc. Natl. Acad. Sci. 2012, 109 (44), 18090. (33) Buchler, N. E.; Louis, M. J. Mol. Biol. 2008, 384 (5), 1106. (34) De Greef, T. F. A.; Smulders, M. M. J.; Wolffs, M.; Schenning, A. P. H. J.; Sijbesma, R. P.; Meijer, E. W. Chem. Rev. 2009, 109 (11), 5687. (35) Guler, M. O.; Stupp, S. I. J. Am. Chem. Soc. 2007, 129 (40), 12082. (36) Rodríguez‐Llansola, F.; Escuder, B.; Miravet, J. F. J. Am. Chem. Soc. 2009, 131 (32), 11478. (37) Cantekin, S.; ten Eikelder, H. M. M.; Markvoort, A. J.; Veld, M. A. J.; Korevaar, P. A.; Green, M. M.; Palmans, A. R. A.; Meijer, E. W. Angew. Chem. Int. Ed. 2012, 51 (26), 6426. (38) Rodriguez‐Llansola, F.; Meijer, E. W. J. Am. Chem. Soc. 2013, 135 (17), 6549. (39) Rodríguez‐Llansola, F.; Meijer, E. W. J. Am. Chem. Soc. 2015, 137 (26), 8654. (40) Teunissen, A. J. P.; Haas, R. J. C. van der; Vekemans, J. A. J. M.; Palmans, A. R. A.; Meijer, E. W. Bull. Chem. Soc. Jpn. 2016, 89 (3), 308. (41) Jacobson, H.; Stockmayer, W. H. J. Chem. Phys. 1950, 18 (12), 1600.

65

Chapter 3

(42) Bastings, M. M. C.; de Greef, T. F. A.; van Dongen, J. L. J.; Merkx, M.; Meijer, E. W. Chem. Sci. 2010, 1 (1), 79. (43) Ercolani, G.; Mandolini, L.; Mencarelli, P.; Roelens, S. J. Am. Chem. Soc. 1993, 115 (10), 3901. (44) Mandolini, L. In Advances in Physical Organic Chemistry; Bethell, V. G. and D., Ed.; Academic Press, 1986; Vol. 22, pp 1–111. (45) Ligthart, G. B. W. L.; Ohkawa, H.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2005, 127 (3), 810. (46) Teunissen, A. J. P.; Nieuwenhuizen, M. M. L.; Rodríguez‐Llansola, F.; Palmans, A. R. A.; Meijer, E. W. Macromolecules 2014, 47 (23), 8429. (47) Söntjens, S. H. M.; Sijbesma, R. P.; van Genderen, M. H. P.; Meijer, E. W. J. Am. Chem. Soc. 2000, 122 (31), 7487. (48) Lafitte, V. G. H.; Aliev, A. E.; Horton, P. N.; Hursthouse, M. B.; Hailes, H. C. Chem. Commun. 2006, No. 20, 2173. (49) Hunter, C. A.; Anderson, H. L. Angew. Chem. Int. Ed. 2009, 48 (41), 7488. (50) Ercolani, G.; Schiaffino, L. Angew. Chem. Int. Ed. 2011, 50 (8), 1762. (51) Bain, A. D.; Fahie, B. J.; Kozluk, T.; Leigh, W. J. Can. J. Chem. 1991, 69 (8), 1189. (52) Felder, T.; de Greef, T. F. A.; Nieuwenhuizen, M. M. L.; Sijbesma, R. P. Chem. Commun. 2014, 50 (19), 2455. (53) Jorgensen, W. L.; Pranata, J. J. Am. Chem. Soc. 1990, 112 (5), 2008. (54) Pranata, J.; Wierschke, S. G.; Jorgensen, W. L. J. Am. Chem. Soc. 1991, 113 (8), 2810. (55) Quinn, J. R.; Zimmerman, S. C.; Del Bene, J. E.; Shavitt, I. J. Am. Chem. Soc. 2007, 129 (4), 934. (56) Teunissen, A. J. P. Competing Interactions in Chemical Reaction Networks. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2017. (57) Zimmerman, S. C.; Corbin, P. S. In Molecular Self‐Assembly Organic Versus Inorganic Approaches; Springer, 2000; pp 63–94. (58) Wilson, A. J. Soft Matter 2007, 3 (4), 409. (59) Fegan, A.; White, B.; Carlson, J. C. T.; Wagner, C. R. Chem. Rev. 2010, 110 (6), 3315. (60) Hobert, E. M.; Doerner, A. E.; Walker, A. S.; Schepartz, A. Isr. J. Chem. 2013, 53 (8), 567. (61) Evers, T. H.; van Dongen, E. M. W. M.; Faesen, A. C.; Meijer, E. W.; Merkx, M. Biochemistry (Mosc.) 2006, 45 (44), 13183. (62) Wilkinson, M. J.; Leeuwen, P. W. N. M. van; Reek, J. N. H. Org. Biomol. Chem. 2005, 3 (13), 2371. (63) Dydio, P.; Breuil, P.‐A. R.; Reek, J. N. H. Isr. J. Chem. 2013, 53 (1–2), 61. (64) Gianneschi, N. C.; Cho, S.‐H.; Nguyen, S. T.; Mirkin, C. A. Angew. Chem. 2004, 116 (41), 5619. (65) Yoon, H. J.; Kuwabara, J.; Kim, J.‐H.; Mirkin, C. A. Science 2010, 330 (6000), 66. (66) Wiester, M. J.; Ulmann, P. A.; Mirkin, C. A. Angew. Chem. Int. Ed. 2011, 50 (1), 114. (67) Adams, H.; Chekmeneva, E.; Hunter, C. A.; Misuraca, M. C.; Navarro, C.; Turega, S. M. J. Am. Chem. Soc. 2013, 135 (5), 1853. (68) Sun, H.; Hunter, C. A.; Navarro, C.; Turega, S. J. Am. Chem. Soc. 2013, 135 (35), 13129.

66

Supramolecular buffering by ring‐chain competition

(69) Skoog, D. A.; West, D. M.; Holler, F. J.; Crouch, S. R. Fundamentals of Analytical Chemistry, 8th edition.; Cengage: Belmont, CA, 2004. (70) Zhang, Q.; Bhattacharya, S.; Andersen, M. E. Open Biol. 2013, 3 (4), 130031. (71) Mukherji, S.; Ebert, M. S.; Zheng, G. X. Y.; Tsang, J. S.; Sharp, P. A.; van Oudenaarden, A. Nat. Genet. 2011, 43 (9), 854. (72) Douglass, E. F.; Miller, C. J.; Sparer, G.; Shapiro, H.; Spiegel, D. A. J. Am. Chem. Soc. 2013, 135 (16), 6092. (73) Ercolani, G.; Piguet, C.; Borkovec, M.; Hamacek, J. J. Phys. Chem. B 2007, 111 (42), 12195. (74) Rodríguez‐Llansola, F.; Meijer, E. W. J. Am. Chem. Soc. 2013, 135 (17), 6549. (75) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes in C: The Art of Scientific Computing, Second Edition, 2 edition.; Cambridge University Press: Cambridge ; New York, 1992. (76) Folmer, B. J. B.; Sijbesma, R. P.; Kooijman, H.; Spek, A. L.; Meijer, E. W. J. Am. Chem. Soc. 1999, 121 (39), 9001. (77) ten Cate, A. T.; Kooijman, H.; Spek, A. L.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2004, 126 (12), 3801.

67

Chapter 3

68

4

Model‐driven engineering of improved supramolecular buffering by

multivalency

Abstract

A supramolecular system in which the concentration of a molecule is buffered over several orders of magnitude is presented. Molecular buffering is achieved as a result of competition in a ring‐chain equilibrium of multivalent ureidopyrimidinone monomers and a monovalent naphthyridine molecule which acts as an end‐capper. While in the previous chapter only divalent ureidopyrimidinone monomers are considered, we now present a model‐driven engineering approach to improve molecular buffering using multivalent ring‐ chain systems. Our theoretical models reveal an odd‐even effect where even‐valent molecules show superior buffering capabilities. Furthermore, we predict that supramolecular buffering can be significantly improved using a tetravalent instead of a divalent molecule, since the tetravalent molecule can form two intramolecular rings with different “stabilities” due to statistical effects. Our model predictions are validated against experimental 1H NMR data, demonstrating that model‐driven engineering has considerable potential in supramolecular chemistry.

This chapter has been submitted as:

T.F.E. Paffen, A.J.P. Teunissen, T.F.A. de Greef, E.W. Meijer, Model‐driven Engineering of Improved Supramolecular Buffering by Multivalency, Proc. Natl. Acad. Sci., in revision.

Chapter 4

4.1 Introduction

The high level of complexity found in biochemical systems often necessitates the synergy of a combined experimental and theoretical study.1,2 Moreover, a well‐established approach in the fields of systems and synthetic biology is to develop novel functionalities by modeling the required molecular mechanisms before any experimental work is performed, i.e. model‐driven engineering.3 The development of synthetic supramolecular systems is currently at a level of complexity that requires a similar synergistic experimental and theoretical treatment.4,5 Analogously, theoretical descriptions are beginning to approach the required level of predictive accuracy required for model‐driven engineering.6 Multivalency is an ubiquitous phenomenon in biochemical systems that is associated with high binding affinity, increased selectivity as well as ultrasensitivity.7–11 Since these properties can be of invaluable use in unprecedented molecular engineering approaches, multivalency is often applied in synthetic systems.12,13 For example, research has shown that multivalent medication can have much lower toxicity while simultaneously having higher medical efficacy.9,14 More recently, multivalency has been recognized as a key molecular driving force in the formation of membraneless organelles in living cells.15 These phase‐separated cellular bodies are organized by dynamic multivalent interactions between proteins and RNA scaffolds and offer a compartmentalized liquid environment that promotes specific enzymatic reactions due to high local concentrations and insulates these reactions from competing substrates.16 Theoretical and experimental studies of multivalent systems have revealed several design parameters that are critical in obtaining effective multivalent constructs. Next to the binding affinity, linker flexibility plays an important role: rigid linkers require extremely precise ligand positioning to obtain high binding affinities and selectivity, while flexible linkers offer more freedom in molecular design at the cost of lower affinity and selectivity.17–19 Furthermore, additional competing equilibria can be used to enhance binding selectivity or to steer an assembly towards a preferred state.4,20 Most, if not all, multivalent biochemical systems are based on dimerization by specific host‐guest binding with minimal host‐host and/or guest‐guest interactions. As a result, most studies in multivalency focus on those types of systems. However, with the rise of supramolecular chemistry, an increasing number of multivalent constructs that have self‐ associating groups are becoming available. The addition of self‐association broadens the behavior of the multivalent constructs to include more possibilities for intramolecular cyclization. In divalent homodimerizing systems, this can have various interesting effects such as mechanically induced gelation, entropy driven polymerization, or light switchable gelation.21–23 Self‐associating constructs with higher valencies are reported less often and

70

Model‐driven engineering of improved supramolecular buffering by multivalency are typically used for their gelation properties, where cyclization leads to less ‘effective’ gelation, or as polymer glasses.24–27 In chapter 3, we analyzed a two‐component supramolecular buffering system based on a self‐associating divalent ureidopyrimidinone (UPy) molecule and a monovalent naphthyridine (NaPy) molecule that undergoes dimerization with UPy chain ends. The buffering of NaPy originates from the competition between cyclization of the divalent UPy and end‐capping of linear oligomers by NaPy. Since the effectiveness of the buffering is controlled to a large degree by the cyclization tendency, we hypothesized that multivalent constructs with higher valencies might lead to improved buffering. Therefore, we present a systematic study in which we investigate how multivalency affects supramolecular buffering using a model‐driven engineering approach.

4.2 Results

To analyze how multivalency affects supramolecular buffering, we expanded our previous model describing supramolecular buffering by divalent UPy monomers.28 To this end, we developed models that describe ring‐chain competition of tri‐ and tetravalent UPy monomers, followed by the inclusion of NaPy dimerization with UPy chain ends. Models for the dimerization of monovalent molecules have already been established.29 While there is a multitude of theoretical models available that describe ring‐chain equilibria of tri‐ or tetravalent molecules with host‐guest binding groups, self‐association has, to the best of our knowledge, not been included.9,14,30–32 Furthermore, even for non‐cyclizing tri‐ and tetravalent molecules, it is analytically intractable to include aggregates with high degrees of polymerization (DPs) due to the exponential increase in the number of molecular species as a function of DP. Moreover, the inclusion of cyclization further increases the number of distinct species. However, since the stopper molecule will limit the formation of larger assemblies, the expected DP will remain low at intermediate concentrations as studied here. Thus, mass balances for the tri‐ and tetravalent molecules were constructed up to a DP of four for the multivalent molecule (section 4.4.1). The Jacobson‐Stockmayer theory, which describes the polymerization and cyclization of divalent molecules in reversible covalent polymerizations, forms the basis for the constructed models.33 The theory has been refined by allowing finite intermolecular 34 binding constants (Kinter), i.e. supramolecular contacts instead of covalent bonds. In Jacobson‐Stockmayer theory, cyclization is taken into account via the effective molarity, which is the experimentally measured cyclization tendency of a chain consisting of i divalent molecules (EMi). When the value of EM for any i is known, the remaining values can be predicted by assuming that the linker follows Gaussian chain statistics, i.e. the linker is strainless. If the linker is not strainless, which is sometimes observed experimentally for relatively short linkers (<30 atoms), the behavior can be described 35 using multiple EMi values.

71

Chapter 4

In the theoretical models for the tri‐ and tetravalent molecules, cycle stabilities are calculated by assuming that the ring‐closure equilibrium constant of a chain with i tri‐ or tetravalent molecules is equal to that of a chain of i divalent molecules, while differences in statistical factors are taken into account (Fig. 4.1A‐B; section 4.4.1). The subsequent inclusion of NaPy dimerization with UPy chain ends leads to a considerable increase in the number of species as a result of combinatorial complexity.36 The input parameters for the models are the intermolecular UPy‐UPy and UPy‐NaPy equilibrium binding constants (KUPy‐

UPy and KUPy‐NaPy), the effective molarity of the monomeric ring (EM1), and the ratio of NaPy to multivalent UPy (f). Using estimated values for the parameters that are appropriate for the multivalent UPy and NaPy system, both stopper titrations and 1:1 dilutions were simulated (Fig. 4.1C‐ D). In the stopper titration simulations, the free NaPy concentration is calculated as a function of the total concentration of NaPy while the concentration of multivalent UPy is constant. The multivalent UPy concentration is set equal to the EM of the divalent molecule to ensure that no linear species are present before NaPy is added. The simulated titration curves, displaying the free concentration of NaPy as a function of the total NaPy concentration, reveal a shallow slope at low NaPy concentrations and a steep transition at the equivalence point for all valencies (Fig. 4.1C). Interestingly, at low NaPy concentrations, both the mono‐ and trivalent UPy titration curves have a slope of unity, while the di‐ and tetravalent UPy curves have lower slopes, the latter indicative of buffering of free NaPy by cyclic species.28 Furthermore, the slope of the tetravalent UPy titration curve at low NaPy concentrations is lower than that of the divalent. This lower slope is attributed to the fact that the two intramolecular cycles of the tetravalent UPy monomer have different stabilities. The difference in stability is not due to any cooperativity, but solely due to a difference in statistical factors for ring formation (section 4.4.1). Thus, during the stopper titration, the binding of NaPy stopper to the cyclized tetravalent molecule proceeds in a two‐step process, opening the cycles one by one (Fig. 4.2). This effect suggests that a multivalent construct with an even higher valency might lower the slope to zero, assuming it would be able to dissolve. As expected, the simulated trivalent UPy titration curve overlaps with that of the monovalent UPy at low concentrations, since the first UPy‐UPy contact to break upon addition of NaPy is the intermolecular contact.4 Thus, at low concentrations, the trivalent UPy acts as a monovalent UPy dimer with two additional cycles.

72

Model‐driven engineering of improved supramolecular buffering by multivalency

Figure 4.1 (A‐B) A selection of species used in the tri‐ and tetravalent UPy models. Partially cyclized species and species larger than dimeric are omitted for clarity. Statistical factors are based on the reference reactions of inter‐ and intramolecular UPy‐UPy dimerization shown in the squares. (C) Simulated stopper titration showing the predicted free NaPy concentration versus the total NaPy concentration (solid lines) and the maximum two‐ fold sensitivity (inset). (D) Simulated equimolar dilution showing the predicted free NaPy concentration versus the total multivalent UPy and NaPy concentration (main) and the fraction of cycles during the dilution (inset). The dashed lines indicate unreliable model predictions, where the fraction of species that have the highest degree of polymerization included in the model is larger than 30%. The input parameters for the predictions in 7 ‐1 6 ‐1 (C) and (D) are: CmultiUPy = 10 mM, KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 10 M , and EM1,divalent = 10 mM.

73

Chapter 4

Figure 4.2 Simulated UPy speciation during titration with NaPy. The input parameters for the speciation are the same as in Fig. 4.1C. The colors of the lines correspond to the DP of the tetravalent UPy (1: blue, 2: red, 3: green).

At NaPy concentrations around the equivalence point, all simulated curves show a sharp increase in the free NaPy concentration, similar to the ultrasensitive response observed in molecular titration.10 The magnitude of the response in this concentration regime can be characterized by the two‐fold sensitivity, which is defined as the change in output (free NaPy concentration) when a two‐fold change in input is applied (total NaPy concentration).10 Interestingly, the titration curves corresponding to the mono‐ and divalent UPy constructs have similar two‐fold sensitivities, while the titration curves of the tri‐ and tetravalent constructs have increasing sensitivities, indicating sharper transitions. This suggests that multivalency can be used to generate sharper transitions and improve supramolecular buffering in this system. For the 1:1 dilution simulations, both the concentrations of multivalent UPy and NaPy are changed simultaneously, keeping the ratio between the two constant at a value of unity. Previously, we have shown that for a divalent UPy construct, the concentration of free NaPy is independent of the total concentration of NaPy and UPy over a broad concentration regime (chapter 3).28 The simulations reveal that both the tri‐ and tetravalent UPy constructs should display a broad plateau indicative of supramolecular buffering (Fig. 4.1D). Interestingly, while the supramolecular buffering of the trivalent UPy construct only occurs at higher concentrations (compared to the divalent construct), the tetravalent UPy construct buffers at the same concentrations as the divalent construct but yields a broader buffering plateau. The inferior buffering by the trivalent construct is not entirely unexpected, since competition between cyclization and end capping is a key requirement for supramolecular buffering. Because the first association of the NaPy stopper with the trivalent molecule will initially disrupt any intermolecular UPy‐UPy contacts, the buffering is expected to occur only at higher concentrations when cycles are

74

Model‐driven engineering of improved supramolecular buffering by multivalency opened. The superior buffering by the tetravalent molecule is attributed to the fact that it lacks a critical concentration above which only chains are present, as is the case for divalent ring‐chain equilibria.28,33,34 Instead, the formation of intermolecular contacts doesn’t prevent cyclization of the remaining UPy moieties, allowing further competition between cyclization and end capping and a continuation of the buffering plateau (Fig. 4.1D, inset). Therefore, it is expected that multivalent constructs with even higher valencies will show a similarly extended supramolecular buffering plateau. To validate the model predictions, a library of multivalent UPy molecules was synthesized by Bram Teunissen, with valencies ranging from mono‐ to tetravalent (Fig. 4.3A).37 In an effort to exclude influence from steric repulsion on the linker flexibility and subsequently the cyclization tendency of multivalent UPys 2‐4, the di‐ and trivalent UPys (2 and 3) were equipped with methyl groups at the central branching position. While this approach does not completely exclude variations in the linker flexibility, it does prevent any additional attractive interactions between linker segments by avoiding heteroatoms. To determine the correct model parameters to be used in the validation of the tri‐ and tetravalent UPy models, KUPy‐NaPy and EM1 were determined experimentally. The value of

KUPy‐UPy is fixed at the reported value for 6‐methylureidopyrimidinone groups in CHCl3 7 ‐1 28,38 (KUPy‐UPy = 6 × 10 M ). The correlated value of KUPy‐NaPy was determined by first measuring 1H NMR spectra of equimolar mixtures containing monovalent UPy 1 and NaPy 5, followed by fitting of the measured distribution of UPy‐UPy and UPy‐NaPy contacts with a simple binding model that includes self‐association of the UPy groups and UPy‐NaPy 6 ‐1 dimerization (KUPy‐NaPy = (3.1 ± 0.2) × 10 M ; section 4.4.2). The determination of the EM of the divalent molecule was performed by measuring concentration dependent 1H NMR spectra of divalent UPy 2 and subsequently fitting the data with a ring‐chain model for 34 divalent molecules (EM1 = 5.3 ± 0.3 mM; section 4.4.2).

Gratifyingly, using the optimized values of KUPy‐NaPy and EM1 to predict the buffering in a 1:1 dilution experiment of NaPy 5 and divalent UPy 2 mixtures results in an excellent prediction of the free NaPy concentration over a broad concentration range (Fig. 4.3B). Thus, with the parameters for dimerization and cyclization determined, the model predictions of the tri‐ and tetravalent models were tested. Various dilution experiments on mixtures of NaPy 5 and either trivalent UPy 3 or tetravalent UPy 4 were performed using 1H NMR spectroscopy while keeping f, the ratio between NaPy and multivalent UPy, constant (Fig. 4.3C‐D). While the model predictions completely overlap with the measured free NaPy concentration within the confidence bounds, we note that at low fractions of free NaPy the measurement of free NaPy concentration is not reliable. The free NaPy concentration is calculated indirectly from the total NaPy concentration and the concentration of UPy‐NaPy contacts, which leads to an increase in uncertainty at low free NaPy fractions (section 3.2.2). The concentration of UPy‐NaPy contacts is obtained from the UPy N‐H resonances in the 1H NMR spectra. To better validate both models, two

75

Chapter 4 separate global fits of the UPy N‐H resonances of 3 and 4 were performed with three free parameters (KUPy‐UPy, KUPy‐NaPy and EM1; section 4.4.3). Since the UPy speciation results directly from the 1H NMR spectra it is more reliable than the indirect calculation of the free NaPy concentration. The three UPy N‐H resonances in the 1H NMR spectra of tetravalent UPy 4 and NaPy 5 mixtures were assigned to UPy‐UPy contacts in monomeric rings of monomers, UPy‐UPy contacts in oligomeric rings and linear species, and UPy‐NaPy contacts (Fig. 4.4A). The speciations could be fitted well with the two models, and the best fits almost overlap with the model predictions based on the reference experiments (Fig. 4.4B, 4.11B, and 4.12A‐C). As expected, the best fit parameter values show that the values of KUPy‐UPy and KUPy‐NaPy are correlated linearly (Fig. 4.4C and 4.11C).

Figure 4.3 (A) Molecular structures of multivalent UPys 1‐4, NaPy 5 and DPU 6, used in this study. (B) 1H NMR data (squares) and model predictions (lines) of the free NaPy concentration in dilution experiments using mixtures containing divalent UPy 2 and f equivalents of NaPy 5. (C‐D) 1H NMR data (squares) and model predictions (lines) of the free NaPy concentration in dilution experiments using mixtures containing f equivalents of NaPy 5 and one equivalent of either trivalent UPy 3 (C) or tetravalent UPy 4 (D). The shaded regions denote the standard deviation of the model predictions. Error bars denote 95 % confidence intervals based on assumed relative standard deviations of the mass, volume and NMR integral (1, 1, and 5 %, respectively).28

76

Model‐driven engineering of improved supramolecular buffering by multivalency

Figure 4.4 (A) a subset of 1H NMR spectra obtained during the dilution experiments of tetravalent UPy 4 and NaPy 5 (f = 1). (B) Peak fractions as obtained from the 1H NMR spectra (circles, f = 1), the best fit using the tetravalent UPy model (lines), and the model predictions based on the parameter values determined in the reference experiments with monovalent UPy 1, NaPy 5, and divalent UPy 2 (dashed lines). (C) Fit parameter log10(values) after optimization with a squared 2‐norm within 5% (red dots) and above 5% (blue dots) of the best fit.

To further validate the tetravalent UPy model, several titrations with N,N’‐di‐2‐ pyridylurea (DPU) 6 were performed on equimolar mixtures of tetravalent UPy 4 and NaPy 5 (Fig. 4.5 and 4.13). DPU 6 selectively binds to NaPy 5 due to its complementary ADDA hydrogen bonding array, effectively sequestering NaPy from the mixture.39 Furthermore, DPU 6 has no interaction with UPy groups, since the UPy groups cannot tautomerize to the complementary DAAD configuration. 1H NMR spectra obtained during the titration showed that the UPy‐NaPy contacts were disrupted, which is in line with NaPy sequestration by DPU, and that the fraction of UPy‐UPy contacts in monomeric rings increased, which is consistent with the dilution of the tetravalent UPy (Fig. 4.5B‐E).

77

Chapter 4

Figure 4.5 (A) Representation of the titration of equimolar mixtures of tetravalent UPy 4 and NaPy 5, using DPU 6 as the titrant. The addition of DPU causes UPy‐NaPy contacts to break in favor of UPy‐UPy and NaPy‐DPU contacts. The titrant solution contains only DPU, which causes dilution of tetravalent UPy 4 and NaPy 5 during 1 the titration. (B) Fit (lines) of H NMR data (markers) of the DPU titration (CtetravalentUPy,0 = CNaPy,0 = 1 mM). (C) UPy speciation. (D) NaPy contacts during the titration. Only NaPy‐UPy contacts could be deduced from the 1H NMR spectrum. (E) NaPy speciation. In the speciation plots (C + E), species with a DP > 3 of the tetravalent UPy are omitted for clarity. The colors of the lines correspond to the DP of the tetravalent UPy (1: blue, 2: red). Error bars denote 95% confidence intervals based on assumed relative standard deviations of the mass, volume and NMR integral (1, 1, and 5%, respectively).

To validate the titration data against model predictions, the tetravalent UPy model was adapted to include NaPy‐DPU dimerization and DPU self‐association (section 4.4.4). Since the reported binding constants for DPU were only approximately determined, a global fit of the titration data was performed using two free parameters (KDPU‐NaPy and KDPU‐DPU). The

78

Model‐driven engineering of improved supramolecular buffering by multivalency

values for KUPy‐NaPy and EM were set to those determined by the reference experiments with monovalent UPy 1, NaPy 5 and divalent UPy 2, vide supra. The titration data could be fitted well, and the optimized parameters are in close correspondence with the reported approximate values (Fig. 4.5B and D; section 4.4.4).

4.3 Discussion

Both the DPU titration and the data of multivalent UPy‐NaPy mixtures show that the models for the tri‐ and tetravalent molecules can sufficiently describe the behavior of the multivalent molecules in the presence of NaPy stopper. Therefore, the models are correct in predicting the inferior buffering of the trivalent UPy and the superior buffering of the tetravalent UPy molecule, compared to the divalent construct. The good agreement between model predictions and experimental results show that model‐driven engineering is an outstanding strategy to investigate new molecular topologies. While this improved supramolecular buffering system does not yet approach the performance of pH buffers in titration experiments, where slopes of zero can be obtained, we do show that multivalency can be used to improve both the capacity of the buffer and the sensitivity of the response. Our study on the effects of multivalency on supramolecular buffering revealed an odd‐ even effect, where the buffering by molecules with odd numbered valencies is significantly inferior to molecules with even valencies. Curiously, a similar odd‐even effect was found in catalytic activity of multivalent dendrimers equipped with catalysts capable of both single and double site catalysis.40 Supramolecular buffering can be substantially improved by employing a tetravalent molecule, as it is able to form two intramolecular rings with different stabilities due to statistical factors. Furthermore, we show that multivalency, while mostly employed to generate sharper responses, can also generate systems that are insensitive to changes in concentration. The present system might be further developed by utilizing the tetravalent molecules described here analogously to multivalent protein and RNA constructs that form phase‐ separated cellular bodies.16,41 Cellular bodies use phase separation to buffer components, isolate incompatible substrates or catalysts, and promote specific reaction rates by changes in local concentrations. Therefore, such an approach may provide a next step in the construction of artificial cells while simultaneously providing a fundamental framework for the effects of phase‐separation. The multivalent constructs could also be incorporated in chemical reaction networks, as combining multivalency and catalysis could lead to increased control over the reaction rate. It would allow for sharper switching between the on and off state and higher rates of catalysis due to increased local concentration. The kinetics of multivalent catalysts and multivalent substrates have been investigated in detail,40,42 and their incorporation in

79

Chapter 4 chemical reaction networks could shed more light on analogous molecular mechanisms in biochemical pathways.

4.4 Experimental section

Simulations were performed using the Matlab software package (R2016a, version 9.0.0341360, Mathworks) along with its optimization, curve fitting and symbolic math toolboxes. Where appropriate, mass balances were analytically solved using the Mathematica software package (version 9.0.1.0, Wolfram Research, Inc.). Otherwise, mass balances were solved numerically using either the fzero or fsolve function included in Matlab. Non‐linear least squares optimizations were performed using the lsqcurvefit function from Matlabs optimization toolbox. This function uses the Levenberg‐Marquardt method to minimize the residual sum of squares. A thousand fits were performed for each optimization. Initial parameters for the fits were distributed using latin hypercube sampling (implemented in the lhsdesign function), which ensures a uniform distribution in multidimensional parameter space so that the global optimum can be obtained. The optimization with the lowest squared 2‐norm is used as the best fit, while optimizations with a squared 2‐norm within 5 % of the best fit are considered equally good fits.

4.4.1 Model description of tri‐ and tetravalent UPys with NaPy

Here we consider the aggregation of tri‐ and tetravalent molecules in the presence of a stopper molecule. As the stopper molecule limits the aggregation of the multivalent molecules, the models are limited to a degree of polymerization (DP) of four for the multivalent molecule. This limitation is also a practical one, as the number of species with a DP of four for the tetravalent molecule is 86 (including NaPy bound species), and the extrapolated amount of species for a DP of five is >100. First, we consider the self‐association of a trivalent UPy monomer. The monomer can cyclize only once, leaving a free binding group (Fig. 4.6). When Kinter is sufficiently high, this cyclized species has a large energetic penalty due to the presence of a free binding group.

Since the models are made for multivalent UPy molecules which have a high Kinter, we limit the species of the multivalent molecules to those which are fully, or to the highest degree, cyclized. Partially cyclized species are only included with the inclusion of monovalent stopper, vide infra. Interestingly, odd‐numbered aggregates always leave a single free binding group upon cyclization. Since the energetic penalty for these species is so high with a large Kinter, they are so unfavorable that they are not populated at all. Thus, a trivalent, or any odd‐valent, molecule with a high Kinter will always aggregate in even‐ numbered sizes in solution.

80

Model‐driven engineering of improved supramolecular buffering by multivalency

Figure 4.6 Trivalent UPy species included in the model and their corresponding stabilities.

As stated before, the inclusion of the stopper molecule leads to the inclusion of partially cyclized species as well, since free binding groups can now bind to the stopper (Fig. 4.7). Thus, the number of species is greatly increased. With the stabilities of all species determined, the species were all included in two coupled mass balances: one for the trivalent UPy and one for NaPy (Equation 4.1; see Fig. 4.6 and 4.7). The balances cannot be solved analytically, since they are fourth and sixth order polynomials (UPy and NaPy, respectively). Therefore, the numerical function fsolve was used. To reduce computational time, a Jacobian matrix was supplied, which was calculated using the jacobian function from the symbolic math toolbox.

81

Chapter 4

Figure 4.7 Trivalent NaPy bound species included in the model and their corresponding stabilities.

82

Model‐driven engineering of improved supramolecular buffering by multivalency

Equation 4.1: 33 CLLKLNKLNKKtriUPy,11,111,1 total33 intra  UN intra UN 2244 9LK12,12inter LK intra LNK UN 2  222 2 24LN2,12,2 Kintra K UN LN K intra K UN 62LLK LK233 LN K K 12inter 3 intra ,1 3 UN intra ,2 LN55 K2 LN 33 K K  LNK K 2 333UN UN intra ,1 3,1UN intra 24LNK K K LN33 K K 3,1,23,3UN intra intra UN intra  12LNK3,2,3UN K intra K intra 61/3LLK LLK L K2 K 13inter 13 inter 4 A intra ,1 intra ,2 LK324 LK K 4B intra ,1 4 A intra ,2 intra ,4 4LK K KLNK 66 4,1AintraintraintraA,2 ,3 4 UN 44 22 2 2LNK4,14,1AUNintraAUNintra K LNK K  66 44 LNK44B UN3 LNK B UN K intra ,1  22 2 44 434LNK4,14,3B UNintra K LNK A UNintra K  22 LK4,1,24,1,2A intra K intra2 LK A intra K intra  22 44 44LNK4,1,3A UN K intra K intra L4,2AUNintraNK K 22 22 44LNK4,1,24,2,3A UN K intra K intra LNK A UN K intra K intra 22 2 44 44LNK4,24,4AUNintraAUNintra K LNK K   16LNK22 K K   4,2,4A UN intra intra 

83

Chapter 4

33 CNLNKLNKKNaPy,11,1 total free33 UN intra UN 44 2 22 2 4224LN22,12,2 KUN LN K intra K UN  LN K intra K UN  2LNKK33 2 LNKK 33 55 3,23,1UN intra UN intra 5LN3 KUN  3 4LN33 K K 3,3UN intra LNKK2  2 LNKKK 3,131UN intra UN intra, intra,2 12LNK3,2,3UN K intra K intra   6 LNK66 LNK 66 44AUNBUN  23LNK44 K LNK 44 K 4,14,1A UN intra B UN intra  44LNK44 K  4 LNK 44 K 4,34,2AUNintraAUNintra 44 4LNK K 4,4AUNintra   22 2 22 2 LNK4,1AUNintra K  3LNK4,1BUNintra K  22 22 44LNK4,1,34,1,2A UN K intra K intra LNK A UN K intra K intra 2 22 22 2 44LNK4,2,34,2A UN K intra K intra LNK A UN K intra  22 16LNK4,2,4A UN K intra K intra  

84

Model‐driven engineering of improved supramolecular buffering by multivalency

We now consider the aggregation of the tetravalent molecule. The tetravalent UPy monomer can cyclize twice, with both cyclization reactions having different statistical factors, which leads to a difference in the cycle stability. Again, we limit the model to fully cyclized species, since species with free binding groups will not be populated (Fig. 4.8).

Figure 4.8 Tetravalent species included in the model and their corresponding stabilities.

As before, the number of species greatly increases upon introduction of the stopper (Fig. 4.9). The mass balances were constructed in a similar manner as for the trivalent UPy model, and solved numerically using fsolve, which was supplied with an analytical Jacobian matrix.

85

Chapter 4

Figure 4.9 Tetravalent NaPy bound species included in the model and their corresponding stabilities.

86

Model‐driven engineering of improved supramolecular buffering by multivalency

Figure 4.9 (Cont.) Tetravalent NaPy bound species included in the model and their corresponding stabilities.

87

Chapter 4

4.4.2 Determination of KUPy‐UPy, KUPy‐NaPy and EM1

Due to the expected high value of KUPy‐UPy, it cannot be determined experimentally without changing the molecular structure to include a fluorescent dye.38 Since the molecular structure of the UPy moiety used in this study does not differ significantly from the reported structure, we assume that the value of KUPy‐UPy is fixed at the reported value 7 ‐1 38 for 6‐methylureidopyrimidinone groups in CHCl3 (KUPy‐UPy = 6 × 10 M ). The correlated 1 value of KUPy‐NaPy was determined by first measuring H NMR spectra of mixtures containing monovalent UPy 1 and NaPy 5, followed by fitting of the measured distribution of UPy‐UPy and UPy‐NaPy contacts with a simple binding model that includes UPy self‐ 6 ‐1 association and UPy‐NaPy dimerization (KUPy‐NaPy = (3 ± 0.2) × 10 M ; Fig. 4.10A‐B).

Figure 4.10 (A) 1H NMR spectra of equimolar mixtures containing monovalent UPy 1 and NaPy 5 at various concentrations. (B) Experimental fractions of UPy hetero‐ and homodimers (circles) and the best fit of the simple binding model (lines). (C) 1H NMR spectra of divalent UPy 2 at various concentrations. (D) Experimental fractions of UPy‐UPy contacts (markers) and the best fit of a ring‐chain equilibrium model (lines).

The determination of the EM of divalent UPy 2 was performed by measuring concentration dependent 1H NMR spectra and subsequently fitting the data (Fig. 4.10C).

88

Model‐driven engineering of improved supramolecular buffering by multivalency

The data was analyzed with two different versions of the model, one in which all rings are considered strainless and in which only EM1 is used as a free parameter. In the second version the monomeric cycle is considered to be strained, so that two EMs are estimated

(EM1 and EM2). Overlaying the best fits of both versions of the model with the data clearly shows that the second version fits considerably better (Fig. 4.10D, grey versus black lines). To ensure that the addition of a model parameter is justified, an extra sum of squares F‐ test was performed which showed that it is highly likely that the second model is correct ‐6 43 (F = 28, p = 1.95 × 10 ). The optimized values of EM1 and EM2 show that the monomeric cycle is slightly stabilized instead of being strained (EM1 = 5.3 ± 0.3 mM, EM2 = 0.3 ± 0.03 mM). However, as was the case with supramolecular buffering by divalent UPys, EM1 is the key parameter for buffering since oligomeric rings are formed only sparingly in the 28 presence of stopper molecules. Thus, only the optimized value of EM1 from the second version of the model is used as input parameter for the tri‐ and tetravalent UPy models.

4.4.3 Fit of 1H NMR data of NaPy 5 mixtures with trivalent UPy 3 or tetravalent UPy 4

The 1H NMR spectra of the mixtures of trivalent UPy 3 and NaPy 5 showed three resonances for the UPy N‐H protons (Fig. 4.11A). Based on their concentration dependent behavior, these peaks were attributed to UPy‐UPy contacts in monomeric rings of monomers and dimers, UPy‐UPy contacts in oligomeric rings and linear species, and UPy‐ NaPy contacts. As expected from the model prediction that trivalent UPy 3 will form dimers with two monomeric cycles at low concentrations, the fraction of UPy‐UPy contacts in monomeric rings is close to 2/3 when no NaPy is present (Fig. 4.11B).

89

Chapter 4

Figure 4.11 (A) a subset of 1H NMR spectra obtained during the dilution experiments (f = 1). (B) Peak fractions as obtained from the 1H NMR spectra (circles) and the best fit using the trivalent UPy model (lines). The dotted lines indicate the model prediction when the parameters are set to the values determined in the reference experiments with monovalent UPy 1, NaPy 5, and divalent UPy 2.

90

Model‐driven engineering of improved supramolecular buffering by multivalency

Figure 4.11 (Cont.) (C) Fit parameter log10(values) after optimization with a squared 2‐norm within 5% (red dots) and above 5% (blue dots) of the best fit. (D) Best fit values of each fitting parameter.

91

Chapter 4

Figure 4.12 (A‐C) Peak fractions as obtained from the 1H NMR spectra (circles) and the best fit using the tetravalent UPy model (lines). The dotted lines indicate the model prediction when the parameters are set to the values determined in the reference experiments with monovalent UPy 1, NaPy 5, and divalent UPy 2. (D) Best fit values of each fitting parameter.

4.4.4 Model description and fit of the tetravalent UPy 4 and NaPy 5 titration using DPU 6

Since DPU can only self‐associate or dimerize with NaPy, its mass balance is quite simple (Eq. 4.2a). The DPU mass balance was analytically solved for the free DPU concentration using Wolfram Mathematica (Eq. 4.2b), and subsequently substituted in the NaPy mass balance of the tetravalent UPy model. That NaPy mass balance was also modified with the addition of the DPU‐NaPy dimer. The mass balance of the tetravalent UPy was not modified, since DPU has no interaction with UPy groups. With the DPU aggregation incorporated in the NaPy mass balance, both the tetravalent UPy and NaPy mass balances were solved numerically using fsolve.

92

Model‐driven engineering of improved supramolecular buffering by multivalency

DPUtotal DPU  DPU-NaPy  2 DPU 2 

2 (4.2a) DPUKKDPU-NaPy  DPU NaPy  2 DPU-DPU  DPU

2 1NaPy8DPU1KKDPU-NaPy  DPU-DPU total   K DPU-NaPy DPUtotal  (4.2b) 4KDPU-DPU

The model was used to perform a global fit of the data shown in the main text and here (Fig. 4.5 and 4.13), using as fit parameters the equilibrium dimerization constants

KDPU‐NaPy and KDPU‐DPU. Gratifyingly, an excellent fit was obtained with parameter values 3 ‐1 close to the approximate values reported by Zimmermann et al. (KDPU‐NaPy = 1.2 × 10 M at o ‐1 o 39 23 C in CHCl3, KDPU‐DPU = 5 M at 50 C in CHCl3, Fig. 4.14).

Figure 4.13 UPy fractions, residuals, and NaPy fractions of the best fit (lines) and based on 1H NMR spectra during titration. The starting concentrations of both multivalent UPy and NaPy are 3 mM (A) and 10 mM (B). Of the NaPy fractions, only the free NaPy fraction could be assigned from the spectra.

93

Chapter 4

Figure 4.14 (A) Fit parameter log10(values) after optimization with a squared 2‐norm within 5% (red dots) and above 5% (blue dots) of the best fit. (B) Best fit values of each model parameter.

4.5 References

(1) Kitano, H. Nature 2002, 420 (6912), 206. (2) Genot, A. J.; Fujii, T.; Rondelez, Y. Phys. Rev. Lett. 2012, 109 (20), 208102. (3) Carothers, J. M.; Goler, J. A.; Juminaga, D.; Keasling, J. D. Science 2011, 334 (6063), 1716. (4) Teunissen, A. J. P.; Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2016. (5) Ashkenasy, G.; Hermans, T. M.; Otto, S.; Taylor, A. F. Chem. Soc. Rev. 2017. (6) Korevaar, P. A.; Grenier, C.; Markvoort, A. J.; Schenning, A. P. H. J.; Greef, T. F. A. de; Meijer, E. W. Proc. Natl. Acad. Sci. 2013, 110 (43), 17205. (7) Kiessling, L. L.; Gestwicki, J. E.; Strong, L. E. Angew. Chem. Int. Ed. 2006, 45 (15), 2348. (8) Martinez‐Veracoechea, F. J.; Frenkel, D. Proc. Natl. Acad. Sci. 2011, 108 (27), 10963. (9) Fasting, C.; Schalley, C. A.; Weber, M.; Seitz, O.; Hecht, S.; Koksch, B.; Dernedde, J.; Graf, C.; Knapp, E.‐W.; Haag, R. Angew. Chem. Int. Ed. 2012, 51 (42), 10472. (10) Buchler, N. E.; Louis, M. J. Mol. Biol. 2008, 384 (5), 1106. (11) Zhang, Q.; Bhattacharya, S.; Andersen, M. E. Open Biol. 2013, 3 (4), 130031. (12) Mulder, A.; Huskens, J.; Reinhoudt, D. N. Org. Biomol. Chem. 2004, 2 (23), 3409. (13) Badjić, J. D.; Nelson, A.; Cantrill, S. J.; Turnbull, W. B.; Stoddart, J. F. Acc. Chem. Res. 2005, 38 (9), 723. (14) Levine, P. M.; Carberry, T. P.; Holub, J. M.; Kirshenbaum, K. MedChemComm 2013, 4 (3), 493. (15) Hyman, A. A.; Weber, C. A.; Jülicher, F. Annu. Rev. Cell Dev. Biol. 2014, 30 (1), 39. (16) Banani, S. F.; Rice, A. M.; Peeples, W. B.; Lin, Y.; Jain, S.; Parker, R.; Rosen, M. K. Cell 2016, 166 (3), 651. (17) Mammen, M.; Choi, S.‐K.; Whitesides, G. M. Angew. Chem. Int. Ed. 1998, 37 (20), 2754.

94

Model‐driven engineering of improved supramolecular buffering by multivalency

(18) Krishnamurthy, V. M.; Semetey, V.; Bracher, P. J.; Shen, N.; Whitesides, G. M. J. Am. Chem. Soc. 2007, 129 (5), 1312. (19) Mahon, C. S.; Fulton, D. A. Nat. Chem. 2014, 6 (8), 665. (20) Angioletti‐Uberti, S. Phys. Rev. Lett. 2017, 118 (6), 068001. (21) Teunissen, A. J. P.; Nieuwenhuizen, M. M. L.; Rodríguez‐Llansola, F.; Palmans, A. R. A.; Meijer, E. W. Macromolecules 2014, 47 (23), 8429. (22) Folmer, B. J. B.; Sijbesma, R. P.; Meijer, E. W. J. Am. Chem. Soc. 2001, 123 (9), 2093. (23) Xu, J.‐F.; Chen, Y.‐Z.; Wu, D.; Wu, L.‐Z.; Tung, C.‐H.; Yang, Q.‐Z. Angew. Chem. Int. Ed. 2013, 52 (37), 9738. (24) Cohen, R. J.; Benedek, G. B. J. Phys. Chem. 1982, 86 (19), 3696. (25) Semenov, A. N.; Rubinstein, M. Macromolecules 1998, 31 (4), 1373. (26) Appel, E. A.; Barrio, J. del; Loh, X. J.; Scherman, O. A. Chem. Soc. Rev. 2012, 41 (18), 6195. (27) Balkenende, D. W. R.; Monnier, C. A.; Fiore, G. L.; Weder, C. Nat. Commun. 2016, 7, 10995. (28) Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2015, 137 (4), 1501. (29) Thordarson, P. Chem. Soc. Rev. 2011, 40 (3), 1305. (30) Ercolani, G.; Piguet, C.; Borkovec, M.; Hamacek, J. J. Phys. Chem. B 2007, 111 (42), 12195. (31) Hunter, C. A.; Anderson, H. L. Angew. Chem. Int. Ed. 2009, 48 (41), 7488. (32) Ercolani, G.; Schiaffino, L. Angew. Chem. Int. Ed. 2011, 50 (8), 1762. (33) Jacobson, H.; Stockmayer, W. H. J. Chem. Phys. 1950, 18 (12), 1600. (34) Ercolani, G.; Mandolini, L.; Mencarelli, P.; Roelens, S. J. Am. Chem. Soc. 1993, 115 (10), 3901. (35) Mandolini, L. In Advances in Physical Organic Chemistry; Bethell, V. G. and D., Ed.; Academic Press, 1986; Vol. 22, pp 1–111. (36) Blinov, M. L.; Ruebenacker, O.; Moraru, I. I. IET Syst. Biol. 2008, 2 (5), 363. (37) Teunissen, A. J. P. Competing Interactions in Chemical Reaction Networks. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2017. (38) Söntjens, S. H. M.; Sijbesma, R. P.; van Genderen, M. H. P.; Meijer, E. W. J. Am. Chem. Soc. 2000, 122 (31), 7487. (39) Corbin, P. S.; Zimmerman, S. C.; Thiessen, P. A.; Hawryluk, N. A.; Murray, T. J. J. Am. Chem. Soc. 2001, 123 (43), 10475. (40) Zaupa, G.; Scrimin, P.; Prins, L. J. J. Am. Chem. Soc. 2008, 130 (17), 5699. (41) Chen, C.; Tan, J.; Hsieh, M.‐C.; Pan, T.; Goodwin, J. T.; Mehta, A. K.; Grover, M. A.; Lynn, D. G. Nat. Chem. 2017, advance online publication. (42) McKay, C. S.; Finn, M. G. Angew. Chem. Int. Ed. 2016, 55 (41), 12643. (43) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes in C: The Art of Scientific Computing, Second Edition, 2 edition.; Cambridge University Press: Cambridge ; New York, 1992.

95

Chapter 4

96

5

Regulating competing supramolecular interactions using ligand concentration

Abstract

Competition in biochemical processes is often used to make molecular mechanisms adaptable to a range of conditions and to introduce selectivity in complex mixtures and various pathways. In recent years, competition has been studied in multi‐component supramolecular systems, giving rise to increased specificity for particular constructs. As a model system for competing supramolecular processes, we study a C2v symmetric trivalent UPy which can form two mutually exclusive types of cycles as synthesized by Bram Teunissen. The self‐assembly of the trivalent monomer cannot be fully assigned using experimental characterization techniques, necessitating the use of a thermodynamic binding model. While competition between the two types of cycles is governed by their relative stabilities, their distribution is influenced by binding of a monovalent 2,7‐diamido‐ 1,8‐naphthyridine ligand. The model is used to predict how ligand selectivity can influence the cycle distribution to form each type of cycle exclusively. This work demonstrates the increased necessity for theoretical analysis of supramolecular systems that are becoming progressively more complex.

Part of this chapter has been published as: Teunissen, A. J. P.; Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2016, 138, 6852.

Chapter 5

5.1 Introduction

Competition is an important property of many biochemical mechanisms such as bistability, ultrasensitivity,1,2 and enhanced stochasticity.3 In turn, these mechanisms give the cell the ability to quickly adapt its internal workings to a variety of external conditions, and evolve by natural selection over longer timescales. For example, DNA repair in eukaryotic cells is regulated by two competing and complementary mechanisms,4 RNA transcription is regulated by the competition of seven different subunits each with its own binding affinity,5 and competing enzyme folding pathways are guided by chaperones.6 In contrast, traditional single‐component supramolecular assemblies are designed to form a single type of structure.7 However, recent advances in the field of supramolecular chemistry have shifted focus towards multi‐component systems, in which competition plays a larger role. For example, competing supramolecular interactions are used to obtain social or narcissistic self‐sorting in mixtures of supramolecular moieties.8,9 Furthermore, by employing competitive supramolecular displacement reactions, logic gates can be constructed or reactions can be activated.10–13 In many of these examples, competition is used to promote the formation of a single type of structure in a complex mixture of species. This is elegantly demonstrated in mixtures of molecular species that interchange through dynamic covalent chemistry, where the introduction of a template induces a shift in the species distribution in which the strongest binder is usually amplified most.14 Here, we present the theoretical study of trivalent UPy 1 in which one of the linkers connecting the UPy groups is shorter (Fig. 5.1A). This results in a C2v symmetric trivalent molecule that cyclizes in two different and mutually exclusive ways, i.e., a small and a large cycle (Fig. 5.1B). Small cycle C1B is preferably formed since there are two ways to form it and since its effective molarity is higher. After cyclization, the remaining free UPy forms an intermolecular contact, leading to the formation of three types of dimers C1B1B, 1 C1A1B, and C1A1A. The experimental characterization of the molecule gives a complex H NMR spectrum that cannot be assigned without the help of both a library of reference molecules and a thermodynamic binding model. This shows the emerging need for a combined experimental and theoretical approach, as molecules used in future studies will probably have increased levels of complexity. Furthermore, we show that the introduction of the NaPy ligand promotes the formation of cycle C1B. Simulations are used to explore the possibility to exclusively form one type of cycle.

98

Regulating competing supramolecular interactions using ligand concentration

Figure 5.1 (A) Molecular structures of the trivalent UPy 1 and monovalent NaPy 2. (B) Schematic representation of the association behavior of the C2V symmetric trivalent UPy and the monovalent NaPy ligand.

5.2 Results and discussion

The synthesis and characterization of trivalent UPy 1 and a library of reference compounds was performed by Bram Teunissen.15 He showed that the signals in the trivalent UPy 1H NMR spectrum could be assigned using spectra of the reference compounds, and he predicted a likely distribution of cyclized dimers of the trivalent UPy. Subsequently, he performed a titration with NaPy 2, to study its influence on the cycle distribution. The spectra obtained during the titration could not be assigned unambiguously, which presented difficulties in obtaining the species distribution. To overcome this difficulty and to verify the species distribution in the absence of NaPy, a thermodynamic binding model of the trivalent UPy with stopper was constructed, which was subsequently used to analyze the experimental data.

5.2.1 Model outline

We first consider the case where the concentration of monovalent NaPy (CN) is zero, focusing on the self‐assembly of the trivalent UPy monomer 1. We denote the UPy groups connected by the longer linkers A‐type UPys, and the remaining UPy with the shorter linker B‐type UPys (Fig. 5.1B). Since the concentrations at which the trivalent molecule was characterized are lower than the values of the effective molarities of both the small and large cycles, it is sufficient to consider only monomeric cycles (Ctrivalent UPy = 2 mM, 15 EMSC = 8.2 mM, EMLC = 5.8 mM). However, after a cycle is formed, the remaining binding

99

Chapter 5

group can still dimerize, forming a dimeric species with two monomeric cycles (C1A1A, C1A1B, and C1B1B, Scheme 5.1A). To determine the statistical factors for the formation of the trivalent UPy species we employed the symmetry method.16 The trivalent UPy monomer 1 has a C2 axis, dimer L2A has a C2 axis too, dimer L2B has no external symmetry axes but has an internal C2 axis due to the internal rotation of the left monomer, and dimer L2C has two independent external C2 axes and one internal C2 axis due to the internal rotation of one monomer with respect to the other. Thus, the statistical factors for the corresponding equilibria are 2, 2, and ½. Based on statistical factors alone, dimers L2A and L2B have a higher tendency to form than dimer L2C. The subsequent cyclization reactions of the dimers to form C1A1A, C1A1B, and C1B1B have a statistical factor of 1, since the products belong to the same symmetry group as the reactants.

Scheme 5.1 (A) Overview of the trivalent UPy species included in the model. Inter‐ and intramolecular dimerization constants are based on the reference dimerization of monovalent UPys and reference ring formation of divalent UPys, respectively, shown in the squares. The association constants KAA, KAB, and KBB are microscopic intermolecular constants, which are corrected for statistical factors. (B) ‘Saddle’ species that were excluded from the model.

While assessing the species to be included in the model, a dimeric 'saddle' species was considered, which consisted of two types of dimers SA and SB with all UPy groups bound intermolecularly (Scheme 5.1B). However, the stabilities that were calculated for the saddle structures were quite low due to two dimeric ring closure reactions that are needed to form the structure. For strainless rings, the ring closure propensity scales with the ring size as i.‐2.5 (Equation 3.1). Thus two dimeric ring closure reactions have a propensity of (2‐2.5)2 = 0.03, compared to monomeric ring closure.17 Coupled with the

100

Regulating competing supramolecular interactions using ligand concentration appropriate statistical factors and assuming no interannular cooperativity, the simulated combined fraction of saddle structures was 5 %. Since no evidence for the structures was found experimentally, and the structures were not needed to fit the experimental data (vide infra), they were omitted from the simulations. The mass balance equation of the trivalent UPy then becomes:

(5.1) LC2A  1B1B  L 2B  LLCC2   111A1B0     CLC1A1B 2C  1A1A

12KKintraLC intraSC  2  L 21KKBB intraSC 21 KKK AB intraLC intraSC 1 2L   1  0.5KK 1 2 AA intraLC where KintraLC and KintraSC are the equilibrium intramolecular cyclization constants of the monomeric large and small cycles C1A and C1B, KAA, KAB and KBB are the equilibrium UPy‐UPy dimerization constants forming AA, AB, and BB UPy contacts, respectively. Preliminary concentration dependent simulations of the trivalent UPy model predict that, at the experimental conditions, a mixture of dimers C1A1A, C1A1B, and C1B1B is present of which the amount is dictated by the balance between the statistical factors, the effective molarities and the three intermolecular UPy‐UPy constants (Fig. 5.2A). As expected, increasing the value of a single KUPy‐UPy (KAA, KAB or KAB) promotes the dimer that has the most of these contacts in its structure at the cost of the other dimer types (Fig.

5.2B‐D). Interestingly, since dimers C1A1B and C1B1B both have two AB contacts and one BB contact, any changes to KAB and KBB lead to zero difference in the stabilities relative to each other (Fig. 5.2C‐D). Thus, forming either of the two dimers exclusively is only possible when the effective molarities of the two types of rings differ extremely. Likewise, decreasing the value of a single KUPy‐UPy reduces the stability of the dimer that has such a contact, promoting formation of the remaining dimer(s). However, since all dimers have at least one B‐B contact, decreasing KBB leads to a decrease in stability for all dimers (Fig. 5.2D).

101

Chapter 5

7 ‐1 Figure 5.2 (A) Concentration dependent speciation of the trivalent UPy model, using KAA = KAB = KBB = 6 × 10 M ,

EMSC = 8.2 mM, and EMLC = 5.8 mM. The vertical black line marks the concentration at which the experimental characterization was performed (Ctrivalent UPy = 2 mM). (B‐D) Species distribution dependence on KAA (B), KAB (C), and KBB (D). The parameters that are not varied have the same value as in (A), Ctrivalent UPy = 2 mM. The vertical 7 ‐1 18 black line marks the literature value of UPy self‐association (KUPy‐UPy = 6 × 10 M ).

Scheme 5.2 Overview of the trivalent UPy and NaPy‐bound species included in the model. The binding constants of UPy‐NaPy dimerization (KBN and KAN) are based on the reference dimerization of monovalent UPys with monovalent NaPy, shown in the square.

102

Regulating competing supramolecular interactions using ligand concentration

We now consider the case where CN is nonzero. Since NaPy can bind to both A and B‐ type UPys, it is expected that it will first break up any intermolecular bonds of the trivalent UPy, followed by the intramolecular bonds. Thus, we consider only NaPy‐bound species in which one trivalent molecule is present (Scheme 5.2). The mass balance equation of the trivalent UPy molecule becomes:

(5.2) LC2A  1B1B  L 2B  LLCC2   111A1B0     CLC1A1B 2C  1A1A

CN1A CN 1B LN 1 3  12KKintraLC intraSC  2 21KKBB intraSC 21 KKK AB intraLC intraSC  L2L11  0.5KK 1 2 AA intraLC 2 NNKKKK  KK 2  intraLC AN intraSC BN AN BN 

Whereas the mass balance of the monovalent NaPy is:

NNCNCN3LN   (5.3)  0    1A  1B  1 3

2 2 N1  L1 KKKK intraLC AN  intraSC BN  3N  KK AN BN  Equation (5.3) was solved analytically for [N] using the Mathematica software package. The solution was substituted into equation (5.2), which was subsequently solved numerically using Matlabs fzero function. With the model complete, we proceed to fit it to the experimental data obtained during NaPy titration to obtain the species distribution.

5.2.2 Non‐linear regression of experimental titration data

The 1H NMR spectra obtained by Bram Teunissen during NaPy titration displayed a complex set of resonances which could be assigned by comparison to the spectra of a library of reference compounds (Fig. 5.3A‐B).15 However, since most resonances are assigned to multiple species it is challenging to directly deduce the species distribution. Therefore, the thermodynamic binding model is used to perform non‐linear regression on the normalized intensities of the 1H NMR resonances, using the assignment of the signals a‐f.

103

Chapter 5

Figure 5.3 (A) Partial 1H NMR spectra obtained during the titration of trivalent UPy 1 with NaPy 2 showing the 15 resonances of a UPy N‐H proton (red), as measured by Bram Teunissen (CUPy 1 = 2 mM, ‐15 °C in CDCl3). (B) Species present during the titration and the assignment of contacts to the resonances in (A), as assigned by Bram Teunissen.15

Parameter bounds were applied to avoid the outcome of unrealistic parameter values during the regression process. Given the relatively long aliphatic chain connecting the B‐ type UPys, no back folding of the UPys to polar groups is expected.19,20 Therefore we assumed the binding constant of linear B‐B and B‐N contacts to be equal to literature 7 ‐1 6 ‐1 18,21 values (KBB = 6 × 10 M and KBN = 5 × 10 M ) while allowing a relatively small deviation of 5 %. Since it is unlikely that the other types of UPy and NaPy contacts are significantly stronger, an upper limit of 108 M‐1 was introduced for these contacts. 5 ‐1 3 Furthermore, KAA and KAB were given lower limits of 10 M and KAN a lower limit of 10 M‐1, which was in line with reference experiments.15,22 Lastly, the effective molarities were constrained to the values that were obtained from divalent reference compounds, allowing for a 10 % deviation due to potential experimental error.15,22 A large number of non‐linear least square optimizations with different initial values were performed to ensure that the global minimum was found (Fig. 5.4A). Since the model parameters are highly correlated this results in a collection of parameter values that give equally good fits (Fig. 5.4B). Nonetheless, using this collection of optimized parameter values to calculate the molecular speciation resulted in almost identical species distributions (Fig. 5.4C‐D). Thus, while the parameter values cannot be determined exactly, the species distribution can. Using the optimized parameters, the simulated speciation of trivalent UPy 1 in the absence of NaPy 2 is quite similar to the one predicted by Bram Teunissen, who assumed equal values for the UPy‐UPy binding strengths (Fig. 5.4C).15 However, as a result of the differences in UPy‐UPy binding strengths, i.e., B‐B > A‐B > A‐A, the fraction of C1B1B is slightly higher than expected, at the expense of C1A1A. Upon the addition of NaPy 2, the

104

Regulating competing supramolecular interactions using ligand concentration relatively weak linear UPy‐UPy contacts connecting the dimerized cycles are disrupted first, resulting in monomeric cycles with the 3rd UPy bound to NaPy. At higher equivalents of NaPy the cycles open up, resulting in a trivalent UPy with NaPy bound to all UPys. Interestingly, signal “a” first increases and subsequently decreases during the NaPy titration, which implied that the cycle distribution is shifted towards small A‐B cycles upon addition of small amounts of NaPy (Fig. 5.3B and 5.4A). Signal “a” is assigned to A‐type

UPys in cycles (C1BN, C1B1B, and C1A1B) and in the linear contact of C1A1A. To quantify the cycle distribution during the titration, we calculated weighted sums of the optimized fractions of species comprising small and large cycles, SC and LC, respectively (equation 5.1, Fig. 5.5A).

SC ffCCNC  f 0.5 f C 1B 1B 1B1B 1A1B (5.1) LC ff  f 0.5 f CCNC1A 1A 1A1A C 1A1B where fx is the optimized fraction of species x during the titration. The factor 0.5 for species C1A1B is used since the dimer comprises a small and a large cycle. After addition of 1 equivalent of NaPy, SC increases from 0.80 to 0.86, which is attributed to KBN being approximately one order of magnitude higher than KAN, along with the fact that there are two B‐type UPys to every A‐type UPy. Both of these effects increase the likelihood of NaPy binding to a B‐type UPy, which subsequently stabilizes C1BN with respect to C1AN. Thus, at low equivalents, NaPy 2 acts as a promoter for the formation of the small A‐B cycle in trivalent UPy 1. Having demonstrated the amplifying effect of NaPy 2 on the cycle ratio, we simulated the extent to which NaPy can influence the cycle distribution if it were given a different selectivity. Decreasing the value of KAN in the simulation further allows the exclusive formation of small A‐B cycles, while increasing KAN allows the exclusive formation of large B‐B cycles (Fig. 5.5B). In this manner, the concentration and selectivity of NaPy can be used to regulate the fraction of each type of cycle, without altering the molecular structure of trivalent UPy 1.

105

Chapter 5

1 Figure 5.4 (A) Normalized peak areas of H NMR resonances observed for an N‐H proton of trivalent UPy 1 (CUPy 1

= 2 mM, ‐15 °C in CDCl3), during its titration with NaPy 2 (symbols) and the best fit based on the thermodynamic model (lines). (B) Fit parameter values of all non‐linear least square optimizations that have a squared 2‐norm residual within 5 % of the optimal parameter set. (C) Calculated distribution of cyclized trivalent UPy dimers in the absence of NaPy, based on the parameter values of the best fits. (D) Average of the calculated speciation during NaPy titration, based on the parameter values of the best fits. Note that the 95% confidence interval is smaller than the linewidth of the plot, thus it was omitted.

106

Regulating competing supramolecular interactions using ligand concentration

Figure 5.5 (A) Calculated weighted sums SC and LC during the NaPy titration (Ctrivalent UPy = 2 mM), based on the model parameters of the best fits. The 95% confidence interval is smaller than the linewidth of the plot, thus it was omitted. (B) Average simulated weighted sums SC and LC during NaPy titration using various values of KAN and the model parameters of the best fits. The exact values of KAN are shown next to the colorbar. The errorbars denote the 95% confidence interval, calculated as two times the standard deviation.

5.3 Conclusion

We developed a thermodynamic binding model for the association of a C2V symmetric trivalent UPy, and subsequently used it to gain more insight in its experimental behavior. By expanding the model with monovalent NaPy association, the model could be used to fit experimental data obtained during a NaPy titration. The complete speciation of both molecules was determined, which was impossible to achieve using only experimental methods. Furthermore, we used the model to predict the influence of NaPy specificity on the cycle distribution during titration. These results demonstrate the importance of model assisted analysis for the characterization of complex supramolecular constructs.

5.4 Experimental section

Simulations were performed using the Matlab software package (R2014a, version 8.3.0.532, Mathworks) along with its optimization, curve fitting and parallel computing toolboxes. Non‐linear least squares optimizations were performed using Matlabs lsqcurvefit function. Each fit was performed with a thousand optimizations using starting parameters that were generated using latin hypercube sampling. Mass balance equations were solved analytically using the Wolfram Mathematica software package (version 9.0.1.0, Wolfram research).

107

Chapter 5

5.5 References

(1) Kim, S. Y.; Ferrell Jr., J. E. Cell 2007, 128 (6), 1133. (2) Buchler, N. E.; Louis, M. J. Mol. Biol. 2008, 384 (5), 1106. (3) J. Chem. Phys. 2013, 139 (12), 121915. (4) Kass, E. M.; Jasin, M. Febs Lett. 2010, 584 (17), 3703. (5) Maeda, H.; Fujita, N.; Ishihama, A. Nucleic Acids Res. 2000, 28 (18), 3497. (6) Balchin, D.; Hayer‐Hartl, M.; Hartl, F. U. Science 2016, 353 (6294), aac4354. (7) De Greef, T. F. A.; Smulders, M. M. J.; Wolffs, M.; Schenning, A. P. H. J.; Sijbesma, R. P.; Meijer, E. W. Chem. Rev. 2009, 109 (11), 5687. (8) Wu, A. X.; Isaacs, L. J. Am. Chem. Soc. 2003, 125 (16), 4831. (9) Safont‐Sempere, M. M.; Fernandez, G.; Wuerthner, F. Chem. Rev. 2011, 111 (9), 5784. (10) Wagner, N.; Ashkenasy, G. Chem. ‐ Eur. J. 2009, 15 (7), 1765. (11) Pischel, U.; Uzunova, V. D.; Remon, P.; Nau, W. M. Chem. Commun. 2010, 46 (15), 2635. (12) Komatsu, H.; Matsumoto, S.; Tamaru, S.; Kaneko, K.; Ikeda, M.; Hamachi, I. J. Am. Chem. Soc. 2009, 131 (15), 5580. (13) Smulders, M. M. J.; Nitschke, J. R. Chem. Sci. 2012, 3 (3), 785. (14) Corbett, P. T.; Sanders, J. K. M.; Otto, S. J. Am. Chem. Soc. 2005, 127 (26), 9390. (15) Teunissen, A. J. P. Competing Interactions in Chemical Reaction Networks. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2017. (16) Ercolani, G.; Piguet, C.; Borkovec, M.; Hamacek, J. J. Phys. Chem. B 2007, 111 (42), 12195. (17) Ercolani, G.; Mandolini, L.; Mencarelli, P.; Roelens, S. J. Am. Chem. Soc. 1993, 115 (10), 3901. (18) Söntjens, S. H. M.; Sijbesma, R. P.; van Genderen, M. H. P.; Meijer, E. W. J. Am. Chem. Soc. 2000, 122 (15), 7487. (19) de Greef, T. F. A.; Nieuwenhuizen, M. M. L.; Sijbesma, R. P.; Meijer, E. W. J. Org. Chem. 2010, 75 (3), 598. (20) Greef, T. F. A. de; Nieuwenhuizen, M. M. L.; Stals, P. J. M.; Fitié, C. F. C.; Palmans, A. R. A.; Sijbesma, R. P.; Meijer, E. W. Chem. Commun. 2008, No. 36, 4306. (21) Wang, X. Z.; Li, X. Q.; Shao, X. Bin; Zhao, X.; Deng, P.; Jiang, X. K.; Li, Z. T.; Chen, Y. Q. Chem. ‐ Eur. J. 2003, 9, 2904. (22) Teunissen, A. J. P.; Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2016.

108

6 Model‐driven engineering of reaction

kinetics using feedback

Abstract

In biochemical systems, feedback is used to control individual reaction rates of complex reaction networks so that the desired outcome is produced. To be able to construct chemical reaction networks of comparable complexity, chemists are developing synthetic systems with designed feedback. Here, we first theoretically explore a supramolecular strategy for obtaining direct positive feedback by way of a reaction product that releases additional phase‐transfer catalyst. Secondly, we perform a theoretical analysis of a complex Michael addition reaction in which both ureidopyrimidinone and 2,7‐diamido‐1,8‐naphthyridine groups act as phase‐transfer catalysts, displaying bimolecular, sigmoidal and pseudo 0th order reaction kinetics. The non‐covalent interactions of the catalysts allow for increased control over their relative activities. Additionally, model‐driven engineering is employed to further increase the linearity of the pseudo 0th order reaction kinetics.

Chapter 6

6.1 Introduction

One of the most important molecular features that enables control over reaction networks is feedback.1–3 Feedback allows dynamical control over individual reaction rates and entire pathways,4 allowing the cell to internally adjust to changes in its environment and respond to specific molecular cues.5–7 In biochemical systems, feedback originates from multiple mechanisms such as ligand induced conformational changes, covalent modification, and protein‐protein interactions.8–11 To reproduce such an adaptive reaction network and ultimately to synthesize artificial cells, chemists have made significant progress by creating synthetic systems displaying feedback.12,13 Early research focused on the creation of direct positive feedback loops, where reaction rates increase as a function of product concentration via either autocatalysis or autoinduction.14 While an autocatalytic product directly catalyzes its own formation, the product of an autoinductive reaction increases its formation rate by influencing another step in the reaction sequence, e.g. its binding as a ligand to the catalyst.15 Designed autocatalysis was achieved using a template to bring reactions together via non‐covalent interactions, producing more of the template.16–18 However, this templated autocatalysis approach is limited by the formation of catalytically inactive template homodimers, which accumulate as the reaction proceeds.19 Designed autoinduction has been achieved by a catalyst that is sandwiched between two layers, and thereby shielded from its environment.20,21 Upon binding of the reaction product to the catalyst, the two layers open up, and the catalyst activity increases due to its increased accessibility. Here, we first present a theoretical exploration of a strategy for obtaining direct positive feedback in a reaction catalyzed by the supramolecular moiety 2,7‐diamido‐1,8‐ naphthyridine (NaPy). The strategy is based on the release of NaPy due to product cyclization, potentially affording a supramolecular self‐accelerating reaction. Subsequently, we perform a theoretical analysis of an experimental system in which both NaPy and ureidopyrimidinone (UPy) are phase‐transfer catalysts for a Michael addition. The Michael addition is slightly autocatalytic, while catalysts NaPy and UPy display bimolecular and sigmoidal kinetics, respectively. By comparing different versions of kinetic models more insight is gained in the reaction mechanisms. Mixing the two catalysts results in linear reaction progress curves, and we use a model‐driven engineering approach to increase the linearity further.

110

Model‐driven engineering of reaction kinetics using feedback

6.2 Results and discussion

6.2.1 Supramolecular self‐accelerating reaction

Based on the observation that NaPy binding competes with cyclization of a divalent

UPy (Chapter 3) and that NaPy acts as a phase transfer catalyst for K2CO3 in certain Michael additions (Fig. 6.1A),22,23 we hypothesized that a reaction in which monovalent UPys are converted to divalent UPys could lead to release of NaPy due to cyclization of the divalent product, increasing the reaction rate as more product is formed (Fig. 6.1B). Thus, while no additional catalyst is formed through a covalent reaction, the reaction rate is increased by the non‐covalent release of catalyst due to reversible cyclization of the product. Since this principle falls outside the scope of the definitions for autocatalysis and autoinduction, we employ the phrase ‘supramolecular self‐accelerating reaction’ for this system.

Figure 6.1 (A) Schematic of the reported NaPy catalyzed Michael addition between trans‐β‐nitrostyrene and 2,4‐ pentanedione. A, D, and M stand for Michael acceptor, donor, and product, respectively. (B) Schematic of the proposed self‐accelerating reaction. During the reaction, additional NaPy catalyst is released due to ring formation of the divalent UPy

Due to the concentration dependency of the supramolecular equilibria involved in the self‐accelerating reaction, it is expected that NaPy release will only occur in a specific concentration range, depending on the values of the equilibrium dimerization constants and the cyclization tendency of the product divalent UPy. To assess the concentration range in which rate‐acceleration is expected to occur, we computed the steady‐state

111

Chapter 6 fraction of free NaPy corresponding to the beginning and end of the reaction (Fig. 6.2A). At the beginning of the reaction, only monovalent UPys and NaPy are present, while at the end of the reaction a mixture of divalent UPys and monovalent NaPy is present. To calculate the fraction of free NaPy at the start of the reaction, a simple monovalent binding model was used, while the fraction of free NaPy at the end of the reaction was computed using the model developed in chapter 3 (section 3.4.3). The input parameters of both models are the equilibrium binding constants for UPy‐UPy and UPy‐NaPy dimerization KUPy‐UPy and KUPy‐NaPy, the ratio of NaPy to UPy groups r, and the effective molarity EM. Simulations using relevant parameter values revealed that there is an optimal concentration range in which the reaction is viable for self‐accelerating kinetics (Fig. 6.2A). However, this optimum is located at a concentration far below the concentration range typically used for NaPy catalysis, e.g. using 5‐20 mM of NaPy leads to reaction times in the order of days.

Figure 6.2 (A) Simulated fractions of free NaPy versus the total NaPy concentration and the difference in fractions between the end and beginning of the reaction. The region in which fN,end ‐ fN,start > 0.5 is highlighted in 7 ‐ grey. The input parameters for the predictions are: Cmonovalent UPy = CNaPy, Cdivalent Upy = 0.5 × CNaPy, KUPy‐UPy = 6 × 10 M 1 6 ‐1 , KUPy‐NaPy = 5 × 10 M , EM = 10 mM, r = 1. (B) Response of the maximum fraction difference of free NaPy and the corresponding optimal concentration as a function of 10 fold changes in the input parameter values. The arrowhead represents a 10 fold multiplication of the parameter value, while the end of the arrow represents a 10 fold division. At the intersection, the parameter values are equal to those used in (A).

To investigate whether it is possible to shift the optimum to experimentally relevant concentrations, the model input parameters were varied individually and their influence on the optimum height and concentration were examined (Fig. 6.2B). Interestingly, increasing KUPy‐UPy or decreasing KUPy‐NaPy results in an increase of the optimal concentration, at the expense of the fraction of NaPy released during the reaction. Analogous to its influence on supramolecular buffering, increasing the EM leads to an increase in both the optimum concentration and fraction of released NaPy, as it increases cycle stability and thereby increases the competition between cyclization and NaPy

112

Model‐driven engineering of reaction kinetics using feedback binding (Chapter 3). However, increasing the EM to 100 mM is experimentally extremely challenging, as such values are mostly observed for rings comprised of less than ~10 atoms or in rings where extremely precise pre‐organization is present.24,25 Lastly, varying the ratio of NaPy to UPy moieties, r, offers no increase in the optimal concentration. While lowering r increases the fraction of NaPy released during the reaction, it occurs at the cost of using lower NaPy concentrations, which is not desirable for increasing reaction rates. While the self‐acceleration reaction is thermodynamically viable at low concentrations, the question remains how large its effect on the kinetics will be. Therefore, a kinetic model was constructed including all changes in non‐covalent binding during the reaction. During the reaction, a mixture of two different monovalent UPys, divalent UPy, and NaPy is present. Due to the divalent UPy having two binding sites with four potential binding partners, this leads to 19+11n potential molecular species; where n is the number of chain contacts of the divalent UPy (Fig. 6.3). This effect of a few molecules potentially creating a large amount of species is termed combinatorial complexity, and is a major hurdle in modeling biochemical systems.26 To circumvent the error‐prone manual entry of an ordinary differential equation (ODE) for each species, several rule‐based strategies have been developed in which models are defined by reaction rules between parts of molecules, e.g. binding or covalent reactions.27,28 Subsequently, the rules are automatically applied to form a reaction network of ODEs, which can then be solved using well‐established numerical techniques. The appropriate reaction rules for the self‐ accelerating system are outlined in section 6.4.1.

Figure 6.3 Potential molecular species formed during the self‐accelerating reaction. A, D, and M stand for Michael acceptor, donor, and product, respectively.

113

Chapter 6

Kinetic simulations were performed using optimized parameters obtained from non‐ linear regression of experimental data on a Michael addition with NaPy as phase‐transfer

2 aN  ‐1 22 catalyst (Eq 6.1; kN = 0.18 M s , aN = 2; Fig. 6.1A).

aN 2 rkCCNNNaPysubstrate (6.1) ‐1 where rN is the reaction rate of the Michael addition in M s , kN is the kinetic constant, aN is the reaction order of NaPy, and C is the concentration of NaPy, or substrate. Unfortunately, under those conditions, the simulations predicted that no reaction would take place on realistic timescales, i.e. the reaction would take longer than a year to complete. Therefore, to verify whether the self‐accelerating principle is viable on realistic ‐2 ‐1 timescales, we used different values for the kinetic parameters (kN = 10 M s , aN = 1) and the EM (100 mM). With these new parameter values, the optimum concentration is shifted to 0.2 mM (Fig. 6.2B). Gratifyingly, these simulations showed a clear self‐ accelerating effect as compared to reference bimolecular kinetic simulations using equation (6.1) for the reaction rate and a free NaPy concentration equal to that at the beginning of the supramolecular self‐accelerating reaction (Fig. 6.4B). As expected, the magnitude of the self‐acceleration is concentration dependent, which is clearly observed when comparing the time to 50 % conversion for the self‐accelerating reaction and the reference simulations (t50, Fig. 6.4B).

Interestingly, the concentration at which the difference in t50 is maximized does not overlap with the predicted optimal concentration based on the thermodynamic simulations (Fig. 6.4C inset). This shift is caused by the non‐linear and concentration‐ dependent release of NaPy during the reaction (Fig. 6.4D). Indeed, comparing the concentration dependencies of t50 and t99, a clear shift to the thermodynamic optimal concentration is observed when going to higher conversions. The non‐linear release of NaPy is due to several factors. At concentrations below the optimal concentration and at low conversions, the decrease of monovalent UPy‐NaPy dimers as a function of conversion is lower, leading to a lower release rate of NaPy. At higher concentrations and conversions, less NaPy is released due to a lack of divalent UPy cyclization. The reaction progress curves do not show the sigmoidal shape indicative of an autocatalytic mechanism (Fig. 6.4B), which is attributed to two reasons. Firstly, due to the concentration dependence of the supramolecular equilibria, there exist no conditions in which no NaPy is free at the beginning of the reaction and all NaPy is free at the end (Fig. 6.4C), while this is the case for the product of an autocatalytic reaction. Secondly, at the optimal concentration, most NaPy is released at high conversions, causing the rate‐ acceleration to occur at higher conversions (black lines; Fig. 6.4D).

114

Model‐driven engineering of reaction kinetics using feedback

Figure 6.4 (A) Schematic of the supramolecular self‐accelerating reaction. (B) Simulated conversion versus time for the self‐accelerating system (solid lines) and reference simulations (dotted lines) at various total NaPy concentrations. The conversion is defined as the ratio between the amounts of reacted substrate and substrate present at the beginning of the reaction (t = 0). The reference simulations are based on eq. 6.1 using a free NaPy concentration equal to that at the beginning of the self‐accelerating reaction. (C) The time to 50 % conversion versus the total NaPy concentration for the self‐accelerating system (red line, main), the reference simulations

(black line, main), and the difference between the t50, t90 and t99 of the self‐accelerating system and the reference simulations (inset). (D) Equilibrium free NaPy and the UPy speciation of both mono‐ and divalent UPys versus conversion at various total NaPy concentrations. The exact values of the concentrations are shown next to the ‐2 ‐1 7 ‐1 color bar. The input parameters for the simulations are: kN = 10 M s , aN = 1, KUPy‐UPy = 6 × 10 M , KUPy‐NaPy = 5 × 106 M‐1, EM = 100 mM, r = 1.

115

Chapter 6

An optimized experimental system synthesized by Bram Teunissen displayed sigmoidal kinetics (personal communication, 2017). However, it is not likely that the sole cause of sigmoidal kinetics is NaPy release, but rather by UPy catalysis, vide infra. Unfortunately, it was not possible to delineate the two effects based on the experimental data. Thus, the approach outlined for the supramolecular self‐accelerating reaction remains to be verified experimentally.

6.2.2 Model‐driven engineering of linear kinetics

During the experimental exploration of the self‐accelerating reaction, two discoveries were made by Bram Teunissen, leading the research in a different direction.29 Firstly, asymmetric NaPy 5 was synthesized which displayed a lower KUPy‐NaPy binding constant and increased activity as phase‐transfer catalyst in the Michael addition, displaying bimolecular kinetics (Fig. 6.5). Secondly, UPy moiety 4 was found to also show activity as phase‐transfer catalyst due to its ester group, displaying sigmoidal kinetics. DFT calculations suggest that the ester groups fold back over the plane of the UPy dimer, 29 thereby coordinating to K2CO3 and enabling phase‐transfer catalysis. The observation of sigmoidal kinetics, one of the defining characteristics of autocatalysis, spurred further kinetic analysis of experiments with both NaPy and UPy as catalysts. Some of these mixtures produced almost linear kinetics, presumably caused by the combination of bimolecular and sigmoidal kinetics and the non‐covalent interactions between both catalysts. Additionally, to improve the magnitude of the rate‐acceleration observed in the UPy catalyzed reactions, Bram Teunissen synthesized construct 6, in which one of the substrates is covalently linked to the catalytically active UPy. This construct showed a significant increase in the rate‐acceleration and an unexpected concentration independence of the reaction speed. Such control over the shape of reaction progress curves is rare in synthetic systems, and warranted a detailed theoretical investigation. Here, we present the kinetic models used for analyzing the contributions of each catalyst and for gaining insight in the molecular details of the catalysis. All synthesis and kinetic measurements in this chapter were performed by Bram Teunissen.29

116

Model‐driven engineering of reaction kinetics using feedback

Figure 6.5 Molecular structures of the reactants and product of the Michael addition, the phase‐transfer catalysts UPy, NaPy, and the UPy‐pentanedione construct.

6.2.2.1 UPy catalysis: autocatalysis and autoinduction

Since the UPy phase‐transfer catalyst displayed sigmoidal kinetics, it was expected that autocatalysis and/or autoinduction played a role in this reaction. Furthermore, it was experimentally shown that a separate complexation step was occurring, in which K2CO3 is slowly complexed by UPy 4 dimers.29 Therefore, to quantify the contributions of each process, measurements were performed in which the concentration of UPy groups was varied, and/or in the presence of pre‐added Michael product (markers, Fig. 6.6A‐B).29

Firstly, in the presence of only K2CO3, a very slow background reaction rate was observed. Secondly, the measurement with pre‐added product showed a relatively low conversion, compared to the measurements with UPy, ruling out autocatalysis as the sole cause of sigmoidal kinetics. Thirdly, a measurement in which UPy and K2CO3 were allowed to equilibrate before the addition of Michael substrates lacked a lag phase. While this measurement revealed that the likely cause of the sigmoidal kinetics was diUPy•K2CO3 complexation, the measurement with both UPy and product displayed an unexpectedly high conversion, suggesting that autoinduction also played a role. Thus, to delineate the complex interplay of the processes taking place during the reaction, a kinetic model is developed based on mass action kinetics that includes the background reaction, autocatalysis, diUPy•K2CO3 complexation, and autoinduction (Figure 6.6C; section 6.4.2).

117

Chapter 6

Figure 6.6 (A‐B) The conversion of the Michael reaction between maleimide 1 and pentanedione 2 versus time 1 (Csubstrates = 4 mM, CK2CO3 = 36 mM) as obtained by H NMR spectroscopy (markers), the best fit of the kinetic model using eq. 6.3 (lines), and the corresponding residuals. (A) K2CO3 with and without pre‐added Michael product (Cproduct = 10 mM), and with UPy and product present (CUPy = Cproduct = 10 mM). (B) K2CO3 and various amounts of UPy, one of which catalyzed by a pre‐equilibrated UPy‐K2CO3 mixture. (C) Schematic of the kinetic mass action model including the background reaction, autocatalysis, diUPy•K2CO3 complexation, and autoinduction. (D) Optimized parameters of the best fit of the experimental data, and their 95% confidence intervals. (E) Catalytic contributions of the background reaction, autocatalysis, UPy catalysis, and autoinduction in the Michael addition catalyzed by K2CO3 and UPy (CUPy = 10 mM), simulated using the optimized parameters in (D).

A global fit of all the data corresponding to the Michael addition, including NaPy catalysis data (vide infra), proved to be computationally expensive. Thus, three separate consecutive fits (lines, Fig. 6.6A‐B and 6.8A) were performed on orthogonal datasets: the first dataset contains all catalysis experiments without any UPy or NaPy present, the

118

Model‐driven engineering of reaction kinetics using feedback second contains all experiments where UPy is present, and the third dataset contains all experiments with NaPy present. The optimized parameters obtained from the first dataset

(kK2CO3 and kP, Fig. 6.6C) were used as fixed constants during the non‐linear regression of the second dataset. The optimized model is able to describe the experimental data well, implying that the kinetic model represents the most important features of the catalysis mechanism. Interestingly, the values of the optimized parameters suggest that the diUPy•product•K2CO3 complex is formed faster, compared to the diUPy•K2CO3 complex, while their activity in the Michael addition is about equal (Fig. 6.6C). Using the optimized parameters to simulate the catalytic contributions during the reaction revealed that the most active process is autoinduction, which is responsible for over 50% of the conversion obtained in the measurement with 10 mM UPy (Fig. 6.6D).

To validate the inclusion of K2CO3 complexation and autoinduction, two non‐linear optimizations were performed with different variants of the model. The fit of the second dataset without K2CO3 complexation was unable to describe the lag phase observed in the experimentally determined curves (Fig. 6.7A). An F‐test was performed to compare the fits with and without K2CO3 complexation which revealed strong evidence for the inclusion of a separate complexation step for both the catalysts diUPy and diUPy•product (P < 0.0001).

Figure 6.7 (A) Best fit of the second dataset without complexation reactions for the catalysts diUPy and diUPy•product. (B) Best fit of the second dataset without diUPy autoinduction.

The inclusion of the diUPy•product•K2CO3 complex in the model was necessary since an analysis of the second dataset without this complex was unable to describe the data correctly. In particular, the experimental measurement in which pre‐synthesized product was added at the beginning of the reaction consistently showed higher conversions compared to the optimized model prediction (Figure 6.7B; crosses and dashed line). An F‐

119

Chapter 6 test was used to compare the fits with and without diUPy autoinduction and confirmed the strong evidence for the inclusion of the complex (P < 0.0001).

6.2.2.2 NaPy and UPy mixtures: toward linear kinetics

Measurements with only NaPy 5 as catalyst displayed typical bimolecular kinetics without any lag‐phase.29 Thus, it was hypothesized that the addition of UPy 4 as catalyst could lead to increased control over the shape of the reaction progress curves due to the superposition of bimolecular and autoinductive kinetics by free NaPy and UPy dimers in addition to the formation of relatively inactive UPy‐NaPy dimers.23 Indeed, experimental measurements showed that the reaction rate was decreased and the shape of the reaction progress curves became more linear as UPy 4 was added (markers, Fig. 6.8A).29 The decrease in reaction rate with increasing amounts of UPy 4 was attributed to the decrease in free NaPy concentration in favor of inactive UPy‐NaPy dimers. The increased linearity was expected to arise from slowing down of the initial rate by decreasing free NaPy concentration and the rate‐acceleration at higher conversions due to UPy autoinduction. At higher equivalents of UPy, when all NaPy is bound, the reaction rate increases again due to the increase in UPy dimer concentration. To delineate the contributions of each catalyst, the kinetic mass action model for UPy catalysis was expanded with terms for free NaPy and UPy‐NaPy dimers (Fig. 6.8B; section 6.4.2). Interestingly, no satisfactory description of the catalytic experiments with NaPy and UPy could be achieved using the optimized parameters obtained from the second dataset

(kU2, kU2P, kU2c, and kU2Pc; Fig. 6.6D). Instead, those parameters were also set as free parameters in the non‐linear optimization of the third dataset, in addition to the NaPy catalysis parameters (kN, aN, kUN, and aUN; Fig. 6.8A‐C). The values of the optimized UPy parameters obtained from the fitting of the third dataset were mostly higher compared to those obtained from the second dataset, which suggests that NaPy plays an activating role in UPy catalysis which is not fully described by the model (Fig. 6.6D and 6.8C). As a result of the increased catalytic activity of UPy dimers, free NaPy and diUPy•K2CO3 catalysis dominate the simulated catalytic contributions to the total conversion, while the contribution of UPy autoinduction is only marginal (Fig. 6.8D, top right, and 6.8E).

120

Model‐driven engineering of reaction kinetics using feedback

Figure 6.8 (A) The conversion of the Michael reaction between maleimide 1 and pentanedione 2 versus time 1 (Csubstrates = 4 mM, CK2CO3 = 36 mM) as obtained by H NMR spectroscopy (markers), the best fit of the kinetic model using eq. 6.3 (lines), and the corresponding residuals. The amount of NaPy is equal in all measurements

(CNaPy = 8 mM), while the concentration of UPy is varied. (B) Schematic of the expanded kinetic mass action model including the background reaction, autocatalysis, diUPy•K2CO3 complexation, UPy autoinduction, NaPy catalysis, and UPy‐NaPy catalysis. (C) Boxplots of the optimized parameters of all fits with a squared 2‐norm residual within 5% of the best fit. (D) Concentrations of catalysts at the start of the reaction (top left), time until 50 % conversion is reached (bottom left), simulated catalytic contributions to the conversion after 10 days using the optimized parameters of the best fit (top right), and normalized residuals of linear regression on the experimentally determined conversion up till 75 % (bottom right) versus the equivalents of UPy. (E) Catalytic contributions of the background reaction, autocatalysis, UPy catalysis, UPy autoinduction, NaPy catalysis, and

UPy‐NaPy catalysis in the Michael addition catalyzed by K2CO3, NaPy (CNaPy = 8 mM), and UPy (CUPy = 12 mM = 1.5 eq.), simulated using the optimized parameters of the best fit.

121

Chapter 6

6.2.2.3 Increased rate‐acceleration by covalently linking catalyst and substrate

To further increase the rate‐acceleration of UPy catalysis, UPy construct 6 was synthesized, which covalently links the UPy catalyst with one of the Michael substrates.29 While the increased local concentration of UPy catalyst is expected to increase the overall reaction rate, the conversion of substrate to product is expected to increase the complexation rate of the dimerized UPy catalyst, thereby increasing the rate‐acceleration. Indeed, covalently attaching a substrate to a catalyst can result in increased reaction rates.30 Catalysis experiments with maleimide 1 and UPy construct 6 showed a large increase in the rate‐acceleration, leading to sigmoidal kinetics (Fig. 6.9A). Surprisingly, an eight‐fold change in the concentrations of maleimide 1 and UPy 6 leads to only a small change in the reaction rate. This concentration independency could be described by a kinetic mass action model, including UPy dimerization, diUPy•K2CO3 complexation, and intramolecular catalysis by dimers comprising UPy 6 (Fig. 6.9B; section 6.4.3). It was assumed that diUPy•K2CO3 complexation influences UPy dimerization so that complexed species do not participate in the dimerization equilibria. Non‐linear regression was performed on the concentration dependent data of UPy 6 catalysis, resulting in a good description of the conversion (Fig. 6.9A). While the optimized model can accurately describe both the initial lag phase and the concentration independence, some deviations between the fit and the data are observed at high conversions, which are attributed to small variations in experimental conditions. In line with the strong rate‐acceleration during the course of the reaction, the values of the optimized parameters increase going from the kinetic constants of UPy 6 dimers (UD2) to those of the UPy 6‐UPy 7 dimer (UDUP), to those of the product UPy dimer (UP2; Fig. 6.9C). Thus, while UPy 6 is converted to UPy 7, the simulated reaction rate increases due to the increase in the values of the optimized kinetic constants. We hypothesized that the increased rate‐acceleration of UPy 6 could lead to more linear reaction progress curves upon the addition of NaPy. Thus, model predictions on the influence of NaPy on the Michael addition between maleimide 1 and UPy 6 were performed (Fig. 6.10A) using the optimized parameters obtained from the data on mixtures of maleimide 1, pentanedione 2, UPy 4, and NaPy 5 (Fig. 6.8B). At low amounts of NaPy, the kinetics are slowed down due the formation of inactive UPy‐NaPy dimers, although rate‐acceleration is still present (Fig. 6.10B). Upon the addition of more NaPy, the model predicts that the kinetics should become more linear. Experimental validation of the predictions revealed that the optimal conditions were predicted successfully and that the reaction progress curves were more linear compared to using mixtures of UPy 4 and NaPy 5.29

122

Model‐driven engineering of reaction kinetics using feedback

Figure 6.9 (A) The conversion of the Michael reaction between equimolar mixtures of maleimide 1 and UPy 6 1 versus time (CK2CO3 = 36 mM) as obtained by H NMR spectroscopy (markers), the best fit of the kinetic model using eq. 6.5 (lines), and the corresponding residuals. (B) Schematic of the kinetic mass action model for the

Michael addition between maleimide 1 and UPy 6 including diUPy•K2CO3 complexation, and inter‐ and intramolecular catalysis. (C) Optimized parameters of the best fit of the experimental data, and their 95% confidence intervals.

123

Chapter 6

Figure 6.10 (A) Simulated conversion of the Michael reaction between mixtures of maleimide 1 (Cmaleimide = 4 mM), UPy 6 (CUPy6 = 4 mM) and various amounts of NaPy versus time. The exact values of the NaPy concentrations are shown next to the color bar. (B) Concentrations of catalysts at the start of the reaction (top) and normalized residuals of linear regression on the simulated conversion up till 75 % (bottom) versus the concentration of NaPy.

6.3 Conclusion

A detailed theoretical description of a supramolecular strategy for a self‐accelerating reaction is presented in which product cyclization leads to release of additional catalyst. Rule‐based kinetic models suggest that while rate‐acceleration is viable, no sigmoidal kinetics are expected due to the supramolecular equilibria that govern the reaction. Additionally, a theoretical analysis of a separate catalytic system is performed which delineates the contributions of each catalyst, demonstrating the synergy that occurs when model simulations and experimental characterization are performed simultaneously. Furthermore, the optimized models are used to predict how the linearity of the reaction progress curves can be increased. The work presented here demonstrates that the development of designed feedback in artificial synthetic systems still requires considerable effort, in which model‐driven engineering can play an important role. Additionally, it demonstrates the increased freedom that is inherent in model predictions, which can be employed to enhance molecular designs before synthesis has started.

6.4 Experimental section

Rule based simulations were performed in a virtual machine (Oracle virtualbox 4.3.12 r93733, Oracle corporation) running Ubuntu (12.04 LTS, 64‐bit), Matlab (2012b, 8.0.0.783, Mathworks) and BioNetGen (version 2.2.2). Other model simulations and non‐linear optimizations were performed using Matlab (R2016a, version 9.0.0341360, Mathworks) along with its optimization, curve fitting and symbolic math toolboxes. Where appropriate, mass balances were analytically solved using Mathematica (version 9.0.1.0, Wolfram Research, Inc.). Otherwise, mass balances were solved numerically using either

124

Model‐driven engineering of reaction kinetics using feedback the fzero or fsolve function included in Matlab. Non‐linear least squares optimizations were performed using the lsqcurvefit function from Matlabs optimization toolbox. This function uses the Levenberg‐Marquardt method to minimize the residual sum of squares. A thousand fits were performed for each optimization. Initial parameters for the fits were distributed using latin hypercube sampling (implemented in the lhsdesign function), which ensures a uniform distribution in multidimensional parameterspace so that the global optimum can be obtained. The optimization with the lowest squared 2‐norm is used as the best fit, while optimizations with a squared 2‐norm within 5 percent of the best fit are considered equally good fits. Estimates of the standard deviations on the optimized parameters are generated from the Jacobian and normalized residuals.31

6.4.1 Rule based model for the self‐accelerating reaction

The reaction rules for the NaPy catalyzed Michael addition, UPy‐UPy and UPy‐NaPy dimerization (Fig. 6.11), and divalent UPy elongation, end‐capping and cyclization (Fig. 6.12) are available online as open source files.32 The rules for cyclization were automatically written using a custom Matlab script, so that the maximum ring size could be set manually. The kinetic constants used in the simulations are based on experimentally determined values for UPy‐UPy dimerization.33 We assumed that the forward kinetic constant for UPy‐NaPy dimerization was equal to that of UPy‐UPy dimerization.

Figure 6.11 Reaction rules for the Michael addition, UPy‐UPy and UPy‐NaPy dimerization.

125

Chapter 6

Figure 6.12 Reaction rules for divalent UPy dimerization, elongation, end‐capping and cyclization.

6.4.2 Kinetic model of the Michael addition between maleimide and pentanedione

The overall reaction rate of the Michael addition between maleimide 1 and pentanedione 2 is described by mass action kinetics, i.e. the rate of the reaction, rM1, is directly proportional to the concentrations of reagents and catalysts (Eq. 6.2; Scheme 6.1). The only exception to this is the catalysis by free NaPy and UPy‐NaPy dimers since the experimental data could not be described with a reaction order of unity. Therefore, the reaction orders were added as free parameters.

126

Model‐driven engineering of reaction kinetics using feedback

kkK CO  product KCO23 2 3 P 2 rkksubstrate U  U P  (6.2) M 1U2cUP2c22   aaUN N kkUNUN N  N where [substrate], [K2CO3], [product], [U2c], [U2Pc], [UN], and [N] are the concentrations of substrates and catalysts (K2CO3, Michael product, diUPy•K2CO3 complex, diUPy•product•K2CO3 complex, UPy‐NaPy dimers, and free NaPy, respectively), kK2CO3, kP, kU2, kU2P, kUN, and kN are their corresponding kinetic rate constants, and aN and aUN are the reaction orders of free NaPy and UPy‐NaPy dimer. Catalysts that have a subscripted c suffix have a separate complexation reaction with

K2CO3, before they become catalytically active. We assume that K2CO3 complex formation is not reversible, so only the forward reaction is included in the model.

Scheme 6.1 Kinetic mass action model describing catalysis by UPy 4 and NaPy 5, including the background reaction, autocatalysis, diUPy•K2CO3 complexation, UPy autoinduction, NaPy catalysis, and UPy‐NaPy catalysis.

The full set of ODEs describing the Michael addition comprises ODEs for the reactants, the product, free UPy dimers, the diUPy•K2CO3 complex, and the diUPy•product•K2CO3 complex (Eq. 6.3). Furthermore, it was assumed that the equilibria describing UPy‐UPy and UPy‐NaPy dimerization are not shifted during the reaction. While 1H NMR spectra obtained during the reaction show that the equilibria are shifted, the magnitude of the shift could not be quantitatively determined due to deuteration. It was estimated that the shifts in equilibria had a maximum change of 30 %, and inclusion of the equilibria in the model did not lead to a significantly better fit. Therefore, they were omitted from the model.

127

Chapter 6

d substrate 2r dt M1 d 2 productrkMUP12232c  U P K CO  k UP U P  substrate  dt 22cc d 2 Ukk U K CO  U substrate 22232cUU22c    dt (6.3) 2 kkU P K CO U P substrate UP22cc2232c UP   d 2 U2ckkUU U 2  K 2 CO 3   U 2c  substrate dt 22c d 2 UP2ckkUP U 2 P  KCO 2 3  UP  UP 2c  substrate  dt 22cc

The ODEs were implemented in Matlab and solved using a stiff solver (ode15s). A Jacobian matrix was calculated using Matlab’s symbolic math toolbox and provided to the solver to decrease computational time.

6.4.3 Kinetic model for the Michael addition between maleimide and UPy construct 6

For the experiments where maleimide 1 and UPy construct 6 are used as reagents, the overall reaction rate of the Michael addition was also described by mass action kinetics, including the change in catalysts as a function of conversion, i.e. the conversion of UPy 6 dimers to UPy 7 dimers (Eq. 6.4). Furthermore, in addition to the intermolecular catalysis by the background reaction and by diUPy•K2CO3 complexes, the possibility of intramolecular catalysis was added for diUPy•K2CO3 complexes comprising UPy construct 6.

kkKCO UD KCO23 2 3 UD 2 2c 2 rkksubstrate UDUP  UP M 2 UDUP c UP2 2c (6.4) aaUN N kkUNUN N  N maleimidekk UD UDUP UD2 ,intra 2c UDUP,intra c where [UD], [UD2c], [UDUPc], and [UP2c] are the concentrations of UPypent, diUPypent•K2CO3,

UPypent•UPyproduct•K2CO3, and diUPyproduct•K2CO3, respectively, and kUD2, kUDUP, kUP2, kUD2,intra, and kUDUP,intra are the corresponding inter‐ and intramolecular rate constants. The full set of ODEs describing the Michael addition of maleimide 1 and UPy construct 6 was constructed with the assumption that the equilibria describing UPy‐UPy and UPy‐ NaPy dimerization shift during the reaction as a consequence of the consumption of

UPypent and the production of UPyproduct (Eq. 6.5). To accommodate the inclusion of dimerization and its possibility to shift during the reaction, a simple mass balance for

UPypent and UPyproduct dimerization was solved each time the numerical values of the ODEs

128

Model‐driven engineering of reaction kinetics using feedback

were calculated. Solving the mass balance yields [UD2], [UDUP], and [UP2] during the course of the reaction, which are used to calculate the reaction rates for the formation of

[UD2,c], [UDUPc], and [UP2,c]. A good description of the experimental data could only be obtained when we assumed that K2CO3 complexation influences the UPy‐UPy and UPy‐ NaPy dimerization equilibria. The ODEs were implemented in Matlab and solved using a stiff solver (ode15s). rk UD K CO UD22 ,cUD c  2 2 3

rkUDUP,c UDUPcUDUP K 2 CO 3 rk UP K CO UP22 ,c UP c 2 2 3 rk UD maleimide UD UD22 ,interUD  2c 

rkUDUP,inter UDUPUDUP c  maleimide UD rk UP maleimide UD UP22 ,inter UP 2c  rk UD maleimide UD22 ,intra UD ,intra2c

rkUDUP,intra UDUP,intraUDUP c  maleimide d maleimide r dt M 2 d UDrrM 2  2 UD,c  2 r UD,inter (6.5) dt 22

rrUDUP,c UDUP,inter r UD2 ,intra d UP rr   r dt M 2 UDUP,c UDUP,inter 22rr UP22 ,c UP ,inter

2rUDUP,intra d UD2crr UD ,c  UD ,inter  r UD ,intra dt 22 2 d UDUP rr   r dt c UDUP,c UDUP,inter UDUP,intra d UP2c P rr UP,c UP,inter dt 22

6.5 References

(1) Brandman, O.; Ferrell, J. E.; Li, R.; Meyer, T. Science 2005, 310 (5747), 496. (2) Zeron, E. S. Math. Model. Nat. Phenom. 2008, 3 (2), 67. (3) Ferrell, J. E.; Tsai, T. Y.‐C.; Yang, Q. Cell 2011, 144 (6), 874. (4) Gerhart, J. C.; Pardee, A. B. J Biol Chem 1962, 237 (3), 891. (5) Ferrell Jr, J. E. Curr. Opin. Cell Biol. 2002, 14 (2), 140.

129

Chapter 6

(6) Bretl, D. J.; Demetriadou, C.; Zahrt, T. C. Microbiol. Mol. Biol. Rev. 2011, 75 (4), 566. (7) Wu, H. Cell 2013, 153 (2), 287. (8) Monod, J.; Changeux, J.‐P.; Jacob, F. J. Mol. Biol. 1963, 6 (4), 306. (9) Kobe, B.; Kemp, B. E. Nature 1999, 402 (6760), 373. (10) Fastrez, J. ChemBioChem 2009, 10 (18), 2824. (11) Viappiani, C.; Abbruzzetti, S.; Ronda, L.; Bettati, S.; Henry, E. R.; Mozzarelli, A.; Eaton, W. A. Proc. Natl. Acad. Sci. 2014, 111 (35), 12758. (12) Blanco, V.; Leigh, D. A.; Marcos, V. Chem. Soc. Rev. 2015, 44 (15), 5341. (13) Raynal, M.; Ballester, P.; Vidal‐Ferran, A.; Leeuwen, P. W. N. M. van. Chem. Soc. Rev. 2014, 43 (5), 1734. (14) Bissette, A. J.; Fletcher, S. P. Angew. Chem. Int. Ed. 2013, 52 (49), 12800. (15) Blackmond, D. G. Angew. Chem. 2009, 121 (2), 392. (16) Tjivikua, T.; Ballester, P.; Rebek Jr, J. J. Am. Chem. Soc. 1990, 112 (3), 1249. (17) Sievers, D.; von Kiedrowski, G. Nature 1994, 369 (6477), 221. (18) Vidonne, A.; Philp, D. Eur. J. Org. Chem. 2009, No. 5, 593. (19) von Kiedrowski, G. In Bioorganic Chemistry Frontiers; Springer Berlin Heidelberg, 1993; Vol. 3, pp 113–146. (20) Yoon, H. J.; Mirkin, C. A. J. Am. Chem. Soc. 2008, 130 (35), 11590. (21) Yoon, H. J.; Kuwabara, J.; Kim, J.‐H.; Mirkin, C. A. Science 2010, 330 (6000), 66. (22) Rodríguez‐Llansola, F.; Meijer, E. W. J. Am. Chem. Soc. 2013, 135 (17), 6549. (23) Teunissen, A. J. P.; Haas, R. J. C. van der; Vekemans, J. A. J. M.; Palmans, A. R. A.; Meijer, E. W. Bull. Chem. Soc. Jpn. 2016, 89 (3), 308. (24) Mandolini, L. In Advances in Physical Organic Chemistry; Bethell, V. G. and D., Ed.; Academic Press, 1986; Vol. 22, pp 1–111. (25) Hogben, H. J.; Sprafke, J. K.; Hoffmann, M.; Pawlicki, M.; Anderson, H. L. J. Am. Chem. Soc. 2011, 133 (51), 20962. (26) Blinov, M. L.; Ruebenacker, O.; Moraru, I. I. IET Syst. Biol. 2008, 2 (5), 363. (27) Faeder, J. R.; Blinov, M. L.; Hlavacek, W. S. Methods Mol. Biol. Clifton NJ 2009, 500, 113. (28) Sneddon, M. W.; Faeder, J. R.; Emonet, T. Nat. Methods 2011, 8 (2), 177. (29) Teunissen, A. J. P. Competing Interactions in Chemical Reaction Networks. PhD Thesis, Eindhoven University of Technology: Eindhoven, 2017. (30) Page, M. I.; Jencks, W. P. Proc. Natl. Acad. Sci. 1971, 68 (8), 1678. (31) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes in C: The Art of Scientific Computing, Second Edition, 2 edition.; Cambridge University Press: Cambridge ; New York, 1992. (32) BioNetGen reaction rules for the self‐accelerating reaction. https://osf.io/htkpu/. (33) Söntjens, S. H. M.; Sijbesma, R. P.; van Genderen, M. H. P.; Meijer, E. W. J. Am. Chem. Soc. 2000, 122 (31), 7487.

130

Summary

Summary

Model‐driven engineering in supramolecular systems

Supramolecular systems are comprised of molecular building blocks that interact via non‐covalent forces. While early studies on synthetic supramolecular systems could be interpreted through experimental techniques, the field has now reached a point where systems are being studied that are too complex to understand without the aid of theoretical models. In this thesis, we present a predominantly theoretical study of supramolecular systems and explore its synergy with experimental techniques. Furthermore, we show that model driven engineering, i.e. using models to predict which molecular changes will improve specific functionalities, is a promising avenue for further scientific development within the field of supramolecular chemistry. In chapter 1, we present an introduction to the field of supramolecular chemistry and highlight the most important mechanisms used in the construction of complex supramolecular systems: molecular competition, multivalency and cooperativity. Furthermore, the topic of ring‐chain equilibria is introduced, which forms the basis of most of the research in this thesis. Lastly, we introduce the supramolecular binding motifs ureidopyrimidinone (UPy) and 2,7‐diamido‐1,8‐naphthyridine (NaPy), which are used throughout the thesis. In chapter 2, an overview is given of the various modeling techniques used in this thesis. The techniques range from constructing and solving mass balances for thermodynamic simulations, solving coupled stiff ODE systems for kinetic simulations, and rule‐based modeling for kinetic simulations using the software package BioNetGen. It details various approaches for solving problems both analytically and numerically, and covers challenges often encountered during non‐linear regression of experimental data. In chapter 3, a two component supramolecular system which displays molecular buffering is investigated. The system comprises a monovalent naphthyridine (NaPy) molecule that is buffered by a divalent ureidopyrimidinone (UPy) molecule capable of forming both rings and chains through self‐association. Via an in‐depth combined experimental and theoretical analysis, it is shown that the buffering is due to competition between ring formation of the divalent UPy molecule, and end‐capping of a chain by the monovalent NaPy molecule. Analysis of the validated theoretical model reveals that the ring formation tendency is the critical parameter in optimizing the broadness of the buffered region as well as the maximum concentration of the buffered molecule. In chapter 4, the supramolecular buffering system is expanded to include mono‐, tri‐ and tetravalent UPy molecules, to study the influence of multivalency on the buffering performance. We discover an odd‐even effect, where odd‐valent molecules display inferior and even‐valent molecules display superior buffering. A combined experimental

131

Summary

and theoretical approach is used to obtain the molecular principles underlying the difference in buffering performance.

In chapter 5, a C2v symmetric trivalent UPy is studied, which has the ability to form two mutually exclusive intramolecular cycles. The experimental characterization of the trivalent molecule, performed by Bram Teunissen, was not enough to definitively prove its conformations in solution. Therefore, a thermodynamic model is constructed with which the experimental data is fitted, yielding a complete speciation and more insight into the underlying molecular principles. In chapter 6, we present model predictions on the viability of a conceptual self‐ accelerating reaction, in which product cyclization leads to release of catalyst.

Furthermore, we study the kinetics of a K2CO3 catalyzed Michael addition, which is accelerated by phase‐transfer catalysts NaPy and UPy. The supramolecular dimerization of the UPy‐NaPy system leads different types of kinetics, ranging from bimolecular to pseudo zero order to autocatalytic. A kinetic model is constructed to explain the different types of kinetics, and model predictions are used to obtain more linear kinetics.

132

Curriculum vitae

Curriculum vitae

Tim Paffen werd geboren op 1 Juni 1987 te Kerkrade. Nadat hij zijn atheneumdiploma gehaald heeft in 2005 aan het College Rolduc in Kerkrade, begon hij met de studie scheikundige technologie aan de Technische Universiteit Eindhoven (TU/e). Tijdens zijn master heeft hij in 2011 onderzoek gedaan in de groep van prof. dr. Craig Hawker aan de University of California in Santa Barbara (UCSB) in de Verenigde Staten. Hier heeft hij het zelf‐assemblage gedrag bestudeerd van histamine gefunctionaliseerde triblock copolymeren. In 2012 heeft hij zijn masterstudie in Molecular Engineering afgerond binnen de capaciteitsgroep Macromoleculaire en Organische Chemie aan de Technische Universiteit Eindhoven, waar hij het vouw‐mechanisme van polymere nanodeeltjes onderzocht. In september 2012 begon hij met zijn promotietraject in dezelfde vakgroep, onder de begeleiding van prof. dr. E.W. Meijer en dr. ir. T.F.A. de Greef. De belangrijkste resultaten van dit onderzoek zijn beschreven in deze thesis.

Tim Paffen was born on June 1st, 1987 in Kerkrade (the Netherlands). After finishing his atheneum degree in 2005 at College Rolduc in Kerkrade, he studied Chemical Engineering and Chemistry at the Eindhoven University of Technology (TU/e). The master program included an international research internship in 2011 at the University of California in Santa Barbara (UCSB) in the United States. There, he investigated the self‐assembly behavior of histamine functionalized triblock copolymers in the group of prof. dr. Craig Hawker. He finished his master studies in Molecular Engineering in 2012 within the research group for Macromolecular and Organic Chemistry on the folding mechanism of single‐chain polymeric nanoparticles. In September 2012 he started his PhD project in the same group, supervised by prof. dr. E.W. Meijer and dr. ir. T.F.A. de Greef. The most important results of his research are described in this thesis.

133

134

Publications

Publications

Regulating kinetic feedback in the Michael addition using interacting supramolecular catalysts. A.J.P. Teunissen, T.F.E. Paffen, I.A.W. Filot, R.J.C. van der Haas, M.D. Lanting, T.F.A. de Greef, E. W. Meijer, in preparation.

Model‐driven Engineering of Improved Supramolecular Buffering by Multivalency. T.F.E. Paffen, A.J.P. Teunissen, T.F.A. de Greef, E.W. Meijer, Proc. Natl. Acad. Sci., in revision.

Regulating Competing Supramolecular Interactions Using Ligand Concentration. Teunissen, A. J. P.; Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2016, 138 (21), 6852.

Supramolecular Buffering by Ring–Chain Competition. Paffen, T. F. E.; Ercolani, G.; de Greef, T. F. A.; Meijer, E. W. J. Am. Chem. Soc. 2015, 137 (4), 1501.

Folding Polymers with Pendant Hydrogen Bonding Motifs in Water: The Effect of Polymer Length and Concentration on the Shape and Size of Single‐Chain Polymeric Nanoparticles. Stals, P. J. M.; Gillissen, M. A. J.; Paffen, T. F. E.; de Greef, T. F. A.; Lindner, P.; Meijer, E. W.; Palmans, A. R. A.; Voets, I. K. Macromolecules 2014, 47 (9), 2947.

Nitroxide‐mediated controlled radical polymerizations of styrene derivatives. Stals, P. J. M.; Phan, T. N. T.; Gigmes, D.; Paffen, T. F. E.; Meijer, E. W.; Palmans, A. R. A. J. Polym. Sci. A Polym. Chem. 2012, 50 (4), 780. pH‐triggered self‐assembly of biocompatible histamine‐functionalized triblock copolymers. Lundberg, P.; Lynd, N. A.; Zhang, Y.; Zeng, X.; Krogstad, D. V.; Paffen, T.; Malkoch, M.; Nyström, A. M.; Hawker, C. J. Soft Matter 2012, 9 (1), 82.

Probing the Limits of the Majority‐Rules Principle in a Dynamic Supramolecular Polymer. Smulders, M. M. J.; Stals, P. J. M.; Mes, T.; Paffen, T. F. E.; Schenning, A. P. H. J.; Palmans, A. R. A.; Meijer, E. W. J. Am. Chem. Soc. 2010, 132 (2), 620.

135

136

Acknowledgements

Acknowledgements

The research presented in this thesis would not have been possible without the help of several amazing people. First and foremost, I would like to thank Bert for being an inspiring professor and for allowing me to do research in his group. Bert, I think our first official meeting was when I volunteered myself for the Santa Barbara exchange program. While I was quite nervous for performing such an uncharacteristically bold action, I was quickly relieved to learn that you appreciated the directness. You were always very pleasant in our exchanges, and I would like to especially thank you for your flexibility during and after Ella’s birth, which helped enormously. I think your personal approach is one of your best character traits, along with your scientific curiosity.

Tom, during my graduation and PhD I always felt welcome to discuss any subject with you, both scientific and personal. Your extensive knowledge and high focus always gave me insights to help me further on my way. I wish you all the best in the rest of your scientific career and in your personal life.

I thank Jeffrey Moore, Jurriaan Huskens, and Giovanni Pavan for carefully reviewing my thesis and making the trip to Eindhoven to be part of my committee even though we haven’t met yet. I feel honored to be able to discuss my thesis with such distinguished scientists. Peter, I am also honored to have you on my committee so that I may yet again answer your critical questions. Thank you for helping prepare for my PhD, and showing that simulations and experiments go great together. Anja, we have known each other since my start as a Spinoza student, and I’ve greatly benefitted from your guidance since then. Thank you for being part of my committee.

Gianfranco Ercolani, although we have never met in person, I would like to thank you for taking the time to write your thoughts on ring‐chain equilibria and for answering any of my questions that would come up. Without you this thesis would likely be filled with errors concerning statistical factors and effective molarities. I also hope you cannot find any errors now that it has been printed.

Bram, similar as for your thesis, mine would have been completely different without your contributions. Your talents for synthesis and analysis led to the discovery of many new effects or strange results. Although I wasn’t always happy with having to redo the models, I find that I am missing our discussions now that you have left for New York. I wish you all the best in the rest of your career and I hope our paths will cross again sometime.

Müge, thank you for your enjoyable company in the office, the “good meurnings”, the delectable dishes, and of course for being my paranimf. I look forward to many more

137

Acknowledgements

culinary exchanges in the future. Patrick, although you weren’t directly involved in my PhD, you did help me to prepare for it. Thank you for giving me guidance in the wonderful world of science, for telling me to go ask Bert for the exchange program, and for your good recommendations in fantasy authors. Robert, thank you for convincing me to try singing at Vokollage, I’m sure we both didn’t expect all the activities and concerts that followed. I hope we will have time to sing together again sometime soon. Thank you for being my paranimf, I am pretty sure there won’t be any questions in German for you during the defense.

René, spending our writing periods together was completely enjoyable. Being able to share and talk about the various sources of stress allowed me to keep things in perspective and remain sane . Helen, I enjoyed having you in our office and talking about bad movies, good books and new recipes. I wish you well in your academic career. Andreas 1, thank you for all the nice discussions we had over coffee/tea and the rigorous exercise program you introduced in our office. I wish you all the best in your future endeavors. Andreas 2, I mostly remember your enthusiasm and cheerfulness, I hope you continue to do well. Daan, thanks for all the discussions about Matlab, the organization of the Meijer Meeting and the matches of Super Smash Bros, I still hope to beat you one day… Beatrice, thank you for taking over the organization from Daan, I wish you all the best with the rest of your PhD. José, thank you for all the nice talks we had and I wish you good luck in Groningen and in the rest of your scientific career. Tristan and Yiliu, thank you for all the talks about babies, they really helped me to explore this whole different world of parenting. Ghislaine, thank you for the work you’ve done on the evolving nanoparticles and your enthusiasm for research. I’m saddened that the project never really worked as intended. Elizabeth, you joined our office in a stressful period, but we still managed to get along well, I hope your writing period will involve less stress. Mathijs, I enjoyed working together on problematic Matlab scripts, good luck with your PhD. Alexandre, thank you for being so interested in the topic of my research and for being the first to ask for my thesis.

During my PhD I have participated in the introduction and subsequent demise of the electronic lab journal in our research group. The meetings were always very interesting and I would like to thank Sjef, Joost, Bart and Johan for the enjoyable discussions which gave me many new insights into university and company politics. Jolanda, thank you for arranging a spot on lab 2, and I apologize for not spending more time on it. During my PhD I discovered that my affinity lies more with computers than with performing reactions. Also a huge thanks to others of the supporting staff, without them the amount of research I did would have been far less: Nora, Carla, Martina, Margot, Ralf, Lou, and Bas.

138

Acknowledgements

Mark, our gaming sessions always filled me with new energy and motivation to continue research. I thoroughly enjoyed them and I am sure that many more will follow. I wish you, Monique, Tijs and your baby all the best.

I’d like to thank my family for asking about my research and supporting me whenever I travelled south. Also many thanks to my unofficial in‐laws for making me feel at home and for being interested in my research and hobbies. My final thanks go to Klara, the love of my life, and Ella, the joy of my life. I feel that you are both continually making me a better person and your support helped me through this final difficult task of thesis writing. Klara, thank you for making me feel accepted. I look forward to growing old with you.

Thank you all,

Tim

139