Design and Verification of Modular Components in Thermodynamic
Total Page:16
File Type:pdf, Size:1020Kb
Design and Verification of Modular Components in Thermodynamic Binding Networks By David Russell Haley Dissertation Submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Applied Mathematics in the Office of Graduate Studies of the University of California Davis Approved: David Doty, Chair Raissa D'Souza James Crutchfield Committee in Charge 2021 i Copyright © 2021 by David Russell Haley All rights reserved. This work is dedicated to everyone who has in some way supported me or encouraged me over the course of my career. This group is in total too numerous to name here, but the individuals know who they are, even though most will be modest and try to underplay their particular role. I especially thank my wife, Kelly Meyer, for her exceptional patience over the better part of a decade during which she had front-row tickets with which to observe the often trying experiences of graduate education. ii Contents Abstract . vi 1 Introduction 2 2 Thermodynamic Binding Networks Model 8 2.1 Model . .8 2.1.1 TBN . .8 2.1.2 Configuration . .8 2.1.3 Path . .9 2.1.4 Energy . 10 2.1.5 Barrier . 12 2.2 Saturated paths . 13 2.2.1 Bounds on energy change . 13 2.2.2 Saturated paths suffice . 14 2.3 Modeling bonds . 16 2.3.1 Bond-Aware Model . 16 2.3.2 Equivalence . 17 3 Engineering Systems with Programmable Energy Barriers 19 3.1 Translator cycle . 20 3.1.1 Construction . 20 3.1.2 Proof of Barrier . 30 3.2 Grid gate . 32 3.2.1 Construction . 32 3.2.2 Proof of Barrier . 33 3.2.3 Catalysis . 34 3.2.4 Autocatalysis . 39 3.3 Reducing binding site complexity for multistable TBNs . 41 3.3.1 Golfergate: Pairwise intersection 1 of state monomers . 42 iii 3.3.2 Catalysts . 44 3.3.3 Three states . 45 3.3.4 Many states . 46 p 3.3.5 Catalyst with Barrier n=2 − n ln n ................. 48 4 PSPACE-completeness of Reachability 51 4.1 Introduction . 51 4.1.1 Related work . 52 4.2 Model . 54 4.2.1 Chemical reaction networks . 54 4.2.2 Constraints on CRNs . 55 4.3 PSPACE-completeness of binary, reversible, singular, catalytic CRNs . 56 4.3.1 Definition of CRN simulation . 56 4.3.2 Construction . 58 4.3.3 Proof . 61 4.4 Singular ≤3-catalytic CRNs can simulate singular catalytic CRNs . 69 4.4.1 Simulation construction . 69 4.5 TBNs can simulate (restricted) CRNs . 74 4.5.1 Thermodynamic Binding Networks Model . 74 4.5.2 Simulation of CRNs . 76 4.5.3 Grid gate . 78 4.5.4 Modifications to grid gate for simulating CRNs . 79 5 Verification Software 89 5.1 Preliminaries . 89 5.1.1 Definitions . 89 5.1.2 Solvers . 91 5.2 Computing stable configurations of TBNs . 92 5.2.1 Finding stable configurations of TBNs . 92 5.2.2 Casting StableConfigs as an IP . 93 iv 5.2.3 Empirical running time measurements . 98 5.3 Computing bases of locally stable configurations of TBNs . 99 5.3.1 Equivalence of polymer bases and Hilbert bases . 99 5.3.2 Using the polymer basis to reason about TBN behavior . 102 5.3.3 A case example: Circular Translator Cascade . 103 6 Grid Gate Experiment 106 6.1 Designs . 106 6.2 Designing Orthogonal Domains . 110 6.2.1 Constraints . 110 6.3 Results . 116 6.3.1 Assay technique . 116 6.3.2 Stable formation of Grid Gate . 118 6.4 Next Steps . 123 v Abstract Design and Verification of Modular Components in Thermodynamic Binding Networks Designing engineered molecular systems typically requires specialized knowledge of the particular substrate; however, one can also reason about such systems in a substrate- independent fashion, by examining the underlying energetics that govern any chemical substrate: the formation of molecular bonds and the number of complexes formed. The thermodynamic binding networks (TBN) model was developed to study such systems, and in particular, to determine fault-tolerance in molecular systems such as DNA strand displacement cascades. This dissertation details an extended form of the model in which complexes can merge together or split apart at an energetic benefit/cost. This exten- sion allows one to also reason about reachability of configurations with respect to energy barriers. Several theoretical constructions are presented here which demonstrate that such energy barriers can be programmably large, implement catalytic and autocatalytic behavior, and be part of larger, modular systems in which complex behavior can be real- ized. Indeed, reasoning about the energy barrier between configurations in such systems is proved here to be PSPACE-hard, even to a c-factor approximation. This dissertation also contains details of integer and constraint programming formulations that can solve certain questions related to a system's energetics. Also made formal here is the con- nection between TBNs and the well-studied combinatorial concept of Hilbert bases, and examples are given which illustrate how one can use a Hilbert basis to verify particular aspects of TBN designs. Finally, the details of an experiment attempting to implement one of the programmable TBN constructions are given, along with empirical results and interpretations. vi Contents 1 Chapter 1 Introduction Recent experimental breakthroughs in DNA nanotechnology [19] have enabled the con- struction of intricate molecular machinery whose complexity rivals that of other biological macromolecules, even executing general-purpose algorithms [59]. A major challenge in creating synthetic DNA molecules that undergo desired chemical reactions is the occur- rence of erroneous \leak" reactions [45], driven by the fact that the products of the leak reactions are more energetically favorable. A promising design principle to mitigate such errors is to build \thermodynamic robustness" into the system, ensuring that leak reac- tions incur an energetic cost [55, 57] by logically forcing one of two unfavorable events: either many molecular bonds must break|an \enthalpic" cost|or many separate molecu- lar complexes (called polymers in this document) must simultaneously come together|an \entropic" cost. The model of thermodynamic binding networks (TBNs) [25] was defined as a combina- torial abstraction of such molecules, deliberately simplifying substrate-dependent details of DNA in order to isolate the foundational energetic contributions of forming bonds and separating polymers. A TBN consists of monomers containing specific binding sites, where binding site a can bind only to its complement a∗. A key aspect of the TBN model is the lack of geometry: a monomer is an unordered collection of binding sites such as fa; a; b∗; cg.A configuration of a TBN describes which monomers are grouped into poly- mers; bonds can only form within a polymer. One can formalize the \correctness" of a TBN by requiring that its desired configuration(s) be stable: the configuration maximizes 2 the number of bonds formed, a.k.a., it is saturated, and, among all saturated configura- tions, it maximizes the number of separate polymers.1 See Fig. 1.1 for an example. Stable configurations are meant to capture the minimum free energy structures of the TBN. Un- fortunately, answering basic questions such as \Is a particular TBN configuration stable?" turn out to be NP-hard [11]. a b a b a b a b a b a b a b a* b* a* b* a* b* a* b* a* b* a* b* a* b* a b a b a b a b a b a b a b not saturated saturated stable Figure 1.1: Example of a simple thermodynamic binding network (TBN). There are four monomers: ∗ ∗ m1 = fa ; b g; m2 = fa; bg; m3 = fag; m4 = fbg; with seven configurations shown: four of these configu- rations are saturated because they have the maximum of 2 bonds. Of these, three have 2 polymers and one has 3 polymers, making the latter the only stable configuration. Despite the suggestive lines between binding sites, the model of this paper ignores individual bonds, defining a configuration solely by how it partitions the set of monomers into polymers, assuming that a maximum number of bonds will form within each polymer. (Thus other configurations exist besides those shown, which would merge polymers shown without allowing new bonds to form.) Abstract mathematical models of molecular systems, such as chemical reaction net- works, have long been useful in natural science to study the properties of natural molecules. For a chemical system designed to perform computation, we can prescribe a chemical pro- gram with abstract chemical reactions such as A + C ! B + C (1.1) A ! B: (1.2) In particular, a program may require Eq. 1.1 and forbid Eq. 1.2. But what remains hidden at this level of abstraction is a well-known chemical constraint: if Eq. 1.1 is possible, then Eq. 1.2 must also be, no matter the exact substances. Knowing this, we might try to slow Eq. 1.2 by ensuring B has high free energy. But then B + C must also have high free energy, so Eq. 1.1 slows in tandem. The only option to slow Eq. 1.2 but not Eq. 1.1 is to 1This definition captures the limiting case (often approximated in practice in DNA nanotechnology) corresponding to increasing the strength of bonds, while diluting (increasing volume), such that the ratio of binding to unbinding rate goes to infinity. 3 use a kinetic barrier: designing A so that, although it is possible for A to reconfigure into B, the system must traverse a higher energy (less favorable) intermediate in the absence of C. It seems difficult to engineer kinetic energy barriers and catalysis in a way that is independent of the particular chemical substrate.