Novel Methods for Learning and Adaptation in Chemical Reaction Networks
Total Page:16
File Type:pdf, Size:1020Kb
Portland State University PDXScholar Dissertations and Theses Dissertations and Theses Spring 6-2-2015 Novel Methods for Learning and Adaptation in Chemical Reaction Networks Peter Banda Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds Part of the Other Computer Sciences Commons Let us know how access to this document benefits ou.y Recommended Citation Banda, Peter, "Novel Methods for Learning and Adaptation in Chemical Reaction Networks" (2015). Dissertations and Theses. Paper 2329. https://doi.org/10.15760/etd.2326 This Dissertation is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of PDXScholar. For more information, please contact [email protected]. Novel Methods for Learning and Adaptation in Chemical Reaction Networks by Peter Banda A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Dissertation Committee: Christof Teuscher, Chair Melanie Mitchell Bart Massey Niles Lehman Michael Bartlett Portland State University 2015 c 2015 Peter Banda ABSTRACT State-of-the-art biochemical systems for medical applications and chemical computing are application-specific and cannot be re-programmed or trained once fabricated. The implementation of adaptive biochemical systems that would offer flexibility through pro- grammability and autonomous adaptation faces major challenges because of the large number of required chemical species as well as the timing-sensitive feedback loops re- quired for learning. Currently, biochemistry lacks a systems vision on how the user- level programming interface and abstraction with a subsequent translation to chemistry should look like. By developing adaptation in chemistry, we could replace multiple hard- wired systems with a single programmable template that can be (re)trained to match a desired input-output profile benefiting smart drug delivery, pattern recognition, and chem- ical computing. I aimed to address these challenges by proposing several approaches to learning and adaptation in Chemical Reaction Networks (CRNs), a type of simulated chemistry, where species are unstructured, i.e., they are identified by symbols rather than molecular struc- ture, and their dynamics or concentration evolution are driven by reactions and reaction rates that follow mass-action and Michaelis-Menten kinetics. Several CRN and experimental DNA-based models of neural networks exist. How- ever, these models successfully implement only the forward-pass, i.e., the input-weight in- tegration part of a perceptron model. Learning is delegated to a non-chemical system that i computes the weights before converting them to molecular concentrations. Autonomous learning, i.e., learning implemented fully inside chemistry has been absent from both theoretical and experimental research. The research in this thesis offers the first constructive evidence that learning in CRNs is, in fact, possible. I have introduced the original concept of a chemical binary perceptron that can learn all 14 linearly-separable logic functions and is robust to the perturbation of rate constants. That shows learning is universal and substrate-free. To simplify the model I later proposed and applied the “asymmetric” chemical arithmetic providing a compact solution for representing negative numbers in chemistry. To tackle more difficult tasks and to serve more complicated biochemical applica- tions, I introduced several key modular building blocks, each addressing certain aspects of chemical information processing and learning. These parts organically combined into gradually more complex systems. First, instead of simple static Boolean functions, I tackled analog time-series learning and signal processing by modeling an analog chem- ical perceptron. To store past input concentrations as a sliding window I implemented a chemical delay line, which feeds the values to the underlying chemical perceptron. That allows the system to learn, e.g., the linear moving-average and to some degree predict a highly nonlinear NARMA benchmark series. Another important contribution to the area of chemical learning, which I have helped to shape, is the composability of perceptrons into larger multi-compartment networks. Each compartment hosts a single chemical perceptron and compartments communicate with each other through a channel-mediated exchange of molecular species. Besides the feedforward pass, I implemented the chemical error backpropagation analogous to that of feedforward neural networks. Also, after applying mass-action kinetics for the catalytic reactions, I succeeded to systematically analyze the ODEs of my models and derive the ii closed exact and approximative formulas for both the input-weight integration and the weight update with a learning rate annealing. I proved mathematically that the formulas of certain chemical perceptrons equal the formal linear and sigmoid neurons, essentially bridging neural networks and adaptive CRNs. For all my models the basic methodology was to first design species and reactions, and then set the rate constants either ”empirically” by hand, automatically by a standard genetic algorithm (GA), or analytically if possible. I performed all simulations in my COEL framework, which is the first cloud-based chemistry modeling tool, accessible at coel-sim.org. I minimized the amount of required molecular species and reactions to make wet chemical implementation possible. I applied an automatized mapping technique, Solove- ichik’s CRN-to-DNA-strand-displacement transformation, to the chemical linear percep- tron and the manual signalling delay line and obtained their full DNA-strand specified implementations. As an alternative DNA-based substrate, I mapped these two models also to deoxyribozyme-mediated cleavage reactions reducing the size of the displacement variant to a third. Both DNA-based incarnations could directly serve as blue-prints for wet biochemicals. Besides an actual synthesis of my models and conducting an experiment in a bio- chemical laboratory, the most promising future work is to employ so-called reservoir computing (RC), which is a novel machine learning method based on recurrent neural networks. The RC approach is relevant because for time-series prediction it is clearly superior to classical recurrent networks. It can also be implemented in various ways, such as electrical circuits, physical systems, such as a colony of Escherichia Coli, and water. RC’s loose structural assumptions therefore suggest that it could be expressed in a chemical form as well. This could further enhance the expressivity and capabilities of iii chemically-embedded learning. My chemical learning systems may have applications in the area of medical diagno- sis and smart medication, e.g., concentration signal processing and monitoring, and the detection of harmful species, such as chemicals produced by cancer cells in a host (can- cer miRNAs) or the detection of a severe event, defined as a linear or nonlinear temporal concentration pattern. My approach could replace hard-coded solutions and would allow to specify, train, and reuse chemical systems without redesigning them. With time-series integration, biochemical computers could keep a record of changing biological systems and act as diagnostic aids and tools in preventative and highly personalized medicine. iv DEDICATION To my father—a biochemist, my first teacher, and the source of my scientific and moral inspiration. v ACKNOWLEDGMENTS First of all, I would like to thank my advisor Christof Teuscher for his continuous support, guidance, and encouragement. Step by step Christof led me forward and pushed me to fulfill my potential. He taught me the virtues of precision and diligence. He showed tremendous understanding to my ever-changing situation. Indeed it has been a wild ride. He greatly influenced my decision to pursue an academic career. Many indispensable discussions with Drew Blount have helped to shape this work. This collaboration was fruitful and of far-reaching importance for me. Alireza Goudarzi has always expanded my narrow scientific view and asked fundamental, analytic ques- tions a lot of people would consider overly ambitious. In my eyes he is a true scholar. I also express my appreciation to Josh Moles, who helped me to formulate the idea of chemically implemented delay line, and with whom I share passion for (fancy) new tech- nologies and frameworks. I deeply appreciated being a part of Teuscher Lab, one the most productive collectives at PSU. I would like to thank my colleagues, especially Jens Burger¨ for his friendship and to be always there to discuss the everyday scientific and personal struggles in a very open fashion. I wish to extend my thanks to NSF project “Computing with Biomolecules: From Net- work Motifs to Complex and Adaptive Systems” for generous support, which made my research possible. I would like to acknowledge all the members of the project, especially vi Darko Stefanovic, who even being at a different university factually became my second advisor, Milan Stojanovic, Carl Brown, Matt Lakin, and Steven Graves, for expanding my knowledge of biochemistry and for being patient explaining very basic concepts. I consider myself very fortunate to have been given an opportunity to attend Portland State University. The remainder of my support was from the Computer Science and Elec- trical Engineering Departments. A special credit goes to the CAT team