Physical Modeling of Biomolecular Computers: Models, Limitations, and Experimental Validation

Natural Computing 3: 411–426, 2004. Ó 2004 Kluwer Academic Publishers. Printed in the Netherlands. Physical modeling of biomolecular computers: Models, limitations, and experimental validation JOHN A. ROSE1;Ã;y and AKIRA SUYAMA2;y 1Department of Computer Science, and UPBSB, The University of Tokyo, and Japan Science and Technology Corporation, CREST, Japan (ÃAuthor for correspondence, e-mail: [email protected]); 2Institute of Physics, The University of Tokyo, and Japan Science and Technology Corporation, CREST, Japan Abstract. A principal challenge facing the development and scaling of biomolecular computers is the design of physically well-motivated, experimentally validated simulation tools. In particular, accurate simulations of computational behavior are needed to establish the feasibility of new architectures, and to guide process implementation, by aiding strand design. Key issues accompanying simulator development include model selection, determination of appropriate level of chemical detail, and experimental validation. In this work, each of these issues is discussed in detail, as presented at the workshop on simulation tools for biomolecular computers (SIMBMC), held at the 2003 Congress on Evolutionary Computation. The three major physical models commonly applied to model biomolecular processes, namely molecular mechanics, chemical kinetics, and statistical thermodynamics, are compared and contrasted, with a focus on the potential of each to simulate various aspects of biomolecular computers. The fundamental and practical limitations of each approach are considered, along with a discussion of appropriate chemical detail, at the biopolymer, process, and system levels. The relationship between system analysis and design is addressed, and formalized via the DNA Strand Design problem (DSD). Finally, the need for experimental validation of both underlying parameter sets and overall predictions is discussed, along with illustrative examples. Key words: biomolecular computing, DNA-based computing, DNA strand design, kinetics, molecular mechanics, statistical thermodynamics 1. Introduction Since the advent of biomolecular computing (Adleman, 1994), a principal difficulty facing development has been the need for physically well-motivated, experimentally validated simulation tools. In particular, accurate simulations of computational behavior are critical for evalu- y Authors contributed equally to the present work. 412 J.A. ROSE AND A. SUYAMA ating feasibility, and for supporting selection of DNA encodings and reaction conditions which optimize reliability and efficiency. A key issue accompanying development is selection of a model that is theoretically adequate, employs experimentally valid parameters, and provides predictions that lend themselves to clear interpretation and experimental validation. For this reason, a number of algorithms and tools for simulation and design have been proposed in the context of DNA computing, including implementations of mass action, via statistical thermodynamic (Hartemink and Gifford, 1999; Rose and Deaton 2001; Rose et al., 2001, Rose et al., 2002; Andronescu et al., 2002; Rose et al., 2004) and kinetic models (Nishikawa et al., 2001; Uejima and Hagiya 2004), and implementations based on simple Watson–Crick sequence- similarity (Deaton et al., 1998; Condon et al., 2001; Garzon et al., 2004). Significantly, numerous well-established tools for modeling biopolymer physical behavior, which employ the same fundamental prin- ciples (Steger, 1994; Blake et al., 1999; Hofacker, 2003; SantaLucia and Hicks, 2004) are also available outside of the DNA computing field, resulting in the potential for considerable overlap between emerging and established models and packages. To facilitate smooth integration, and to discuss difficulties facing implementation, a Workshop on Biomo- lecular Simulation Tools (SIMBMC’03) was recently organized, at the 2003 Congress on Evolutionary Computation. In this work, the physical models commonly used to analyze biopolymer systems are compared, along with a discussion of the relationship to system design (for a discussion of related issues regarding simulator design, see the companion paper (Blain et al., 2004)). Issues regarding model selection and application are discussed, including: appropriateness; limitations; level of detail; and experimental validation, at both parameter and prediction levels. Organization is as follows. Section 2 surveys the principal methods applied to model biomolecular processes, including single-molecule models, via molecular mechanics (Section 2.1) and approaches based on mass action (Section 2.2), including kinetics and statistical thermodynamics. Section 2.3 discusses the appropriate level of chemical detail for nucleic acid systems, at the biopolymer, process, and system levels. Section 2.4 addresses the relationship between analysis and design, via formulation of the DNA Strand Design problem. Section 3 considers issues regarding experimental validation, via a pair of simple examples: (1) a two–state model for characterizing the annealing of long oligonucleotides; and (2) a Hamming encoding strategy for low-error, intermediate-size Tag–Ant- itag (TAT) system design. PHYSICAL MODELING OF BIOMOLECULAR COMPUTERS 413 2. Alternative physical models for simulation A number of approaches may be employed to model chemical systems. These models generally fall into two categories: (1) methods based on single-molecule simulation (e.g., molecular mechanics), which provide a detailed picture of molecule dynamics (van Holde et al.,1998, Leach 2001), and methods based on simulation of mass action, including (2) kinetics (Wetmur 1991, Voit 2000), and (3) statistical thermodynamics (Cantor and Schimmel, 1980; Wartell and Benight 1985), which provide averaged measures of system behavior. The appropriate method depends upon the chemical system of interest, and the scales associated with the primary system processes under consideration. 2.1. Single-molecule simulation: molecular mechanics The most accurate approach is the explicit modeling of a single instance of the molecule of interest. For purposes of modeling behavior fol- lowing association, a hydrogen-bonded network (e.g., dsDNA) is often considered to be a single biopolymer. Although the most detailed method involves calculation of electronic wave functions (ab initio methods), a more popular approach for large molecular systems is to dispense with wave functions, in favor of an empirical, Newtonian ball- and-spring view of biopolymer energetics. In this approach, known as molecular mechanics, the energy of a conformation is computed as the sum of two components: the bonding interactions, and the non-bonding interactions. A brief overview of this process will now be provided. For a detailed discussion of force-field construction and use for modeling DNA structures, see (von Kitzing 1992; Cheatham and Kollman, 2000). The overall bonding energy is defined in terms of the deformation energy of each two-atom, three-atom, and four-atom center, from the experimentally determined, context-independent equilibrium values characteristic of the group (bond-stretching, bond-angle bending, and dihedral angles, respectively). A harmonic force-field is distinguished by use of a harmonic potential to model the first two types of terms. The low- level assignment of spring constants and equilibrium values, known as field parametrization, provides the field’s connection with experimental observation. Taken over all centers, the sum estimates the energy of forming the covalent bonds characterizing a given structure (i.e., config- uration). At ambient temperatures, the large magnitude of this energy, away from the vicinity of equilibrium values restricts dynamical changes 414 J.A. ROSE AND A. SUYAMA to rotations about single bonds (i.e., variations in conformation) effec- tively limiting motion to a conformational subspace, within which folding is driven by more modest, non-bonding interactions. These include the long-range stabilizing forces (electrostatic, dipole–dipole, van der Waals, hydrogen bonding), taken over relevant pairs or groups of atoms. The number of degrees of freedom is commonly reduced by treating chemical groups whose internal dynamics are beyond the scope of the simulation (e.g., hydrogen vibrations) in terms of a static, ‘united atom’. An addi- tional consideration when modeling a charged poly-ion such as DNA in buffered solution is the need to account for the stabilizing effect of partial counter-ion screening of the backbone charges (von Kitzing, 1992). The primary stabilizing interaction for a DNA duplex in solution, namely base-pair stacking, arises from the sum of 3 distinct favorable interactions: (1) van der Waals interaction between rings; (2) interactions between induced ring dipole moments; and (3) hydrophobic sequestering of aro- matic rings (Wartell and Benight 1985). In addition to estimation of the above enthalpic components of the free energy, a separate consideration must therefore be made of the entropy change accompanying solvation (e.g., modeling the hydrophobic interaction, which drives biopolymer folding (van Holde et al., 1998)). For each conformation, the sum over the bonding and nonbonding components yields the total conformational potential energy, V. For a biopolymer containing N atoms, the resulting values form a potential energy landscape of size 3N À 3. Molecular mechanics simulations on an energy landscape may be classified into two categories:

Physical Modeling of Biomolecular Computers: Models, Limitations, and Experimental Validation

The Architecture of the Protein Domain Universe

Cmse 520 Biomolecular Structure, Function And

Action of Multiple Rice -Glucosidases on Abscisic Acid Glucose Ester

Evolution of Biomolecular Structure Class II Trna-Synthetases and Trna

The Effect of Dicer Knockout on RNA Interference Using Various Dicer Substrate Interfering RNA Structures

Protein Structure Prediction and Design in a Biologically-Realistic Implicit Membrane

Topic 5: the Folding of Biopolymers – RNA and Protein Overview

Biomolecular Modeling and Simulation: a Prospering Multidisciplinary Field

A Glance Into the Evolution of Template-Free Protein Structure Prediction Methodologies Arxiv:2002.06616V2 [Q-Bio.QM] 24 Apr 2

Biological Information, Molecular Structure, and the Origins Debate Jonathan K

Appendix8 Shape Grammars in Chapter 9 There Is a Reference to Possible Applications of Shape Grammars. One Such Application Conc

Structure and Functions of Biomolecules - Web Course