MOIRAE: a Computational Strategy to Predict 3-D Structures of Polypeptides

Total Page:16

File Type:pdf, Size:1020Kb

MOIRAE: a Computational Strategy to Predict 3-D Structures of Polypeptides UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL INSTITUTO DE INFORMÁTICA PROGRAMA DE PÓS-GRADUAÇÃO EM COMPUTAÇÃO MÁRCIO DORN MOIRAE: A Computational Strategy to Predict 3-D Structures of Polypeptides Thesis presented in partial fulfillment of the requirements for the degree of Doctor of Computer Science Prof. Dr. Luis C. Lamb Advisor Profa.Dra. Luciana S. Buriol Coadvisor Porto Alegre, August 2012 CIP – CATALOGING-IN-PUBLICATION Dorn, Márcio MOIRAE: A Computational Strategy to Predict 3-D Structures of Polypeptides / Márcio Dorn. – Porto Alegre: PPGC da UFRGS, 2012. 217 f.: il. Thesis (Ph.D.) – Universidade Federal do Rio Grande do Sul. Programa de Pós-Graduação em Computação, Porto Alegre, BR–RS, 2012. Advisor: Luis C. Lamb; Coad- visor: Luciana S. Buriol. 1. 3-D protein structure prediction. 2. Artificial neural networks. 3. Genetic algorithms. 4. GA local-search oper- ator. 5. Structural bioinformatics. I. Lamb, Luis C.. II. Bu- riol, Luciana S.. III. Título. UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL Reitor: Prof. Carlos Alexandre Netto Pró-Reitor de Coordenação Acadêmica: Prof. Rui Vicente Oppermann Pró-Reitor de Pós-Graduação: Prof. Aldo Bolten Lucion Diretor do Instituto de Informática: Prof. Luis Cunha Lamb Coordenador do PPGC: Prof. Alvaro Freitas Moreira Bibliotecária-chefe do Instituto de Informática: Beatriz Regina Bastos Haro “Das interessanteste an unseren Universum ist, dass man es verstehen kann.” “O Interessante em nosso universo que podemos entende-lo." — Albert Einstein “In Greek mythology, the Moirae (often known in English as The Fates) were three sisters who determined the fate of both gods and humans. The three women were dismal, responsible for spinning (Clotho), weaving (Lachesis) and cut (Atropos) what would be the thread of life of all individuals: Clotho in Greek means "spinning", held the spindle and weaving the thread of life. Lachesis in Greek means "sort", pulled and wrapped the cord tissue. Atropos in Greek means "away", she cut the thread of life. In this thesis Spinning represents the fact of acquire and combine structural patterns from experimental protein structures, Sort represents the genetic algorithm developed to search the conformation space in order to find the protein native-like structure and Away represents the developed strategy to keep out bad solutions. ” AGRADECIMENTOS Primeiramente gostaria de agradecer aos meus pais Marlise e Valdir Dorn por sempre estarem presentes em todas as etapas da minha vida, me apoiando, me incen- tivando e me inspirando. Sem a ajuda de vocês mais este sonho não teria se tornado realidade. Agradeço também as pessoas que quando do início do meu doutoramento se faziam presentes e que hoje infelizmente já não estão mais entre nós: Vó Helga Kupske e Vô Roberto Kupske (in memoriam). Saudades eternas de vocês. Aos avós paternos Reinert e Berta Dorn, anos já se passaram mas a saudade continua. Vocês sempre foram um exemplo de pessoas. Aos demais familiares Marlene, Carlos, Hildegard, Egon, Claudio, Mariza, Morgane e Mônica. Obrigado pelo apoio, carinho e amizade. Aos meus amigos Leonardo Klein, Everton Becker, Eleonor Mayer, Paula Pavan, Melina, Guaraci, Elaine Costa, Neri Costa e Dione Taschetto, com os quais, sempre pude contar. Aos amigos e colegas de laboratório e da UFRGS, Fabiane Dillenburg, Daniel Farenzena, Diego Noble, Rodrigo Machado, Cristiano Reis dos Santos, Rodrigo Kassick, Michelle Leonhardt, Fernando Stefanello, Arton Dorneles, Tiago Primo, Kelen Bernardi e Rafael Parizi pelas conversas, cafés e momentos de descontração. Aos pais acadêmicos Luis Lamb e Luciana Buriol que me acolheram na UFRGS. Obrigado pela dedicação de vocês, pela paciência, pela confiança, pelo apoio, pelas conversas, pela compreensão e pelo incentivo. Vocês são exemplos de profissionais e de seres humanos. Obrigado por tudo! A todos os técnicos administrativos do INF e do PPGC, em especial Beatriz Haro, Claudia de Quadros, Silvania Vidal, Margareth Schäffer, Leivo Ortiz, Luis Otávio e Elisiane de Almeida. Obrigado também a todos os professores do INF em especial Álvaro Moreira e Lisandro Granville. Ao professor Hugo Verli (CBiot/UFRGS) pelo apoio, incentivo e disponibilidade para conversas e trocas de idéias. Aos professores Mario Inostroza (Universidade do Chile) e Lean- dro Wives (INF/UFRGS) pelos construtivos comentários sobre o conteúdo da tese. Aos colegas e amigos de outras instituições nacionais e internacionais pelo apoio e incentivo ao longo desta caminhada: André Braga (Universidade de Brasília), Carlos Humberto Llanos (Universidade de Brasília), Martin Baumman (KIT, Ale- manha), Florian Wilhelm (KIT, Alemanha), Eva Ketelaer (KIT, Alemanha), Math- ias Krause (KIT, Alemanha), Andrea Nestler (KIT, Alemanha), Mario Torres (Red Hat, Alemanha), Federica Ciacca (Itália), Maristela Brizzi (Unijuí), Priscila Gryn- berg (UFMG), Tálita Sono (UFMG) e Diogo Volpati (UNESP). CONTENTS LIST OF ABBREVIATIONS AND ACRONYMS . 8 LIST OF FIGURES . 10 LIST OF TABLES . 13 ABSTRACT . 14 RESUMO . 15 1INTRODUCTION.............................16 1.1 Motivation ................................. 17 1.2 Contributions ............................... 18 1.3 Thesis organization ............................ 18 2ONPROTEINS..............................19 2.1 Introduction ................................ 19 2.2 Amino acid residues and their structures .............. 20 2.3 The peptide bond ............................. 21 2.3.1 Side-chain conformations . 22 2.4 Description of protein structures ................... 24 2.4.1 Primary structure . 24 2.4.2 Secondary structure . 25 2.4.3 Tertiary structure . 26 2.4.4 Quaternary structure . 27 2.5 Protein taxonomy ............................ 27 2.6 Structural databases and structural parameters .......... 29 2.7 Chapter conclusions ........................... 30 3ONTHEPROTEINKINEMATICS...................31 3.1 Introduction ................................ 31 3.2 Three-dimensional protein structure representation ....... 31 3.3 Protein structure kinematics ...................... 33 3.4 Chapter conclusions ........................... 36 4TERTIARYPROTEINSTRUCTUREPREDICTIONMETHODS..37 4.1 Introduction ................................ 37 4.2 Ab initio methods: first principle methods without database information ................................. 39 4.2.1 Overview of ab initio approaches . 43 4.3 First principle methods with database information ........ 49 4.3.1 Overview of first principle methods with database information . 52 4.4 Fold recognition and threading methods .............. 56 4.4.1 Overview of fold recognition and threading methods . 58 4.5 Comparative modeling methods and sequence alignment strate- gies ...................................... 60 4.5.1 Overview of comparative modeling methods and sequence alignment strategies.................................. 64 4.6 Chapter conclusions ........................... 65 5MOIRAE:REDUCINGTHECONFORMATIONALSEARCHSPACE 66 5.1 Introduction ................................ 66 5.2 Proposed method ............................. 67 5.2.1 Polypeptide representation . 67 5.2.2 Structural templates . 68 5.3 Chapter conclusions ........................... 83 6MOIRAE:ALOCAL-SEARCH-BASEDGENETICALGORITHMTO SEARCH THE 3-D PROTEIN CONFORMATIONAL SPACE . 84 6.1 Introduction ................................ 84 6.2 Proposed method ............................. 84 6.2.1 Potential energy function and implicit solvation model . 84 6.2.2 Searchstrategy .............................. 85 6.3 Chapter conclusions ........................... 91 7EXPERIMENTALRESULTS.......................92 7.1 Introduction ................................ 92 7.2 Model and target proteins ....................... 92 7.3 Computing main-chain torsion angles intervals .......... 97 7.4 Searching the native-like structures of polypeptides .......104 7.5 Structural analysis ............................113 7.5.1 Secondary structure analysis . 116 7.5.2 Stereo-chemical analysis . 118 7.6 Comparison of protein 3-D structure prediction methods ....124 7.7 Chapter conclusions ...........................126 8CONCLUSIONS..............................127 9PUBLICATIONS..............................129 9.1 Published papers .............................129 9.1.1 Under review . 130 APPENDIX A PROTEIN KINEMATICS: ROTATION OF SIDE-CHAIN TORSION ANGLES . 132 APPENDIX B FIRST PRINCIPLE METHODS WITHOUT DATABASE INFORMATION . 135 APPENDIX C FIRST PRINCIPLE METHODS WITH DATABASE IN- FORMATION ........................ 139 APPENDIX D THREADING METHODS . 141 APPENDIX E COMPARATIVE MODELING METHODS SUMMARY . 143 APPENDIX F TORSION ANGLES OF —-TURNS . 145 APPENDIX G SIDE-CHAIN TORSION ANGLES . 147 APPENDIX H TORSION ANGLES INTERVALS . 149 APPENDIX I EXPERIMENTS: FITNESS GRAPHS . 165 APPENDIX J EXPERIMENTS: RAMACHANDRAN PLOTS . 173 APPENDIX K RESUMO ESTENDIDO . 181 K.1 Desenvolvimento da Pesquisa .....................181 K.1.1 Conclusões . 185 REFERENCES . 187 LIST OF ABBREVIATIONS AND ACRONYMS AMBER Assisted Model Building with Energy Refinement A3N Artificial Neural Network-N-gram based method ANN Artificial Neural Network ASA Accessible Surface Area BB Branch and Bound BOSS Biochemical and Organic Simulation System CATH Class, Architecture, Topology, Homologous super-family CASP Critical Assessment of protein Structure Prediction CCP Contact Capacity Potentials CE Combinatorial Extension CHARMM Chemistry at HARvard Macromolecular Mechanics CM Comparative Modeling CREF Central-REsidue-Fragment-based method CSA Conformation Space Annealing DDD Dali Domain Dictionary DSSP Define Secondary Structure
Recommended publications
  • 20 Years of Polarizable Force Field Development
    20 YEARS OF POLARIZABLE FORCEFIELDDEVELOPMENTfor biomolecular systems thor van heesch Supervised by Daan P. Geerke and Paola Gori-Giorg 1 contents 2 1 Introduction 3 2contentsForce fields: Basics, Caveats and Extensions 4 2.1 The Classical Approach . 4 2.2 The caveats of point-charge electrostatics . 8 2.3 Physical phenomenon of polarizability . 10 2.4 Common implementation methods of electronic polarization . 11 2.5 Accounting for anisotropic interactions . 14 3 Knitting the reviews and perspectives together 17 3.1 Polarizability: A smoking gun? . 17 3.2 New branches of electronic polarization . 18 3.3 Descriptions of electrostatics . 19 3.4 Solvation and polarization . 20 3.5 The rise of new challenges . 21 3.6 Parameterization or polarization? . 22 3.7 Enough response: how far away? . 23 3.8 The last perspectives . 23 3.9 A new hope: the next-generation force fields . 29 4 Learning with machines 29 4.1 Replace the functional form with machine learned force fields . 31 4.2 A different take on polarizable force fields . 33 4.3 Are transferable parameters an universal requirement? . 33 4.4 The difference between derivation and prediction . 34 4.5 From small molecules to long range interactions . 36 4.6 Enough knowledge to fold a protein? . 37 4.7 Boltzmann generators, a not so hypothetical machine anymore . 38 5 Summary: The Red Thread 41 introduction 3 In this literature study we aimed to answer the following question: What has changedabstract in the outlook on polarizable force field development during the last 20 years? The theory, history, methods, and applications of polarizable force fields have been discussed to address this question.
    [Show full text]
  • Parallel-Tempered Stochastic Gradient Hamiltonian Monte Carlo for Approximate Multimodal Posterior Sampling
    1st Symposium on Advances in Approximate Bayesian Inference, 20181–6 Parallel-tempered Stochastic Gradient Hamiltonian Monte Carlo for Approximate Multimodal Posterior Sampling ∗ Rui Luo [email protected] ∗ Qiang Zhang [email protected] Yuanyuan Liu [email protected] American International Group, Inc. Abstract We propose a new sampler that integrates the protocol of parallel tempering with the Nose-Hoover´ (NH) dynamics. The proposed method can efficiently draw representative samples from complex posterior distributions with multiple isolated modes in the presence of noise arising from stochastic gradient. It potentially facilitates deep Bayesian learning on large datasets where complex multi- modal posteriors and mini-batch gradient are encountered. 1. Introduction In Bayesian inference, one of the fundamental problems is to efficiently draw i.i.d. samples from the posterior distribution π¹θj Dº given the dataset D = fxg, where θ 2 D denotes the variable of interest. Provided the prior distribution π¹θº and the likelihood per datum `¹θ; xº, the posterior to be sampled can be formulated as Ö π¹θj Dº = π¹θº `¹θ; xº: (1) x 2D To facilitate posterior sampling, the framework of Markov chain Monte Carlo (MCMC) has been established, which has initiated a broad family of methods that generate Markov chains to propose new sample candidates and then apply tests of acceptance in order to guarantee the condition of detailed balance. Methods like the Metropolis-Hastings (MH) algorithm (Metropolis et al., 1953; Hastings, 1970), the Gibbs sampler (Geman and Geman, 1984), and the hybrid/Hamiltonian Monte Carlo (HMC) (Duane et al., 1987; Neal, 2011) are famous representatives for the MCMC family where different generating procedures of Markov chains are adopted; each of those methods has achieved great success on various tasks in statistics and related fields.
    [Show full text]
  • Open Babel Documentation Release 2.3.1
    Open Babel Documentation Release 2.3.1 Geoffrey R Hutchison Chris Morley Craig James Chris Swain Hans De Winter Tim Vandermeersch Noel M O’Boyle (Ed.) December 05, 2011 Contents 1 Introduction 3 1.1 Goals of the Open Babel project ..................................... 3 1.2 Frequently Asked Questions ....................................... 4 1.3 Thanks .................................................. 7 2 Install Open Babel 9 2.1 Install a binary package ......................................... 9 2.2 Compiling Open Babel .......................................... 9 3 obabel and babel - Convert, Filter and Manipulate Chemical Data 17 3.1 Synopsis ................................................. 17 3.2 Options .................................................. 17 3.3 Examples ................................................. 19 3.4 Differences between babel and obabel .................................. 21 3.5 Format Options .............................................. 22 3.6 Append property values to the title .................................... 22 3.7 Filtering molecules from a multimolecule file .............................. 22 3.8 Substructure and similarity searching .................................. 25 3.9 Sorting molecules ............................................ 25 3.10 Remove duplicate molecules ....................................... 25 3.11 Aliases for chemical groups ....................................... 26 4 The Open Babel GUI 29 4.1 Basic operation .............................................. 29 4.2 Options .................................................
    [Show full text]
  • Molecular Dynamics Simulations in Drug Discovery and Pharmaceutical Development
    processes Review Molecular Dynamics Simulations in Drug Discovery and Pharmaceutical Development Outi M. H. Salo-Ahen 1,2,* , Ida Alanko 1,2, Rajendra Bhadane 1,2 , Alexandre M. J. J. Bonvin 3,* , Rodrigo Vargas Honorato 3, Shakhawath Hossain 4 , André H. Juffer 5 , Aleksei Kabedev 4, Maija Lahtela-Kakkonen 6, Anders Støttrup Larsen 7, Eveline Lescrinier 8 , Parthiban Marimuthu 1,2 , Muhammad Usman Mirza 8 , Ghulam Mustafa 9, Ariane Nunes-Alves 10,11,* , Tatu Pantsar 6,12, Atefeh Saadabadi 1,2 , Kalaimathy Singaravelu 13 and Michiel Vanmeert 8 1 Pharmaceutical Sciences Laboratory (Pharmacy), Åbo Akademi University, Tykistökatu 6 A, Biocity, FI-20520 Turku, Finland; ida.alanko@abo.fi (I.A.); rajendra.bhadane@abo.fi (R.B.); parthiban.marimuthu@abo.fi (P.M.); atefeh.saadabadi@abo.fi (A.S.) 2 Structural Bioinformatics Laboratory (Biochemistry), Åbo Akademi University, Tykistökatu 6 A, Biocity, FI-20520 Turku, Finland 3 Faculty of Science-Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, 3584 CH Utrecht, The Netherlands; [email protected] 4 Swedish Drug Delivery Forum (SDDF), Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden; [email protected] (S.H.); [email protected] (A.K.) 5 Biocenter Oulu & Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 7 A, FI-90014 Oulu, Finland; andre.juffer@oulu.fi 6 School of Pharmacy, University of Eastern Finland, FI-70210 Kuopio, Finland; maija.lahtela-kakkonen@uef.fi (M.L.-K.); tatu.pantsar@uef.fi
    [Show full text]
  • Ee9e60701e814784783a672a9
    International Journal of Technology (2017) 4: 611‐621 ISSN 2086‐9614 © IJTech 2017 A PRELIMINARY STUDY ON SHIFTING FROM VIRTUAL MACHINE TO DOCKER CONTAINER FOR INSILICO DRUG DISCOVERY IN THE CLOUD Agung Putra Pasaribu1, Muhammad Fajar Siddiq1, Muhammad Irfan Fadhila1, Muhammad H. Hilman1, Arry Yanuar2, Heru Suhartanto1* 1Faculty of Computer Science, Universitas Indonesia, Kampus UI Depok, Depok 16424, Indonesia 2Faculty of Pharmacy, Universitas Indonesia, Kampus UI Depok, Depok 16424, Indonesia (Received: January 2017 / Revised: April 2017 / Accepted: June 2017) ABSTRACT The rapid growth of information technology and internet access has moved many offline activities online. Cloud computing is an easy and inexpensive solution, as supported by virtualization servers that allow easier access to personal computing resources. Unfortunately, current virtualization technology has some major disadvantages that can lead to suboptimal server performance. As a result, some companies have begun to move from virtual machines to containers. While containers are not new technology, their use has increased recently due to the Docker container platform product. Docker’s features can provide easier solutions. In this work, insilico drug discovery applications from molecular modelling to virtual screening were tested to run in Docker. The results are very promising, as Docker beat the virtual machine in most tests and reduced the performance gap that exists when using a virtual machine (VirtualBox). The virtual machine placed third in test performance, after the host itself and Docker. Keywords: Cloud computing; Docker container; Molecular modeling; Virtual screening 1. INTRODUCTION In recent years, cloud computing has entered the realm of information technology (IT) and has been widely used by the enterprise in support of business activities (Foundation, 2016).
    [Show full text]
  • Melissa Gajewski & Jonathan Mane
    Melissa Gajewski & Jonathan Mane Molecular Mechanics (MM) Methods Force fields & potential energy calculations Example using noscapine Molecular Dynamics (MD) Methods Ensembles & trajectories Example using 18-crown-6 Quantum Mechanics (QM) Methods Schrödinger’s equation Semi-empirical (SE) Wave Functional Theory (WFT) Density Functional Theory (DFT) Hybrid QM/MM & MD Methods Comparison of hybrid methods 2 3 Molecular Mechanics (MM) Methods Force fields & potential energy calculations Example using noscapine Molecular Dynamics (MD) Methods Ensembles & trajectories Example using 18-crown-6 Quantum Mechanics (QM) Methods Schrödinger’s equation Semi-empirical (SE) Wave Functional Theory (WFT) Density Functional Theory (DFT) Hybrid QM/MM & MD Methods Comparison of hybrid methods 4 Useful for all system size ◦ Small molecules, proteins, material assemblies, surface science, etc … ◦ Based on Newtonian mechanics (classical mechanics) d F = (mv) dt ◦ The potential energy of the system is calculated using a force field € 5 An atom is considered as a single particle Example: H atom Particle variables: ◦ Radius (typically van der Waals radius) ◦ Polarizability ◦ Net charge Obtained from experiment or QM calculations ◦ Bond interactions Equilibrium bond lengths & angles from experiment or QM calculations 6 All atom approach ◦ Provides parameters for every atom in the system (including hydrogen) Ex: In –CH3 each atom is assigned a set of data MOLDEN MOLDEN (Radius, polarizability, netMOLDENMOLDENMOLDENMOLDENMOLDEN charge,
    [Show full text]
  • Chapter 2 Monte Carlo Simulations
    Chapter 2 Monte Carlo Simulations David J. Earl and Michael W. Deem Summary AdescriptionofMonteCarlomethodsforsimulationofproteinsisgiven. Advantages and disadvantages of the Monte Carlo approach are presented. The the- oretical basis for calculating equilibrium properties of biological molecules by the Monte Carlo method is presented. Some of the standard and some of the more re- cent ways of performing Monte Carlo on proteins are presented. A discussion of the estimation of errors in properties calculated by Monte Carlo is given. Keywords: Markov chain Metropolis algorithm Monte Carlo Protein simula- · · · tion Stochastic methods · 1 Introduction The term Monte Carlo generally applies to all simulations that use stochastic meth- ods to generate new configurations of a system of interest. In the context of mole- cular simulation, specifically, the simulation of proteins, Monte Carlo refers to importance sampling, which we describe in Sect. 2, of systems at equilibrium. In general, a Monte Carlo simulation will proceed as follows: starting from an initial configuration of particles in a system, a Monte Carlo move is attempted that changes the configuration of the particles. This move is accepted or rejected based on an ac- ceptance criterion that guarantees that configurations are sampled in the simulation from a statistical mechanics ensemble distribution, and that the configurations are sampled with the correct weight. After the acceptance or rejection of a move, one calculates the value of a property of interest, and, after many such moves, an accu- rate average value of this property can be obtained. With the application of statistical mechanics, it is possible to calculate the equilibrium thermodynamic properties of the system of interest in this way.
    [Show full text]
  • Calculation of Effective Interaction Potentials From
    Methodologies of systematic structure-based coarse-graining Alexander Lyubartsev Division of Physical Chemistry Department of Material and Environmental Chemistry Stockholm University E-CAM workshop: State of the art in mesoscale and multiscale modeling Dublin 29-31 May 2017 Outline 1. Multiscale and coarse-grained simulations 2. Systematic structure-based coarse-graining by Inverse Monte Carlo 3. Examples - water and ions - lipid bilayers and lipid assemblies 4. Software - MagiC Example of systematic coarse-graining: DNA in Chromatin: presentation 30 May Computer modeling: from 1st principles to mesoscale Levels of molecular modeling: First-principles Atomistic Meso-scale (Quantum Classical Langevine/ BD, Mechanics) Molecular Dynamics DPD... nuclei electrons atoms coarse-grained biomolecules atoms molecules molecules soft matter 0.1 nm 1.0 nm 10 nm 100 nm 1 000 nm Larger scale more approximations Mesoscale Simulations Length scale: > 10 nm ( nanoscale: 10 - 1000 nm) Atomistic modeling is generally not possible box 10 nm - more than 105 atoms even if doable for 105 - 106 particles - do not forget about time scale! larger size : longer time for equilibration and reliable sampling 105 atoms - time scale should be above 1 ms = 1000 ns Need approximations - coarse-graining Coarse-graining – an example Original size – 2.4Mb Compressed to 24 Kb Levels of coarse- graining Level of coarse-graining can be different. For example, for a DMPC lipid: All-atom model United-atom Coarse-grained Coarse-grained 118 atoms model: 10 sites: 3 sites: 46 united atoms Martini model Cooke model more details - chemical specificity faster computations; larger systems Coarse-graining of solvent: 1) Explicit solvent: 2) Implicit solvent: One or several solvent molecules are No solvent particles but their effect is united in a single site.
    [Show full text]
  • LAMMPS – an Object Oriented Scientific Application
    LAMMPS – An Object Oriented Scientific Application Dr. Axel Kohlmeyer (with a little help from several friends) Associate Dean for Scientific Computing College of Science and Technology Temple University, Philadelphia http://sites.google.com/site/akohlmey/ [email protected] Workshop on Advanced Techniques in Scientific Computing LAMMPS is a Collaborative Project A few lead developers and many significant contributors: ● Steve Plimpton, Aidan Thompson, Paul Crozier, Axel Kohlmeyer - Roy Pollock (LLNL), Ewald and PPPM solvers - Mike Brown (ORNL), GPU package - Greg Wagner (Sandia), MEAM package for MEAM potential - Mike Parks (Sandia), PERI package for Peridynamics - Rudra Mukherjee (JPL), POEMS package for rigid body motion - Reese Jones (Sandia), USER-ATC package for coupling to continuum - Ilya Valuev (JIHT), USER-AWPMD package for wave-packet MD - Christian Trott (Sandia), USER-CUDA package, KOKKOS package - A. Jaramillo-Botero (Caltech), USER-EFF electron force field package - Christoph Kloss (JKU), LIGGGHTS package for DEM and fluid coupling - Metin Aktulga (LBL), USER-REAXC package for C version of ReaxFF - Georg Gunzenmuller (EMI), USER-SPH package - Ray Shan (Sandia), COMB package, QEQ package - Trung Nguyen (ORNL), RIGID package, GPU package - Francis Mackay and Coling Denniston (U Western Ontario), USER-LB Workshop on Advanced Techniques in Scientific Computing LAMMPS is an Extensible Project ● ~2900 C/C++/CUDA files, 120 Fortran files, about 900,000 lines of code in core executable ● Only about 200 files are essential, about
    [Show full text]
  • Calculation of Effective Interaction Potentials From
    Systematic Coarse-Graining of Molecular Models by the Inverse Monte Carlo: Theory, Practice and Software Alexander Lyubartsev Division of Physical Chemistry Department of Material and Environmental Chemistry Stockholm University Modeling Soft Matter: Linking Multiple Length and Time Scales Kavli Institute of Theoretical Physics, UCSB, Santa Barbara 4-8 June 2012 Soft Matter Simulations length model method time 1Å electron w.f + ab-initio 100 ps 1 nm nuclei BOMD, CPMD 10 nm atomistic classical MD 100 ns coarse-grained Langevine MD, 100 nm µ more DPD, etc 100 s coarse-grained µ 1 m continuous The problem: Larger scale ⇔ more approximations Coarse-graining – an example Original size - 900K Compressed to 24K Coares-graining: reduction degrees of freedom All-atom model Coarse-grained model Large-scale 118 atoms 10 sites simulations We need to: 1) Design Coarse-Grained mapping: specify the important degrees of freedom 2) For “important” degrees of freedom we need interaction potential Question: what is the interaction potential for the coarse-grained model? Formal solution: N-body mean force potential Original (FG = fine grained system) FG ( ) =θ( ) H r 1, r 2,... ,rn R j r1, ...r n n = 118 j = 1,...,10 Usually, centers of mass of selected molecular fragments Partition function : n =∫∏ (−β ( ))= Z dr i exp H FG r 1, ...,r n i=1 n N =∫∏ ∏ δ( −θ ( )) (−β ( ))= dr i dR j R j j r1, ...rn exp H FG r1, ... ,rn = i1 j 1 N =∫∏ (−β ( )) dR j exp H CG R1, ... , RN j =1 1 where β= k B T n N ( )=−1 ∫∏ ∏ δ( −θ ( )) (− ( )) H CG R1, ..., R N β ln dr i R j
    [Show full text]
  • Density Functional Theory for Protein Transfer Free Energy
    Density Functional Theory for Protein Transfer Free Energy Eric A Mills, and Steven S Plotkin∗ Department of Physics & Astronomy, University of British Columbia, Vancouver, British Columbia V6T1Z4 Canada E-mail: [email protected] Phone: 1 604-822-8813. Fax: 1 604-822-5324 arXiv:1310.1126v1 [q-bio.BM] 3 Oct 2013 ∗To whom correspondence should be addressed 1 Abstract We cast the problem of protein transfer free energy within the formalism of density func- tional theory (DFT), treating the protein as a source of external potential that acts upon the sol- vent. Solvent excluded volume, solvent-accessible surface area, and temperature-dependence of the transfer free energy all emerge naturally within this formalism, and may be compared with simplified “back of the envelope” models, which are also developed here. Depletion contributions to osmolyte induced stability range from 5-10kBT for typical protein lengths. The general DFT transfer theory developed here may be simplified to reproduce a Langmuir isotherm condensation mechanism on the protein surface in the limits of short-ranged inter- actions, and dilute solute. Extending the equation of state to higher solute densities results in non-monotonic behavior of the free energy driving protein or polymer collapse. Effective inter- action potentials between protein backbone or sidechains and TMAO are obtained, assuming a simple backbone/sidechain 2-bead model for the protein with an effective 6-12 potential with the osmolyte. The transfer free energy dg shows significant entropy: d(dg)=dT ≈ 20kB for a 100 residue protein. The application of DFT to effective solvent forces for use in implicit- solvent molecular dynamics is also developed.
    [Show full text]
  • Tinker-HP : Accelerating Molecular Dynamics Simulations of Large Complex Systems with Advanced Point Dipole Polarizable Force Fields Using Gpus and Multi-Gpus Systems
    Tinker-HP : Accelerating Molecular Dynamics Simulations of Large Complex Systems with Advanced Point Dipole Polarizable Force Fields using GPUs and Multi-GPUs systems Olivier Adjoua,y Louis Lagardère*,y,z Luc-Henri Jolly,z Arnaud Durocher,{ Thibaut Very,x Isabelle Dupays,x Zhi Wang,k Théo Jaffrelot Inizan,y Frédéric Célerse,y,? Pengyu Ren,# Jay W. Ponder,k and Jean-Philip Piquemal∗,y,# ySorbonne Université, LCT, UMR 7616 CNRS, F-75005, Paris, France zSorbonne Université, IP2CT, FR2622 CNRS, F-75005, Paris, France {Eolen, 37-39 Rue Boissière, 75116 Paris, France xIDRIS, CNRS, Orsay, France kDepartment of Chemistry, Washington University in Saint Louis, USA ?Sorbonne Université, CNRS, IPCM, Paris, France. #Department of Biomedical Engineering, The University of Texas at Austin, USA E-mail: [email protected],[email protected] arXiv:2011.01207v4 [physics.comp-ph] 3 Mar 2021 Abstract We present the extension of the Tinker-HP package (Lagardère et al., Chem. Sci., 2018,9, 956-972) to the use of Graphics Processing Unit (GPU) cards to accelerate molecular dynamics simulations using polarizable many-body force fields. The new high-performance module allows for an efficient use of single- and multi-GPUs archi- 1 tectures ranging from research laboratories to modern supercomputer centers. After de- tailing an analysis of our general scalable strategy that relies on OpenACC and CUDA , we discuss the various capabilities of the package. Among them, the multi-precision possibilities of the code are discussed. If an efficient double precision implementation is provided to preserve the possibility of fast reference computations, we show that a lower precision arithmetic is preferred providing a similar accuracy for molecular dynamics while exhibiting superior performances.
    [Show full text]