The GROMOS96 Benchmarks for Molecular Simulation

Computer Physics Communications 128 (2000) 550–557 www.elsevier.nl/locate/cpc The GROMOS96 benchmarks for molecular simulation Alexandre M.J.J. Bonvin b;1,AlanE.Markc;2, Wilfred F. van Gunsteren a;∗ a Laboratory of Physical Chemistry, Swiss Federal Institute of Technology Zürich, ETH Zentrum, CH-8092 Zürich, Switzerland b Bijvoet Center for Biomolecular Research, NMR Spectroscopy, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands c Laboratory of Biophysical Chemistry, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands Received 22 November 1999 Abstract A set of biomolecular systems is presented, which can be used to benchmark the performance of simulation programs and computers. It is applied, using the GROMOS96 biomolecular simulation software, to a variety of computers. The dependence of computing time on a number of model and computational parameters is investigated. An extended pair list technique to select non-bonded interaction pairs and long-range interactions is shown to increase the efficiency by a factor 1.5 to 3 when compared to standard procedures. The benchmark results can be used to estimate the computer time required for simulation studies, and to evaluate the efficiency of various computers regarding molecular simulations. 2000 Elsevier Science B.V. All rights reserved. Keywords: Molecular dynamics; Non-bonded interactions; Cut-offs; Extended pair list 1. Introduction der to extend a simulation period as long as possible for a molecular system of a given size within a given Simulation of the behaviour of biomolecular sys- time span, it is of paramount importance to use a very tems on a computer is a steadily expanding area of fast computer and a very efficient simulation program. theoretical and computational biochemistry. In mole- The time needed to complete a simulation project cular dynamics (MD) simulation, Newton’s equations depends on a number of factors: of motion for thousands of atoms are integrated for- (1) Size of the molecular system, i.e. the number of ward in time using small, i.e. femtosecond size, time atoms or particles or interaction sites. steps during which the forces on the atoms may be (2) The length of the simulation that is needed to assumed to be nearly constant. Today, a simulation sufficiently sample the relevant motions of the study of a small protein containing about 100 amino system. acid residues in water involves about 106 of such time (3) The complexity of the potential energy function steps covering a simulation period of 1 nsec. Although used, from which the forces on the atoms are this seems sufficiently long to sample local, fast re- laxing properties, it is too short to study slow, global derived. processes such as protein folding and unfolding. In or- (4) The spatial range of the forces between atoms. (5) The settings of various interaction or computa- ∗ Corresponding author. E-mail: [email protected]. tional parameters, such as the non-bonded inter- 1 E-mail: [email protected]. action cut-off or the frequency of updating spe- 2 E-mail: [email protected]. cific interactions. 0010-4655/00/$ – see front matter 2000 Elsevier Science B.V. All rights reserved. PII:S0010-4655(99)00540-8 A.M.J.J. Bonvin et al. / Computer Physics Communications 128 (2000) 550–557 551 (6) The handling of spatial boundary conditions, cut-off radius Rcp. The atoms that belong to a charge e.g., periodic versus vacuum. group are chosen such that their partial atomic charges (7) The spatial homogeneity of the system, i.e. add up to zero, except for fully charged groups like whether the atom density is a smooth function the side-chains of Arg or Asp where the partial atomic of position or not. charges may add up to Ce or −e. When the (partial) (8) The characteristics of the computer used, e.g., atomic charges of a group of atoms add up to exactly its CPU, clock rate, memory handling, cache zero, the leading term of the electrostatic interaction size. between two such groups is of dipolar .1=r3/ char- (9) The structure of the program and algorithms acter and the error due to the application of a cut-off involved. radius is reduced compared to the 1=r monopole con- (10) The programming language, compiler and com- tributions. Therefore in the GROMOS96 non-bonded piler options used to produce an executable pro- interaction routines the cut-off radius is used to select gram. nearest-neighbor charge groups. The simplest way to In order to give an impression of the computing find the neighboring charge groups of a charge group, effort required to simulate a biomolecular system a that is, the charge groups that lie within Rcp, is to scan benchmark has been formulated and carried out on a all possible charge group pairs in the system. For a variety of computers using the Groningen Molecular system consisting of Ncg charge groups, the number 1 2 Simulation GROMOS96 software [1,2]. A number of of pairs amounts to 2 Ncg, which makes the computer the factors mentioned above, which determine the time time required for finding the neighbors in this way pro- 2 required for a simulation, have been investigated: portional to Ncg. Once the neighbors have been found, – size of the system (1), the time required for calculating the non-bonded in- – range of the forces (4), teraction is proportional to Ncg. We note that the non- – frequency of non-bonded force update (5), bonded interaction within a charge group may need to – vacuum versus periodic boundaries (6), be calculated, when the charge group contains more – computer architecture (8), than a few atoms. – use of an extended pair list technique to select non- In order to evaluate the non-bonded interaction bonded interaction pairs (9). with sufficient accuracy, a long cut-off radius Rcl has The results of the GROMOS96 benchmark can be to be used; for biomolecular systems a value of at used to estimate the CPU time required for a typical least 1.4 nm seems necessary [3]. But such a range biomolecular simulation project and to evaluate the ef- is very expensive if pair interactions are evaluated ficiency or performance-to-priceratio for various com- at every MD step; the number of neighbor atoms puters regarding simulation of biomolecular systems. within 1.4 nm will exceed 900. Therefore in GRO- MOS the non-bonded interaction can be evaluated using a triple-range method. The electrostatic inter- 2. Methods actions beyond the long-range cut-off Rcl – typically 1.4 nm – can be approximated by a Poisson– The bulk of the computational time required by a Boltzmann generalized reaction field term [4]. The simulation time step is used for calculating the non- non-bonded interactions are evaluated at every sim- bonded interactions, that is, for finding the nearest ulation step using the charge group pair list that is neighbor atoms and subsequently evaluating the van generated with a short-range cut-off radius Rcp.The der Waals and electrostatic interaction terms for the longer-range non-bonded interactions, that is, those obtained atoms pairs. Since the non-bonded interac- between charge groups at a distance longer than Rcp tion between atoms decreases with the distance be- and smaller than Rcl, are evaluated less frequently, tween them, only interactions between atoms closer viz. only at every nth simulation step when also the to each other than a certain cut-off distance Rcp are pair list is updated. They are kept unchanged between generally taken into account in simulations. The GRO- these updates. In this way the long-range non-bonded MOS96 force field makes use of the concept of charge forces can be approximately taken into account, with- groups to reduce the errors due to the application of a out increasing the computing effort significantly, at the 552 A.M.J.J. Bonvin et al. / Computer Physics Communications 128 (2000) 550–557 expense of neglecting the fluctuation of the forces bey- which is done every nth MD step (typically 5 to 10) ond Rcp during n simulation steps. and allows further reduction of the computing time. The efficiency of the pair list generation on serial computers can be improved by using grid-search tech- niques [5–9], since the computational effort to select 3. Results nearest-neighbor charge-groups using a spatial grid is proportional to Ncg. The improvement over the scan- The six systems used for benchmarking are defined ning of all charge group pair is only significant, how- in Table 1. These are representative of the range of sys- ever, for systems in which the non-bonded interaction tems typically studied by MD simulations. The first two benchmarks are for a short cyclic peptide, cy- cut-off radius Rcl is an order of magnitude shorter than the size of the simulated molecular system. For closporin A (11 residues), in vacuum taking all interactions into account .R D1/ (I), and in wa- simulation of proteins in solution with Rcl ≈ 1:4nm cp and periodic box lengths of about 5.0 nm, contain- ter in a truncated octahedron under periodic bound- ing typically 104 atoms, the computational gain is very ary conditions (II). The next two are for a protein of modest. Another method to speed-up the non-bonded 295 residues, thrombin, in vacuum (III) and in water in pair list generation consists of keeping a second, “ex- a truncated octahedron under periodic boundary con- tended pair-list” generated using an extended cut-off ditions (IV). The last two benchmarks correspond to medium- (V) and large-sized (VI) pure water systems radius R >R > R and searching through this cx cl cp in a cubic box under periodic boundary conditions.

The GROMOS96 Benchmarks for Molecular Simulation

Mipspro C++ Programmer's Guide

Microprocessors History of Computing Nouf Assaid

Preparing for an Installation of a 33 to 256 Processor System HR-04122-0B Origin ™ Systems Last Modiﬁed: August 1999

MIPS IV Instruction Set

SGI® Octane III®

10º Encontro Português De Computação Gráfica

CXFSTM Administration Guide for SGI® Infinitestorage

OCTANE Technical Report

Imusic: Inessential Guide to Listening to Music on Athena

Interactive Rendering with Coherent Ray Tracing

On the Efficacy of Source Code Optimizations for Cache-Based Systems

Computing @SERC Resources,Services and Policies